Search Results for: data broker

Muyi Xiao, Paul Mozur, Isabelle Qian, and Alexander Cardia of the New York Times put together a haunting short documentary about the state of surveillance in China. It shows a complete loss of privacy, and any attempt to maintain one’s sense of self is regarded as suspicious. From my limited perspective, I cannot imagine making such a fundamental sacrifice.

This is why it is so important to match the revulsion we feel over things like that Cadillac Fairview surreptitious facial recognition incident or Clearview AI — in its entirety — with strong legislation. These early-stage attempts at building surveillance technologies that circumvent legal processes forecast an invasive future for everyone.

Nicole Nguyen and Cordilia James, Wall Street Journal:

Different types of data, including information that can be subpoenaed from period trackers, can create an extremely detailed profile of you when combined. Prof. Fowler says she thinks it is likely that user data will have greater importance if more places criminalize abortion.

While period trackers collect and store health data, there aren’t typically special protections governing that information, said Prof. Fowler. Apps can use your data how they choose as outlined in their privacy policies, she said, adding that ideally the data would be stored on your devices — rather than in the cloud — and not be subject to third-party tracking.

Period tracking apps’ sometimes sketchy privacy policies and the legal jeopardy in which they can place users is something explicitly called out in Sen. Elizabeth Warren’s announcement of a bill to curtail data brokers.

Apple’s first-party Health app is the only one that encrypts users’ data end-to-end. Unfortunately, it is halfway between an all-in-one health tracking app and a repository for other apps’ data. I do not have experience with entering a menstrual cycle, but I find manually adding cycling distance or — new in iOS 16 — medication to be confusing and inelegant.

Even if a period tracking app is sharing data with Health, it is worth remembering that its own in-app privacy and data use policies apply.

Jon Brodkin, Ars Technica:

A bill introduced by Sen. Elizabeth Warren (D-Mass.) would prohibit data brokers from selling Americans’ location and health data, Warren’s office said Wednesday.

“Largely unregulated by federal law, data brokers gather intensely personal data such as location data from seemingly innocuous sources including weather apps and prayer apps—oftentimes without the consumer’s consent or knowledge,” a bill summary said. “Then, brokers turn around and sell the data in bulk to virtually any willing buyer, reaping massive profits.”

I do love the sound of this. Though Brodkin says it bans selling certain data types, it is actually more comprehensive — if passed, data brokers would be prohibited from doing just about anything with location and health data “declared or inferred”.

It seems too good to be true, and my hopes were quashed when I read this piece from Jeffrey Neuburger of the National Law Review:

[…] The bill makes exceptions for health information transfers done lawfully under HIPAA, publication of “newsworthy information of legitimate public concern” under the First Amendment, or disclosure for which the individual provides “valid authorization.” The FTC would be responsible for adapting the HIPAA-related term “valid authorization” to fit the location data context. It is possible that the conspicuous notice and consent processes surrounding the collection and use of the data — as is currently in place in many mobile applications — will suffice.

If all big ideas for protecting privacy come down to the same notice and consent laws that have had mixed results around the world, I do not think we will find ourselves in a better place. Everyone will simply be more irritated by the technology they use while finding few privacy benefits. I understand the value of someone consenting to have information collected and shared, but there needs to be a better model for confirming an opt-in and limitations on its use.

Julia Conley, Common Dreams:

Warren noted that location data has already been used by federal agencies to circumvent the Fourth Amendment by purchasing private data instead of obtaining it via a subpoena or warrant and to out LGBTQ+ people.

I continue to wonder how much of a factor it is that law enforcement and intelligence agencies rely on anti-privacy companies and data brokers as a workaround for more scrutinized legal measures.

Dimitrios Katsifis, writing in the Platform Law Blog, published by a British–Belgian law firm:

On 10 June 2022, the UK Competition and Markets Authority (CMA) published its Final Report on its year-long market study into mobile ecosystems — namely mobile operating systems, app stores, and web browsers. The CMA found that Apple and Google have a tight grip over these increasingly crucial ecosystems, which in turn places them in a very powerful position. As a result, thousands of businesses which rely on these ecosystems to reach their users face restrictions and terms which they have little choice but to accept, while consumers are likely to miss out on new innovations, have less choice, and ultimately face higher prices. In response, the CMA has identified a wide range of potential interventions that could help unlock competition and protect millions of businesses and people.

Even so, ecosystem operators — and Apple in particular — have fiercely opposed any intervention on the part of the CMA, arguing that this would compromise user privacy, security, and safety. […]

Of course there are times when real privacy and security risks conflict with permissive on-platform competition. But the CMA identified several areas where it says Apple and Google overstated those concerns in defence of their platform rules choices. That is a bit of a case of the boy crying “wolf!”. It is difficult to believe platform operators are always making these decisions only in the name of privacy and security when there are conflicts of interest in their own business lines.

Ideally, platform owners should be making decisions about what to allow in their ecosystem because regulators should not be micromanaging.

One other thing; Katsifis:

First, the choice architecture for the ATT prompt — which Apple chose without conducting any user testing — may not maximise user comprehension and thus could unduly influence some users to opt out of data sharing. Among others, the framing of the prompt could result in limited user comprehension, while Apple bars developers from offering any incentives for users to opt in to sharing their data (which in principle is not unlawful under UK privacy legislation).

Permitting tracking and data collection by coercion seems ethically fraught to me. How hard is it to not be creepy? At this point, just wipe the entire digital advertising and data broker industry off the face of the Earth and start from scratch.

Issie Lapowsky, Protocol:

Every single pig in the D.C. metro area took flight Friday when three key bipartisan lawmakers unveiled a draft of their actual, real-life, long-promised, but rarely materializing comprehensive privacy bill.

The draft of the American Data Privacy and Protection Act represents a crucial compromise between Reps. Frank Pallone Jr., Cathy McMorris Rodgers and Sen. Roger Wicker, and would give Americans unprecedented rights over their privacy, including the right to sue tech companies that violate it.

So, has the time for federal privacy law in the U.S. finally come? Come on back down to ground, little piggies. This could take a while.

Shoshana Wodinsky, Gizmodo:

And at least from a brief reading of the 10-pager outlining the bill’s basics, it looks pretty good! Upon a deeper reading though, the thing is… well, it’s not pretty good, or even remotely good. It carves out exemptions for bad bosses and law enforcement officials, while letting data brokers continue buying and selling vast amounts of our personal data with impunity.

I am following U.S. privacy legislation with great interest as it has the potential for worldwide knock-on effects. This bill seems promising on first glance, but Wodinsky documents enough loopholes and flaws to make me question how much it was influenced by anti-privacy industries. It is worrying to see a blanket exemption for data that has been “de-identified”, for example, even though we know that means nothing in terms of privacy protections.

Please try again.

Leaky Forms is a new study by Asuman Senol, Gunes Acar, Mathias Humbert, and Frederik Zuiderveen Borgesius (emphasis theirs):

Email addresses — or identifiers derived from them — are known to be used by data brokers and advertisers for cross-site, cross-platform, and persistent identification of potentially unsuspecting individuals. In order to find out whether access to online forms are misused by online trackers, we present a measurement of email and password collection that occur before form submission on the top 100K websites.

These researchers received marketing emails from some of the leaky sites where, I will repeat, they never submitted a form. Their typed email address was captured and whisked into the ad tech and data broker machinery without their explicit consent. When using a U.S.-based crawler to assess these forms, researchers found a greater proportion of incidents (PDF, section 4.3) of email address collection than when they used an E.U.-based crawler, “perhaps due to stricter data protection regulations”.

The worst offenders were, according to researchers, fashion and beauty websites, with shopping and general news sites in second and third places. Notably more private: porn sites, the only category for which not a single one was found to have leaky forms.

Joseph Cox, Vice:

The Centers for Disease Control and Prevention (CDC) bought access to location data harvested from tens of millions of phones in the United States to perform analysis of compliance with curfews, track patterns of people visiting K-12 schools, and specifically monitor the effectiveness of policy in the Navajo Nation, according to CDC documents obtained by Motherboard. The documents also show that although the CDC used COVID-19 as a reason to buy access to the data more quickly, it intended to use it for more general CDC purposes.

Location data is information on a device’s location sourced from the phone, which can then show where a person lives, works, and where they went. The sort of data the CDC bought was aggregated — meaning it was designed to follow trends that emerge from the movements of groups of people — but researchers have repeatedly raised concerns with how location data can be deanonymized and used to track specific people.

Remember, during the early days of the pandemic, when the Washington Post published an article chastising Apple and Google for not providing health organizations full access to users’ physical locations? In the time since it was published, the two companies released their jointly-developed exposure notification framework which, depending on where you live, has either been somewhat beneficial or mostly inconsequential. Perhaps unsurprisingly, regions with more consistent messaging and better privacy regulations seemed to find it more useful than places where there were multiple competing crappy apps.

The reason I bring that up is because it turns out a new app that invades your privacy in the way the Post seemed to want was unnecessary when a bunch of other apps on your phone do that job just fine. And, for the record, that is terrible.

In a context vacuum, it would be better if health agencies were able to collect physical locations in a regulated and safe way for all kinds of diseases. But there have been at least stories about wild overreach during this pandemic alone: this one, in which the CDC wanted location data for all sorts of uses beyond contact tracing, and Singapore’s acknowledgement that data from its TraceTogether app — not based on the Apple–Google framework — was made available to police. These episodes do not engender confidence.

Also — and I could write these words for any of the number of posts I have published about the data broker economy — it is super weird how this data can be purchased by just about anyone. Any number of apps on our phones report our location to hundreds of these companies we have never heard of, and then a government agency or a media organization or some dude can just buy it in ostensibly anonymized form. This is the totally legal but horrific present.

Reports like these underscore how frustrating it was to see the misplaced privacy panic over stuff like the Apple–Google framework or digital vaccine passports. Those systems were generally designed to require minimal information, report as little externally as possible, and use good encryption for communications. Meanwhile, the CDC can just click “add to cart” on the location of millions of phones.

Sam Biddle and Jack Poulson, the Intercept:

Anomaly Six software lets its customers browse all of this data in a convenient and intuitive Google Maps-style satellite view of Earth. Users need only find a location of interest and draw a box around it, and A6 fills that boundary with dots denoting smartphones that passed through that area. Clicking a dot will provide you with lines representing the device’s — and its owner’s — movements around a neighborhood, city, or indeed the entire world.

[…]

To fully impress upon its audience the immense power of this software, Anomaly Six did what few in the world can claim to do: spied on American spies. “I like making fun of our own people,” Clark began. Pulling up a Google Maps-like satellite view, the sales rep showed the NSA’s headquarters in Fort Meade, Maryland, and the CIA’s headquarters in Langley, Virginia. With virtual boundary boxes drawn around both, a technique known as geofencing, A6’s software revealed an incredible intelligence bounty: 183 dots representing phones that had visited both agencies potentially belonging to American intelligence personnel, with hundreds of lines streaking outward revealing their movements, ready to track throughout the world. “So, if I’m a foreign intel officer, that’s 183 start points for me now,” Clark noted.

Clark was able to show the location history for each of those nearly two hundred devices for, according to Biddle and Poulson, up to a year’s worth of tracking. Any of these devices were easily de-anonymized because, well, Anomaly Six had their entire location history. It is worth being cautious about their capabilities given the self-promotional context of these claims, but multiple experts told the Intercept they felt believable.

Byron Tau of the Wall Street Journal has previously reported on Anomaly Six’s capabilities, which are derived from the inclusion of its SDK in third-party apps as well as the broader data broker economy. That economy is potentially open to users from other countries, given the United States’ almost non-existent protections on personal data privacy. Much of the world’s tech industry is also based in the U.S. and their privacy policies often say U.S. jurisdiction applies.

Not only does the American military-industrial complex have the ability to spy on the world’s devices, adversarial nations could create similar capabilities — again, partly thanks to the weak privacy protections afforded by U.S. law and its concentration of tech companies.

It does not really matter how well-educated you are as a consumer or user. Short of not owning anything that connects to the internet, there is no reliable way of opting out of surveillance by a company nobody really thinks about. The only way this gets improved is by minimizing data generation and collection, and through stricter privacy laws. Perhaps this is one reason why American lawmakers have been reluctant to pass such laws.

I have written an awful lot about data brokers for years now and others have been covering this industry for much longer. Yet it persists, and I am glad it is getting the kind of spotlight that John Oliver’s “Last Week Tonight” can throw on it. It is a good high-level overview, accurately covering many familiar stories, and will hopefully motivate more comprehensive reforms.

This video is only available in the United States right now, but I am sure you are a clever person.

Ted Gioia, the Honest Broker:

I had a hunch that old songs were taking over music streaming platforms — but even I was shocked when I saw the most recent numbers. According to MRC Data, old songs now represent 70% of the US music market.

[…]

I can understand the frustrations of music lovers getting no satisfaction from the current songs, though they try and they try. I also lament the lack of imagination on many current hits. But I disagree with their larger verdict. I listen to 2-3 hours of new music every day, and I know that there are plenty of outstanding young musicians out there. The problem isn’t that they don’t exist, but that the music industry has lost its ability to discover and nurture their talents.

Gioia explores many possibilities for why catalogue music — recordings from more than eighteen months prior — is dominating charts. Maybe big artists are delaying releases due to the pandemic, causing audiences to retreat to songs they already know and love. Perhaps TikTok is to blame for keeping tracks with endlessly reusable choruses — like Gayle’s “ABCDEFU” — on the charts, though how that explains Glass Animals’ “Heat Waves” spending over a year on the Billboard Hot 100 is anyone’s guess.

But there is a simpler reason I think makes the most sense.

Ben Gilbert, Synchtank:

Kriss Thakrar, consultant at MIDiA Research, believes the answer is connected to both technology and demographics. “The audiences of streaming platforms are getting older. Most of the early adoption was from millennials who are now in their late 20s and 30s. There are twice as many people over the age of 35 as there are under 25 on streaming so the listening habits will naturally skew towards older music, coupled with younger listeners also being inclined to listen to older songs,” he told Synchtank.

“However, millennials remain Spotify’s core audience and music from this millennium (but still over two plus years old) forms the vast majority of music consumption. With catalog forming the majority of consumption on streaming, it is no wonder that the owners of that catalog are set to benefit the most. This creates new opportunities for older artists to monetise their catalog and overall it works well for labels and publishers,” commented Thakrar.

No matter how much new music I listen to, there is an ever-growing depth of stuff I have already listened to which I am able to return to. That is not to say there is only a singular factor, nor that no new stars have emerged from the music industry — Lil Nas X and Doja Cat have established their presence with aplomb. But in an aging nation (PDF), like the U.S., it surely seems like more people gravitating toward the familiar means older music has an edge.

Alfred Ng and Jon Keegan, the Markup:

There is an estimated $12 billion market of companies that buy and sell location data collected from your cellphone. And the trade is entirely legal in the U.S.

Without legislation limiting the location data trade, Apple and Google have become the de facto regulators for keeping your whereabouts private — through shifts in transparency requirements and crackdowns on certain data brokers.

[…]

Workers in the location data industry told The Markup that data brokers are increasingly collecting data directly from app developers instead of relying on SDKs, which often leave a digital footprint. And it’s unclear how Apple and Google could even monitor how apps are sharing and selling data once they obtain it.

The short version is that nobody is policing it, and location data collected from anyone with a smartphone and commonplace third-party apps has become a massive unregulated market.

I understand the argument for Apple and Google to operate their platforms by their rules, but privacy should not be a policy decision made by a parent company. Nobody should have to decide what unknown privacy tradeoffs they are making by choosing software from one company over another. There must be clear rules restricting the collection and use of this information, and the U.S. desperately needs those laws since most of the world’s major technology companies are based in that country.

Jon Keegan and Alfred Ng, the Markup:

The family safety app Life360 announced on Wednesday that it would stop selling precise location data, cutting off one of the multibillion-dollar location data industry’s largest sources. The decision comes after The Markup revealed that Life360 was supplying up to a dozen data brokers with the whereabouts of millions of its users.

Good. But Life360 says it will keep selling precise data to Arity, and “aggregate” data to Placer.ai. This is all at the company’s behest — there is nothing legally preventing it from beginning similar data marketing agreements.

Joseph Cox, Vice:

“The Banning Surveillance Advertising Act does what its title suggests. The legislation prohibits advertising facilitators (e.g., Facebook, Google DoubleClick, data brokers) from targeting ads with the exception of broad location targeting to a recognized place (e.g., municipality),” a press release announcing the proposed legislation reads. “The bill also prohibits advertisers from targeting ads based on protected class information and any information they purchase. Violations can be enforced by the Federal Trade Commission, state attorneys general, or private lawsuits,” it adds. The legislation would also prohibit targeted advertisements based on protected class attributes such as race, gender, and religion.

Reps. Anna G. Eshoo of California and Jan Schakowsky of Illinois, and Sen. Cory Booker of New Jersey are the Democratic lawmakers behind the proposed legislation.

Can Duruk:

My hope is that we will look back at the current state of the internet, funded solely by adtech, like when we used asbestos for insulation, lead for toys, and land mines for defense.

There is no chance that this bill becomes law in the U.S., thereby causing the world’s ad tech market to adjust to a better model, but a simple Canadian boy can dream.

Jon Keegan and Alfred Ng, the Markup:

Life360, a popular family safety app used by 33 million people worldwide, has been marketed as a great way for parents to track their children’s movements using their cellphones. The Markup has learned, however, that the app is selling data on kids’ and families’ whereabouts to approximately a dozen data brokers who have sold data to virtually anyone who wants to buy it.

In 2019, Apple pulled about a dozen parental control apps from the App Store over privacy concerns, since they abused Mobile Device Management, though I cannot find any reports that Life360 was among them. However, I did come across a Wired article from later that year in which Louise Matsakis reported that Life360’s public trading prospectus indicated the value it sees in mining its vast collection of user data — largely of children — for profit.

Last month, Life360 announced it would be acquiring Tile.

Jeffrey Dastin, Chris Kirkham, and Aditya Kalra, Reuters:

Amazon’s lobbying against privacy protections aims to preserve the company’s access to detailed consumer data that has fueled its explosive online-retailing growth and provided an advantage in emerging technologies, according to the Amazon documents and former employees. The data Amazon amasses includes Alexa voice recordings; videos from home-camera systems; personal health data from fitness trackers; and data on consumers’ web-searching and buying habits from its e-commerce business.

Some of this information is highly sensitive. Under a 2018 California law that passed despite Amazon’s opposition, consumers can access the personal data that technology companies keep on them. After losing that state battle, Amazon last year started allowing all U.S. consumers to access their data. (Customers can request their data at this link.) Seven Reuters reporters obtained and examined their own Amazon dossiers.

Even setting aside its massive cloud computing business, it is staggering to imagine how much information Amazon has access to on its users with historically poor internal controls. For its heaviest users — Prime members who have Ring doorbells, Alexa devices in every room, read their Kindle most nights, and shop at Whole Foods — Amazon has a more-or-less complete picture of their lifestyle.

I am a very light Amazon user, with just one order made in 2021, and six in 2020. I do not have any Alexa or Kindle devices, and have never shopped at Whole Foods. So I was a little surprised when I requested my data on November 19 and was told that it would take up to a month for them to produce a copy. I delayed writing about this story because I wanted to have a copy of my own data in hand, but it has been five days and I have not received anything. Any other large technology company has produced a copy of my data within hours of me making the request, and even the slowest information brokers have taken just a couple days. Is Amazon relying on an entirely manual process?

Some of the examples cited by Reuters are a little weak on their face:

Alexa devices also pulled in data from iPhones and other non-Amazon gear – including one reporter’s iPhone calendar entries, with names of people he was scheduled to contact.

I am not sure it is newsworthy that Alexa devices need to know information about users’ calendar entries in order to respond to queries like “what time is my meeting with Leslie?”, for example. But perhaps it should be — if this reporter was not aware of how much information a smart speaker needed to ingest and share with Amazon’s servers, for some reason, it can understandably feel like an invasion of privacy. If something can be done locally, it probably ought to be.

One more thing:

As executives edited the draft, Herdener summed up a central goal in a margin note: “We want policymakers and press to fear us,” he wrote. He described this desire as a “mantra” that had united department leaders in a Washington strategy session.

This is a terrible goal to even suggest in a margin note, and it is indicative of the kind of ruthless work culture that urgently needs to die.

From the U.S. Federal Trade Commission

Many internet service providers (ISPs) collect and share far more data about their customers than many consumers may expect — including access to all of their Internet traffic and real-time location data — while failing to offer consumers meaningful choices about how this data can be used, according to an FTC staff report on ISPs’ data collection and use practices.

This report is alarming, yet painfully obvious to anyone who has been paying attention to the behaviour of American internet providers. Because they are conglomerates operating in many markets, they have a uniquely comprehensive view of Americans’ lives, which they pitch as an advantage in the miserable world of targeted advertising. And it is a mutually beneficial market.

From the report (PDF):

Second, there is a trend in the ISP industry to buy consumer information from third party data brokers, which many ISPs in our study use for advertising purposes. One reported using data from data brokers to market their own products to new customers only. For example, they might get lists of new homeowners in a particular geographic area. A sizable number of the ISPs in our study also buy data from data brokers about their existing customers. For example, an ISP might send the data broker subscriber names and addresses, which the data broker would then append with demographic information (e.g., gender, age range, race and ethnicity information, marital status, parental status) and interest data (e.g., hiking, biking, gardening, bodybuilding, high-end spirits) for those subscribers. Or, for those ISPs that do not want to share their customers’ names and contact information with third-party data brokers, the ISP might send persistent identifiers (e.g., cookies, advertising identifiers, or hashed or encrypted account numbers or telephone numbers) associated with their subscribers to third party “matching services.” These matching services then sync these identifiers with similar identifiers they receive from other sources and provide the list of identifiers to the ISP. Once the ISP has the synced list of identifiers, the ISP can then check with data brokers to request demographic and interest data 94 associated with all of those identifiers, without sharing consumers’ name and contact information.

The data brokerage industry is vile. For comparison, here in Canada, internet providers are prohibited from using subscriber information for auxiliary business purposes without express permission. Bell, one of the big telecom providers in Canada, runs a “tailored marketing program” that requires subscribers to opt into receiving ads based on their Bell-provided services. I still think it is gross, but at least it is off by default and requires explicit permission.

Because it is opt-in, I bet this business is tiny. I asked Bell for more information about it, including the number of subscribers, and have not heard back. But I imagine very few people agree to allow the use of their web activity and television habits to serve them ads, probably because most people do not think the privacy tradeoffs are worth it. iOS’ App Tracking Transparency feature has similarly low opt-in rates. Even though many apps do not respect it, this indicates that most people do not want their activities recorded for the milquetoast reason of making ads a little bit more relevant.

U.S. service providers should respect those kinds of wishes. Unfortunately, while mainstream attention has finally turned to the egregious privacy practices of companies like Facebook and Google, ISPs have not been treated with similar scrutiny. This is as true for the press as it is for regulators. The CEOs from tech companies have spent hours over the past few years testifying before Congress about their privacy practices, but telecom CEOs have not been asked to do the same. Reports about lobbying have highlighted how much money is being spent by technology companies, without acknowledging similarly huge spending by telecoms.

I know this is not a new observation, but: these egregious violations of user privacy will not change without regulation, but rules protecting consumers’ personal data are unlikely to materialize when lawmakers are earning so much from the businesses they are supposed to regulate.

Johana Bhuiyan, the Guardian:

Geofence location warrants and reverse search warrants such as the ones McCoy dealt with are increasingly becoming the tool of choice for law enforcement. Google revealed for the first time in August that it received 11,554 geofence location warrants from law enforcement agencies in 2020, up from 8,396 in 2019 and 982 in 2018.

It’s a concerning trend, argue experts and advocates. They worry the increase signals the start of a new era, one in which law enforcement agencies find ever more creative ways to obtain user information from data-rich tech companies. And they fear agencies and jurisdictions will use this relatively unchecked mechanism in the context of new and controversial laws such as the criminalization of nearly all abortions in Texas.

If this topic sounds familiar to you, thank you for being a regular reader. I think this is a critical topic to understand since how law enforcement, which is generally prohibited from monitoring large groups of people indiscriminately, is able to work around pesky restrictive laws by subpoenaing advertisers and data brokers. Byron Tau of the Wall Street Journal has covered this extensively, and so has Joseph Cox of Vice and reporters at Buzzfeed News. In some cases, law enforcement is able to collect information without a warrant, as Tau revealed in an article earlier this week.

Where I think this article jumps the rails is in its attempt to tie Apple’s proposed CSAM detecting efforts to the above warrantless data collection methods:

For tech companies that count advertising among their revenue streams – or as a major source of revenue, as is the case for Google, there’s no real technical solution to curbing government requests for their data. “It would be technically impossible to have this data available to advertisers in a way that police couldn’t buy it, subpoena it or take it with a warrant,” Cahn said.

That’s why Apple’s now-postponed plan to launch a feature that scans for CSAM caused such a furor. When the FBI in 2019 asked Apple to unlock the phone of the suspect in a mass shooting in San Bernardino, California, Apple resisted the request arguing the company couldn’t comply without building a backdoor, which it refused to do. Once Apple begins scanning and indexing the photos of anyone who uses its devices or services, however, there’s little stopping law enforcement from issuing warrants or subpoenas for those images in investigations unrelated to CSAM.

While I understand the concern, this is simply not how the proposed feature would work.

For one thing, Apple is already able to respond to warrants with photos stored in iCloud. The CSAM detection proposal would not change that.

For another, photos are not really being scanned or indexed, but compared against hashes of known CSAM photos and flagged with information about whether a match was found. These would only be for photos stored in iCloud, so someone could disable the feature by disabling iCloud Photo Library.

Perhaps I am missing something key here, but Bhuiyan’s attempt to connect this feature with dragnet warrants seems tenuous at best. When law enforcement subpoenas Apple, they ask for information connected to specific Apple IDs or iCloud accounts. That is very different from the much scarier warrants issued based on the devices connected to a location, or the users that are connected with search queries.

Ad tech companies and data brokers have so much information about individual users that their databases can be used as a proxy for mass surveillance — that is a more pressing ongoing concern.

Common Sense Media recently completed an assessment of ten streaming video services and five dedicated devices, and has some concerns (PDF):

Many viewers know that free streaming apps are most likely selling their personal information, but most viewers may not know that most paid subscription streaming apps are also selling users’ data. Even more expensive streaming plans with “no ads” or “limited ads” still collect viewing data from use of the app to track and serve users advertisements on other apps and services across the internet. Also, data brokers buy and sell users’ data and share it with other companies for data recombination purposes.

[…]

Our privacy evaluations of the top 10 streaming apps indicate that all streaming apps (except Apple TV+) have privacy practices that put consumers’ privacy at considerable risk including selling data, sending third‐party marketing communications, displaying targeted advertisements, tracking users across other sites and services, and creating advertising profiles for data brokers.

It is the same story for devices, too.

Via Karl Bode, Techdirt:

Some of the failures were downright ugly, like making no real exceptions for the data collection of children. Many of the issues revealed weren’t the end of the world, but they make it repeatedly clear that companies aren’t being transparent about what is collected, and often enjoy making opting out of data collection and monetization as cumbersome and annoying as possible.

Also remember that smart televisions are among the worst offenders of user privacy. Even if you use an Apple TV box and watch shows through Apple TV Plus, your television may still be automatically recognizing everything you watch.

I would hate to begin any post here in the way that some first-year college student would start an essay: with a definition. But the meaning of “privacy” so variable that I invite you to see how different entities explain it. NIST has a few different explanations, while the NTIA has a much longer exploration. PC Magazine’s glossary entry is pretty good, too, and closely mimics Steve Jobs’ thesis.

So, with so much understanding of what privacy represents — at least in a kind of abstract sense — parts of this article by Benedict Evans come across as hollow even as it makes several great arguments. I’m going to start by quoting the second paragraph, because it begins “first”:

First, can we achieve the underlying economic aims of online advertising in a private way? Advertisers don’t necessarily want (or at least need) to know who you are as an individual. As Tim O’Reilly put it, data is sand, not oil – all this personal data actually only has value in the aggregate of millions. Advertisers don’t really want to know who you are – they want to show diaper ads to people who have babies, not to show them to people who don’t, and to have some sense of which ads drove half a million sales and which ads drove a million sales. […]

Already, I find myself wondering if Evans is being honest with himself. The argument that advertisers want to work in bulk more often than at the individual level is an outdated one in an era of ads that can be generated to uncanny specificity. Even conceding that Facebook’s influence on the 2016 election was overstated, the Trump campaign was “running 40,000 to 50,000 variants of its ads” every day. This ain’t the world of high-quality, thoughtful advertising — not any more. This is a numbers game: scaled individualization driven by constant feedback and iteration. If advertisers believe more personal information will make ads more effective, they will pursue that theory as far as they can take it.

Evans acknowledges that consumer demands and Apple’s industry influence have pushed the technology industry to try improving user privacy. On-device tracking systems are seen, he says, as a more private way of targeting advertising without exposing user data to third parties.

But:

This takes me to a second question – what counts as ‘private’, and how can you build ‘private’ systems if we don’t know?

Apple has pursued a very clear theory that analysis and tracking is private if it happens on your device and is not private if [it] leaves your device or happens in the cloud. Hence, it’s built a complex system of tracking and analysis on your iPhone, but is adamant that this is private because the data stays on the device. People have seemed to accept this (so far), but acting on the same theory Apple also created a CSAM scanning system that it thought was entirely private – ‘it only happens your device!’ – that created a huge privacy backlash, because a bunch of other people think that if your phone is scanning your photos, that isn’t ‘private’ at all. […]

I will get back to the first part of this quoted section at the end of this response because I think it is the most important thing in Evans’ entire piece.

For clarity, the backlash over CSAM scanning seems less about privacy than it does about device ownership and agency. This is, to some extent, perhaps a distinction without a difference. Many of the definitions I cited in the first paragraph describe privacy as a function of control. But I think there is a subtle point of clarity here: Apple’s solution probably is more private than checking those photos server-side, but it means that a user’s device is more than a mere client connected to cloud services — it is acting as a local agent of those services.

Continued from above:

[…] So is ‘on device’ private or not? […]

This feels like a trick question or a false premise, to which the only acceptable answer is “it depends”. In general, probably, but there are reasonable concerns about Google’s on-device FLoC initiative.

On / off device is one test, but another and much broader one is the first party / third party test: that it’s OK for a website to track what you do on that website but not OK for adtech companies to track you across many different websites. This is the core of the cookie question, and sounds sensible, and indeed one might think that we do have a pretty good consensus on ‘third party cookies’ – after all, Google and Apple are getting rid of them. However, I’m puzzled by some of the implications. “1p good / 3p bad” means that it’s OK for the New York Times to know that you read ten New York Times travel pieces and show you a travel ad, but not OK for the New Yorker to know that and show you the same ad. […]

This is where this piece starts to go off the rails. I have read the last sentence of this quoted paragraph several times and I cannot figure out if this is a legitimate question Evans is asking.

If we engage with it on its premise, of course it is not okay for the New Yorker to show an ad based on my Times browsing history. It is none of their business what I read elsewhere. It would be like if I went to a clothing store and then, later at a restaurant, a waiter told me that I should have bought the other shirt I tried on because they think it looked better. That would be creepy! And if any website could show me ads based on what I viewed somewhere else, that means that my web browsing history is public knowledge. It violates both the first- and third-party definition and the on- and off-device definition.

But the premise is wrong — or, at least, incomplete. The New Yorker contains empty frames that can be filled by whatever a series of unknown adtech companies decide is the best fit for me based on the slice of my browsing history they collect, like little spies with snippets of information. If it were a direct partnership to share advertising slots, at least we could imply that a reader of both may see them as similarly trustworthy organizations, given that they read both. But this is not a decision between the New Yorker and the Times. There may be a dozen other companies involved in selecting the ad, most of which a typical user has never heard of. How much do you, reader, trust Adara, Dataxu, GumGum, MadHive, Operative, SRAX, Strossle, TelMar, or Vertoz? I do not know if any of them have ever been involved in ad spots in the New Yorker or the Times, but they are all real companies that are really involved in placing ads across the web — and they are only a few names in a sea of thousands.

At this point one answer is to cut across all these questions and say that what really matters is whether you disclose whatever you’re doing and get consent. Steve Jobs liked this argument. But in practice, as we’ve discovered, ‘get consent’ means endless cookie pop-ups full of endless incomprehensible questions that no normal consumer should be expected to be understand, and that just train people to click ‘stop bothering me’. Meanwhile, Apple’s on-device tracking doesn’t ask for permission, and opts you in by default, because, of course, Apple thinks that if it’s on the device it’s private. Perhaps ‘consent’ is not a complete solution after all.

Evans references Jobs’ consent-based explanation of privacy that I cited at the top of this piece — a definition which, unsurprisingly, Apple continues to favour. But an over-dependency on a consent model offloads the responsibility for privacy onto individual users. At best, this allows the technology and advertising industries to distance themselves from their key role in protecting user privacy; at worst, it allows them to exploit whatever they are permitted to gather by whatever technical or legal means possible.

The Jobs definition of privacy and consent is right, but it becomes even more right if you expand its scope beyond the individual. As important as it is for users to confirm who is collecting their data and for what purpose, it is more important that there are limits on the use and distribution of collected information. This sea of data is simply too much to keep track of. Had you heard of any of the ad tech companies mentioned above? What about data brokers that trade and “enrich” personal information? Even if users affirm that they are okay with an app or a website tracking them, they may not be okay with how a service that app relies on ends up reselling or sharing user data.

Good legislation can restrict these industries. I am sure Canada’s is imperfect, but there has to be a reason why the data broker industry here is, thankfully, almost nonexistent compared to the industry in the United States.

But the bigger issue with consent is that it’s a walled garden, which takes me to a third question – competition. Most of the privacy proposals on the table are in absolute, direct conflict with most of the competition proposals on the table. If you can only analyse behaviour within one site but not across many sites, or make it much harder to do that, companies that have a big site where people spend lots of time have better targeting information and make more money from advertising. If you can only track behaviour across lots of different sites if you do it ‘privately’ on the device or in the browser, then the companies that control the device or the browser have much more control over that advertising (which is why the UK CMA is investigating FLoC).

With GDPR, we have seen the product of similarly well-intentioned privacy legislation that restricts the abilities of smaller companies while further entrenching the established positions of giants. I think regulators were well aware of that consequence, and it is a valid compromise position between where the law existed several years ago and where it ought to be going.

As regulations evolve, these competition problems deserve greater focus. It is no good if the biggest companies on the planet or those that are higher up the technology stack — like internet service providers — are able to use their position to abuse user privacy. To make sure smaller companies ever have a chance of competing, it would be a mistake loosen policies on privacy and data collection. Regulations must go in the other direction.

And, as an aside, if you can only target on context, not the user, then Hodinkee is fine but the Guardian’s next landmark piece on Kabul has no ad revenue. Is that what we want? What else might happen?

This is not a new problem for newspapers. Advertisers have always been worried that their ads will be placed alongside “hard news” stories. You can find endless listicles of examples — here’s one from Bored Panda. In order to avoid embarrassing associations, it is commonplace for print advertisers to ask for exceptions: a car company, for example, may request their ad not be placed alongside stories about collisions.

This has been replicated online at both ends of the ad buying market. The New York Times has special tags to limit or remove ads on some stories, while advertisers can construct lists of words and domains they want to avoid placement alongside. But what is new about online news compared to its print counterpart is that someone will go from the Guardian story about Kabul to Hodinkee without “buying” the rest of the Guardian, or even looking at it. This is a media-wide problem that has little to do with privacy-sensitive ad technologies. If serving individualized ads tailored based on a user’s browsing history were so incredible, you would imagine the news business would be doing far better than it is.

All of this leads to the final paragraph in Evans’ piece, which I think raises worthwhile questions:

These are all unresolved questions, and the more questions you ask the less clear things can become. I’ve barely touched on a whole other line of enquiry – of where all the world’s $600bn of annual ad spending would be reallocated when all of this has happened (no, not to newspapers, sadly). Apple clearly thinks that scanning for CSAM on the device is more private than the cloud, but a lot of other people think the opposite. You can see the same confusion in terms like ‘Facebook sells your data’ (which, of course, it doesn’t) or ‘surveillance capitalism’ – these are really just attempts to avoid the discussion by reframing it, and moving it to a place where we do know what we think, rather than engaging with the challenge and trying to work out an answer. I don’t have an answer either, of course, but that’s rather my point – I don’t think we even agree on the questions.

Regardless of whether we disagree on the questions or if you — as I — think that Evans is misstating concerns without fully engaging, I think he’s entirely right here. Questions about user privacy on the web are often flawed because of the expansive and technical nature of the discussion. We should start with simpler questions about what we hope to achieve, and fundamental statements what “privacy” really looks like. There should be at least some ground level agreement about what information is considered personal and confidential. At the very least, I would argue that this applies to data points like non-public email addresses, personal phone numbers, dates of birth, government identification numbers, and advertiser identifiers that are a proxy for an individual or a device.

But judging by the popularity of data enrichment companies, it does not appear that there is broad agreement that anything is private any more — certainly not among those in advertising technologies. The public is disillusioned and overwhelmed, and it is irresponsible to leave it to individuals to unpack this industry. There is no such thing as informed consent in marketing technologies when there is no corresponding legislation requiring the protection of collected data. These kinds of fundamental concerns must be addressed before moving on to more abstract questions about how the industry will cope.

Jennifer Elias, CNBC:

Alphabet reported Q2 2021 earnings after the bell. The stock rose more than 3% after hours on the strong numbers, which crushed analyst expectations.

[…]

Total Google ad revenue increased to $50.44 billion, up 69% from the year-ago quarter, which was hurt by the onset of the Covid pandemic.

Tripp Mickle, Wall Street Journal:

Google’s parent company flexed its digital dominance, reporting its highest quarter ever for sales and profit behind a gusher of online advertising from businesses vying for customers across reopened economies.

[…]

Other tech companies have benefited from a soaring digital ad market. Snap Inc. last week reported revenue more than doubled behind strong user growth, while Twitter Inc. reported sales surged 74% behind increased advertising.

Facebook:

Advertising revenue growth in the second quarter of 2021 was driven by a 47% year-over-year increase in the average price per ad and a 6% increase in the number of ads delivered. Similar to the second quarter, we expect that advertising revenue growth will be driven primarily by year-over-year advertising price increases during the rest of 2021.

Elizabeth Culliford and Nivedita Balu, Reuters:

Facebook Inc said on Wednesday it expects revenue growth to “decelerate significantly,” sending the social media giant’s shares down 3.5% in extended trading even as it reported strong ad sales.

[…]

Facebook said it expects Apple’s recent update to its iOS operating system to impact its ability to target ads and therefore ad revenue in the third quarter. The iPhone maker’s privacy changes make it harder for apps to track users and restrict advertisers from accessing valuable data for targeting ads.

Facebook said much the same thing in its earnings press release last quarter. Perhaps its advertising revenues will begin to be impacted by App Tracking Transparency after all, but it seems likely that the feature will benefit the online advertising duopoly. In this riskier climate, advertisers seem to be favouring the known quantities of Google and Facebook. I will repeat what I wrote in April:

As is often the case for stories about privacy changes — whether regulatory or at a platform level — much of the coverage about App Tracking Transparency has been centred around its potential effects on the giants of the industry: Amazon, Facebook, and Google. But this may actually have a greater impact on smaller ad tech companies and data brokers. That is fine; I have repeatedly highlighted the surreptitious danger of these companies that are not household names. But Facebook and Google can adapt and avoid major hits to their businesses because they are massive — and they may, as Zuckerberg said, do even better. They are certainly charging more for ads.

Privacy should not be something that users must buy, nor should its violation be a key selling point. Privacy is something that should be there, for all of us, regardless of the device we use, the websites we visit, or the ad tech networks we unknowingly interact with.