Granular Private Data Is the Foundation of Targeted Advertising, Obviously

What people with Big Business Brains often like to argue about the unethical but wildly successful ad tech industry is that it is not as bad as it looks because your individual data does not have any real use or value. Ad tech vendors would not bother retaining such granular details because it is beneficial, they say, only in a more aggregated and generalized form.

The problem with this argument is that it keeps getting blown up by their demonstrable behaviour.¹ For a recent example, consider Avast, an antivirus and security software provider, which installed to users’ computers a web browser toolbar that promised to protect against third-party tracking but, in actual fact, was collecting browsing history for — and you are not going to believe this — third-party tracking and advertising companies on behalf of the Avast subsidiary Jumpshot. It was supposed to be anonymized but, according to the U.S. Federal Trade Commission, this “proprietary algorithm” was so ineffective that Avast managed to collect six petabytes of revealing browsing history between 2014–2020. Then, it sold access (PDF):

[…] For example, from May 2017 to April 2019, Jumpshot granted LiveRamp, a data company that specializes in various identity services, a “world-wide license” to use consumers’ granular browsing information, including all clicks, timestamps, persistent identifiers, and cookie values, for a number of specified purposes. […]

One agreement between LiveRamp and Jumpshot stated that Jumpshot would use two services: first, “ID Syncing Services,” in which “LiveRamp and [Jumpshot] will engage in a synchronization and matching of identifiers,” and second, “Data Distribution Services,” in which “LiveRamp will ingest online Client Data and facilitate the distribution of Client’s Data (i.e., data segments and attributes of its users associated with Client IDs) to third-party platforms for the purpose of performing ad targeting and measurement.” These provisions permit the targeting of Avast consumers using LiveRamp’s ability to match Respondents’ persistent identifiers to LiveRamp’s own persistent identifiers, thereby associating data collected from Avast users with LiveRamp’s data.

We know these allegations due to the FTC’s settlement — though, I should say, these claims have not been proven, because Avast paid a $16.5 million penalty and said it would not use any of the data it collected “for advertising purposes”. The caveat makes this settlement feel a little incomplete to me. While there are other ways aggregated personal data can be used, like in market research, it does not seem Avast and Jumpshot were all that careful about obtaining consent when this software was first rolled out. When they did, the results were predictable (PDF):

Respondents had direct evidence that many consumers did not want their browsing information to be sold to third parties, even when they were told that the information would only be shared in de-identified form. In 2019, when Avast asked users of other Avast antivirus software to opt-in to the collection and sale of de-identified browsing information, fewer than 50% of consumers did so.

I am interpreting “fewer than 50%” as “between 40–49%”; if 18% of users had opted in, I expect the FTC would have said “fewer than 20%”. Most people do not want to be tracked. For comparison, this seems to be at the upper end of App Tracking Transparency opt-in rates.

I noted the LiveRamp connection when I first linked to investigations of Avast’s deceptive behaviour, though it seems Wolfie Christl beat me to the punch in December 2019. Christl also pointed out Jumpshot’s supply of data to Lotame, something the FTC also objected to. LiveRamp’s whole thing is resolving audiences based on personal information, though it says it will not return this information directly. Still, this granular identity resolution is not the kind of thing most people would like to participate in. Even if they consent, it is unclear if they are fully aware of the consequences.

This is just one settlement but it helps illustrate the distribution and mingling of granular user data. Marketers may be restricted to larger audiences and it may not be possible to directly extract users’ personally identifiable information — though it is often trivial to do so. But it is not comforting to be told collected data is only useful as part of a broader set. First of all, it is not: there are existing albeit limited ways it is possible to target small numbers of people. Even if that were true, though, this highly specific data is the foundation of larger sets. Ad tech companies want to follow you as specifically and closely as they can, and there are only nominal safeguards because collecting it all is too damn valuable.

Well, and also how weird it is to be totally okay with collecting a massive amount of data with virtually no oversight or regulations so long as industry players pinky promise to only use some of it. ↥︎