Google’s Nightmare ‘Web Integrity API’ ⇥ arstechnica.com

Ben Wiser, et al., of Google, proposing a new Web Integrity standard:

The trust relationship between websites and clients is frequently established through the collection and interpretation of highly re-identifiable information. However, the signals that are considered essential for these safety use cases can also serve as a near-unique fingerprint that can be used to track users across sites without their knowledge or control.

We would like to explore whether a lower-entropy mechanism – Web Environment Integrity – could help address these use cases with better privacy respecting properties.

The goals of this project are laudable. Between this and features like PassKeys, you can imagine browsing a web where you are less frequently challenged to prove your identity and, when that happens, it is less interruptive.

Unfortunately, the likely reality is more worrisome. Since awareness of this API began bubbling up early last week, I have read and re-read Google’s proposal trying to make sense of it. I could not quite hit on an explanation that resonated with me.

Then I read Ron Amadeo’s summary, for Ars Technica, and it made complete sense — it is DRM for the web:

Google’s plan is that, during a webpage transaction, the web server could require you to pass an “environment attestation” test before you get any data. At this point your browser would contact a “third-party” attestation server, and you would need to pass some kind of test. If you passed, you would get a signed “IntegrityToken” that verifies your environment is unmodified and points to the content you wanted unlocked. You bring this back to the web server, and if the server trusts the attestation company, you get the content unlocked and finally get a response with the data you wanted.

If this comes to pass and becomes popular, it will be a sad day for the open web. Unfortunately, Google has both the incentive to release it, and the position to standardize it. To wit, the first argument Wiser, et al., makes for why user integrity is important:

Users like visiting websites that are expensive to create and maintain, but they often want or need to do it without paying directly. These websites fund themselves with ads, but the advertisers can only afford to pay for humans to see the ads, rather than robots. This creates a need for human users to prove to websites that they’re human, sometimes through tasks like challenges or logins.

It goes without saying — so I will anyway — that publishers and advertisers want to ensure humans look at ads. Google does as well. As the world’s largest online ad company — perhaps criminally so — Google has to toe a fine line between doing right by its advertising customers and doing right by the users of its web browser, which just so happens to be the world’s most popular. I am not calling conspiracy here, but I do think the objective alignment is noteworthy.

Besides, there are plenty of other reasons to differentiate legitimate human traffic from illegitimate and automated traffic. That is why the CAPTCHA exists. And Wiser does say write that, should this standard be adopted, users ought to be able to view pages even if they do not pass the Web Integrity check.

But what that could look like in practice is for unauthenticated users to be aggressively challenged by login prompts and CAPTCHAs. If website operators can be confident the vast majority of their user base can be validated by Web Integrity attestation, they could adjust the threshold for robot detection to present more CAPTCHAs more of the time. As the purveyor of the world’s most popular search engine, video streaming site, email provider, and maps product, Google is positioned perfectly to force adoption, similar to how it strong-armed publishers into using AMP.¹

More worrisome is what this means for the open web. While native app marketplaces have rules about what is permissible for some platforms, the web is entirely free on those same devices. Google’s proposal makes the web less open and less free.

Update: Tim Perry:

Of course, Google isn’t the first to think of this, but in fact they’re not even the first to ship it. Apple already developed & deployed an extremely similar system last year, now integrated into MacOS 13, iOS 16 & Safari, called “Private Access Tokens”: […]

[…]

That said, it’s not as dangerous as the Google proposal, simply because Safari isn’t the dominant browser. Right now, Safari has around 20% market share in browsers (25% on mobile, and 15% on desktop), while Chrome is comfortably above 60% everywhere, with Chromium more generally (Brave, Edge, Opera, Samsung Internet, etc) about 10% above that.

Apple’s lower market share could explain why I see so many more CAPTCHAs and have more trouble accessing pages when using iCloud Private Relay and Safari. It is a little crummy preview of what the web looks like if this technology becomes an expectation.

When I use Google to search the web instead of DuckDuckGo, it is usually because I am combing through its more extensive results for something specific. I often use advanced search operators. This is something Google is especially sensitive about — if I repeatedly search a website by using the site: or inurl: operators, I will see a CAPTCHA for just about each page of search results. I am picturing that, but for every few YouTube videos I watch. ↥︎