Google Sues SerpApi ⇥ seroundtable.com
Barry Schwartz, Search Engine Roundtable:
On Friday, Google announced it had filed a lawsuit (PDF) against SerpApi for scraping the Google search results. Google alleges that SerpApi is running an “unlawful” operation that bypasses Google’s security measures to scrape search results at an astonishing scale.
[…]
Google claims SerpApi uses hundreds of millions of fake search requests to mimic human behavior. This allows them to bypass CAPTCHAs and other automated defenses that Google uses to prevent bots from overwhelming its systems.
In October, as part of its lawsuit against Perplexity, Reddit sued SerpApi and a couple of other scraper companies. Figuring out the difference between the ostensibly bad kind scraping practices of SerpApi and the good ones of Google seems like it will require a narrow definition, one Google is happy to provide.
Halimah DeLaine Prado, Google’s general counsel:
Google follows industry-standard crawling protocols, and honors websites’ directives over crawling of their content. Stealthy scrapers like SerpApi override those directives and give sites no choice at all. SerpApi uses shady back doors — like cloaking themselves, bombarding websites with massive networks of bots and giving their crawlers fake and constantly changing names — circumventing our security measures to take websites’ content wholesale. This unlawful activity has increased dramatically over the past year.
This explanation is not wrong, per se, though it is quite self-serving. The way many people begin their search for a product, service, or local business is with Google. A typical website owner is therefore desperate for a Google link, to the extent that they will reconstruct their site on a regular basis to suit its shifting ranking criteria. That means Google has broad power to do basically whatever it wants to the web. If publishers wanted to rank highly in search results, they were required to adopt the company’s proprietary fork of HTML. It can inject links, scrape third-parties, and build a self-preferencing silo — and website owners have to be okay with it or lose valuable referral traffic from Google users.
All of that is nominally ethical in Google’s view. What is not, apparently, is a company using workarounds to get a window into Google’s practices. I sympathize with that argument. The only tool we have is robots.txt and, regardless of SerpApi’s intent, I do not think circumvention efforts should be tolerated, though that should be paired with aggressive antitrust action to prevent incumbent powers from abusing their position.
Recent actions taken by U.S. courts, for example, have found Google illegally maintained its search monopoly. In issuing proposed remedies earlier this year, the judge noted the rapidly shifting world of search thanks to the growth of generative artificial intelligence products. “OpenAI” is mentioned (PDF) thirty times as an example of a potential disruptor. However, the judge does not mention OpenAI’s live search data is at least partially powered by SerpApi.