The Carelessness of Perplexity

Alex Heath, of the Verge, spoke with Aravind Srinivas, CEO of Perplexity, earlier this week, and they had quite the conversation.

Many publishers have been upset with you for scraping their content. You’ve started cutting some of them checks. Do you feel like you’re in a good place with publishers now, or do you feel there’s still more work to be done?

I’m sure there’s more work to be done, but it’s in a way better place than it was last time we spoke. We are scraping but respecting robots.txt. We only use third-party data providers for anything that doesn’t allow us to scrape.

Heath has no followup, no request for clarification — nothing — so I am not sure if I am reading this right, but I think I am. Srinivas here says that Perplexity’s scraper itself respects website owners’ permissions but, if it is disallowed, the company gets the same data through third-parties. If a website owner does not want their data used by Perplexity, they must disallow its own scraper plus every single scraper that might sell to Perplexity. That barely resembles “respecting robots.txt”.

Again, I could have this wrong, but Heath does not bother to clarify this response.

Perplexity is currently working on its own web browser, Comet, and has signalled interest in buying Chrome should Google be forced to divest it. Srinivas calls it a “containerized operating system” and explains the company’s thinking in response to a question about ChatGPT’s Memory feature:

Our strategy is to allow people to stay logged in where they are. We’re going to build a browser, and that’s how we’ll access apps on behalf of the user on the client side.

I think memory will be won by the company that has the most context. ChatGPT knows nothing about what you buy on Instagram or Amazon. It also knows nothing about how much time you spend on different websites. You need to have all this data to deeply personalize for the user. It’s not about who rolls out memory based on the retrieval of past queries. That’s very simple to replicate.

If you are a money person, there is a logical next step to this, which Srinivas revealed on a small podcast with a couple of finance bros as they ask a question that just so happens to promote one of their sponsors: “how are you thinking about advertising in the context of search? […] Is there a future where, if you search for ‘what’s the best corporate card?’, Ramp is going to show up at the top if they bid on that?”, to which Srinivas responds “hopefully not” before going on to explain how Perplexity could eventually become ad-supported.

Julie Bort, TechCrunch:

“On the other hand, what are the things you’re buying; which hotels are you going [to]; which restaurants are you going to; what are you spending time browsing, tells us so much more about you,” he explained.

Srinivas believes that Perplexity’s browser users will be fine with such tracking because the ads should be more relevant to them.

“We plan to use all the context to build a better user profile and, maybe you know, through our discover feed we could show some ads there,” he said.

These are comments on a podcast and perhaps none of this will come to pass, but anyone can see how this is financially alluring. The “business friendly” but privacy hostile environment of the U.S. means companies like Perplexity can do this stuff with impunity. Its pitch sounds revolting now — exactly how Google’s behaviourally targeted ads sounded twenty years ago.

Perplexity is another careless business. It does not care if a website has specifically prohibited it from scraping; Perplexity will simply rely on a third-party scraper. Perplexity does not care about your privacy. I see no reason to treat this as a problem specific to individual companies, and these technologies do not respect geographic boundaries, either. We need better protections as users, which means more policy-based protections by governments taking privacy seriously.

But this industry is moving too fast. It is a “race”, after all, and any attempts to regulate it are either knocked down or compromised. There is a real need for lawmakers and regulators who care about privacy as a fundamental human right. These companies do not care and will not regulate themselves.