The ‘A.I.’ Trust Crisis ⇥ simonwillison.net

Dropbox recently added some features, marketed as “A.I.”, which it turned on by default for people outside the European Economic Area, U.K., and — depending on which part of the page you read — maybe also Canada. In those named regions, it must be enabled by users. So you can imagine how, when people discovered in their Dropbox settings, a toggle in the “on” position for a “third-party A.I.” preference, many were immediately concerned all their stuff was being used to train third-party models.

Not only is this a setting which, according to Dropbox, is only relevant if you use its “A.I.” features, it is also protective of your work.

Simon Willison:

Here’s copy from that Dropbox preference box, talking about their “third-party partners” — in this case OpenAI:

Your data is never used to train their internal models, and is deleted from third-party servers within 30 days.

It’s increasing clear to me like people simply don’t believe OpenAI when they’re told that data won’t be used for training.

What’s really going on here is something deeper then: AI is facing a crisis of trust.

This is a very good article, comparing mistrust of machine learning to the ad-tech-is-listening-through-your-microphone fiction. It is the kind of thing where the available evidence — or lack thereof — for the theory is matched against decades of the deregulation of industry and consequential corporate malfeasance.

Willison suggests OpenAI and other machine learning companies should provide more information about how they train their models, which I think would be helpful. But it does not go far enough because, as Willison writes, people do not believe the overwhelming evidence against the microphone spying theory, either. How much of this is human nature and how much is a learned behaviour is well beyond something I can write about, but it does seem to be something which can be coaxed in one direction or another by setting expectations. There ought to be a legal privacy framework that would give people the confidence their data is not being misused, and there ought to be a way to audit the entire chain. Does your Dropbox-hosted document really get removed from OpenAI’s servers within a month after you used a summarization feature? It would be nice if there was some way to confirm that — for example, I feel confident that a connection between web services has been severed when I delete the relevant OAuth token.

This is important to me. Longtime readers know I am someone who cares deeply about privacy. It is understandable why people were scared: there is deservedly little trust in corporations, this toggle was poorly labelled, and Dropbox did not effectively communicate this addition. This is why it is vital for privacy legislation to be comprehensive and to give people confidence. Not only is it, you know, effective, it should also reduce the spread of these kinds of theories when they are unfounded and give relevant authorities the ability to investigate when there is a legitimate problem.

People should be able to trust systems and institutions, and systems and institutions need to give people reasons to trust them. The correct thing to do is to repair these critical parts of society, not tear them down or prevent them from being built in the first place.