Assigning Copyright Liability for the Output of ‘A.I.’ Systems ⇥ techdirt.com

Camilla Hodgson, Financial Times (syndicated at Ars Technica):

AI models are “trained” on data, such as photographs and text found on the internet. This has led to concern that rights holders, from media companies to image libraries, will make legal claims against third parties who use the AI tools trained on their copyrighted data.

The big three cloud computing providers have pledged to defend business customers from such intellectual property claims. But an analysis of the indemnity clauses published by the cloud computing companies show that the legal protections only extend to the use of models developed by or with oversight from Google, Amazon and Microsoft.

Ira Rothken, Techdirt:

Here’s the crux: the LLM itself can’t predict the user’s intentions. It simply processes patterns based on prompts. The LLM learning machine and idea processor shouldn’t be stifled due to potential user misuse. Instead, in the rare circumstances when there is a legitimate copyright infringement, users ought to be held accountable for their prompts and subsequent usage and give the AI LLM “dual use technology” developers the non-infringing status of the VCR manufacturer under the Sony Doctrine.

It seems there are two possible points of copyright infringement: input and output. I find the latter so much more interesting.

It seems, to me, to depend on how much of a role machine learning models play in determining what is produced, and I find that fascinating. These models have been marketed as true artificial intelligence but, in their defence, are often compared to photocopiers — and there is a yawning chasm between those perspectives. It makes sense for Xerox to bear zero responsibility if someone uses one of its machines to duplicate an entire book. Taking it up a notch, I have no idea if a printer manufacturer might be found culpable for permitting counterfeiting currency — I am not a lawyer — but it is noteworthy anti-duplication measures have been present in scanners and printers for decades, yet Bloomberg reported in 2014 that around 60% of fake U.S. currency was made on home-style printers.

But those are examples of strict duplication — these devices have very little in the way of a brain, and the same is true of a VHS recorder. Large language models and other forms of generative “intelligence” are a little bit different. Somewhere, something like a decision happens. It seems plausible an image generator could produce a result uncomfortably close to a specific visual style without direct prompting by the user, or it could clearly replicate something. In that case, is it the fault of the user or the program, even if it goes unused and mostly unseen?

To emphasize again, I am not a lawyer while Rothken is, so I am just talking out of my butt. These tools are raising some interesting questions is all I want to highlight. Fascinating times ahead.