NYT: Apple Explores Licensing Deals With Publishers for Large Language Model Training nytimes.com

Benjamin Mullin and Tripp Mickle, New York Times:

Apple has opened negotiations in recent weeks with major news and publishing organizations, seeking permission to use their material in the company’s development of generative artificial intelligence systems, according to four people familiar with the discussions.

This is very different from the way existing large language models have been trained.

Kali Hays, of Business Insider, in November:

Most tech companies seemed to agree that being required to pay for the huge amounts of copyrighted material scraped from the internet and used to train large language models behind AI tools like Meta’s Llama, Google’s Bard, and OpenAI’s ChatGPT would create an impossible hurdle to develop the tech.

“Generative AI models need not only a massive quantity of content, but also a large diversity of content,” Meta wrote in its comment. “To be sure, it is possible that AI developers will strike deals with individual rights holders, to develop broader partnerships or simply to buy peace from the threat of litigation. But those kinds of deals would provide AI developers with the rights to only a minuscule fraction of the data they need to train their models. And it would be impossible for AI developers to license the rights to other critical categories of works.”

If it were necessary to license published materials for training large language models, it would necessarily limit the viability of those models to those companies which could afford the significant expense. Mullin and Mickle report Apple is offering “at least $50 million”. Then again, large technology companies are already backing the “A.I.” boom.

Mullin and Mickle:

The negotiations mark one of the earliest examples of how Apple is trying to catch up to rivals in the race to develop generative A.I., which allows computers to create images and chat like a human. […]

Tim Bradshaw, of the Financial Times, as syndicated by Ars Technica:

Apple’s latest research about running large language models on smartphones offers the clearest signal yet that the iPhone maker plans to catch up with its Silicon Valley rivals in generative artificial intelligence.

The paper, entitled “LLM in a Flash,” offers a “solution to a current computational bottleneck,” its researchers write.

Both writers frame this as Apple needing to “catch up” to Microsoft — which licenses generative technology from OpenAI — Meta, and Google. But surely this year has demonstrated both how exciting this technology is and how badly some of these companies have fumbled their use of it — from misleading demos to “automated bullshit”. I have no idea how Apple’s entry will fare in comparison but it may, in retrospect, look wise for it to dodge this kind of embarrassment and the legal questions of today’s examples.