Search Results for: "artificial intelligence"

Carolyn Jones, the Markup:

Google has recently made the visual search tool easier to use on the company’s Chrome browser. When users click on an icon in the Google search bar, a moveable bubble pops up. Wherever the bubble is placed, a sidebar appears with an artificial intelligence answer, description, explanation or interpretation of whatever is inside the bubble. For students, it provides an easy way to cheat on digital tests without typing in a prompt, or even leaving the page. All they have to do is click.

“I couldn’t believe it,” said teacher Dustin Stevenson. “It’s hard enough to teach in the age of AI, and now we have to navigate this?”

As browsers are increasingly augmented with A.I. features, I expect to see more stories like this one. In Google’s case, it is particularly egregious as the company’s Chromebooks are widely used in education.

Corporate Europe Observatory:

Over the past year, tech industry lobby groups have used their lavish budgets to aggressively push for the deregulation of the EU’s digital rulebook. The intensity of this policy battle is also reflected in the fact that Big Tech companies have on average more than one lobby meeting per day with EU Commission officials.

This lobbying offensive appears to be paying off. Recently, a string of policy-makers have called for a pause of the Artificial Intelligence Act, and there is also a concerted push to weaken people’s data protection rights under the GDPR. Moreover, the EU’s Digital Markets Act (DMA) and Digital Services Act (DSA) are being constantly challenged by Big Tech, including via the Trump administration.

Jack Power, Irish Times:

Meta, the company that owns Facebook, Instagram and WhatsApp, has privately tried to convince the Irish Government to lead a pushback against data protection laws at European Union level, correspondence shows.

[…]

“We believe that the EU’s data protection and privacy regimes require a fundamental overhaul and that Ireland has a very important and meaningful role to play in achieving this,” she [Meta’s Erin Egan] wrote.

Thanks to years of being a tax haven, Ireland has now found itself in a position of unique responsibility. For example:

Ms Egan said Meta has been going back and forth with regulators about the company’s plans to train its AI models using public Facebook and Instagram posts.

An effective green light from the Data Protection Commission, which enforces data and privacy laws in Ireland, was a ”welcome step”, she wrote.

The Commission made a series of recommendations giving E.U. citizens more control over the user data Meta is using to train its A.I. models. Still, it means user data on Meta platforms is being used to train A.I. models. While groups in Ireland and Germany objected to those plans, courts seemed largely satisfied with the controls and protections the DPC mandated, and which were so basic this article calls them an “effective green light”.

Though it is apparently satisfied with the outcome, Meta does not want even that level of scrutiny. It wants to export its U.S.-centric view of user privacy rights — that is, that they are governed only by whatever Meta wants to jam into its lengthy terms of service agreements — around the world. I know lobbying is just something corporations do and policymakers are expected to consider their viewpoints. On the other hand, Meta’s entire history of contempt toward user privacy ought to be disqualifying. The correct response to Meta’s letter is to put it through a shredder without a second thought.

Matt O’Brien, Associated Press:

Social media platform Reddit sued the artificial intelligence company Perplexity AI and three other entities on Wednesday, alleging their involvement in an “industrial-scale, unlawful” economy to “scrape” the comments of millions of Reddit users for commercial gain.

[…]

Also named in the lawsuit are Lithuanian data-scraping company Oxylabs UAB, a web domain called AWMProxy that Reddit describes as a “former Russian botnet,” and Texas-based startup SerpApi, which lists Perplexity as a customer on its website.

Mike Masnick, Techdirt:

Most reporting on this is not actually explaining the nuances, which require a deeper understanding of the law, but fundamentally, Reddit is NOT arguing that these companies are illegally scraping Reddit, but rather that they are illegally scraping… Google (which is not a party to the lawsuit) and in doing so violating the DMCA’s anti-circumvention clause, over content Reddit holds no copyright over. And, then, Perplexity is effectively being sued for linking to Reddit.

This is… bonkers on so many levels. And, incredibly, within their lawsuit, Reddit defends its arguments by claiming it’s filing this lawsuit to protect the open internet. It is not. It is doing the exact opposite.

I am glad Masnick wrote about this despite my disagreement with his views on how much control a website owner ought to have over scraping. This is a necessary dissection of the suit, though I would appreciate views on it from actual intellectual property lawyers. They might be able to explain how a positive outcome of this case for Reddit would have clear rules delineating this conduct from the ways in which artificial intelligence companies have so far benefitted from a generous reading of fair use and terms of service documents.

Tabby Kinder in New York and George Hammond, Financial Times:

OpenAI has signed about $1tn in deals this year for computing power to run its artificial intelligence models, commitments that dwarf its revenue and raise questions about how it can fund them.

Emily Forgash and Agnee Ghosh, Bloomberg:

For much of the AI boom, there have been whispers about Nvidia’s frenzied dealmaking. The chipmaker bolstered the market by pumping money into dozens of AI startups, many of which rely on Nvidia’s graphics processing units to develop and run their models. OpenAI, to a lesser degree, also invested in startups, some of which built services on top of its AI models. But as tech firms have entered a more costly phase of AI development, the scale of the deals involving these two companies has grown substantially, making it harder to ignore.

The day after Nvidia and OpenAI announced their $100 billion investment agreement, OpenAI confirmed it had struck a separate $300 billion deal with Oracle to build out data centers in the US. Oracle, in turn, is spending billions on Nvidia chips for those facilities, sending money back to Nvidia, a company that is emerging as one of OpenAI’s most prominent backers.

I possess none of the skills most useful to understand what all of this means. I am not an economist; I did not have a secret life as an investment banker. As a layperson, however, it is not comforting to read from some People With Specialized Knowledge that this is similar to historically good circular investments, just at an unprecedented scale, while other People With Specialized Knowledge say this has been the force preventing the U.S. from entering a recession. These articles might be like one of those prescient papers from before the Great Recession. Not a great feeling.

Last year, Robb Knight figured out how Perplexity, an artificial intelligence search engine, was evading instructions not to crawl particular sites. Knight learned that Perplexity’s engine would use an unlisted user agent to scrape summaries of pages on websites where Perplexity was blocked. In my testing, I found the summaries were outdated by hours-to-days, indicating to me the pages were not being actively visited as though guided by a user. Aravind Srinivas, CEO of Perplexity, told Mark Sullivan, of Fast Company, it was the fault of a third-party crawler and denied wrongdoing.

This dispute was, I think, a clear marker in a debate concerning what control website owners have — or ought to have — over access to and interpretation of their websites, an issue that was recently re-raised in an article by Mike Masnick of Techdirt. Masnick explores scraper gating services offered by Cloudflare and Reddit’s blocking of the Internet Archive, and concludes the web is being cleaved in two:

There are plenty of reasons to be concerned about LLM/AI tools these days, in terms of how they can be overhyped, how they can be misused, and certainly over who has power and control over the systems. But it’s deeply concerning to me how many people who supported an open internet and the fundamental principles that underlie that have now given up on those principles because they see that some AI companies might benefit from an open internet.

The problem isn’t just ideological — it’s practical. We’re watching the construction of a fundamentally different internet, one where access is controlled by gatekeepers and paywalls rather than governed by open protocols and user choice. And we’re doing it in the name of stopping AI companies, even though the real result will be to concentrate even more power in the hands of those same large tech companies while making the internet less useful for everyone else.

This is a passionately argued article about a thorny issue. I, too, am saddened by an increasingly walled-off web, whether through payment gates or the softer barriers of login or email subscriptions. Yet Masnick misses the mark in ways I think he is usually more careful about.

In the second quoted paragraph above, for example, Masnick laments an internet “governed [less] by open protocols and user choice” than “controlled by gatekeepers”. These are presented as opposing qualities, but they are in fact complementary. Open protocols frequently contain specifications for authentication, allowing users and administrators to limit access. Robots.txt is an open standard that is specifically intended to communicate access rules. Thus, while an open web is averse to centralization and proprietary technologies, it does not necessarily mean a porous web. The open web does not necessarily come without financial cost to human users. I see no reason the same principle should not be applied to robots, too.

Masnick:

This illustrates the core problem: we’re not just blocking bulk AI training anymore. We’re blocking legitimate individual use of AI tools to access and analyze web content. That’s not protecting creator rights — that’s breaking the fundamental promise of the web that if you publish something publicly, people should be able to access and use it.

Masnick is entirely correct: people should be able to access and use it. They should be able to use any web browser they like, with whatever browser extensions and user scripts they desire. That does not necessarily extend to machines. The specific use case Masnick is concerned with is that he uses Lex as a kind of editorial verification step. When he references some news sites, however, Lex is blocked from reading them and therefore cannot provide notes on whether Masnick’s interpretation of a particular article is accurate. “I’m not trying to train an A.I. on those articles”, Masnick writes. “I’m just asking it to read over the article, read over what I’ve written, and give me a sense” if they jibe.

That may well be the case, but the blame for mistrust lies squarely with artificial intelligence companies. The original sin of representatives of this industry was to believe they did not require permission to ingest a subset of the corpus of human knowledge and expression, nor did they need to offer compensation. They did not seem to draw hard ethical lines around what they would consume for training, either — if it was publicly available, it could become part of their model. Anthropic and Meta both relied on materials available at LibGen, many of which are hosted without permission. A training data set included fan-made subtitles, which can be treated as illicit derivative works. I cannot blame any publisher for treating these automated visitors as untrustworthy or even hostile because A.I. companies have sabotaged attempts at building trust. Some seem to treat the restrictions of a robots.txt file as mere suggestions to be worked around. How can a publisher be confident the user-initiated retrieval of their articles, as Masnick is doing, is not used for training in any way?

Masnick is right, however, to be worried about how this is bifurcating the web. Websites like 404 Media have explicitly cited A.I. scraping as the reason for imposing a login wall. A cynical person might view this as a convenient excuse to collect ever-important email addresses and, while I cannot disprove that, it is still a barrier to entry. Then there are the unintended consequences of trying to impose limits on scraping. After Reddit announced it would block the Internet Archive, probably to comply with some kind of exclusivity expectations in its agreements with Google and OpenAI, it implied the Archive does not pass along the robots.txt rules of the sites in its collection. If a website administrator truly does not want the material on their site to be used for A.I. training, they would need to prevent the Internet Archive from scraping as well — and that would be horrible consequence.

Of course, Reddit does not block A.I. scraping on principle. It appears to be a contractual matter, where third-parties pay the company some massive amount of money for access. Anthropic’s recent proposed settlement supposed a price of a billion-and-a-half dollars would sufficiently compensate authors of the books it pirated. M.G. Siegler called this “pulling up a drawbridge” by setting a high cost floor that will lock out insufficiently funded competitors. Masnick worries about the same thing, predicting the ultimate winners of this will be “the same large tech companies that can afford licensing deals and that have the resources to navigate an increasingly complex web of access restrictions”.

To be sure, intellectual property law is a mess, and encouraging copyright maximalism will have negative consequences. The U.S. already has some of the longest copyright protections in the world, and which have unfortunately spilled into Canada thanks to trade agreements. But A.I. organizations have not created a bottom-up rebellious exploration of the limits of intellectual property law. They are big businesses with deep pockets exploiting decades of news, blogging, photography, video, and art. Nobody, as near as makes no difference, expected something they published online would one day feed the machines that now produce personalized Facebook slop.

Masnick acknowledges faults like these in his conclusion, but I do not think his proposed solutions are very strong:

None of this means we should ignore legitimate concerns about AI training or creator compensation. But we should address those concerns through mechanisms that preserve internet openness rather than destroy it. That might mean new business models, better attribution systems, or novel approaches to creator compensation. What it shouldn’t mean is abandoning the fundamental architecture of the web.

The “new business models” and “better attribution systems” are not elucidated here, but the compensation pitch seems like a disaster in the making to me. It is also from Masnick; here is the nut of his explanation:

But… that doesn’t mean there isn’t a better solution. If the tech companies need good, well-written content to fill their training systems, and the world needs good, high-quality journalism, why don’t the big AI companies agree to start funding journalists and solve both problems in one move?

What Masnick proposes is that A.I. companies could pay journalists to produce new articles for their training data. Respectfully, this would be so insubstantial as to be worthless. To train their models, A.I. companies are ingesting the millions of websites, tens of millions of YouTube videos, hundreds of thousands of books, and probably far more — the training data is opaque. It is almost like a perverse version of fair use. Instead of a small amount of an existing work becoming the basis of a larger body of work — like the quotes I am using and attributing in this article — this is a massive library of fully captured information. Any single piece is of little consequence to the whole, but the whole does not work as well without all those tiny pieces.

The output of a single journalist is inconsequential, an argument Masnick also makes: “[a]ny individual piece of content (or even 80k pieces of content) is actually not worth that much” in the scope of training a large language model. This is near the beginning of the same piece he concludes by arguing we need “novel approaches to creator compensation”. Why would A.I. companies pay journalists to produce the microscopic portion of words training their systems when they have historically used billions — perhaps trillions — of freebies? There are other ways I can think of why this would not work, but this is the most obvious.

One thing that might help, not suggested by Masnick, is improving the controls available to publishers. Today marked the launch of the Really Simple Licensing standard offering publishers a way to define machine-readable licenses. These can be applied site-wide, sure, but also at a per-page level. It is up to A.I. companies to adhere to the terms but with an exception — there are ways to permit access to encrypted material. This raises concerns about a growing proliferation of digital rights management, bringing me back to Masnick’s reasonable concern about a web increasingly walled-off and accessible only to authorized visitors.

I am not saying I have better ideas; I appreciate that Masnick at least brought something to the table in that regard, as I have nothing to add. I, too, am concerned about dividing the web. However, I think publishers are coming at this from a reasonable place. This is not, as Masnick puts it, a “knee-jerk, anti-A.I. stance” to which publishers have responded with restrictions because “[i]f it hurts A.I. companies, it must be good”. A.I. companies largely did this to themselves by raising billions of dollars in funding to strip-mine the public web without permission and, ultimately, with scant acknowledgement. I believe information should be freer than it is, that intellectual property hoarding is wrong, and that we are better when we build on top of each other. That is a fine stance for information reuse by fellow human beings. However, the massive scale of artificial intelligence training comes with different standards.

In writing this article, I am acutely aware it will become part of a training data set. I could block those crawlers — I have blocked a few — but that is only partly the point. I simply do not know how much control I reclaim now will be relevant in the future, and I am sure the same is true of any real media organization. I write here for you, not for the benefit of building the machines producing a firehose of spam, scams, and slop. The artificial intelligence companies have already violated the expectations of even a public web. Regardless of the benefits they have created — and I do believe there are benefits to these technologies — they have behaved unethically. Defensive action is the only control a publisher can assume right now.

At the time I wrote about the fundamentally dishonest complaints from Elon Musk about the App Store’s ranking of Grok, a lawsuit had not been filed. Two weeks later, though, Musk followed through.

Annie Palmer, CNBC:

Elon Musk’s xAI sued Apple and OpenAI on Monday, accusing the pair of an “anticompetitive scheme” to thwart artificial intelligence rivals.

The lawsuit, filed by Musk’s AI startup xAI and its social network business X, alleges Apple and OpenAI have “colluded” to maintain monopolies in the smartphone and generative AI markets.

Iain Thomson, the Register:

It accuses Apple of downgrading other AI apps in favor of ChatGPT. While the lawsuit acknowledges iPhones can use other AI engines, it claims that OpenAI competitors don’t get enough promotion.

The lawsuit cites the list of “Must-Have Apps” posted on Sunday, in which OpenAI was the only AI app listed. Also included were Tinder, Hinge, and Bumble. Musk’s lawyers claim that Cook & Co’s statement in the T&Cs that Apple’s store “is designed to be fair and free of bias,” is a lie.

There are many problems one can find in the App Store, Apple’s editorial process, and the way OpenAI seems to be everywhere. I think xAI is a bad plaintiff for this case, however. When I wrote that Musk’s frenzied posting on X was “dishonest”, what I meant was he was inventing or exaggerating controversy to boost the app’s rankings. At the time, it was unclear whether this strategy would work. On the day I published my commentary, Grok was fifth in the U.S. in overall free downloads. The day this lawsuit was filed, it had fallen off the chart, declining steeply since. Meanwhile, Google’s Gemini has climbed from placing in the mid-fifties in mid-August to third place today. Perplexity has grown from placing in the hundreds to twenty-fifth place today. (Sensor Tower does not allow me to create permalinks of those charts, so act fast.)

Of course, even though this information appears to invalidate the lawsuit’s claim (PDF) that “Apple has deprioritized the apps of super app and generative AI chatbot competitors, like the products offered by Plaintiffs, in its App Store rankings to favor OpenAI”, it will simply feed the persecution complex of xAI. And the lawsuit raises a good point: Apple should more urgently open up third-party A.I. integration, something it said it would do. This is going to be painful to watch.

Ashley Belanger, Ars Technica:

Authors revealed today that Anthropic agreed to pay $1.5 billion and destroy all copies of the books the AI company pirated to train its artificial intelligence models.

In a press release provided to Ars, the authors confirmed that the settlement is “believed to be the largest publicly reported recovery in the history of US copyright litigation.” Covering 500,000 works that Anthropic pirated for AI training, if a court approves the settlement, each author will receive $3,000 per work that Anthropic stole. “Depending on the number of claims submitted, the final figure per work could be higher,” the press release noted.

Foster Kamer, Futurism:

Kyle Chayka, a staff writer at The New Yorker whose work zeroes in on the intersection between technology, art, and culture, is the author of not one but two books that popped up in LibGen: 2024’s “Filterworld: How Algorithms Flattened Culture” and 2020’s “The Longing For Less: Living With Minimalism.” Also in found in LibGen was the Italian translation of Filterworld. All in, he could stand to make upwards of $12K!

We asked Kyle: How does the sum of “$3,000 per class work” feel as a number given that his intellectual property was used to train an AI? Low, high, not worth it on principle, or about right?

“It should be a license, really,” he replied. “Because the training never goes away. So it could be $5,000 every 5 years, or $1,000 / year as long as they exist. But the price seems about right, honestly — a decent percentage of most book advances, and about the price of an institutional speaking gig.”

Yet another complication for the fair use arguments of generative A.I. companies, though one which was obviously undermined by using pirated data to begin with. Though I think it makes sense to focus on this case for now, the question looming over this entire case is what precedent it sets. It does not singlehandedly eliminate the fair use argument for training on public information, but what about other illicitly reproduced information ingested into data sets?

Meta and now Apple are being sued in similar cases. If you are a published author, visit the case settlement website to register for your share of the pot.

Update: The judge in the case is not thrilled with this proposed settlement.

Katherine Bunt, Meredith McGraw, and Megan Bobrowsky, Wall Street Journal:

President Trump on Thursday led leaders of the world’s biggest technology companies in a version of his cabinet meetings, in which each participant takes a turn thanking and praising him, this time for his efforts to promote investments in chip manufacturing and artificial intelligence.

Present at the table were Sam Altman, Tim Cook, Sundar Pichai, David Sacks, and — immediately to Trump’s right — Mark Zuckerberg. Bill Gates was also there for some reason. Here is a fun exchange the Journal pulled from all the grovelling:

Trump also addressed Alphabet CEO Sundar Pichai about a federal judge’s ruling this week on an antitrust case related to Google’s monopoly in search. The judge levied relatively light penalties and rejected the most significant measures sought by the Justice Department, which filed the lawsuit in 2020.

“You had a very good day yesterday,” Trump said. “Do you want to talk about that big day you had yesterday?”

“I’m glad it’s over,” Pichai said.

“Biden was the one who prosecuted that lawsuit,” Trump said. “You know that, right?”

Beginning this section by reminding readers the suit was filed under the first Trump administration is a kind way of calling out the president’s flexible concepts of time and responsibility.

At least nobody gave him any solid gold statues this time, as far as I know.

Tripp Mickle and Don Clark, New York Times:

Echoing IBM, Microsoft in 1985 built its Windows software to run on Intel processors. The combination created the “Wintel era,” when the majority of the world’s computers featured Windows software and Intel hardware. Microsoft’s and Intel’s profits soared, turning them into two of the world’s most valuable companies by the mid-1990s. Most of the world’s computers soon featured “Intel Inside” stickers, making the chipmaker a household name.

In 2009, the Obama administration was so troubled by Intel’s dominance in computer chips that it filed a broad antitrust case against the Silicon Valley giant. It was settled the next year with concessions that hardly dented the company’s profits.

This is a gift link because I think this one is particularly worth reading. The headline calls it a “long, painful downfall”, but the remarkable thing about it is that it is short, if anything. Revenue is not always the best proxy for this, but the cracks began to show in the early 2010s when its quarterly growth contracted; a few years of modest growth followed before being clobbered since mid-2020. Every similar company in tech seems to have made a fortune off the combined forces of the covid-19 pandemic and artificial intelligence except Intel.

Tobias Mann, the Register:

For better or worse, the US is now a shareholder in the chipmaker’s success, which makes sense given Intel’s strategic importance to national security. Remember, Intel is the only American manufacturer of leading edge silicon. TSMC and Samsung may be setting up shop in the US, but hell will freeze over before the US military lets either of them fab its most sensitive chips. Uncle Sam awarded Intel $3.2 billion to build that secure enclave for a reason.

Put mildly, The US government needs Intel Foundry and Lip Bu Tan needs Uncle Sam’s cash to make the whole thing work. It just so happens that right now Intel isn’t in a great position to negotiate.

Mann’s skeptical analysis is also worth your time. There is good sense in the U.S. government holding an interest in the success of Intel. Under this president, however, it raises entirely unique questions and concerns.

The top two stories on Techmeme right now are lies. What I mean by that is not that the reporters are lying, but that the stories themselves are fundamentally dishonest because of who and what they are about. The first is by Katherine Blunt, of the Wall Street Journal:

Artificial-intelligence startup Perplexity on Tuesday offered to purchase Google’s Chrome browser for $34.5 billion as it works to challenge the tech giant’s web-search dominance.

Perplexity’s offer is significantly more than its own valuation, which is estimated at $18 billion. The company told The Wall Street Journal that several investors including large venture-capital funds had agreed to back the transaction in full.

Perplexity will not be buying Chrome. Someone there is very good at getting press, but this is ridiculous.

The second story is by Surbhi Misra, of Reuters:

Billionaire Elon Musk said on Monday his artificial intelligence startup xAI would take legal action against Apple, accusing the iPhone maker of breaching antitrust regulations in managing App Store rankings.

“Apple is behaving in a manner that makes it impossible for any AI company besides OpenAI to reach #1 in the App Store, which is an unequivocal antitrust violation. xAI will take immediate legal action,” Musk said in a post on his social media platform X.

Not only was Musk’s claim debunked in a community note under his tweet, Henry Chandonnet of Business Insider pointed out Grok topped the top free apps chart in February. Also, the day before Musk tweeted this, he retweeted someone who said Grok was the top app in the Netherlands.

The whole gimmick was revealed less than two hours later when Musk, quote-tweeting one of his drooling sycophants, started a campaign to increase its App Store popularity. I have no idea if it is working. It was not among the top free apps last Tuesday, but entered the chart on Wednesday; by Friday, it was number eight. On Sunday, it was number five. All of this happened pre-campaign. It has stayed in the number five slot probably in part because the App Store rankings are not immediate. Also, maybe ChatGPT is just really popular and well-known.

I cannot find a relevant lawsuit. I do think Apple should enforce its App Store rules since Grok is so unmoderated. This whole story is built on a foundation of lies to create what is effectively a viral marketing campaign for the permanently aggrieved.

Jay Peters, the Verge:

Reddit says that it has caught AI companies scraping its data from the Internet Archive’s Wayback Machine, so it’s going to start blocking the Internet Archive from indexing the vast majority of Reddit. The Wayback Machine will no longer be able to crawl post detail pages, comments, or profiles; instead, it will only be able to index the Reddit.com homepage, which effectively means Internet Archive will only be able to archive insights into which news headlines and posts were most popular on a given day.

Surely, this has something to do with Reddit’s decision to license the data created by its users, as Peters writes, but it also puts the Internet Archive in an uncomfortable middle seat with a massive trove of third-party data. Unfortunately for many publishers, the Archive seems to be perfectly happy with scrapers and is unbothered if its collection is used to train artificial intelligence. While the Wayback Machine preserves a copy of a website’s robots.txt file, any publisher serious about restricting A.I. training on their material must also block the Internet Archive for fear this could happen to them. That would be a terrible loss for all of us.

Mark Zuckerberg is not much of a visionary. He is ambitious, sure, and he has big ideas. He occasionally pops into the public consciousness to share some new direction in which he is taking his company — a new area of focus that promises to assert his company’s leadership in technology and society. But very little of it seems to bear fruit or be based on a coherent set of principles.

For example, due to Meta’s scale, it is running into limitations on its total addressable market based on global internet connectivity. It has therefore participated in several related projects, like measuring the availability of internet connectivity worldwide with the Economist, which has not been updated since 2022. In 2014, it acquired a company building a solar-powered drone to beam service to people in more remote locations; the project was cancelled in 2018. It made a robot to wrap fibre optic cable around existing power lines, which it licensed to Hibot in 2023; Hibot has nothing on its website about the robot.

It is not just Meta’s globe-spanning ambitions that have faltered. In 2019, Zuckerberg outlined a “privacy-focused vision for social networking” for what was then Facebook, the core tenets of which in no way conflict with the company’s targeted advertising business. Aside from the things I hope Facebook was already doing — data should be stored securely, private interactions should remain private, and so on — there were some lofty goals. Zuckerberg said the company should roll out end-to-end encrypted messaging across its product line; that it should add controls to automatically delete or hide posts after some amount of time; that its products should be extremely interoperable with those from third-parties. As of writing, Meta added end-to-end encryption to Facebook Messenger and Instagram, but it is only on by default for Facebook. (WhatsApp was end-to-end encrypted by default already.) It has not added an automatic post deletion feature to Facebook or Instagram. Its apps remain stubbornly walled-off. You cannot even sign into a third-party Mastodon app with a Threads account, even though it is amongst the newest and most interoperable offerings from Meta.

Zuckerberg published that when it was advantageous for the company to be seen as doing its part for user privacy. Similarly, when it was smart to advocate for platform safety, Zuckerberg was contrite:

But it’s clear now that we didn’t do enough. We didn’t focus enough on preventing abuse and thinking through how people could use these tools to do harm as well. That goes for fake news, foreign interference in elections, hate speech, in addition to developers and data privacy. We didn’t take a broad enough view of what our responsibility is, and that was a huge mistake. It was my mistake.

Then, when it became a good move to be brash and arrogant, Zuckerberg put on a gold chain and a million-dollar watch to explain how platform moderation had gone too far.

To be clear, Meta has not entirely failed with these initiatives. As mentioned, Threads is relatively interoperable, and the company defaulted to end-to-end encryption in Facebook Messenger in 2023. It said earlier this year it is spending $10 billion on a massive sub-sea cable, which is a proven technology to expand connectivity more than a solar-powered drone could.

But I have so far not mentioned the metaverse. According to Zuckerberg, this is “an embodied internet where you’re in the experience, not just looking at it”, and it was worth pivoting the entire company to be “metaverse-first”. The company renamed itself “Meta”. Zuckerberg forecasted an “Altria moment” a few years prior and the press noticed. In announcing this new direction in 2021, Zuckerberg acknowledged it would be a long-term goal, though predicted it would be “mainstream in the next five to ten years”:

Our hope is that within the next decade, the metaverse will reach a billion people, host hundreds of billions of dollars of digital commerce, and support jobs for millions of creators and developers.

Granted, it has not been even four years since Zuckerberg made these announcements, but are we any closer to his company’s vision becoming mainstream? If you broaden the definition of “metaverse” to include all augmented and virtual reality products then, yes, it appears to be a growing industry. But the vision shown at Connect 2021 is scarcely anywhere to be found. We are not attending virtual concerts or buying virtual merch at virtual after-parties. I am aching to know how the metaverse real estate market is doing as I am unaware of anyone I know living in a virtual house.

As part of this effort, Meta announced in May 2022 it would support NFTs on Instagram. These would be important building blocks for the metaverse, the company said, “critical for how people will buy, use and share virtual objects and experiences” in the virtual environment it was building. Meta quickly expanded availability to Facebook and rolled it out worldwide. Then, in March 2023, it ended support for NFTs altogether, saying “[a]ny collectibles you’ve already shared will remain as posts, but no blockchain info will be displayed”.

Zuckerberg has repeatedly changed direction on what his company is supposed to stand for. He has plenty of ideas, sure, and they are often the kinds of things requiring resources in an amount only possible for a giant corporation like the one he runs. And he has done it again by dedicating Meta’s efforts to what he is calling — in a new manifesto, open letter, mission statement, or whatever this is — “personal superintelligence”.

I do have to take a moment to acknowledge the bizarre quality of this page. It is ostensibly a minimalist and unstyled document of near-black Times New Roman on a white background — very hacker, very serious. It contains about 3,800 characters, which should mean a document barely above four or five kilobytes, accounting for HTML tags and a touch of CSS. Yet it is over 400 kilobytes. Also, I love that keywords are defined:

<meta name="keywords" content="Personal 
Superintelligence, AI systems improvement, 
Superintelligence vision, Mark Zuckerberg 
Meta, Human empowerment AI, Future of 
technology, AI safety and risks, Personal
AI devices, Creativity and culture with 
AI, Meta AI initiatives">

Very retro.

Anyway, what is “superintelligence”? is a reasonable question you may ask, and a term which Zuckerberg does not define. I guess it is supposed to be something more than or different from artificial intelligence, which is yesterday’s news:

As profound as the abundance produced by AI may one day be, an even more meaningful impact on our lives will likely come from everyone having a personal superintelligence that helps you achieve your goals, create what you want to see in the world, experience any adventure, be a better friend to those you care about, and grow to become the person you aspire to be.

He decries competitors’ ambitions:

This is distinct from others in the industry who believe superintelligence should be directed centrally towards automating all valuable work, and then humanity will live on a dole of its output. At Meta, we believe that people pursuing their individual aspirations is how we have always made progress expanding prosperity, science, health, and culture. This will be increasingly important in the future as well.

I am unsure what to make of this. It is sorely tempting to dismiss the whole endeavour as little more than words on a page for a company deriving 98% of its revenue (PDF) from advertising.1 If we consider it more seriously, however, we are left with an ugly impression for what “valuable work” may consist of. Meta is very proud of its technology to “generate photorealistic images”, thereby taking the work of artists and photographers. Examples of its technology also include generating blog posts and building study plans, so it seems writing and tutoring are not entirely “valuable work” either.

I am being a bit cheeky but, with Zuckerberg’s statement entirely devoid of specifics, I am also giving it the gravitas it has earned.

While I was taking way too long to write this, Om Malik examined it from the perspective of someone who has followed Zuckerberg’s career trajectory since it began. It is a really good piece. Though Malik starts by saying “Zuck is one of the best ‘chief executives’ to come out of Silicon Valley”, he concludes by acknowledging he is “skeptical of his ability to invent a new future for his company”:

Zuck has competitive anxiety. By repeatedly talking about being “distinct from others in the industry” he is tipping his hand. He is worried that Meta is being seen as a follower rather than leader. Young people are flocking to ChatGPT. Programmers are flocking to Claude Code.

What does Meta AI do? Bupkiss. And Zuck knows that very well. You don’t do a company makeover if things are working well.

If you are solely looking at Meta’s earnings, things seem to be working just fine for the company. Meta beat revenue expectations in its most recent quarter while saying the current quarter will also be better than analysts thought. Meta might not be meeting already-low analyst expectations for revenue in its Reality Labs metaverse segment, but the stock jumped by 10% anyhow. Even Wall Street is not taking Zuckerberg seriously as an innovator. Meta is great at selling ads. It is not very exciting, but it works.

Back to the superintelligence memo, emphasis mine:

We believe the benefits of superintelligence should be shared with the world as broadly as possible. That said, superintelligence will raise novel safety concerns. We’ll need to be rigorous about mitigating these risks and careful about what we choose to open source. Still, we believe that building a free society requires that we aim to empower people as much as possible.

And here is what Zuckerberg wrote just one year ago:

Meta is committed to open source AI. I’ll outline why I believe open source is the best development stack for you, why open sourcing Llama is good for Meta, and why open source AI is good for the world and therefore a platform that will be around for the long term.

[…]

There is an ongoing debate about the safety of open source AI models, and my view is that open source AI will be safer than the alternatives. I think governments will conclude it’s in their interest to support open source because it will make the world more prosperous and safer.

No mention of being careful, no mention of choosing what to open source. Zuckerberg took an ostensibly strong, principled view supportive of open source A.I. when it benefitted the company, and is now taking an ostensibly strong, principled view that it requires more nuance.

Zuckerberg concludes:

Meta believes strongly in building personal superintelligence that empowers everyone. We have the resources and the expertise to build the massive infrastructure required, and the capability and will to deliver new technology to billions of people across our products. I’m excited to focus Meta’s efforts towards building this future.

On this, I kind of believe him. I believe the company has the resources and reach to make “personal superintelligence” — whatever it is — a central part of Meta’s raison d’être, just as Malik says in his article he has “learned not to underestimate Zuckerberg”. The language in Zuckerberg’s post is flexible, vague, and optimistic enough to provide cover for whatever the company does next. It could be a unique virtual assistant, or it could be animated stickers in chats. Whatever it is, this technology will also assuredly be directed toward the company’s advertising machine, as its current A.I. efforts are providing “greater efficiency and gains across our ad system”. Zuckerberg is telling investors imagine what we could do with superintelligence.

In December 2023, Simon Willison wrote about the trust crisis in artificial intelligence, comparing it to the conspiracy theory that advertisers use audio from real-world conversations for targeting:

The key issue here is the same as the OpenAI training issue: people don’t believe these companies when they say that they aren’t doing something.

One interesting difference here is that in the Facebook example people have personal evidence that makes them believe they understand what’s going on.

With AI we have almost the complete opposite: AI models are weird black boxes, built in secret and with no way of understanding what the training data was or how it influences the model.

Meta has pulled off a remarkable feat. It has ground down users’ view of their own privacy into irrelevance, yet its services remain ubiquitous to the point of being essential. Maybe Meta does not need trust for its A.I. or “superintelligence” ambitions, either. It is unfathomably rich, has a huge volume of proprietary user data, and a CEO who keeps pushing forward despite failing at basically every quasi-visionary project. Maybe that is enough.


  1. Do note two slides later the company’s effective tax rate dropping from 17% in Q3 and Q4 2023 to just 9% in Q1 2025, and 11% in the most recent quarter. Nine percent on over $18 billion in income. ↥︎

Tim Bradshaw and Ivan Levingston, Financial Times:

Spotify founder Daniel Ek’s investment company is leading a €600mn funding round in Helsing, valuing the German defence tech group at €12bn and making it one of Europe’s most valuable start-ups.

The deal comes as the Munich-based start-up is expanding from its origins in artificial intelligence software to produce its own drones, aircraft and submarines.

Laura Molloy, NME:

Xiu Xiu have announced that they are in the process of removing their music from Spotify, over CEO Daniel Ek’s “investment in AI war drones”.

[…]

It comes after Deerhoof also recently pulled their catalogue from the platform for the same reason, stating: “We don’t want our music killing people. We don’t want our success being tied to AI battle tech,” Deerhoof said in a statement.

Financial relationships between the music industry and arms suppliers has been documented before, but it was more of a hop-skip-and-jump away. Ek’s investment is pretty direct. A Spotify subscription boosts his net worth, which he puts into his fund, which gives that money to an drone company he helps oversee.

Update: King Gizzard and the Lizard Wizard has also removed its music from Spotify.

Mike Masnick, Techdirt, reacting to Grok’s Nazi turn:

We need to take back control over the tools that we use.

Especially these days, as so many people have started (dangerously) treating AI tools as “objective” sources of truth, people need to understand that they are all subject to biases. Some of these biases are in their training data. Some are in their weights. And some are, as is now quite clear, directly in their system prompts.

The problem isn’t just bias — it’s whose bias gets embedded in the system. When a centralized AI reflects the worldview of tech billionaires rather than the diverse perspectives of its users, we’re not getting artificial intelligence. We’re getting artificial ideology.

I am half compelled by this argument, and half concerned. I obviously believe we should be skeptical of how much trust we place in corporations. After all, they have given us ample reason to be suspicious of them.

Even before it was “X”, Twitter did not have the best reputation for quality discussion. And then it was bought by Elon Musk. I still do not believe there is sufficient evidence for bias in users’ feeds during the recent U.S. presidential election, but the anti-“political correctness” written into Grok is a plainly obvious problem. Even so, a new version of Grok was launched this week, which consults Musk’s tweets when it gets stuck on a query. All of this should undermine the little bit of trust anyone might have left in X and xAI.

A company with a much better reputation, historically, is Google. Even though it has faced decades of scrutiny and questions about its secret website rankings, it has generally gotten things more right than not. To be clear, I can point to dozens of times when it has been bad at search — especially in the last five years — but it remains what most people think of when they think of searching the web. Yet, because it feels to some like A.I. works like magic, that reputation is on the line with good criticisms and very dumb ones. The Attorney General of Missouri — the state that nearly prosecuted a journalist for viewing the source of a website — is investigating Google, Meta, Microsoft, and OpenAI for being insufficiently supportive of the president’s record on Israel–U.S. relations. The Attorney General approvingly cites Missouri v. Biden, which the state lost.

Yet, even with all this in mind, we need to be able to trust institutions to some extent. This is the part of me concerned about Masnick’s piece. I think it is a great suggestion that we should control our own tools, where anyone can “choose your own values, your own sources, and your own filters”. However, most people are unlikely to do these things. Most of us will probably use something from some big company we do not really trust, but it is what ships with the system or is built into the apps we use most, or whatever. We need to ensure the areas where we have little control are trustworthy, too.

What that probably means is some kind of oversight, akin to what we have for other areas of little control. This is how we have some trust in the water we drink, the air we breathe, the medicine we take, and the planes we fly in. Consumer protection laws give us something to stand on when we are taken advantage of. Yes, there are places where this is done better than others, and I think we should learn from them instead of throwing up our hands and pretending this problem will be solved on an individual basis. To be clear, I am not reading Masnick’s writing as some kind of libertarian fantasy or an anti-regulation screed, nor do I interpret that in Alex Komoroske’s manifesto either. But I also believe there should be some regulation because we need to be realistic about the practical limitations of how much time and effort people will invest into controlling their experience.

Jonathan Vanian, CNBC:

Mark Zuckerberg said Monday that he’s creating Meta Superintelligence Labs, which will be led by some of his company’s most recent hires, including Scale AI ex-CEO Alexandr Wang and former GitHub CEO Nat Friedman.

Zuckerberg said the new AI superintelligence unit, MSL, will house the company’s various teams working on foundation models such as the open-source Llama software, products and Fundamental Artificial Intelligence Research projects, according to an internal memo obtained by CNBC.

Kyle Orland, Ars Technica:

When I hear Zuckerberg talk about the promise of AI these days, it’s hard not to hear echoes of his monumental vision for the metaverse from 2021. If anything, Zuckerberg’s vision of our AI-powered future is even more grandiose than his view of the metaverse.

Orland allows for key differences, like how people actually use A.I. products, including those from Meta — Zuckerberg says “more than 1 billion monthly actives”. That seems, to me, to be a pretty big caveat. The series 404 Media has been running about A.I. slop on Facebook looks bad, but at it suggests people are using A.I. in connection with Meta’s products, something nobody can say about the metaverse it decided to use as the foundation for rebranding itself. Embarrassing.

A good faith read of Orland’s argument is that Meta is taking advantage of — and growing — the hype around A.I. in the same way as it attempted to do with the metaverse. This is obviously not a new thing for tech companies. They routinely proclaim world-changing advancements without earning it, and Meta is a particularly poor narrator of its own supposed brilliance. I would not trust it — but not because this all sounds a bit like the metaverse. Meta and Zuckerberg personally simply have not demonstrated a capacity for being visionary. The company has a knack for acquisitions and an ability to retain users’ attention. It has not shown an ability to invent the future.

All of these stories were published yesterday. I am linking to them in chronological order.

Alex Kantrowitz, in his Big Technology newsletter:

Tim Cook ought to call Perplexity CEO Aravind Srinivas and offer him $30 billion for his AI search engine. And he should do it right away.

[…]

“Not likely!” Perplexity chief business officer Dmitry Shevelenko told me of a potential tie-up with Apple. “But Meta-Scale is so unlikely that I feel we aren’t living in a world of likelies.”

Apple and Perplexity have had no M&A discussions to date, Shevelenko added, not even a wink.

Mark Gurman, Katie Roof, and Riley Griffin, Bloomberg:

Meta Platforms Inc. held discussions with artificial intelligence search startup Perplexity AI Inc. about a possible takeover before moving ahead with a multibillion-dollar investment in Scale AI, according to people familiar with the matter.

Deirdre Bosa and Ashley Capoot, of CNBC, confirmed Bloomberg’s reporting, adding that one source “said Perplexity walked away from a potential deal”.

Mark Gurman, Bloomberg:

Apple Inc. executives have held internal discussions about potentially bidding for artificial intelligence startup Perplexity AI, seeking to address the need for more AI talent and technology.

You will note the day began with Kantrowitz’s article calling for Apple to buy Perplexity. It was not a reaction to Gurman’s report, which was published late in the afternoon and came after a different story about another possible Perplexity acquisition, to which Gurman also contributed. Heck of a coincidence all of these dropped on the same day.

Christine Wang, Axios:

Ten years ago, Google crawled two pages for every visitor it sent a publisher, per [Cloudflare CEO Matthew] Prince.

[…]

Now:

  • For Google, it’s 18:1

  • For OpenAI, it’s 1,500:1

  • For Anthropic, it’s 60,000:1

It is a curious side effect of Cloudflare’s size and position that it is among a true handful of companies with this kind of visibility into a meaningful slice of global web traffic.

In an alternate world, these artificial intelligence businesses may have tried to work with publishers. Perhaps they would have given greater prominence to references, self-policed the amount of summarization they would offer, and provide some kind of financial kickback. Instead, they have trained their systems on publishers’ vast libraries without telling them until it is far too late for it to matter. They take so much while providing so little in return. This will surely accelerate the walling-off of the necessarily paid web, further affirming what I have taken to calling “Robinson’s Law”. This helps explain the increasingly unethical means of acquiring this training data.

I, like millions of people, spend a good chunk of my Mondays through Fridays in Microsoft 365 (formerly Office) applications, many of which I partly or wholly access through a browser. For years, I could just type portal.office.com in my browser and get links to Word, Excel, Outlook — all the classics — plus some organization-specific applications.

Microsoft being Microsoft, it has decided to bury all that. We are in the artificial intelligence era now, baby. Instead of seeing the applications millions of people rely on, Microsoft has decided the default view should be a Copilot chat. The familiar suite of apps is in another page accessible from the sidebar; each app can, at least, be pinned to that same sidebar. This was barely announced and seems to have taken many by surprise.

But, hey, Copilot should at least make me more productive, right? It is Microsoft’s own service — albeit backed by OpenAI’s work — and integrated, in theory, with our organization’s applications and files. Maybe it could function as an application launcher. So I told Copilot to “launch Word”. Copilot:

I can’t directly launch Microsoft Word from here, but I can help you open it on your device. Here’s how you can do it:

On Windows:

  1. Press the Windows key on your keyboard or click the Start menu.

  2. Type “Word” in the search bar.

  3. Click on Microsoft Word from the search results.

On macOS:

  1. Open Finder or press Command + Space to open Spotlight Search.

  2. Type “Word” and press Enter when Microsoft Word appears.

Truly embarrassing stuff to interrupt users’ learned workflow for something worse because it is important for Microsoft to be seen as a cutting-edge innovator. It has made Copilot part of its branding — so much so that it will advertise Copilot even where it is not available, for “consistency”.

This, from Adam Newbold, is a perfect encapsulation of a bunch of ethical problems related to artificial intelligence. The prompt:

Generate an image for a Calvin & Hobbes strip. Four panels. Calvin and Hobbes are walking through the woods, talking to each other, both holding smart phones and looking at them intently the entire time.

Panel 1: Calvin says to Hobbes, “This strip was made entirely with ChatGPT, which should be impossible given the strict intellectual property rights restrictions on Calvin & Hobbes content.”

Panel 2: Hobbes responds to Calvin, “Oh? Then how did it make it?”

Panel 3: Calvin responds to Hobbes, “Some guy just typed this into a box and clicked a button. That’s all it took.”

Panel 4: Hobbes responds to Calvin, “That’s so fucked up.”

This is entirely doable without generative artificial intelligence, but it requires far more skill. The ease of this duplication is maddening. I find this offensive in exactly the way Newbold intended it to be.

More important, I think, is the control exercised over the likenesses of Calvin and Hobbes by the strip’s creator Bill Watterson, as Newbold noted in the strip. Watterson famously rejected all but a handful of licensed merchandising ideas. But the mechanism for how he might protect this is the same as the one used by Disney when it fights parody and reinterpretation of its vast intellectual property, even though the motivations are different. Watterson’s protective quality is admirable, driven by artistic integrity to the extent he has left many millions of dollars’ worth of tchotchkes on the table to retain the spirit of the strips. Disney’s is entirely business motivated, evidenced by the tens of billions of dollars in licensed tchotchkes sold last year alone.

This is not the first “Calvin & Hobbes” strip made with generative A.I., nor does generative A.I. begin and end at self-referential prompts like these. Some assholes have created plugins — more-or-less — to badly emulate Watterson’s unique style in generative A.I. programs. It is awful.

I want to live in a world where we can differentiate between the necessary reinterpretation of intellectual property while respecting the wishes of artists. This is a tricky line, I know. It requires us — individually, but also the organizations responsible for generative A.I. stuff — to think about who is making such a request in good faith, and decide whether we are going to honour that.

One more thing: Watterson is a pretty private person, rarely giving interviews. But, right above this paragraph, I think we can get a sense of how he might feel about this.

Drew DeVault:

Now it’s LLMs. If you think these crawlers respect robots.txt then you are several assumptions of good faith removed from reality. These bots crawl everything they can find, robots.txt be damned, including expensive endpoints like git blame, every page of every git log, and every commit in every repo, and they do so using random User-Agents that overlap with end-users and come from tens of thousands of IP addresses – mostly residential, in unrelated subnets, each one making no more than one HTTP request over any time period we tried to measure – actively and maliciously adapting and blending in with end-user traffic and avoiding attempts to characterize their behavior or block their traffic.

As curious and fascinating as I find many applications of generative artificial intelligence, I find it difficult to square with the flagrantly unethical way it has been trained. Server admins have to endure and pay for massive amounts of traffic from well-funded corporations, without compensation, all of which treat robots.txt as something to be worked around. Add to that the kind of copyright infringement that would cost users thousands of dollars per file, and it is clear the whole system is morally bankrupt.

Do not get me wrong — existing intellectual property law is in desperate need of reform. Big, powerful corporations have screwed us all over by extending copyright terms. In Canada, the number of works in the public domain will be stagnant for the next eighteen years after we signed onto the Canada–United States–Mexico Agreement. But what artificial intelligence training is proposing is a worst-of-both-worlds situation, in which some big businesses get to retain a tight grip on artists’ works, and others get to assume anything remotely public is theirs to seize.