The Internet Archive Is Increasingly Restricted by Publishers ⇥ wired.com
Kate Knibbs, Wired:
A number of other major journalism organizations have also recently moved to restrict the Wayback Machine from archiving their stories, including The New York Times. According to analysis by the artificial-intelligence-detection startup Originality AI, 23 major news sites are currently blocking ia_archiverbot, the web crawler commonly used by the Internet Archive for the Wayback project. The social platform Reddit is too. Other outlets are limiting the project in different ways: The Guardian does not block the crawler, but it excludes its content from the Internet Archive API and filters out articles from the Wayback Machine interface, which makes it harder for regular people to access archived versions of its articles.
This problem was so foreseeable that I foresaw it. It is just one of many ripple effects of artificial intelligence that affect all of us regardless of whether it changes our employment prospects, in ways large and small. I see way more CAPTCHAs and rate limiting now than I ever have, and I do not think that is coincidental. The web and its services are becoming less useful for most of us precisely because it is the only way comparatively powerless media organizations have any leverage over well-funded and amoral A.I. firms.