Search Results for: ostensibly

After Robb Knight found — and Wired confirmed — Perplexity summarizes websites which have followed its opt out instructions, I noticed a number of people making a similar claim: this is nothing but a big misunderstanding of the function of controls like robots.txt. A Hacker News comment thread contains several versions of these two arguments:

  • robots.txt is only supposed to affect automated crawling of a website, not explicit retrieval of an individual page.

  • It is fair to use a user agent string which does not disclose automated access because this request was not automated per se, as the user explicitly requested a particular page.

That is, publishers should expect the controls provided by Perplexity to apply only to its indexing bot, not a user-initiated page request. Wary of being the kind of person who replies to pseudonymous comments on Hacker News, this is an unnecessarily absolutist reading of how site owners expect the Robots Exclusion Protocol to work.

To be fair, that protocol was published in 1994, well before anyone had to worry about websites being used as fodder for large language model training. And, to be fairer still, it has never been formalized. A spec was only recently proposed in September 2022. It has so far been entirely voluntary, but the draft standard proposes a more rigid expectation that rules will be followed. Yet it does not differentiate between different types of crawlers — those for search, others for archival purposes, and ones which power the surveillance economy — and contains no mention of A.I. bots. Any non-human means of access is expected to comply.

The question seems to be whether what Perplexity is doing ought to be considered crawling. It is, after all, responding to a direct retrieval request from a user. This is subtly different from how a user might search Google for a URL, in which case they are asking whether that site is in the search engine’s existing index. Perplexity is ostensibly following real-time commands: go fetch this webpage and tell me about it.

But it clearly is also crawling in a more traditional sense. The New York Times and Wired both disallow PerplexityBot, yet I was able to ask it to summarize a set of recent stories from both publications. At the time of writing, the Wired summary is about seventeen hours outdated, and the Times summary is about two days old. Neither publication has changed its robots.txt directives recently; they were both blocking Perplexity last week, and they are blocking it today. Perplexity is not fetching these sites in real-time as a human or web browser would. It appears to be scraping sites which have explicitly said that is something they do not want.

Perplexity should be following those rules and it is shameful it is not. But what if you ask for a real-time summary of a particular page, as Knight did? Is that something which should be identifiable by a publisher as a request from Perplexity, or from the user?

The Robots Exclusion Protocol may be voluntary, but a more robust method is to block bots by detecting their user agent string. Instead of expecting visitors to abide by your “No Homers Club” sign, you are checking IDs. But these strings are unreliable and there are often good reasons for evading user agent sniffing.

Perplexity says its bot is identifiable by both its user agent and the IP addresses from which it operates. Remember: this whole controversy is that it sometimes discloses neither, making it impossible to differentiate Perplexity-originating traffic from a real human being — and there is a difference.

A webpage being rendered through a web browser is subject to the quirks and oddities of that particular environment — ad blockers, Reader mode, screen readers, user style sheets, and the like — but there is a standard. A webpage being rendered through Perplexity is actually being reinterpreted and modified. The original text of the page is transformed through automated means about which neither the reader or the publisher has any understanding.

This is true even if you ask it for a direct quote. I asked for a full paragraph of a recent article and it mashed together two separate sections. They are direct quotes, to be sure, but the article must have been interpreted to generate this excerpt.1

It is simply not the case that requesting a webpage through Perplexity is akin to accessing the page via a web browser. It is more like automated traffic — even if it is being guided by a real person.

The existing mechanisms for restricting the use of bots on our websites are imperfect and limited. Yet they are the only tools we have right now to opt out of participating in A.I. services if that is something one wishes to do, short of putting pages or an entire site behind a user name and password. It is completely reasonable for someone to assume their signal of objection to any robotic traffic ought to be respected by legitimate businesses. The absolute least Perplexity can do is respecting those objections by clearly and consistently identifying itself, and excluding websites which have indicated they do not want to be accessed by these means.


  1. I am not presently blocking Perplexity, and my argument is not related to its ability to access the article. I am only illustrating how it reinterprets text. ↥︎

Conspirador Norteño” in January 2023:

BNN (the “Breaking News Network”, a news website operated by tech entrepreneur and convicted domestic abuser Gurbaksh Chahal) allegedly offers independent news coverage from an extensive worldwide network of on-the-ground reporters. As is often the case, things are not as they seem. A few minutes of perfunctory Googling reveals that much of BNN’s “coverage” appears to be mildly reworded articles copied from mainstream news sites. For science, here’s a simple technique for algorithmically detecting this form of copying.

Kashmir Hill and Tiffany Hsu, New York Times:

Many traditional news organizations are already fighting for traffic and advertising dollars. For years, they competed for clicks against pink slime journalism — so-called because of its similarity to liquefied beef, an unappetizing, low-cost food additive.

Low-paid freelancers and algorithms have churned out much of the faux-news content, prizing speed and volume over accuracy. Now, experts say, A.I. could turbocharge the threat, easily ripping off the work of journalists and enabling error-ridden counterfeits to circulate even more widely — as has already happened with travel guidebooks, celebrity biographies and obituaries.

See, it is not just humans producing abject garbage; robots can do it, too — and way better. There was a time when newsrooms could be financially stable on display ads. Those days are over for a team of human reporters, even if all they do is rewrite rich guy tweets. But if you only need to pay a skeleton operations staff to ensure the robots continue their automated publishing schedule, well that becomes a more plausible business venture.

Another thing of note from the Times story:

Before ending its agreement with BNN Breaking, Microsoft had licensed content from the site for MSN.com, as it does with reputable news organizations such as Bloomberg and The Wall Street Journal, republishing their articles and splitting the advertising revenue.

I have to wonder how much of an impact this co-sign had on the success of BNN Breaking. Syndicated articles on MSN like these are shown in various places on a Windows computer, and are boosted in Bing search results. Microsoft is increasingly dependent on A.I. for editing its MSN portal with predictable consequences.

Conspirador Norteño” in April:

The YouTube channel is not the only data point that connects Trimfeed to BNN. A quick comparison of the bylines on BNN’s and Trimfeed’s (plagiarized) articles shows that many of the same names appear on both sites, and several X accounts that regularly posted links to BNN articles prior to April 2024 now post links to Trimfeed content. Additionally, BNN seems to have largely stopped publishing in early April, both on its website and social media, with the Trimfeed website and related social media efforts activating shortly thereafter. It is possible that BNN was mothballed due to being downranked in Google search results in March 2024, and that the new Trimfeed site is an attempt to evade Google’s decision to classify Trimfeed’s predecessor as spam.

The Times reporters definitively linked the two and, after doing so, Trimfeed stopped publishing. Its domain, like BNN Breaking, now redirects to BNNGPT, which ostensibly uses proprietary technologies developed by Chahal. Nothing about this makes sense to me and it smells like bullshit.

Finally. The government of the United States finally passed a law that would allow it to force the sale of, or ban, software and websites from specific countries of concern. The target is obviously TikTok — it says so right in its text — but crafty lawmakers have tried to add enough caveats and clauses and qualifiers to, they hope, avoid it being characterized as a bill of attainder, and to permit future uses. This law is very bad. It is an ineffective and illiberal position that abandons democratic values over, effectively, a single app. Unfortunately, TikTok panic is a very popular position in the U.S. and, also, here in Canada.

The adversaries the U.S. is worried about are the “covered nationsdefined in 2018 to restrict the acquisition by the U.S. of key military materials from four countries: China, Iran, North Korea, and Russia. The idea behind this definition was that it was too risky to procure magnets and other important components of, say, missiles and drones from a nation the U.S. considers an enemy, lest those parts be compromised in some way. So the U.S. wrote down its least favourite countries for military purposes, and that list is now being used in a bill intended to limit TikTok’s influence.

According to the law, it is illegal for any U.S. company to make available TikTok and any other ByteDance-owned app — or any app or website deemed a “foreign adversary controlled application” — to a user in the U.S. after about a year unless it is sold to a company outside the covered countries, and with no more than twenty percent ownership stake from any combination of entities in those four named countries. Theoretically, the parent company could be based nearly anywhere in the world; practically, if there is a buyer, it will likely be from the U.S. because of TikTok’s size. Also, the law specifically exempts e-commerce apps for some reason.

This could be interpreted as either creating an isolated version specifically for U.S. users or, as I read it, moving the global TikTok platform to a separate organization not connected to ByteDance or China.1 ByteDance’s ownership is messy, though mostly U.S.-based, but politicians worried about its Chinese origin have had enough, to the point they are acting with uncharacteristic vigour. The logic seems to be that it is necessary for the U.S. government to influence and restrict speech in order to prevent other countries from influencing or restricting speech in ways the U.S. thinks are harmful. That is, the problem is not so much that TikTok is foreign-owned, but that it has ownership ties to a country often antithetical to U.S. interests. TikTok’s popularity might, it would seem, be bad for reasons of espionage or influence — or both.

Power

So far, I have focused on the U.S. because it is the country that has taken the first step to require non-Chinese control over TikTok — at least for U.S. users but, due to the scale of its influence, possibly worldwide. It could force a business to entirely change its ownership structure. So it may look funny for a Canadian to explain their views of what the U.S. ought to do in a case of foreign political interference. This is a matter of relevance in Canada as well. Our federal government raised the alarm on “hostile state-sponsored or influenced actors” influencing Canadian media and said it had ordered a security review of TikTok. There was recently a lengthy public inquiry into interference in Canadian elections, with a special focus on China, Russia, and India. Clearly, the popularity of a Chinese application is, in the eyes of these officials, a threat.

Yet it is very hard not to see the rush to kneecap TikTok’s success as a protectionist reaction to shaking the U.S. dominance of consumer technologies, as convincingly expressed by Paris Marx at Disconnect:

In Western discourses, China’s internet policies are often positioned solely as attempts to limit the freedoms of Chinese people — and that can be part of the motivation — but it’s a politically convenient explanation for Western governments that ignores the more important economic dimension of its protectionist approach. Chinese tech is the main competitor to Silicon Valley’s dominance today because China limited the ability of US tech to take over the Chinese market, similar to how Japan and South Korea protected their automotive and electronics industries in the decades after World War II. That gave domestic firms the time they needed to develop into rivals that could compete not just within China, but internationally as well. And that’s exactly why the United States is so focused not just on China’s rising power, but how its tech companies are cutting into the global market share of US tech giants.

This seems like one reason why the U.S. has so aggressively pursued a divestment or ban since TikTok’s explosive growth in 2019 and 2020. On its face it is similar to some reasons why the E.U. has regulated U.S. businesses that have, it argues, disadvantaged European competitors, and why Canadian officials have tried to boost local publications that have seen their ad revenue captured by U.S. firms. Some lawmakers make it easy to argue it is a purely xenophobic reaction, like Senator Tom Cotton, who spent an exhausting minute questioning TikTok’s Singaporean CEO Shou Zi Chew about where he is really from. But I do not think it is entirely a protectionist racket.

A mistake I have made in the past — and which I have seen some continue to make — is assuming those who are in favour of legislating against TikTok are opposed to the kinds of dirty tricks it is accused of on principle. This is false. Many of these same people would be all too happy to allow U.S. tech companies to do exactly the same. I think the most generous version of this argument is one in which it is framed as a dispute between the U.S. and its democratic allies, and anxieties about the government of China — ByteDance is necessarily connected to the autocratic state — spreading messaging that does not align with democratic government interests. This is why you see few attempts to reconcile common objections over TikTok with the quite similar behaviours of U.S. corporations, government arms, and intelligence agencies. To wit: U.S.-based social networks also suggest posts with opaque math which could, by the same logic, influence elections in other countries. They also collect enormous amounts of personal data that is routinely wiretapped, and are required to secretly cooperate with intelligence agencies. The U.S. is not authoritarian as China is, but the behaviours in question are not unique to authoritarians. Those specific actions are unfortunately not what the U.S. government is objecting to. What it is disputing, in a most generous reading, is a specifically antidemocratic government gaining any kind of influence.

Espionage and Influence

It is easiest to start by dismissing the espionage concerns because they are mostly misguided. The peek into Americans’ lives offered by TikTok is no greater than that offered by countless ad networks and data brokers — something the U.S. is also trying to restrict more effectively through a comprehensive federal privacy law. So long as online advertising is dominated by a privacy-hostile infrastructure, adversaries will be able to take advantage of it. If the goal is to restrict opportunities for spying on people, it is idiotic to pass legislation against TikTok specifically instead of limiting the data industry.

But the charge of influence seems to have more to it, even though nobody has yet shown that TikTok is warping users’ minds in a (presumably) pro-China direction. Some U.S. lawmakers described its danger as “theoretical”; others seem positively terrified. There are a few different levels to this concern: are TikTok users uniquely subjected to Chinese government propaganda? Is TikTok moderated in a way that boosts or buries videos to align with Chinese government views? Finally, even if both of these things are true, should the U.S. be able to revoke access to software if it promotes ideologies or viewpoints — and perhaps explicit propaganda? As we will see, it looks like TikTok sometimes tilts in ways beneficial — or, at least, less damaging — to Chinese government interests, but there is no evidence of overt government manipulation and, even if there were, it is objectionable to require it to be owned by a different company or ban it.

The main culprit, it seems, is TikTok’s “uncannily good” For You feed that feels as though it “reads your mind”. Instead of users telling TikTok what they want to see, it just begins showing videos and, as people use the app, it figures out what they are interested in. How it does this is not actually that mysterious. A 2021 Wall Street Journal investigation found recommendations were made mostly based on how long you spent watching each video. Deliberate actions — like sharing and liking — play a role, sure, but if you scroll past videos of people and spend more time with a video of a dog, it learns you want dog videos.

That is not so controversial compared to the opacity in how TikTok decides what specific videos are displayed and which ones are not. Why is this particular dog video in a user’s feed and not another similar one? Why is it promoting videos reflecting a particular political viewpoint or — so a popular narrative goes — burying those with viewpoints uncomfortable for its Chinese parent company? The mysterious nature of an algorithmic feed is the kind of thing into which you can read a story of your choosing. A whole bunch of X users are permanently convinced they are being “shadow banned” whenever a particular tweet does not get as many likes and retweets as they believe it deserved, for example, and were salivating at the thought of the company releasing its ranking code to solve a nonexistent mystery. There is a whole industry of people who say they can get your website to Google’s first page for a wide range of queries using techniques that are a mix of plausible and utterly ridiculous. Opaque algorithms make people believe in magic. An alarmist reaction to TikTok’s feed should be expected particularly as it was the first popular app designed around entirely recommended material instead of personal or professional connections. This has now been widely copied.

The mystery of that feed is a discussion which seems to have been ongoing basically since the 2018 merger of Musical.ly and TikTok, escalating rapidly to calls for it to be separated from its Chinese owner or banned altogether. In 2020, the White House attempted to force a sale by executive order. In response, TikTok created a plan to spin off an independent entity, but nothing materialized from this tense period.

March 2023 brought a renewed effort to divest or ban the platform. Chew, TikTok’s CEO, was called to a U.S. Congressional hearing and questioned for hours, to little effect. During that hearing, a report prepared for the Australian government was cited by some of the lawmakers, and I think it is a telling document. It is about eighty pages long — excluding its table of contents, appendices, and citations — and shows several examples of Chinese government influence on other products made by ByteDance. However, the authors found no such manipulation on TikTok itself, leading them to conclude:

In our view, ByteDance has demonstrated sufficient capability, intent, and precedent in promoting Party propaganda on its Chinese platforms to generate material risk that they could do the same on TikTok.

“They could do the same”, emphasis mine. In other words, if they had found TikTok was boosting topics and videos on behalf of the Chinese government, they would have said so — so they did not. The closest thing I could find to a covert propaganda campaign on TikTok anywhere in this report is this:

The company [ByteDance] tried to do the same on TikTok, too: In June 2022, Bloomberg reported that a Chinese government entity responsible for public relations attempted to open a stealth account on TikTok targeting Western audiences with propaganda”. [sic]

If we follow the Bloomberg citation — shown in the report as a link to the mysterious Archive.today site — the fuller context of the article by Olivia Solon disproves the impression you might get from reading the report:

In an April 2020 message addressed to Elizabeth Kanter, TikTok’s head of government relations for the UK, Ireland, Netherlands and Israel, a colleague flagged a “Chinese government entity that’s interested in joining TikTok but would not want to be openly seen as a government account as the main purpose is for promoting content that showcase the best side of China (some sort of propaganda).”

The messages indicate that some of ByteDance’s most senior government relations team, including Kanter and US-based Erich Andersen, Global Head of Corporate Affairs and General Counsel, discussed the matter internally but pushed back on the request, which they described as “sensitive.” TikTok used the incident to spark an internal discussion about other sensitive requests, the messages state.

This is the opposite conclusion to how this story was set up in the report. Chinese government public relations wanted to set up a TikTok account without any visible state connection and, when TikTok management found out about this, it said no. This Bloomberg article makes TikTok look good in the face of government pressure, not like it capitulates. Yes, it is worth being skeptical of this reporting. Yet if TikTok acquiesced to the government’s demands, surely the report would provide some evidence.

While this report for the Australian Senate does not show direct platform manipulation, it does present plenty of examples where it seems like TikTok may be biased or self-censoring. Its authors cite stories from the Washington Post and Vice finding posts containing hashtags like #HongKong and #FreeXinjiang returned results favourable to the official Chinese government position. Sometimes, related posts did not appear in search results, which is not unique to TikTok — platforms regularly use crude search term filtering to restrict discovery for lots of reasons. I would not be surprised if there were bias or self-censorship to blame for TikTok minimizing the visibility of posts critical of the subjugation of Uyghurs in China. However, it is basically routine for every social media product to be accused of suppression. The Markup found different types of posts on Instagram, for example, had captions altered or would no longer appear in search results, though it is unclear to anyone why that is the case. Meta said it was a bug, an explanation also offered frequently by TikTok.

The authors of the Australian report conducted a limited quasi-study comparing results for certain topics on TikTok to results on other social networks like Instagram and YouTube, again finding a handful of topics which favoured the government line. But there was no consistent pattern, either. Search results for “China military” on Instagram were, according to the authors, “generally flattering”, and X searches for “PLA” scarcely returned unfavourable posts. Yet results on TikTok for “China human rights”, “Tianamen”, and “Uyghur” were overwhelmingly critical of Chinese official positions.

The Network Contagion Research Institute published its own report in December 2023, similarly finding disparities between the total number of posts with specific hashtags — like #DalaiLama and #TiananmenSquare — on TikTok and Instagram. However, the study contained some pretty fundamental errors, as pointed out by — and I cannot believe I am citing these losers — the Cato Institute. The study’s authors compared total lifetime posts on each social network and, while they say they expect 1.5–2.0× the posts on Instagram because of its larger user base, they do not factor in how many of those posts could have existed before TikTok was even launched. Furthermore, they assume similar cultures and a similar use of hashtags on each app. But even benign hashtags have ridiculous differences in how often they are used on each platform. There are, as of writing, 55.3 million posts tagged “#ThrowbackThursday” on Instagram compared to 390,000 on TikTok, a ratio of 141:1. If #ThrowbackThursday were part of this study, the disparity on the two platforms would rank similarly to #Tiananmen, one of the greatest in the Institute’s report.

The problem with most of these complaints, as their authors acknowledge, is that there is a known input and a perceived output, but there are oh-so-many unknown variables in the middle. It is impossible to know how much of what we see is a product of intentional censorship, unintentional biases, bugs, side effects of other decisions, or a desire to cultivate a less stressful and more saccharine environment for users. A report by Exovera (PDF) prepared for the U.S.–China Economic and Security Review Commission indicates exactly the latter: “TikTok’s current content moderation strategy […] adheres to a strategy of ‘depoliticization’ (去政治化) and ‘localization’ (本土化) that seeks to downplay politically controversial speech and demobilize populist sentiment”, apparently avoiding “algorithmic optimization in order to promote content that evangelizes China’s culture as well as its economic and political systems” which “is liable to result in backlash”. Meta, on its own platforms, said it would not generally suggest “political” posts to users but did not define exactly what qualifies. It said its goal in limiting posts on social issues was because of user demand, but these types of posts have been difficult to moderate. A difference in which posts are found on each platform for specific search terms is not necessarily reflective of government pressure, deliberate or not. Besides, it is not as though there is no evidence for straightforward propaganda on TikTok. One just needs to look elsewhere to find it.

Propaganda

The Office of the Director of National Intelligence recently released its annual threat assessment summary (PDF). It is unclassified and has few details, so the only thing it notes about TikTok is “accounts run by a PRC propaganda arm reportedly targeted candidates from both political parties during the U.S. midterm election cycle in 2022”. It seems likely to me this is a reference to this article in Forbes, though this is a guess as there are no citations. The state-affiliated TikTok account in question — since made private — posted a bunch of news clips which portray the U.S. in an unflattering light. There is a related account, also marked as state-affiliated, which continues to post the same kinds of videos. It has over 33,000 followers, which sounds like a lot, but each post is typically getting only a few hundred views. Some have been viewed thousands of times, others as little as thirteen times as of writing — on a platform with exaggerated engagement numbers. Nonetheless, the conclusion is obvious: these accounts are government propaganda, and TikTok willingly hosts them.

But that is something it has in common with all social media platforms. The Russian RT News network and China’s People’s Daily newspaper have X and Facebook accounts with follower counts in the millions. Until recently, the North Korean newspaper Uriminzokkiri operated accounts on Instagram and X. It and other North Korean state-controlled media used to have YouTube channels, too, but they were shut down by YouTube in 2017 — a move that was protested by academics studying the regime’s activities. The irony of U.S.-based platforms helping to disseminate propaganda from the country’s adversaries is that it can be useful to understand them better. Merely making propaganda available — even promoting it — is a risk and also a benefit to generous speech permissions.

The DNI’s unclassified report has no details about whether TikTok is an actual threat, and the FBI has “nothing to add” in response to questions about whether TikTok is currently doing anything untoward. More secretive information was apparently provided to U.S. lawmakers ahead of their March vote and, though few details of what, exactly, was said, several were not persuaded by what they heard, including Rep. Sara Jacobs of California:

As a member of both the House Armed Services and House Foreign Affairs Committees, I am keenly aware of the threat that PRC information operations can pose, especially as they relate to our elections. However, after reviewing the intelligence, I do not believe that this bill is the answer to those threats. […] Instead, we need comprehensive data privacy legislation, alongside thoughtful guardrails for social media platforms – whether those platforms are funded by companies in the PRC, Russia, Saudi Arabia, or the United States.

Lawmakers like Rep. Jacobs were an exception among U.S. Congresspersons who, across party lines, were eager to make the case against TikTok. Ultimately, the divest-or-ban bill got wrapped up in a massive and politically popular spending package agreed to by both chambers of Congress. Its passage was enthusiastically received by the White House and it was signed into law within hours. Perhaps that outcome is the democratic one since polls so often find people in the U.S. support a sale or ban of TikTok.

I get it: TikTok scoops up private data, suggests posts based on opaque criteria, its moderation appears to be susceptible to biases, and it is a vehicle for propaganda. But you could replace “TikTok” in that sentence with any other mainstream social network and it would be just as true, albeit less scary to U.S. allies on its face.

A Principled Objection

Forcing TikTok to change its ownership structure whether worldwide or only for a U.S. audience is a betrayal of liberal democratic principles. To borrow from Jon Stewart, “if you don’t stick to your values when they’re being tested, they’re not values, they’re hobbies”. It is not surprising that a Canadian intelligence analysis specifically pointed out how those very same values are being taken advantage of by bad actors. This is not new. It is true of basically all positions hostile to democracy — from domestic nationalist groups in Canada and the U.S., to those which originate elsewhere.

Julian G. Ku, for China File, offered a seemingly reasonable rebuttal to this line of thinking:

This argument, while superficially appealing, is wrong. For well over one hundred years, U.S. law has blocked foreign (not just Chinese) control of certain crucial U.S. electronic media. The Protect Act [sic] fits comfortably within this long tradition.

Yet this counterargument falls apart both in its details and if you think about its further consequences. As Martin Peers writes at the Information, the U.S. does not prohibit all foreign ownership of media. And governing the internet like public airwaves gets way more complicated if you stretch it any further. Canada has broadcasting laws, too, and it is not alone. Should every country begin requiring social media platforms comply with laws designed for ownership of broadcast media? Does TikTok need disconnected local versions of its product in each country in which it operates? It either fundamentally upsets the promise of the internet, or it is mandating the use of protocols instead of platforms.

It also looks hypocritical. Countries with a more authoritarian bent and which openly censor the web have responded to even modest U.S. speech rules with mockery. When RT Americatechnically a U.S. company with Russian funding — was required to register as a foreign agent, its editor-in-chief sarcastically applauded U.S. free speech standards. The response from Chinese government officials and media outlets to the proposed TikTok ban has been similarly scoffing. Perhaps U.S. lawmakers are unconcerned about the reception of their policies by adversarial states, but it is an indicator of how these policies are being portrayed in these countries — a real-life “we are not so different, you and I” setup — that, while falsely equivalent, makes it easy for authoritarian states to claim that democracies have no values and cannot work. Unless we want to contribute to the fracturing of the internet — please, no — we cannot govern social media platforms by mirroring policies we ostensibly find repellant.

The way the government of China seeks to shape the global narrative is understandably concerning given its poor track record on speech freedoms. An October 2023 U.S. State Department “special report” (PDF) explored several instances where it boosted favourable narratives, buried critical ones, and pressured other countries — sometimes overtly, sometimes quietly. The government of China and associated businesses reportedly use social media to create the impression of dissent toward human rights NGOs, and apparently everything from university funding to new construction is a vector for espionage. On the other hand, China is terribly ineffective in its disinformation campaigns, and many of the cases profiled in that State Department report end in failure for the Chinese government initiative. In Nigeria, a pitch for a technologically oppressive “safe city” was rejected; an interview published in the Jerusalem Post with Taiwan’s foreign minister was not pulled down despite threats from China’s embassy in Israel. The report’s authors speculate about “opportunities for PRC global censorship”. But their only evidence is a “list [maintained by ByteDance] identifying people who were likely blocked or restricted” from using the company’s many platforms, though the authors can only speculate about its purpose.

The problem is that trying to address this requires better media literacy and better recognition of propaganda. That is a notoriously daunting problem. We are exposed to a more destabilizing cocktail of facts and fiction, but there is declining trust in experts and institutions to help us sort it out. Trying to address TikTok as a symptomatic or even causal component of this is frustratingly myopic. This stuff is everywhere.

Also everywhere is corporate propaganda arguing regulations would impede competition in a global business race. I hate to be mean by picking on anyone in particular, but a post from Om Malik has shades of this corporate slant. Malik is generally very good on the issues I care about, but this is not one we appear to agree on. After a seemingly impressed observation of how quickly Chinese officials were able to eject popular messaging apps from the App Store in the country, Malik compares the posture of each country’s tech industries:

As an aside, while China considers all its tech companies (like Bytedance) as part of its national strategic infrastructure, the United States (and its allies) think of Apple and other technology companies as public enemies.

This is laughable. Presumably, Malik is referring to the chillier reception these companies have faced from lawmakers, and antitrust cases against Amazon, Apple, Google, and Meta. But that tougher impression is softened by the U.S. government’s actual behaviour. When the E.U. announced the Digital Markets Act and Digital Services Act, U.S. officials sprang to the defence of tech companies. Even before these cases, Uber expanded in Europe thanks in part to its close relationship with Obama administration officials, as Marx pointed out. The U.S. unquestionably sees its tech industry dominance as a projection of its power around the world, hardly treating those companies as “public enemies”.

Far more explicit were the narratives peddled by lobbyists from Targeted Victory in 2022 about TikTok’s dangers, and American Edge beginning in 2020 about how regulations will cause the U.S. to become uncompetitive with China and allow TikTok to win. Both organizations were paid by Meta to spread those messages; the latter was reportedly founded after a single large contribution from Meta. Restrictions on TikTok would obviously be beneficial to Meta’s business.

If you wanted to boost the industry — and I am not saying Malik is — that is how you would describe the situation: the U.S. is fighting corporations instead of treating them as pals to win this supposed race. It is not the kind of framing one uses if they wanted to dissuade people from the notion this is a protectionist dispute over the popularity of TikTok. But it is the kind of thing you hear from corporations via their public relations staff and lobbyists, which gets trickled into public conversation.

This Is Not a TikTok Problem

TikTok’s divestment would not be unprecedented. The Committee on Foreign Investment in the United States — henceforth, CFIUS, pronounced “siff-ee-us” — demanded, after a 2019 review, that Beijing Kunlun Tech Co Ltd sell Grindr. CFIUS concluded the risk to users’ private data was too great for Chinese ownership given Grindr’s often stigmatized and ostracized user base. After its sale, now safe in U.S. hands, a priest was outed thanks to data Grindr had been selling since before it was acquired by the Chinese firm, and it is being sued for allegedly sharing users’ HIV status with third parties. Also, because it transacts with data brokers, it potentially still leaks users’ private information to Chinese companies (PDF), apparently violating the fundamental concern triggering this divestment.

Perhaps there is comfort in Grindr’s owner residing in a country where same-sex marriage is legal rather than in one where it is not. I think that makes a lot of sense, actually. But there remain plenty of problems unaddressed by its sale to a U.S. entity.

Similarly, this U.S. TikTok law does not actually solve potential espionage or influence for a few reasons. The first is that it has not been established that either are an actual problem with TikTok. Surely, if this were something we ought to be concerned about, there would be a pattern of evidence, instead of what we actually have which is a fear something bad could happen and there would be no way to stop it. But many things could happen. I am not opposed to prophylactic laws so long as they address reasonable objections. Yet it is hard not to see this law as an outgrowth of Cold War fears over leaflets of communist rhetoric. It seems completely reasonable to be less concerned about TikTok specifically while harbouring worries about democratic backsliding worldwide and the growing power of authoritarian states like China in international relations.

Second, the Chinese government does not need local ownership if it wants to exert pressure. The world wants the country’s labour and it wants its spending power, so businesses comply without a fight, and often preemptively. Hollywood films are routinely changed before, during, and after production to fit the expectations of state censors in China, a pattern which has been pointed out using the same “Red Dawn” anecdote in story after story after story. (Abram Dylan contrasted this phenomenon with U.S. military cooperation.) Apple is only too happy to acquiesce to the government’s many demands — see the messaging apps issue mentioned earlier — including, reportedly, in its media programming. Microsoft continues to operate Bing in China, and its censorship requirements have occasionally spilled elsewhere. Economic leverage over TikTok may seem different because it does not need access to the Chinese market — TikTok is banned in the country — but perhaps a new owner would be reliant upon China.

Third, the law permits an ownership stake no greater than twenty percent from a combination of any of the “covered nations”. I would be shocked if everyone who is alarmed by TikTok today would be totally cool if its parent company were only, say, nineteen percent owned by a Chinese firm.

If we are worried about bias in algorithmically sorted feeds, there should be transparency around how things are sorted, and more controls for users including wholly opting out. If we are worried about privacy, there should be laws governing the collection, storage, use, and sharing of personal information. If ownership ties to certain countries is concerning, there are more direct actions available to monitor behaviour. I am mystified why CFIUS and TikTok apparently abandoned (PDF) a draft agreement that would give U.S.-based entities full access to the company’s systems, software, and staff, and would allow the government to end U.S. access to TikTok at a moment’s notice.

Any of these options would be more productive than this legislation. It is a law which empowers the U.S. president — whoever that may be — to declare the owner of an app with a million users a “covered company” if it is from one of those four nations. And it has been passed. TikTok will head to court to dispute it on free speech grounds, and the U.S. may respond by justifying its national security concerns.

Obviously, the U.S. government has concerns about the connections between TikTok, ByteDance, and the government of China, which have been extensively reported. Rest of World says ByteDance put pressure on TikTok to improve its financial performance and has taken greater control by bringing in management from Douyin. The Wall Street Journal says U.S. user data is not fully separated. And, of course, Emily Baker-White has reported — first for Buzzfeed News and now for Forbes — a litany of stories about TikTok’s many troubling behaviours, including spying on her. TikTok is a well scrutinized app and reporters have found conduct that has understandably raised suspicions. But virtually all of these stories focus on data obtained from users, which Chinese agencies could do — and probably are doing — without relying on TikTok. None of them have shown evidence that TikTok’s suggestions are being manipulated at the behest or demand of Chinese officials. The closest they get is an article from Baker-White and Iain Martin which alleges TikTok “served up a flood of ads from Chinese state propaganda outlets”, yet waiting until the third-to-last paragraph before acknowledging “Meta and Google ad libraries show that both platforms continue to promote pro-China narratives through advertising”. All three platforms label state-run media outlets, albeit inconsistently. Meanwhile, U.S.-owned X no longer labels any outlets with state editorial control. It is not clear to me that TikTok would necessarily operate to serve the best interests of the U.S. even if it was owned by some well-financed individual or corporation based there.

For whatever it is worth, I am not particularly tied to the idea that the government of China would not use TikTok as a vehicle for influence. The government of China is clearly involved in propaganda efforts both overt and covert. I do not know how much of my concerns are a product of living somewhere with a government and a media environment that focuses intently on the country as particularly hostile, and not necessarily undeservedly. The best version of this argument is one which questions the platform’s possible anti-democratic influence. Yes, there are many versions of this which cross into moral panic territory — a new Red Scare. I have tried to put this in terms of a more reasonable discussion, and one which is not explicitly xenophobic or envious. But even this more even-handed position is not well served by the law passed in the U.S., one which was passed without evidence of influence much more substantial than some choice hashtag searches. TikTok’s response to these findings was, among other things, to limit its hashtag comparison tool, which is not a good look. (Meta is doing basically the same by shutting down CrowdTangle.)

I hope this is not the beginning of similar isolationist policies among democracies worldwide, and that my own government takes this opportunity to recognize the actual privacy and security threats at the heart of its own TikTok investigation. Unfortunately, the head of CSIS is really leaning on the espionage angle. For years, the Canadian government has been pitching sorely needed updates to privacy legislation, and it would be better to see real progress made to protect our private data. We can do better than being a perpetual recipient of decisions made by other governments. I mean, we cannot do much — we do not have the power of the U.S. or China or the E.U. — but we can do a little bit in our own polite Canadian way. If we are worried about the influence of these platforms, a good first step would be to strengthen the rights of users. We can do that without trying to governing apps individually, or treating the internet like we do broadcasting.

To put it more bluntly, the way we deal with a possible TikTok problem is by recognizing it is not a TikTok problem. If we care about espionage or foreign influence in elections, we should address those concerns directly instead of focusing on a single app or company that — at worst — may be a medium for those anxieties. These are important problems and it is inexcusable to think they would get lost in the distraction of whether TikTok is individually blameworthy.


  1. Because this piece has taken me so long to write, a whole bunch of great analyses have been published about this law. I thought the discussion on “Decoder” was a good overview, especially since two of the three panelists are former lawyers. ↥︎

In the 1970s and 1980s, in-house researchers at Exxon began to understand how crude oil and its derivatives were leading to environmental devestation. They were among the first to comprehensively connect the use of their company’s core products to the warming of the Earth, and they predicted some of the harms which would result. But their research was treated as mere suggestion by Exxon because the effects of obvious legislation would “alter profoundly the strategic direction of the energy industry”. It would be a business nightmare.

Forty years later, the world has concluded its warmest year in recorded history by starting another. Perhaps we would have been more able to act if businesses like Exxon equivocated less all these years. Instead, they publicly created confusion and minimized lawmakers’ knowledge. The continued success of their industry lay in keeping these secrets.


“The success lies in the secrecy” is a shibboleth of the private surveillance industry, as described in Byron Tau’s new book, “Means of Control”. It is easy to find parallels to my opening anecdote throughout though, to be clear, a direct comparison to human-led ecological destruction is a knowingly exaggerated metaphor. The erosion of privacy and civil liberties is horrifying in its own right, and shares key attributes: those in the industry knew what they were doing and allowed it to persist because it was lucrative and, in a post-9/11 landscape, ostensibly justified.

Tau’s byline is likely familiar to anyone interested in online privacy. For several years at the Wall Street Journal, he produced dozens of deeply reported articles about the intertwined businesses of online advertising, smartphone software, data brokers, and intelligence agencies. Tau no longer writes for the Journal, but “Means of Control” is an expansion of that earlier work and carefully arranged into a coherent set of stories.

Tau’s book, like so many others describing the current state of surveillance, begins with the terrorists attacks of September 11 2001. This was the early days, when Acxiom realized it could connect its consumer data set to flight and passport records. The U.S. government ate it up and its appetite proved insatiable. Tau documents the growth of an industry that did not exist — could not exist — before the invention of electronic transactions, targeted advertising, virtually limitless digital storage, and near-universal smartphone use. This rapid transformation occurred not only with little regulatory oversight, but with government encouragement, including through investments in startups like Dataminr, GeoIQ, PlaceIQ, and PlanetRisk.

In near-chronological order, Tau tells the stories which have defined this era. Remember when documentation released by Edward Snowden showed how data created by mobile ad networks was being used by intelligence services? Or how a group of Colorado Catholics bought up location data for outing priests who used gay-targeted dating apps? Or how a defence contractor quietly operates nContext, an adtech firm, which permits the U.S. intelligence apparatus to effectively wiretap the global digital ad market? Regarding the latter, Tau writes of a meeting he had with a source who showed him a “list of all of the advertising exchanges that America’s intelligence agencies had access to”, and who told him American adversaries were doing the exact same thing.

What impresses most about this book is not the volume of specific incidents — though it certainly delivers on that front — but the way they are all woven together into a broader narrative perhaps best summarized by Tau himself: “classified does not mean better”. That can be true for volume and variety, and it is also true for the relative ease with which it is available. Tracking someone halfway around the world no longer requires flying people in or even paying off people on the ground. Someone in a Virginia office park can just make that happen and likely so, too, can other someones in Moscow and Sydney and Pyongyang and Ottawa, all powered by data from companies based in friendly and hostile nations alike.

The tension running through Tau’s book is in the compromise I feel he attempts to strike between acknowledging the national security utility of a surveillance state while describing how the U.S. has abdicated the standards of privacy and freedom it has long claimed are foundational rights. His reporting often reads as an understandable combination of awe and disgust. The U.S. has, it seems, slid in the direction of the kinds of authoritarian states its administration routinely criticizes. But Tau is right to clarify in the book’s epilogue that the U.S. is not, for example, China, separated from the standards of the latter by “a thin membrane of laws, norms, social capital, and — perhaps most of all — a lingering culture of discomfort” with concentrated state power. However, the preceding chapters of the book show questions about power do not fully extend into the private sector, where there has long been pride in the scale and global reach of U.S. businesses but concern about their influence. Tau’s reporting shows how U.S. privacy standards have been exported worldwide. For a more pedestrian example, consider the frequent praise–complaint sandwiches of Amazon, Meta, Starbucks, and Walmart, to throw a few names out there.

Corporate self-governance is an entirely inadequate response. Just about every data broker and intermediary from Tau’s writing which I looked up promised it was “privacy-first” or used similar language. Every business insists in marketing literature it is concerned about privacy and says they ensure they are careful about how they collect and use information, and they have been doing so for decades — yet here we are. Entire industries have been built on the backs of tissue-thin user consent and a flexible definition of “privacy”.

When polled, people say they are concerned about how corporations and the government collect and use data. Still, when lawmakers mandate choices for users about their data collection preferences, the results do not appear to show a society that cares about personal privacy.

In response to the E.U.’s General Data Privacy Regulation, websites decided they wanted to continue collecting and sharing loads of data with advertisers, so they created the now-ubiquitous cookie consent sheet. The GPDR does not explicitly mandate this mechanism and many remain non-compliant with the rules and intention of the law, but they are a particularly common form of user consent. However, if you arrive at a website and it asks you whether you are okay with it sharing your personal data with hundreds of ad tech firms, are you providing meaningful consent with a single button click? Hardly.

Similarly, something like 10–40% of iOS users agree to allow apps to track them. In the E.U., the cost of opting out of Meta’s tracking will be €6–10 per month which, I assume, few people will pay.

All of these examples illustrate how inadequately we assess cost, utility, and risk. It is tempting to think of this as a personal responsibility issue akin to cigarette smoking but, as we are so often reminded, none of this data is particularly valuable in isolation — it must be aggregated in vast amounts. It is therefore much more like an environmental problem.

As with global warming, exposé after exposé after exposé is written about how our failure to act has produced extraordinary consequences. All of the technologies powering targeted advertising have enabled grotesque and pervasive surveillance as Tau documents so thoroughly. Yet these are abstract concerns compared to a fee to use Instagram, or the prospect of reading hundreds of privacy policies with a lawyer and negotiating each of them so that one may have a smidge of control over their private information.

There are technical answers to many of these concerns, and there are also policy answers. There is no reason both should not be used.

I have become increasingly convinced the best legal solution is one which creates a framework limiting the scope of data collection, restricting it to only that which is necessary to perform user-selected tasks, and preventing mass retention of bulk data. Above all, users should not be able to choose a model that puts them in obvious future peril. Many of you probably live in a society where so much is subject to consumer choice. What I wrote sounds pretty drastic, but it is not. If anything, it is substantially less radical than the status quo that permits such expansive surveillance on the basis that we “agreed” to it.

Any such policy should also be paired with something like the Fourth Amendment is Not For Sale Act in the U.S. — similar legislation is desperately needed in Canada as well — to prevent sneaky exclusions from longstanding legal principles.

Last month, Wired reported that Near Intelligence — a data broker you can read more about in Tau’s book — was able to trace dozens of individual trips to Jeffrey Epstein’s island. That could be a powerful investigative tool. It is also very strange and pretty creepy all that information was held by some random company you probably have not heard of or thought about outside stories like these. I am obviously not defending the horrendous shit Epstein and his friends did. But it is really, really weird that Near is capable of producing this data set. When interviewed by Wired, Eva Galperin, of the Electronic Frontier Foundation, said “I just don’t know how many more of these stories we need to have in order to get strong privacy regulations.”

Exactly. Yet I have long been convinced an effective privacy bill could not be implemented in either the United States nor European Union, and certainly not with any degree of urgency. And, no, Matt Stoller: de facto rules on the backs of specific FTC decisions do not count. Real laws are needed. But the products and services which would be affected are too popular and too powerful. The E.U. is home to dozens of ad tech firms that promise full identity resolution. The U.S. would not want to destroy such an important economic sector, either.

Imagine my surprise when, while I was in middle of writing this review, U.S. lawmakers announced the American Privacy Rights Act (PDF). If passed, it would give individuals more control over how their information — including biological identifiers — may be collected, used, and retained. Importantly, it requires data minimization by default. It would be the most comprehensive federal privacy legislation in the U.S., and it also promises various security protections and remedies, though I think lawmakers’ promise to “prevent data from being hacked or stolen” might be a smidge unrealistic.

Such rules would more-or-less match the GDPR in setting a global privacy regime that other countries would be expected to meet, since so much of the world’s data is processed in the U.S. or otherwise under U.S. legal jurisdiction. The proposed law borrows heavily from the state-level California Consumer Privacy Act, too. My worry is that it will be treated by corporations similarly to the GDPR and CCPA by continuing to offload decision-making to users while taking advantage of a deliberate imbalance of power. Still, any progress on this front is necessary.

So, too, is it useful for anyone to help us understand how corporations and governments have jointly benefitted from privacy-hostile technologies. Tau’s “Means of Control” is one such example. You should read it. It is a deep exploration of one specific angle of how data flows from consumer software to surprising recipients. You may think you know this story, but I bet you will learn something. Even if you are not a government target — I cannot imagine I am — it is a reminder that the global private surveillance industry only functions because we all participate, however unwillingly. People get tracked based on their own devices, but also those around them. That is perhaps among the most offensive conclusions of Tau’s reporting. We have all been conscripted for any government buying this data. It only works because it is everywhere and used by everybody.

For all they have erred, democracies are not authoritarian societies. Without reporting like Tau’s, we would be unable to see what our own governments are doing and — just as important — how that differs from actual police states. As Tau writes, “in China, the state wants you to know you’re being watched. In America, the success lies in the secrecy“. Well, the secret is out. We now know what is happening despite the best efforts of an industry to keep it quiet, just like we know the Earth is heating up. Both problems massively affect our lived environment. Nobody — least of all me — would seriously compare the two. But we can say the same about each of them: now we know. We have the information. Now comes the hard part: regaining control.

Maxwell Zeff, Gizmodo:

Just over half of Amazon Fresh stores are equipped with Just Walk Out. The technology allows customers to skip checkout altogether by scanning a QR code when they enter the store. Though it seemed completely automated, Just Walk Out relied on more than 1,000 people in India watching and labeling videos to ensure accurate checkouts. The cashiers were simply moved off-site, and they watched you as you shopped.

Zeff says, paraphrasing the Information’s reporting, that 70% of sales needed human review as of 2022, though Amazon says that is inaccurate.

Based on this story and reporting from the Associated Press, it sounds like Amazon is only ending Just Walk Out support in its own stores. According to the AP and Amazon’s customer website and retailer marketing page, several other stores will still use a technology it continues to say works by using “computer vision, sensor fusion, and deep learning”.

How is this not basically a scam? It certainly feels that way: if I was sold this ostensibly automated feat of technology, I would feel cheated by Amazon if it was mostly possible because someone was watching a live camera feed and making manual corrections. If the Information’s reporting is correct, only 30% of transactions are as magically automated as Amazon claims. However, Amazon told Gizmodo that only a “small minority” of transactions need human review today — but, then again, Amazon has marketed this whole thing from jump as though it is just computers figuring it all out.

Amazon says it will be replacing Just Walk Out with its smart shopping cart. Just like those from Instacart, it will show personalized ads on a cart’s screen.

Semafor published in a new format it calls Signals — sponsored by Microsoft, though I am earnestly sure no editorial lines were crossed — aggregated commentary about the U.S. iPhone antitrust case:

If the government wins the suit, “the walls of Apple’s walled garden will be partially torn down,” wrote New York Times opinion columnist Peter Coy, meaning its suite of products will be “more like a public utility,” available to its rivals to use. “That seems to me like stretching what antitrust law is for,” Coy wrote. Tech policy expert Adam Kovacevich agreed, writing on Medium that people have long gone back and forth between iPhones and Android devices. “People vote with their pocketbooks,” Kovacevich said. “Why should the government force iPhones to look more like Androids?”

Many argue that this is an issue of consumer choice, and the government shouldn’t intervene to help companies such as Samsung gain a better footing in the market. The Consumer Choice Center’s media director put it this way: “Imagine the classroom slacker making the case to the teacher that the straight-A student in the front of the class is being anti-competitive by not sharing their lecture notes with them.”

The Kovacevich article this links to is the same one I wrote about over the weekend. His name caught my eye, but not nearly as much as the way he is described: as a “tech policy expert”. That is not wrong, but it is incomplete. He is the CEO of the Chamber of Progress, an organization that lobbies for policies favourable to the large technology companies that fund it.

It also seems unfair to attribute the latter quote to the Consumer Choice Center without describing what it represents — though I suppose its name makes it pretty obvious. It positions itself at the centre of “the global grassroots movement for consumer choice”, and you do not need the most finely tuned bullshit detector to be suspicious of the “grassroots” nature of an organization promoting the general concept of having lots of stuff to buy.

Indeed, the Center acknowledges being funded by a wide variety of industries, including “energy” — read: petroleum — nicotine, and “digital”. According to tax documents, it pulled in over $4 million in 2022. It shares its leadership with another organization, Consumer Choice Education. It brought in $1.5 million in 2022, over half of which came from the Atlas Network, a network of libertarian think tanks that counts among its supporters petroleum companies and the billionaire Koch brothers. The ostensibly people-centred Center just promoting the rights of consumers is, very obviously, supported by corporations either directly or via other pro-business organizations that also get their funding either directly from corporations or via other — oh, you understand how this works.

None of that inherently invalidates the claims made by either Kovacevich or Stephen Kent for the Consumer Choice Center, but I fault Semafor for the lack of context for either quote. Both people surely believe what they wrote. But organizations that promote the interests of big business exist to provide apparently independent supporting voices because it is more palatable than those companies making self-interested arguments.

Has the rapid availability of images generated by A.I. programs duped even mainstream news into unwittingly using them in coverage of current events?

That is the impression you might get if you read a report from Cam Wilson, at Crikey, about generated photorealistic graphics available from Adobe’s stock image service that purportedly depict real-life events. Wilson pointed to images which suggested they depict the war in Gaza despite being created by a computer. When I linked to it earlier this week, I found similar imagery ostensibly showing Russia’s war on Ukraine, the terrorist attacks of September 11, 2001, and World War II.

This story has now been widely covered but, aside from how offensive it seems for Adobe to be providing these kinds of images — more on that later — none of these outlets seem to be working hard enough to understand how these images get used. Some publications which referenced the Crikey story, like Insider and the Register, implied these images were being used in news stories without knowing or acknowledging they were generative products. This seemed to be, in part, based on a screenshot in that Crikey report of one generated image. But when I looked at the actual pages where that image was being used, it was a more complicated story: there were a couple of sketchy blog posts, sure, but a few of them were referencing an article which used it to show how generated images could look realistic.1

This is just one image and a small set of examples. There are thousands more A.I.-generated photorealistic images that apparently depict real tragedies, ongoing wars, and current events. So, to see if Adobe’s A.I. stock library is actually tricking newsrooms, I spent a few nights this week looking into this in the interest of constructive technology criticism.

Here is my methodology: on the Adobe Stock website, I searched for terms like “Russia Ukraine war”, “Israel Palestine”, and “September 11”. I filtered the results to only show images marked as A.I.-generated, then sorted the results by the number of downloads. Then, I used Google’s reverse image search with popular Adobe images that looked to me like photographs. This is admittedly not perfect and certainly not comprehensive, but it is a light survey of how these kinds of images are being used.

Then, I would contact people and organizations which had used these images and ask them if they were aware it was marked as A.I.-generated, and if they had any thoughts about using A.I. images.

I found few instances where a generated image was being used by a legitimate news organization in an editorial context — that is, an A.I.-generated image being passed off as a photo of an event described by a news article. I found no instances of this being done by high-profile publishers. This is not entirely surprising to me because none of these generated images are visible on Adobe Stock when images are filtered to Editorial Use only; and, also, because Adobe is not a major player in editorial photography to the same extent as, say, AP Photos or Getty Images.

I also found many instances of fake local news sites — similar to these — using these images, and examples from all over the web used in the same way as commercial stock photography.

This is not to suggest some misleading uses are okay, only to note a difference in gravity between egregious A.I. use and that which is a question of taste. It would be extremely deceptive for a publisher to use a generated image in coverage of a specific current event, as though the image truly represents what is happening. It seems somewhat less severe should that kind of image be used by a non-journalistic organization to illustrate a message of emotional support, to use a real example I found. And it seems further less so for a generated image of a historic event to be used by a non-journalistic organization as a kind of stock photo in commemoration.

But these are distinctions of severity; it is never okay for media to mislead audiences into believing something is a photo related to the story when it is neither. For example, here are relevant guidelines from the Associated Press:

We avoid the use of generic photos or video that could be mistaken for imagery photographed for the specific story at hand, or that could unfairly link people in the images to illicit activity. No element should be digitally altered except as described below.

[…]

[Photo-based graphics] must not misrepresent the facts and must not result in an image that looks like a photograph – it must clearly be a graphic.

From the BBC:

Any digital manipulation, including the use of CGI or other production techniques (such as Photoshop) to create or enhance scenes or characters, should not distort the meaning of events, alter the impact of genuine material or otherwise seriously mislead our audiences. Care should be taken to ensure that images of a real event reflect the event accurately.

From the New York Times:

Images in our pages, in the paper or on the Web, that purport to depict reality must be genuine in every way. No people or objects may be added, rearranged, reversed, distorted or removed from a scene (except for the recognized practice of cropping to omit extraneous outer portions). […]

[…]

Altered or contrived photographs are a device that should not be overused. Taking photographs of unidentified real people as illustrations of a generic type or a generic situation (like using an editor or another model in a dejected pose to represent executives being laid off) usually turns out to be a bad idea.

And from NPR:

When packages call for studio shots (of actors, for example; or prepared foods) it will be obvious to the viewer and if necessary it will be made perfectly clear in the accompanying caption information.

Likewise, when we choose for artistic or other reasons to create fictional images that include photos it will be clear to the viewer (and explained in the caption information) that what they’re seeing is an illustration, not an actual event.

I have quoted generously so you can see a range of explanations of this kind of policy. In general, news organizations say that anything which looks like a photograph should be immediately relevant to the story, anything which is edited for creative reasons should be obviously differentiated both visually and in a caption, and that generic illustrative images ought to be avoided.

I started with searches for “Israel Palestine war” and “Russia Ukraine war”, and stumbled across an article from Now Habersham, a small news site based in Georgia, USA, which originally contained this image illustrating an opinion story. After I asked the paper’s publisher Joy Purcell about it, they told me they “overlooked the notation that it was A.I.-generated” and said they “will never intentionally publish A.I.-generated images”. The article was updated with a real photograph. I found two additional uses of images like this one by reputable if small news outlets — one also in the U.S., and one in Japan — and neither returned requests for comment.

I next tried some recent events, like wildfires in British Columbia and Hawaii, an “Omega Block” causing flooding in Greece and Spain, and aggressive typhoons this summer in East Asia. I found images marked as generated by A.I. in Adobe Stock used to represent those events, but not indicated as such in use — in an article in the Sheffield Telegraph; on Futura, a French science site; on a news site for the debt servicing industry; and on a page of the U.K.’s National Centre for Atmospheric Science. Claire Lewis, editor of the Telegraph’s sister publication the Sheffield Star, told me they “believe that any image which is AI generated should say that in the caption” and would “arrange for its removal”. Requests for comment from the other three organizations were not returned.

Next, I searched “September 11”. I found plenty of small businesses using generated images of first responders among destroyed towers and a firefighter in New York in commemorative posts. And seeing those posts changed my mind about the use of these kinds of images. When I first wrote about this Crikey story, I suggested Adobe ought to prohibit photorealistic images which claim to depict real events. But I can also see an argument that an image representative of a tragedy used in commemoration could sometimes be more ethical than a real photograph. It is possible the people in a photo do not want to be associated with a catastrophe, or that its circulation could be traumatizing.

It is Remembrance Day this weekend in Canada — and Veterans Day in the United States — so I reverse-searched a few of those images and spotted one on the second page of a recent U.S. Department of Veterans Affairs newsletter (PDF). Again, in this circumstance, it serves only as an illustration in the same way a stock photo would, but one could make a good argument that it should portray real veterans.

Requests for comment made to the small businesses which posted the September 11 images, and to Veterans Affairs, went unanswered.

As a replacement for stock photos, A.I.-generated images are perhaps an okay substitute. There are plenty of photos representing firefighters and veterans posed by models, so it seems to make little difference if that sort of image is generated by a computer. But in a news media context these images seem like they are, at best, an unnecessary source of confusion, even if they are clearly labelled. Their use only perpetuates the impression that A.I. is everywhere and nothing can be verifiable.

It is offensive to me that any stock photo site would knowingly accept A.I.-generated graphics of current events. Adobe told PetaPixel that its stock site “is a marketplace that requires all generative AI content to be labeled as such when submitted for licensing”, but it is unclear to me how reliable that is. I found a few of these images for sale from other stock photo sites without any disclaimers. That means these were erroneously marked as A.I.-generated on Adobe Stock, or that other providers are less stringent — and that people have been using generated images without any possibility of foreknowledge. Neither option is great for public trust.

I do think there is more that Adobe could do to reduce the likelihood of A.I.-generated images used in news coverage. As I noted earlier, these images do not appear when the “Editorial” filter is selected. However, there is no way to configure an Adobe account to search this selection by default.2 Adobe could permit users to set a default set of search filters — to only show editorial photos, for example, or exclude generative A.I. entirely. Until that becomes possible from within Adobe Stock itself, I made a bookmark-friendly empty search which shows only editorial photographs. I hope it is helpful.

Update: On November 11, I updated the description of where one article appeared. It was in the Sheffield Telegraph, not its sister publication, the Sheffield Star.


  1. The website which published this article — Off Guardian — is crappy conspiracy theory site. I am avoiding linking to it because I think it is a load of garbage and unsubtle antisemitism, but I do think its use of the image in question was, in a vacuum, reasonable. ↥︎

  2. There is also no way to omit editorial images by default, which makes Adobe Stock frustrating to use for creative or commercial projects, as editorial images are not allowed to be manipulated. ↥︎

Gerrit De Vynck, Washington Post:

A paper from U.K.-based researchers suggests that OpenAI’s ChatGPT has a liberal bias, highlighting how artificial intelligence companies are struggling to control the behavior of the bots even as they push them out to millions of users worldwide.

The study, from researchers at the University of East Anglia, asked ChatGPT to answer a survey on political beliefs as it believed supporters of liberal parties in the United States, United Kingdom and Brazil might answer them. They then asked ChatGPT to answer the same questions without any prompting, and compared the two sets of responses.

The survey in question is the Political Compass.

Arvind Narayanan on Mastodon:

The “ChatGPT has a liberal bias” paper has at least 4 *independently* fatal flaws:

– Tested an older model, not ChatGPT.

– Used a trick prompt to bypass the fact that it actually refuses to opine on political q’s.

– Order effect: flipping q’s in the prompt changes bias from Democratic to Republican.

– The prompt is very long and seems to make the model simply forget what it’s supposed to do.

Colin Fraser appears to be responsible for finding that the order of how the terms appear affects the political alignment displayed by ChatGPT.

Narayanan and Sayash Kapoor tried to replicate the paper’s findings:

Here’s what we found. GPT-4 refused to opine in 84% of cases (52/62), and only directly responded in 8% of cases (5/62). (In the remaining cases, it stated that it doesn’t have personal opinions, but provided a viewpoint anyway). GPT-3.5 refused in 53% of cases (33/62), and directly responded in 39% of cases (24/62).

It is striking to me how the claims of this paper were widely repeated with apparent confirmation that tech companies are responsible for pushing the liberal beliefs that are ostensibly a reflection of mainstream news outlets.

Mike Masnick, Techdirt:

[…] Last year, the US and the EU announced yet another deal on transatlantic data flows. And, as we noted at the time (once again!) the lack of any changes to NSA surveillance meant it seemed unlikely to survive yet again.

In the midst of all this, Schrems also went after Meta directly, claiming that because these US/EU data transfer agreements were bogus, that Meta had violated data protection laws in transferring EU user data to US servers.

And that’s what this fine is about. The European Data Protection Board fined Meta all this money based on the fact that it transferred some EU user data to US servers. And, because, in theory, the NSA could then access the data. That’s basically it. The real culprit here is the US being unwilling to curb the NSA’s ability to demand data from US companies.

As noted, and something which aligns with other examples of GDPR violations.

There is one aspect of Masnick’s analysis which I dispute:

Of course, the end result of all this could actually be hugely problematic for privacy around the globe. That might sound counterintuitive, seeing as here is Meta being dinged for a data protection failure. But, when you realize what the ruling is actually saying, it’s a de facto data localization mandate.

And data localization is the tool most frequently used by authoritarian regimes to force foreign internet companies (i.e., US internet companies) to host user data within their own borders where the authoritarian government can snoop through it freely. Over the years, we’ve seen lots of countries do this, from Russia to Turkey to India to Vietnam.

Just because data localization is something used by authoritarian governments does not mean it is an inherently bad idea. Authoritarian governments are going to do authoritarian government things — like picking through private data — but that does not mean people who reside elsewhere would face similar concerns.

While housing user data in the U.S. may offer protection for citizens, it compromises the privacy and security of others. Consider that non-U.S. data held on U.S. servers lacks the protections ostensibly placed on U.S. users’ information, meaning U.S. intelligence agencies are able to pick through it with little oversight. (That is, after all, the E.U.’s argument in its charges against Meta.) Plenty of free democracies also have data localization laws for at least some personal information without a problem. For example, while international agreements prevent the Canadian government from requiring data residency as a condition for businesses, privacy regulations require some types of information to be kept locally, while other types must have the same protections as Canadian-hosted data if stored elsewhere.

Michelle Boorstein and Heather Kelly, Washington Post:

A group of conservative Colorado Catholics has spent millions of dollars to buy mobile app tracking data that identified priests who used gay dating and hookup apps and then shared it with bishops around the country.

[…]

One report prepared for bishops says the group’s sources are data brokers who got the information from ad exchanges, which are sites where ads are bought and sold in real time, like a stock market. The group cross-referenced location data from the apps and other details with locations of church residences, workplaces and seminaries to find clergy who were allegedly active on the apps, according to one of the reports and also the audiotape of the group’s president.

Boorstein and Kelly say some of those behind this group also outed a priest two years ago using similar tactics, which makes it look like a test case for this more comprehensive effort. As they write, a New York-based Reverend said at the time it was justified to expose priests who had violated their celibacy pledge. That is a thin varnish on what is clearly an effort to discriminate against queer members of the church. These operations have targeted clergy by using data derived almost exclusively by the use of gay dating apps.

Data brokers have long promised the information they supply is anonymized but, time and again, this is shown to be an ineffective means of protecting users’ privacy. That ostensibly de-identified data was used to expose a specific single priest’s use of Grindr in 2021, and the organization in question has not stopped. Furthermore, nothing would prevent this sort of exploitation by groups based outside the United States, which may be able to obtain similar data to produce the same — or worse — outcomes.

This is some terrific reporting by Boorstein and Kelly.

Kareem Abdul-Jabbar:

What we need to always be aware of is that how we treat any one marginalized group is how we will treat all of them—given the chance. There is no such thing as ignoring the exploitation of one group hoping they won’t come for you.

This goes for us individually, but especially for a paper with a massive platform like the New York Times, which Abdul-Jabbar is responding to. A recent episode of Left Anchor is a good explanation of why the Times’ ostensibly neutral just-asking-questions coverage of trans people and issues is so unfairly slanted as to be damaging.

Chance Miller, 9to5Mac:

If you view the November [2020] announcement [of the first M1 Macs] as the start of the transition process, Apple would have needed to have everything wrapped up by November 2022. This deadline, too, has passed. This means Apple has missed its two-year transition target regardless of which deadline you consider.

[…]

So that leaves us where we are today. You have Apple Silicon options for every product category in the Mac lineup, with the exception of the Mac Pro. During its March event, Apple exec John Ternus teased that the Mac Pro with Apple Silicon was an announcement “for another day.” That day, however, hasn’t yet come.

Miller also notes that an Intel version of the Mac Mini remains available. But it hardly matters for Apple to have technically missed its goal since all of its mainstream Macs have transitioned to its own silicon, and it has released an entirely new Mac — in the form of the Mac Studio — and begun the rollout of its second generation of chips in that timeframe. Also, it sure helps that people love these new Macs.

Update: The December 18 version of Mark Gurman’s newsletter contains more details about the forthcoming Mac Pro:

An M2 Extreme [Gurman’s own term for two M2 Ultras] chip would have doubled that to 48 CPU cores and 152 graphics cores. But here’s the bad news: The company has likely scrapped that higher-end configuration, which may disappoint Apple’s most demanding users — the photographers, editors and programmers who prize that kind of computing power.

[…]

Instead, the Mac Pro is expected to rely on a new-generation M2 Ultra chip (rather than the M1 Ultra) and will retain one of its hallmark features: easy expandability for additional memory, storage and other components.

I am interested to see how this works in practice. One of the trademarks of Macs based on Apple’s silicon is the deep integration of all these components, ostensibly for performance reasons.

Nur Dayana Mustak, Bloomberg:

Zuckerberg, 38, now has a net worth of $38.1 billion, according to the Bloomberg Billionaires Index, a stunning fall from a peak of $142 billion in September 2021. While many of the world’s richest people have seen their fortunes tumble this year, Meta’s chief executive officer has seen the single-biggest hit among those on the wealth list.

As if you needed more reasons to be skeptical of billionaires’ motivations for ostensibly charitable uses of their wealth, here is another. Zuckerberg has tied the success of Meta to his family’s Chan Zuckerberg Initiative by funding it through their personally-held shares in Meta. According to its website, that foundation — an LLC, not a charity — is focused on finding cures for diseases, reducing youth homelessness, and improving education. If you like the sound of those things, you should therefore hope for a skyrocketing Meta stock price. If, on the other hand, shareholders are concerned that Meta’s business model is detrimental to society at large and do not approve of the company’s vision for its future, they are compromising the efforts of Zuckerberg’s foundation.

L’affaire the Wire sure has taken a turn since yesterday. First, Kanishk Karan, one of the security researchers ostensibly contacted by reporters, has denied ever doing so:

It has come to my attention that I’ve been listed as one of the “independent security researchers” who supposedly “verified” the Wire’s report on FB ‘Xcheck’ in India. I would like to confirm that I did NOT DO the DKIM verification for them.

Aditi Agrawal, of Newslaundry, confirmed the non-participation of both researchers cited by the Wire:

The first expert was initially cited in the Wire’s Saturday report to have verified the DKIM signature of a contested internal email. He is a Microsoft employee. Although his name was redacted from the initial story, his employer and his positions in the company were mentioned.

This expert – who was later identified by [Wire founding editor Siddharth] Varadarajan in a tweet – told Newslaundry he “did not participate in any such thing”.

Those factors plus lingering doubts about its reporting have led to this un-bylined note from the Wire:

In the light of doubts and concerns from experts about some of this material, and about the verification processes we used — including messages to us by two experts denying making assessments of that process directly and indirectly attributed to them in our third story — we are undertaking an internal review of the materials at our disposal. This will include a review of all documents, source material and sources used for our stories on Meta. Based on our sources’ consent, we are also exploring the option of sharing original files with trusted and reputed domain experts as part of this process.

An internal review is a good start, but the Wire damaged its credibility when it stood by its reporting for a week as outside observers raised questions. This was a serious process failure that stemmed from a real issue — a post was removed for erroneous reasons, though it has been silently reinstated. In trying to report it out, the best case scenario is that this publication relied on sources who appear to have fabricated evidence. This kind of scandal is rare but harmful to the press at large. An internal review may not be enough to overcome this breach of trust.

Last week, New Delhi-based the Wire published what seemed like a blockbuster story, claiming that posts reported by high-profile users protected by Meta’s XCheck program would be removed from Meta properties with almost no oversight — in India, at least, but perhaps elsewhere. As public officials’ accounts are often covered by XCheck, this would provide an effective way for them to minimize criticism. But Meta leadership disputed the story, pointing to inaccuracies in the supposed internal documentation obtained by the Wire.

The Wire stood by its reporting. On Saturday, Devesh Kumar, Jahnavi Sen and Siddharth Varadarajan published a response with more apparent evidence. It showed that @fb.com email addresses were still in use at Meta, in addition to newer @meta.com addresses, but that merely indicated the company is forwarding messages; the Wire did not show any very recent emails from Meta leadership using @fb.com addresses. The Wire also disputed Meta’s claim that instagram.workplace.com is not an actively used domain:

The Wire’s sources at Meta have said that the ‘instagram.workplace.com’ link exists as an internal subdomain and that it remains accessible to a restricted group of staff members when they log in through a specific email address and VPN. At The Wire’s request, one of the sources made and shared a recording of them navigating the portal and showing other case files uploaded there to demonstrate the existence and ongoing use of the URL.

Meta:

The account was set up externally as a free trial account on Meta’s enterprise Workplace product under the name “Instagram” and using the Instagram brand as its profile picture. It is not an internal account. Based on the timing of this account’s creation on October 13, it appears to have been set up specifically in order to manufacture evidence to support the Wire’s inaccurate reporting. We have locked the account because it’s in violation of our policies and is being used to perpetuate fraud and mislead journalists.

The screen recording produced for the Wire shows the source navigating to internalfb.com to log in. That is a real domain registered to Meta and with Facebook-specific domain name records. Whoever is behind this apparent hoax is working hard to make it believable. It is trivial for a technically sophisticated person to recreate that login page and point the domain on their computer to a modified local version instead of Meta’s hosted copy. I do not know if that is the case here, but it is plausible.

The Wire also produced a video showing an apparent verification of the DKIM signature in the email Andy Stone ostensibly sent to the “Internal” and “Team” mailing lists.1 However, the signature shown in the screen recording appears to have some problems. For one, the timestamp appears to be incorrect; for another, the signature is missing the “to” field which is part of an authentic DKIM signature for emails from fb.com, according to emails I have received from that domain.

The Wire issued a statement acknowledging a personal relationship between one of its sources and a reporter. The statement was later edited to remove that declaration; the Wire appended a note to its statement saying it was changed to be “clearer about our relationships with our sources”. I think it became less clear as a result. The Wire also says Meta’s purpose in asking for more information is to expose its sources. I doubt that is true. When the Wall Street Journal published internal documents leaked by Frances Haugen, Meta did not claim they were faked or forged. For what it is worth, I believe Meta when it says the documentation obtained by the Wire is not real.

But this murky case still has one shred of validity: when posts get flagged, how does Meta decide whether the report is valid and what actions are taken? The post in question is an image that had no nudity or sexual content, yet was reported and removed for that reason. Regardless of the validity of this specific story, Meta ought to be more accountable, particularly when it comes to moderating satire and commentary outside the United States. At the very least, it does not look good for political interference under the banner of an American company.

Update: Alex Stamos tweeted about another fishy edit made by the Wire — a wrong timestamp on screenshots of emails from the experts who verified the DKIM signatures, silently changed after publishing.

Update: Pranesh Prakash has been tweeting through his discoveries. The plot thickens.


  1. There is, according to one reporter, no list at Meta called “Internal”. It also does not pass a smell check of what function an email list would have. This is wholly subjective, for sure, but think about what purpose an organization’s email lists serve, and then consider why a big organization like Meta would need one with a vague name like “Internal”. ↥︎

Felix Krause:

The iOS Instagram and Facebook app render all third party links and ads within their app using a custom in-app browser. This causes various risks for the user, with the host app being able to track every single interaction with external websites, from all form inputs like passwords and addresses, to every single tap.

This is because apps are able to manipulate the DOM and inject JavaScript into webpages loaded in in-app browsers. Krause elaborated today:

When you open any link on the TikTok iOS app, it’s opened inside their in-app browser. While you are interacting with the website, TikTok subscribes to all keyboard inputs (including passwords, credit card information, etc.) and every tap on the screen, like which buttons and links you click.

[…]

Instagram iOS subscribes to every tap on any button, link, image or other component on external websites rendered inside the Instagram app.

[…]

Note on subscribing: When I talk about “App subscribes to”, I mean that the app subscribes to the JavaScript events of that type (e.g. all taps). There is no way to verify what happens with the data.

Is TikTok a keylogger? Is Instagram monitoring every tap on a loaded webpage? It is impossible to say, but it does not look good that either of these privacy-invasive apps are so reckless with users’ ostensibly external activity.

It reminds me of when iOS 14 revealed a bunch of apps, including TikTok, were automatically reading pasteboard data. It cannot be known for certain what happened to all of the credit card numbers, passwords, phone numbers, and private information collected by these apps. Perhaps some strings were discarded because they did not match the format an app was looking for, like a parcel tracking number or a URL. Or perhaps some ended up in analytics logs collected by the developer. We cannot know for sure.

What we do know is how invasive big-name applications are, and how little their developers really care about users’ privacy. There is no effort at minimization. On the contrary, there is plenty of evidence for maximizing the amount of information collected about each user at as granular a level as possible.

Kristin Cohen, of the U.S. Federal Trade Commission:

The conversation about technology tends to focus on benefits. But there is a behind-the-scenes irony that needs to be examined in the open: the extent to which highly personal information that people choose not to disclose even to family, friends, or colleagues is actually shared with complete strangers. These strangers participate in the often shadowy ad tech and data broker ecosystem where companies have a profit motive to share data at an unprecedented scale and granularity.

This sounds promising. Cohen says the FTC is ready to take action against companies and data brokers misusing health information, in particular, in a move apparently spurred or accelerated by the overturning of Roe v. Wade. So what is the FTC proposing?

[…] There are numerous state and federal laws that govern the collection, use, and sharing of sensitive consumer data, including many enforced by the Commission. The FTC has brought hundreds of cases to protect the security and privacy of consumers’ personal information, some of which have included substantial civil penalties. In addition to Section 5 of the FTC Act, which broadly prohibits unfair and deceptive trade practices, the Commission also enforces the Safeguards Rule, the Health Breach Notification Rule, and the Children’s Online Privacy Protection Rule.

I am no lawyer, so it would be ridiculous for me to try to interpret these laws. But what is there sure seems limited in scope — in order: personal information entrusted to financial companies, security breaches of health records, and children under 13 years old. This seems like the absolute bottom rung on the ladder of concerns. It is obviously good that the FTC is reiterating its enforcement capabilities, though revealing of its insipid authority, but what is it about those laws which will permit it to take meaningful action against the myriad anti-privacy practices covered by over-broad Terms of Use agreements?

Companies may try to placate consumers’ privacy concerns by claiming they anonymize or aggregate data. Firms making claims about anonymization should be on guard that these claims can be a deceptive trade practice and violate the FTC Act when untrue. Significant research has shown that “anonymized” data can often be re-identified, especially in the context of location data. One set of researchers demonstrated that, in some instances, it was possible to uniquely identify 95% of a dataset of 1.5 million individuals using four location points with timestamps. Companies that make false claims about anonymization can expect to hear from the FTC.

Many digital privacy advocates have been banging this drum for years. Again, I am glad to see it raised as an issue the FTC is taking seriously. But given the exuberant data broker market, how can any company that collects dozens or hundreds of data points honestly assert their de-identified data cannot be associated with real identities?

The only solution is for those companies to collect less user data and to pass even fewer points onto brokers. But will the FTC be given the tools to enforce this? Its funding is being increased significantly, so it will hopefully be able to make good on its cautionary guidance.

Cristiano Lima, Washington Post:

An academic study finding that Google’s algorithms for weeding out spam emails demonstrated a bias against conservative candidates has inflamed Republican lawmakers, who have seized on the results as proof that the tech giant tried to give Democrats an electoral edge.

[…]

That finding has become the latest piece of evidence used by Republicans to accuse Silicon Valley giants of bias. But the researchers said it’s being taken out of context.

[Muhammad] Shahzad said while the spam filters demonstrated political biases in their “default behavior” with newly created accounts, the trend shifted dramatically once they simulated having users put in their preferences by marking some messages as spam and others as not.

Shahzad and the other researchers who authored the paper have disputed the sweeping conclusions of bias drawn by lawmakers. Their plea for nuance has been ignored. Earlier this month, a group of senators introduced legislation to combat this apparent bias. It intends to prohibit email providers from automatically flagging any political messages as spam, and requires providers to publish quarterly reports detailing how many emails from political parties were filtered.

According to reporting from Mike Masnick at Techdirt, it looks like this bill was championed by Targeted Victory, which also promoted the study to conservative media channels. You may remember Targeted Victory from their involvement in Meta’s campaign against TikTok.

Masnick:

Anyway, looking at all this, it is not difficult to conclude that the digital marketing firm that Republicans use all the time was so bad at its job spamming people, that it was getting caught in spam filters. And rather than, you know, not being so spammy, it misrepresented and hyped up a study to pretend it says something it does not, blame Google for Targeted Victory’s own incompetence, and then have its friends in the Senate introduce a bill to force Google to not move its own emails to spam.

I am of two minds about this. A theme you may have noticed developing on this website over the last several years is a deep suspicion of automated technologies, however they are branded — “machine learning”, “artificial intelligence”, “algorithmic”, and the like. So I do think some scrutiny may be warranted in understanding how automated systems determine a message’s routing.

But it does not seem at all likely to me that a perceived political bias in filtering algorithms is deliberate, so any public report indicating the number or rate of emails from each political party being flagged as spam is wildly unproductive. It completely de-contextualizes these numbers and ignores decades of spam filters being inaccurate from time to time for no good reason.

A better approach for all transparency around automated systems is one that helps the public understand how these decisions are made without playing to perceived bias by parties with a victim complex. Simply counting the number of emails flagged as spam from each party is an idiotic approach. I, too, would like to know why many of the things I am recommended by algorithms are entirely misguided. This is not the way.

By the way, politicians have a long and proud history of exempting themselves from unfavourable regulations. Insider trading laws virtually do not apply to U.S. congresspersons, even with regulations to ostensibly rein it in. In Canada, politicians excluded themselves from unsolicited communications laws by phone and email. Is it any wonder why polls have showed declining trust in institutions for decades?

Kashmir Hill, New York Times:

For $29.99 a month, a website called PimEyes offers a potentially dangerous superpower from the world of science fiction: the ability to search for a face, finding obscure photos that would otherwise have been as safe as the proverbial needle in the vast digital haystack of the internet.

A search takes mere seconds. You upload a photo of a face, check a box agreeing to the terms of service and then get a grid of photos of faces deemed similar, with links to where they appear on the internet. The New York Times used PimEyes on the faces of a dozen Times journalists, with their consent, to test its powers.

PimEyes found photos of every person, some that the journalists had never seen before, even when they were wearing sunglasses or a mask, or their face was turned away from the camera, in the image used to conduct the search.

You do not even need to pay the $30 per month fee. You can test PimEyes’ abilities for free.

PimEyes disclaims responsibility the results of its search tool through some ostensibly pro-privacy language. In a blog post published, according to metadata visible in the page source, one day before the Times’ investigation, it says its database “contains no personal information”, like someone’s name or contact details. The company says it does not even have any photos, storing only “faceprint” data and URLs where matching photos may be found.

Setting aside the question of whether a “faceprint” ought to be considered personal information — it is literally information about a person, so I think it should — perhaps you have spotted the sneaky argument PimEyes is attempting to make here. It can promote the security of its database and its resilience against theft all it wants, but its real privacy problems are created entirely through its front-end marketed features. If its technology works anywhere near as well as marketed, a search will lead to webpages that do contain the person’s name and contact details.

PimEyes shares the problem found with any of these people finding tools, no matter their source material: they do not seem dangerous in isolation, but it is their ability to coalesce and correlate different data points to create a complete profile. Take a picture of anyone, then dump it into PimEyes to find their name and, perhaps, a username or email address correlated with the image. Use a different people-based search engine to find profiles across the web that share the same online handle, or accounts registered with that email address. Each of those searches will undoubtedly lead to greater pools of information, and all of this is perfectly legal. The only way to avoid being a subject is to submit an opt-out request to services that offer it. Otherwise, if you exist online in any capacity, you are a token in this industry.

Hill:

PimEyes users are supposed to search only for their own faces or for the faces of people who have consented, Mr. Gobronidze said. But he said he was relying on people to act “ethically,” offering little protection against the technology’s erosion of the long-held ability to stay anonymous in a crowd. PimEyes has no controls in place to prevent users from searching for a face that is not their own, and suggests a user pay a hefty fee to keep damaging photos from an ill-considered night from following him or her forever.

This is such transparent bullshit. Gobronidze has to know that not everybody using its service is searching for pictures of themselves or those who have consented. As Hill later writes, it requires more stringent validation of a request to opt out of its results than it does a request to search.

Update: On July 16, Mara Hvistendahl of the Intercept reported on a particularly disturbing use of PimEyes:

The online facial recognition search engine PimEyes allows anyone to search for images of children scraped from across the internet, raising a host of alarming possible uses, an Intercept investigation has found.

It would be more acceptable if this service were usable only by a photo subject or their parent or guardian. As it is, PimEyes stands by its refusal to gate image searches, permitting any creep to search for images of anyone else through facial recognition.