The Bullshit Web

My home computer in 1998 had a 56K modem connected to our telephone line; we were allowed a maximum of thirty minutes of computer usage a day, because my parents — quite reasonably — did not want to have their telephone shut off for an evening at a time. I remember webpages loading slowly: ten to twenty seconds for a basic news article.

At the time, a few of my friends were getting cable internet. It was remarkable seeing the same pages load in just a few seconds, and I remember thinking about the kinds of the possibilities that would open up as the web kept getting faster.

And faster it got, of course. When I moved into my own apartment several years ago, I got to pick my plan and chose a massive fifty megabit per second broadband connection, which I have since upgraded.

So, with an internet connection faster than I could have thought possible in the late 1990s, what’s the score now? A story at the Hill took over nine seconds to load; at Politico, seventeen seconds; at CNN, over thirty seconds. This is the bullshit web.

But first, a short parenthetical: I’ve been writing posts in both long- and short-form about this stuff for a while, but I wanted to bring many threads together into a single document that may pretentiously be described as a theory of or, more practically, a guide to the bullshit web.

A second parenthetical: when I use the word “bullshit” in this article, it isn’t in a profane sense. It is much closer to Harry Frankfurt’s definition in “On Bullshit”:

It is just this lack of connection to a concern with truth — this indifference to how things really are — that I regard as of the essence of bullshit.

I also intend it to be used in much the same sense as the way it is used in David Graeber’s “On the Phenomenon of Bullshit Jobs”:

In the year 1930, John Maynard Keynes predicted that, by century’s end, technology would have advanced sufficiently that countries like Great Britain or the United States would have achieved a 15-hour work week. There’s every reason to believe he was right. In technological terms, we are quite capable of this. And yet it didn’t happen. Instead, technology has been marshaled, if anything, to figure out ways to make us all work more. In order to achieve this, jobs have had to be created that are, effectively, pointless. Huge swathes of people, in Europe and North America in particular, spend their entire working lives performing tasks they secretly believe do not really need to be performed. The moral and spiritual damage that comes from this situation is profound. It is a scar across our collective soul. Yet virtually no one talks about it.

[…]

These are what I propose to call ‘bullshit jobs’.

What is the equivalent on the web, then?

1

The average internet connection in the United States is about six times as fast as it was just ten years ago, but instead of making it faster to browse the same types of websites, we’re simply occupying that extra bandwidth with more stuff. Some of this stuff is amazing: in 2006, Apple added movies to the iTunes Store that were 640 × 480 pixels, but you can now stream movies in HD resolution and (pretend) 4K. These much higher speeds also allow us to see more detailed photos, and that’s very nice.

But a lot of the stuff we’re seeing is a pile-up of garbage on seemingly every major website that does nothing to make visitors happier — if anything, much of this stuff is deeply irritating and morally indefensible.

Take that CNN article, for example. Here’s what it contained when I loaded it:

  • Eleven web fonts, totalling 414 KB

  • Four stylesheets, totalling 315 KB

  • Twenty frames

  • Twenty-nine XML HTTP requests, totalling about 500 KB

  • Approximately one hundred scripts, totalling several megabytes — though it’s hard to pin down the number and actual size because some of the scripts are “beacons” that load after the page is technically finished downloading.

The vast majority of these resources are not directly related to the information on the page, and I’m including advertising. Many of the scripts that were loaded are purely for surveillance purposes: self-hosted analytics, of which there are several examples; various third-party analytics firms like Salesforce, Chartbeat, and Optimizely; and social network sharing widgets. They churn through CPU cycles and cause my six-year-old computer to cry out in pain and fury. I’m not asking much of it; I have opened a text-based document on the web.

In addition, pretty much any CNN article page includes an autoplaying video, a tactic which has allowed them to brag about having the highest number of video starts in their category. I have no access to ComScore’s Media Metrix statistics, so I don’t know exactly how many of those millions of video starts were stopped instantly by either the visitor frantically pressing every button in the player until it goes away or just closing the tab in desperation, but I suspect it’s approximately every single one of them. People really hate autoplaying video.

Also, have you noticed just how many websites desperately want you to sign up for their newsletter? While this is prevalent on so many news and blog websites, I’ve dragged them enough in this piece so far, so I’ll mix it up a bit: this is also super popular with retailers. From Barnes & Noble to Aritzia, Fluevog to Linus Bicycles, these things are seemingly everywhere. Get a nominal coupon in exchange for being sent an email you won’t read every day until forever — I don’t think so.

Finally, there are a bunch of elements that have become something of a standard with modern website design that, while not offensively intrusive, are often unnecessary. I appreciate great typography, but web fonts still load pretty slowly and cause the text to reflow midway through reading the first paragraph. And then there are those gigantic full-width header images that dominate the top of every page, as though every two-hundred-word article on a news site was the equivalent of a magazine feature.

So that’s the tip of the bullshit web. You know how building wider roads doesn’t improve commute times, as it simply encourages people to drive more? It’s that, but with bytes and bandwidth instead of cars and lanes.

2

As Graeber observed in his essay and book, bullshit jobs tend to spawn other bullshit jobs for which the sole function is a dependence on the existence of more senior bullshit jobs:

And these numbers do not even reflect on all those people whose job is to provide administrative, technical, or security support for these industries, or for that matter the whole host of ancillary industries (dog-washers, all-night pizza delivery) that only exist because everyone else is spending so much of their time working in all the other ones.

So, too, is the case with the bullshit web. The combination of huge images that serve little additional purpose than decoration, several scripts that track how far you scroll on a page, and dozens of scripts that are advertising related means that text-based webpages are now obese and torpid and excreting a casual contempt for visitors.

Given the assumption that any additional bandwidth offered to web developers will immediately be consumed, there seems to be just one possible solution, which is to reduce the amount of bytes that are transmitted. For some bizarre reason, this hasn’t happened on the main web, because it somehow makes more sense to create an exact copy of every page on their site that is expressly designed for speed. Welcome back, WAP — except, for some reason, this mobile-centric copy is entirely dependent on yet more bytes. This is the dumbfoundingly dumb premise of AMP.

Launched in February 2016, AMP is a collection of standard HTML elements and AMP-specific elements on a special ostensibly-lightweight page that needs an 80 kilobyte JavaScript file to load correctly. Let me explain: HTML5 allows custom elements like AMP’s <amp-img>, but will render them as <span> elements without any additional direction — provided, in AMP’s case, by its mandatory JavaScript file. This large script is also required by the AMP spec to be hotlinked from cdn.amp-project.org, which is a Google-owned domain. That makes an AMP website dependent on Google to display its basic markup, which is super weird for a platform as open as the web.

That belies the reason AMP has taken off. It isn’t necessarily because AMP pages are better for users, though that’s absolutely a consideration, but because Google wants it to be popular. When you search Google for anything remotely related to current events, you’ll see only AMP pages in the news carousel that sits above typical search results. You’ll also see AMP links crowding the first results page, too. Google has openly admitted that they promote AMP pages in their results and that the carousel is restricted to only AMP links on their mobile results page. They insist that this is because AMP pages are faster and, therefore, better for users, but that’s not a complete explanation for three reasons: AMP pages aren’t inherently faster than non-AMP pages, high-performing non-AMP pages are not mixed with AMP versions, and Google has a conflict of interest in promoting the format.

It seems ridiculous to argue that AMP pages aren’t actually faster than their plain HTML counterparts because it’s so easy to see these pages are actually very fast. And there’s a good reason for that. It isn’t that there’s some sort of special sauce that is being done with the AMP format, or some brilliant piece of programmatic rearchitecting. No, it’s just because AMP restricts the kinds of elements that can be used on a page and severely limits the scripts that can be used. That means that webpages can’t be littered with arbitrary and numerous tracking and advertiser scripts, and that, of course, leads to a dramatically faster page. A series of experiments by Tim Kadlec showed the effect of these limitations:

AMP’s biggest advantage isn’t the library — you can beat that on your own. It isn’t the AMP cache — you can get many of those optimizations through a good build script, and all of them through a decent CDN provider. That’s not to say there aren’t some really smart things happening in the AMP JS library or the cache — there are. It’s just not what makes the biggest difference from a performance perspective.

AMP’s biggest advantage is the restrictions it draws on how much stuff you can cram into a single page.

[…]

AMP’s restrictions mean less stuff. It’s a concession publishers are willing to make in exchange for the enhanced distribution Google provides, but that they hesitate to make for their canonical versions.

So: if you have a reasonably fast host and don’t litter your page with scripts, you, too, can have AMP-like results without creating a copy of your site dependent on Google and their slow crawl to gain control over the infrastructure of the web. But you can’t get into Google’s special promoted slots for AMP websites for reasons that are almost certainly driven by self-interest.

3

There is a cumulative effect of bullshit; its depth and breadth is especially profound. In isolation, the few seconds that it takes to load some extra piece of surveillance JavaScript isn’t much. Neither is the time it takes for a user to hide an email subscription box, or pause an autoplaying video. But these actions compound on a single webpage, and then again across multiple websites, and those seemingly-small time increments become a swirling miasma of frustration and pain.

It’s key to recognize, though, that this is a choice, a responsibility, and — ultimately — a matter of respect. Let us return to Graeber’s explanation of bullshit jobs, and his observation that we often experience fifteen-hour work weeks while at the office for forty. Much of the same is true on the web: there is the capability for pages to load in a second or two, but it has instead been used to spy on users’ browsing habits, make them miserable, and inundate them on other websites and in their inbox.

As for Frankfurt’s definition — that the essence of bullshit is an indifference to the way things really are — that’s manifested in the hand-wavey treatment of the actual problems of the web in favour of dishonest pseudo-solutions like AMP.

An actual solution recognizes that this bullshit is inexcusable. It is making the web a cumulatively awful place to be. Behind closed doors, those in the advertising and marketing industry can be pretty lucid about how much they also hate surveillance scripts and how awful they find these methods, while simultaneously encouraging their use. Meanwhile, users are increasingly taking matters into their own hands — the use of ad blockers is rising across the board, many of which also block tracking scripts and other disrespectful behaviours. Users are making that choice.

They shouldn’t have to. Better choices should be made by web developers to not ship this bullshit in the first place. We wouldn’t tolerate such intrusive behaviour more generally; why are we expected to find it acceptable on the web?

An honest web is one in which the overwhelming majority of the code and assets downloaded to a user’s computer are used in a page’s visual presentation, with nearly all the remainder used to define the semantic structure and associated metadata on the page. Bullshit — in the form of CPU-sucking surveillance, unnecessarily-interruptive elements, and behaviours that nobody responsible for a website would themselves find appealing as a visitor — is unwelcome and intolerable.

Death to the bullshit web.