Fifty Million Facebook Profiles Harvested for Cambridge Analytica theguardian.com

Matthew Rosenberg, Nicholas Confessore, and Carole Cadwalladr, New York Times:

[Cambridge Analytica] had secured a $15 million investment from Robert Mercer, the wealthy Republican donor, and wooed his political adviser, Stephen K. Bannon, with the promise of tools that could identify the personalities of American voters and influence their behavior. But it did not have the data to make its new products work.

So the firm harvested private information from the Facebook profiles of more than 50 million users without their permission, according to former Cambridge employees, associates and documents, making it one of the largest data leaks in the social network’s history. The breach allowed the company to exploit the private social media activity of a huge swath of the American electorate, developing techniques that underpinned its work on President Trump’s campaign in 2016.

Carole Cadwalladr and Emma Graham-Harrison, the Guardian:

The data was collected through an app called thisisyourdigitallife, built by academic Aleksandr Kogan, separately from his work at Cambridge University. Through his company Global Science Research (GSR), in collaboration with Cambridge Analytica, hundreds of thousands of users were paid to take a personality test and agreed to have their data collected for academic use.

However, the app also collected the information of the test-takers’ Facebook friends, leading to the accumulation of a data pool tens of millions-strong. Facebook’s “platform policy” allowed only collection of friends’ data to improve user experience in the app and barred it being sold on or used for advertising. The discovery of the unprecedented data harvesting, and the use to which it was put, raises urgent new questions about Facebook’s role in targeting voters in the US presidential election. It comes only weeks after indictments of 13 Russians by the special counsel Robert Mueller which stated they had used the platform to perpetrate “information warfare” against the US.

Both the Times and the Guardian describe this as a “data breach”, but I don’t think that’s entirely descriptive of what went on here. When I hear “data breach”, I think that a password got stolen or a system was hacked into. But Facebook VP Andrew Bosworth tweeted that there was nothing that was stolen — users willingly gave their information to an app, which went behind their backs to use the information in a somewhat sketchy way that users did not expect.

Which, when you think about it, is kind of Facebook’s business model. Maciej Cegłowski:

The data that Facebook leaked to Cambridge Analytica is the same data Facebook retains on everyone and sells targeting services around. The problem is not shady Russian researchers; it’s Facebook’s core business model of collect, store, analyze, exploit.

Facebook preempted the publication of both of these stories with a press release indicating that they’ve suspended Strategic Communications Laboratories — Cambridge Analytica’s parent — from accessing Facebook, including the properties of any of their clients.

However, the reason for that suspension is not what you may think: it isn’t because Kogan, the developer of the thisisyourdigitallife app, passed information to Cambridge Analytica, but rather because he did not delete all of the data after Facebook told him to.

Also, from that press release:

We are constantly working to improve the safety and experience of everyone on Facebook. In the past five years, we have made significant improvements in our ability to detect and prevent violations by app developers. Now all apps requesting detailed user information go through our App Review process, which requires developers to justify the data they’re looking to collect and how they’re going to use it – before they’re allowed to even ask people for it.

Of course, this kind of review process doesn’t exist for new projects created by Facebook itself, beyond the company’s blanket privacy policy.1 When Facebook starts analyzing user photos for facial recognition purposes without telling users first, that’s a similar violation of expectations and trust.

Marco Rogers:

Today, Facebook execs are going out of their way to let us know that this is the intended purpose of the platform. This isn’t unexpected. This is why they built it. They just didn’t expect to be held accountable.

Facebook can make all the policy changes it likes, but I don’t see any reason why something like this can’t happen again at some point in the future. Something will slip through the cracks and create unintended consequences of third-party companies having extraordinary access to one of the largest databases of people anywhere.

Facebook is more than happy to collect the world’s information, but it is clear to me that they have no intention for taking full responsibility for what that entails.


  1. Which users often don’t understand the implications of before accepting. ↥︎