Oh, the Places Your Apple ID Will Go

Here is a short and curious Twitter thread from app developers and security researchers Tommy Mysk and Talal Haj Bakry:

Apple’s analytics data include an ID called “dsId”. We were able to verify that “dsId” is the “Directory Services Identifier”, an ID that uniquely identifies an iCloud account. Meaning, Apple’s analytics can personally identify you.

Apple states in their Device Analytics & Privacy statement that the collected data does not identify you personally. This is inaccurate. We also showed earlier that the #AppStore keeps sending detailed analytics to Apple even when sharing analytics is switched off.

Apple also refers to the DSID by other names, such as the “Apple User Account Identifier”, “Apple ID Number”, “Apple ID Reference Number”, and “Original Unique Identifier”. Based on my 2021 data request it is, as described, a proxy for a specific Apple ID. It identifies you with Apple’s services, including for things like marketing and communications efforts. I have a spreadsheet of the nearly nine hundred times me and my DSID ignored Apple’s attempts to upsell me on Apple One, a service which launched just thirteen months before I made this data request. I also have a list of all the times I contacted AppleCare and the same identifier is attached. In most, but not all, instances, this numeric identifier is the only personal identification entirely without redaction. In my records from Apple, my name, email address, Apple ID and aliases, and phone number are only shown in part.

I am not surprised Apple assigns a personal identifier for its services; Mysk and Bakry say they found the same identifier in analytics logs for the App Store, Apple Music, and other company services.1 The researchers point to Apple’s Device Analytics & Privacy document where it says in the iOS Device Analytics section that “[n]one of the collected information identifies you personally”. But this does not pertain to Apple’s services which are covered by entirely different policies. Both the App Store and Apple Music say usage information is collected. These are not device analytics, they are services analytics. How else are recommendations or search features supposed to work? If anything, I wish Apple used this information in even smarter ways: up until recently, a search for “Low” in Apple Music would always return several results related to the Flo Rida song first, which does not see any playback from me, instead of the band I often listen to. I wish those results were more tailored to my use of the service.

In fairness, perhaps the Device Analytics toggle in Settings should be worded more clearly to indicate that turning it off will not opt out of store and services activity. I am also shocked by the granularity of information in these storefront analytics. It is relevant to Apple’s recommendation engine if I listened to an album or song and whether I finished it, but it is hard to see what value it has in knowing my track playback to the millisecond. I also think the identifier used by Apple’s services should be different than the Apple ID that is correlated with your device purchase history and support requests.

Where I think things take a more concerning turn are in the logs Apple collects alongside bug reports and crashes. If I am reading the Device Analytics policy correctly, these would fall under a category of logged personal data which “is subject to privacy preserving techniques such as differential privacy, or is removed from any reports before they’re sent to Apple”. However, I am not sure that is strictly true. I downloaded the copy hosted by Apple of a sysdiagnose package sent by my MacBook Pro — which does not have a beta profile installed and is running a public non-beta version of MacOS — and found my identifier in three files. If these are in the copy I downloaded from Feedback Assistant, Apple has copies of these three files, all of which are associated with iCloud features. Because that identifier is also used in some iCloud API requests, I also spotted the same value in activity logs for third-party applications using things in my iCloud account, as well as in metadata for local copies of documents I downloaded from my drive at iCloud.com. However, I did not see this identifier in any other diagnostic report, usage logs, or other analytics on my Mac.2

I may be getting something wildly wrong here, but I am not sure I see the presence of this Apple ID proxy in Apple’s services logs to be a violation of either its own policies or users’ expectations for using internet services in general. Its highly granular analytics are more comprehensive than I think many people would believe is necessary, to an extent they violate the spirit of what Apple professes to stand for, and it would be better if this identifier were sandboxed to avoid any association with real-world activity like service requests. I do not think it is news that device analytics are not the same as services analytics, certainly not to the extent that it justifies a lawsuit.

But there is a quirk that interests me: does Apple continue to view the iPhone as a device with a unified and interconnected set of hardware, software, and services it controls at a platform level? While it is possible to use an iPhone without an Apple ID, it is not possible to use the App Store without one, and installing software outside of the App Store is officially not possible. Because of DRM, it is also not possible to sign into the App Store for the purpose of downloading a third-party app, then sign out of an Apple ID and be able to use that app. Apple may not strictly be associating someone’s use of an iPhone with a personal identifier, but it is extremely limiting to avoid using an iPhone’s features without associating with that identifier. A wall between these aspects may be overprotective, but overprotective is how Apple markets itself.

A good question is whether Apple violated privacy laws like GDPR with the use of this identifier combined with the description of the device analytics opt-out. An answer to that question is well outside my expertise.

  1. As an aside, and I do not intend this to be mean, I think it is a little funny how Gizmodo described the way Mysk and Bakry gathered information on the analytics Apple collects: “they used a jail broken iPhone running iOS 14.6, which allowed them to decrypt the traffic and examine exactly what data was being sent”, and they “also examined a regular iPhone running iOS 16”. It makes it sound like this is information impossible to be found without some laborious and technical work.

    But this same information appears to be available if you just ask for it. I have some giant spreadsheets here containing all sorts of analytics about my activity in the App Store, Apple Music, Apple Books, and other Apple services. Maybe I am missing something, but this does not strike me as a massive secret if it is something Apple will hand over if you simply ask.

    Collecting this information at the device or network level may not be telling the whole story. Apple says it adds layers of randomization upon receipt of the data, before it or its products are made available internally. ↥︎

  2. My iPhone is running the latest beta seed of iOS so I assumed it would collect more information. A spot check of a few analytics and usage files did not contain my identifier, but I would not draw general conclusions about iOS from beta builds. ↥︎