Reddit’s Partner Policies, Applicable to ‘A.I.’ Licensees, Prohibits Using Deleted Posts ⇥ redditinc.com

Reddit:

Our policy outlines the information partners can access via a public-content licensing agreement as well as the commitments we make to users about usage of this content. It takes into account feedback from a group of moderators we consulted when developing it:

We require our partners to uphold the privacy of redditors and their communities. This includes respecting users’ decisions to delete their content and any content we remove for violating our Content Policy.

This always sounds like a good policy, but how does it work in practice? Is it really possible to disentangle someone’s deleted Reddit post from training data? The models which have been trained on Reddit comments will not be redone every time posts or accounts get deleted.

There are, it seems, some good protections in these policies and I do not want to dump on it entirely. I just do not think it is fair to imply to users that their deleted posts cannot or will not be used in artificial intelligence models.