Adequate Red Teaming of A.I. Models Requires Better Policy ⇥ nytimes.com

Riana Pfefferkorn, in an op-ed for the New York Times:

A.I. companies like xAI can and should do more not just to respond quickly and decisively when their models behave badly, but also to prevent them from generating such material in the first place. This means rigorously testing the models to learn how and why they can be manipulated into generating illegal sexual content — then closing those loopholes. But current laws don’t adequately protect good-intentioned testers from prosecution or correctly distinguish them from malicious users, which frightens companies from taking this kind of action.

To the extent A.I. companies are truly “red teaming” their models — this term has been misused by the industry, which often outsources the work to contractors in developing nations — current laws restrict the limits to which they can be taken. On this I agree with Pfefferkorn, and I think she is right to call for a change in policy.

But I am not convinced xAI is much interested in ensuring its model is that much safer. CSAM is obviously over the line and I would be surprised if anyone there were to defend Grok on that. Most anything else, however, is something I think xAI would find permissible albeit perhaps unseemly for Grok to generate. Remember: Grok is supposed to be “unfiltered”. Does it offend you? Because it should, buddy. That is the freedom you get when you look at the world through the lens of a mall-grade edgelord who will be turning 55 years old in June.