So today I came across Glaze, the first attempt I've seen at a widely publicly accessible anti-generative model trap-tool:
https://glaze.cs.uchicago.edu/The principle being that it makes very subtle, not easy for humans to detect, tweaks to an image to get AIs to read it as being stylistically very different to how we see it: using the difference between model input perception and human perception as a way to keep the base image but really confuse the model. Glaze mainly messes with the stylistics, to make it harder to do "X in the style of Y" type requests from systems like StableDiffusion and MidJourney.
I think we're going to see a ton more of this stuff for style protection, but also similar techniques for "gaming" LLM and image model outputs (once people can work out how to maximise product placement from these models, someone will make an advertising fortune). And there'll be outright "malware" inputs that try to break the set or get the AI to replicate damaging code or whatever. There are some big vulnerabilities and possibilities in this area that people haven't really been considering yet because the tech is so new that people are only just reacting to it.
A quote from Prof. Ben Zhao who worked on Glaze, via
TechCrunch:
What we do is we try to understand how the AI model perceives its own version of what artistic style is. And then we basically work in that dimension — to distort what the model sees as a particular style. So it’s not so much that there’s a hidden message or blocking of anything… It is, basically, learning how to speak the language of the machine learning model, and using its own language — distorting what it sees of the art images in such a way that it actually has a minimal impact on how humans see. And it turns out because these two worlds are so different, we can actually achieve both significant distortion in the machine learning perspective, with minimal distortion in the visual perspective that we have as humans.
This comes from a fundamental gap between how AI perceives the world and how we perceive the world. This fundamental gap has been known for ages. It is not something that is new. It is not something that can be easily removed or avoided. It’s the reason that we have a task called ‘adversarial examples’ against machine learning. And people have been trying to fix that — defend against these things — for close to 10 years now, with very limited success,” he adds. “This gap between how we see the world and how AI model sees the world, using mathematical representation, seems to be fundamental and unavoidable… What we’re actually doing — in pure technical terms — is an attack, not a defence. But we’re using it as a defence.