University of Chicago Researchers Have Found a Way to ‘Poison’ Training Data for AI

AI companies using publicly available data without consent has been an ongoing issue until now. A lot of people or companies have been filing copyright infringement lawsuits, but most of the owners of the images that are used still remain undefended. With the new tool developed by researchers, they might not need to worry about that anymore.

'Nightshade' Poisoning Technique

Researchers from the University of Chicago have created a tool that gives artists and image owners the ability to protect their work from AI companies using it to train their AI models, especially popular generative AI tools like Migjourney, DALL-E 3, and Stable Diffusion.

What the university calls the "poison" pill is an open-source tool that can make changes in an image that would corrupt an AI model's training process, all of which is not visible to the naked eye, as reported by Ars Technica.

Co-author of the research paper and university professor Ben Y. Zhao says that the point of the tool is to "balance the playing field between model trainers and content creators." Since there's not much that artists can do right now to fight back, AI companies hold all the cards.

"The only tools that can slow down crawlers are opt-out lists and do-not-crawl directives, all of which are optional and rely on the conscience of AI companies, and of course none of it is verifiable or enforceable and companies can say one thing and do another with impunity," says Zhao.

With the tool, content creators will be able to not only prevent AI model trainers from scraping data despite not licensing them or compensating the owners, but also disrupt the process of AI training as a consequence of doing so.

The way it works is that the tool would alter the images of, for instance, a dog, in a way that would make the AI model generate a cat even when the prompt asked for a canine. The image will be subtly modified so it will appear the same but its encoded space has a different concept.

Researchers tried feeding "poisoned" images to the open-source text-to-image generative AI tool Stable Diffusion, and after 50 images, the dogs generated in the images began to appear distorted. After feeding it 100 poisoned images, cats were starting to appear.

This Could Be a Significant Innovation

AI companies have both the legal resources and the money to stay afloat despite copyright infringement lawsuits thrown their way, and even if a few artists are compensated after that, many more are still infringed without the means to do anything about it.

Government agencies are yet to come up with concrete policies and regulations that would do content creators and artists justice. With this tool, anyone can easily mark their images and artwork without distorting its appearance.

The best part is that it will not affect all generative AI tools in the market which means that we will still have access to the convenient tools that train their models legally. Those that will be affected are just the ones who are illegally scraping data on the internet.