While it’s fun to imitate the works of long-gone artists like Van Gogh or Frida Kahlo with an AI generator, the same is not true for living artists. Allowing just about anyone to create an image “in the style of” an artist can cause their market to be flooded with knockoffs; adding confusion around authenticity and a potential loss of income.
In response, people are searching for a way to protect their images from AI art generators. The answers may not be perfect or foolproof, but there are ways you can defend your images from being used by AI art generators.
How AI Generators Get Your Images
AI art generators go through a period of “training” to learn how to produce an image when given a text prompt. As part of the learning process, it needs to study hundreds of millions of image-text pairs to eventually generate an accurate image of real-world objects, colors, and scenes—alongside art techniques and style.
As it happens, AI models have to learn from the creativity of humans. For example, Midjourney and Stability Diffusion are two AI art generators trained on the open-source LAION-5B dataset, containing billions of images from across the internet.
Using web crawlers to “scrape” websites for data, these datasets create lists of image URLs, plus their caption, in something that might resemble a massive Excel spreadsheet. If you have posted your art online before it might be in an image dataset, and therefore used to train AI, whether or not you consented.
1. Opt Out of AI Training Datasets
Spawning is a group of artists whose popular website, Have I Been Trained?, can be used to see if your images are in the LAION-5B dataset. Taking it upon themselves, they later added the function to opt out of the dataset. Under an agreement, Spawning will pass on user opt-out lists to LAION, who have said that it will honor the request and remove those images from its collection.
The opt-out tool by Spawning still requires some development since, at the time of writing, you can’t add multiple images at once. Nor are there opt-out agreements with any other dataset that might be used to train AI models.
Since many AI companies don’t disclose the finer details about how their AI models are built, it’s sometimes not clear what dataset they are using. DALL-E is one popular AI art generator that doesn’t share this information.
Alternatively, if you use DeviantArt to share your artwork, your images are now protected by default from being used for AI training datasets. It works by tagging your image with “noai”, meaning if an AI model is found to have used your image for training, it will be violating DeviantArt’s Terms of Service.
Of course, these opt-out measures are not enforced, so they can be ignored by third parties if they chose. While this isn’t the most efficient solution, it is leading the way for more rules and regulations to protect artists’ work.
In an ideal world, people would be given the option to opt in, as opposed to having to opt-out. We hope to see this happen in the future. For more details about how to remove your images from datasets, see our guide on how to opt out your images from AI training.
2. Copyright Your Work
Whether the practice of scraping images from the internet is legal has been brought to the attention of the courts. And helping artists to present evidence on their side is the image copyright they own.
At the start of 2023, the well-known comic artist, Sarah Andersen, was part of a group of artists who brought a lawsuit against the AI companies Stability AI and Midjourney, as well as the art-sharing website DeviantArt, for scraping the work of their art without consent—including the art of countless millions of other artists.
Another example is the stock image website, Getty Images, which filed a lawsuit against Stability AI for scraping its images without a license. The way it discovered that its copyright images were being used was when AI-generated images started showing up with the Getty Images watermark—a pretty obvious giveaway.
While the legality will be determined in due course, copyright is one of the few things that can be used to fight for the rights of artists’ work, as seen in the cases we mentioned. It may not be up-to-date with AI technology, but it can add to your defense moving forward.
It’s a practice that is well worth learning about anyway so you can protect your work from being stolen, whether or not AI is involved. Follow our guide on how to copyright your photos for an in-depth look at how it all works.
3. Block Website Crawlers With Robots.txt
Image datasets are only able to index a large number of images because they use something called web crawlers. As the name might suggest, they crawl across websites in search of particular information.
Some crawlers are useful and help search engines like Google find and index the most relevant information to display on its search results page. Others are used to scrape websites for images to include in AI training datasets.
That’s where Robots.txt comes in. Robots.txt is a text file that can be placed in the backend code of a website to tell web crawlers what they can and cannot scan. You can use it to stop a crawler from looking at certain pages or files, which is helpful if you don’t want your images to be used by AI.
If you want to know more, read our guide on what a web crawler is and how it works. For those who have a website, ask your web developer to embed a Robots.txt in your website code to stop your images from being scrapped by AI training datasets.
What to Expect in the Future
It’s frustrating having to compete with AI models, but more solutions are on their way.
On the one hand, court proceedings are in the process of working out what is legal and how copyright works with AI image generation. The outcome of these public debates will set legal standards, and possibly prompt regulations, that AI companies have to follow.
On the other hand, developers are thinking of how to solve the problem using new technology. In one promising study, research shows that you can use AI to combat itself by rendering an image “unlearnable” to AI training datasets.
As Dr. Sarah Monazam Erfani at the University of Melbourne explains: “We have devised a machine learning-based technique that identifies and changes just enough pixels in an image to confuse AI and turn it into an ‘unlearnable’ image. The change is very small and imperceptible to human eyes, but it introduces enough ‘noise’ into an image to make it useless for training AI.”
If you are someone affected by AI image generators, it’s worth making your voice heard so that these companies are pressured into changing their practices. It was only because of strong feedback from the DeviantArt community that a new opt-out preference was created, so make sure to give feedback to the art sharing platforms and AI companies.
Defending Your Images From AI
You can protect your images from AI art generators by opting out of AI training datasets, copyrighting your pictures, and using the Robots.txt standard. While it won’t guarantee that your images stay out of AI systems, using all three methods will give you the best defense until more solutions are developed.
New tools are on their way, including ways to imperceptibly adjust your image so that AI can’t learn from it, making it useless for training AI art generators. In the meantime, don’t give up. There are still ways to protect your images from AI art generators.