Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

 



Forgot your password?
typodupeerror
×
AI

How Artists are Sabotaging AI to Take Revenge on Image Generators (theconversation.com) 25

Some text-to-image generators "have been trained by indiscriminately scraping online images," reports the Conversation, "many of which may be under copyright.

"Researchers who want to empower individual artists have recently created a tool named 'Nightshade' to fight back against unauthorised image scraping." The tool works by subtly altering an image's pixels in a way that wreaks havoc to computer vision but leaves the image unaltered to a human's eyes.... This can result in the algorithm mistakenly learning to classify an image as something a human would visually know to be untrue. As a result, the generator can start returning unpredictable and unintended results... [A] balloon might become an egg. A request for an image in the style of Monet might instead return an image in the style of Picasso... The models could also introduce other odd and illogical features to images — think six-legged dogs or deformed couches. The higher the number of "poisoned" images in the training data, the greater the disruption.

Because of how generative AI works, the damage from "poisoned" images also affects related prompt keywords. For example, if a "poisoned" image of a Ferrari is used in training data, prompt results for other car brands and for other related terms, such as vehicle and automobile, can also be affected. Nightshade's developer hopes the tool will make big tech companies more respectful of copyright, but it's also possible users could abuse the tool and intentionally upload "poisoned" images to generators to try and disrupt their services... [Technological fixes] include the use of "ensemble modeling" where different models are trained on many different subsets of data and compared to locate specific outliers. This approach can be used not only for training but also to detect and discard suspected "poisoned" images. Audits are another option. One audit approach involves developing a "test battery" — a small, highly curated, and well-labelled dataset — using "hold-out" data that are never used for training. This dataset can then be used to examine the model's accuracy.

The article adds that the most obvious fix "is paying greater attention to where input data are coming from and how they can be used.

"Doing so would result in less indiscriminate data harvesting. This approach does challenge a common belief among computer scientists: that data found online can be used for any purpose they see fit."
This discussion has been archived. No new comments can be posted.

How Artists are Sabotaging AI to Take Revenge on Image Generators

Comments Filter:
  • by alvinrod ( 889928 ) on Sunday December 17, 2023 @09:04PM (#64088003)
    Maybe that will be a short term effect, but I think it will just lead the model to a better approximation of human vision in the long run.

    Also they're missing a golden opportunity to poison the AI so that it draws dicks on everything.
    • by Bahbus ( 1180627 )

      Maybe that will be a short term effect, but I think it will just lead the model to a better approximation of human vision in the long run.

      This is precisely what I was thinking.

    • ...the tool will make big tech companies more respectful of copyright, but it's also possible users could abuse the tool and intentionally upload "poisoned" images to generators to try and disrupt their services

      Filed Under: No Shit, Sherlock.

    • Re: (Score:3, Insightful)

      by rudy_wayne ( 414635 )

      The tool works by subtly altering an image's pixels in a way that wreaks havoc to computer vision but leaves the image unaltered to a human's eyes

      That sounds suspiciously like marketing bullshit.

      Also they're missing a golden opportunity to poison the AI so that it draws dicks on everything.

      +1 Funny.

  • by Rei ( 128717 ) on Sunday December 17, 2023 @09:18PM (#64088027) Homepage

    ... product, Glaze. It's a nice compute-intensive way to hurt the quality of your images while not having any impact on AIs whatsoever. Back with Glaze, there were people deliberately putting glazed images into their datasets for training custom models to see what would happen, and the answer is "absolutely nothing, no impact whatsoever". And Glaze could be removed with something like 16 lines of python code anyway.

    What impact it was supposed to have even the developers admitted broke with SDXL, because these systems are ridiculously fragile and rely on the presumption that nothing changes about the model at all, and also relies on the assumption that there's *some* difference at all between how the model is trained to represent human vision vs. the model of human vision that they use, which is a flimsy assumption right off the bat.

    We'll just ignore that massive image datasets predating Glaze and Nightshade already exist, and no matter what you do today, it's not going to unexist previous datasets.

    And lastly, there are universal detection methods anyway. Not detecting some specific attack, but literally any attack that could possibly be conceived. Specifically, you train a baseline model on preexisting images, and then for each new image, you check to see how much it skews the model's understanding of preexisting concepts. If some image tries to radically skew one or more concepts, its probably bad (either intentional or accidentally). Indeed, if you really wanted to, you could then turn that into a corrector tool, that does the similarity process in reverse until the image no longer heavily skews concepts, and from that get a universal deglazer..

    It's garbage. It's a feel-good way for people who hate AI to lower the quality of their images, but it doesn't actually do anything. All the people glazing their images did nothing to prevent SDXL. All the people using nightshade today will do nothing to prevent the next-generation models.

  • by danielcolchete ( 1088383 ) on Sunday December 17, 2023 @09:32PM (#64088043)

    Everyone will be buying their product based on this bullshit?

    AI transforms and compresses images so much, exactly because subtle pixel level differences would wreck everything. And everyone encodes / compresses their images differently. The only way to create something that would confuse all AI models would be to create something that confuses humans as well.

    But, sure, let them market and profit on the fear that creative folks have of AI right now.

    Such bullshit.

  • I see some academic paper without any images. =/
    Where are the pics showing how this technique will wreck havoc?
    Just like that paper where adding an elephant to a room.

  • by jenningsthecat ( 1525947 ) on Sunday December 17, 2023 @10:36PM (#64088119)

    This was covered here about two months ago: https://slashdot.org/story/23/... [slashdot.org]

  • If they can really create such adversarial images so surgically as they claim then perhaps they should be the ones we hire to do mechanistic interpretability instead of Anthropic and all the people currently investing tens of millions of dollars on it.
  • Wonder what the effect on stuff like Google Image Search would be. Are they going for the description attached or do they actually "look" at the image?

    Anyway, I have a hunch we'll see a lot of funny effects from this.

  • ... for every asshat out there with a computer, there's another asshat out there with a computer. I applaud efforts to jam AI. That's the proper way to get those asshats to harden their systems. AI will be unfit for duty until then. To me, it's similar to entities who get breached and then build stronger gates.

  • You can't trust anything you see, because it may just be AI generated. But if AI also can't trust anything it learns, that is an equal and opposite win. This will happen anyway the more AI pisses in the pool of knowledge, but speeding the process up should still be good for the lolz.
  • Limited sympathy (Score:4, Informative)

    by bradley13 ( 1118935 ) on Monday December 18, 2023 @02:22AM (#64088323) Homepage

    If you put something online, where it can be viewed without a login, you can't complain when the public sees it.

    Whether that's some young artist copying your style, on an AI learning to generate images makes zero difference. You put it out there, it was seen and used.

    • by BeerCat ( 685972 )

      Since many artists claim their images are being used without recompense, then they may also find that image houses are less keen on purchasing an altered picture

      Artist: "But why didn't you buy my latest picture? I spent hours on it!"

      House: "In the same way that we don't buy your rough drafts, or the one that got data mangled. You deliberately made your work worse than it could be, so we are not interested"

      • by Bahbus ( 1180627 )

        I'm sure the thought process would be to provide the unaltered image to those wishing to actually purchase it, while the altered AI-poison one, would be what is freely visible on the internet - a watermarked version essentially.

        But as stated by someone else, this tool is just as useless as watermarks. Watermarks can be easily removed, and current AI models already pretty much ignore these poison pills.

  • Unfiltered web data was the first generation, now they are doing quality filtering and rewriting captions where they don't match. It's not indiscriminate use.
  • Funny thing, I’ve been intentionally poisoning data collection on me for decades. I despise licorice, Amazon doesn’t think so. Geothermal currents in volcanos? Who cares (not me). Even now my employer is pushing our support and R&D to embrace KCS so they can train the AI they hope to sell, but I’m not contributing, ChatGPT is.

    I’ve been promoting poisonong data for years and years just to mess it up for advertisers, it works well for this and the government as well.

  • Were only at the point where its mostly the image creators who are concerned their images are used in the training. What happens when the corporations who produce the things get involved? The likenesses of many autos need licensing to be used in game, films, and in the production of replicas. Even armaments companies have objected to the military hardware they manufacture being depicted in games. How long until those companies decide images depicting their products must be removed from training data? How lo

egrep -n '^[a-z].*\(' $ | sort -t':' +2.0

Working...