Follow Slashdot stories on Twitter

 



Forgot your password?
typodupeerror
×
AI

Shutterstock Is Removing AI-Generated Images 74

Shutterstock appears to be removing images generated by AI systems like DALL-E and Midjourney. Motherboard reports: On Shutterstock, searches for images tagged "Midjourney" yielded several photos with the AI tool's unmistakable aesthetic, with many having high popularity scores and marked as "frequently used." But late Monday, the results for "Midjourney" seem to have been reduced, leaving mainly stock photos of the tool's logo. Other images use tags like "AI generated" -- one image, for example, is an illustration of a futuristic building with an image description reading "Ai generated illustration of futuristic Art Deco city, vintage image, retro poster." The image is part of a collection the artist titled "Midjourney," which has since been removed from the site. Other images marked "AI generated," like this burning medieval castle, seem to remain up on the site.

As Ars Technica notes, neither Shutterstock nor Getty Images explicitly prohibits AI-generated images in their terms of service, and Shutterstock users typically make around 15 to 40 percent of what the company makes when it sells an image. Some creators have not taken kindly to this trend, pointing out that these systems use massive datasets of images scraped from the web. [...] In other words, the generated works are the result of an algorithmic process which mines original art from the internet without credit or compensation to the original artists. Others have worried about the impacts on independent artists who work for commissions, since the ability for anyone to create custom generated artwork potentially means lost revenue.
This discussion has been archived. No new comments can be posted.

Shutterstock Is Removing AI-Generated Images

Comments Filter:
  • by Rei ( 128717 ) on Tuesday September 20, 2022 @08:49PM (#62899959) Homepage

    an algorithmic process which mines original art from the internet without credit or compensation to the original artists

    StableDiffusion is trained on 2,3B images. The weightings are 2GB. Aka, less than 1 byte per training image.

    If your "creative work" contains less than 1 byte of originality, how can that even qualify for copyright protection in the first place?

    • by pt73 ( 2506856 )
      The banking system rounds X Trillion transactions. Say I cause it to round down every time and take the part cents.

      If your "interest" is rounded down and so less than 1 cent, how can that even qualify as stealing?

      • It was done many years ago with one of the first electronic money transfer systems. The man who did ended up very rich, so they changed the rules to stop it from happening again.
      • by Rei ( 128717 )

        Wow, I had no idea that people lose a byte of their artwork every single time an AI trains on them!

      • False equivalence? As far as I'm aware there's no threshold to property but there is a threshold for copyrightable creativity.
        • It seems like a Ship of Theseus problem to me.

          • In text, one byte of information is roughly a digraph. Combining a book out of ten thousand digraphs doesn't sound like a Ship of Theseus problem to me.
            • If you change one byte of an artwork- is it a completely new artwork?
              For example, instead of one pixel being D7 38 27 70, it's D7 43 27 70. Is the 13 million pixel image now a new work of art?
              How many bytes can you change before the image is no longer considered the same?
              Hence the ship of Theseus.

              In any case, it's very misleading to say it's an average of one byte is used. Based on what I've seen of these images, they do not take 1 byte from thousands of images. Instead, they use a large sample from sin

    • Just keep the concept of "anti-ai bias" on the top most shelf way in the back...you will need it eventually.

    • If your "creative work" contains less than 1 byte of originality, how can that even qualify for copyright protection in the first place?

      The originality and creativity were in designing and training the AI.

      • by Rei ( 128717 )

        AI software also has licenses. But that's a change of subject. We're talking about the creative endeavour, in terms of prompt and command crafting (which people may spend hours on at times), selection (from potentially thousands of images), workflows (which involve potentially dozens of steps, even potentially dozens of runs through img2img, some of which may be iterative), including many manual compositing and postprocessing steps.

        Obviously there is a line between that and just typing "a fluffy dog" and

        • But we're talking about quality outputs, not lazy ugly things.

          Really? I thought we were talking about copyrights.

          Copyrights apply to crappy art just as much as good art.

          • by Rei ( 128717 )

            Copyright does NOT apply to things in which no human creative endeavour applies.
            Copyright DOES apply to things in which human creative endeavour applies.
            Hence, the amount of human creative endeavour not only matters, it's the key element that matters.

    • by xalqor ( 6762950 )

      less than 1 byte of originality

      I can hash the same image data into a single value of 16, 20, 32, 48, or 64 bytes depending on which algorithm I use. Does that mean they each have less than one bit of originality, in average? No, it doesn't. You're mixing up what the algorithm does with what humans value, but they are not the same.

      • by Rei ( 128717 )

        You cannot restore an image from a hash. Nobody is trying to sue people for hashing images. Because it would be bloody stupid to do so and get thrown out of court.

        • by xalqor ( 6762950 )

          You also can't restore an image from the weightings dataset alone. pixelz.ai [pixelz.ai] has limited free play with some different algorithms, including stable diffusion. The results obviously reuse significant parts of the original images.

          Nobody is trying to sue people for hashing images.

          Exactly. They are suing because the generated images contain significant bits of copyrighted images for which they didn't give a license.

          You mentioned some number as the size of the weightings dataset, and then made a claim about the

          • by Rei ( 128717 )

            The results obviously reuse significant parts of the original images.

            Only frequently-repeated motifs, and even then, only to a limited degree. It can barely even reproduce a US flag, and that's everywhere.

            You want such systems to learn how to replicate common motifs. If someone asks for the Mona Lisa, you want the Mona Lisa, not some random woman. But you also don't want it reproducing everything with such fidelity. And it can't, it only has less than 1 byte per image. There's a gulf between "reproduc

            • by xalqor ( 6762950 )

              Only frequently-repeated motifs, and even then, only to a limited degree

              Certainly more than one byte. Try "fox playing cards", it uses hand-drawn elements of foxes and cards there that take a lot more than one byte to represent digitally.

              It can barely even reproduce a US flag, and that's everywhere

              Irrelevant. The current limitations of the copyright-infringing technology doesn't make it any less copyright infringing, when the source image used is copyrighted. The US flag is in the public domain, but if a

              • by Rei ( 128717 )

                Certainly more than one byte. Try "fox playing cards", it uses hand-drawn elements of foxes and cards there that take a lot more than one byte to represent digitally.

                No, literally less than one byte. Look, I have it installed on my computer. It doesn't need the internet to run, and the weightings are only two gigs, and it was trained on more than 2 billion images. Do the math.

                I just ran "fox playing cards" on SD. I got a bunch of playing cards with vague foxes on them. Har har, SD.

                I changed it to "foxe

                • by xalqor ( 6762950 )

                  I've been making analogies trying to get you to understand my point and it's not working. This will be my last attempt.

                  Paraphrasing your first post:

                  1. Over 2 billion images got processed into 2 gigabytes of data (I'm just using your numbers here, the specific numbers don't really matter for what I'm trying to convey)

                  2. That's an average of approximately 1 byte per image

                  3. Therefore each source image contains 1 byte of creativity

                  4. Therefore the source images are not deserving of copyright protection

                  And here

                  • by Rei ( 128717 )

                    but it does not follow that the original images only had one byte of creativity.

                    It does follow if they're arguing that it violates their copyright. Their image only contributed, on average, one byte's worth of data. If an image created using less than one byte of data about your copyrighted work is enough to establish a copyright claim, then your innovation level is only one byte's worth.

                    To repeat: that's THEIR argument if they want to assert copyvio. Not mine. It's the implications of THEIR accusatio

  • Maybe they're using an Ai system to decide which photos stay up on the site, and which ones are taken down. When the AI detected photos generated by another AI, well, there wasn't enough room for two AIs on the site, one of them had to go.

    • Maybe they're using an Ai system to decide which photos stay up on the site

      Using an AI to detect the work of another AI is exactly how a GAN [wikipedia.org] creates the images in the first place.

  • I have seen the argument leveled against AI that because it was trained on images without the consent of the artists it's unethical. I do not see a similar claim leveled against human artists - why?
    • by narcc ( 412956 )

      Imagine that you're trying to make music with a Markov chain. You've trained it using simple children's songs. What do you think the music will sound like? As it happens, all of your generated songs will sound reminiscent of the children's songs you've selected.

      There is no other possibility. You can generate music for generations and they'll all still sound reminiscent of the children's songs you selected. There is no creativity there. How could there be? It can only produce things similar to what

      • Have you actually used it? It's a very specific aesthetic that's unlike any artist I know.

        And there are plenty of artists that are starting to use it to improve their own production process. It's great for generating props and backgrounds, for instance.

        • by narcc ( 412956 )

          Have you actually used it? It's a very specific aesthetic that's unlike any artist I know.

          I must not have done a very good job of explaining how it works. I used Markov chains because I thought most people here would be familiar with them. This doesn't work the same way, of course, but they're very much analogous.

          The AI is not contributing anything new to the generated images beyond what was in the training data. That's simply not possible. You can get a sense of this yourself by spending enough time with one of those text to image toys, as you'll start to notice similar elements appear acr

      • by Rei ( 128717 )

        Human artists THINK they're a lot more innovative than they actually are. Archaeologists track the migration of human cultures throughout history by shifts in the art styles in the archaeological record. Humans learn from their surroundings. And they also suck at randomness compared to computers. Try asking a human to write 100 random numbers from 1 to 10 and then handing it over to a statistician to see how random it really is.

        Now, one might say that "humans are exposed to many other things besides art

        • by narcc ( 412956 )

          In your rush to rant about ... I'm not sure what ... you've completely misunderstood my post. As it was written to be simple to understand, this is an astonishing achievement.

          Try reading it again. You should notice that I'm trying to answer EmoryM's question about why artists and AI tools are treated differently. To do that, I needed to explain a few things about AI. There isn't much I needed to make my point, so I decided to use something simple, Markov chains, to explain the relevant concepts.

          See, most

          • by Rei ( 128717 )

            Where did I make any sort of metaphysical argument? You're the one injecting metaphysics into this, trying to insert something magical about human characteristics which are in reality demonstrably rote and predictable compared to the output of AIs. It is a fact that AIs are better at randomness than humans. It is a fact that an AI trained on basically the entire breadth of the internet has a more diverse range of inputs than a human has had. Leave metaphysics out of this and just look at the facts

            And for

            • by narcc ( 412956 )

              It is a fact that an AI trained on basically the entire breadth of the internet [...]

              No, actually, it's not. I've explained this to you already. The limitations there are very clear.

              Where did I make any sort of metaphysical argument?

              That's all you. You seem to think that AI is magical and can somehow store every image on the internet or whatever that nonsense was about. As "evidence" ... You play a game with wikipedia that highlights your own ignorance...

              As I've explained to you, this type of AI is limited in ways that can be objectively quantified. For example, we know exactly the maximum amount of information the AI can store from th

              • by Rei ( 128717 )

                That's all you. You seem to think that AI is magical and can somehow store every image on the internet or whatever that nonsense was about

                This thread is surreal, as right above you I'm arguing with someone who ACTUALLY believes that they somehow store every image on the internet, and am trying to explain how that's WRONG.

                For example, we know exactly the maximum amount of information the AI can store from the training data.

                (Sets 2TB hard drive in the computer)

                dd if=/dev/null of=/dev/sda

                Hey look, two terabyte

                • by Rei ( 128717 )

                  For fun, I decided to test Viren Rasquinha and Moira Shire to the above challenge.

                  I couldn't even begin to picture what "Viren Rasquinha" would look like, as I can't even guess at the ethnicity, so I auto-failed that one. SD in 3 of six images showed sports 4 out of 6 fully shaven, and 5 of six images showed a balding or bald man, which is correct, though it got the face wrong (odds of random-guessing a face would be really low). So SD didn't do great, but I didn't do "at all".

                  Shire of Moira, my brain imm

                  • by narcc ( 412956 )

                    You know your test is completely meaningless, right?

                    • by Rei ( 128717 )

                      Subject: the breadth of "knowledge" on the subject of creating images from text, SD vs. humans
                      Test: the breadth of "knowledge" on the subject of creating images from text, SD vs. humans
                      You: "You know your test is completely meaningless, right?"

                      Propose a better test or bugger off.

                    • by narcc ( 412956 )

                      We don't need a test. As I've explained, we already know the exact upper bound, to the bit!

                      He wants "breadth" to mean something nebulous and quantifiable so that he can pretend the AI is magical.

                      If you really want to play his stupid game, provide a meaningful definition of "Breadth". How many bits of information constitutes one unit of "breadth"?

                    • by Rei ( 128717 )

                      "How many bits of information constitutes one unit of breadth"

                      Hey, look at my 4TB hard drive full of zeros, it contains so much more information than SD's 2GB of weightings!

                      And again, YOU'RE the one attributing magic to HUMANS. I'm attributing magic to NOBODY and NOTHING.

                      Lastly: "Gee, who to listen to, YOU OR MY LYING EYES?"

                    • by narcc ( 412956 )

                      What on earth are you babbling about?

  • Lud, Ned (Score:2, Insightful)

    by baomike ( 143457 )

    This has Luddite feel to it.

    • How is curating your collection of wares Ludditism? Especially when the items you're removing may be of a dubious or surreal quality that does not match what your customers are typically seeking? (Have you seen the new captchas asking for lions with open eyes? That shit is creepy.) For that matter, what rights do these "artist"/uploaders actually hold on the images they are adding?

      • Apparently it actually was what customers were seeking, which is in the summary...

        Might want to read it some time.

        • People seeking out such imagery on the platform is not the same as some of these images being listed with a vague phrase such as "frequently used."

          • Okay, granted.

            Now I've read the article itself and the links it refers to, I really have to say this whole thing is very light on actual facts. So for now I can't even say that these images are actually being removed, let alone the reason why.

      • by Rei ( 128717 )

        I can't even tell what you're asking. When you say "artist/uploaders", do you mean AI artists? The short of it is: it's fuzzy. Under US copyright law, works created entirely by an AI are not copywritable, as it requires human creative endeavor. However, the longer and more complex your workflow for creating images, the better positioned you'd be to argue a claim of copyright.

        If you've basically written a short story as a prompt and spent hours tuning the generation and manual selection of images (incl. pot

        • I have no allusions that one must hold copyright to sell, else works in the public domain, such as printings of the constitution or periodic table would not exist. However, stock image services are predicaed on licensing,which does necessitate copyright.

  • by Snowhare ( 263311 ) on Wednesday September 21, 2022 @12:11AM (#62900273)

    Stock image sites make their money by selling licenses to copyrighted images. AI generated art has been specifically ruled by courts to NOT be copyrightable. Which means they would have no legal leverage to enforce their licensing for AI generated artwork.

    tl;dr: AI artwork breaks their business model in a way they have no legal remedy for except excluding it from their platforms

    • AI generated art has been specifically ruled by courts to NOT be copyrightable.

      Nuh-nuh. What they rejected was the idea of registering the AI itself as the author of the picture.

      • by Rei ( 128717 ) on Wednesday September 21, 2022 @06:17AM (#62900737) Homepage

        Neither of these are accurate statements (this case has been greatly misrepresented). The ruling was based on the lack of human creative endeavor, as the AI created the images fully on its own with no human guidance.

        People seeking to assert copyright over images that involved AI creation steps will need to show that they put a meaningful amount of creative endeavor into their work. For now, the case law is limited and fuzzy. I do however expect it to increasingly drift into AI artists favour. If you're writing a couple-hundred word prompt that goes through dozens of iterations, and spend many minutes or even several hours tuning, selecting, compositing and postprocessing (which I certainly have), that certainly seems like you've passed a higher bar then a lot of things that nobody disputes are copyrighted today.

    • I just discussed this with an IP lawyer yesterday.

      The MidJourney license is quite interesting. I have the feeling that this issue is mostly due to that license, and not the fact it is AI generated.

      AI generation is like any other tool: the copyright is not for the tool. You need to be sentient enough to be able to claim copyright - just see the case of the monkey picture. The sentient that presses the button gets the copyright. If the button is pressed by something not sentient, there is no copyright. Hence

    • Tools like Midjourney are exactly this: just tools. The creator is not the AI, but the human who input parameters. You know, even Photoshop has auto-generating tools (content aware fill as an example).

  • The threat that AI might somehow replace artists is little different from the notion that compilers would replace programmers. What happens is that the trivial examples look impressive; what actually happens is that in the same way programmers could solve larger, more complex problems thanks to compilers and better computer languages, artists can move on to more challenging works using AI. It used to be that to become a surrealist an artist needed to use oil paint and be well trained in classical paintin

    • by HiThere ( 15173 )

      OTOH, the skill sets used are very different. I, for one, don't think the controls on the generated image that I've heard of are sufficient to create the image that one holds in ones mind, though they might create a different one that's about as good.
      That said, while I'm capable of creating a very good image with colored pencils, I'm not capable of enjoying the creation. Others can do both. So I ended up as a programmer. Perhaps if this new approach had been available, I'd have ended up as an artist. T

You know you've landed gear-up when it takes full power to taxi.

Working...