Shutterstock Is Removing AI-Generated Images 74
Shutterstock appears to be removing images generated by AI systems like DALL-E and Midjourney. Motherboard reports: On Shutterstock, searches for images tagged "Midjourney" yielded several photos with the AI tool's unmistakable aesthetic, with many having high popularity scores and marked as "frequently used." But late Monday, the results for "Midjourney" seem to have been reduced, leaving mainly stock photos of the tool's logo. Other images use tags like "AI generated" -- one image, for example, is an illustration of a futuristic building with an image description reading "Ai generated illustration of futuristic Art Deco city, vintage image, retro poster." The image is part of a collection the artist titled "Midjourney," which has since been removed from the site. Other images marked "AI generated," like this burning medieval castle, seem to remain up on the site.
As Ars Technica notes, neither Shutterstock nor Getty Images explicitly prohibits AI-generated images in their terms of service, and Shutterstock users typically make around 15 to 40 percent of what the company makes when it sells an image. Some creators have not taken kindly to this trend, pointing out that these systems use massive datasets of images scraped from the web. [...] In other words, the generated works are the result of an algorithmic process which mines original art from the internet without credit or compensation to the original artists. Others have worried about the impacts on independent artists who work for commissions, since the ability for anyone to create custom generated artwork potentially means lost revenue.
As Ars Technica notes, neither Shutterstock nor Getty Images explicitly prohibits AI-generated images in their terms of service, and Shutterstock users typically make around 15 to 40 percent of what the company makes when it sells an image. Some creators have not taken kindly to this trend, pointing out that these systems use massive datasets of images scraped from the web. [...] In other words, the generated works are the result of an algorithmic process which mines original art from the internet without credit or compensation to the original artists. Others have worried about the impacts on independent artists who work for commissions, since the ability for anyone to create custom generated artwork potentially means lost revenue.
"Mines the internet" (Score:4, Insightful)
StableDiffusion is trained on 2,3B images. The weightings are 2GB. Aka, less than 1 byte per training image.
If your "creative work" contains less than 1 byte of originality, how can that even qualify for copyright protection in the first place?
Re: (Score:1)
If your "interest" is rounded down and so less than 1 cent, how can that even qualify as stealing?
Re: (Score:2)
Re: (Score:2)
Yes, but he later invented a substance that could weaken Superman, so he had that going for him.
They (Score:2)
Re: (Score:2)
Either that, or this is the plot to the movie "Office Space"
Re: (Score:2)
Wow, I had no idea that people lose a byte of their artwork every single time an AI trains on them!
Re: (Score:2)
Re: (Score:1)
It seems like a Ship of Theseus problem to me.
Re: (Score:2)
Re: (Score:1)
If you change one byte of an artwork- is it a completely new artwork?
For example, instead of one pixel being D7 38 27 70, it's D7 43 27 70. Is the 13 million pixel image now a new work of art?
How many bytes can you change before the image is no longer considered the same?
Hence the ship of Theseus.
In any case, it's very misleading to say it's an average of one byte is used. Based on what I've seen of these images, they do not take 1 byte from thousands of images. Instead, they use a large sample from sin
Re: "Mines the internet" (Score:1)
Just keep the concept of "anti-ai bias" on the top most shelf way in the back...you will need it eventually.
Re: (Score:1)
If your "creative work" contains less than 1 byte of originality, how can that even qualify for copyright protection in the first place?
The originality and creativity were in designing and training the AI.
Re: (Score:2)
AI software also has licenses. But that's a change of subject. We're talking about the creative endeavour, in terms of prompt and command crafting (which people may spend hours on at times), selection (from potentially thousands of images), workflows (which involve potentially dozens of steps, even potentially dozens of runs through img2img, some of which may be iterative), including many manual compositing and postprocessing steps.
Obviously there is a line between that and just typing "a fluffy dog" and
Re: (Score:2)
But we're talking about quality outputs, not lazy ugly things.
Really? I thought we were talking about copyrights.
Copyrights apply to crappy art just as much as good art.
Re: (Score:2)
Copyright does NOT apply to things in which no human creative endeavour applies.
Copyright DOES apply to things in which human creative endeavour applies.
Hence, the amount of human creative endeavour not only matters, it's the key element that matters.
Re: (Score:3)
I can hash the same image data into a single value of 16, 20, 32, 48, or 64 bytes depending on which algorithm I use. Does that mean they each have less than one bit of originality, in average? No, it doesn't. You're mixing up what the algorithm does with what humans value, but they are not the same.
Re: (Score:2)
You cannot restore an image from a hash. Nobody is trying to sue people for hashing images. Because it would be bloody stupid to do so and get thrown out of court.
Re: (Score:2)
You also can't restore an image from the weightings dataset alone. pixelz.ai [pixelz.ai] has limited free play with some different algorithms, including stable diffusion. The results obviously reuse significant parts of the original images.
Exactly. They are suing because the generated images contain significant bits of copyrighted images for which they didn't give a license.
You mentioned some number as the size of the weightings dataset, and then made a claim about the
Re: (Score:2)
Only frequently-repeated motifs, and even then, only to a limited degree. It can barely even reproduce a US flag, and that's everywhere.
You want such systems to learn how to replicate common motifs. If someone asks for the Mona Lisa, you want the Mona Lisa, not some random woman. But you also don't want it reproducing everything with such fidelity. And it can't, it only has less than 1 byte per image. There's a gulf between "reproduc
Re: (Score:2)
Certainly more than one byte. Try "fox playing cards", it uses hand-drawn elements of foxes and cards there that take a lot more than one byte to represent digitally.
Irrelevant. The current limitations of the copyright-infringing technology doesn't make it any less copyright infringing, when the source image used is copyrighted. The US flag is in the public domain, but if a
Re: (Score:2)
No, literally less than one byte. Look, I have it installed on my computer. It doesn't need the internet to run, and the weightings are only two gigs, and it was trained on more than 2 billion images. Do the math.
I just ran "fox playing cards" on SD. I got a bunch of playing cards with vague foxes on them. Har har, SD.
I changed it to "foxe
Re: (Score:2)
I've been making analogies trying to get you to understand my point and it's not working. This will be my last attempt.
Paraphrasing your first post:
1. Over 2 billion images got processed into 2 gigabytes of data (I'm just using your numbers here, the specific numbers don't really matter for what I'm trying to convey)
2. That's an average of approximately 1 byte per image
3. Therefore each source image contains 1 byte of creativity
4. Therefore the source images are not deserving of copyright protection
And here
Re: (Score:2)
It does follow if they're arguing that it violates their copyright. Their image only contributed, on average, one byte's worth of data. If an image created using less than one byte of data about your copyrighted work is enough to establish a copyright claim, then your innovation level is only one byte's worth.
To repeat: that's THEIR argument if they want to assert copyvio. Not mine. It's the implications of THEIR accusatio
Re: (Score:3)
Except those services are the source material the AI was trained on. If they disappear, the AI loses access to the new images that it needs to update itself. Saying DALL-E can replace Shutterstock is like saying Google search can replace the Internet; you just type something into the Google search bar and the answer just pops up, no need for all those old
Re: (Score:1)
That seems optimistic. With no actual new information, how could it possibly grow? The generated images now all have "the AI tool's unmistakable aesthetic"
Here's my prediction: Trying to do so will result in something more like incest. Continuing to train on it's own generated images will ultimately result in information being lost, not gained. We will call the final image "Cleopatra".
Re: They're scared (Score:2)
That's about the most dystopian thing I've ever heard.
"Humans have created everything worth creating, now we'll just get machines to provide us with remixes."
Re: (Score:2)
"Humans have created everything worth creating, now we'll just get machines to provide us with remixes."
That is only true if you believe machines can never create anything of value.
Re: (Score:2)
Artistic value is a 2 sided coin (Score:2)
There are two sides to most art: the creator's side, which is what you're talking about, and the observer's side, which you're not. The observer's POV, the "eye of the beholder", contributes human factors as well.
Also, unless the artist was raised in an environment where they were never exposed to other art (it happens, but it certainly isn't the normal case),
Re: (Score:2)
Some of us think artistic value can only originate from a human soul and AI art is completely without soul.
Then in a blind test, you should be able to easily tell them apart.
Guess what? You can't.
Re: (Score:2)
Its called the simulacrum
Re: (Score:2)
Re: They're scared (Score:3)
You're assuming normal content creators can't do this by using the clone brush, as well as a few different colours and materials.
I mean, have you worked with shutterstock? If you look for office stuff, you get thousands of similar pictures. It's not as if these are all "artistes" that create. Loads of content creators just work the factory creation line. And yes, AI threatens that model. Well, to bad for them. If you're smart, you use AI to up your speed. If you don't, you go the way of the stagecoach.
Shutt
Re: (Score:1)
Say, I am writing an article about the war in Ukraine and need an image to make it more attractive, should I buy a photo from Getty or generate one with Midjourney? How about a cake or a polar bear?
On the other hand, if I want an illustration for the cover of a fantasy book...
Re: (Score:1)
You're highly optimistic to think that in the future there'll be journalism
Dueling AIs (Score:2)
Maybe they're using an Ai system to decide which photos stay up on the site, and which ones are taken down. When the AI detected photos generated by another AI, well, there wasn't enough room for two AIs on the site, one of them had to go.
Re: (Score:2)
Maybe they're using an Ai system to decide which photos stay up on the site
Using an AI to detect the work of another AI is exactly how a GAN [wikipedia.org] creates the images in the first place.
Do human artists attribute what they've seen? (Score:1)
Re: (Score:2)
Imagine that you're trying to make music with a Markov chain. You've trained it using simple children's songs. What do you think the music will sound like? As it happens, all of your generated songs will sound reminiscent of the children's songs you've selected.
There is no other possibility. You can generate music for generations and they'll all still sound reminiscent of the children's songs you selected. There is no creativity there. How could there be? It can only produce things similar to what
Re: Do human artists attribute what they've seen? (Score:3)
Have you actually used it? It's a very specific aesthetic that's unlike any artist I know.
And there are plenty of artists that are starting to use it to improve their own production process. It's great for generating props and backgrounds, for instance.
Re: (Score:3)
Have you actually used it? It's a very specific aesthetic that's unlike any artist I know.
I must not have done a very good job of explaining how it works. I used Markov chains because I thought most people here would be familiar with them. This doesn't work the same way, of course, but they're very much analogous.
The AI is not contributing anything new to the generated images beyond what was in the training data. That's simply not possible. You can get a sense of this yourself by spending enough time with one of those text to image toys, as you'll start to notice similar elements appear acr
Re: (Score:2)
Human artists THINK they're a lot more innovative than they actually are. Archaeologists track the migration of human cultures throughout history by shifts in the art styles in the archaeological record. Humans learn from their surroundings. And they also suck at randomness compared to computers. Try asking a human to write 100 random numbers from 1 to 10 and then handing it over to a statistician to see how random it really is.
Now, one might say that "humans are exposed to many other things besides art
Re: (Score:2)
In your rush to rant about ... I'm not sure what ... you've completely misunderstood my post. As it was written to be simple to understand, this is an astonishing achievement.
Try reading it again. You should notice that I'm trying to answer EmoryM's question about why artists and AI tools are treated differently. To do that, I needed to explain a few things about AI. There isn't much I needed to make my point, so I decided to use something simple, Markov chains, to explain the relevant concepts.
See, most
Re: (Score:2)
Where did I make any sort of metaphysical argument? You're the one injecting metaphysics into this, trying to insert something magical about human characteristics which are in reality demonstrably rote and predictable compared to the output of AIs. It is a fact that AIs are better at randomness than humans. It is a fact that an AI trained on basically the entire breadth of the internet has a more diverse range of inputs than a human has had. Leave metaphysics out of this and just look at the facts
And for
Re: (Score:2)
It is a fact that an AI trained on basically the entire breadth of the internet [...]
No, actually, it's not. I've explained this to you already. The limitations there are very clear.
Where did I make any sort of metaphysical argument?
That's all you. You seem to think that AI is magical and can somehow store every image on the internet or whatever that nonsense was about. As "evidence" ... You play a game with wikipedia that highlights your own ignorance...
As I've explained to you, this type of AI is limited in ways that can be objectively quantified. For example, we know exactly the maximum amount of information the AI can store from th
Re: (Score:2)
This thread is surreal, as right above you I'm arguing with someone who ACTUALLY believes that they somehow store every image on the internet, and am trying to explain how that's WRONG.
(Sets 2TB hard drive in the computer)
dd if=/dev/null of=/dev/sda
Hey look, two terabyte
Re: (Score:2)
For fun, I decided to test Viren Rasquinha and Moira Shire to the above challenge.
I couldn't even begin to picture what "Viren Rasquinha" would look like, as I can't even guess at the ethnicity, so I auto-failed that one. SD in 3 of six images showed sports 4 out of 6 fully shaven, and 5 of six images showed a balding or bald man, which is correct, though it got the face wrong (odds of random-guessing a face would be really low). So SD didn't do great, but I didn't do "at all".
Shire of Moira, my brain imm
Re: (Score:2)
You know your test is completely meaningless, right?
Re: (Score:2)
Subject: the breadth of "knowledge" on the subject of creating images from text, SD vs. humans
Test: the breadth of "knowledge" on the subject of creating images from text, SD vs. humans
You: "You know your test is completely meaningless, right?"
Propose a better test or bugger off.
Re: (Score:2)
We don't need a test. As I've explained, we already know the exact upper bound, to the bit!
He wants "breadth" to mean something nebulous and quantifiable so that he can pretend the AI is magical.
If you really want to play his stupid game, provide a meaningful definition of "Breadth". How many bits of information constitutes one unit of "breadth"?
Re: (Score:2)
"How many bits of information constitutes one unit of breadth"
Hey, look at my 4TB hard drive full of zeros, it contains so much more information than SD's 2GB of weightings!
And again, YOU'RE the one attributing magic to HUMANS. I'm attributing magic to NOBODY and NOTHING.
Lastly: "Gee, who to listen to, YOU OR MY LYING EYES?"
Re: (Score:2)
What on earth are you babbling about?
Lud, Ned (Score:2, Insightful)
This has Luddite feel to it.
Re: (Score:1)
How is curating your collection of wares Ludditism? Especially when the items you're removing may be of a dubious or surreal quality that does not match what your customers are typically seeking? (Have you seen the new captchas asking for lions with open eyes? That shit is creepy.) For that matter, what rights do these "artist"/uploaders actually hold on the images they are adding?
Re: Lud, Ned (Score:3)
Apparently it actually was what customers were seeking, which is in the summary...
Might want to read it some time.
Re: (Score:3)
People seeking out such imagery on the platform is not the same as some of these images being listed with a vague phrase such as "frequently used."
Re: (Score:2)
Okay, granted.
Now I've read the article itself and the links it refers to, I really have to say this whole thing is very light on actual facts. So for now I can't even say that these images are actually being removed, let alone the reason why.
Re: (Score:2)
I can't even tell what you're asking. When you say "artist/uploaders", do you mean AI artists? The short of it is: it's fuzzy. Under US copyright law, works created entirely by an AI are not copywritable, as it requires human creative endeavor. However, the longer and more complex your workflow for creating images, the better positioned you'd be to argue a claim of copyright.
If you've basically written a short story as a prompt and spent hours tuning the generation and manual selection of images (incl. pot
Re: (Score:2)
I have no allusions that one must hold copyright to sell, else works in the public domain, such as printings of the constitution or periodic table would not exist. However, stock image services are predicaed on licensing,which does necessitate copyright.
It's because they CAN'T legally be copyrighted (Score:3)
Stock image sites make their money by selling licenses to copyrighted images. AI generated art has been specifically ruled by courts to NOT be copyrightable. Which means they would have no legal leverage to enforce their licensing for AI generated artwork.
tl;dr: AI artwork breaks their business model in a way they have no legal remedy for except excluding it from their platforms
Re: (Score:3)
Nuh-nuh. What they rejected was the idea of registering the AI itself as the author of the picture.
Re:It's because they CAN'T legally be copyrighted (Score:4, Interesting)
Neither of these are accurate statements (this case has been greatly misrepresented). The ruling was based on the lack of human creative endeavor, as the AI created the images fully on its own with no human guidance.
People seeking to assert copyright over images that involved AI creation steps will need to show that they put a meaningful amount of creative endeavor into their work. For now, the case law is limited and fuzzy. I do however expect it to increasingly drift into AI artists favour. If you're writing a couple-hundred word prompt that goes through dozens of iterations, and spend many minutes or even several hours tuning, selecting, compositing and postprocessing (which I certainly have), that certainly seems like you've passed a higher bar then a lot of things that nobody disputes are copyrighted today.
Re: It's because they CAN'T legally be copyrighted (Score:2)
I just discussed this with an IP lawyer yesterday.
The MidJourney license is quite interesting. I have the feeling that this issue is mostly due to that license, and not the fact it is AI generated.
AI generation is like any other tool: the copyright is not for the tool. You need to be sentient enough to be able to claim copyright - just see the case of the monkey picture. The sentient that presses the button gets the copyright. If the button is pressed by something not sentient, there is no copyright. Hence
Re: (Score:1)
Tools like Midjourney are exactly this: just tools. The creator is not the AI, but the human who input parameters. You know, even Photoshop has auto-generating tools (content aware fill as an example).
From an artist (Score:2)
The threat that AI might somehow replace artists is little different from the notion that compilers would replace programmers. What happens is that the trivial examples look impressive; what actually happens is that in the same way programmers could solve larger, more complex problems thanks to compilers and better computer languages, artists can move on to more challenging works using AI. It used to be that to become a surrealist an artist needed to use oil paint and be well trained in classical paintin
Re: (Score:2)
OTOH, the skill sets used are very different. I, for one, don't think the controls on the generated image that I've heard of are sufficient to create the image that one holds in ones mind, though they might create a different one that's about as good.
That said, while I'm capable of creating a very good image with colored pencils, I'm not capable of enjoying the creation. Others can do both. So I ended up as a programmer. Perhaps if this new approach had been available, I'd have ended up as an artist. T