Follow Slashdot stories on Twitter

 



Forgot your password?
typodupeerror
×
AI

Getty Images Built a 'Socially Responsible' AI Tool That Rewards Artists (arstechnica.com) 26

An anonymous reader quotes a report from Ars Technica: Getty Images CEO Craig Peters told the Verge that he has found a solution to one of AI's biggest copyright problems: creators suing because AI models were trained on their original works without consent or compensation. To prove it's possible for AI makers to respect artists' copyrights, Getty built an AI tool using only licensed data that's designed to reward creators more and more as the tool becomes more popular over time. "I think a world that doesn't reward investment in intellectual property is a pretty sad world," Peters told The Verge. The conversation happened at Vox Media's Code Conference 2023, with Peters explaining why Getty Images -- which manages "the world's largest privately held visual archive" -- has a unique perspective on this divisive issue.

In February, Getty Images sued Stability AI over copyright concerns regarding the AI company's image generator, Stable Diffusion. Getty alleged that Stable Diffusion was trained on 12 million Getty images and even imitated Getty's watermark -- controversially seeming to add a layer of Getty's authenticity to fake AI images. Now, Getty has rolled out its own AI image generator that has been trained in ways that are unlike most of the popular image generators out there. Peters told The Verge that because of Getty's ongoing mission to capture the world's most iconic images, "Generative AI by Getty Images" was intentionally designed to avoid major copyright concerns swirling around AI images -- and compensate Getty creators fairly.

Rather than crawling the web for data to feed its AI model, Getty's tool is trained exclusively on images that Getty owns the rights to, Peters said. The tool was created out of rising demand from Getty Images customers who want access to AI generators that don't carry copyright risks. [...] With that as the goal, Peters told Code Conference attendees that the tool is "entirely commercially safe" and "cannot produce third-party intellectual property" or deepfakes because the AI model would have no references from which to produce such risky content. Getty's AI tool "doesn't know what the Pope is," Peters told The Verge. "It doesn't know what [Balenciaga] is, and it can't produce a merging of the two." Peters also said that if there are any lawsuits over AI images generated by Getty, then Getty will cover any legal costs for customers. "We actually put our indemnification around that so that if there are any issues, which we're confident there won't be, we'll stand behind that," Peters said.
When asked how Getty creators will be paid for AI training data, Peters said that there currently isn't a tool for Getty to assess which artist deserves credit every time an AI image is generated. "Instead, Getty will rely on a fixed model that Peters said determines 'what proportion of the training set does your content represent? And then, how has that content performed in our licensing world over time? It's kind of a proxy for quality and quantity. So, it's kind of a blend of the two,'" reports Ars.

"Importantly, Peters suggested that Getty isn't married to using this rewards system and would adapt its methods for rewarding creators by continually monitoring how customers are using the AI tool."
This discussion has been archived. No new comments can be posted.

Getty Images Built a 'Socially Responsible' AI Tool That Rewards Artists

Comments Filter:
  • by MpVpRb ( 1423381 ) on Friday October 06, 2023 @06:03PM (#63907141)

    "I think a world that doesn't reward investment in intellectual property is a pretty sad world"
    And an even sadder world is one where rights holders use the law to limit creativity and extend copyright far beyond its original intent

    • by Anonymous Coward

      it's sad that their wording avoids mentioning creators, Imaginary Property is something to hoard and trade above them

    • Here's one of the best arguments I've heard against the extension of copyright [archive.org]

      From an era where the public interest and the good of society was argued for over corporate and private interests.

      A bit of a long read, but a standout paragraph imho -

      I believe, Sir, that I may with safety take it for granted that the effect of monopoly generally is to make articles scarce, to make them dear, and to make them bad. And I may with equal safety challenge my honourable friend to find out any distinction between copyright and other privileges of the same kind; any reason why a monopoly of books should produce an effect directly the reverse of that which was produced by the East India Company's monopoly of tea, or by Lord Essex's monopoly of sweet wines. Thus, then, stands the case. It is good that authors should be remunerated; and the least exceptionable way of remunerating them is by a monopoly. Yet monopoly is an evil. For the sake of the good we must submit to the evil; but the evil ought not to last a day longer than is necessary for the purpose of securing the good.

  • Getty pays creators peanuts to start with. What they're trying to do is to protect their own profit because AI is going to eat their lunch. Let's not be fooled.
    • by HiThere ( 15173 )

      The summary of their promise, at least, is so vague that it doesn't actually mean anything.

  • by Rei ( 128717 )

    Getty's AI tool "doesn't know what the Pope is,"

    Wait, what? There are no Getty images of the Pope? Really? [google.is]

    When asked how Getty creators will be paid for AI training data, Peters said that there currently isn't a tool for Getty to assess which artist deserves credit every time an AI image is generated

    Because that's not how AI works...

    Lastly, why would anyone pay Getty to use this tool when there's ample free alternatives? Even in a world where copyright law was modified to ban training on copyrighted dat

    • Guy1: I said something about the Pope and my girlfriend is furious.

      Guy2: Well that was dumb, you know she's catholic.

      Guy 1: I knew she was, didn't know about the Pope.
    • Because that's not how AI works...

      You clearly don't understand AI technology. If you did you would know that there's an internal representation of the training dataset in a high dimensional space through a complicated nonlinear transformation of the images. The crediting assignment problem is just a matter of lifting the function representing the original author labels from the discrete dataset into the internal representation, followed by an extension to the full internal state space.

      The main difficulty is that for reasons of lazyness/ex

      • by Rei ( 128717 )

        If you did you would know that there's an internal representation of the training dataset in a high dimensional space through a complicated nonlinear transformation of the images

        Yeah, nah. There is zero "representation of images". They're denoising algorithms, fed static noise and trying to nudge it into a coherent image. It's learning how to take a dot product of a textual latent and a noised-up image latent to create a diffusion gradient to denoise the image latent. You could include the author in the t

        • by Rei ( 128717 )

          Indeed, current networks generally include the author in the training text. That's why you can add something like "By [name]" and get something in the style of that person's name. But that person's name is just one or more tokens. An array of a couple hundred floating point values. The dot product of any given token with any other token represents its semantic distance. So you if you have two authors with similar styles, they'll have a low semantic difference.

          Here's the thing though: you can have a netw

          • FYI: styles aren't copyrightable. You can be mad about this fact, but it's a core element of how copyright works. Copyright is based on works, not styles. .

            Tell that to Robin Thicke. [utexas.edu]

            Given that the concept now has legal precedent in music, it's not a stretch to assume it's also applicable to images. Just wait for the right lawsuit to come along.

            • by Rei ( 128717 )

              He lost that lawsuit not because "his style" was similar to Gaye's style, but because a specific work of his was too close to one of Gaye's specific works [youtube.com].

              Excepting character copyrights**, copyright is based on *works*

              (even character copyrights, which are very narrowly delineated, are technically also based on works, just with a broader applicability)

              • Of course they ruled the work resembled Gaye's. The question is, in what respect do these works resemble each other?

                They do not share the same melody.
                They do not share the same chord progressions.
                They do not share the same lyrics.

                Very specifically, the ruling cited the songs share the same "feel", which in the vernacular of musicians would very much be understood to refer to the style, as in a "blues feel" or a "jazz-waltz" feel.

                In other words, that's exactly what the ruling stated, that Thicke had appropri

          • Just replying in one place as I have somewhere to be shortly. As you pointed out, you could get the model to output a similarity with a particular author if you like, that's a valid attempt at assigning credit, and by no means the only one.

            It's valid because copyright law also depends on some form of closeness metric, measured by provenance and perturbation level, evaluated by a judge. The internal mechanism used by a human artist or an ML algorithm to produce that perturbation or even synthesise an image

        • by HiThere ( 15173 )

          I think you're right, but that doesn't mean the gp is wrong. Clearly the information is there in a highly processed form, or there would be no way to retrieve it. And clever prompts have retrieved pretty exact replicas of some of the info fed in. The question is something line "how distributed can the information be and with how much loss and still be subject to copyright?". Consider, for example, the abstraction called "Mickey Mouse". Lets only consider black and white versions for simplicity. There

      • by tlhIngan ( 30335 )

        Last but not least, you are advocating free alternatives that are only free based on massive copyright infringement. We will see eventually if the US courts side with the content producers or the content consumers in this game.

        Or how about we do a turnabout - instead of generating images, the AI generates CODE?

        If that AI was trained using GPL code, if all the image AI proponents get their way, the code I get out of the AI will no longer be GPL.

        Would that be fair? Perhaps it was trained on lots of GPL and F/

        • by HiThere ( 15173 )

          If only large corporations have access to that AI, it's a problem. If everyone has access, then the problem the GPL was created to deal with has been solved.

          The problem is centralized control. BSD was the first response, but that didn't incentivize releasing the code sufficiently, so development was quite slow. The GPL was a second response, and was more successful, because it gave people more incentive to release the code openly. If you can release the code ad hoc without copyright limitations, then th

    • by Triv ( 181010 )

      What's "so special" abou this is that Getty is one of the largest single copyright holders in the world and they know the licensing status of every piece of media in their collection, so any AI trained on those images is guaranteed liability-free for their clients."

      the monetization schema is bullshit - it looks like the tiktok model where there's a giant pool of money split between all the creators every year with the size of that pool determined by "business growth" (ie in a way that prioritized the busi

      • by Rei ( 128717 )

        any AI trained on those images is guaranteed liability-free for their clients.

        1. That's not how copyright works. Automated processing to create new services is granted an exception under copyright law. Which is why 99% of Google's business line isn't illegal.

        2. Getty owns a minuscule fraction of the image space that exists on the internet. Pretending that Getty owns most images is just nonsense.

  • If I can't have Balenciaga Pope, then what's the point of an image generator? I don't want your product if it's just a nerfed image gen in a collard shirt. Go back to innovating on web crawlers and under handed threats of lawsuits to any random domain owner on the web.

  • They are not better nor more transparents than Adobe on this, just hiding behind a "trust me bro" attitude is not going to convince anyone.
  • I bet the artists will all be thrilled to be paid that mad Spotify-level money. I predict most everyone gets less than a dollar a month.

  • keeps getting sued by photographers [insideimaging.com.au] for selling their photos without a license?

    That Getty?

"The medium is the massage." -- Crazy Nigel

Working...