Can The AI Industry Continue To Avoid Paying for the Content They're Using? (yahoo.com) 196

Posted by EditorDavid on Monday January 15, 2024 @08:34AM from the what-AI-owes dept.

Last year Marc Andreessen's firm "argued that AI companies would go broke if they had to pay copyright royalties or licensing fees," notes a Los Angeles Times technology columnist.

But are these powerful companies doing even more to ensure they're not billed for their training data? Just this week, British media outlets reported that OpenAI has made the same case, seeking an exemption from copyright rules in England, claiming that the company simply couldn't operate without ingesting copyrighted materials.... The AI companies also argue what they're doing falls under the legal doctrine of fair use — probably the strongest argument they've got — because it's transformative. This argument helped Google win in court against the big book publishers when it was copying books into its massive Google Books database, and defeat claims that YouTube was profiting by allowing users to host and promulgate unlicensed material. Next, the AI companies argue that copyright-violating outputs like those uncovered by AI expert Gary Marcus, film industry veteran Reid Southern and the New York Times are rare or are bugs that are going to be patched.
But finally, William Fitzgerald, a partner at the Worker Agency and former member of the public policy team at Google, predicts Google will try to line up supportive groups to tell lawmakers artists support AI: Fitzgerald also sees Google's fingerprints on Creative Commons' embrace of the argument that AI art is fair use, as Google is a major funder of the organization. "It's worrisome to see Google deploy the same lobbying tactics they've developed over the years to ensure workers don't get paid fairly for their labor," Fitzgerald said. And OpenAI is close behind. It is not only taking a similar approach to heading off copyright complaints as Google, but it's also hiring the same people: It hired Fred Von Lohmann, Google's former director of copyright policy, as its top copyright lawyer....

[Marcus says] "There's an obvious alternative here — OpenAI's saying that we need all this or we can't build AI — but they could pay for it!" We want a world with artists and with writers, after all, he adds, one that rewards artistic work — not one where all the money goes to the top because a handful of tech companies won a digital land grab. "It's up to workers everywhere to see this for what it is, get organized, educate lawmakers and fight to get paid fairly for their labor," Fitzgerald says.

"Because if they don't, Google and OpenAI will continue to profit from other people's labor and content for a long time to come."

Can The AI Industry Continue To Avoid Paying for the Content They're Using?

This discussion has been archived. No new comments can be posted.

Load All Comments

Search 196 Comments Log In/Create an Account

Comments Filter:

Yes (Score:5, Interesting)

by DrMrLordX ( 559371 ) writes: on Monday January 15, 2024 @08:39AM (#64159413)

Yes. So long as any one individual human has the ability to read/consume published works of any kind, that human can feed an LLM, and there's very little anyone can do to stop that from happening.

- Re: (Score:2)
  
  by coofercat ( 719737 ) writes:
  
  You might not be able to stop it happening in a private lab or even private LLM, but if you want to make money from it, you bet it can be stopped.
  The question really becomes how far will law makers go? Europe has put its line in the sand, and since OpenAI is in the USA, what the USA does next is what's going to really matter here.
  - Re: (Score:2)
    
    by gweihir ( 88907 ) writes:
    
    Indeed. There is also an exception for research. But as soon as you monetize, you are crossing a big, fat, red line.
  - Well, about the law (Score:2)
    
    by fyngyrz ( 762201 ) writes:
    
    You might not be able to stop it happening in a private lab or even private LLM, but if you want to make money from it, you bet it can be stopped.
    So imagine this is the case, and meanwhile [insert country here] ignores this particular form of repressing learning, and leaps ahead in GPT/LLM ML systems, and possibly even to AGI. Because while GPT/LLM ML isn't AGI (or even AI) by any means, we can be absolutely certain that learning from the broadest possible training data will be involved when that tech arriv
  - Re: (Score:2)
    
    by DrMrLordX ( 559371 ) writes:
    
    If you obfuscate output to the point that it isn't rote regurgitation of material line-by-line, I do t think anyone can legally stop an LLM from using someone's IP as training data.
- Heres something they can do... Nightshade. (Score:2)
  
  by jools33 ( 252092 ) writes:
  
  So he is something an individual can do. Artists can poison their image data before posting, it means an additional step in post processing, and it means that any ML collecting their date from an artists images, will find that their data models are seriously messed up (thanks to the team working on this from the university of Chicago:
  https://www.technologyreview.c... [technologyreview.com]
  https://amt-lab.org/reviews/20... [amt-lab.org]
  - Re: (Score:2)
    
    by DrMrLordX ( 559371 ) writes:
    
    Wasn't that shown to be ineffective?
- - Re: (Score:2)
    
    by dfghjk ( 711126 ) writes:
    
    If a human buys a book, then loans that book to another human, would you describe that as "feeding the other human with copyrighted material"? Would you say that loaning the book is "no different legally and morally than said human copying the material directly"?
    Perhaps, if it suits your bad faith narrative. For the rest of, the absurdity is obvious.
    An LLM cannot purchase a book, but a human can and the have the LLM "read" it. That is QUITE different legally (and morally) than copying.
    Let's face it, /. i
    - Re:Yes (Score:4)
      
      by gweihir ( 88907 ) writes: on Monday January 15, 2024 @09:25AM (#64159533)
      
      You are confused. Machines are not humans and automated processing of information falls under entirely different laws than humans reading stuff. The law knows that. Anybody with two brain-cells knows that. You do not. How pathetic.
      
      - Re: Yes (Score:2)
        
        by tchetch ( 6418204 ) writes:
        
        I don't see how human brain is less automated in processing data. Slower but still automated. And once you know something from learning, you output it in a slightly different manner than the original one but it is still the overall same content. When I say "Inflation is pushing Price up", I am quoting a newspaper, in slightly different way, so my memory should be deleted because it's copyrighted content ?
        
        Re: (Score:2)
        
        by gweihir ( 88907 ) writes:
        
        Yes, clearly you "do not see". That is a defect on your side. Unlike you, the law is not confused in this question.
        
        Re: (Score:2)
        
        by Bumbul ( 7920730 ) writes:
        
        Yes, clearly you "do not see". That is a defect on your side. Unlike you, the law is not confused in this question.
        The law is outdated. Just earlier today (my time zone) there was a story on /. on building a "general robotic brain". Let's have that brain observe the world on its own. Who would build the copyright-filter? Not doable, the law needs to be updated.
        
        Re: (Score:2)
        
        by gweihir ( 88907 ) writes:
        
        Bargaining.
        
        Re: Yes (Score:2)
        
        by tchetch ( 6418204 ) writes:
        
        "The Law" don't exist. Each country has it's own law that might be quite different from others. Is downloading copyrighted content illegal ? Not in all country where only distribution is illegal but downloading is legal, the idea is that the user may be unable to know if that downloading is illegal and the burden is left on the distributor only. And this is compatible with international treaties. Your own opinion is not "The Law" and the country you live in is not "The Country". So in some country what open
        
        Re: (Score:2)
        
        by gweihir ( 88907 ) writes:
        
        Denial.
        
        Re: (Score:2)
        
        by Gibgezr ( 2025238 ) writes:
        
        It's ok to deny things that aren't true.
        
        Re: Yes (Score:4)
        
        by Entrope ( 68843 ) writes: on Monday January 15, 2024 @09:51AM (#64159617) Homepage
        
        It's not a question of automated processing, but of how the material is represented. At least under US copyright law, "copies" are what are regulated, and:
        âoeCopiesâ are material objects, other than phonorecords, in which a work is fixed by any method now known or later developed, and from which the work can be perceived, reproduced, or otherwise communicated, either directly or with the aid of a machine or device. The term âoecopiesâ includes the material object, other than a phonorecord, in which the work is first fixed.
        This was intended to exclude the human brain, so even if someone creates a memory-reading "machine or device", human memories will not be copies. On the other hand, loading a program into RAM to execute is making a copy of it, so legally the LLMs are engaging in massive amounts of copying.
        As to your specific example, copyright also only protects the specific creative expression of an idea. Factual claims or hypotheses about economics are not protected by copyright, and your quote was (as you described it) both a paraphrase and too short to infringe anyway.
        
        
        Re: (Score:2)
        
        by gweihir ( 88907 ) writes:
        
        On the other hand, loading a program into RAM to execute is making a copy of it, so legally the LLMs are engaging in massive amounts of copying.
        And that is the actual fact of the matter. What we will now find out is exactly how corrupt the legal system is.
        
        Re: (Score:2)
        
        by snowshovelboy ( 242280 ) writes:
        
        At least under US copyright law, "copies" are what are regulated
        
        If it really was how you believe it is, there would be no performance royalty for cover bands.
        
        Re: (Score:2)
        
        by Entrope ( 68843 ) writes:
        
        I didn't enumerate every activity that copyright law regulates because they're not relevant here. Don't you have better nits to pick?
        
        Re: (Score:2)
        
        by snowshovelboy ( 242280 ) writes:
        
        Proof by contradiction doesn't require me to enumerate them all, it only needs one example.
        
        Re: (Score:2)
        
        by Entrope ( 68843 ) writes:
        
        You might want to double-check what you think was proven there, chief. My argument didn't rely on my one example being an exhaustive list of what copyright law regulates. It relies on my one example being an actual example.
        
        Re: (Score:2)
        
        by snowshovelboy ( 242280 ) writes:
        
        The difference is that a human can be sent to prison when it disobeys a court order.
      - Horses, barns (Score:4, Interesting)
        
        by fyngyrz ( 762201 ) writes: on Monday January 15, 2024 @10:28AM (#64159713) Homepage Journal
        
        automated processing of information falls under entirely different laws than humans reading stuff
        Well, there's the law WRT the training of the models, however that actually pans out in court, and there's what's definitely going to happen regardless.
        These horses are well out of the barn and in the hands of the worldwide open source community, white and black hats both. There's absolutely no chance of putting them back — there isn't even any current law that would enable doing so, were it even possible (see why below.) Also, in terms of national advantage, any country that handicaps itself in this race will simply fall far behind the ones that don't. It would take monumentally stupid national leadership to allow that to happen. I leave it to the reader to imagine which countries might actually be that stupid, and which ones will proceed apace no matter what anyone thinks.
        It's important to understand GPT/LLM ML tech isn't definitively one thing, it's four:
        [1] - the engines that do the training (completely independent of the nature of [2] and [3])
        [2] - training data
        [3] - the models resulting from [1] processing [2]
        [4] - the engines that process [3] (completely independent of the nature of [2] and [3])
        [1] and [4] can and will continue to be developed, even if [2] and [3] are crippled in the development environment, as developing the tech implementing the engines isn't legally or technically hamstrung by the data used to develop the engines — in other words, [1] and [4] can be advanced indefinitely, including commercially, using entirely non-infringing data, without any concern for copyright issues at all, and then [2] and [3] can be multiply and variously and independent of supervision developed elsewhere, by anyone. This distributed effort is already well under way; you can put it on a reasonable desktop machine in any combination. (Also true for generative imaging.)
        The engines ([1]) will then be used to develop truly robust and well-informed models. Thus breeding new horses ([3]) nowhere near anyone's barn. We already have instances of [4] and [3] being widely, and independently, distributed. On current tech desktop machines. Further, layer-at-a-time methodology [github.com] has recently been disclosed for [4] which allow much larger models to be processed on machines that previously had no chance of running them.
        Think of [1] and [4] as paint programs and image display programs, respectively, while [2] is "other people's images" and [3] is "images you make/derive in paint programs." Neither paint programs (as [1]) or image display programs (as [4]) infringe; you can't effectively stop people from loading images (as [2]) or producing new derivative images and then sharing them (as [3].) Between the robust encryption available and the black distribution channels that abound, and the fact that the tech is completely and unequivocally out the door, it's an impossible task to rein this in outside of completely toothless for-show enforcements.
        This is one of those circumstances where the law, well-intentioned or not, notionally right or not, has absolutely no chance of effective regulation of the technology in question.
        
      - Re: (Score:2)
        
        by DrMrLordX ( 559371 ) writes:
        
        Which laws? Do they apply differently to an AGI?
        
        Re: (Score:3)
        
        by gweihir ( 88907 ) writes:
        
        There is no AGI.
      - Re: (Score:2)
        
        by Gibgezr ( 2025238 ) writes:
        
        Are there actually any RELEVANT laws that "know that"? "Doing X but on a computer" is not normally a part of any laws.
    - Re: Yes (Score:2)
      
      by Junta ( 36770 ) writes:
      
      If a human is "fed" a lot of Disney animation, and then proceeds to create an animation featuring characters that are dead ringers for Ariel, Aladdin, and so forth, they can't hide behind "oh I didn't copy verbatim, I processed it". A person writing music gets in trouble for a song that reproduces the hook of a copyrighted song gets sued, even if the rest of the song seems to be an original work.
      In LLM land, it's been repeatedly shown that LLMs will reproduce copyrighted material. It has been shown, text
      - Re: (Score:3)
        
        by DrMrLordX ( 559371 ) writes:
        
        Funny you should mention that. There are obvious 90% knockoffs of entertainment products all the time. They're clearly derivative and often awful. Then you have the less-derivative bodies of work, such as everything done by Don Bluth after he left Disney. Yet it's undeniable that Bluth was heavily-influnced by Disney.
        Humans are often affected by exposure to works exposed by copyright. It changes their creative output indelibly.
    - Re: (Score:2)
      
      by Freischutz ( 4776131 ) writes:
      
      If a human buys a book, then loans that book to another human, would you describe that as "feeding the other human with copyrighted material"? Would you say that loaning the book is "no different legally and morally than said human copying the material directly"?
      Perhaps, if it suits your bad faith narrative. For the rest of, the absurdity is obvious.
      An LLM cannot purchase a book, but a human can and the have the LLM "read" it. That is QUITE different legally (and morally) than copying.
      Let's face it, /. is filled with bad faith actors. You know, the type that, in a moral world, would have people sent to prison.
      It's more like a human, call him Abe, finds another humans sketchbook, call him Bob, which Bob had left on a table while the went to the vending machine to get another Diet Coke. Abe copies Bob's sketchbook and uses bits and pieces of the stuff the Bob drew in his sketchbook to create his own "original art', then sells it on Amazon. Given how lazy Abe is the Amazon product label ends up being "I'm sorry but I cannot fulfill this request it goes against OpenAI use policy.". [slashdot.org] What Abe should be doing is ask Bo
  - Re: (Score:2, Troll)
    
    by Luckyo ( 1726890 ) writes:
    
    This is the next stage of the mentality of those that want monopoly on knowledge. They want you to pay royalties for having a right to learn. Not just distribute.
    This is what will kill any civilization that attempts it, because passing knowledge on to the next generation is pre-requisite for survival of any civilization.
    It is a great dream of upper class who cannot perform and compete to lock their status in so they don't have to compete with those coming from below. Just tax the ability to learn, so that p
    - Re: (Score:2, Insightful)
      
      by gweihir ( 88907 ) writes:
      
      Machines do not "learn".
      - Re: (Score:3)
        
        by Bumbul ( 7920730 ) writes:
        
        Your input would be much more valuable, if you could use other arguments than simple semantics. Besides, creating connections between artificial neurons is as close as it be to the definition of learning, happening in human brain, creating connections between neurons.
  - Re: Yes (Score:2)
    
    by beelsebob ( 529313 ) writes:
    
    Indeed - that ironically *would* break copyright law, as the human would be directly copying the work. Whether AI training violates copyright is a much murkier question (and imo very unlikely to amount to copyright violation)
  - Re: (Score:2)
    
    by thegarbz ( 1787294 ) writes:
    
    Untrue. If a human feeds an LLM with copyrighted material, then that is no different legally (and morally) than said human copying the material directly.
    Memcpy of this post into your brain is not copyright infringement. What do you propose next, we sue every person in the world with photographic memory for everything they've ever seen?
It's a NEW revenue source (Score:5, Insightful)

by Bruce66423 ( 1678196 ) writes: on Monday January 15, 2024 @08:41AM (#64159417)

So the idea that they have an inherent right to a part of this new flow is not necessarily rational; books and newspapers were not created on the basis that they would gain an income flow from being used to develop AI.
So how do we decide? The instinctive reaction, to kick the tech companies because everyone hates them and we're Philistines if we don't do all we can to support the arts, means that we're in danger of assuming that they have a right to a totally new revenue source - despite the damage that will do to the development of AI.

- Re: (Score:3, Insightful)
  
  by gweihir ( 88907 ) writes:
  
  It is really a lot easier: The LLM makers did copy copyrighted material and then processed it in a fashion decidedly not covered by fair use. That means the product of that processing is illegal and must be deleted. That they did all that commercially makes them liable financially and likely makes the whole thing criminal. As OpenAI has this process as its very business model, OpenAI may well be a criminal enterprise in addition.
  - Re:It's a NEW revenue source (Score:4, Interesting)
    
    by Luckyo ( 1726890 ) writes: on Monday January 15, 2024 @09:42AM (#64159583)
    
    So does a human brain. To process anything, you must first commit it to memory. Notably we already had the same argument with browser caches.
    The answer remains no. Your insane interpretation of copyright to keep competition down remains factually and objectively wrong.
    Copyright does not, and has never limited ability to learn from copyrighted material. That is expressly not touched ever, everything that was ever created by humans was an interpreted processing of something that was created by other humans in the past. It's how all mammalian learning processes work. And that is also how BD ML processes work.
    And it is true that throughout the history, upper managerial class strived to prevent just that from happening. Upcoming smart individuals among the rabble actually learning things, and therefore being able to replace the managers with much more talented people who weren't in the upper class clique. It's one of the defining features of totalitarian left today, who invent entire systems to block talented people who can do what they do better and with less societal damage from advancing in social class. And primary method of advancement in social class is learning from existing things.
    It's therefore no surprise that it's usually the same people who advocate covering learning processes with copyright also advocate for things like DEI, as all of these things have the same primary goal in mind. Locking in social classes, allowing only those vetted to be utterly unthreatening to rise and ensuring that no talented usurpers rise up.
    
    - - Re: It's a NEW revenue source (Score:2)
        
        by beelsebob ( 529313 ) writes:
        
        How are they different? As far as I can tell we are just particularly sophisticated machines.
      - Re:It's a NEW revenue source (Score:4, Insightful)
        
        by Luckyo ( 1726890 ) writes: on Monday January 15, 2024 @12:35PM (#64160079)
        
        You desperation to end the discussion you were so keen to start until the obvious flaws in your narrative are pointed out to you tell us more about your views than text you type out.
        
  - Re: It's a NEW revenue source (Score:2)
    
    by beelsebob ( 529313 ) writes:
    
    In what way is the way they processed it not covered by fair use? Creating a large database of the information in the text (whatever the format) has been found to be fair use already. Reading the material, and using the knowledge gained to create new things has also. This lies about half way between the two, and I donâ(TM)t see any reason why it wouldnâ(TM)t be found to be fair use too.
  - Re: (Score:2)
    
    by SandorZoo ( 2318398 ) writes:
    
    The LLM makers did copy copyrighted material and then processed it in a fashion decidedly not covered by fair use.
    That's not clear at all. Image search engines download images and process them to create thumbnails, which they then make available in image search results, That was ruled transformative enough to be worth a fair-use exception (e.g. by Perfect 10 vs Amazon & Google [wikipedia.org]). Google did a similar thing with Google Books, except they did keep copies, and that still got ruled transformative enough to be fair use (Authors Guild vs Google [wikipedia.org])
    It seems to me creating an LLM from copyrighted text is one hell of a lot more
- Re: (Score:2)
  
  by john83 ( 923470 ) writes:
  
  Maybe I'm old-fashioned, but I hate copyright law more than I hate the tech industry.
  - Re: (Score:2)
    
    by gtall ( 79522 ) writes:
    
    Maybe you hate it because you have not produced any copyrightable material that you spent a lot of time, money, and effort producing only to have some bot or company take it and take the profit you had counted on obtaining.
    - Humbug (Score:2)
      
      by Bruce66423 ( 1678196 ) writes:
      
      Until eighteen months ago no creative would be counting on any income from AI using their material. That's what I mean by a 'new revenue source'. Thus it can be argued that this new income has the legitimacy as the new income that authors and film makers have gained because their creatures in Congress extended the duration of copyright, i.e. zero.
      - Re: Humbug (Score:2)
        
        by beelsebob ( 529313 ) writes:
        
        I think the question the court is more likely to ask is whether the AI is going to make money in place of the original author. If you write a great work like Principia Mathematica that contains cutting edge information on a topic, and is the one source for all this information together; and then OpenAI gets all the revenue for reading it and then handing out that information, youâ(TM)re likely to be pissed off.
        That said, if I had an amazing memory, read your book, and started handing out the informati
        
        NOT what AI is doing (Score:2)
        
        by Bruce66423 ( 1678196 ) writes:
        
        It in no way - unless subverted into doing so - trots out the same material as it has been trained on. It does something very other. No normally behaving AI as we have them today would ever be done for plagiarism.
- Re:It's a NEW revenue source (Score:4, Insightful)
  
  by Entrope ( 68843 ) writes: on Monday January 15, 2024 @09:54AM (#64159631) Homepage
  
  Your argument is very similar to saying that FTX should have been allowed to steal client funds because they were applying it to a business that didn't exist when laws against bank fraud were created.
  Coming up with a new business model that relies on breaking the law does not excuse that law-breaking.
  
  - Begging the question (Score:2)
    
    by Bruce66423 ( 1678196 ) writes:
    
    The discussion here is whether the use for training AI - a totally new role for the copyright material - constitutes fair use or not. The FTX scam is clearly separating its owner from their money, something which has always been defined as theft. Copyright - by definition - is talking about something that is not lost when it is copied.
- Re: (Score:2)
  
  by scamper_22 ( 1073470 ) writes:
  
  It's a complex issue, but at the core, you are right that they want to be a part of the new flow. I'm not actually opposed to it.
  I think it was Toyota that has a small group of trades people who know how to do things manually and build things by hand. They employ them so the knowledge is not lost by simply automating everything.
  It's extremely complicated and in no way is it easy to figure out how to pay humans who technically might not be 'needed' for the job anymore. However, I think it conceptually a good
  - AI will kill creativity? (Score:2)
    
    by Bruce66423 ( 1678196 ) writes:
    
    That's an extreme assumption, and one that leaves creatives looking less than worthwhile.
    - Re: (Score:2)
      
      by scamper_22 ( 1073470 ) writes:
      
      Most creatives are less than worthwhile.
      I think it is very few 'creatives' who actually get to make money on their creativity. A lot of creatives make their bread and butter on things that are more routine, and then use their creativity in small slices or other projects.
      An artist might get a job make boring corporate diagrams or generic game/movie imagery. I think in a lot of those cases 'AI' could produce something 'good enough' for most uses. I'd still find it a sad an artist would not be able to make a
Yes (Score:4, Informative)

by nospam007 ( 722110 ) * writes: on Monday January 15, 2024 @08:43AM (#64159423)

The bought the books, the newspapers, the magazines ..and they AI read them and remembers.
Like every single kid.

- Re:Yes (Score:4, Interesting)
  
  by gweihir ( 88907 ) writes: on Monday January 15, 2024 @08:52AM (#64159443)
  
  You are confused. Machines are different from humans. The law is _very_ clear on that. Anybody with two actually working brain-cells is too.
  
  - Re: (Score:2)
    
    by dfghjk ( 711126 ) writes:
    
    Says the guy who claims OpenAI is a "criminal enterprise", a real legal expert. LOL. Where is the law "very clear" that machines are different from humans? Citations please. Hell, corporations are people, according to "law".
    And are you claiming that you do NOT have two working brain cells, or that you are not human? Asking for a friend so that he may know how to target you in his lawsuit.
  - Re: (Score:2)
    
    by penguinoid ( 724646 ) writes:
    
    No you're confused: humans are machines. Messy biochemical machines to be sure, but made of atoms and like any machine performing energy + material in ==> work + calculation + products + waste out. No magic, soul, spirit, whatever involved, just atoms and physics. And we understand some of the machinery and can change it.
    Just a couple years ago, we sent instructions for the nucleus to produce a new product, so as to update our antivirus software. A century ago we hacked the blood sugar regulation mechani
  - Re: Yes (Score:2)
    
    by beelsebob ( 529313 ) writes:
    
    The law is, but not in relation to this issue. The two working brain cells bitâ¦ can you point out which part of the brain makes it something more than a complex well optimised machine?
  - Re: (Score:2)
    
    by thegarbz ( 1787294 ) writes:
    
    The law is _very_ clear on that.
    The law has not once ruled on the machine concept of learning and it's relation to copyright infringement. It's not documented in actual law nor case law. There's nothing clear about at all, and so far no legal cases have actually come up addressing this issue (instead they all focus on whether a machine can own copyright or the legal impact on the user).
    Stop pretending your fever dream is reality. Every post of you've made here in this story is unsubstantiated rubbish. The world doesn't revolve around your
  - Re: (Score:2)
    
    by WaffleMonster ( 969671 ) writes:
    
    You are confused. Machines are different from humans. The law is _very_ clear on that. Anybody with two actually working brain-cells is too.
    If it's _very_ clear you should have no problem whatsoever citing relevant laws and or case law. Of course you won't support your baseless claims in this way because such evidence doesn't actually exist.
  - Re: (Score:2)
    
    by Gibgezr ( 2025238 ) writes:
    
    Please quote the relevant law. I can't find it. There's copyright law about fixing works in a physical form, but that would require the data of the model to resemble the object being copied, which they don't. There's precedence in law rulings that says that we can do special things with data storage and processing that means that indexing massive amounts of data is NOT a violation of copyright though.
- Re: (Score:2)
  
  by Njovich ( 553857 ) writes:
  
  You are talking about thousands of nodes that received copies of this data. If these are 'kids', as you see them, are you saying that I can legally buy 1 book, make thousands of copies, and hand them over to kids? And these kids are then free to make their own derivative works and sell those?
  - Re: (Score:2)
    
    by gweihir ( 88907 ) writes:
    
    That nicely sums it up. The AI fanatics have no working minds...
  - Re: (Score:2)
    
    by nospam007 ( 722110 ) * writes:
    
    "You are talking about thousands of nodes that received copies of this data. If these are 'kids', as you see them, are you saying that I can legally buy 1 book, make thousands of copies, and hand them over to kids? "
    Not you, but people called 'libraries'.
- Re: (Score:2)
  
  by snowshovelboy ( 242280 ) writes:
  
  You lost me at the very first step. AI can't buy anything, including books, because its not a person, and doesn't represent a person.
- Re: (Score:2)
  
  by Calydor ( 739835 ) writes:
  
  Show me a kid (or an adult, not picky on this point) who can flawlessly recite the millions of books, newspapers, images, movies, etc. they have read, seen, and so on.
Paying? They need to stop stealing! (Score:4, Interesting)

by gweihir ( 88907 ) writes: on Monday January 15, 2024 @08:51AM (#64159441)

And delete the models based on their massive campaign of commercial intellectual theft.

- Re: (Score:2)
  
  by thegarbz ( 1787294 ) writes:
  
  Please delete your own memory of this topic. We're tired of listening to your rubbish. Incidentally rubbish that couldn't exist if you didn't steal the story into your brain in order to write an (un)informed opinion on it.
Alternative (Score:2)

by ThosLives ( 686517 ) writes:

Why not send the LLMs to school, and have them learn like humans?
I mean, people going to school don't have to worry about copyright or any of that nonsense when they are learning that data. The copyright is fulfilled by the purchase of the textbooks and other materials. Just because a computer learns "faster" than a person, and can answer questions "faster" than a person, I don't see a fundamental difference in what these LLMs are doing versus some teenager just reading the internet all day.
I'm not a huge
- Re:Alternative (Score:5, Interesting)
  
  by Bert64 ( 520050 ) writes: <bertNO@SPAMslashdot.firenzee.com> on Monday January 15, 2024 @09:00AM (#64159467) Homepage
  
  Well another part of copyright is that the author can license the work for specific purposes. These textbooks are licensed for teaching humans, but not for teaching machines - thus you'd need to negotiate a different license with the publisher.
  The textbooks are also not supposed to be used alone, they are generally meant to be accompanied by a live teacher and sometimes practical demonstrations.
  
  - Re: (Score:3)
    
    by penguinoid ( 724646 ) writes:
    
    These textbooks are licensed for teaching humans
    Wow I never knew that. Does that mean I'm in violation if I use a textbook to balance a table?
  - Re: (Score:2)
    
    by Sloppy ( 14984 ) writes:
    
    Authors can license textbooks instead of selling them, but do they?
    I guess I wouldn't be surprised if kids these days (yes, I'm old) are agreeing to EULAs when they open their textbook apps. But I know for sure that tens of millions of people still alive today, purchased textbooks instead of licensing them. If those textbooks still exist, then the knowledge is attainable without any contracts, so there's no means of discriminating against computers.
    Just avoid the weird textbooks (ones that require special s
    - Re: (Score:2)
      
      by Bert64 ( 520050 ) writes:
      
      Authors can license textbooks instead of selling them, but do they?
      Yes, read the fine print. Just because you bought the physical book doesn't mean they don't try to restrict what you can do with the contents of it.
  - Re: Alternative (Score:2)
    
    by beelsebob ( 529313 ) writes:
    
    They can license it for specific purposes, but only when the purpose violates copyright in the first place. It doesnâ(TM)t seem likely that this does violate copyright to me.
- Re: (Score:2)
  
  by dfghjk ( 711126 ) writes:
  
  Schools contribute to learning but learning doesn't end there.
  "I don't see a fundamental difference in what these LLMs are doing versus some teenager just reading the internet all day."
  Exactly. Some of those teenagers have photographic memories, yet there are no dumbasses arguing that they should have to pay more.
  Faced with the idea that machines can think like humans and replace humans for many, even most, tasks, the natural reaction would be to consider what life would be like if we didn't have to work.
- Re: (Score:2)
  
  by gweihir ( 88907 ) writes:
  
  Machines cannot "learn like humans". Seriously.
Land grab (Score:3)

by Bert64 ( 520050 ) writes: <bertNO@SPAMslashdot.firenzee.com> on Monday January 15, 2024 @09:03AM (#64159481) Homepage

not one where all the money goes to the top because a handful of tech companies won a digital land grab
Only that's already the status quo with copyright. The vast majority of profits from selling copyrighted works go to a small handful of companies. These companies continue to control the copyrights long after the original author is dead as is anyone who was around when the work was first released.
We want a world with artists and with writers, after all, he adds, one that rewards artistic work
A system with excessively long copyrights does not reward artistic work, it rewards sitting on your ass creating absolutely nothing new and living off royalties from something created 50+ years ago. If you want to reward artistic work, copyright needs to be much shorter, and then there would be much more public domain content that AI could ingest.

continue to profit from other people's labor (Score:2)

by dfghjk ( 711126 ) writes:

Isn't this literally what writing is? Labor that other people profit from? Isn't that what content is? A way of conveying knowledge.
The objection is that "the enemy" is benefiting, the enemy defined as a corporation with money. If you don't want anyone to learn from your effort, don't create content. AI is using content in exactly the way it is intended to be used, and that makes people mad.
Please, your honour! (Score:5, Insightful)

by jenningsthecat ( 1525947 ) writes: on Monday January 15, 2024 @09:08AM (#64159489)

... claiming that the company simply couldn't operate without ingesting copyrighted materials ...
"Yes, we're stealing from creators, but... but... but... Innovation! Bizness! Benefactors! We're above the considerations which bind mere average citizens! We're a corporation!
I really don't think my characterization exaggerates much, if at all. The fact that these clowns not only think like that, but also say it aloud, demonstrates just how delusional, megalomaniacal, and outright dangerous they are.

- Re: (Score:2)
  
  by gweihir ( 88907 ) writes:
  
  Indeed. And their deranged fanbois (also here) are no better.
- Look at the purpose of copyright (Score:2)
  
  by Bruce66423 ( 1678196 ) writes:
  
  The US Constitution is very clear - congress is given the power:
  'to promote the Progress of Science and useful Arts, by securing for limited Times to Authors and Inventors the exclusive Right to their respective Writings and Discoveries'
  If the effect of the contested material is to PREVENT the progress of science and useful arts because the copyright holders are demanding too much, then the purpose of the power is frustrated.
  You are falling for the idea that the copyright owners - very often big corporation
Grammatical error (Score:2)

by Hagaric ( 2591241 ) writes:

Not "Are using" -"Have used". Once trained on the content, it is of no further value to AI, and can safely be discarded. What is a fair price for a one-time analysis, or "reading" of a copyrighted piece?
- Re: (Score:2)
  
  by gweihir ( 88907 ) writes:
  
  Simple: If you did use my content without asking, the "fair" price is anything I want and I can insist on them deleting my data from their model in addition. The only exception would be fair use. That clearly does not apply.
  - Re: (Score:2)
    
    by aldousd666 ( 640240 ) writes:
    
    Fair Use probably does apply unless they're reproducing copyrighted output... having seen the work during training isn't a technical violation of copyright. As we are about to see in the courts.
- Re: Grammatical error (Score:2)
  
  by zkiwi34 ( 974563 ) writes:
  
  If itâ(TM)s not stored it cannot be recalled by a computer.
Google can do it but you can't (Score:3)

by Rosco P. Coltrane ( 209368 ) writes: on Monday January 15, 2024 @09:20AM (#64159515)

it's transformative. This argument helped Google win in court against the big book publishers when it was copying books into its massive Google Books database, and defeat claims that YouTube was profiting by allowing users to host and promulgate unlicensed material.
Look up Family Guy on Youtube: you will find hundreds of really long videos with almost all episodes of Family Guy, interspersed with bogus content from some video game, and/or with the video zooming in and out constantly to fool the Youtube copyright bot.
I'd argue that's transformative: some dude took a bunch of Family Guy episodes and turned them into really annoying, barely watchable compilations. They don't really resemble the original episodes. And yet Google keeps taking them down (and the dudes posting them keep re-uploading them, and it's been going on for years...)
How come Google gets away with scanning books almost verbatim, but dudes uploading heavily fucked up TV series episodes are copyright violators?
I'll tell you how: Google has MONEY and plenty of lobbyists in Washington. So does Microsoft. And that's how both Google and Microsoft will somehow be allowed to develop their AI businesses without paying anyone anything, while you can't download a 70-year-old Disney cartoon without paying royalties. Mark my words!

The training set is not transformative (Score:2)

by penguinoid ( 724646 ) writes:

The AI might be transformative; but do they delete all their data as soon as they get it, or do they keep a copy of their training set data?
Bankruptcy (Score:2)

by zkiwi34 ( 974563 ) writes:

Would probably achieve that goal
Then "piracy" is perfectly legal (Score:2)

by Sebby ( 238625 ) writes:

The AI companies also argue what they're doing falls under the legal doctrine of fair use — probably the strongest argument they've got — because it's transformative
Ok, cool. All I have to do is take the latest hot movie, copy it, transcode it, add a screen of text of why the studios hated all the strikes, and re-publish it as I wish, because it's "transformative". Perfectly legal by their logic.
Re: (Score:2)

by account_deleted ( 4530225 ) writes:

Comment removed based on user account deletion
Another failure of the "west?" (Score:2)

by bool2 ( 1782642 ) writes:

Whilst western entities predictably waste their energy trying to slow down our collective AI efforts with regulation and greedy application of copyright laws, China and others are quietly beavering away, not giving a flying fuck about any of it.
If we don't stop this nonsense, the next AI leap will come from China and we're not going to be ready for it.
Could help pop the bubble (Score:2)

by Tx ( 96709 ) writes:

AI companies are currently far from profitable - OpenAI for example is burning through cash at a furious rate, and their path to profitability is based on what seems to me like ludicrous projections of future paying customers, given the amount of competition building up out there. Sooner or later, the AI bubble is going to pop, and then we'll see what sustainable business models will survive. A copyright-holder cash-grab could bring that forward if successful, adding a potentially massive cost to the alread
Yeah, it's all owned by the same handful of people (Score:4, Interesting)

by rsilvergun ( 571051 ) writes: on Monday January 15, 2024 @10:27AM (#64159709)

there's about 500 families that make up the 1% of the 1% and own basically everything (or at least a controlling share of it, but honestly we're splitting hairs at that point).

They own the media conglomerates that in turn own the IP that the AI that they also own is using. They'll be a little back and forth among them but they'll work out deals and that'll be that.

Us peons? We'll own nothing and be happy [wikipedia.org], right?

China (Score:3)

by GeLeTo ( 527660 ) writes: on Monday January 15, 2024 @10:54AM (#64159763)

Even if U.S. courts deem the current practice of data scraping for LLM training illegal, the laws will change very quickly. Imagine a situation where China has access to the smartest LLMs and USA is limited to only AI trained on public domain data.

Conqueror (Score:2)

by Baby Duck ( 176251 ) writes:

If you kill 10 people, you're a mass murderer. Kill 10,000, and you're a conqueror.
Google Books, YouTube, and AI prove copyright infringement is OK, as long as done en masse.
AI's Bounty IS the Data They Thieve (Score:2)

by BrendaEM ( 871664 ) writes:

It's all about stealing from people. Make your own training data.
As long as they are compliant (Score:2)

by aldousd666 ( 640240 ) writes:

If they are reproducing copyrighted output, then they'll get in trouble and have to pay. If they only train on it, and training is found to be fair use, because it doesn't actually reproduce a copyrighted work nor deprive the author of the ability to make money on their work, then it will continue to avoid paying for it, yes.
Yes (Score:2)

by OrangeTide ( 124937 ) writes:

Current political powers are not interested in regulating new business. They all receive money through PACs and back channels from the people making the technology. There will eventually be an era of reform, but that can't happen until the damage is done.
This is extraordinarily dangerous (Score:2)

by WaffleMonster ( 969671 ) writes:

Regardless of ones opinions about all this AI shit notion copyright should be extend to imposition of constraints on the reader... you can't profit from having read my book unless you agree to reimburse me... is an extraordinarily breathtaking change in policy.
At present copyright applies entirely to "write" operations (e.g. public performances, fixed verbatim and derivative copies). It explicitly does not apply to "read" operations including underlying facts and ideas. Copyright regime has no province ov
My business is robbing banks (Score:3)

by honestmonkey ( 819408 ) writes: on Monday January 15, 2024 @03:09PM (#64160513) Journal

I need to rob banks for my business to be profitable, there's just no way to do it without robbing banks. So I need to continue to rob banks for what I do to be a viable business. People can complain about "laws" and "stealing", but it's my business, so what can I do?

- Re: (Score:2)
  
  by gweihir ( 88907 ) writes:
  
  Machines do not "learn". Seriously, stop claiming insightless crap.

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

Yes (Score:5, Interesting)

Re: (Score:2)

Re: (Score:2)

Well, about the law (Score:2)

Re: (Score:2)

Heres something they can do... Nightshade. (Score:2)

Re: (Score:2)

Re: (Score:2)

Re:Yes (Score:4)

Re: Yes (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: Yes (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: Yes (Score:4)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Horses, barns (Score:4, Interesting)

Re: (Score:2)

Re: (Score:3)

Re: (Score:2)

Re: Yes (Score:2)

Re: (Score:3)

Re: (Score:2)

Re: (Score:2, Troll)

Re: (Score:2, Insightful)

Re: (Score:3)

Re: Yes (Score:2)

Re: (Score:2)

It's a NEW revenue source (Score:5, Insightful)

Re: (Score:3, Insightful)

Re:It's a NEW revenue source (Score:4, Interesting)

Re: It's a NEW revenue source (Score:2)

Re:It's a NEW revenue source (Score:4, Insightful)

Re: It's a NEW revenue source (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Humbug (Score:2)

Re: Humbug (Score:2)

NOT what AI is doing (Score:2)

Re:It's a NEW revenue source (Score:4, Insightful)

Begging the question (Score:2)

Re: (Score:2)

AI will kill creativity? (Score:2)

Re: (Score:2)

Yes (Score:4, Informative)

Re:Yes (Score:4, Interesting)

Re: (Score:2)

Re: (Score:2)

Re: Yes (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Paying? They need to stop stealing! (Score:4, Interesting)

Re: (Score:2)

Alternative (Score:2)

Re:Alternative (Score:5, Interesting)

Re: (Score:3)

Re: (Score:2)

Re: (Score:2)

Re: Alternative (Score:2)

Re: (Score:2)

Re: (Score:2)

Land grab (Score:3)

continue to profit from other people's labor (Score:2)

Please, your honour! (Score:5, Insightful)

Re: (Score:2)

Look at the purpose of copyright (Score:2)