Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

 



Forgot your password?
typodupeerror
China

OpenAI Warns Limiting AI Access To Copyrighted Content Could Give China Advantage 74

OpenAI has warned the U.S. government that restricting AI models from learning from copyrighted material would threaten America's technological leadership against China, according to a proposal submitted [PDF] to the Office of Science and Technology Policy for the AI Action Plan.

In its March 13 document, OpenAI argues its AI training aligns with fair use doctrine, saying its models don't replicate works but extract "patterns, linguistic structures, and contextual insights" without harming commercial value of original content. "If the PRC's developers have unfettered access to data and American companies are left without fair use access, the race for AI is effectively over. America loses, as does the success of democratic AI," OpenAI stated.

The Microsoft-backed startup criticized European and UK approaches that allow copyright holders to opt out of AI training, claiming these restrictions hinder innovation, particularly for smaller companies with limited resources. The proposal comes as China-based DeepSeek recently released an AI model with capabilities comparable to American systems despite development at a fraction of the cost.
This discussion has been archived. No new comments can be posted.

OpenAI Warns Limiting AI Access To Copyrighted Content Could Give China Advantage

Comments Filter:
  • Aaron Swartz (Score:5, Insightful)

    by systemd-anonymousd ( 6652324 ) on Thursday March 13, 2025 @04:12PM (#65231085)

    Carmen Ortiz used brutal tactics to drive Aaron Swartz to suicide over doing this for *public domain* documents. Now unethical corporations do it daily for all copyrighted content and it's just fine.

    • Re:Aaron Swartz (Score:4, Insightful)

      by Mspangler ( 770054 ) on Thursday March 13, 2025 @04:22PM (#65231125)

      The Chinese ignore our copyright laws, so we should get to ignore them too!

      That's the thinking.

      • Literally irrelevant to my comment. Maybe someone will punish you so you stop top-posting

      • I mean OpenAI are right we shouldnt be following Europes opt-in model.

        We should be doing an opt-IN model. Ie ask permission, not require me to fill out forms just to plead with some company not to steal my shit.

      • $mega_corporation warns that being prevented from doing $something_illegal_immoral_or_unethical will cause $random_bogeyman_effect, that's the thinking.
    • No, he hanged himself because he had the balls to steal but no balls to go to jail for 6 months for stealing.. he physically broken into places, he accessed networks he wasn’t allowed to , then redistributed the goods he stole, repeatedly was offered a light plea deal but decided to take his chances and ended up being charged with the maximum. Harsh yes, unexpected no. This isn’t subversion of the system but flat out criminality. He couldn’t cope with the consequences of his actions no one
      • 1. He didn't steal, and you're confidently ignorant of even the most basic facts. He was charged with breaking into a server cabinet under the "Computer Fraud and Abuse Act."

        2. Carmen Oritz used the massive power of the justice machine to flip his friends and loved ones and make them betray him. She made him face a maximum penalty of 50 years so that he would be pressured into admitting he's a cyber criminal and pleaing down, and so she could get a career win under that desirable charge. He submitted a coun

        • I thought, for some reason, that it was 30 years, but regardless, I agree 100%.

          I wouldn't be surprised if Zuckerberg is held criminally liable in the upcoming case though. With the way everything is playing out with the judges politically right now.
        • 1) no, that was some of the charges. You’re being wilfully ignorant here. On July 11, 2011, he was indicted by a federal grand jury on charges of wire fraud, computer fraud, unlawfully obtaining information from a protected computer, and recklessly damaging a protected computer. Swartz pled guilty to 13 federal crimes. 2. Ok. So? Who cares about her motivation. Did the charges stick as in did he do it. Even he said as much and the evidence of that is clear. Could have hated him or had a hardon for hi
          • 1. MIT felt no serious crime was committed and didn't want to press charges. Carmen Ortiz is the one who pursued i

            2. Grand juries always choose to prosecute

            3. Who cares about Carmen Ortiz' motivation in driving a young man to suicide by pursuing a bogus criminal case using the full weight of the federal government, while flipping his friends and an ex? And in the next sentence you claim she had no choice, so you are actually weighing the question of her motivations (and are wrong about AG discretion).

            Opinio

            • 1) you say that like that’s not how this works. It’s her job too. When the guy broke into my house I dropped the charges the state decided otherwise.. why? The deeds had greater implications beyond the palatability of my damage to me. 2) good job skipping over the plethora of charges and focusing down the inevitability of the system 3) again, nothing bogus about the charges, he was merely charged to the full extent.. he drove himself to suicide, his many many actions did or rather his not wan
    • Carmen Ortiz used brutal tactics to drive Aaron Swartz to suicide over doing this for *public domain* documents. Now unethical corporations do it daily for all copyrighted content and it's just fine.

      Corporations are clearly more important than people. That's been the message in the USA for decades now.

    • by allo ( 1728082 )

      Do you argue one should drive people at large companies to suicide or that information should be free? Quite a few people bring this argument and seem always to side with the wrong point of view of denying people access, which implies in the context this statement rather driving people to suicide than allowing them to use public domain documents.

  • If it's posted in public, it's public
    If it's behind a paywall or otherwise restricted, it's private

    • by gweihir ( 88907 )

      Public does not mean "do anything you like with it". Seriously. Stupid "argument" is stupid.

      • Well, his mom was in public and I pretty much did anything I could think of with herâ¦
      • It actually does, at least online. If it's not in the robots.txt file or if there isn't a robots.txt file, it's fair game. Even with a robots.txt file it isn't illegal to scrape it. Some services are justified scraping it anyway... like NCII detection services.

        With that said, I think a BTC address should be placed into the robots.txt file for "restricted" pages, so that anyone with a conscience can help pay the bandwidth bills for what they scrape at reasonable prices.

        In Aaron Swartz' case, wasn't it
    • Nah, private means nothing if you opened a file with an "AI enhanced " application.
  • by WimBo ( 124634 ) on Thursday March 13, 2025 @04:27PM (#65231133) Homepage

    Iâ(TM)ve not understood the argument against having AI reading copyrighted content. That would be like a scholar only reading things out of copyright.

    • It's reading it without paying.

      The scholars have to respect it. They buy books, their institutions pay for access to journals... You're arguing that AI should have an unfair advantage over humans.

      • But they also read everything freely published on the web. The argument is an AI has to pay to learn from things a human doesn't. They're also arguing paying for access like a person isn't enough. They should be punished for illegally accessing material they have no right to, but you're saying that they should have to pay or pay more to learn from the same things humans can access for free or for less.
        Having the same right to learn from what they can lawfully access isn't giving AI an unfair advantage; qu
        • I'm not a fan of the AI companies but copyright is already hugely unfairly tilted against its explicit purpose in favor of being a perpetual rent seeking corporate profit engine. We shouldn't further gut fair use just because the copyright mafia wants a piece of the AI hype money.

          The AI companies are doing it for profit. They're not our friends. It's not an expansion of copyright. What they're doing is illegal now. Whether it should be or not is a valid question, but what they are doing doesn't meet any of the typical definitions of fair use because it's commercial in nature, except maybe for when they're handing out models like candy because that actually benefits us directly. So, while there's an argument to be made, there's also distinctions to be drawn.

      • Strawman.

        Nobody is saying AI should have free access to a paid journal or something. Just that it consume what any other person could for free on the web.

        Not saying that I agree with this- the consequence, I imagine, is that *everything* becomes paywalled- but that's their argument, and you're misconstruing it.
      • by allo ( 1728082 )

        Wait until you learn about sci-hub. Research would be almost impossible without shadow libraries.

    • by Rujiel ( 1632063 )

      Do you get to read it? Why should the AI get to read it if you didn't?
      It's not as if any of the contributors to a given paper on JSTOR was asked if their paper could be public or not, or is paid when someone downloads their work for that matter. For public universities, why should I have to rely on paying AI tens of thousands of dollars to get at the derived results of works that originated in the fucking public sector?

    • by Morpeth ( 577066 )

      "That would be like a scholar only reading things out of copyright"

      If that scholar uses that work for their own research or paper, they have to credit or cite it -- HUGE difference than what's happening with LLM training, it's not in any way comparable.

      AI companies are making serious money using prior works to build their models, with no compensation or even credit/acknowledgement to those who actually generated the content or data.

      Not sure what's so hard to understand about that.

      • by fafalone ( 633739 ) on Thursday March 13, 2025 @06:25PM (#65231483)
        Citations aren't required by copyright law. And only specific things are cited; they're not citing the tens of thousands of sources that went into their general education. Artists don't list every influence that went into their work... hell they *couldn't*.
        It is hard for me to understand why I suddenly have to pay just because I made a computer program to also learn from what I'm free to learn from. Pirated material is one thing; but what's being accessed lawfully, especially freely? It's like StackOverflow or its users claiming I owe them a cut of my developer wages because if how much I learned about programming from there. I'm making so much money from it right?
      • If that scholar uses that work for their own research or paper, they have to credit or cite it -- HUGE difference than what's happening with LLM training, it's not in any way comparable.

        I'm a bit worried you've had a stroke, because you just presented an even less functional analogy and used it as an example of how the original was different.

        If the LLM writes a paper, it too should cite. However, a scholar learning from something they read does not require citation.

    • AI is exposing how frail copyright is in the face of technology that can replicate artifacts quickly. Everyone has been in denial about this for a long time, and ridiculous mechanisms of control have been implemented, but digitization has always been incompatible with copyright. There will always be a way to get around DRM. No matter how sophisticated the DRM, media will always be vulnerable to capture and reproduction when the end user can experience it.

      We should not be desperately attempting to maintain t

    • I've not understood the argument against having AI reading copyrighted content.

      That because you are under the mistaken impression that these things have sentient intelligence. They are data processing machines, and nothing more. OpenAI and its other criminal compatriots take the commercial work of others and resell access to them. OpenAI is admitting as much in this very article, as they know what they're doing is illegal, and want copyright laws to be rewritten to carve out an exception for them.

    • by AmiMoJo ( 196126 )

      AI is fundamentally different to humans in three important ways.

      1. AI memory is different, allowing AI to reproduce copyrighted text verbatim.

      2. Humans understand that ripping off another artist wholesale will likely get them into trouble, AI doesn't care.

      3. AI is a product being sold by a corporation, so it's more like if you paid someone to clone a book and just trivially re-word it, rather than employing someone who read the book and puts the knowledge into practice.

    • This isn't about reading the copyrighted material, it's about what you do with it once you've read it. A lot of copyright licenses allow you to freely read/view the text but don't allow you to sell it for profit as your own, or require you to post attribution to the original author. If you're writing a scientific paper and in your research you use papers written by others, you add citations so that people can find the original research. Copyright owners can add all kinds of special restrictions of where the

    • by spitzak ( 4019 )

      I really would prefer if the AI's knowledge was not restricted to only non-copyrighted information.

    • The AI is taking copyright work and using it for commercial purposes. If AI was open and free to the people and not patented or proprietary, then I would be more likely to accept that AI has the same rights as the average person to these works.

  • good one (Score:5, Funny)

    by snowshovelboy ( 242280 ) on Thursday March 13, 2025 @04:28PM (#65231139)

    I'll use this next time I get pulled over for speeding.

    "Sorry officer, but if you don't let me drive fast, workers in China are going to get to work before me, and that will be bad for America"

    • by Tablizer ( 95088 )

      "But the Chinese ride bicycles faster than you to work, in snow, without shoes, against the wind, up hill, both ways. Now pay up!"

  • I didn't download all of those songs to steal them. It was fair use because I was just training myself on the material so that I could generate new material in similar styles.

  • by gweihir ( 88907 ) on Thursday March 13, 2025 @04:34PM (#65231151)

    They must get rich quick! Or else.

    Seriously, these companies should be disbanded and their owners should have their fortunes impounded and paid out to the ones they stole from.

    And no, this is not "learning". This is commercial copyright infringement.

    • by ewibble ( 1655195 ) on Thursday March 13, 2025 @05:03PM (#65231255)

      Yes it is learning, the problem is billions of dollars have been spent convincing people and governments that using other peoples work is "stealing", a thing that has been done by people for thousands of years. Now another group of rich people find it inconvenient and want it changed, it just sound hypocritical. Just wait till people copying their code to make their own AI, then these people will think its wrong again.

    • And no, this is not "learning". This is commercial copyright infringement.

      Yes, because copyright infringement is mutually exclusive to learning, ignoring the fact that you're begging the question since this isn't determined to be copyright infringement or not, yet.

      Try harder, dude. Do your blind hatred a favor and at least grant it 4 of your brain cells when coming up with shitposts.

      • by gweihir ( 88907 )

        Machines cannot learn. That is a term used as an analog. It has no legal meaning for machines. Sentient beings get an exception to copyright, but machines do not.

        • Machines cannot learn.

          Yes, they can.
          learn (verb):
          gain or acquire knowledge of or skill in (something) by study, experience, or being taught.

          That is a term used as an analog.

          Nope.

          It has no legal meaning for machines.

          And you think "learn" has a legal meaning for non-machines?

          Sentient beings get an exception to copyright, but machines do not.

          Says who, you?

          • by gweihir ( 88907 )

            Sentient beings get an exception to copyright, but machines do not.

            Says who, you?

            International copyright law. Do you know nothing?

            • International copyright law. Do you know nothing?

              No international copyright law (which is law governed by treaty) currently addresses this issue.

              In the US, human authorship is required, but only by jurisprudence, not statute.
              Yes, I do know something.
              You, apparently, do not.

              • by gweihir ( 88907 )

                And you even miss this very simple point completely. This is not about _authorship_. This is about using works that are under copyright. Humans are allowed to read and learn from things. Machines are not allowed to process copyrighted works without a specific exception for processing. And no, the law says machines processing data and humans learning from data is fundamentally different and that the term "learning" is used for machines does not mean anything.

                Hence the exceptions are for _humans_, not for "_l

                • And you even miss this very simple point completely.

                  Hard to track what the fuck your point is when you just make shit up.

                  This is not about _authorship_.

                  If it's not, then I have no idea where you're going with this.

                  Humans are allowed to read and learn from things. Machines are not allowed to process copyrighted works without a specific exception for processing.

                  No law says any such thing.

                  And no, the law says machines processing data and humans learning from data is fundamentally different and that the term "learning" is used for machines does not mean anything.

                  No, it does not. You are a liar.

                  Hence the exceptions are for _humans_, not for "_learning_". Seriously, stop claiming stupid crap. Machines are not humans and the law is very clear on that, even if you do not seem to be.

                  You are a liar.

  • by spaceman375 ( 780812 ) on Thursday March 13, 2025 @04:43PM (#65231177)
    They are right in that giving them free reign with copywrites will speed up AI training, and not doing so may give an advantage to competitors and adversaries. However, just because it is strategically the "right thing to do" doesn't make it the ethical, or actually right, thing to do.
    • They are right in that giving them free reign with copywrites will speed up AI training, and not doing so may give an advantage to competitors and adversaries. However, just because it is strategically the "right thing to do" doesn't make it the ethical, or actually right, thing to do.

      That particular argument doesn't hold in our current environment. "What makes the most profit the most quickly" is the most moral, most ethical, most "right thing to do" possible. Profit is all. Profit is the way forward. Any detriment to profit must be cast aside. Greed is our god now, and anything getting between us (<- corporations) and our god must be destroyed. This logic is exactly what people have been arguing for all along when they say that corporations have a moral imperative to seek profit abo

  • If my browser can connect to a publicly accessible library via the internet on my behalf why can't my AI trainer? It doesn't have the fidelity of taking screenshots - it doesn't 'copy'. The training data is an amalgamation with no ability to faithfully reproduce an original; so what is the basis for blocking access to copyrighted works???

    I say, give the trainer access to the USLOC and let it run.

    • It doesn't have the fidelity of taking screenshots - it doesn't 'copy'.

      That is not the legal standard.

  • by oldgraybeard ( 2939809 ) on Thursday March 13, 2025 @04:51PM (#65231205)
    If we can't steal everyone's private personal data and use it to train our AI China will win!
  • If you don't give up all your freedoms, the terrorists will win!

    If you don't give in to every corporate demand, China will win!

    Hmm. Hyperbolic fear. By golly, it just might work!/p.

  • ...by scanning more romance novels and cat videos than USA?

    Are next-generation weapons quantum-powered by passionate Schrodinger cats or something?

    (Pepe Le Pew cartoons come to mind for some reason; Schrodinger's Skunk?)

  • My business model doesnâ(TM)t work if it needs to respect the law. We must change the law so I can make money.
  • by presearch ( 214913 ) on Thursday March 13, 2025 @06:00PM (#65231401)

    Not having free access to all tv and films, for all of us, gives China an advantage.

    Having to pay money to OpenAI, especially at their insane subscription amounts,
    also gives China an advantage.

  • We have a classic case of peoples' rights to be rewarded for their efforts vs sharing that for the greater good. So it would make sense to come up with a balanced solution that is fair to everyone.

    However none of that matters as really what is really happen is one greedy group, the companies selling people's work and taking most of the money in the process is fighting with another greedy group feeding anything they lay their hands on to their AIs for training. Maybe we just need to lock all the CEOs of
    • peoples' rights to be rewarded for their efforts

      No such right actually exists, except as granted by statute- meaning it can be taken away.

      So it would make sense to come up with a balanced solution that is fair to everyone.

      Indeed, it would, particularly since copyright is again, not a natural right.

      However none of that matters as really what is really happen is one greedy group, the companies selling people's work and taking most of the money in the process is fighting with another greedy group feeding anything they lay their hands on to their AIs for training. Maybe we just need to lock all the CEOs of the relevant companies in a room with good solid walls and doors, and come back some time later to see who is still standing. Or maybe not come back...

      Don't disagree with this at all.

    • Selling may be greedy, but using it without paying for it is criminal. You can't really compare the two.
  • The elephant in the room is that AI companies want to profit from knowing things, but current AI needs humans to learn it first or it is useless. You are selling something but you don't create any of it.
  • gives thieves an economic advantage, over those who obey laws.

    That doesn't mean we should repeal laws against stealing.

  • If the PRC's developers have unfettered access to data and American companies are left without fair use access, the race for AI is effectively over. America loses, as does the success of democratic AI

    Not a problem that some 400 percent tariffs can't solve. And by the way, what is "democratic AI"?

  • by vbdasc ( 146051 )

    Those damn greedy AI bros are going to kill fair use. Everything not in Public Domain is going to be hidden behind paywalls.

"Plan to throw one away. You will anyway." - Fred Brooks, "The Mythical Man Month"

Working...