Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

 



Forgot your password?
typodupeerror
×
AI

OpenAI Suspends ByteDance's Account After It Used GPT To Train Its Own AI Model (theverge.com) 32

TikTok's parent company, ByteDance, has been secretly using OpenAI's technology to develop its own competing large language model (LLM). "This practice is generally considered a faux pas in the AI world," writes The Verge's Alex Heath. "It's also in direct violation of OpenAI's terms of service, which state that its model output can't be used 'to develop any artificial intelligence models that compete with our products and services.'" From the report: Nevertheless, internal ByteDance documents shared with me confirm that the OpenAI API has been relied on to develop its foundational LLM, codenamed Project Seed, during nearly every phase of development, including for training and evaluating the model. Employees involved are well aware of the implications; I've seen conversations on Lark, ByteDance's internal communication platform for employees, about how to "whitewash" the evidence through "data desensitization." The misuse is so rampant that Project Seed employees regularly hit their max allowance for API access. Most of the company's GPT usage has been done through Microsoft's Azure program, which has the same policy as OpenAI.

In response, OpenAI said that it has suspended ByteDance's account: "All API customers must adhere to our usage policies to ensure that our technology is used for good. While ByteDance's use of our API was minimal, we have suspended their account while we further investigate. If we discover that their usage doesn't follow these policies, we will ask them to make necessary changes or terminate their account."
This discussion has been archived. No new comments can be posted.

OpenAI Suspends ByteDance's Account After It Used GPT To Train Its Own AI Model

Comments Filter:
  • The self-replicating SkyNet has arrived. I for one, welcome our new AI overlords.
    • by narcc ( 412956 )

      Don't worry too much. "AI training other AI" can mean a lot of different things, none of which will result in a terminator situation.

      The first thing most people think of, and what the summary seems to imply, is using the output of one model to train another. This will always result in a lower-quality model thanks to a well-known phenomenon recently termed 'model collapse'. An easy way to think about this is that because models will necessarily have some error (they won't perfectly model whatever it is th

  • by Tyr07 ( 8900565 ) on Friday December 15, 2023 @07:51PM (#64084991)

    I mean you went around using other peoples data to train your LLM without their consent. Your only argument has been if it counts as copyright infringement or not. If you think you're safe from that behavior just because you beat people to the punch in doing it, before they wrote laws to say no explicitly, think again. What's good for the goose is good for the gander, people are doing the same thing you did now.

    They're using /your/ data to train their LLM. Fun game isn't it?

    I don't have a position either way on what's ethical data training for LLMs, I don't know enough about the impact, how it affects artists / other users data and content, but I certainly don't have sympathy if you're upset people are doing the same thing you did.

  • Doesn't feel great when someone robs your shit to seed their AI model, huh?

    • by dfghjk ( 711126 )

      I think they should be more concerned that Elon Musk swiped the source code to their entire chat application.

      It's funny how easily people use the terms "steal" and "rob" in this context when /.'era get their panties in a wad over copyright violations not being theft. The couldn't have "robbed" their shit since they still have it!

      • Yea, well it's false equivalence, because back when that argument first surfaced (over the illegal copying of digitally-compressed versions of CD hardcopies of commercially-licensed music) nobody was using those bootlegged copies to also illegally feed a giant AI that then turns around and directly competes commercially on equal footing with the originals works. They were largely just exercising what was in the prior era of cassette tapes deemed as "fair use" by law; keeping rebundled copies for personal ar

  • What about Grok, which comes right out and claims to be ChatGPT?

  • by Jeremi ( 14640 ) on Friday December 15, 2023 @07:55PM (#64085007) Homepage

    Say it ain't so!

  • "This practice is generally considered a faux pas in the AI world,"

    The cause of the embarrassment or insult to others has to be a mistake to be a faux pas though.

  • by ehack ( 115197 ) on Friday December 15, 2023 @08:26PM (#64085051) Journal

    Do they get to be friends later in life?

  • I love how it's totally cool for OpenAI to use unlicensed data from the internet and not credit open source projects to build their own model. As soon as someone else does that to them, it's not right. What a load of hypocritical shit.

    I contribute to open source and I don't mind OpenAI using it to train models. But just like a human, you better damn well cite and give credit to the projects.

  • Copying when explicitly disallowed. Same old story with them.
    • The only countries that give a fuck about IP are those trying to use it as a means to keep "lesser" countries from over taking them while they rest on their laurels.

      It didn't work for Europe when they tried it against the US, and it won't work for the US trying it with China. Nor will it work for China when they inevitably try it against some other country.
  • Silicon valley thieves just ran into the true masters of the concept that, "if it's not nailed down and protected by armed guards, it's ours. And sometimes even then.

  • Maybe wait until you discover a violation before you send a press release then.
  • A company breaking terms and conditions, just so that it can make money? Ridiculous!

  • Comment removed based on user account deletion
  • I can't remember who first used that simile - Shakespeare, Virgil, Homer ? It's definitely a completely new thing.
  • Shithole company from shithole country doing shithole things. And not just shithole thing, completely stupid things. Any model that used any of GPT as training is inherently extra stupid. Anyone involved at ByteDance should feel ashamed of their stupidity - from whoever had the idea all the way to anyone falling in line to work on it. The downside being in order to have this level of stupidity all around means that no one there is capable of feeling ashamed. They're too stupid to even recognize that they ar

To the landlord belongs the doorknobs.

Working...