AI Tool Rips Off Open Source Software Without Violating Copyright (404media.co) 101

Posted by BeauHD on Wednesday April 22, 2026 @01:00PM from the license-liberation dept.

A satirical but working tool called Malus uses AI to create "clean room" clones of open-source software, aiming to reproduce the same functionality while shedding attribution and copyleft obligations. "It works," Mike Nolan, one of the two people behind Malus, who researches the political economy of open source software and currently works for the United Nations, told 404 Media. "The Stripe charge will provide you the thing, and it was important for us to do that, because we felt that if it was just satire, it would end up like every other piece of research I've done on open source, which ends up being largely dismissed by open source tech workers who felt that they were too special and too unique and too intelligent to ever be the ones on the bad side of the layoffs or the economics of the situation." 404 Media reports: Malus's legal strategy for bypassing copyright is based on a historically pivotal moment for software and copyright law dating back to 1982. Back then, IBM dominated home computing, and competitors like Columbia Data Products wanted to sell products that were compatible with software that IBM customers were already using. Reverse engineering IBM's computer would have infringed on the company's copyright, so Columbia Data Products came up with what we now know as a "clean room" design.

It tasked one team with examining IBM's BIOS and creating specifications for what a clone of that system would require. A different "clean" team, one that was never exposed to IBM's code, then created BIOS that met those specifications from scratch. The result was a system that was compatible with IBM's ecosystem but didn't violate its copyright because it did not copy IBM's technical process and counted as original work.

This clean room method, which has been validated by case law and dramatized in the first season of Halt and Catch Fire, made computing more open and competitive than it would have been otherwise. But it has taken on new meaning in the age of generative AI. It is now easier than ever to ask AI tools to produce software that is identical in function to existing open source projects, and that, some would argue, are built from scratch and are therefore original work that can bypass existing copyright licenses. Others would say that software produced by large language models is inherently derivative, because like any LLM output, it is trained on the collective output of humans scraped from the internet, including specific open source projects.

Malus (pronounced malice), uses AI to do the same thing. "Finally, liberation from open source license obligations," Malus's site says. "Our proprietary AI robots independently recreate any open source project from scratch. The result? Legally distinct code with corporate-friendly licensing. No attribution. No copyleft. No problems." Copyleft is a type of copyright license that ensures reproductions or applications of the software keep it free to share and modify.

AI Tool Rips Off Open Source Software Without Violating Copyright

Post Load All Comments

Search 101 Comments Log In/Create an Account

Comments Filter:

support (Score:5, Funny)

by awwshit ( 6214476 ) writes: on Wednesday April 22, 2026 @01:14PM (#66106976)

When my Malus clone fails, can I buy support from the original project? From Malus? What do you mean I'm on my own?

Reply to This Share
Flag as Inappropriate
- Support is re-run process on latest FOSS update (Score:2)
  
  by drnb ( 2434720 ) writes:
  
  When my Malus clone fails, can I buy support from the original project? From Malus? What do you mean I'm on my own?
  Support will be included with the purchase price.
  
  Your purchase will pay for the electricity to re-run the clean room software process on the latest FOSS update. :-)
  - Re: (Score:2)
    
    by awwshit ( 6214476 ) writes:
    
    Its Malus all the way down then.
- Re: (Score:2)
  
  by sourcerror ( 1718066 ) writes:
  
  You just have to malus again the newest version of the OSS lib that contains the bugfix.
Honesty (Score:5, Insightful)

by Himmy32 ( 650060 ) writes: on Wednesday April 22, 2026 @01:14PM (#66106978)

They sure don't mince words about their ethics: [malus.sh]
Some will argue that what we do is exploitative, that we are extracting the ideas from open source while leaving behind the people who contributed them. To this I say: yes, that is a reasonably accurate description of our business model. It is also a reasonably accurate description of every company that has ever used open source software without contributing back, which is to say, virtually every company that has ever used open source software. We are simply being honest about it, and charging a fee for the privilege.
This service is provided "as is" without warranty. MalusCorp is not responsible for any legal consequences, moral implications, or late-night guilt spirals resulting from use of our services.

Reply to This Share
Flag as Inappropriate
- - Re: (Score:2)
    
    by unrtst ( 777550 ) writes:
    
    Yeah, the blatant disregard for the work and intentions of FOSS authors is breathtaking. However, just keep in mind Windows, MacOS, IRIX, and other code bases are also available. Plenty of folks have the code. If I feed one of these LLMs enough tokens and my private collection of highly copyrighted/DRM'd vendor's code, it's eventually going to spit out a version of MS Windows or whatever that I can license anyway I want (BSD license for Win 11 anyone?).
    I doubt M$ or Apple would let that go unchallenged. If they can win in court, then so can the EFF or others. So, I'm not sure how long the legs are on this idea.
    Not sure, but I wouldn't be surprised if that's the actual motive/goal.
    IE: they can't say, "feed your stolen proprietary codebase in and...," because you shouldn't have access to that to begin with, and that'd get them suing for just saying that, but they can use FOSS as an example case. Then, wait until someone else does the dirty work and inevitably shits out a demo of something expensive (Autodesk stuff, Adobe things, Microsoft enterprise software, etc..) and slaps an open source license on the result, o
    - Re: Turnabout is fair play when they do it to Wind (Score:3)
      
      by simlox ( 6576120 ) writes:
      
      Do you need the source code, or can the LLM just de-compile the programs?
  - Re: (Score:2)
    
    by AleRunner ( 4556245 ) writes:
    
    Plenty of folks have the code. If I feed one of these LLMs enough tokens and my private collection of highly copyrighted/DRM'd vendor's code, it's eventually going to spit out a version of MS Windows or whatever that I can license anyway I want (BSD license for Win 11 anyone?).
    
    This has kind of already been done with Claude code. Crucially, don't use just any LLM. If you want to make a new derivative of Windows, use some properly protected instance of Microsoft Copilot to do it (likely, if you have access to a military version, that will have the right protections). When Microsoft then attempts to claim copyright violation you will find it much easier to use their own arguments that copilot is not a copyright violation against them.
- Re: (Score:2)
  
  by DarkOx ( 621550 ) writes:
  
  I would have to counter that argument that using a FOSS project without contributing for a for profit activity isn't great but the people who are behind project always knew that was a possibility, depending on what license they chose.
  However just being a user, especially as a corporate entity, means more exposure from the project. Even if you don't publish the fact you use it, you end up with employees who know that might recommend it to others, move on use it elsewhere, contribute themselves, provide usef
  - Re:Honesty (Score:4, Informative)
    
    by ceoyoyo ( 59147 ) writes: on Wednesday April 22, 2026 @02:50PM (#66107222)
    
    This does none of those things, this is purely parasitic it borrows all the ideas and robs the original project of mind share.
    There is a legal mechanism for protecting ideas. It's a patent. Open source software is almost never patented, Slashdot generally comes out very much against software patents, and actual policy differs by country and goes back and forth over time. This is because the "ideas" implmented in software are usually not very novel.
    Copyright, on the other hand, protects the actual work. Copyright protects the text of Harry Potter. It does not protect the concept of kids with magic powers.
    Now you *could* argue this is true of FOSS clones, of commercial applications...
    Not can, must.
    
    Reply to This Parent Share
    Flag as Inappropriate
- Don't FOSS dev do the same at times? (Score:5, Insightful)
  
  by drnb ( 2434720 ) writes: on Wednesday April 22, 2026 @01:45PM (#66107060)
  
  Some will argue that what we do is exploitative, that we are extracting the ideas from open source while leaving behind the people who contributed them.
  How is that different from those who create a FOSS project to create a FOSS alternative to a commercial product? The process is simply less formal for these FOSS devs. Neither sides looks at original source code, both sides rely on observed behavior and reimplements that in a new way. FOSS having "noble" intentions and MalusCorp having "less-than-noble" intentions does not change this fact.
  
  Reply to This Parent Share
  Flag as Inappropriate
  - Re: (Score:2, Insightful)
    
    by Himmy32 ( 650060 ) writes:
    
    Weird to argue that ethical considerations shouldn't consider intent.
    - Re: (Score:2)
      
      by drnb ( 2434720 ) writes:
      
      Weird to argue that ethical considerations shouldn't consider intent.
      It's more that ethical considerations are not necessarily relevant to legality.
      
      And then there is the irony.
      - Re: (Score:3)
        
        by Himmy32 ( 650060 ) writes:
        
        As far as reverse engineering and clean room legality goes, we even got to see that play out with Google and Oracle duking it out. LLMs just reduce the barrier, add a layer of insulation, but also an extra question of how much of the "training data" is transformed.
        But if you want truly ironic in this category, that's definitely the post-leak Claude Code clones [github.com]. Anthropic has got to let it live otherwise they'll make an argument against using their tool.
        
        Re: (Score:2)
        
        by drnb ( 2434720 ) writes:
        
        but also an extra question of how much of the "training data" is transformed.
        I think that is the core question, perhaps the only meaningful question from the legal perspective.
  - Re: (Score:1)
    
    by Larry_Dillon ( 20347 ) writes:
    
    It's different in the same way that mass surveillance by law enforcement is somehow legal. The law simply hasn't caught up with reality. In mass surveillance, the premise is that if the government could have put a cop there, they can put a camera there. Yet this is totally different in scale, expense, and ease of use -- making it not the same thing at all. These three factors put a natural limit on the scale of surveillance, limiting the reach of the government to high profile crimes. Cheap and pervasive ca
    - Re: (Score:2)
      
      by HiThere ( 15173 ) writes:
      
      Why would I want a workalike of Windows? (I haven't used it for over two decades now, so I'm not sure. Linux is superior to the MSWindows that I remember...and it doesn't force updates at their convenience rather than mine.)
      - Re: (Score:2)
        
        by Megol ( 3135005 ) writes:
        
        I think you just gave two good reasons?
    - Photoshop work-alike and a Gimp work-alike (Score:2)
      
      by drnb ( 2434720 ) writes:
      
      In a FOSS project, the code is out in the wild (unlike most commercial software) and it would be incumbent on the LLM to prove it wasn't trained on that software, or derivative software, to be clean.
      I don't think the former implies the latter. Especially since one of the "benefits" of FOSS is that aspiring and new coders can study it to see how the more experienced do things. In other words there is an educational component. Would you apply the same logic to textbooks, to academic research, that includes source code?
      Another way to evaluate this would be to ask if the LLM can do the same thing with closed source software. If it can, I would call that a legal work-alike, for any code base.
      That would seem to stronger evidence. Assuming all other things being equal. Both FOSS and commercial being well known, well used. With ample materials teaching users how to use it. With se
- Re: (Score:2)
  
  by dimeglio ( 456244 ) writes:
  
  Well why stop at open source. Let do this for.windows, macOS, Photoshop, freedom forever.
  - Re: (Score:2)
    
    by HiThere ( 15173 ) writes:
    
    It will happen. The question is "Will it continue to be legal?".
  - Clean-room Windows clones would not sell. (Score:2)
    
    by zmollusc ( 763634 ) writes:
    
    BusinessBros would continue to throw money at Microsoft for genuine Windows. It is like a cargo cult or something to them, other companies used microsoft and exploited their workers and polluted everything and became rich so they must use microsoft and exploit their workers and pollute everything in the hope that they become rich. Insane, but if they had thinking skills or ethics they wouldn't be BusinessBros.
  - Re: Honesty (Score:2)
    
    by St.Creed ( 853824 ) writes:
    
    Friend of mine is an IP lawyer and he's working on an office clone to see how far he can get. It's actually important for digital sovereignty. Thank Trump for the urgency on that topic.
Area Mom Regrets Looking Under Bed (Score:5, Interesting)

by Pseudonymous Powers ( 4097097 ) writes: on Wednesday April 22, 2026 @01:16PM (#66106986)

"Malus [...] is modeled after the IBM case and uses one AI agent to write the specifications and a different agent to produce the code, creating that 'clean room' effect. [...] Blanchard also conceded that Claude, which like all LLMs, was trained on vast amounts of data scraped indiscriminately from the internet and was exposed to the original chardet in its training, but maintains his version is not derivative."
So, it's not a clean room at all: they're just calling it that.

Reply to This Share
Flag as Inappropriate
- Re: (Score:2)
  
  by homerbrew ( 10094532 ) writes:
  
  Seeing as how most if not ALL if the AIs had used open source software for their training, I would find it hard to believe it was a clean room approach. Perhaps if they can prove the original open source software wasn't used in it's training, then I could be convinced it was done in a clean room.
- Code (Score:2)
  
  by JBMcB ( 73720 ) writes:
  
  Compare the code, if it's similar then Claude is relying on stuff it's been trained on. If not, it's generating novel code that does the same thing.
  - Re:Code (Score:5, Insightful)
    
    by TheNameOfNick ( 7286618 ) writes: on Wednesday April 22, 2026 @01:59PM (#66107090)
    
    Nope, those are two compilers. One transforms code into an intermediate language in which the program is expressed as a specification that contains all the functionality of the original program, i.e. is a derived work. Then another compiler takes the program in the intermediate language and creates code from it (source or binary doesn't matter). Contrary to what AI evangelists want you to believe, it does matter whether something is an automatic process or involves creative thought. Also, what's with the focus on Open Source software? You could do the exact same thing with binary code.
    
    Reply to This Parent Share
    Flag as Inappropriate
    - Re: (Score:3)
      
      by alvinrod ( 889928 ) writes:
      
      Closed source binaries have companies who own the copyright and would sue the pants off of anyone who used this tool to try and "clean room" engineer a replacement. With open source there's not always a monolithic entity that can exercise copyright claims against an infringing party and perhaps generally less of a desire to do so even if the money and desire to pursue legal action were there.
      
      The hope is that by targeting open source there's people infringing on the copyright of the authors will be able t
      - Re: (Score:1)
        
        by dfghjk ( 711126 ) writes:
        
        "Closed source binaries have companies who own the copyright and would sue the pants off of anyone who used this tool to try and "clean room" engineer a replacement. "
        Given that the entire claim is that this technique does not infringe copyright, you are saying nothing. You need a legal basis to sue.
        "With open source there's not always a monolithic entity that can exercise copyright claims against an infringing party and perhaps generally less of a desire to do so even if the money and desire to pursue leg
        
        Re: (Score:3)
        
        by Local ID10T ( 790134 ) writes:
        
        Given that the entire claim is that this technique does not infringe copyright, you are saying nothing. You need a legal basis to sue.
        The legal basis to sue is that "this new software appears to infringe on the copyright of this existing software".
        The defense is that "it was created using a clean room technique".
        The complaint then alleges that "the established methodology of a clean room reproduction was not followed. The copy was not created with clean hands, as the AI was trained with the original source code."
        Since this would be a civil lawsuit, the standard that must be proven to the jury is a "preponderance of evidence" -that this i
        
        Re: Code (Score:2)
        
        by St.Creed ( 853824 ) writes:
        
        I doubt that just that argument some would win, because there have been several cases where it didn't. Like in the case where it was obvious the AI was trained on images from a given painter, the plaintiff still could not show reproduction of the works.
        Good luck with that lawsuit.
      - Re: (Score:2)
        
        by unrtst ( 777550 ) writes:
        
        The hope is that by targeting open source there's people infringing on the copyright of the authors will be able to get away with it more easily. ...
        ... or do they hope others will use Malus on leaked proprietary code, and that will kick off the necessary legal proceedings to bring a close to this somehow. Like, now that this is bound to happen, let's get it settled. Can you legally use the results of LLM generated code or not, and WTF are we going to do about it now?
  - Similarity insufficient for straightforward code (Score:2)
    
    by drnb ( 2434720 ) writes:
    
    Compare the code, if it's similar then Claude is relying on stuff it's been trained on. If not, it's generating novel code that does the same thing.
    Insufficient. There have been cases where an infringing company had to rewrite their code and have it examined and signed off on by the original copyright holders attorneys. The lawyers basically did what you describe, "this looks similar."
    
    When the developer could convince the judge that the code in question was basically a straight forward implementation, basically what would be "expected", the attorney was overruled and the code allowed. Similarity is insufficient if the code is simple and straightforw
  - Re: (Score:2)
    
    by Junta ( 36770 ) writes:
    
    Well, no, that assumes exactly one implementation for a given feature in the wild.
    Imagine generating a random string. Hundreds of codebases will have that same function. So this process may pull that from any of those codebases and not necessarily from the source codebase.
    It's never generating fundamentally novel code, but it is drawing from a huge training data that includes the same thing done dozens or hundreds of times with technically distinct code.
    - Re: Code (Score:2)
      
      by St.Creed ( 853824 ) writes:
      
      But wouldn't it be fun if these lawsuits led to the situation that the first published implementation gets copyright? I bet that would paralyse the entire IT industry for years, allowing Europe to catch up (unless they're just as stupid, in which case China handily wins the race).
- ML models have extracted general knowledge ? (Score:2)
  
  by drnb ( 2434720 ) writes:
  
  Humans have used copyrighted software (which would include FOSS) for their training. I think there is a legal concept that if copyrighted material is used to attain general learning, that general knowledge can be applied to new original works. These new works are not derivative if based solely on the general knowledge extracted from the original copyrighted material.
  
  Some similar concept could be applied to ML models. That the models contain extracted general knowledge.
  - Re: (Score:3)
    
    by El_Muerte_TDS ( 592157 ) writes:
    
    A human who has seen the source code cannot be part of the team producing a clean room implementation. They will taint the project with their general knowledge of the system they try to rebuild.
    - Re: (Score:2)
      
      by drnb ( 2434720 ) writes:
      
      A human who has seen the source code cannot be part of the team producing a clean room implementation. They will taint the project with their general knowledge of the system they try to rebuild.
      Apologies if I was not clear. I am NOT talking about the clean room based clones. I am talking about every day software development. That too is sometimes based on having studied copyrighted code. Mere similarity is insufficient to claim infringement. It has to go beyond well known and discussed general knowledge. I expect that concept will be successfully applied to AI too.
  - Re: (Score:2)
    
    by dfghjk ( 711126 ) writes:
    
    No one can claim ownership of your knowledge and experience. You cannot, though, duplicate other people's work.
    If AI duplicates code it was trained on, it may be a violation of copyright of the work used in training, not a violation of copyright of a product it is trying to duplicate.
    - Re: (Score:2)
      
      by drnb ( 2434720 ) writes:
      
      No one can claim ownership of your knowledge and experience. You cannot, though, duplicate other people's work.
      If AI duplicates code it was trained on, it may be a violation of copyright of the work used in training, not a violation of copyright of a product it is trying to duplicate.
      The point I am trying to get at is that the concept of "duplicating code" is fuzzy. General knowledge, straightforward and obvious implementations, etc can make work non-infringing even if the code looks similar. An AI that is using some sort of logic to build a solution is quite different than an AI that is searching the internet from similar examples. Similar code being more legally acceptable for the former. Just like a human who studied a textbook or academic literature use open source code as examples
- Re: (Score:2)
  
  by Junta ( 36770 ) writes:
  
  But it's meant as a proof point of our current interpretation of LLM and copyright. So far this "counts" as clean room because courts have not said LLM ingest is a violation, and they are using the LLM to launder the code to an intermediate form and then to code based on the 'clean room' finding.
  So while you are right in a sense, the point is from a court perspective this is "equivalent" to clean room unless new laws/court cases amend the status quo.
  - Re: (Score:2)
    
    by Pseudonymous Powers ( 4097097 ) writes:
    
    The article is frustratingly hard to parse for me, but it does say that Malus is meant as satirical. In that light, I can see why they would make a claim meant to be outrageous ("Claude, despite having read the original source code, can nonetheless reproduce it without violating the clean-room principle,") in the hopes that it will spark a re-evaluation of that claim when it is made sincerely, or at least seriously.
    The problem with that interpretation is that they're making real money ripping off real soft
    - Re: Area Mom Regrets Looking Under Bed (Score:2)
      
      by St.Creed ( 853824 ) writes:
      
      You think they're making money? I don't think so. Anyone can do this in silence, behind NDAs, with their own people. No need to put this out into the open on such an obvious target for litigation.
- - Re: (Score:2)
    
    by dfghjk ( 711126 ) writes:
    
    "There's no such thing as "clean room", unless you can find some yet to be discovered tribe in the Amazon basin cranking out their own code from scratch on papyrus with a punch tool."
    False.
    "The good thing is that whatever this "AI" tool puts out, it can't be copyrighted or patented, so we are free to use whatever bit and pieces we want"
    How are you going to get it? Steal it?
- Re: (Score:2)
  
  by dfghjk ( 711126 ) writes:
  
  Sure, if you don't understand what "clean room" means.
- Re: (Score:2)
  
  by radarskiy ( 2874255 ) writes:
  
  The target system has no copyright claim; every other system does.
- Re: (Score:2)
  
  by martin-boundary ( 547041 ) writes:
  
  Indeed. It's the same trash Malus article that was already commented on multiple times before here on slashdot. Repeating a lie about AI doing clean room reverse engineering doesn't make it true, but it does get tiring to read it again and doing so increases prejudice against it.
  This Mike Nolan guy is just full of shit, he's no researcher doing anything worthwhile,.just an attention whore at this point.
Excel should be doable, then (Score:3)

by rbrander ( 73222 ) writes: on Wednesday April 22, 2026 @01:20PM (#66106998) Homepage

There can't be any bit of software in the world more documented as to the requirements of every singe function, every menu item, every bit of behaviour, than Excel.
And it's the only thing tying so many people to Microsoft. Windows and Word sure are the hell not.

Reply to This Share
Flag as Inappropriate
- Excel is off-topic (Score:2, Insightful)
  
  by TurboStar ( 712836 ) writes:
  
  Excel isn't open source. I know it's tradition to not read the article, and some people don't even read the summary, but now we're not even reading the headline?
  - Re: (Score:3)
    
    by karmawarrior ( 311177 ) writes:
    
    The GP is suggesting that closed source software might also be up for grabs and explains it in terms that the specification and behavior is already written down, even if not as code.
    I'd suggest though that it's up for grabs anyway. The difference between open source and closed source is that you have access to the original human-readable source code for the former. But looking at the wider picture, the code is available for both if you don't need a human-readable version, as binary code is also computer cod
  - Re: (Score:3)
    
    by Junta ( 36770 ) writes:
    
    IBM BIOS wasn't source available either. The precedent for 'clean room' involved reverse engineering binary code.
    So while the current story emphasizes the loss of open source protections, the same principles would apply to LLM transforming binary and test cases to a specification.
    - Re: (Score:2)
      
      by Jerrry ( 43027 ) writes:
      
      IBM BIOS wasn't source available either. The precedent for 'clean room' involved reverse engineering binary code.
      Not true. IBM published the source code to their BIOS in their technical reference manuals, which were available to the public.
just require humans for distribution - bam, solved (Score:2)

by ErikKnepfler ( 4242189 ) writes:

https://tos.md/ [tos.md] is my answer to this problem: It's just my personal AI harness, everyone's got one. It's more of a harness for humans really, but anyway, I've been watching people ingesting and replicating repos in bulk. So - a no-license license, no downloadable software, a maze of nonsense for bots to navigate, no Github repo, no published spec - just instructions that only a human can actually complete, because it involves literally talking to me before I'll give you a copy. I dare your AI bot to in
it would not be bad if (Score:2)

by FudRucker ( 866063 ) writes:

It is not so bad if a user used it only for personal use without distributing their modified software to anyone, I don't use AI but I do some unusual and wild tweaks to Slackware and I don't tell anyone what I do because it's just for me on my laptop alone so depending on how this AI tool is used it's not as bad as it seems unless some company is looking to steal other people's ideas for a profit
- Re: it would not be bad if (Score:1)
  
  by dfarrow ( 1683868 ) writes:
  
  Just because you made your PC talk to you like a horny anime cat-girl doesn't make you weird.
- Re: (Score:2)
  
  by Marc_Hawke ( 130338 ) writes:
  
  This is what I came to say:
  The very premise of how LLMs work mean that nothing they ever do is 'clean room.' If you use an LLM you forego the 'clean room' argument.
  Now, I guess you could take special effort to train your LLM only on code which you can verify is 'clean.' Good luck on that. If you have that much code that you can coerce and LLM to accurately rebuild an software product, then you would have already written that software product 3 times yourself.
  - Re: (Score:2)
    
    by gweihir ( 88907 ) writes:
    
    If you have that much code that you can coerce and LLM to accurately rebuild an software product, then you would have already written that software product 3 times yourself.
    Indeed. But it would be more like 1000x or even higher.
  - Re: (Score:1)
    
    by dfghjk ( 711126 ) writes:
    
    "Now, I guess you could take special effort to train your LLM only on code which you can verify is 'clean.' Good luck on that. "
    Easy. "Clean" in this context means training with, at worst, fair use. "Clean room" means that generated code is produces without reference to the code you are cloning. It is not related to training.
    "If you have that much code that you can coerce and LLM to accurately rebuild an software product, then you would have already written that software product 3 times yourself."
    False.
- Re: (Score:2)
  
  by Junta ( 36770 ) writes:
  
  To make that, first we need a court to actually rule that 'training' can count that way. Currently everyone is operating under the philosophy that ingest to an LLM "doesn't count" somehow.
  - Let's ask Anthropic...Re:What did the AI train on? (Score:2)
    
    by Tschaine ( 10502969 ) writes:
    
    Have they taken any action against the owners of the GitHub repos hosting Claude's reimplementation of their leaked Claude source code?
  - Re: (Score:3)
    
    by gweihir ( 88907 ) writes:
    
    Not in this case. If the implementation team had any contact at all (!) with the original sources, the thing is not "clean room" anymore. The laws are very strict here. That is why clean room reimplementations are so exceptionally rare.
    - Re: (Score:1)
      
      by dfghjk ( 711126 ) writes:
      
      "That is why clean room reimplementations are so exceptionally rare."
      I would say the reason is that "clean room reimplementations" have not proven to be necessary. In the original IBM case, what was cloned was a BIOS that IBM published source code for. Compatibility requirements were STRICT, so there was a lot of motivation to use the BIOS tech ref. That, however, led to concerns over copyright. These issues simply don't exist in applications development. Worse yet, BIOS clones stopped using clean room
- Re: (Score:2)
  
  by dfghjk ( 711126 ) writes:
  
  "If the AI used ANY open source code in its training, then it CANNOT work in a clean room."
  False.
  "And if it did, it should be easy enough to prove that."
  False and irrelevant.
Ok, fine. Do ZFS and make a GPL version. (Score:3)

by thedarb ( 181754 ) writes: on Wednesday April 22, 2026 @02:28PM (#66107154)

Ok, fine. Do ZFS and make a GPL version that can be included in the kernel and all the distributions. Two can play this game.

Reply to This Share
Flag as Inappropriate
- Re:Ok, fine. Do ZFS and make a GPL version. (Score:4, Insightful)
  
  by ceoyoyo ( 59147 ) writes: on Wednesday April 22, 2026 @02:58PM (#66107242)
  
  I think the authors of this particular project, open source people and Slashdot have got this entirely backwards.
  The point of open source isn't that the code is super unique and awesome, it's that you can see it and modify it. The whole idea was born out of Stallman's frustration that Xerox wouldn't give him their code so he could add a feature to a buggy printer.
  Sure, someone can take an open source project and clone it. But anybody can also take a closed source project and clone it using the same technology. The AI doesn't need source code, it can work from the compiled version no problem. It's also infinitely patient so it could also write a clone just by interacting with the running system without access to any code at all.
  ZFS is a bad example. There are already open ZFS clones. The difficulty with ZFS is not that it is closed, but that it is patented.
  
  Reply to This Parent Share
  Flag as Inappropriate
Closed source to open source? (Score:1)

by esev ( 77914 ) writes:

Will the same work for making open source clean room versions of closed source applications? AI is pretty good at disassembly/decompilation.
Specification (Score:3)

by bill_mcgonigle ( 4333 ) * writes: on Wednesday April 22, 2026 @02:35PM (#66107180) Homepage Journal

The Chinese Wall legal strategy is to have Team A produce a specification and Team B produce an implementation.
If these guys can't show a specification they're screwed.
Claiming there must have been one in abstract Platonic space inside the LLM network black box isn't going to convince a Court.
So do the work of making an actual specification generator. Then write a coder. It's not impossible. You still won't get updates, fixes, support, community, or features added. The guys who just steal ffmpeg won't even bother. The AGPL haters might bite.
Also, he seems quite angry.

Reply to This Share
Flag as Inappropriate
Prove It (Score:4, Interesting)

by StormReaver ( 59959 ) writes: on Wednesday April 22, 2026 @02:46PM (#66107210)

This will be believable if they can do the same thing for Closed Source software. If they can't, then they are lying and infringing on copyright. If they can, then they will be the biggest software company in the history of software.

Reply to This Share
Flag as Inappropriate
- Re: (Score:2)
  
  by gweihir ( 88907 ) writes:
  
  If they can, then they will be the biggest software company in the history of software.
  Not at all. It starts with the "product" being static. You have no dev-team at all and you cannot even do small changes. Security, performance and reliability will suck. The product has no copyright. And proving that this was "clean room" is almost impossible as that would require a thorough and careful examination of all training data that even remotely looks like code.
  This is really just useable as satire to point out a problem.
"Clean room" means "clean room" (Score:5, Interesting)

by Dagmar d'Surreal ( 5939 ) writes: on Wednesday April 22, 2026 @02:51PM (#66107228) Journal

Good luck getting a judge to agree they had a "clean room" implementation performed by an AI that was trained on the very code it's supposed to be "re-inventing".
...and any minute now the same ruling about AI-generated art is likely to come down pertaining to programming, because copyright was meant to provide actual human artists with encouragement and protection for their craft by giving them the exclusive right to exploit their work throughout their lifetime and generally the lifetime of their children. Bots don't get afforded that same protection because they can't starve to death and they can never actually die, and programming is still both science and art (which is the only reason code is copyrightable).

Reply to This Share
Flag as Inappropriate
- Re: (Score:2)
  
  by gweihir ( 88907 ) writes:
  
  Since this is satire, that is not a problem.
  But for an actual clean room reimplementation, you need that the implementation team never has looked at the original code. Since a lot of FOSS went into the training data for this "demo", that is very likely not the case and hence this is not "clean room" at all. Oh, and also note that the product generated this way has no copyright at all ...
  - Re: (Score:2)
    
    by radarskiy ( 2874255 ) writes:
    
    "that is very likely not the case "
    That is the plaintiff's case to prove. The legal standard is "preponderance of evidence", IIRC. Good luck with that,
    - Re: (Score:2)
      
      by gweihir ( 88907 ) writes:
      
      That really depends.
    - Re: (Score:2)
      
      by Pinky's Brain ( 1158667 ) writes:
      
      The lawyer will say "hey, make them tell under oath if the code was in the training set" and the judge will say "okay".
      Companies can't take the fifth.
- Re: (Score:2)
  
  by aRTeeNLCH ( 6256058 ) writes:
  
  If you're right, then all open source is safe, but all closed source is up for grabs. That would actually be quite amusing.
Is it just compression? (Score:3)

by SoftwareArtist ( 1472499 ) writes: on Wednesday April 22, 2026 @02:51PM (#66107230)

I already have a program to do this. It came with my computer. It's called zip. Run it on the source tree of any program and it creates a new file with a specification for the program. Run unzip on the specification file and you get a new source tree, free from any copyrights. Problem solved!
You say a compressed version of the source code doesn't count as a specification? What do you think this person's program does? I'm not going to pay him money to find out, but I'd bet the "specification" it produces is nothing like what was used in the IBM case.
Now he needs to read up on inducement [numberanalytics.com], because he's opening himself up to enormous liability. By explicitly advertizing it as a tool to get around copyrights, he's pretty much waived the most common defenses in these cases, claiming you didn't intend it to be used for that purpose. By charging it for it, he's probably ruled out any kind of fair use defense. If anyone actually uses his tool to do what he says, he's personally liable.

Reply to This Share
Flag as Inappropriate
It does not clone the important 99%... (Score:2)

by gweihir ( 88907 ) writes:

And that is the people developing and maintaining it. Definitely a commendable satiric effort though that shows we cannot continue the lawlessness we currently have. It will destroy too many things.
Obviously, on the side of security, performance and stability, these clones will also not be worth much, so the "threat" is probably really small.
The end of copyright? (Score:2)

by allo ( 1728082 ) writes:

I've seen workflows where people use a image description model to create a detailed description and an image model to produce an image. Sometimes it reproduced quite similar versions of their photos, which are newer than the models. For audio I've seen at least songs that have similar style. The headline may be a bit provocative, but maybe AI generation is now really challenging how good the concept of "intellectual property" really is.
The value of code is rarely in the code (Score:3)

by Tony Isaac ( 1301187 ) writes: on Wednesday April 22, 2026 @03:36PM (#66107324) Homepage

Anybody can fork an open source repository. But not just anybody can keep it going. LibreOffice survives not because it got its code from Open Office, but because of the community that keeps it alive.
I think we programmers often obsess too much about who can see or get copies of our code, as if that were the magic sauce. It's not. It's the people behind the code, that is the magic sauce.

Reply to This Share
Flag as Inappropriate
You're missing the point (Score:2)

by cowwoc2001 ( 976892 ) writes:

The same approach can (and is) being done to closed-source software. This can result in more open-source software. It's just a matter of choosing what you want to tackle.
- Re: (Score:3)
  
  by martiniturbide ( 1203660 ) writes:
  
  yes, while they rip off open source, we should be Ripping Off Close Source with AI.
Great business plan (Score:2)

by MpVpRb ( 1423381 ) writes:

Find something popular that's free and open-source
Clone it in order to change the license
Sell it as closed source
Profit
I do wonder who their customer base is expected to be? People who want to pay a scam artist instead of getting it for free from its developers?
- Re: (Score:2)
  
  by PPH ( 736903 ) writes:
  
  Fine if you (a meat-sack) do it with a clean room process. But the product of the LLM has no legitimate copyright. And so the "Sell it as closed source" is in error. Sure, you can sell it. But I can copy it and there's nothing that ca be done.
unlikely. (Score:2)

by usedtobestine ( 7476084 ) writes:

The chances that the AI was trained on any of the open source packages is non-zero, if they publish the packages, I'd suggest that they are opening themselves up to litigation.
So what? (Score:2)

by AntisocialNetworker ( 5443888 ) writes:

Not sure I see the point of ripping off an open-source project. Said project remains, and will be significantly cheaper, probably free, compared to any commercial rip-off thereof. While you might be able to copyright the rip-off, you can't patent it, and the open-source version remains under whatever licence it has, so the rest of us can just raise a finger to the rip-off merchant.
easy solution (Score:1)

by mirwor ( 198892 ) writes:

the best way to protect our open source projects is to make them closed source! Finally, no one can misuse our source code, fork it or clean-room it. because no one can see it at all. problem solved!
Malus means Apple (Score:2)

by eggstasy ( 458692 ) writes:

https://en.wikipedia.org/wiki/... [wikipedia.org]
Why not just design your own? (Score:2)

by Qbertino ( 265505 ) writes:

How is this any different from any unimaginative but very elaborate software spec? VLC is arguably the best media player, but it does have some really bizarre quirks. I wouldn't ripp off VLC if I wanted to rebuild it, I would simply ask the AI to build a media player with the features I wanted. Problem solved.
My current software project is FOSS but it's built with AI. It's the same thing, just the other way around. I really don't get the hype.
This thing is just a very fringe use-case of AI-built software, t

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

AI Tool Rips Off Open Source Software Without Violating Copyright More | Reply Login

support (Score:5, Funny)

Support is re-run process on latest FOSS update (Score:2)

Re: (Score:2)

Re: (Score:2)

Honesty (Score:5, Insightful)

Re: (Score:2)

Re: Turnabout is fair play when they do it to Wind (Score:3)

Re: (Score:2)

Re: (Score:2)

Re:Honesty (Score:4, Informative)

Don't FOSS dev do the same at times? (Score:5, Insightful)

Re: (Score:2, Insightful)

Re: (Score:2)

Re: (Score:3)

Re: (Score:2)

Re: (Score:1)

Re: (Score:2)

Re: (Score:2)

Photoshop work-alike and a Gimp work-alike (Score:2)

Re: (Score:2)

Re: (Score:2)

Clean-room Windows clones would not sell. (Score:2)

Re: Honesty (Score:2)

Area Mom Regrets Looking Under Bed (Score:5, Interesting)

Re: (Score:2)

Code (Score:2)

Re:Code (Score:5, Insightful)

Re: (Score:3)

Re: (Score:1)

Re: (Score:3)

Re: Code (Score:2)

Re: (Score:2)

Similarity insufficient for straightforward code (Score:2)

Re: (Score:2)

Re: Code (Score:2)

ML models have extracted general knowledge ? (Score:2)

Re: (Score:3)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: Area Mom Regrets Looking Under Bed (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Excel should be doable, then (Score:3)

Excel is off-topic (Score:2, Insightful)

Re: (Score:3)

Re: (Score:3)

Re: (Score:2)

just require humans for distribution - bam, solved (Score:2)

it would not be bad if (Score:2)

Re: it would not be bad if (Score:1)

Re: (Score:2)

Re: (Score:2)

Re: (Score:1)

Re: (Score:2)

Let's ask Anthropic...Re:What did the AI train on? (Score:2)

Re: (Score:3)

Re: (Score:1)

Re: (Score:2)

Ok, fine. Do ZFS and make a GPL version. (Score:3)

Re:Ok, fine. Do ZFS and make a GPL version. (Score:4, Insightful)

Closed source to open source? (Score:1)

Specification (Score:3)

Prove It (Score:4, Interesting)

Re: (Score:2)

"Clean room" means "clean room" (Score:5, Interesting)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Is it just compression? (Score:3)

It does not clone the important 99%... (Score:2)

The end of copyright? (Score:2)

The value of code is rarely in the code (Score:3)