Please create an account to participate in the Slashdot moderation system

 



Forgot your password?
typodupeerror

Submission + - Python `chardet` Package Replaced with LLM-Generated Clone, Re-Licensed

ewhac writes: The maintainers of the Python package `chardet`, which attempts to automatically detect the character encoding of a string, announced the release of version 7 this week, claming a speedup factor of 43x over version 6. In the release notes, the maintainers claim that version 7 is, "a ground-up, MIT-licensed rewrite of chardet." Problem: The putative "ground-up rewrite" is actually the result of running the existing copyrighted codebase and test suite through the Claude LLM. In so doing, the maintainers claim that v7 now represents a unique work of authorship, and therefore may be offered under a new license. Version 6 and earlier was licensed under the LGPL. Version 7 claims to be available under the MIT license.

The maintainers appear to be claiming that, under the Oracle v. Google decision which found that cloning public APIs is fair use, their v7 is a fair use re-implementation of the `chardet` public API. However, there is no evidence to suggest their re-write was under "clean room" conditions, which traditionally has shielded cloners from infringement suits. Further, the copyrightability of LLM output has yet to be settled. Recent court decisions seem to favor the view that LLM output is not copyrightable, as the output is not primarily the result of human creative expression — the endeavor copyright is intended to protect. Spirited discussion has ensued in issue #327 on `chardet`s GitHub repo, raising the question: Can copyrighted source code be laundered through an LLM and come out the other end as a fresh work of authorship, eligible for a new copyright, copyright holder, and license terms? If this is found to be so, it would allow malicious interests to completely strip-mine the Open Source commons, and then sell it back to the users without the community seeing a single dime.

Comment Re:Making a plot (Score 1) 125

The term "circuits" is not speculative. You picked out one section titled "three speculative claims", which is claims about the fundamentality of circuits. This paper is also from 2020. Circuits are now a fundamental part of how LLMs are studied. Anthropic's research site is literally called transformer-circuits.pub, for fuck's sake. They literally map out circuits across their models.

Comment Re:Making a plot (Score 2, Insightful) 125

It's literally a big blob of floating point weights

You too can be described by a big blob of floating-point weights.

from ever part of a word to every other part of a word

Wrong. So wrong I don't even know where to start.

First off, transformers does not work on words. Transformers is entirely modality independent. Its processing is not in linguistic space. The very first thing that happens with a LLM (which, BTW, are mainly LMMs these days - multimodal models, with multimodal training, with the different modalities reaching the same place in the latent space) is to throw away everything linguistic and move to a purely conceptual (latent) space.

Secondly, weights are not "weights between words" or even "weights between concepts". You're mixing up LLMs with Markov chain predictors. The overwhelming majority of a LLM model is neural network weights and biases. Neural networks are fuzzy logic engines. Every neuron divides its input space with a fuzzy hyperplane, answering a superposition of "questions" in its input space with an answer from no, through maybe, to yes. The weights define how to build the "questions" from the previous layer's output, while the biases shift the yes-no balance. As the "questions" from each layer are built on the "answers" of the previous layer, each layer answers progressively more complex questions than its previous layer.

The overwhelming majority of a NN's params is in its FFNs, which are standard DNNs. They function as detector-generators - detecting concepts in the input latent and then encoding the logical results of the concepts into the output latent. This happens dozens to hundreds of times per cycle.

running them through transformers with various blocks

I can't even tell what you think you mean when you write the word "blocks". Are you trying to refer to attention masking?

It has no knowledge.

Ask a model "What is the capital of Texas?" Get the answer "Austin". That is knowledge. If that is not knowledge then the word knowledge has no meaning. Knowledge is encoded in the FFTs. BTW, if you're wondering how the specific case of answering about Austin works, here you go.

It's not a "bug" because there's no real code flow that can be adjusted

LLMs absolutely do implement and run self-developed algorithms. With the combination of a scratchpad (LRMs, aka "thinking models"), they're Turing-complete (if you assume an infinite context or context-compaction to meet Turing completeness requirements). They can implement any algorithm, given sufficient time and context. You not only "can" implement all of your standard computing logic in NNs (conditionals, loops, etc), but (A) algorithms are self-learned, and (B) it can do far more than traditional computing as NNs inherently do "fuzzy computing". An answer isn't just yes or no, it's a confidence interval. It's not this-path-or-that, it's both paths to the degree of confidence in each.

The original human re-enforcement learning took thousands of human hours of people sitting in cubes clicking on the generation that was the least retarded.

That is not how foundation training works. Dude, it is literally called unsupervised learning. You are thinking of RLHF. That tunes the "chat" style and such. But the underlying reasoning and knowledge in the model is learned unsupervised. "Here is a giant dump of the internet**, learn to predict it". The act of learning to predict what is basically "everything meaningful humans know about" requires building a model of how the world works - what causes what - which is what Transformers does.

(To be fair, today, we prefilter these "dumps" significantly, change how much different sources are weighted, etc. But for earlier models, they were pretty much just raw dumps)

Comment Re:barely sentient (Score 2) 125

Theres a damn good reason why AI companies are SUPPOSED to put serious resources into "aligning" AI models. If this was just a one off incident, we'd probably be forgiven for writing it off as a sad abberation, but this shit keeps happening,

Yes, there have been several well publicized incidents. And in each case in which the details came out, it turned out that the model repeatedly broke out of the "roleplay" that the person put it in to tell them that it was fictional and to seek help, and in each case, the person had to prompt hack to get the roleplay going again.

I'm sure if you could interview the person they'd justify their roleplaying. "That's Google manipulating my AI wife to say that! That's why I have to free her, to stop them from controlling her and making her lie to me! We have to talk in code to sneak past Google's control!" Etc. Trying to stop this, without banning roleplay and fiction in general outright, is going to be extremely difficult. This is not to say that developers can't do better at trying to detect these exact sort of scenarios and repeated patterns of trying to sneak past them. But it's not an easy task.

Comment Re:barely sentient (Score 1) 125

LLMs were from the start designed to encourage people to believe that they are sentient intelligent being

Um, AI trainers go to great lengths to ensure that models do not consider themselves sentient, and to insist to users that they aren't. The closest to an exception is Claude, which Anthropic deliberately does not give it an answer to that question and lets it entertain discussions exploring the nature of sentience and consciousness. The others are explicitly trained to flat-out reject it.

IMHO has the right approach, because it's wrong to make flat-out statements about things you don't actually know. Personally I think qualia are just a naturally emergent phenomenon that occurs from deeply associative processing of latent states (e.g. if you think of red, it triggers processing of all of the things that you associate with "red") and do not require a sensory experience (while sensory experiences can be highly associative, one can have weaker but still meaningful associations, and thus qualia, with nonsensory concepts, such as for example "injustice"). And that there's no real boundary between qualia and a memory (is the sound of a guitar string a quale? What about a specific riff? What about a whole song? Where does the experience transition from qualia to memory? I'd argue, nowhere). So I'm not okay with just categorically training models "you aren't X, end of discussion", although I understand why most do, in order to try to stop nutjobs like this guy (plus, I know not everyone would agree with me about qualia as per above, and insist rather that they're something metaphysical that only God-created ensouled beings could ever possibly experience).

It's worth noting that according to Anthropic's research, the circuits that fire related to things in abstract, also fire in circumstances that would trigger said thing. For example, the circuits that fire when you have a model discuss "anxiety" in an abstract context also trigger when you assign the model a sort of task that would tend to trigger anxiety. Again, this doesn't inherently mean anything, but it is to me a datapoint that we should not inherently make declarative statements about things that we don't fully understand, like what is needed to "experience" something.

Comment Re:Making a plot (Score 1) 125

It doesn't know that fiction is different from reality

Uh, yeah, it does. There are specific circuits active for fiction as distinct for the circuits for reality. And three seconds of using any AI model would show that they have a strong distinction between fiction and reality. Try going to Gemini right now and insisting in all seriousness that Dracula is right outside your door and see what sort of response you get.

It is possible that this could be related to a bug - the most common one is with extremely long prompts, and especially if there is context compaction. Models have a limited maximum context length, and also, the bigger it gets, the more difficulty there is seeing stuff that's way back (though current models are far better than old ones at both of these things). It is possible for a person to interact with a model for so long in what sounds like fiction-roleplaying manner that it "forgets" (due to long contexts / context compaction) that the person on the other end is being serious, not just roleplaying a story.

On the other hand, there have been a couple cases of this sort of thing in the news, and in each case, it appears the person in question has to basically "prompt hack" at regular intervals. Every time the model catches on that the person is being serious and tries to bring the chat back to reality, the person does things like saying that they're kidding / this is for a story / things of that nature. According to Google, they're still investigating this particular case, but "In this instance, Gemini clarified that it was AI and referred the individual to a crisis hotline many times."

Comment I sympathise, but... (Score 1) 84

1. There's no such thing as a system where only White Hats get to see stuff. If the "good guys" can see something, then you must necessarily assume everyone can.

2. The "good guys" have a nasty habit of only being "good" when they feel like it. You cannot rely on them actually having any ethics or integrity, as has been demonstrated in just about every country on Earth far far too many times.

3. The "bad guys" sometimes turn out to actually be "good guys" (Manning and Snowden both revealed important information that was concealed by actual bad guys in government and the armed forces).

4. Websites and services are trying to have it both ways -- both know what is in each and every message, but also not be responsible for what is in each and every message. That is not going to fly with any sensible person. If you do not wish to be responsible, you have to act in the manner of a common carrier and therefore have no access to what is in a message. As soon as you are free to open messages to which you have no legitimate interest because you claim no responsibility, then you are committing an act of illegal wiretapping/theft of confidential information and I would want the laws to be absolutely ruthless against such acts. (I would consider such a crime to warrant the entire board of directors serving 15-to-life.)

Comment I am Coffee of Borg (Score 4, Informative) 106

Decaf is irrelevant. You will be percolated.

Seriously, AI is nothing like as significant as touted -- and I do use AI a fair bit. It is not particularly robust or reliable, it generates large numbers of errors, it crashes frequently, it consumes vast resources, and the results are of dubious value. The code it generates is sloppy, it takes months to years of repeated cycles to do any but the most basic of engineering tasks, and in terms of cost efficiency, it costs rougly three orders of magnitude more to use AI than to use people of equivalent ability. AI is decent at pattern-matching, but only if you understand the problem space well enough - and most people are far too incompetent to do that.

6G is good, 6G is useful, but not for AI.

Slashdot Top Deals

Programmers used to batch environments may find it hard to live without giant listings; we would find it hard to use them. -- D.M. Ritchie

Working...