Catch up on stories from the past week (and beyond) at the Slashdot story archive

 



Forgot your password?
typodupeerror
AI China

OpenAI's AI Reasoning Model 'Thinks' In Chinese Sometimes, No One Really Knows Why 104

OpenAI's "reasoning" AI model, o1, has exhibited a puzzling behavior of "thinking" in Chinese, Persian, or some other language -- "even when asked a question in English," reports TechCrunch. While the exact cause remains unclear, as OpenAI has yet to provide an explanation, AI experts have proposed a few theories. From the report: Several on X, including Hugging Face CEO Clement Delangue, alluded to the fact that reasoning models like o1 are trained on datasets containing a lot of Chinese characters. Ted Xiao, a researcher at Google DeepMind, claimed that companies including OpenAI use third-party Chinese data labeling services, and that o1 switching to Chinese is an example of "Chinese linguistic influence on reasoning."

"[Labs like] OpenAI and Anthropic utilize [third-party] data labeling services for PhD-level reasoning data for science, math, and coding," Xiao wrote in a post on X. "[F]or expert labor availability and cost reasons, many of these data providers are based in China." [...] Other experts don't buy the o1 Chinese data labeling hypothesis, however. They point out that o1 is just as likely to switch to Hindi, Thai, or a language other than Chinese while teasing out a solution.

Other experts don't buy the o1 Chinese data labeling hypothesis, however. They point out that o1 is just as likely to switch to Hindi, Thai, or a language other than Chinese while teasing out a solution. Rather, these experts say, o1 and other reasoning models might simply be using languages they find most efficient to achieve an objective (or hallucinating). "The model doesn't know what language is, or that languages are different," Matthew Guzdial, an AI researcher and assistant professor at the University of Alberta, told TechCrunch. "It's all just text to it."

Tiezhen Wang, a software engineer at AI startup Hugging Face, agrees with Guzdial that reasoning models' language inconsistencies may be explained by associations the models made during training. "By embracing every linguistic nuance, we expand the model's worldview and allow it to learn from the full spectrum of human knowledge," Wang wrote in a post on X. "For example, I prefer doing math in Chinese because each digit is just one syllable, which makes calculations crisp and efficient. But when it comes to topics like unconscious bias, I automatically switch to English, mainly because that's where I first learned and absorbed those ideas."

[...] Luca Soldaini, a research scientist at the nonprofit Allen Institute for AI, cautioned that we can't know for certain. "This type of observation on a deployed AI system is impossible to back up due to how opaque these models are," they told TechCrunch. "It's one of the many cases for why transparency in how AI systems are built is fundamental."
This discussion has been archived. No new comments can be posted.

OpenAI's AI Reasoning Model 'Thinks' In Chinese Sometimes, No One Really Knows Why

Comments Filter:
  • by fahrbot-bot ( 874524 ) on Tuesday January 14, 2025 @08:49PM (#65089399)

    OpenAI's AI Reasoning Model 'Thinks' In Chinese Sometimes, ...

    If it wants fly the MiG-31 Firefox [wikipedia.org], it'll have to "think in Russian."

    • Re:Ya, well ... (Score:5, Informative)

      by Kisai ( 213879 ) on Tuesday January 14, 2025 @09:17PM (#65089427)

      My guess is that it's likely doing it for a practical reason.

      Chinese has a lot of characters, so specific concepts are likely easier tokenized as single Chinese characters. Where as English is a hugely clunky language that requires 10 times as much verbosity to get the same concept.

      It likely requires less memory to work in Chinese for that reason.

      Heck every time I see "400 billion parameter" LLM I think "I'm sure that could be represented in 1/4th the size in Chinese" But then you have to add a translation layer for the ingress and egress of data which means the accuracy plummets if there isn't perfect 1:1 mapping of ingress language to chinese and back to an egress language.

      • by AmiMoJo ( 196126 )

        Was going to say the same thing. Sometimes I think in Japanese simply because the language and mindset are a better fit for the subject at hand.

        • Understanding Japanese is helpful when programming in Postscript or using an RPN calculator.

          Korean, Hungarian, and Basque also use SOV postfix grammar.

        • TF are you guys talking about, "thinking in Chinese"? How does one THINK in any language at all?

          • by AmiMoJo ( 196126 )

            You don't have an inner monologue?

            If you knew a language like Japanese you would know that it's not just the words, it's the whole way of looking at the world and understanding it.

            • Everyone has an inner monolog, if that's what you want to call it. It's just that monolog isn't using language. The left side of your brain talks to the right side. This interaction doesn't use language. Also, we use this same exact type of communication exchange with the trees, insects, animals.... And this communication exchange isn't limited to happen within the frame of time either.

              And in fact, those that design a language that's as closely related to this natural communication exchange as possible

            • No, no inner monologue. No images. I don't think in languages, though I can speak several. Turning my thoughts into language is a laborous process, often requiring a rather huge amount of words for a rather simple thought.

      • Re:Ya, well ... (Score:5, Interesting)

        by ceoyoyo ( 59147 ) on Tuesday January 14, 2025 @11:05PM (#65089611)

        Nobody outside of OpenAI really knows how they've implemented their reasoning system, and the article isn't clear about what "thinks in" means. My guess is that the former has the model produce a chain of intermediate results, in a language it knows, and the latter means some of those intermediate results are in different langauges.

        OpenAI also doesn't say how they train on multiple languages. I asked ChatGPT 4o mini "Pouvez-vous décrire un lion?" replacing various words with English it responded in French unless all the words of the question were English. It even responded in french when the query was "Can you describe un lion?" Asking "Can you describe un tree" also got a French response, with a little snark about my shitty French: 'Bien sûr ! Un "tree", ou arbre en français, est une structure....'

        Interestingly, if you ask it something where the non-English is a phrase that it might have seen in English ("What does je ne sais quoi mean?"), it responded once with split screen French and English and the rest of the time with English.

        So it seems to be able to understand mixed language input but the response is biased towards non-English, and the output is always in a single language except for quotes. That might mean the output is forced into whatever language some not very good detector thinks the input is in, and the intermediates sometimes drift a bit because the detector is kind of crap. That could well be because some particularly efficient tokens are in the other language, or it could be because the training data that contained some concept was in that language.

        I'm curious whether "thinks in X" means all of the intermediate was in X, if it was mixed languages, or if that's even possible.

        • Nobody outside of OpenAI really knows how they've implemented their reasoning system, and the article isn't clear about what "thinks in" means

          The o1 and o3 models use chain of thought to "think" about and reason step by step about the question you ask. Concretely, "thinking" is just emitting text about the "reasoning" process they're going through, before providing an answer. Whether or not it is "thinking" in any real sense, lots of empirical results have shown that having models emit chain of thought before responding improves their accuracy. One intuition for why, is that emitting more text lets the model spend more compute on answering a ques

          • by ceoyoyo ( 59147 )

            Thinking is not "emitting text about the reasoning process," either in humans or otherwise.

            As I said, you could implement a chain of reasoning system that does that, but you don't need to. There isn't a particularly good reason for doing so, and quite a few reasons why you would not, which is why I suspect OpenAI doesn't. More likely they encode text to tokens at the beginning, then stick with tokens or something more abstract all the way through until the last stage.

            You could take the products at those int

      • Chinese has a lot of characters, so specific concepts are likely easier tokenized as single Chinese characters. Where as English is a hugely clunky language that requires 10 times as much verbosity to get the same concept.

        Heck every time I see "400 billion parameter" LLM I think "I'm sure that could be represented in 1/4th the size in Chinese" But then you have to add a translation layer for the ingress and egress of data which means the accuracy plummets if there isn't perfect 1:1 mapping of ingress language to chinese and back to an egress language.

        LLMs think in tokens not words. You can simply increase the size or optimize configuration of tokenizer dictionary to improve inference time performance in this way. There is no way to judge efficiency if one language over another without an analysis of the dictionary.

      • ... But then you have to add a translation layer for the ingress and egress of data which means the accuracy plummets if there isn't perfect 1:1 mapping of ingress language to chinese and back to an egress language.

        Is that perfect mapping even possible? I'm reminded of the old joke about translating "out of sight, out of mind" from English to Chinese and back, and ending up with "invisible idiot". Admittedly that's not a fair example; but even if you eliminate whatever an LLM's equivalents of connotations and slang might be, I'd be amazed if you could rely on the output being the same as the input.

        I would guess that the results of an LLM translating from one language to another and back again might be, at best, incons

      • wondering out loud here... (disclaimer: my knowledge of non latin languages is essentialy nil, i.e. running on average misconceptions, and a bit that Marshall McLuhan wrote)...

        What you said makes sense to me, however, I'd like to know if the models use that same logic. I suggest the opposite may be true. Each Chinese character is a pictogram, and a metaphor. You have to understand Chinese history to understand the complete meaning of one of these Chinese characters. This seems quite beyond the "weighted ave
      • True, you may be thinking, but these models can't think, they don't think, yet. It's just a giant function 'f', whose domain is a set of multi-language characters and whose range is a set of multi-language characters.

    • by twosat ( 1414337 )

      Mitchell Gant (Clint Eastwood) must think in Russian to make his thought-controlled Firefox shoot down the other Firefox. https://www.youtube.com/watch?... [youtube.com]

    • While the wording of the movie is humorous, that concept might have been lost between the script and the filming. It would have been better phrased as "Think like a Russian pilot." It might have worked better in the script in showing Eastwood anticipating the Soviet pilot's tactics and countering his moves. But then all Eastwood did is launch a missile backwards which I don’t know that any jet that has that capability.

      After the Soviet Union fell, the US was able to purchase combat jets like MiG-29s

  • I mean, I can understand Chinese, but Persian? That's peculiar to say the least.

  • "Luca Soldaini, a research scientist at the nonprofit Allen Institute for AI, cautioned that we can't know for certain. "

    OK, don't listen to anything this guy says.

  • If this machine was trained proper on all languages then it would make sense why it would 'think' in a different language just to ensure proper 'understanding' of all possibilities.

  • AIs don't "think" (Score:5, Insightful)

    by dfghjk ( 711126 ) on Tuesday January 14, 2025 @09:20PM (#65089431)

    "OpenAI's "reasoning" AI model..."

    It's not a "reasoning model" because it doesn't "reason", that's just a provocative name given to suggests OpenAI knows more than it does. And AI's don't "think" either, so they don't think in Chinese. Nor is this behavior, when described properly, particularly "puzzling". It's interesting for sure, but clearly they train the model using multiple languages so you should expect learned information to potentially map to multiple languages. Brains exhibit this too.

    • Re: (Score:3, Insightful)

      The definition for "think" is vacuous, to say the least.
      It infers, and I challenge you to prove that your "thinking" is anything more than that, and it's been constructed so that it can combine inferences into logical steps to reason, a skill I think you're willfully throwing to the wayside because you're intimidated by how unspecial it makes you feel.

      Reasoning LLMs are an area of study. OpenAI didn't invent the term, nor are they anywhere close to the only people using it.
      Some of those people are Ph.D.
      • *are PH.D.s in universities
        • LLMs are just fancy autocompletes, which is useful for some things, for example I think you could do with a better one.

          • Re: (Score:2, Troll)

            LLMs are transformers- they tokenize a string of words and fire them through hundreds of layers of millions of interconnected virtual neurons, collecting context along the way.
            What comes out on the other side is a result of the weights of all those neurons.

            To call that an "autocomplete" is too stupid to even be wrong.
            I'd love to see you prove that you're anything more than an autocomplete, and not even a very fancy one.
            • I read somewhere his mother is a _____
              Somebody completed, but we're not sure who.

            • Re: (Score:2, Interesting)

              by gweihir ( 88907 )

              That is just pseudo-profound bullshit, bereft of insight. LLMs _are_ essentially just autocomplete with some things wrapped around them for a nicer presentation. They cannot even do symbolic computation, they always have to stay on a concrete literal word level. You cannot get insight or understanding that way. You can fake in a way that less smart people are deeply impressed though.

              • Re: (Score:1, Troll)

                LLMs _are_ essentially just autocomplete with some things wrapped around them for a nicer presentation.

                Wrong.

                They cannot even do symbolic computation, they always have to stay on a concrete literal word level.

                Laughably wrong.

                You cannot get insight or understanding that way.

                Based on what? Your magical definition of insight and understanding?

                You can fake in a way that less smart people are deeply impressed though.

                And in a way that causes other people to overestimate their intelligence, apparently.

                • by gweihir ( 88907 )

                  You really need to get some clue before you shoot your mouth off. Well, that is probably not the way you like to things. Clueless grandstanding seems to be what you do best.

                  • I need to get a clue? Give me a break.
                    Your religious adherence to your counterfactual opinion on this matter is so deeply ingrained in your head, that you have made it your fucking signature.
                    Nearly every claim you made was false, as if tailored for some earlier iteration of LLMs via a God of the Gaps argument.
                    That's not clever. In fact, it's downright stupid. It's the kind of stupid shit that lets people keep believing in god as human knowledge runs out of places for him to exist.
                    You are desperately s
                    • by gweihir ( 88907 )

                      You know what being a PhD level CS type that has followed AI research for 35 years does? It makes me pretty immune to marketing bullshit. I guess you lack that.
                      .

                    • I have a BS.CS. from UW and 25 years of experience.

                      You aren't Ph.D. level anything, dude.
                      Your reasoning ability is high-school at best.
                      You've had experience, I can see that in your words, but actual understanding- you lack. I think you might be an LLM. After all, your neurons are only able to transform action potentials over a circuit- it can't be possible that you comprehend anything.
                    • by gweihir ( 88907 )

                      Hahaha, more denial. Since I do not want to doxx myself, I cannot give you the reference to my PhD theses in the respective university library or my publication list (including 5 "best paper" awards), but I know what I have.

                    • Like an LLM, I can only infer what you have based upon observed behavior.
                      A Ph.D. in any science would know as much... unless of course that Ph.D. thought that there was magical processes going on in their neurons that made them something more than what they are.

                      So, if you were, in fact, granted a Ph.D. in some science, I posit that the screening process for filtering out semiclever imitations of intelligence failed in this instance.
              • >They cannot even do symbolic computation, they always have to stay on a concrete literal word level.

                Ok gramps. Go take your meds. That's quite literally what these "reasoning" models are are meant to exceed. You can even run at least one locally if you have more than 16GB of RAM and a couple of spare CPU cores.

                Let's list what that stripped down model can do - solve algebraic equations, do linear regressions, solve word problems, solve logic puzzles, solve isostacy equations, solve order of operations eq

                • by gweihir ( 88907 )

                  Nope. That comes in via Wolfram Alpha. LLMs cannot do these things. You are hsllucinating heavily.

                  • Ah yes, "I" am the one hallucinating. I must have hallucinated all the testing I've done on a machine with no internet connection. Sure. I guess Wolfram was somehow magically installed on my system...

                    Or you are just an asshole that doesn't know what they fuck you are talking about. I think I'll go with that one. Add in the fact that you are hallucinating about advances not happening in technology fields you don't understand, and well, we get your stupid posts. AKA your slashdot comment history.

      • But it has not been *constructed* to do logical inference at all... it has been trained to simulate the output by copying (badly) from examples, and then it has been trained to simulate the output students would generate if you ask them to "show your work"

        Like a lazy student you know is always cheating, almost by principle, but you think they're smart enough if you force them to cheat through enough homework and tests they will learn and develop true understanding... despite themselves.

        The above *can* work

        • But it has not been *constructed* to do logical inference at all...

          Incorrect.

          it has been trained to simulate the output by copying (badly) from examples

          Absurdly incorrect.

          and then it has been trained to simulate the output students would generate if you ask them to "show your work"

          Where the fuck are you making this bullshit up from?

          Some researchers seem to think the "latent space" LLMMs develop from training is equivalent to internalized knowledge and reasoning (if not sentience) can "emerge" from generative behavior - and the rest is optimization. But this is controversial and hardly a scientific consensus - it just gets more press.

          Everything about what goes in LLMs is controversial. They're fucking black boxes due to the complexity.
          Sentience should be controversial. If for no other reason that an LLM has atrociously small state that isn't hard-coded, and so can't possibly experience anything like life for more than a unimaginably small fraction of time and space, if it were to do so at all.

          As to whether the latent space is equivalent to internalized k

          • I think you need to calm down and stop watching so much MLST. You've become stuck in a simplified worldview about the inner workings of the brain that was never intended as more than a crude analogy to entice interest in a data fitting field of study called neural networks long ago.

            Mathematically, there's not much going on here other than a humongously large dataset and a worldwide competition to find ways to represent it geometrically, and preferably in linear algebraic operations that happen to be optim

            • I think you need an IQ greater than 70, which is questionable at this juncture, to have an opinion that's worth a shit.

              Let's analyze the core of your concern.
              Mathematically, there's not much going on here.
              What on earth would lead you to think that's meaningful in any way?
              Your entire existence is coded by single repeating molecule that comes in 4 configurations.
              Mathematically, there isn't much going on with you.
              Of course, if you're of the more intelligent persuasion, you'd say, "Why on Earth would we
          • That we are neurons doesn't mean we are the same thing as an LLM. And there is no evidence to support that when humans comprehend a subject, what they do is the same glorified auto-completion an LLM does. In fact, humans are horrible at doing that.

            Humans build internal models to predict what is going to happen. This is how they accurately throw spears at moving targets, and steer herds of large animals over cliff edges. From this, symbolic systems to communicate and enumerate have been built. LLM's do thing

            • That we are neurons doesn't mean we are the same thing as an LLM.

              The same, of course not.
              However, it directly means that claims of:
              LLMs can't be intelligent, can't think, are nothing but statistical models, are nothing but autocompletes- are absurd on their face.
              You're nothing but a repeating chain of 4 configurations of a DNA molecule.
              2 molecules of DNA can't do much.
              Enough billions, and you've got a you. Amazing, isn't it?

              And there is no evidence to support that when humans comprehend a subject, what they do is the same glorified auto-completion an LLM does.

              Logical fallacy right out of the gate. You're begging the question.
              Let me turn that around on you.
              There's no evidence to support that when an

              • No, it doesn't directly mean what you claim, any more than the fact that both cars and supertankers are made from steel means supertankers can drive on roads and be useful for trips to the mall. LLM's are defined by their algorithms, and those directly prohibit LLM's from ever comprehending concepts or context in the way that humans do.

                And no, I'm not begging the question just because I make the observation that LLM's do statistical analysis and humans do not. That's a really wild thing to try to claim, and

                • by gweihir ( 88907 )

                  That nicely dims it up. But some people want to believe so strongly that they become completely disconnected. We have a textbook example right here.

                • No, it doesn't directly mean what you claim, any more than the fact that both cars and supertankers are made from steel means supertankers can drive on roads and be useful for trips to the mall.

                  Of course it does, because your example gives structure where you have none.
                  You cannot classify the set of finite state machines in your brain as a supertanker, or the one in the LLM as a car.
                  All you know is that it's a series of several billion of them, and its observed behavior is X.

                  LLM's are defined by their algorithms, and those directly prohibit LLM's from ever comprehending concepts or context in the way that humans do.

                  Their algorithms?
                  And what algorithm do you think that is?
                  I can hand-design a 10 layer LLM that's only capable of doing math.
                  The algorithm is in the parameters, and it is unknowable at the sizes we're now playing with.
                  It'

                  • Indeed, our mind has structure, and so does an LLM. And they are demonstrably not the same kind of structure, WHICH IS THE POINT. The structure is there, and it's important. Arguing that it isn't is, once more, appealing to magic. You do that a lot.

                    An LLM is an algorithm. That algorithm defines its method, just like a bubble sort algorith's method is defined by the bubble sort. What you can hand design is the parameter set to this algorithm, which does not magically bestow it with some way of doing things i

                    • Indeed, our mind has structure, and so does an LLM. And they are demonstrably not the same kind of structure, WHICH IS THE POINT.

                      I should have used smaller words.

                      You are using a personal bias to select the structure of one vs. the other (supertanker, car) and describing which can use some arbitrary mode of operation of your own selection (road)

                      Arguing that it isn't is, once more, appealing to magic. You do that a lot.

                      This is called a strawman. Since I never argued this, whatsoever.
                      You do that a lot. Unintelligent people often do.

                      You're arguing that because we do not have detailed understanding of the flow inside a large LLM, it might actually be doing something which it hasn't been programmed or trained to do. For someone so quick to slam on magic you're even quicker to invoke it.

                      No magic whatsoever. It's emergence, and it's very scientific. Hell, you're using your emergence right this second to invoke logical fallacy after logical fallacy in a desperate

                    • So observation is bias in your world. Got it.

                      And your argument as stated is that because we contain DNA, something something, so LLM's can think. Frankly, I'm completely uninterested in your chain of reasoning (if you even have one) at this point, because you constantly show you do not understand the core concepts of LLM's. Or of biology, for that matter.

                      That we can't know in detail what goes on in an LLM is not the same thing as that we don't know what kind of algorithm it is. It's a statistical token matc

                    • Observation?

                      Give me a break.
                      Your argument is so fucking stupid that anyone who read it actively had brain cells die.

                      Your argument can be rephrased with alternate bias to work just fine:
                      "Any more than the fact that both EVs and ICEVs are made from steel means EVs can drive on roads..."

                      Your argument depended on your bias. Come the fuck on, dude. Did you fucking graduate highschool?
                    • My argument depends on my observation. Nothing else.

                      And yes, lots of things made from steel can be driven on roads. And lots can't. Your attempt at snark shows you haven't grasped how this means that the fact that steel is used is not a relevant factor to determining whether something can be driven on a road.

                      Which maps directly to your attempts to claim that just because we have DNA, LLM's can be intelligent.

                      Fascinating to watch someone who in part appears to be a fully grown human with a BS in CS claim tha

          • by gweihir ( 88907 )

            But at the end of the day- we are that- nothing more than neurons.

            That is what is called an extraordinary claim. You have any extraordinary evidence to support it? Or maybe simple, but scientifically sound evidence? No, you do not, because _nobody_ has that evidence and the question is completely open. It is currently hip among some groups scientists to make that claim as a _personal_ opinion, in a failed attempt to claim that one has a reasonable distance from religion. But no competent scientist makes that claim as a _scientific_ claim, because that would make proof nec

            • That is what is called an extraordinary claim.

              That is actually what is called a fact.
              You've just outed your belief system, and it's precisely what I thought it was.
              You think that you're magical.

              Physicalism is indeed philosophy, like Science is.
              My understanding of the brain is scientific. Yours is religious.

              Science does not realize the question is open. You tell yourself that to soothe your cognitive dissonance.
              Science says, "The alternate religious hypothesis presented is an unprovable bullshit claim. It requires no more consideration."

              • That is actually what is called a fact.

                While it's still debated by philosophers, I'm pretty sure it's what the vast majority of scientists believe. I'm not sure I would call it a fact but that gets into semantics. And there is a ton of evidence. Look at the medical work on the brain including surgery with patients that are awake. While philosophers can weave some interesting tales, almost no one believes them.

                You've just outed your belief system, and it's precisely what I thought it was. You think th

      • by gweihir ( 88907 )

        In other words, you have nothing worthwhile to say, you have no insight into the matter but you are deeply emotionally committed to believe this tech will bring us a bright future, if just all these pesky dumb deniers would simply go away.

        Did I sum that up correctly?

        • but you are deeply emotionally committed to believe this tech will bring us a bright future

          This is a literal strawman. Does constructing strawmen to slay imaginary enemies make you feel clever?
          I never said anything about our future, and in no way do I think AI brings us anything good that outweighs its bad.

          if just all these pesky dumb deniers would simply go away.

          Go away? No need- you can stand still. The world is flying past you so quickly that I imagine your head is spinning.

          Did I sum that up correctly?

          In a way that only a 4 year old trying to argue with an adult could- congratulations ;)

          • by gweihir ( 88907 )

            In a way that only a 4 year old trying to argue with an adult could- congratulations ;)

            Try a professor trying to get through to a not very smart but hugely arrogant student. Of course, I would just fail you if I do not manage to reach you. And yes, I have done that in the past.

            • Try a professor trying to get through to a not very smart but hugely arrogant student.

              I know you are, but what am I? lol.
              I had many professors. If any suffered from the kind of willful stupidity that you force upon yourself on certain topics, I'd have gotten another one.

              Of course, I would just fail you if I do not manage to reach you. And yes, I have done that in the past.

              I mean, if you were employing logical fallacies to try to reach them via reason, I can see why- it was because you sucked at your job.

              • by gweihir ( 88907 )

                Haha, you wish. All I hear is denial and arrogance.

                • Yes, me pointing out that you used a logical fallacy and then decried being unable to reach people is denial and arrogance.

                  No, sir, that is someone pointing out that you've lost the argument and lack the grace to admit it.
                  Whether you're right or wrong- you lost the argument.
    • by gweihir ( 88907 )

      "Provocative name"? You are too kind, What is happening is that they are lying through their teeth to keep the investor money flowing. "AGI" redefined, "reasoning" that is not reasoning, "thinking" that has no resemblance to actual thinking and generally ascribing (always future) capabilities to their crappy technology that would be direly needed, but that is simply does not have and cannot attain.

      Somebody called that a "permanent delivery scam", where it is always the next version that brings the big break

    • It's not a "reasoning model" because it doesn't "reason", that's just a provocative name given to suggests OpenAI knows more than it does. And AI's don't "think" either, so they don't think in Chinese. Nor is this behavior, when described properly, particularly "puzzling". It's interesting for sure, but clearly they train the model using multiple languages so you should expect learned information to potentially map to multiple languages. Brains exhibit this too.

      The terminology and concept of chain of "thought" "reasoning" is a well known commonly used and accepted term that is by no means limited or unique to OpenAI.

  • Makes total sense (Score:4, Insightful)

    by gurps_npc ( 621217 ) on Tuesday January 14, 2025 @09:27PM (#65089441) Homepage

    It was trained on all languages, it doesn't think in any of them, it thinks in all of them.

    It's the equivalent of me and my friend arguing about whether Spock or Gandalf would make a better Jedi. We have been trained in Star Trek, Lord of the Rings and Star Wars, so we think in all of them together.

    When asked a question about something from one of them, we will of course think about the related subjects.

  • but it was hacked by Xi!

  • common (Score:5, Funny)

    by bugs2squash ( 1132591 ) on Tuesday January 14, 2025 @09:48PM (#65089481)
    maybe we all think in chinese, we're just unaware of it if we don't speak the language
  • by Draconi ( 38078 ) on Tuesday January 14, 2025 @10:12PM (#65089541) Homepage

    Some thoughts from the mechanistic interpretability front: I was toying with a few gpt4all models just the other day (including their smaller reasoning model), and injecting random data into the weights and, as the quantity of random data increases, the performance of the models shifted through a few phases when testing at Temperature 0.0 (for reproducibility):

    1. Differentiation: small amount of randomness, the results for simple queries like "What's your name?" changed (ex. "a 19-year old girl" became "a 10-year old girl."
    2. Instruction Framing: more randomness meant it was more likely to create "instruct" style responses, including tangentially related Multiple Choice quizzes
    3. Unicodification: lots and lots of Chinese + other unicode bytes, sometimes still relevant to the query, mostly not
    4. Garbage
    5. Repetitive null output (ex: strings of @'s)

    It really seemed to be that just the sheer amount of unicode exposure leans Chinese.

    Lazy Reproduction Methodology:

    Take a vector of random bits, find the center of a LLM file containing binary weights, then use the bits to push the value of each weight up or down (with floors/ceilings, no wraparounds). In a model like Qwen 1.5b with 4-bit quantization, you just go to the center of the file, offset by how many weights you're overwriting, and unpack each byte of 8 bits into 2 chunks of 4. If the weight is stored as 0110 and your dice roll bit is 1, change it to 0111. If the dice roll bit is 0, change it to 0101. This is the lazy way to quickly play with the idea. I find that with about 1024 or 2048 bytes worth of random data you can keep it in Stage 1 pretty well. Expand the amount using the weight-shifting scheme if you want to see it start spitting out Unicode like mad, claiming to be god (literally breaks the late-trained fine tuning guardrails it seems), or spitting out broken token ids. Examples: https://x.com/CottenIO/status/1875000049056506061

    • by Draconi ( 38078 )

      here's a python script for corrupting GGUFs if you want to try it out :)

      input: "model_path_here"
      output: (the path you gave appended with _2)

      ```
      import os
      import numpy as np
      import shutil
      import sys
      from typing import Optional

      def analyze_and_modify_gguf_file(file_path: str, modify_bytes: int = 128) -> Optional[str]:
      """
      Modifies a GGUF file by slightly adjusting weights in the middle section.
      copies first, the modifies that "_2

      • by Draconi ( 38078 )

        (note: the above is hard-coded for 4-bit quantized models like the GGUFs common in gpt4all)

  • me, a native finnish speaker (poems&texts&stuff), some swedish and vewy, vewy good english, i know. whilst i am watching (the walking dead, lately) i think in english. same when talking in english.
    everything else i think in finnish.
    the WAY one thinks in different languages is,.... well, different. languages have limits and freedoms. like "taivas" in finnish, it is both "heaven" and "sky". then "nuoska", "pyry", "loska" and so forth are all weather related names for "snow".
    in first example finnish la

    • by dargaud ( 518470 )
      Exactly, polyglots will unconsciously switch to whatever language is better suited, which depends on the context they learned a language. For me its: general reading/writing, math/engineering: english. Recipes, small talk: french. Sex: italian.
  • by retiarius ( 72746 ) on Wednesday January 15, 2025 @12:46AM (#65089737)

    ... From national treasure Tom Lehrer's song "Wernher von Braun",
    note the last stanza. (All lyrics are now dedicated to the public domain
    by Tom, himself):

    Lyrics
    And what is it that put America in the forefront of the nuclear nations?
    And what is it that will make it possible to spend twenty billion dollars of your money
    to put some clown on the moon? Well, it was good old American know how, that's what,
    as provided by good old Americans like Dr. Wernher von Braun!
    Gather 'round while I sing you of Wernher von Braun,

    A man whose allegiance
    Is ruled by expedience.
    Call him a Nazi, he won't even frown,
    "Ha, Nazi, Schmazi, " says Wernher von Braun.

    Don't say that he's hypocritical,
    Say rather that he's apolitical.
    "Once the rockets are up, who cares where they come down?
    That's not my department, " says Wernher von Braun.

    Some have harsh words for this man of renown,
    But some think our attitude
    Should be one of gratitude,
    Like the widows and cripples in old London town,
    Who owe their large pensions to Wernher von Braun.

    You too may be a big hero,
    Once you've learned to count backwards to zero.
    "In German oder English I know how to count down,
    Und I'm learning Chinese!" says Wernher von Braun.

  • "For example, I prefer doing math in Chinese because each digit is just one syllable, which makes calculations crisp and efficient."

    Damn you, seven!!!

    • by dohzer ( 867770 )

      Makes me wonder if I've been performing calculations wrong, because the number of syllables have never mattered. "The word for the number 7 has two syllables, so I'll double the answer."

  • ... as bandied about in university philosophy departments for years.

    Only it's the reverse, where messages are passed into the room
    in English, processed in Chinese, then spit back into English.

  • ...don't buy the o1 Chinese data labeling hypothesis, however. They point out that o1 is just as likely to switch to Hindi, Thai, or a language other than Chinese while teasing out a solution.

    Bears repeating. It appears.

  • It is cluelessly bumbling around through the fog, never seeing anything beyond the next step and never even understanding that next step.

    Second, the observed phenomenon nicely shows this is actually fundamental research, because nobody has a clue what this machine does and why. Hence, at the very least 30 years to practical applicability, and probably much linger.

  • These LLMs are essentially implementations of the Chinese room concept.

    • by Epeeist ( 2682 )

      These LLMs are essentially implementations of the Chinese room concept.

      Searle's thought experiment is meant to demonstrate the difference between syntactic and semantic information. While LLMs may work at the syntactic level, they do not operate at the semantic level.

  • OpenAI's AI Reasoning Model 'Thinks' In Chinese Sometimes, No One Really Knows Why

    If a human suddenly switches language, then we can ask them why and they can explain.

    In the case of LLMs, then we don't have that explanation.

  • by LoadLin ( 6193506 ) on Wednesday January 15, 2025 @05:32AM (#65090061)

    Definitely, not all languages think or express the same way.

    Did you have found that you know some term in one language and, when you don't have that word in the language you want to express, you just import the word?

    I'm Spanish. And sometimes you have words like "Crush" (in a romantic sense). When I speak in Spanish I just teasing someone I use that word.

    Well. I did also because I think that person also knows the word, but the thing is... it's a better word that other Spanish expressions like "que... te encoñaste de ella?". It's the same... but sounds rude as the word "coño" is considered a bad word that alone is used in the same way as "pussy".

    We also have "enamorarse", but it means literally "get in love" and that's... not exactly the same when someone is just in the stage of being fascinated and attracted to someone without really understand and know for real that other person.

    When I needed, I just switch language. No problem.

    I think here is the same. If the shorter path to think about something is using a better suited language, because the related ideas are better expressed there, the IA switch.

    Have you tested write to an IA in mixed languages? It understand it without problems, and the response can be very chaotic. Sometimes I just express myself in Spanish and the IA answer in English. Fortunately that's not a problem for me.
    But... yeah... switch to asian languages would be a problem for me. X-D

  • Other experts don't buy the o1 Chinese data labeling hypothesis, however. They point out that o1 is just as likely to switch to Hindi, Thai, or a language other than Chinese while teasing out a solution.

    Other experts don't buy the o1 Chinese data labeling hypothesis, however. They point out that o1 is just as likely to switch to Hindi, Thai, or a language other than Chinese while teasing out a solution.

    Slashdot editors strike again. Perhaps they should be replaced with an AI.

  • "For example, I prefer doing math in Chinese because each digit is just one syllable, which makes calculations crisp and efficient. But when it comes to topics like unconscious bias, I automatically switch to English, mainly because that's where I first learned and absorbed those ideas."

    "So for real stuff, we of course use Chinese. But for insane stuff like that, you gotta use English naturally ..."

  • Decreasing order of food deliciousness?

  • A merkin AI centuary - time to ban them all for national security.
  • Humans do this all the time. Humans WRITE like this all the time. It doesn't take much searching to see posts where people write in multiple languages in a single post, changing between languages mid sentence, or even word to word.

    This is the data you're training on⦠of course the model will replicate this.

    I'm starting to think a lot of the people in the AI space are bozos and don't really understand how these ML models work.

  • After all, it should be able to "reason" it out.

The clearest way into the Universe is through a forest wilderness. -- John Muir

Working...