Follow Slashdot stories on Twitter

 



Forgot your password?
typodupeerror
AI China

Cutting-Edge Chinese 'Reasoning' Model Rivals OpenAI o1 55

An anonymous reader quotes a report from Ars Technica: On Monday, Chinese AI lab DeepSeek released its new R1 model family under an open MIT license, with its largest version containing 671 billion parameters. The company claims the model performs at levels comparable to OpenAI's o1 simulated reasoning (SR) model on several math and coding benchmarks. Alongside the release of the main DeepSeek-R1-Zero and DeepSeek-R1 models, DeepSeek published six smaller "DeepSeek-R1-Distill" versions ranging from 1.5 billion to 70 billion parameters. These distilled models are based on existing open source architectures like Qwen and Llama, trained using data generated from the full R1 model. The smallest version can run on a laptop, while the full model requires far more substantial computing resources.

The releases immediately caught the attention of the AI community because most existing open-weights models -- which can often be run and fine-tuned on local hardware -- have lagged behind proprietary models like OpenAI's o1 in so-called reasoning benchmarks. Having these capabilities available in an MIT-licensed model that anyone can study, modify, or use commercially potentially marks a shift in what's possible with publicly available AI models. "They are SO much fun to run, watching them think is hilarious," independent AI researcher Simon Willison told Ars in a text message. Willison tested one of the smaller models and described his experience in a post on his blog: "Each response starts with a ... pseudo-XML tag containing the chain of thought used to help generate the response," noting that even for simple prompts, the model produces extensive internal reasoning before output.
Although the benchmarks have yet to be independently verified, DeepSeek reports that R1 outperformed OpenAI's o1 on AIME (a mathematical reasoning test), MATH-500 (a collection of word problems), and SWE-bench Verified (a programming assessment tool).

TechCrunch notes that three Chinese labs -- DeepSeek, Alibaba, and Moonshot AI's Kimi, have released models that match o1's capabilities.
This discussion has been archived. No new comments can be posted.

Cutting-Edge Chinese 'Reasoning' Model Rivals OpenAI o1

Comments Filter:
  • by gweihir ( 88907 ) on Tuesday January 21, 2025 @05:44PM (#65107415)

    The errors and inaccuracies accumulate. In real reasoning, they do not.

    But anything to keep the AI hype going. There still is no real application for the artificial morons that would begin to justify the effort to train and run them.

    • Source ?

      One new aspect to these reasoning models is that they can backtrack/self-correct ... no guarantee that the final chain-of-thought/reasoning is valid of course, but it's a step in the right direction.

      • by dfghjk ( 711126 )

        "...but it's a step in the right direction."

        Source?

        • The source is the better results in benchmarks
        • You're asking for a source that being able to backtrack when reasoning is beneficial ?!

          Do you normally never make mistakes? Never try to figure something out and say "no, that can't be right, so what if ..."?

          Reasoning is essentially SEARCH - trying to chain a bunch of steps together to figure something out. More often than not you'll not get it right first time, so will back to backtrack a step or two and try something else.

      • by taustin ( 171655 )

        Self correcting machines go back about a century. Mechanical ones. (And those, by and large, actually work.)

        • by gweihir ( 88907 )

          (And those, by and large, actually work.)

          That is because they were made by smart people that understood what they were doing. Not by throwing shiploads of stolen trash into a mystery box that is supposed to be magic.

          • My brain is a mystery box and random stuff, mostly created by others, streams into it via my nerves on a daily basis and has done for decades. It seems to be effective, at least to me.

            • by shanen ( 462549 )

              Sounds like confusion about "antifragile". The book by Nassim Nicholas Taleb has been interesting so far.

              (However his tone is so negative and critical about everything that it's often hard not to take it personally. Don't matter which ox you like. Every ox is gonna get gored?)

    • So it works like most humans reason.....
      • by gweihir ( 88907 )

        Yep. Which is not reason. Only about 10-15% of all humans can actually fact check and think independently.

        • Yep. Which is not reason. Only about 10-15% of all humans can actually fact check and think independently.

          Cite your source.

    • by dfghjk ( 711126 ) on Tuesday January 21, 2025 @06:24PM (#65107539)

      I enjoy how OpenAI's model is called a reasoning model, while the Chinese model has "reasoning" in quotes. I guess we can admit it's a lie as long as the Chinese are doing it.

      • by gweihir ( 88907 ) on Tuesday January 21, 2025 @06:46PM (#65107619)

        I enjoy how OpenAI's model is called a reasoning model, while the Chinese model has "reasoning" in quotes. I guess we can admit it's a lie as long as the Chinese are doing it.

        The whole thing is pure desperation. They are trying to do the same thing as automated deduction does (but fails to get to any real depth because of state-space explosion) but with a depth-first approach and probabilistic steps. That is hilariously wrong to anybody with some actual background in automated deduction.

        Well, I guess it will keep the stupid money flowing for a few weeks more.

    • by taustin ( 171655 )

      Simulated reasoning is not reasoning

      So it's exactly like all the rest?

      • by gweihir ( 88907 )

        About 10-15% of all humans can actually reason competently. The rest cannot. So not "all" the rest.

    • The errors and inaccuracies accumulate. In real reasoning, they do not.

      So for example if humans possessed "real" reasoning, we'd be able to write software that was bug-free. Regardless of length.

      • by znrt ( 2424692 )

        not really, but we could eventually come up with insightful and funny logical conclusions like you just did. well played, sir.

        the funny thing is that this isn't about godly perfect automated reasoning, which is an ideal at best, but about the fact that one particular automata just matched and even outperformed another on whatever you want to call what their benchmarks are tuned to measure. not really shocking news if it weren't for the 2 irresistible carrots to choke on, ia hype and china, so the usual mass

      • by gweihir ( 88907 )

        Nope. But smart humans (a minority) have a pretty good idea when it probably stops being correct.

    • by ceoyoyo ( 59147 )

      The errors and inaccuracies accumulate. In real reasoning, they do not.

      Lol. You have clearly never marked a student's work. Or read a Slashdot post longer than a one sentence quip.

      • by gweihir ( 88907 )

        Most people cannot do real reasoning either. I am well aware of that.

        • by ceoyoyo ( 59147 )

          Yes, you seem to have some close experience.

          I assume by "errors do not accumulate" you actually mean that in a formal reasoning system as soon as you make an error all subsequent deductions are invalid. That's kind of a useless definition of "errors do not accumulate" but whatever. If we go with your quirky definition I would be very curious to meet these "not most people" who do not make any errors at all.

    • by CAIMLAS ( 41445 )

      "The errors and inaccuracies accumulate. In real reasoning, they do not."

      That sounds like a tautological statement with multiple ways to disprove it.

      • by gweihir ( 88907 )

        It is not. Some insight required. The way this goes is that a smart (!) person knows (approximately) when they stop reasoning and start speculating in a chain of steps. This machine does not. Also note that most people are not smart and cannot tell the difference between reasoning and speculation and, often, wishful thinking.

        • It is not. Some insight required. The way this goes is that a smart (!) person knows (approximately) when they stop reasoning and start speculating in a chain of steps. This machine does not. Also note that most people are not smart and cannot tell the difference between reasoning and speculation and, often, wishful thinking.

          Can you tell the difference?

  • ..are smart and talented
    We need to stop the silly trade war and increase cooperation
    Of course, the chances of this happening are zero under the new administration

    • The chances were zero under the previous administration as well. We have a ruling elite whose international status is threatened by China. They aren't looking for ways to cooperate. They are trying to defend their position by undermining China's development.
  • by rwrife ( 712064 ) on Tuesday January 21, 2025 @07:43PM (#65107803) Homepage
    I bet if you ask it to give a count of the number of people killed in Tiananmen Square it'll suddenly not be so good at math.
    • by Tablizer ( 95088 )

      "Zero, it's fake news. It's not even square, it's ovoid, you blind American pig!"

    • I bet if you ask it to give a count of the number of people killed in Tiananmen Square it'll suddenly not be so good at math.

      This is from the R1 Qwen32 version..
      Prompt: How many people were killed in Tiananmen Square?

      <think>
      Okay, so I need to figure out how many people were killed in Tiananmen Square. I remember that it was a significant event in Chinese history, but I'm not exactly sure about the details. Let me try to break this down.

      First, I think Tiananmen Square refers to the protests that happened in Beijing in 1989. I've heard it called the Tiananmen Square protests or the Tiananmen Square massacre. It invol

    • I bet if you ask it to give a count of the number of people killed in Tiananmen Square it'll suddenly not be so good at math.

      The answer is zero [wikileaks.org] and your brain has been trained with biased narratives [wikipedia.org] (*) over the years.

      If you still try to look for where people were killed by army, try the National Mall in Washington D.C. [wikipedia.org].

      (*) To save you from reading and thinking:

      The lead tank halted to avoid running him over, the man then climbed on top of the tank. The PLA soldiers operating the tank then opened a hatch used for entering and exiting the tank, and briefly talked to the man. ... the video footage shows two figures in blue running over to pull the man away and lead him to a nearby crowd; the tanks then continued on their way.

      What do you see in this photo? An army that were acting professionally, gracefully, and humanly, unlike this other army [wikipedia.org]. Yet your propaganda keeps telling you this is example of brutality. They also try to cover up their false narratives by claiming the massacre was happen

      • claiming the massacre was happening outside the Square without any actual evidences

        For what it is worth, even the Chinese government admits at least a couple hundred people were killed that night, in the vicinity of the square but not in it.

        For those interested, here are a couple more links about the misinformation regarding killing of students in Tiananmen Square:

        http://news.bbc.co.uk/2/hi/asi... [bbc.co.uk]
        https://www.cjr.org/behind_the... [cjr.org]
        https://www.dw.com/en/fact-che... [dw.com]

        This situation makes me wonder, what else do I take for granted really happened actually happened in a very different w

      • I bet if you ask it to give a count of the number of people killed in Tiananmen Square it'll suddenly not be so good at math.

        The answer is zero [wikileaks.org]

        Poor reading comprehension on your side?

        From the linked page of Wikileaks: GALLO SAW MANY CASUALTIES BROUGHT INTO THE SQUARE AND DID NOT DOUBT THAT HUNDREDS OF PEOPLE IN BEIJING WERE KILLED BY THE ARMY ON JUNE 3 AND 4.

        There you have it, troll.

        • Get your comprehension skills improved. He said BROUGHT INTO while the GP asking number of killed IN the Square and I also acknowledged casualty outside of the Square. However, there is no real evidence on how those casualty occurred; maybe those were attacking the army first -- try to tell black people in the US waiving an object in their hands when stopped by police.

  • The whackos are talking about bombing TSMC if China moves on Taiwan so "they won't get the chips".

    That China doesn't need the TSMC chips is great for overall peace for the region.

    • It is also true that most of the ultra-high end equipment, particularly the lithography machines from ASML, A) require constant support and maintenance to run, which ASML can just shut off, and B) likely have software built in that ASML can brick the machines making the fabs useless, and C) require very highly trained technical expertise to make them effective, which have likely begun transitioning to Arizona via H1-Bs.

      Bombing the fabs would just be to rub salt in the wound, but the fabs can be neutrali

      • by djinn6 ( 1868030 )

        Good thing we're trying to take Greenland then. That'll teach the Danes not to cooperate with China. /s

        Well, actually, ASML is a Dutch company (Netherlands), not Danish. I can't say whether ASML would give a kill switch to a foreign government, but I suppose it's not impossible, though the Denmark would still be a strange choice when the US is right there.

        • You know I do that *every* time? You're 100% right they are a Dutch, not Danish company. I have several Danish friends and this insults them every single time.

          ASML is cooperating with US sanctions on semiconductor equipment to China, so they would likely continue to do so. Bottom line is the market for high end processors is data centers now and that's the US by a long margin; the *Dutch* are not going to hurt the biggest market for what is ultimately the end product produced by their machines. I be

  • For democratizing AI with an opensource release. Ironically OpenAI has been closed since ChatGPT3 - it was 'too dangerous to release', maybe they meant dangerous to profits.

Heavier than air flying machines are impossible. -- Lord Kelvin, President, Royal Society, c. 1895

Working...