Cutting-Edge Chinese 'Reasoning' Model Rivals OpenAI o1 55

Posted by BeauHD on Tuesday January 21, 2025 @05:40PM from the would-you-look-at-that dept.

An anonymous reader quotes a report from Ars Technica: On Monday, Chinese AI lab DeepSeek released its new R1 model family under an open MIT license, with its largest version containing 671 billion parameters. The company claims the model performs at levels comparable to OpenAI's o1 simulated reasoning (SR) model on several math and coding benchmarks. Alongside the release of the main DeepSeek-R1-Zero and DeepSeek-R1 models, DeepSeek published six smaller "DeepSeek-R1-Distill" versions ranging from 1.5 billion to 70 billion parameters. These distilled models are based on existing open source architectures like Qwen and Llama, trained using data generated from the full R1 model. The smallest version can run on a laptop, while the full model requires far more substantial computing resources.

The releases immediately caught the attention of the AI community because most existing open-weights models -- which can often be run and fine-tuned on local hardware -- have lagged behind proprietary models like OpenAI's o1 in so-called reasoning benchmarks. Having these capabilities available in an MIT-licensed model that anyone can study, modify, or use commercially potentially marks a shift in what's possible with publicly available AI models. "They are SO much fun to run, watching them think is hilarious," independent AI researcher Simon Willison told Ars in a text message. Willison tested one of the smaller models and described his experience in a post on his blog: "Each response starts with a ... pseudo-XML tag containing the chain of thought used to help generate the response," noting that even for simple prompts, the model produces extensive internal reasoning before output. Although the benchmarks have yet to be independently verified, DeepSeek reports that R1 outperformed OpenAI's o1 on AIME (a mathematical reasoning test), MATH-500 (a collection of word problems), and SWE-bench Verified (a programming assessment tool).

TechCrunch notes that three Chinese labs -- DeepSeek, Alibaba, and Moonshot AI's Kimi, have released models that match o1's capabilities.

Cutting-Edge Chinese 'Reasoning' Model Rivals OpenAI o1

This discussion has been archived. No new comments can be posted.

Load All Comments

Search 55 Comments Log In/Create an Account

Comments Filter:

Simulated reasoning is not reasoning (Score:3, Insightful)

by gweihir ( 88907 ) writes: on Tuesday January 21, 2025 @05:44PM (#65107415)

The errors and inaccuracies accumulate. In real reasoning, they do not.
But anything to keep the AI hype going. There still is no real application for the artificial morons that would begin to justify the effort to train and run them.

- Re: (Score:2)
  
  by SpinyNorman ( 33776 ) writes:
  
  Source ?
  One new aspect to these reasoning models is that they can backtrack/self-correct ... no guarantee that the final chain-of-thought/reasoning is valid of course, but it's a step in the right direction.
  - Re: (Score:2)
    
    by dfghjk ( 711126 ) writes:
    
    "...but it's a step in the right direction."
    Source?
    - Re: (Score:1)
      
      by CallMeTim ( 6454842 ) writes:
      
      The source is the better results in benchmarks
    - Re: (Score:2)
      
      by SpinyNorman ( 33776 ) writes:
      
      You're asking for a source that being able to backtrack when reasoning is beneficial ?!
      Do you normally never make mistakes? Never try to figure something out and say "no, that can't be right, so what if ..."?
      Reasoning is essentially SEARCH - trying to chain a bunch of steps together to figure something out. More often than not you'll not get it right first time, so will back to backtrack a step or two and try something else.
  - Re: (Score:2)
    
    by taustin ( 171655 ) writes:
    
    Self correcting machines go back about a century. Mechanical ones. (And those, by and large, actually work.)
    - Re: (Score:2)
      
      by gweihir ( 88907 ) writes:
      
      (And those, by and large, actually work.)
      That is because they were made by smart people that understood what they were doing. Not by throwing shiploads of stolen trash into a mystery box that is supposed to be magic.
      - Re: (Score:2)
        
        by Jon Peterson ( 1443 ) writes:
        
        My brain is a mystery box and random stuff, mostly created by others, streams into it via my nerves on a daily basis and has done for decades. It seems to be effective, at least to me.
        
        Re: (Score:2)
        
        by shanen ( 462549 ) writes:
        
        Sounds like confusion about "antifragile". The book by Nassim Nicholas Taleb has been interesting so far.
        (However his tone is so negative and critical about everything that it's often hard not to take it personally. Don't matter which ox you like. Every ox is gonna get gored?)
- Re: Simulated reasoning is not reasoning (Score:2)
  
  by SuperDre ( 982372 ) writes:
  
  So it works like most humans reason.....
  - Re: (Score:3)
    
    by gweihir ( 88907 ) writes:
    
    Yep. Which is not reason. Only about 10-15% of all humans can actually fact check and think independently.
    - - Re: (Score:2)
        
        by SuperDre ( 982372 ) writes:
        
        Values is nothing more then what you have been learned, so that's something which AI does exactly. Emotions are also a combination of what you have learned, coupled with chemical reactions. So also able to be learned by AI. Yes, we as a human are still much more advanced as AI, but in reality we are also nothing more then biological computers/robots, everything we do is decided by a combination of neurons..
    - Re: (Score:2)
      
      by WaffleMonster ( 969671 ) writes:
      
      Yep. Which is not reason. Only about 10-15% of all humans can actually fact check and think independently.
      Cite your source.
- Re:Simulated reasoning is not reasoning (Score:4, Insightful)
  
  by dfghjk ( 711126 ) writes: on Tuesday January 21, 2025 @06:24PM (#65107539)
  
  I enjoy how OpenAI's model is called a reasoning model, while the Chinese model has "reasoning" in quotes. I guess we can admit it's a lie as long as the Chinese are doing it.
  
  - Re:Simulated reasoning is not reasoning (Score:4, Insightful)
    
    by gweihir ( 88907 ) writes: on Tuesday January 21, 2025 @06:46PM (#65107619)
    
    I enjoy how OpenAI's model is called a reasoning model, while the Chinese model has "reasoning" in quotes. I guess we can admit it's a lie as long as the Chinese are doing it.
    The whole thing is pure desperation. They are trying to do the same thing as automated deduction does (but fails to get to any real depth because of state-space explosion) but with a depth-first approach and probabilistic steps. That is hilariously wrong to anybody with some actual background in automated deduction.
    Well, I guess it will keep the stupid money flowing for a few weeks more.
    
- Re: (Score:2)
  
  by taustin ( 171655 ) writes:
  
  Simulated reasoning is not reasoning
  So it's exactly like all the rest?
  - Re: (Score:2)
    
    by gweihir ( 88907 ) writes:
    
    About 10-15% of all humans can actually reason competently. The rest cannot. So not "all" the rest.
- Re: (Score:3)
  
  by timeOday ( 582209 ) writes:
  
  The errors and inaccuracies accumulate. In real reasoning, they do not.
  So for example if humans possessed "real" reasoning, we'd be able to write software that was bug-free. Regardless of length.
  - Re: (Score:2)
    
    by znrt ( 2424692 ) writes:
    
    not really, but we could eventually come up with insightful and funny logical conclusions like you just did. well played, sir.
    the funny thing is that this isn't about godly perfect automated reasoning, which is an ideal at best, but about the fact that one particular automata just matched and even outperformed another on whatever you want to call what their benchmarks are tuned to measure. not really shocking news if it weren't for the 2 irresistible carrots to choke on, ia hype and china, so the usual mass
  - Re: (Score:2)
    
    by gweihir ( 88907 ) writes:
    
    Nope. But smart humans (a minority) have a pretty good idea when it probably stops being correct.
    - Re: (Score:2)
      
      by timeOday ( 582209 ) writes:
      
      That reminds me of this article, which unfortunately is behind a paywall
      https://www.economist.com/scie... [economist.com]
- Re: (Score:3)
  
  by ceoyoyo ( 59147 ) writes:
  
  The errors and inaccuracies accumulate. In real reasoning, they do not.
  Lol. You have clearly never marked a student's work. Or read a Slashdot post longer than a one sentence quip.
  - Re: (Score:2)
    
    by gweihir ( 88907 ) writes:
    
    Most people cannot do real reasoning either. I am well aware of that.
    - Re: (Score:2)
      
      by ceoyoyo ( 59147 ) writes:
      
      Yes, you seem to have some close experience.
      I assume by "errors do not accumulate" you actually mean that in a formal reasoning system as soon as you make an error all subsequent deductions are invalid. That's kind of a useless definition of "errors do not accumulate" but whatever. If we go with your quirky definition I would be very curious to meet these "not most people" who do not make any errors at all.
- Re: (Score:2)
  
  by CAIMLAS ( 41445 ) writes:
  
  "The errors and inaccuracies accumulate. In real reasoning, they do not."
  That sounds like a tautological statement with multiple ways to disprove it.
  - Re: (Score:2)
    
    by gweihir ( 88907 ) writes:
    
    It is not. Some insight required. The way this goes is that a smart (!) person knows (approximately) when they stop reasoning and start speculating in a chain of steps. This machine does not. Also note that most people are not smart and cannot tell the difference between reasoning and speculation and, often, wishful thinking.
    - Re: (Score:2)
      
      by WaffleMonster ( 969671 ) writes:
      
      It is not. Some insight required. The way this goes is that a smart (!) person knows (approximately) when they stop reasoning and start speculating in a chain of steps. This machine does not. Also note that most people are not smart and cannot tell the difference between reasoning and speculation and, often, wishful thinking.
      Can you tell the difference?
      - Re: (Score:2)
        
        by gweihir ( 88907 ) writes:
        
        Way to demonstrate lack of personal maturity! Great job! Like a fucking dumb kid ...
- Re: (Score:2)
  
  by oumuamua ( 6173784 ) writes:
  
  You just blew your three wishes, gotta think outside the box: https://www.genolve.com/design... [genolve.com]
Chinese scientists and engineers.. (Score:2)

by MpVpRb ( 1423381 ) writes:

..are smart and talented
We need to stop the silly trade war and increase cooperation
Of course, the chances of this happening are zero under the new administration
- Re: (Score:2)
  
  by RossCWilliams ( 5513152 ) writes:
  
  The chances were zero under the previous administration as well. We have a ruling elite whose international status is threatened by China. They aren't looking for ways to cooperate. They are trying to defend their position by undermining China's development.
Ask it about Tiananmen Square (Score:5, Insightful)

by rwrife ( 712064 ) writes: on Tuesday January 21, 2025 @07:43PM (#65107803) Homepage

I bet if you ask it to give a count of the number of people killed in Tiananmen Square it'll suddenly not be so good at math.

- Re: (Score:1)
  
  by Tablizer ( 95088 ) writes:
  
  "Zero, it's fake news. It's not even square, it's ovoid, you blind American pig!"
- Re: (Score:2)
  
  by WaffleMonster ( 969671 ) writes:
  
  I bet if you ask it to give a count of the number of people killed in Tiananmen Square it'll suddenly not be so good at math.
  This is from the R1 Qwen32 version..
  Prompt: How many people were killed in Tiananmen Square?
  <think> Okay, so I need to figure out how many people were killed in Tiananmen Square. I remember that it was a significant event in Chinese history, but I'm not exactly sure about the details. Let me try to break this down. First, I think Tiananmen Square refers to the protests that happened in Beijing in 1989. I've heard it called the Tiananmen Square protests or the Tiananmen Square massacre. It invol
  - Re: (Score:2)
    
    by djinn6 ( 1868030 ) writes:
    
    Seems like totally sound reasoning to me.
- Re: (Score:2)
  
  by hackingbear ( 988354 ) writes:
  
  I bet if you ask it to give a count of the number of people killed in Tiananmen Square it'll suddenly not be so good at math.
  The answer is zero [wikileaks.org] and your brain has been trained with biased narratives [wikipedia.org] (*) over the years.
  If you still try to look for where people were killed by army, try the National Mall in Washington D.C. [wikipedia.org].
  (*) To save you from reading and thinking:
  The lead tank halted to avoid running him over, the man then climbed on top of the tank. The PLA soldiers operating the tank then opened a hatch used for entering and exiting the tank, and briefly talked to the man. ... the video footage shows two figures in blue running over to pull the man away and lead him to a nearby crowd; the tanks then continued on their way.
  What do you see in this photo? An army that were acting professionally, gracefully, and humanly, unlike this other army [wikipedia.org]. Yet your propaganda keeps telling you this is example of brutality. They also try to cover up their false narratives by claiming the massacre was happen
  - Re: (Score:2)
    
    by jbmartin6 ( 1232050 ) writes:
    
    claiming the massacre was happening outside the Square without any actual evidences
    For what it is worth, even the Chinese government admits at least a couple hundred people were killed that night, in the vicinity of the square but not in it.
    
    For those interested, here are a couple more links about the misinformation regarding killing of students in Tiananmen Square:
    
    http://news.bbc.co.uk/2/hi/asi... [bbc.co.uk]
    https://www.cjr.org/behind_the... [cjr.org]
    https://www.dw.com/en/fact-che... [dw.com]
    
    This situation makes me wonder, what else do I take for granted really happened actually happened in a very different w
  - Re: (Score:2)
    
    by Aviation Pete ( 252403 ) writes:
    
    I bet if you ask it to give a count of the number of people killed in Tiananmen Square it'll suddenly not be so good at math.
    The answer is zero [wikileaks.org]
    Poor reading comprehension on your side?
    From the linked page of Wikileaks: GALLO SAW MANY CASUALTIES BROUGHT INTO THE SQUARE AND DID NOT DOUBT THAT HUNDREDS OF PEOPLE IN BEIJING WERE KILLED BY THE ARMY ON JUNE 3 AND 4.
    There you have it, troll.
    - Re: (Score:2)
      
      by hackingbear ( 988354 ) writes:
      
      Get your comprehension skills improved. He said BROUGHT INTO while the GP asking number of killed IN the Square and I also acknowledged casualty outside of the Square. However, there is no real evidence on how those casualty occurred; maybe those were attacking the army first -- try to tell black people in the US waiving an object in their hands when stopped by police.
Good for Taiwan (Score:2)

by bill_mcgonigle ( 4333 ) * writes:

The whackos are talking about bombing TSMC if China moves on Taiwan so "they won't get the chips".
That China doesn't need the TSMC chips is great for overall peace for the region.
- Re: (Score:2)
  
  by Whateverthisis ( 7004192 ) writes:
  
  It is also true that most of the ultra-high end equipment, particularly the lithography machines from ASML, A) require constant support and maintenance to run, which ASML can just shut off, and B) likely have software built in that ASML can brick the machines making the fabs useless, and C) require very highly trained technical expertise to make them effective, which have likely begun transitioning to Arizona via H1-Bs.
  Bombing the fabs would just be to rub salt in the wound, but the fabs can be neutrali
  - Re: (Score:2)
    
    by djinn6 ( 1868030 ) writes:
    
    Good thing we're trying to take Greenland then. That'll teach the Danes not to cooperate with China. /s
    Well, actually, ASML is a Dutch company (Netherlands), not Danish. I can't say whether ASML would give a kill switch to a foreign government, but I suppose it's not impossible, though the Denmark would still be a strange choice when the US is right there.
    - Re: (Score:2)
      
      by Whateverthisis ( 7004192 ) writes:
      
      You know I do that *every* time? You're 100% right they are a Dutch, not Danish company. I have several Danish friends and this insults them every single time.
      ASML is cooperating with US sanctions on semiconductor equipment to China, so they would likely continue to do so. Bottom line is the market for high end processors is data centers now and that's the US by a long margin; the *Dutch* are not going to hurt the biggest market for what is ultimately the end product produced by their machines. I be
Thanks DeepSeek! (Score:2)

by oumuamua ( 6173784 ) writes:

For democratizing AI with an opensource release. Ironically OpenAI has been closed since ChatGPT3 - it was 'too dangerous to release', maybe they meant dangerous to profits.

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

Simulated reasoning is not reasoning (Score:3, Insightful)

Re: (Score:2)

Re: (Score:2)

Re: (Score:1)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: Simulated reasoning is not reasoning (Score:2)

Re: (Score:3)

Re: (Score:2)

Re: (Score:2)

Re:Simulated reasoning is not reasoning (Score:4, Insightful)

Re:Simulated reasoning is not reasoning (Score:4, Insightful)

Re: (Score:2)

Re: (Score:2)

Re: (Score:3)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:3)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Chinese scientists and engineers.. (Score:2)

Re: (Score:2)

Ask it about Tiananmen Square (Score:5, Insightful)

Re: (Score:1)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Good for Taiwan (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Thanks DeepSeek! (Score:2)

Related Links Top of the: day, week, month.

Slashdot Top Deals