Researchers Created an Open Rival To OpenAI's o1 'Reasoning' Model for Under $50 23

Posted by msmash on Thursday February 06, 2025 @10:45AM from the pushing-the-limits dept.

AI researchers at Stanford and the University of Washington were able to train an AI "reasoning" model for under $50 in cloud compute credits, according to a research paper. From a report: The model, known as s1, performs similarly to cutting-edge reasoning models, such as OpenAI's o1 and DeepSeek's R1, on tests measuring math and coding abilities. The s1 model is available on GitHub, along with the data and code used to train it.

The team behind s1 said they started with an off-the-shelf base model, then fine-tuned it through distillation, a process to extract the "reasoning" capabilities from another AI model by training on its answers. The researchers said s1 is distilled from one of Google's reasoning models, Gemini 2.0 Flash Thinking Experimental. Distillation is the same approach Berkeley researchers used to create an AI reasoning model for around $450 last month.

Researchers Created an Open Rival To OpenAI's o1 'Reasoning' Model for Under $50

This discussion has been archived. No new comments can be posted.

Load All Comments

Search 23 Comments Log In/Create an Account

Comments Filter:

the inevitable will eventually occur (Score:2)

by Big Hairy Gorilla ( 9839972 ) writes:

deepseek? s1?

Sooner or later someone will improve the efficiency of the existing models, cost to train will come down.
Then the current status quo will crumble.

did it happen already?
Training an LLM using another LLM (Score:3)

by DrMrLordX ( 559371 ) writes: on Thursday February 06, 2025 @10:59AM (#65146817)

What could possibly go wrong?

- Re: Training an LLM using another LLM (Score:2)
  
  by LindleyF ( 9395567 ) writes:
  
  I think the raw training from scratch is like deciphering an alien language, while distillation is more like being in English class.
- Re: (Score:2)
  
  by joshuark ( 6549270 ) writes:
  
  I can see some AI researcher doing a Simpson's "Dr. Frink"...
  "I forgot to carry the one..."
  https://www.youtube.com/watch?... [youtube.com]
  JoshK.
- Re: (Score:2)
  
  by dvice ( 6309704 ) writes:
  
  It is old and very well working method to train AI. To train AI you need a scoring system, so if you want to train an AI to draw pictures you:
  1. Train an AI that will score a picture just based on how similar it is to the target image. This is relatively simple. Just give it bunch of random images and reward AI if it gives low score and bunch of target images from which you reward with high score.
  2. Now you have a scoring AI, so you start training the actual AI. You simply give it the score from the first A
- Re: (Score:2)
  
  by null etc. ( 524767 ) writes:
  
  Wait another 5 minutes, gweihir will be on here explaining everything that's wrong with AI.
Leading to the next innovation ... (Score:2)

by larryjoe ( 135075 ) writes:

Distillation is great! DeepSeek used it. Stanford used it. It saves lots of time and money. Why spend the time and money to train your own unique model when you can mostly copy someone else's work?
Of course, this distillation trend misses the truly big thing. Direct copying of someone else's model requires even less time and money! This will be the next great innovation.
Huh (Score:3)

by TJHook3r ( 4699685 ) writes: on Thursday February 06, 2025 @11:18AM (#65146859)

So 2025 is going to be the year of ridiculous cost claims?

- - Re: (Score:2)
    
    by blue trane ( 110704 ) writes:
    
    Remember when information wanted to be free?
This is remniscent of... (Score:2)

by joshuark ( 6549270 ) writes:

This is reminiscent of the processor wars of the 1990s and 2000s. All the then big names vying for best processor in terms of MIPS.
Now the shift is the creating models that require more powerful processors, or GPUs. Progress!
JoshK.
OpenAI saw it coming (Score:3)

by Pinky's Brain ( 1158667 ) writes: on Thursday February 06, 2025 @12:22PM (#65147089)

Though some people might have said it would be stupid to try to distill reasoning ability from the "thought" output, it's clearly extremely effective. The researchers even distilled on the pure text, this is not even logit distillation (you can get top 5 logits for only one request per day from Google, though for 1000 questions scraping with multiple accounts would have been an option).
OpenAI likely saw it coming, hence their refusal to expose the thoughts for o1.

- Re: (Score:3)
  
  by Nrrqshrr ( 1879148 ) writes:
  
  ClosedAI's business model is basically "This is too expensive for you to run on your own, buy a subscription from us instead."
  As you said, they probably saw it coming and even explored it inhouse. It's just something that would come against their 500b grift, and they sure as hell didn't want it out.
- Re: (Score:2)
  
  by SpinyNorman ( 33776 ) writes:
  
  Yeah - I wouldn't really call this distillation, it's just using one model to generate training data for another - synthetic data.
  If seems the use of RL for training reasoning models is mostly (what else?) acting as a data multiplier - taking a small number of reasoning samples, and training a model capable of generating more. A bit like RLHF using human data to train a reward model.
- Re: (Score:2)
  
  by ceoyoyo ( 59147 ) writes:
  
  Some people say lots of dumb things. The idea behind distillation is that unsupervised learning is very hard while supervised is much easier. It would be surprising if learning a chain of reasoning process, which is not only unsupervised but usually doesn't have a good proximal cost measure, wouldn't benefit.
  It's like learning to solve math problems by blindly manipulating symbols you don't understand versus somebody showing you step by step what to do.
  - Re: (Score:2)
    
    by blue trane ( 110704 ) writes:
    
    If it was just blindly manipulating symbols it didn't understand before, how come said everything in such good grammar?
    - Re: (Score:2)
      
      by ceoyoyo ( 59147 ) writes:
      
      I didn't say it was blindly manipulating symbols it didn't understand. I said learning by blindly manipulating symbols you don't understand.
      You can absolutely learn to understand what some random mathematician means by the | symbol given enough context, but it's much easier if they define it. You can learn a whole language that way too (all of us did).
Darn bubbles anyway (Score:4, Interesting)

by Ol Olsoc ( 1175323 ) writes: on Thursday February 06, 2025 @12:24PM (#65147093)

There is really nothing all that special about AI, it's just the latest bubble. So trillions of dollars will evaporate overnight as the costs drop, and the funny money people lose their asses.

even less (Score:2)

by bramez ( 190835 ) writes:

I saw in this tweet [x.com] it can be done for $3 already.
Not sure I understand what distillation is (Score:2)

by mukundajohnson ( 10427278 ) writes:

It sounds like the "distillation" process is asking another model for answers to benchmark questions, to look good on benchmarks.
How would that be any actual reasoning in the new model?
- Re: (Score:2)
  
  by ceoyoyo ( 59147 ) writes:
  
  You could learn how to differentiate equations by randomly guessing answers, checking to see if they're right, and trying to recognize patterns in your correct answers.
  You could also learn by looking at a shitload of problems and answers and trying to figure out how to do it yourself.
  Or a teacher could show you step by step the techniques involved in solving them.
  The difficulty of those methods decreases a lot between 1 and 3. And if the problem requires "show your work", i.e. reasoning, even more so.
As Good As Open AI? (Score:2)

by newbie_fantod ( 514871 ) writes:

Better ban it ASAP

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

Researchers Created an Open Rival To OpenAI's o1 'Reasoning' Model for Under $50 23

Researchers Created an Open Rival To OpenAI's o1 'Reasoning' Model for Under $50 More Login

Researchers Created an Open Rival To OpenAI's o1 'Reasoning' Model for Under $50

the inevitable will eventually occur (Score:2)

Training an LLM using another LLM (Score:3)

Re: Training an LLM using another LLM (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Leading to the next innovation ... (Score:2)

Huh (Score:3)

Re: (Score:2)

This is remniscent of... (Score:2)

OpenAI saw it coming (Score:3)

Re: (Score:3)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Darn bubbles anyway (Score:4, Interesting)

even less (Score:2)

Not sure I understand what distillation is (Score:2)

Re: (Score:2)

As Good As Open AI? (Score:2)

Related Links Top of the: day, week, month.

Slashdot Top Deals

Slashdot