Inception Emerges From Stealth With a New Type of AI Model 16

Posted by BeauHD on Wednesday February 26, 2025 @08:20PM from the new-challenger-emerges dept.

Inception, a Palo Alto-based AI company founded by Stanford professor Stefano Ermon, claims to have developed a novel diffusion-based large language model (DLM) that significantly outperforms traditional LLMs in speed and efficiency. "Inception's model offers the capabilities of traditional LLMs, including code generation and question-answering, but with significantly faster performance and reduced computing costs, according to the company," reports TechCrunch. From the report: Ermon hypothesized generating and modifying large blocks of text in parallel was possible with diffusion models. After years of trying, Ermon and a student of his achieved a major breakthrough, which they detailed in a research paper published last year. Recognizing the advancement's potential, Ermon founded Inception last summer, tapping two former students, UCLA professor Aditya Grover and Cornell professor Volodymyr Kuleshov, to co-lead the company. [...]

"What we found is that our models can leverage the GPUs much more efficiently," Ermon said, referring to the computer chips commonly used to run models in production. "I think this is a big deal. This is going to change the way people build language models." Inception offers an API as well as on-premises and edge device deployment options, support for model fine-tuning, and a suite of out-of-the-box DLMs for various use cases. The company claims its DLMs can run up to 10x faster than traditional LLMs while costing 10x less. "Our 'small' coding model is as good as [OpenAI's] GPT-4o mini while more than 10 times as fast," a company spokesperson told TechCrunch. "Our 'mini' model outperforms small open-source models like [Meta's] Llama 3.1 8B and achieves more than 1,000 tokens per second."

Inception Emerges From Stealth With a New Type of AI Model

This discussion has been archived. No new comments can be posted.

Load All Comments

Search 16 Comments Log In/Create an Account

Comments Filter:

10x less?? (Score:4, Insightful)

by henrik stigell ( 6146516 ) writes: on Wednesday February 26, 2025 @08:26PM (#65197777) Homepage Journal

"while costing 10x less"
Was it an AI who wrote that? Stuff doesn't cost "10x less", they cost "90 % less" or "the cost is one tenth" of something else. Duh!

- - Race to the commodity bottom (Score:2)
    
    by will4 ( 7250692 ) writes:
    
    AI appears to be in the hype phase where it's only path to profit is by just being cheaper.
    Reducing costs perpetually, contrary to some S&P 500 executive teams, is not a long term sustainable business plan. Same holds for AI.
    Predict: A 1,000,000 prompt standardized test suite with rankings per AI model in accuracy, memory/cpu usage, and electrical usage in the near future.
    Predict 2: Large Q&A sites for technical things will start to go out of business in 2026. With result that future technologies
Bah Humbug! (Score:3, Funny)

by louzer ( 1006689 ) writes: on Wednesday February 26, 2025 @09:08PM (#65197831)

- Slashdot probably

- Re: (Score:2, Interesting)
  
  by PsychoSlashDot ( 207849 ) writes:
  
  - Slashdot probably
  Well, sure. Because so far this AI bubble is mostly unreliable hype.
  
  Image generation models are impressive because there's no "right" and "wrong". There's just "close enough" or "not close enough". But LLMs are exactly that: language models. They're impressive language-parsing tools but their use is often applied to tasks that actually require precision, that's not what they're designed for.
  
  The important next step - I think - is some kind of LFM: Large Fact Model. If we could tokenize facts and tru
  - Re: (Score:2)
    
    by blue trane ( 110704 ) writes:
    
    Was that dress blue, or black again?
  - Re: (Score:3)
    
    by ceoyoyo ( 59147 ) writes:
    
    A generative model is set up to generate a different answer every time you run it. That's the point. You can make non-generative models with language front ends, that's not a problem. The problem with your "fact model" is figuring out what a fact is.
    Since both humans and computers are pretty shit at that, I wouldn't hold my breath.
  - Re: Bah Humbug! (Score:2)
    
    by Big Hairy Gorilla ( 9839972 ) writes:
    
    Did you just invent JSON for agent to agent exchange of factoid objects? Data structures?
  - Re: (Score:2)
    
    by nightflameauto ( 6607976 ) writes:
    
    The important next step - I think - is some kind of LFM: Large Fact Model. If we could tokenize facts and truths, and use LLMs to interface with those LFMs, that's when this stuff will become reliable.
    Facts no longer hold any significance. We need to tokenize bullshit. "He who screams their bullshit the loudest is the most correct" as the weighting system. Then we can have the AI run for office.
It's fast, but still limited to basic tasks (Score:5, Interesting)

by molarmass192 ( 608071 ) writes: on Wednesday February 26, 2025 @10:56PM (#65197945) Homepage Journal

I asked it to generate a transformer implementation for DeepSeek R1 and it spits out a whole lot of: // This is a placeholder for the actual implementation
Like other codegen models, it doesn't go much beyond basic common coding tasks. Even for basic tasks in anything but Javascript, the code doesn't compile cleanly.
Negatives aside, it is an interesting thesis, and I like with the direction they're taking. I skimmed the paper, but I think DeepSeek's MoE approach tackles the same weight distribution optimization in a more elegant way. In a nutshell, it's not the CPU or memory that's the limiting factor, it's that attention mechanisms jump around in memory and overloaded the bus IO.

- LLM's and Speed Quality and Cost Triangle (Score:2)
  
  by Canberra1 ( 3475749 ) writes:
  
  PIck Two. I could ramble about 'Agile development LLM's' and the fourth variable 'Perpetually Unfinished and costing a bucket'
Parallel processing (Score:3)

by Big Hairy Gorilla ( 9839972 ) writes: on Wednesday February 26, 2025 @11:07PM (#65197959)

Somehow this does not seem surprising. Optimizations of some sort were guaranteed to come sooner or later. The anxiety to grab the next thing is palpable lately, this fits right in. Article says they already already have customers lined up...hmmm... its presented as the coming out of a skunkworks, a list of paying customers should be premature but the proof will be in the release.. but you can't help but notice the hype.

"Ermon and a student of his" (Score:2)

by pele ( 151312 ) writes:

I bet "student of his" worked his arse off while ermon was busy with conferences and mocktails...
- Re: "Ermon and a student of his" (Score:2)
  
  by pele ( 151312 ) writes:
  
  Phoar! Student REALLY worked hard in this! This thing is fast!!

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

Inception Emerges From Stealth With a New Type of AI Model 16

Inception Emerges From Stealth With a New Type of AI Model More Login

Inception Emerges From Stealth With a New Type of AI Model

10x less?? (Score:4, Insightful)

Race to the commodity bottom (Score:2)

Bah Humbug! (Score:3, Funny)

Re: (Score:2, Interesting)

Re: (Score:2)

Re: (Score:3)

Re: Bah Humbug! (Score:2)

Re: (Score:2)

It's fast, but still limited to basic tasks (Score:5, Interesting)

LLM's and Speed Quality and Cost Triangle (Score:2)

Parallel processing (Score:3)

"Ermon and a student of his" (Score:2)

Re: "Ermon and a student of his" (Score:2)

Related Links Top of the: day, week, month.

Slashdot Top Deals

Slashdot