DeepMind Tests the Limits of Large AI Language Systems With 280-Billion-Parameter Model (theverge.com) 22

Posted by BeauHD on Wednesday December 08, 2021 @08:50PM from the increasing-capabilities dept.

An anonymous reader quotes a report from The Verge: Language generation is the hottest thing in AI right now, with a class of systems known as "large language models" (or LLMs) being used for everything from improving Google's search engine to creating text-based fantasy games. But these programs also have serious problems, including regurgitating sexist and racist language and failing tests of logical reasoning. One big question is: can these weaknesses be improved by simply adding more data and computing power, or are we reaching the limits of this technological paradigm? This is one of the topics that Alphabet's AI lab DeepMind is tackling in a trio of research papers published today. The company's conclusion is that scaling up these systems further should deliver plenty of improvements. "One key finding of the paper is that the progress and capabilities of large language models is still increasing. This is not an area that has plateaued," DeepMind research scientist Jack Rae told reporters in a briefing call.

DeepMind, which regularly feeds its work into Google products, has probed the capabilities of this LLMs by building a language model with 280 billion parameters named Gopher. Parameters are a quick measure of a language's models size and complexity, meaning that Gopher is larger than OpenAI's GPT-3 (175 billion parameters) but not as big as some more experimental systems, like Microsoft and Nvidia's Megatron model (530 billion parameters). It's generally true in the AI world that bigger is better, with larger models usually offering higher performance. DeepMind's research confirms this trend and suggests that scaling up LLMs does offer improved performance on the most common benchmarks testing things like sentiment analysis and summarization. However, researchers also cautioned that some issues inherent to language models will need more than just data and compute to fix. "I think right now it really looks like the model can fail in variety of ways," said Rae. "Some subset of those ways are because the model just doesn't have sufficiently good comprehension of what it's reading, and I feel like, for those class of problems, we are just going to see improved performance with more data and scale."

But, he added, there are "other categories of problems, like the model perpetuating stereotypical biases or the model being coaxed into giving mistruths, that [...] no one at DeepMind thinks scale will be the solution [to]." In these cases, language models will need "additional training routines" like feedback from human users, he noted.

DeepMind Tests the Limits of Large AI Language Systems With 280-Billion-Parameter Model

This discussion has been archived. No new comments can be posted.

Load All Comments

Search 22 Comments Log In/Create an Account

Comments Filter:

complex or complicated? (Score:3)

by reanjr ( 588767 ) writes: on Wednesday December 08, 2021 @09:19PM (#62061311) Homepage

I know there's a school of thought that AI doesn't need to be constrained by biological intelligence mechanisms. This seems excessive though. I don't think language is actually that complicated. There are only 100 billion neurons in the brain, and only some are dedicated to language. I know it's not that simple, but I feel like we missed the boat somewhere on the technology.

- Re:complex or complicated? (Score:4, Interesting)
  
  by timeOday ( 582209 ) writes: on Wednesday December 08, 2021 @09:27PM (#62061329)
  
  For AI, what's best is really just an empirical question, so they are pushing the limits to find out. They are intentionally shooting for overkill as soon as they can achieve it, to find out where it is.
  But does it matter if this has more neurons than a human brain? For the most part no.
  First, this thing certainly "knows far", far more than any human - along some dimensions, which might be hard to characterize and/or useless, depending on need.
  Second, the correspondence between an artificial neuron and biological one shouldn't be taken too literally. For AI, a larger number of simpler neurons are better if they are faster or more compact than a smaller number of more powerful neurons. It would be bizarre if the best tradeoffs in one medium (silicon) happened to work out the same as those in another (wetware).
  
- Re:complex or complicated? (Score:5, Interesting)
  
  by Drago3711 ( 1415041 ) writes: on Wednesday December 08, 2021 @09:44PM (#62061351)
  
  The parameters in this context are much closer to neural connections than they are to neurons. In the human brain each neuron has ~7k connections. So while we only have 100 billion neurons, we have something like 100-500 trillion connections. In that context, these models seem quite capable by comparison. Source: https://en.wikipedia.org/wiki/... [wikipedia.org] Back of the envelope math puts this model somewhere around 0.056%-0.28% of the connections that our brains have. Obviously this is a bit of an apples to oranges comparison, but it's the best we've got. There is some recent work trying to quantify how many artificial neurons it takes to simulate a real one, and its somewhere around 1k: https://singularityhub.com/202... [singularityhub.com]
  
- Re: (Score:2, Informative)
  
  by phantomfive ( 622387 ) writes:
  
  When a person speaks, typically we have a concept in mind, and we search for words to express that idea.
  When a neural network speaks (writes), it looks at the previous several words and probabilistically guesses what the next word should be.
  There is a clear difference between the two approaches, as should be obvious. Humans sometimes use the second approach, for example if I say, "twinkle twinkle little ____" everyone will guess automatically what the next word will be.
  - Re: (Score:2, Insightful)
    
    by Anonymous Coward writes:
    
    When a neural network speaks (writes), it looks at the previous several words and probabilistically guesses what the next word should be.
    Dude.
    That's a Markov chain, not a neural net.
    - Re: (Score:1)
      
      by phantomfive ( 622387 ) writes:
      
      So what are you saying? That the neural network has an idea, and it's trying to express it in words?
      - Re: (Score:3)
        
        by Rockoon ( 1252108 ) writes:
        
        You did describe a markov chain.
        
        A neural network doesnt have to be context-driven or work anything at all like that. You are imposing implementation-specifics because you read-about-one-such-implementation in a book like Artificial Life or whatever.
        
        What would you say if the implementation produced an audio file?
        
        Would you still be imagining a markov chain?
        
        Neural network are used as approximators of a multi-variate function. All multivariate functions can be approximated by them, not just specific kin
        
        Re: (Score:1)
        
        by phantomfive ( 622387 ) writes:
        
        What would you say if the implementation produced an audio file?
        How does that relate to anything at all? Are you on crack? I cannot conceive of what is wrong with your mind to produce the post you produced.
- Re: (Score:2)
  
  by AmiMoJo ( 196126 ) writes:
  
  The issue is not the complexity, it's the knowledge that backs up what is said.
  Human beings have extensive knowledge of the world so they can fill in the unsaid parts and reject logically correct but practically unlikely meanings. It also enables them to appreciate humour and sarcasm.
  In the 80s efforts were made to teach computers about the world. It didn't go very well, the computer had trouble with a lot of basic concepts and they couldn't really figure out how to give it "common sense".
  More recently deve
280 billion parameters (Score:5, Funny)

by maiden_taiwan ( 516943 ) writes: on Wednesday December 08, 2021 @11:07PM (#62061555)

>...a language model with 280 billion parameters named Gopher.
How do they tell all those parameters apart if they're all named "Gopher"?

- Re: (Score:3)
  
  by serviscope_minor ( 664417 ) writes:
  
  How do they tell all those parameters apart if they're all named "Gopher"?
  gopher
  Gopher
  g0pher
  goph3r
  G0ph3r
  gopher2
  gopher_new
  gopher_newer
  gopher_new2a
  gopher1
  you know just like any other program.
280-Billion-Parameter Model ... (Score:2)

by blugalf ( 7063499 ) writes:

... or to paraphrase Smith: "Still using all the muscles because you haven't found the one that matters?"
Task failed successfully? (Score:2)

by K. S. Kyosuke ( 729550 ) writes:

But these programs also have serious problems, including regurgitating sexist and racist language and failing tests of logical reasoning.

That sounds like an accurate replica of many humans to me. I'm convinced!
- Re: Task failed successfully? (Score:2)
  
  by TJHook3r ( 4699685 ) writes:
  
  If the data is not available to form other models, how does the team train it to not be racist or sexist?
Yet it still doesn't understand what its hearing (Score:2)

by Viol8 ( 599362 ) writes:

While the current neural net technology is certainly a quantum leap in machine ability, it seems to me they're still little more than very impressive statistical analysers and just don't work in the same way as a biological brain. Until we really understand how biological brains work at the data and processing level (or someone has an A-Ha! moment of genius) then I suspect from now on it'll be the law of diminishing returns with ANNs just as it was with traditional AI back in the 80s and 90s.
Re: (Score:2)

by account_deleted ( 4530225 ) writes:

Comment removed based on user account deletion
- Re: (Score:2)
  
  by serviscope_minor ( 664417 ) writes:
  
  The algorithm has no idea what it means or why it's producing those strings. It's essentially a giant bullshitter-bot
  What's it's slashdot uid?
  - Comment removed (Score:4, Funny)
    
    by account_deleted ( 4530225 ) writes: on Thursday December 09, 2021 @07:08PM (#62064349)
    
    Comment removed based on user account deletion
    
Parsimony is dead (Score:1)

by TheStatsMan ( 1763322 ) writes:

and machine learning killed it.
Not real science.
guaranteed overfitting (Score:2)

by groobly ( 6155920 ) writes:

with that many parameters, overfitting is nearly guaranteed unless they have trillions of training samples. So, it will work great on the training data, then fail miserably when it encounters something new.
- Re: (Score:2)
  
  by gweihir ( 88907 ) writes:
  
  with that many parameters, overfitting is nearly guaranteed unless they have trillions of training samples. So, it will work great on the training data, then fail miserably when it encounters something new.
  To be fair, statistical models only rarely deliver good results on "something new", and if they do, they do so by accident. What they excel at is dealing with cases that are somehow a mix of what was found in the input data, i.e. nothing new, but one more variant of the same thing. To expect anything else is foolish and only indicates that the "researchers" doing so are incompetent.
  Sure, this is valuable. But, for example, logical reasoning is not accessible to statistical models. Statistical models can be
No (Score:2)

by gweihir ( 88907 ) writes:

The problem with these things being unable to do logic reasoning is not the size of the model used. Logic reasoning is not a language feature. Logic reasoning comes from insight and understanding and no statistical model can do that. In fact we see logic reasoning only in beings equipped with a consciousness and that is a rather large hint.
The only thing these devices will ever be able to do is mimic conventions. (That, incidentally. is where their "racism" and "sexism" comes from: They mimic what they got

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

DeepMind Tests the Limits of Large AI Language Systems With 280-Billion-Parameter Model (theverge.com) 22

DeepMind Tests the Limits of Large AI Language Systems With 280-Billion-Parameter Model More Login

DeepMind Tests the Limits of Large AI Language Systems With 280-Billion-Parameter Model

complex or complicated? (Score:3)

Re:complex or complicated? (Score:4, Interesting)

Re:complex or complicated? (Score:5, Interesting)

Re: (Score:2, Informative)

Re: (Score:2, Insightful)

Re: (Score:1)

Re: (Score:3)

Re: (Score:1)

Re: (Score:2)

280 billion parameters (Score:5, Funny)

Re: (Score:3)

280-Billion-Parameter Model ... (Score:2)

Task failed successfully? (Score:2)

Re: Task failed successfully? (Score:2)

Yet it still doesn't understand what its hearing (Score:2)

Re: (Score:2)

Re: (Score:2)

Comment removed (Score:4, Funny)

Parsimony is dead (Score:1)

guaranteed overfitting (Score:2)

Re: (Score:2)

No (Score:2)

Related Links Top of the: day, week, month.

Slashdot Top Deals

Slashdot