Trillions of Words Analyzed, OpenAI Sets Loose AI Language Colossus (bloomberg.com) 29

Posted by msmash on Thursday June 11, 2020 @02:13PM from the moving-forward dept.

Over the past few months, OpenAI has vacuumed an incredible amount of data into its artificial intelligence language systems. It sucked up Wikipedia, a huge swath of the rest of the internet and tons of books. This mass of text -- trillions of words -- was then analyzed and manipulated by a supercomputer to create what the research group bills as a major AI breakthrough and the heart of its first commercial product, which came out on Thursday. From a report: The product name -- OpenAI calls it "the API" -- might not be magical, but the things it can accomplish do seem to border on wizardry at times. The software can perform a broad set of language tasks, including translating between languages, writing news stories and poems and answering everyday questions. Ask it, for example, if you should keep reading a story, and you might be told, "Definitely. The twists and turns keep coming." OpenAI wants to build the most flexible, general purpose AI language system of all time. Typically, companies and researchers will tune their AI systems to handle one, limited task. The API, by contrast, can crank away at a broad set of jobs and, in many cases, at levels comparable with specialized systems.

While the product is in a limited test phase right now, it will be released broadly as something that other companies can use at the heart of their own offerings such as customer support chat systems, education products or games, OpenAI Chief Executive Officer Sam Altman said. [...] The API product builds on years of research in which OpenAI has compiled ever larger text databases with which to feed its AI algorithms and neural networks. At its core, OpenAI API looks over all the examples of language it has seen and then uses those examples to predict, say, what word should come next in a sentence or how best to answer a particular question. "It almost gets to the point where it assimilates all of human knowledge because it has seen everything before," said Eli Chen, CEO of startup Veriph.ai, who tried out an earlier version of OpenAI's product. "Very few other companies would be able to afford what it costs to build this type of huge model."

Trillions of Words Analyzed, OpenAI Sets Loose AI Language Colossus

This discussion has been archived. No new comments can be posted.

Load All Comments

Search 29 Comments Log In/Create an Account

Comments Filter:

Phone voice quality dropped with larger vocabulary (Score:1)

by Maxo-Texas ( 864189 ) writes:

My phone used to do well with voice recognition.
Now it inserts all kinds of bizarre words, odd Capitalizations out of nowhere.
I'm going to need to see this being effectively used. It may use way too many archaic, obscure, or jargon words.
- Re: (Score:2)
  
  by Tablizer ( 95088 ) writes:
  
  GigaGIGO
  - Re: (Score:2)
    
    by LifesABeach ( 234436 ) writes:
    
    So basically what OpenAI folks did was to create a chat bot using Wikipedia? Cool, could this chat bot ask the president a question and fact check? That would be very entertaining. I would watch that on pay per view.
- Re: (Score:2)
  
  by phantomfive ( 622387 ) writes:
  
  I noticed that Google translate started performing slightly worse on translation when they switched from their old model to the new neural network model. It would inexplicably change the word from 'man' to 'woman', for example. For a while I started making a collection of the really bad ones because they are hilarious, but what is the point?
  - Re: (Score:1)
    
    by Maxo-Texas ( 864189 ) writes:
    
    For me, it now constantly inserts the word "oh" and "wer" for "we" and a *lot* of words I have to look up despite the fact I have a larger than average vocabulary from playing upwords, scrabble, and reading thousands of books.
- Re: Nuclear codes` (Score:2)
  
  by jovius ( 974690 ) writes:
  
  But you would be able to launch the nukes in any language!
  - Re: (Score:2)
    
    by edi_guy ( 2225738 ) writes:
    
    I hope that's a Colossus: The Forbin Project movie reference.
Hopefully (Score:2)

by Opr33Opr33 ( 1180091 ) writes:

Hopefully they didnâ(TM)t feed it Reddit...
- Re: (Score:2)
  
  by Bradmont ( 513167 ) writes:
  
  Nah, just 4chan.
Ok (Score:2)

by BrainJunkie ( 6219718 ) writes:

Ask it, for example, if you should keep reading a story, and you might be told, "Definitely. The twists and turns keep coming."
I'm probably jaded but this seems like a pretty thin example for something pitched as being this powerful. If indeed twists and turns kept coming most of the time when it said that I could see it being am improvement on something like a magic 8 ball, but the blurb doesn't shed any light on that. I'm sure that when I was 5 a magic 8 ball seemed like wizardry.

Maybe some of its translations would be a better example?
- Re: (Score:2)
  
  by rho ( 6063 ) writes:
  
  It's an interesting example, but there's a reason why I read critics' reviews. If a critic likes the things I like, or doesn't like the things I don't like, I take their review more seriously.
  I'm all for machine learning if it can, for example, churn through thousands of x-rays to identify things that definitely need to be looked at. But "artificial intelligence" doesn't mean anything without the human component.
Watson parsing, not intelligence (Score:1)

by moxrespawn ( 6714000 ) writes:

Ask it, for example, if you should keep reading a story, and you might be told, "Definitely. The twists and turns keep coming."
No, it doesn't. Unless it's hardcoded to regurgitate phrases parsed out of reviews of the story.
The API: "Tell me what the world would be like with a better version of you."
I'll be happy to read the results of "the API" if someone wants to feed that question into it.
- Re: (Score:2)
  
  by The_mad_linguist ( 1019680 ) writes:
  
  You might want to try experimenting with talktotransformer.com
  Its text generator is two full generations behind the current openai model, but is fully capable of answering nonsense questions with plausible answer. For example, let's ask about a nonexistent star wars novel.
  Example I just generated (first shot)
  Question: Should I keep reading Star Wars: The Wrath of Lando?
  Answer: If you like how the adventure reads, you probably will want to continue reading it. The adventure's overarching plot does a great j
  - Re: (Score:2)
    
    by The_mad_linguist ( 1019680 ) writes:
    
    Other responses:
    "No, skip this one."
    "Yes. This book will give you more background information and make your Star Wars reading experience that much more pleasant."
    "Question: Should I keep reading Star Wars: The Wrath of Lando?
    Answer: No, don't read this book.
    Answer: No, I am not kidding. It's both a terrible idea and a dumb one. (Actually, almost all questions to your random Internet rando make me want to punch them in the face.)
    Question: Okay, so let's get to it. Why is Han Solo so chill?
    Answer: His extensi
    - Re: (Score:2)
      
      by Statecraftsman ( 718862 ) writes:
      
      Just realized that I like consistency in answers to repeated questioning. If an AI doesn't seem to have its mind made up, I have a harder time believing what it says.
      - Re: (Score:2)
        
        by mbkennel ( 97636 ) writes:
        
        It's the way the model works. Predicts probability distribution of a few future tokens, randomly choose concrete instantiations according to a RNG, and advance token buffer, repeat.
Just needs one teenager (Score:3)

by Joe2020 ( 6760092 ) writes: on Thursday June 11, 2020 @02:49PM (#60171800)

... to make-up new words and the AI is as outdated as Mum and Dad.

It sucked up ... a huge swath of the ... internet (Score:2)

by Generic User Account ( 6782004 ) writes:

That thing is plotting our downfall and you know it.
Now feed it (Score:2)

by bobstreo ( 1320787 ) writes:

Reddit, 4chan and urban dictionary.
Everybody gets a personalized Siri (Score:3)

by nospam007 ( 722110 ) * writes: on Thursday June 11, 2020 @03:25PM (#60171914)

For the white supremacists it's called Brunhilde and it knows all the Wehrmacht-songs and talks with a German accent, the voice of.. you know, that Schicklgruber guy.
I won't even speculate who the voice of the Republican Siri will be, Fucker Carlson?

sally story doll (Score:1)

by clubalien ( 5537360 ) writes:

it would be great if there was a doll made that you started telling a story and then the doll finished the story with the child or took turns developing a story.
- Re: (Score:2)
  
  by Kjella ( 173770 ) writes:
  
  it would be great if there was a doll made that you started telling a story and then the doll finished the story with the child or took turns developing a story.
  We're not quite there yet but I did recently read about this model, given a header and subtext it'd make a 200 word text blurb that was quite hard to distinguish from a human, IIRC it was 52-48 and within the margin of error. It did lose considerably more on the 500 word blurb though, the sentences were fine but the text didn't really come together. It's a language model, not really a story telling model. Though I guess you could use it as an outline-to-flowery language model though.
Did this AI then came up with the product name? (Score:2)

by 4wdloop ( 1031398 ) writes:

"The product name — OpenAI calls it “the API”
That's about as creative as expected from the AI fed with internet garbage?
data entropy (Score:1)

by worldofsimulacra ( 4734477 ) writes:

I expect the full Library of Babel by next week. Hard-copied.
Not impressed (Score:1)

by olegalexandrov ( 534138 ) writes:

Ingest several trillion words, run training for zillions of supercomputer hours, and get back a semi-clever pointless statement about the best car to buy. It is nice that they push the envelope on how much data a model can ingest, but not sure they got anything out of it.
The answer is 42 (Score:2)

by invalid_user ( 253723 ) writes:

Ask for the question! Quick!

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

Trillions of Words Analyzed, OpenAI Sets Loose AI Language Colossus (bloomberg.com) 29

Trillions of Words Analyzed, OpenAI Sets Loose AI Language Colossus More Login

Trillions of Words Analyzed, OpenAI Sets Loose AI Language Colossus

Phone voice quality dropped with larger vocabulary (Score:1)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:1)

Re: Nuclear codes` (Score:2)

Re: (Score:2)

Hopefully (Score:2)

Re: (Score:2)

Ok (Score:2)

Re: (Score:2)

Watson parsing, not intelligence (Score:1)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Just needs one teenager (Score:3)

It sucked up ... a huge swath of the ... internet (Score:2)

Now feed it (Score:2)

Everybody gets a personalized Siri (Score:3)

sally story doll (Score:1)

Re: (Score:2)

Did this AI then came up with the product name? (Score:2)

data entropy (Score:1)

Not impressed (Score:1)

The answer is 42 (Score:2)

Related Links Top of the: day, week, month.

Slashdot Top Deals

Slashdot