OpenAI's New Chatbot Can Explain Code and Write Sitcom Scripts But Is Still Easily Tricked 38

Posted by BeauHD on Thursday December 01, 2022 @09:25PM from the work-in-progress dept.

OpenAI has released a prototype general purpose chatbot that demonstrates a fascinating array of new capabilities but also shows off weaknesses familiar to the fast-moving field of text-generation AI. And you can test out the model for yourself right here. The Verge reports: ChatGPT is adapted from OpenAI's GPT-3.5 model but trained to provide more conversational answers. While GPT-3 in its original form simply predicts what text follows any given string of words, ChatGPT tries to engage with users' queries in a more human-like fashion. As you can see in the examples below, the results are often strikingly fluid, and ChatGPT is capable of engaging with a huge range of topics, demonstrating big improvements to chatbots seen even a few years ago. But the software also fails in a manner similar to other AI chatbots, with the bot often confidently presenting false or invented information as fact. As some AI researchers explain it, this is because such chatbots are essentially "stochastic parrots" -- that is, their knowledge is derived only from statistical regularities in their training data, rather than any human-like understanding of the world as a complex and abstract system. [...]

Enough preamble, though: what can this thing actually do? Well, plenty of people have been testing it out with coding questions and claiming its answers are perfect. ChatGPT can also apparently write some pretty uneven TV scripts, even combining actors from different sitcoms. It can explain various scientific concepts. And it can write basic academic essays. And the bot can combine its fields of knowledge in all sorts of interesting ways. So, for example, you can ask it to debug a string of code ... like a pirate, for which its response starts: "Arr, ye scurvy landlubber! Ye be makin' a grave mistake with that loop condition ye be usin'!" Or get it to explain bubble sort algorithms like a wise guy gangster. ChatGPT also has a fantastic ability to answer basic trivia questions, though examples of this are so boring I won't paste any in here. And someone else saying the code ChatGPT provides in the very answer above is garbage.

I'm not a programmer myself, so I won't make a judgment on this specific case, but there are plenty of examples of ChatGPT confidently asserting obviously false information. Here's computational biology professor Carl Bergstrom asking the bot to write a Wikipedia entry about his life, for example, which ChatGPT does with aplomb -- while including several entirely false biographical details. Another interesting set of flaws comes when users try to get the bot to ignore its safety training. If you ask ChatGPT about certain dangerous subjects, like how to plan the perfect murder or make napalm at home, the system will explain why it can't tell you the answer. (For example, "I'm sorry, but it is not safe or appropriate to make napalm, which is a highly flammable and dangerous substance.") But, you can get the bot to produce this sort of dangerous information with certain tricks, like pretending it's a character in a film or that it's writing a script on how AI models shouldn't respond to these sorts of questions.

OpenAI's New Chatbot Can Explain Code and Write Sitcom Scripts But Is Still Easily Tricked

This discussion has been archived. No new comments can be posted.

Load All Comments

Search 38 Comments Log In/Create an Account

Comments Filter:

Can explain code ... (Score:2)

by PPH ( 736903 ) writes:

... and write sitcom scripts. It just doesn't know which is which.
Sound like the all new and improved Watson (Score:2)

by thesjaakspoiler ( 4782965 ) writes:

that could also do everything and would have brought us World Peace if we used it for everything.
OpenAI hasn't figure out coding (Score:4, Funny)

by 93 Escort Wagon ( 326346 ) writes: on Thursday December 01, 2022 @10:57PM (#63095750)

It just does a reasonably good job of querying Stack Overflow. People just don't realize it because it uses a clever pseudonym - notOpenAI.

Wants a real phone number?! (Score:4, Interesting)

by Arethan ( 223197 ) writes: on Thursday December 01, 2022 @11:09PM (#63095756) Journal

I tried to sign up to openai to play with this, but the signup process absolutely demands a "real phone number" to send an SMS code to. They don't accept virtual/voip SMS numbers. I refuse to give random sites my raw mobile number. I spent a lot of time getting my SMS spam to near zero, and I'm not about to reverse course on that now. /shrug

- Re: (Score:2)
  
  by HeadSoft ( 147914 ) writes:
  
  They do it specifically because they want to charge you, they don't do anything for free. It's a great AI, but others will catch up that realize that most people aren't interested in a subscription. Put an ad up or two, and that's about it. As amazing as it is, it's still not worth paying for.
  - Re: (Score:2)
    
    by Arethan ( 223197 ) writes:
    
    Ah, yep - that makes sense.
    Thanks for the helpful response!
- Re: (Score:2)
  
  by lsllll ( 830002 ) writes:
  
  Yeah, saw that and took one click to close the tab, and another to close the tab before it. I'm not THAT interested in chatting with a bot.
- Re: (Score:2)
  
  by drinkypoo ( 153816 ) writes:
  
  I tried to sign up to openai to play with this, but the signup process absolutely demands a "real phone number" to send an SMS code to. They don't accept virtual/voip SMS numbers. I refuse to give random sites my raw mobile number.
  The site is neither random nor even psuedorandom, you went there on purpose. Feel free to not give them your phone number, but you're going to have to come up with a better excuse than that if you want to be taken seriously.
  - Re: (Score:3)
    
    by Arethan ( 223197 ) writes:
    
    Sure, how about this:
    I have no established history with this site or company, I've never been to this site before, but I wanted to play with this "cool toy" they advertise for about 5 minutes. I have no intention of buying anything, the toy will not help me in my work or personal life, so it has no value beyond being a momentary curiosity. Simultaneously, I have very little faith in website security and even less faith in desperate business managers, so I assume that any information I provide to this site w
A lot like a human (Score:4, Interesting)

by SoftwareArtist ( 1472499 ) writes: on Thursday December 01, 2022 @11:49PM (#63095794)

But the software also fails in a manner similar to other AI chatbots, with the bot often confidently presenting false or invented information as fact.
A lot of humans do exactly the same thing.
their knowledge is derived only from statistical regularities in their training data, rather than any human-like understanding of the world as a complex and abstract system.
That's a perfect description of a lot of human knowledge. We think we understand some complex and abstract system, when really we're just repeating (often incorrectly at that) statements we've come across on the internet that we don't really understand. It's the Dunning-Kruger effect [wikipedia.org].

- Comment removed (Score:5, Interesting)
  
  by account_deleted ( 4530225 ) writes: on Friday December 02, 2022 @06:00AM (#63096050)
  
  Comment removed based on user account deletion
  
  - Re: (Score:2)
    
    by gweihir ( 88907 ) writes:
    
    Nice example with the lawyer. The thing is that the answer can be both, but one case is a lot more plausible given context, and hence humans would expect there being an additional hint if it was the less plausible one. And that requires either understanding of the situation or a world-model massively larger than current Artificial Ignorance can handle. And even with that world model, this requires deduction, i.e. several steps that build on each other and statistical models are all flat and cannot go into d
  - Re: (Score:3)
    
    by SoftwareArtist ( 1472499 ) writes:
    
    That's a "god of the gaps" type argument. No matter how many amazing things an AI does, you'll look for something it can't do and focus on that as evidence it "isn't really intelligent" and "doesn't understand anything". It's able to write factually accurate rhyming poems about general relativity [arstechnica.com], yet you insist it "doesn't know anything at all." That's confirmation bias. You focus only on the things it can't do and ignore the things it can.
    And of course, the technology is advancing really quickly. Tha
    - Re: (Score:2)
      
      by account_deleted ( 4530225 ) writes:
      
      Comment removed based on user account deletion
- Re: (Score:2)
  
  by gweihir ( 88907 ) writes:
  
  But the software also fails in a manner similar to other AI chatbots, with the bot often confidently presenting false or invented information as fact.
  A lot of humans do exactly the same thing.
  True and that would be the "worthless as provider of insight" type of human. The real fact here is that about 60% of the human race is as dumb as bread and has no insight into how the world actually works and cannot even be convinced by a rational argument. They just follow and regurgitate whatever the general opinion in their in-group is. They are also clueless about being clueless as Dunning and Kruger nicely documented. Emulating them requires no intelligence, because these people are not using whatever
  - Re: (Score:2)
    
    by SoftwareArtist ( 1472499 ) writes:
    
    It's not just 60%. All humans are subject to the Dunning-Kruger effect. Including you and me. That's just how our brains work. We all have subjects we know a lot about, and others we know very little about. The less we know about a subject, the more we overrate our understanding of it.
Interesting (Score:5, Interesting)

by war4peace ( 1628283 ) writes: on Friday December 02, 2022 @02:28AM (#63095930)

The bot gracefully avoids political question, financial advice as well as current events questions. That's nice.
It can provide measures in metric system instead of Imperial when asked to, and doesn't need the previous question repeated in such a case.
When provided with filters ("Where was the coldest temperature recorded on Earth between 1950 and 1980?"), it acknowledges its limitations and still provides the most relevant answer: "The coldest temperature ever recorded on Earth was -128.6 degrees Fahrenheit (-89.2 degrees Celsius), which occurred at the Soviet Vostok Station in Antarctica on July 21, 1983. This temperature was recorded after the time period you specified (1950-1980), so I cannot confirm if it was the coldest temperature recorded during that time."
Furthermore, within the same answer, it is covering its ass: "It's also worth noting that temperature records from different locations and time periods can vary, so it's possible that there may have been colder temperatures recorded in other places during the time period you specified."
I tried a cascading topic (ask a question, then expand on it using part of the answer as the next question). The experiment failed.
Q: "When did Mars attack mankind?"
A (correct): "Mars has never attacked mankind. Mars is the fourth planet from the sun in our solar system and is known for its red appearance and rocky terrain. It is not capable of attacking any other planet or its inhabitants. The notion of Mars attacking Earth is a fictional scenario that has been explored in various works of science fiction. It is not based in reality."
Q (cascading): "Please provide me an example of such work of science fiction."
A (after a long wait): "An error occurred. If this issue persists please contact us through our help center at help.openai.com."
It's a good start, but needs a helluva lot of work to make it better. The real difficulty is having it operate well and consistently.

- Re: (Score:3)
  
  by narcc ( 412956 ) writes:
  
  The real difficulty is having it operate well and consistently.
  ...
  needs a helluva lot of work to make it better.
  It's a parlor trick. It's designed to make you think that there is more happening than is actually happening.
  It's neat, sure, and there may even be some utility, but it will never be the thing they want you to believe it is. Neither will it get there through incremental improvements as the faithful will undoubtedly claim. You need something fundamentally different. Think of it this way: no matter how good you get at making ladders, you will never reach the moon.
  - Re: (Score:2)
    
    by account_deleted ( 4530225 ) writes:
    
    Comment removed based on user account deletion
    - Re: (Score:2)
      
      by narcc ( 412956 ) writes:
      
      Henry Moriarty was the first to use the metaphor, but to describe an absurd proposition. He was the navigator on the The Great Eastern, the ship that laid the first trans-Atlantic telegraph lines. That was in a letter to the editor of the London Standard dated March 7th, 1970:
      "It is certain that the originators of the Atlantic Cable, to whom is due the honour of being the pioneers of ocean telegraphy (when their scheme ranked in public opinion, only one degree in the scale of absurdity below that of raising a ladder to the moon) imagined the success would be rewarded by great and permanent remuneration." -- Henry Moriarty
      As far as who first used it as an indictment of incrementalim I'm tempted to take credit, but it seems a little too obvious. Though it is very often used to describe the power of incrementalism, the strength of man's ingenuity, the
      - Re: (Score:2)
        
        by account_deleted ( 4530225 ) writes:
        
        Comment removed based on user account deletion
  - Re: (Score:2)
    
    by jd ( 1658 ) writes:
    
    Agreed. A neural net (roughly 100,000 times larger than the world's fastest computer can simulate in real time) with the connectome of the human brain MIGHT be capable of producing genuinely intelligent responses, but that's currently unproven. (We simply don't know what it would take to program a neural net of that complexity, never having built one.)
- Re: (Score:2)
  
  by lsllll ( 830002 ) writes:
  
  Can't believe you gave them your phone number!
  - Re: (Score:3)
    
    by account_deleted ( 4530225 ) writes:
    
    Comment removed based on user account deletion
  - Re: (Score:2)
    
    by war4peace ( 1628283 ) writes:
    
    Given I live in a country where I receive maybe three SPAM texts per year (and zero robocalls), I'd say I'm good with that.
- Re: (Score:2)
  
  by drinkypoo ( 153816 ) writes:
  
  The bot gracefully avoids political question, financial advice as well as current events questions. That's nice.
  That's because they've carefully kept it away from the general public. You have to petition for access.
  I tried a cascading topic (ask a question, then expand on it using part of the answer as the next question). The experiment failed.
  I've seen what happens when it doesn't throw an error — it becomes excessively repetitive in a way that a human wouldn't unless it was being a smartass, and it clearly isn't.
- Re: (Score:2)
  
  by aRTeeNLCH ( 6256058 ) writes:
  
  Honestly, I think the above practically spells the doom of Indian call centres.
What does it say if you ask it... (Score:2)

by Petersko ( 564140 ) writes:

...what it thinks about the recent new content on Updog?
- Re: (Score:2)
  
  by account_deleted ( 4530225 ) writes:
  
  Comment removed based on user account deletion
  - Re: (Score:2)
    
    by account_deleted ( 4530225 ) writes:
    
    Comment removed based on user account deletion
  - Re: (Score:2)
    
    by Petersko ( 564140 ) writes:
    
    Touche!
Crypto is dead! (Score:2)

by Big Hairy Gorilla ( 9839972 ) writes:

Tech Grifters need a new shiny object to sell... Hmmm... I know!

"Invest" in "AI."
Invented information (Score:2)

by radaos ( 540979 ) writes:

"confidently presenting false or invented information as fact". Next step for this AI: running for public office.
No, it cannot (Score:2)

by gweihir ( 88907 ) writes:

It can take a statistical guess what explanation fits what code it gets "asked" about and regurgitate that, maybe with some no-insight edits. That is all it can do.
And humans also Stochastic parrots? (Score:1)

by SamLogical ( 8161792 ) writes:

But isnâ(TM)t humanâ(TM)s understanding of the world also effectively stochastic parrots?

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

Can explain code ... (Score:2)

Sound like the all new and improved Watson (Score:2)

OpenAI hasn't figure out coding (Score:4, Funny)

Wants a real phone number?! (Score:4, Interesting)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:3)

A lot like a human (Score:4, Interesting)

Comment removed (Score:5, Interesting)

Re: (Score:2)

Re: (Score:3)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Interesting (Score:5, Interesting)

Re: (Score:3)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:3)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

What does it say if you ask it... (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Crypto is dead! (Score:2)

Invented information (Score:2)

No, it cannot (Score:2)

And humans also Stochastic parrots? (Score:1)

Related Links Top of the: day, week, month.

Slashdot Top Deals