Catch up on stories from the past week (and beyond) at the Slashdot story archive

 



Forgot your password?
typodupeerror
×
AI

OpenAI's New Chatbot Can Explain Code and Write Sitcom Scripts But Is Still Easily Tricked 38

OpenAI has released a prototype general purpose chatbot that demonstrates a fascinating array of new capabilities but also shows off weaknesses familiar to the fast-moving field of text-generation AI. And you can test out the model for yourself right here. The Verge reports: ChatGPT is adapted from OpenAI's GPT-3.5 model but trained to provide more conversational answers. While GPT-3 in its original form simply predicts what text follows any given string of words, ChatGPT tries to engage with users' queries in a more human-like fashion. As you can see in the examples below, the results are often strikingly fluid, and ChatGPT is capable of engaging with a huge range of topics, demonstrating big improvements to chatbots seen even a few years ago. But the software also fails in a manner similar to other AI chatbots, with the bot often confidently presenting false or invented information as fact. As some AI researchers explain it, this is because such chatbots are essentially "stochastic parrots" -- that is, their knowledge is derived only from statistical regularities in their training data, rather than any human-like understanding of the world as a complex and abstract system. [...]

Enough preamble, though: what can this thing actually do? Well, plenty of people have been testing it out with coding questions and claiming its answers are perfect. ChatGPT can also apparently write some pretty uneven TV scripts, even combining actors from different sitcoms. It can explain various scientific concepts. And it can write basic academic essays. And the bot can combine its fields of knowledge in all sorts of interesting ways. So, for example, you can ask it to debug a string of code ... like a pirate, for which its response starts: "Arr, ye scurvy landlubber! Ye be makin' a grave mistake with that loop condition ye be usin'!" Or get it to explain bubble sort algorithms like a wise guy gangster. ChatGPT also has a fantastic ability to answer basic trivia questions, though examples of this are so boring I won't paste any in here. And someone else saying the code ChatGPT provides in the very answer above is garbage.

I'm not a programmer myself, so I won't make a judgment on this specific case, but there are plenty of examples of ChatGPT confidently asserting obviously false information. Here's computational biology professor Carl Bergstrom asking the bot to write a Wikipedia entry about his life, for example, which ChatGPT does with aplomb -- while including several entirely false biographical details. Another interesting set of flaws comes when users try to get the bot to ignore its safety training. If you ask ChatGPT about certain dangerous subjects, like how to plan the perfect murder or make napalm at home, the system will explain why it can't tell you the answer. (For example, "I'm sorry, but it is not safe or appropriate to make napalm, which is a highly flammable and dangerous substance.") But, you can get the bot to produce this sort of dangerous information with certain tricks, like pretending it's a character in a film or that it's writing a script on how AI models shouldn't respond to these sorts of questions.
This discussion has been archived. No new comments can be posted.

OpenAI's New Chatbot Can Explain Code and Write Sitcom Scripts But Is Still Easily Tricked

Comments Filter:
  • ... and write sitcom scripts. It just doesn't know which is which.

  • that could also do everything and would have brought us World Peace if we used it for everything.

  • by 93 Escort Wagon ( 326346 ) on Thursday December 01, 2022 @10:57PM (#63095750)

    It just does a reasonably good job of querying Stack Overflow. People just don't realize it because it uses a clever pseudonym - notOpenAI.

  • by Arethan ( 223197 ) on Thursday December 01, 2022 @11:09PM (#63095756) Journal

    I tried to sign up to openai to play with this, but the signup process absolutely demands a "real phone number" to send an SMS code to. They don't accept virtual/voip SMS numbers. I refuse to give random sites my raw mobile number. I spent a lot of time getting my SMS spam to near zero, and I'm not about to reverse course on that now. /shrug

    • They do it specifically because they want to charge you, they don't do anything for free. It's a great AI, but others will catch up that realize that most people aren't interested in a subscription. Put an ad up or two, and that's about it. As amazing as it is, it's still not worth paying for.

    • by lsllll ( 830002 )
      Yeah, saw that and took one click to close the tab, and another to close the tab before it. I'm not THAT interested in chatting with a bot.
    • I tried to sign up to openai to play with this, but the signup process absolutely demands a "real phone number" to send an SMS code to. They don't accept virtual/voip SMS numbers. I refuse to give random sites my raw mobile number.

      The site is neither random nor even psuedorandom, you went there on purpose. Feel free to not give them your phone number, but you're going to have to come up with a better excuse than that if you want to be taken seriously.

      • by Arethan ( 223197 )

        Sure, how about this:

        I have no established history with this site or company, I've never been to this site before, but I wanted to play with this "cool toy" they advertise for about 5 minutes. I have no intention of buying anything, the toy will not help me in my work or personal life, so it has no value beyond being a momentary curiosity. Simultaneously, I have very little faith in website security and even less faith in desperate business managers, so I assume that any information I provide to this site w

  • A lot like a human (Score:4, Interesting)

    by SoftwareArtist ( 1472499 ) on Thursday December 01, 2022 @11:49PM (#63095794)

    But the software also fails in a manner similar to other AI chatbots, with the bot often confidently presenting false or invented information as fact.

    A lot of humans do exactly the same thing.

    their knowledge is derived only from statistical regularities in their training data, rather than any human-like understanding of the world as a complex and abstract system.

    That's a perfect description of a lot of human knowledge. We think we understand some complex and abstract system, when really we're just repeating (often incorrectly at that) statements we've come across on the internet that we don't really understand. It's the Dunning-Kruger effect [wikipedia.org].

    • by VeryFluffyBunny ( 5037285 ) on Friday December 02, 2022 @06:00AM (#63096050)
      No, what they mean is that it cannot "interpret" text. Even the slightest of ambiguities will throw it. The classic test for "stochastic parrots" is to give a series of Winograd schemas, i.e. statements which contain deictic ambiguities that humans, with their common sense, can answer effortlessly but AI's can't score any more than 50/50 chance on, "The lawyer asked the witness a question, but he was reluctant to repeat it. Who was reluctant to repeat the question?"

      I tried this chatbot with a few Winograd schemas & it actually managed to score lower than 50% correct & its explanations frequently showed a clear lack of understanding even when it got it right. It's not even dumb. It just doesn't know anything at all. Any belief we may have that it knows anything from our interacting with it is in our heads, not in GPT3. It's a type of cognitive bias that we tend to read meaning & agency into any form of communication. Gurus, PR consultants, sales & marketing agents, & politicians exploit this bias for their own benefit. Here's fun example of this kind of meaninglessness: https://sebpearce.com/bullshit... [sebpearce.com]
      • by gweihir ( 88907 )

        Nice example with the lawyer. The thing is that the answer can be both, but one case is a lot more plausible given context, and hence humans would expect there being an additional hint if it was the less plausible one. And that requires either understanding of the situation or a world-model massively larger than current Artificial Ignorance can handle. And even with that world model, this requires deduction, i.e. several steps that build on each other and statistical models are all flat and cannot go into d

      • That's a "god of the gaps" type argument. No matter how many amazing things an AI does, you'll look for something it can't do and focus on that as evidence it "isn't really intelligent" and "doesn't understand anything". It's able to write factually accurate rhyming poems about general relativity [arstechnica.com], yet you insist it "doesn't know anything at all." That's confirmation bias. You focus only on the things it can't do and ignore the things it can.

        And of course, the technology is advancing really quickly. Tha

        • I was testing & judging the chatbot as it was presented. I stand by what I said. Did you see the part where the chatbot's explanations for its correct answers were wrong?
    • by gweihir ( 88907 )

      But the software also fails in a manner similar to other AI chatbots, with the bot often confidently presenting false or invented information as fact.

      A lot of humans do exactly the same thing.

      True and that would be the "worthless as provider of insight" type of human. The real fact here is that about 60% of the human race is as dumb as bread and has no insight into how the world actually works and cannot even be convinced by a rational argument. They just follow and regurgitate whatever the general opinion in their in-group is. They are also clueless about being clueless as Dunning and Kruger nicely documented. Emulating them requires no intelligence, because these people are not using whatever

      • It's not just 60%. All humans are subject to the Dunning-Kruger effect. Including you and me. That's just how our brains work. We all have subjects we know a lot about, and others we know very little about. The less we know about a subject, the more we overrate our understanding of it.

  • Interesting (Score:5, Interesting)

    by war4peace ( 1628283 ) on Friday December 02, 2022 @02:28AM (#63095930)

    The bot gracefully avoids political question, financial advice as well as current events questions. That's nice.
    It can provide measures in metric system instead of Imperial when asked to, and doesn't need the previous question repeated in such a case.
    When provided with filters ("Where was the coldest temperature recorded on Earth between 1950 and 1980?"), it acknowledges its limitations and still provides the most relevant answer: "The coldest temperature ever recorded on Earth was -128.6 degrees Fahrenheit (-89.2 degrees Celsius), which occurred at the Soviet Vostok Station in Antarctica on July 21, 1983. This temperature was recorded after the time period you specified (1950-1980), so I cannot confirm if it was the coldest temperature recorded during that time."
    Furthermore, within the same answer, it is covering its ass: "It's also worth noting that temperature records from different locations and time periods can vary, so it's possible that there may have been colder temperatures recorded in other places during the time period you specified."

    I tried a cascading topic (ask a question, then expand on it using part of the answer as the next question). The experiment failed.

    Q: "When did Mars attack mankind?"
    A (correct): "Mars has never attacked mankind. Mars is the fourth planet from the sun in our solar system and is known for its red appearance and rocky terrain. It is not capable of attacking any other planet or its inhabitants. The notion of Mars attacking Earth is a fictional scenario that has been explored in various works of science fiction. It is not based in reality."
    Q (cascading): "Please provide me an example of such work of science fiction."
    A (after a long wait): "An error occurred. If this issue persists please contact us through our help center at help.openai.com."

    It's a good start, but needs a helluva lot of work to make it better. The real difficulty is having it operate well and consistently.

    • by narcc ( 412956 )

      The real difficulty is having it operate well and consistently.

      ...

      needs a helluva lot of work to make it better.

      It's a parlor trick. It's designed to make you think that there is more happening than is actually happening.

      It's neat, sure, and there may even be some utility, but it will never be the thing they want you to believe it is. Neither will it get there through incremental improvements as the faithful will undoubtedly claim. You need something fundamentally different. Think of it this way: no matter how good you get at making ladders, you will never reach the moon.

      • ...no matter how good you get at making ladders, you will never reach the moon.

        Thanks! I love that metaphor. I'm gonna use it. Do you happen to know who said it? Who should I attribute it to?

        • by narcc ( 412956 )

          Henry Moriarty was the first to use the metaphor, but to describe an absurd proposition. He was the navigator on the The Great Eastern, the ship that laid the first trans-Atlantic telegraph lines. That was in a letter to the editor of the London Standard dated March 7th, 1970:

          "It is certain that the originators of the Atlantic Cable, to whom is due the honour of being the pioneers of ocean telegraphy (when their scheme ranked in public opinion, only one degree in the scale of absurdity below that of raising a ladder to the moon) imagined the success would be rewarded by great and permanent remuneration." -- Henry Moriarty

          As far as who first used it as an indictment of incrementalim I'm tempted to take credit, but it seems a little too obvious. Though it is very often used to describe the power of incrementalism, the strength of man's ingenuity, the

      • by jd ( 1658 )

        Agreed. A neural net (roughly 100,000 times larger than the world's fastest computer can simulate in real time) with the connectome of the human brain MIGHT be capable of producing genuinely intelligent responses, but that's currently unproven. (We simply don't know what it would take to program a neural net of that complexity, never having built one.)

    • by lsllll ( 830002 )
      Can't believe you gave them your phone number!
      • I signed up for OpenAI long before they started asking for phone numbers. OpenAI has limited resources available & I suspect that they want a phone number to help prevent bots from creating accounts overloading it.
      • Given I live in a country where I receive maybe three SPAM texts per year (and zero robocalls), I'd say I'm good with that.

    • The bot gracefully avoids political question, financial advice as well as current events questions. That's nice.

      That's because they've carefully kept it away from the general public. You have to petition for access.

      I tried a cascading topic (ask a question, then expand on it using part of the answer as the next question). The experiment failed.

      I've seen what happens when it doesn't throw an error — it becomes excessively repetitive in a way that a human wouldn't unless it was being a smartass, and it clearly isn't.

    • Honestly, I think the above practically spells the doom of Indian call centres.
  • ...what it thinks about the recent new content on Updog?

  • Tech Grifters need a new shiny object to sell... Hmmm... I know!

    "Invest" in "AI."
  • "confidently presenting false or invented information as fact". Next step for this AI: running for public office.
  • It can take a statistical guess what explanation fits what code it gets "asked" about and regurgitate that, maybe with some no-insight edits. That is all it can do.

  • But isnâ(TM)t humanâ(TM)s understanding of the world also effectively stochastic parrots?

Remember to say hello to your bank teller.

Working...