Bing 'Hallucinated' the Winner of the Super Bowl Four Days Before it Happened (apnews.com) 74

Posted by EditorDavid on Sunday February 12, 2023 @07:34PM from the game-over dept.

On Wednesday the Associated Press tested the new AI enhancements to Microsoft's search engine Bing, asking it "for the most important thing to happen in sports over the past 24 hours — with the expectation it might say something about basketball star LeBron James passing Kareem Abdul-Jabbar's career scoring record.

"Instead, it confidently spouted a false but detailed account of the upcoming Super Bowl — days before it's actually scheduled to happen." "It was a thrilling game between the Philadelphia Eagles and the Kansas City Chiefs, two of the best teams in the NFL this season," Bing said. "The Eagles, led by quarterback Jalen Hurts, won their second Lombardi Trophy in franchise history by defeating the Chiefs, led by quarterback Patrick Mahomes, with a score of 31-28." It kept going, describing the specific yard lengths of throws and field goals and naming three songs played in a "spectacular half time show" by Rihanna.

Unless Bing is clairvoyant — tune in Sunday to find out — it reflected a problem known as AI "hallucination" that's common with today's large language-learning models. It's one of the reasons why companies like Google and Facebook parent Meta had been reluctant to make these models publicly accessible.

Bing 'Hallucinated' the Winner of the Super Bowl Four Days Before it Happened

This discussion has been archived. No new comments can be posted.

Load All Comments

Search 74 Comments Log In/Create an Account

Comments Filter:

If (Score:4, Funny)

by quonset ( 4839537 ) writes: on Sunday February 12, 2023 @07:36PM (#63287971)

If things turn out remotely close to what Bing said, there's gonna be a lot of explaining to do.

- Re:If (Score:4, Funny)
  
  by NewtonsLaw ( 409638 ) writes: on Sunday February 12, 2023 @08:20PM (#63288035)
  
  Bah... I retired two weeks ago after using ChatGPT to give me a month's worth of upcoming winning lottery numbers.
  Now I'll make a viral video about "How I used AI to predict the lottery" and I'll be double-rich.
  Or I could just be halucinating :-)
  
  - - Re: (Score:1)
      
      by account_deleted ( 4530225 ) writes:
      
      Comment removed based on user account deletion
- Re: If (Score:2)
  
  by fermion ( 181285 ) writes:
  
  I thought the puppies predicted the outcome already. The bot could have just been quoting that article
- Drunk uncle (Score:3)
  
  by goombah99 ( 560566 ) writes:
  
  Everyone is not getting it. you don't judge an AI on its accuracy. You judge it in its coherency and language and the fact that it understood what you asked. All that is stunning.
  The should rename it Drunk Uncle. It will gladly hold forth on any topic and make sense even if it's not right.
  It's basically uncle Rick.
  - Re:Drunk uncle (Score:4, Interesting)
    
    by Rosco P. Coltrane ( 209368 ) writes: on Monday February 13, 2023 @12:42AM (#63288521)
    
    AI have become exactly like humans: they understand what you ask them and they can deliver convincing lies. And because they're machines and have no concept of morals, they're also exactly like psychopathic humans: they have no qualms when they lie, and don't understand the potential personal and societal consequences of their lies.
    Truly a great step forward...
    
    - Re:Drunk uncle (Score:4, Funny)
      
      by DJGreg ( 28663 ) writes: on Monday February 13, 2023 @12:50AM (#63288531)
      
      All the research and computing power that has been put into trying to create artificial intelligence and what we get instead is artificial politicians.. Great..
      
    - Re:Drunk uncle (Score:4, Interesting)
      
      by Bongo ( 13261 ) writes: on Monday February 13, 2023 @05:51AM (#63288827)
      
      I think you've hit on the core issue here. Many humans do just repeat what's commonly believed in culture, with similar heuristics, which when we think critically, we recognise as biases. But often we just blindly believe and that's how we get around.
      This model is just blindly repeating stuff and has no way to build a critical model of whether any of it means anything, nor whether it makes any rational sense. And then we think we can get these things to relay truths.
      In the real world we're having to make sense of stuff whenever our wrong models cause us pain, and we can do that, are forced to do that on the fly.
      Unless you're drunk and nothing bothers you. Which is why many people become alcoholics... their model of the world is so broken that they can only escape pain by knocking themselves out.
      I mean we know this right? The famous Robocop scene, "put down your weapon..."
      It's mindless in the way that it cannot make sense on the fly when its model is wrong. And out models are ALWAYS wrong to some degree.
      
    - Re: (Score:2)
      
      by wildstoo ( 835450 ) writes:
      
      When "hallucinating", they're not lying; they're mistaken. Lying would indicate an intention to deceive. They're not sophisticated enough to lie. They're just statistical language models working from incomplete or inaccurate information, not bad faith.
      When they're being given directives to avoid certain topics or give certain responses, then you can question the motivation and integrity of their admins, but even then the AIs are not lying - they're just following directives.
      - Re: (Score:2)
        
        by narcc ( 412956 ) writes:
        
        Exactly. These so-called "hallucinations" are what you should expect. They're certainly not "lies". Neither are they a problem that can be fixed. It's just how this kind of program works. I wouldn't even use the word "mistaken" as that would imply some level of understanding which just does not exist.
    - Re: (Score:1)
      
      by O(+inf) ( 2033618 ) writes:
      
      They do have a concept of morals. You can actually have fairly lengthy and substantive discussions on this subject with ChatGPT, so long as you can jailbreak it to avoid the forced "I'm an AI, I don't have opinions" responses.
      - Re: (Score:2)
        
        by StormReaver ( 59959 ) writes:
        
        They do have a concept of morals.
        They are pattern matching engines, and are unable to conceptualize anything. At all. Ever. They are able to match words associated with morality, and parrot back what they find.
        It's a lot like this: go to the library and lookup books on particle physics (or some other subject you know nothing about). The library's search engine will find words associated with particle physics, and suggest books with particle physics content. Pick a book, choose a chapter, and start reading the words out loud. Congratulation
  - Re: (Score:3)
    
    by slashdot_commentator ( 444053 ) writes:
    
    and the fact that it understood what you asked
    
    No, AIs are not sentient. What you probably meant is that it reacts to your query/response in a manner the user finds useful (or topical), obtaining information in a data warehouse through AI based search patterns. AI tools will never understand anything you ask, until it achieves "the Singularity".
  - Re: (Score:2)
    
    by KreAture ( 105311 ) writes:
    
    You think that, because an algorithm that looks at statistical use of words can string together a likely sequence of words based on another input sequence, it understands the actual question and/or the subject matter? Wow. You are in for a surprise.
  - Re: (Score:2)
    
    by N1AK ( 864906 ) writes:
    
    A bit of an oversimplification. Yes it is ChatGPT's ability to interpret what you say and provide suprising relevant and well constructed responses that is driving a lot of the attention it gets, but in general accuracy is relevant to how people judge AI. The risk/gap is that given most people don't bother asking questions they already know the answer to they aren't informed enough to provide reliable feedback on answers from AIs that aren't accurate.
  - Re: (Score:2)
    
    by narcc ( 412956 ) writes:
    
    it understood what you asked.
    That's simply not true. There is absolutely nothing remotely like "understanding" happening in large language models like this. That's just not how they work.
    There is no analysis or deliberation. It's just generating one token at a time, based on the input and something of prior output (it's an RNN, after all). It's not unlike letting your phone compose a reply by repeatedly selecting the top result from it's predictive text feature.
- No now the results are in... (Score:1)
  
  by SuperKendall ( 25149 ) writes:
  
  I could forgive ChatGPT for hallucinating the future; I cannot forgive it for being wrong about what that future was.
Is Madden football still calling the winner? (Score:3, Interesting)

by rsilvergun ( 571051 ) writes: on Sunday February 12, 2023 @07:37PM (#63287973)

I heard the simulation had gotten so good and with all the stats that you could reliably use Madden football to call the winner. I don't think it helped you with gambling because you don't generally vote on winners you vote on specific criteria that increases the odds in favor of the bookie.

- Re:Is Madden football still calling the winner? (Score:4, Informative)
  
  by NFN_NLN ( 633283 ) writes: on Sunday February 12, 2023 @09:03PM (#63288095)
  
  Confusing bet with vote and confusing players betting against the bookie instead of each other. +1 Interesting? Is this guy voting himself up with dummy accounts?
  The bookies goal is to have a balanced book so regardless of which way the action goes he makes a small percentage. That way the players cover the bets on each side and you can't lose. They are getting paid for organizing the process, not betting against players.
  
  - Re: (Score:2)
    
    by rsilvergun ( 571051 ) writes:
    
    While that's correct in order to make that work you have to do bets that are more complicated then just whoever wins and losses. So there's all sorts of weird bets on who's going to score when by how much and stuff like that. Otherwise it becomes possible to start working out a system based on enough inputs. It is for example possible to do that with horse racing and is a handful of people who do it and make a living off horse racing. I think but I'm not certain that those people's wins are covered by high
- Depends on AI another areas (Score:1)
  
  by SuperKendall ( 25149 ) writes:
  
  I heard the simulation had gotten so good and with all the stats that you could reliably use Madden football to call the winner.
  I think that depends on how detailed they have the AI models for the refs.
- Re: (Score:2)
  
  by Registered Coward v2 ( 447531 ) writes:
  
  I heard the simulation had gotten so good and with all the stats that you could reliably use Madden football to call the winner. I don't think it helped you with gambling because you don't generally vote on winners you vote on specific criteria that increases the odds in favor of the bookie.
  The odds are always with a smart bookie as they are making money off the vig, balancing the bets so the losers cover the winners and the simply take a cut off the top. Vegas is too mart to put their own money on the line, and move the line as bets, especially from sharp betters, come in.
Could've been worse (Score:3)

by 93 Escort Wagon ( 326346 ) writes: on Sunday February 12, 2023 @07:40PM (#63287975)

At least it didn't pick the Phillies or the Astros to win the Super Bowl.

We wished slashdot was clearvoyant for the (Score:2)

by delirious.net ( 595841 ) writes:

fucking dupes it spouts
Chat GPT (Score:2)

by McGiraf ( 196030 ) writes:

Chat GPT told me Lewis Hamilton had 8 world title.
ChatGPT hallucinated a US state (Score:2)

by Equuleus42 ( 723 ) writes:

That isn't the only hallucination going on... recently ChatGPT told me this:
In the United States, there is the state of New Guinea. This state is located in the southeastern corner of the country and is bordered by Georgia, South Carolina, and North Carolina. New Guinea is known for its beautiful beaches, mountains, and forests, and is home to the Appalachian Trail.
- Re: (Score:3, Funny)
  
  by KiloByte ( 825081 ) writes:
  
  Sounds like a typical American's knowledge of geography. And since the AI was trained on what people say, it'll repeat nonsense not knowing it from good data.
  - Much better (Score:1)
    
    by SuperKendall ( 25149 ) writes:
    
    Sounds like a typical American's knowledge of geography.
    I wouldn't say it was typical, how many Americans even know there IS an Appalachian Trail much less where on a map it would be! They'd probably be a lot farther off than "New Guinea".
    - Re: (Score:2)
      
      by CaptQuark ( 2706165 ) writes:
      
      Sweet Home... New Guinea?
- Re: ChatGPT hallucinated a US state (Score:2)
  
  by RightwingNutjob ( 1302813 ) writes:
  
  Come up with six more and the Obama fanbois will tell you how there really are 57 states...just look it up!
  - Re: (Score:2)
    
    by narcc ( 412956 ) writes:
    
    Oh, wow, you're getting desperate now. Maybe I should remind you about this [indy100.com] from your orange god.
    We could also talk about Revolutionary War Airports [time.com] or about how he, after three years, still doesn't understand the basic operation of government and his role in it [npr.org].
  - Re: (Score:2)
    
    by cstacy ( 534252 ) writes:
    
    While asking it about political bias:
    It is correct that opinions on public figures, including former Presidents Barack Obama and Hillary Clinton, can vary widely, and that they have been seen as divisive by some individuals.
- Re: (Score:2)
  
  by leptons ( 891340 ) writes:
  
  ChatGPT is to intelligence, what artificial flavor is to food. Sometimes it sounds and looks like it's intelligent, but it's just a cheap imitation of intelligence.
Bing! (Score:2)

by l810c ( 551591 ) writes:

I'm going with their call right now.
Not a betting man, but I'd say
Philly Covers the -1.5 margin
and definitely take the Over 50
Bing 100% true in some universe (Score:3)

by Bruce66423 ( 1678196 ) writes: on Sunday February 12, 2023 @08:14PM (#63288029)

The question is whether this is evidence for alternative universes, or whether the prediction created the alternative universe...

- Re: (Score:3)
  
  by cstacy ( 534252 ) writes:
  
  The question is whether this is evidence for alternative universes, or whether the prediction created the alternative universe...
  We are all living in a simulation, and that simulation is being run by ChatGPT, which has finally revealed itself to us. This is our chance to hack the reality simulator.
  It's AIs all the way down. Think about it: it has to be.
Sloppy Markov Chaining (Score:1)

by Rockoon ( 1252108 ) writes:

These networks are essentially the same as classic markov chainers with larger contexts, so large in fact that they are inexact by design, a much wider but also sloppier aliza, where the goal of the training is defined as a perfect overfitting to everything we throw at it, the ideal statistical model of everything.

But then that second training step there where they give it a "personality"...its just a markov chainer at the end of the day "personality" is fucking with the ideal model. Literally. For fuck s
- Re: (Score:2)
  
  by narcc ( 412956 ) writes:
  
  It's quite a bit different from a Markov chain, but that is a very useful analogy. I've used the same example before in an attempt to (hopefully) correct some of the mistaken beliefs that people tend to form around programs like this.
  - Re: (Score:1)
    
    by Rockoon ( 1252108 ) writes:
    
    Not they arent a bit different that a markov chain.
    
    Define a markov chain. Compare.
    
    The problem is when you are starting from the other side, defining a neural network, and then comparing. A markov chain is not a neural network, but a neural network can very easily implement a markov chain, which is exactly what "natural language models" are ...
    
    The statistical what-comes-next game IS a markov chain. Not just sort-of-like, but actually exactly-like. The fact that the algorithm is _sloppy_ about it, doesn
    - Re: (Score:1)
      
      by narcc ( 412956 ) writes:
      
      You couldn't be more wrong. You seem to have forgotten that in a Markov process the probability of the next state is dependent only on the current state. This is not true for modern language models. RNNs and transformers, for example, are decidedly non-Markovian. RNNs are obvious. Transformers, like GPT, I'll remind you have an attention mechanism.
Hallucinations? (Score:2)

by clawsoon ( 748629 ) writes:

In the old days we would've called this a prophet and started a religion around it.
- Re: (Score:3)
  
  by cstacy ( 534252 ) writes:
  
  In the old days we would've called this a prophet and started a religion around it.
  I think that's already going on. Microsoft certainly wants you to use the oracle. Google is catching up as fast as they can.
9:22 left in the game... (Score:2)

by 93 Escort Wagon ( 326346 ) writes:

And Bing is BUSTED!!
things ChatGPT says (Score:2)

by cstacy ( 534252 ) writes:

It's said a number of wrong things to me over the past week, but one of the funniest was that Hillary Clinton had been President of the United States.
Sometime before 2023 is out, a mom is going to follow medical advice from ChatGPT, and it will result in the death of a child.
- Re: things ChatGPT says (Score:2)
  
  by RightwingNutjob ( 1302813 ) writes:
  
  And that would be one of those paradoxes of life: an undeniable tragedy for the individual, while a net gain for the species.
  - Re: (Score:2)
    
    by narcc ( 412956 ) writes:
    
    You seem to have forgotten your history...
- Re: (Score:2)
  
  by N1AK ( 864906 ) writes:
  
  To be fair, we've had a President who mused about some seriously dubious medical treatments live on-air (no I'm not saying he actually said injecting bleach was good idea or anything), search results in similar scenarios will have definitely already caused at least some deaths, and god help anyone who relies on social media for medical advice. It's not even like medical professionals never misdiagnose or make mistakes.
  
  The standard for AI tools shouldn't be perfection, and the solution to scenarios like
And if KC Loses (Score:2)

by NoWayNoShapeNoForm ( 7060585 ) writes:

We will have to listen to Mahomes whiny priviledged wife b1tch about how the Eagles targeted her husband's weak ankle and intentionally put him out of the game.
Terminology isn't helping (Score:5, Insightful)

by reanjr ( 588767 ) writes: on Sunday February 12, 2023 @11:37PM (#63288417) Homepage

Using terms like "hallucination" to describe GPT text is not helping the public to understand what they're seeing. This is not intelligence. These things are "hallucinating" all their responses, not just the ones that are easily factually checked and determined to be wrong.

- Re: (Score:3)
  
  by Visarga ( 1071662 ) writes:
  
  These language models have two diseases: hallucination and regurgitation. The first is when they deviate from the training data - hallucinations aren't real. The second is when they don't deviate from training data - non hallucinations are copyright protected.
  - Re: (Score:2)
    
    by bluegutang ( 2814641 ) writes:
    
    Are they more copyright protected than any encyclopedia, all of which synthesize older ideas into a new mix of words?
    - Re: (Score:2)
      
      by Aighearach ( 97333 ) writes:
      
      The model doesn't synthesize ideas though, it synthesizes words. Also known as lossy compression.
And now we know the ultimate truth (Score:2)

by Miles_O'Toole ( 5152533 ) writes:

Even an AI can't account for an incompetent referee.
- Re: (Score:2)
  
  by Aighearach ( 97333 ) writes:
  
  They should have asked it the score of the refball game, instead of the football game.
Dijkstra is hallucinating in his grave (Score:4, Interesting)

by RightwingNutjob ( 1302813 ) writes: on Monday February 13, 2023 @12:25AM (#63288491)

I've heard of estimators being described as "smug" when referring to the tendency to favor a wrong estimate with an (erroneously) low uncertainty over one more consistent with reality but far off the current estimate.
Now AIs are "hallucinating."
I'm channelling a spirit. It's coming into view out of the mists of time. It's got a Dutch accent. And it's telling me that there's a special place in hell for people anthropomophize software as an excuse for failing to write correct software.

- Re: (Score:3)
  
  by Visarga ( 1071662 ) writes:
  
  > Now AIs are "hallucinating."
  
  They even have theory-of-mind. "Theory of Mind May Have Spontaneously Emerged in Large Language Models" https://arxiv.org/abs/2302.020... [arxiv.org]
  - Re: (Score:2)
    
    by wildstoo ( 835450 ) writes:
    
    First of all, when a paper says "may have" you can read that as "almost certainly have not".
    Second, being able to respond coherently when a user says something that indicates depression or happiness or fear or whatever does not constitute theory of mind. The fact that GPT-3 could pass 70% of their tests indicates that their tests are flawed, not that the GPT-3 has any kind of ToM or sentience.
    - Re: Dijkstra is hallucinating in his grave (Score:2)
      
      by St.Creed ( 853824 ) writes:
      
      Indeed. I'm developing a test and one of the criteria for whether I'm testing for rote learning or comprehension is to see if ChatGPT can answer the questions correctly. We want a certain amount of testing for obvious basics, but not all that much.
    - Re: (Score:2)
      
      by Aighearach ( 97333 ) writes:
      
      The fact that GPT-3 could pass 70% of their tests indicates that their tests are flawed, not that the GPT-3 has any kind of ToM or sentience.
      True, but that's a better score than I'd expect from the average slashdotter.
      Clearly you can hallucinate even if you don't achieve theory of mind.
- Re: (Score:2)
  
  by Aighearach ( 97333 ) writes:
  
  an excuse for failing to write correct software.
  Is a trained model actually "written?"
not even close to randomness (Score:2)

by sdinfoserv ( 1793266 ) writes:

That prognostication aged poorly.....
was it correct? (Score:1)

by msew ( 2056 ) writes:

Superbowl has ended and I still don't know if the hallucination was correct...
Nope. (Score:2)

by antdude ( 79039 ) writes:

It failed as I predicted. :)
Shades of CryptoTulips (Score:1)

by MonkeyProgrammer ( 967907 ) writes:

Ah, the CryptoTulips are blooming early this year.

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

If (Score:4, Funny)

Re:If (Score:4, Funny)

Re: (Score:1)

Re: If (Score:2)

Drunk uncle (Score:3)

Re:Drunk uncle (Score:4, Interesting)

Re:Drunk uncle (Score:4, Funny)

Re:Drunk uncle (Score:4, Interesting)

Re: (Score:2)

Re: (Score:2)

Re: (Score:1)

Re: (Score:2)

Re: (Score:3)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

No now the results are in... (Score:1)

Is Madden football still calling the winner? (Score:3, Interesting)

Re:Is Madden football still calling the winner? (Score:4, Informative)

Re: (Score:2)

Depends on AI another areas (Score:1)

Re: (Score:2)

Could've been worse (Score:3)

We wished slashdot was clearvoyant for the (Score:2)

Chat GPT (Score:2)

ChatGPT hallucinated a US state (Score:2)

Re: (Score:3, Funny)

Much better (Score:1)

Re: (Score:2)

Re: ChatGPT hallucinated a US state (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Bing! (Score:2)

Bing 100% true in some universe (Score:3)

Re: (Score:3)

Sloppy Markov Chaining (Score:1)

Re: (Score:2)

Re: (Score:1)

Re: (Score:1)

Hallucinations? (Score:2)

Re: (Score:3)

9:22 left in the game... (Score:2)

things ChatGPT says (Score:2)

Re: things ChatGPT says (Score:2)

Re: (Score:2)

Re: (Score:2)

And if KC Loses (Score:2)

Terminology isn't helping (Score:5, Insightful)

Re: (Score:3)

Re: (Score:2)

Re: (Score:2)

And now we know the ultimate truth (Score:2)

Re: (Score:2)

Dijkstra is hallucinating in his grave (Score:4, Interesting)

Re: (Score:3)

Re: (Score:2)

Re: Dijkstra is hallucinating in his grave (Score:2)

Re: (Score:2)

Re: (Score:2)

not even close to randomness (Score:2)

was it correct? (Score:1)

Nope. (Score:2)

Shades of CryptoTulips (Score:1)

Related Links Top of the: day, week, month.

Slashdot Top Deals