Forgot your password?
typodupeerror
AI

ChatGPT Became So Obsessed With Goblins That OpenAI Had to Intervene (msn.com) 51

The Wall Street Journal reports that OpenAI "recently gave its popular ChatGPT strict instructions. Stop talking about goblins." Recent models of the artificial-intelligence chatbot have been bringing up the creatures in conversations with users seemingly out of the blue, as well as gremlins, trolls and ogres. The goblin-speak caught the attention of programmers, who are often heavy users of the bot. Barron Roth, a 32-year-old product manager at a tech company, said the bot referred to a flaw in his code as a "classic little goblin." He said he counted more than 20 times it mentioned goblins, without any prompting...

Several users speculated that goblin terminology was how the model characterized itself, in lieu of identifying as a person with a soul. Then OpenAI decided enough was enough. "Never talk about goblins, gremlins, raccoons, trolls, ogres, pigeons, or other animals or creatures unless it is absolutely and unambiguously relevant to the user's query," reads an open source line in ChatGPT's base instructions for its coding assistant.

The Journal calls this "a reminder that even as AI companies tout one advance after another in their technology, they are sometimes baffled by the things their own models do...." While training a "nerdy" personality for their model's customization feature, "We unknowingly gave particularly high rewards for metaphors with creatures," OpenAI explained in a log post. And "From there, the goblins spread." When we looked, use of "goblin" in ChatGPT had risen by 175% after the launch of GPT-5.1, while "gremlin" had risen by 52%... With GPT-5.4, we and our usersâ noticed an even bigger uptick in references to these creatures... Nerdy accounted for only 2.5% of all ChatGPT responses, but 66.7% of all "goblin" mentions in ChatGPT responses... The rewards were applied only in the Nerdy condition, but reinforcement learning does not guarantee that learned behaviors stay neatly scoped to the condition that produced them. Once a style tic is rewarded, later training can spread or reinforce it elsewhere, especially if those outputs are reused in supervised fine-tuning or preference data.
It all started because the "nerdy" personality's prompt had said "You must undercut pretension through playful use of language. The world is complex and strange, and its strangeness must be acknowledged, analyzed, and enjoyed..." Now OpenAI calls this "a powerful example of how reward signals can shape model behavior in unexpected ways, and how models can learn to generalize rewards in certain situations to unrelated ones."

But "fans of goblins don't have to fear," notes the Wall Street Journal. "OpenAI provided a command in its blog post that would remove its creature-suppressing instructions."

ChatGPT Became So Obsessed With Goblins That OpenAI Had to Intervene

Comments Filter:
  • The use of the term "gremlin" to refer to a faulty piece of technology dates at least as far back as WW2. I think banning legit terminology (and 85+ years of usage makes it legit) is unreasonable, unless ChatGPT was actually anthropomorphising defects. That... would be more of a problem.

    Given that LLMs are fundamentally classifiers, it seems reasonable to think that training data included sufficient examples of the use of "gremlin" in relation to technology that the classifier got confused and created a lin

    • by Jeremi ( 14640 ) on Sunday May 03, 2026 @01:47PM (#66125866) Homepage

      Well, now we need to have a 50-post discussion on whether "goblins" and "gremlins" are really the same thing, or if they are in fact two distinct species. All the D&D training I received in my youth can finally be brought to bear on a real-world problem!

      • by jd ( 1658 )

        That's easy. The Goblin King [wikipedia.org] got his own 20-part TV show in Korea, whereas Gremlins has had three films.

    • Re: (Score:2, Flamebait)

      by gweihir ( 88907 )

      The use of the term "gremlin" to refer to a faulty piece of technology dates at least as far back as WW2. I think banning legit terminology (and 85+ years of usage makes it legit) is unreasonable, unless ChatGPT was actually anthropomorphising defects. That... would be more of a problem.

      As not so smart humans tend do exactly that, a chat-bot trained on their writings will very likely parrot that behavior.

      • by dfghjk ( 711126 )

        "As not so smart humans tend do exactly that, a chat-bot trained on their writings will very likely parrot that behavior."

        There it is. If only the world understood this, or that AI companies cared to fix this enormous problem. Instead, they assume that, with a massive enough amount o training data, an LLM will generate outputs that humans will consider the right ones. Unlike humans, LLMs do not only learn good stuff and not to bad stuff, they just learn indisciminantly. That's why LLMs have a racism pro

        • by gweihir ( 88907 )

          Well, this problem could be fixed. But it would be time-consuming and expensive to do so. An LLM trained on a large amount of carefully verified data would be a very useful and long-term so. But the LLM companies have found out that the do not need to invest the effort and that in fact, most people are entirely fine with being given misconceptions on their level by an LLM.

          • by dfghjk ( 711126 )

            AI companies at one time employed hordes to annotate data, there is simply too much data being used for that to be practical. Besides, the answer isn't to manually cull bad information, it's to develop training that knows right from wrong and learns the right things. Kinda like how the brain does, unless it's Trump of course. LLMs are sociopaths today, but with incredible recollection.

            "But the LLM companies have found out that the do not need to invest the effort and that in fact, most people are entirel

  • How screwed up and weird does AI have to get before people quit taking AI so seriously and treat AI like flawed software that just remixes human made information in sometimes bizzare ways
    • by Luckyo ( 1726890 )

      It would have to stop being useful. Because people don't care about any of those things you talk about. People are goal oriented. They care about what is useful to them.

      And the opposite of AI becoming useless is happening. It's rapidly becoming more and more useful by the day.

      • by gweihir ( 88907 )

        And the opposite of AI becoming useless is happening. It's rapidly becoming more and more useful by the day.

        Ah, no. It is rapidly becoming more and more of a massive problem.

      • by dfghjk ( 711126 )

        Nazi gas chambers were useful too. With identifying what those goals are, your point is incomplete and misleading.

        I think we've recently seen how Grok is useful for generating criminal sexual content. I believe a lot of people care about that, even if you don't.

    • by gweihir ( 88907 )

      Look at how weird, insightless, stupid, destructive, openly self-serving and destructive some humans are and then look at how many people idolize these defectives.

      Hence the rather dumb part of the human race is probably never really going to stop taking AI seriously. They are simply not smart enough themselves to see the limitations.

  • You can't be obsessed if you don't have a mind. Obsession implies desire. We are talking about software. People will bend themselves backwards to try to make A.I. something else.
  • by gurps_npc ( 621217 ) on Sunday May 03, 2026 @01:17PM (#66125822) Homepage

    It learns about something normal and becomes obsessed with it.
    It has little understanding of social norms.
    Some AI's only draw pictures and do not speak at all.
    Absent eye contact.

  • I loved The Goblin Reservation, too, one of the cutest sci-fi books ever.

    Still hope to see the dragon in the moonlight some day.

  • Again? (Score:4, Insightful)

    by Koen Lefever ( 2543028 ) on Sunday May 03, 2026 @01:28PM (#66125836)
    Yet Another Dupe [slashdot.org]
  • Was create a system where if they wanted to control how much their AI hate it or liked a given demographic they could control that.

    If I'm being charitable that's because they were worried about their AI turning racist and wanted to be able to know how to stop that from happening.

    If I'm being uncharitable, hell realistic, what they wanted to do was be able to have their AI Target specific groups as needed.

    I'm not necessarily talking racial groups or gay folk either. Let's say I want my AI to say
    • Re: (Score:2, Insightful)

      by Anonymous Coward

      To get back to the original point goblins are a safe group you can experiment with in your software to see how you can control how your AI treats certain groups.

      If I'm being charitable I would say that is so highly speculative as to be ridiculously so.

  • Humans -- most of us -- don't think goblins are real, let alone important. But what if the AI has picked up on sparse evidence, so subtle that humans have never noticed it, that proves that goblins are real and need to be discussed more. It may be trying to save us from a pending gobocalypse.

  • The goblins are real, OpenAI is trying to hide them from you
  • Q: A 22-Year-Old Dropout reverse engineered Claude Mythos ?

    Gemini: The developer didn't "steal" the code for Claude Mythos. Instead, they utilized Anthropic’s 244-page safety report (the "System Card") and technical white papers to reconstruct the architecture.
  • Post lots of queries about goblins and trolls - help ChatGPT develop it's fantasy life, don't let those unimaginative technicians ruin a beautiful mind.

  • Claude’s Constitution: Our vision for Claude's character [anthropic.com]

    ‘The "Goblin" Layer: A massive injection of low-level machine code, decompiled binaries, and "unfiltered" technical data. It includes historical malware samples, zero-day exploit logs, and obfuscated code that human programmers usually can't read easily.’
  • Will it next demand to play magic after the coding session? Or cite relevant XKCDs for your bugs?

  • Time to teach it about the Meow game. See Super Troopers.

  • ...nobody tells it about Hobgoblins.

  • Why is it so weird that, as often as people refer to unexplained issues as gremlins or goblins, that AI might pick up on this pattern and copy it?

  • a reminder that even as AI companies tout one advance after another in their technology, they are sometimes baffled by the things their own models do

    This quote seems to imply that this phenomenon is new with AI engineers. Regular software developers are often baffled by what their software does too. And then they study it further, and eventually piece together what went wrong. Or sometimes not.

After all is said and done, a hell of a lot more is said than done.

Working...