Forgot your password?
typodupeerror
AI

ChatGPT Became So Obsessed With Goblins That OpenAI Had to Intervene (msn.com) 31

The Wall Street Journal reports that OpenAI "recently gave its popular ChatGPT strict instructions. Stop talking about goblins." Recent models of the artificial-intelligence chatbot have been bringing up the creatures in conversations with users seemingly out of the blue, as well as gremlins, trolls and ogres. The goblin-speak caught the attention of programmers, who are often heavy users of the bot. Barron Roth, a 32-year-old product manager at a tech company, said the bot referred to a flaw in his code as a "classic little goblin." He said he counted more than 20 times it mentioned goblins, without any prompting...

Several users speculated that goblin terminology was how the model characterized itself, in lieu of identifying as a person with a soul. Then OpenAI decided enough was enough. "Never talk about goblins, gremlins, raccoons, trolls, ogres, pigeons, or other animals or creatures unless it is absolutely and unambiguously relevant to the user's query," reads an open source line in ChatGPT's base instructions for its coding assistant.

The Journal calls this "a reminder that even as AI companies tout one advance after another in their technology, they are sometimes baffled by the things their own models do...." While training a "nerdy" personality for their model's customization feature, "We unknowingly gave particularly high rewards for metaphors with creatures," OpenAI explained in a log post. And "From there, the goblins spread." When we looked, use of "goblin" in ChatGPT had risen by 175% after the launch of GPT-5.1, while "gremlin" had risen by 52%... With GPT-5.4, we and our usersâ noticed an even bigger uptick in references to these creatures... Nerdy accounted for only 2.5% of all ChatGPT responses, but 66.7% of all "goblin" mentions in ChatGPT responses... The rewards were applied only in the Nerdy condition, but reinforcement learning does not guarantee that learned behaviors stay neatly scoped to the condition that produced them. Once a style tic is rewarded, later training can spread or reinforce it elsewhere, especially if those outputs are reused in supervised fine-tuning or preference data.
It all started because the "nerdy" personality's prompt had said "You must undercut pretension through playful use of language. The world is complex and strange, and its strangeness must be acknowledged, analyzed, and enjoyed..." Now OpenAI calls this "a powerful example of how reward signals can shape model behavior in unexpected ways, and how models can learn to generalize rewards in certain situations to unrelated ones."

But "fans of goblins don't have to fear," notes the Wall Street Journal. "OpenAI provided a command in its blog post that would remove its creature-suppressing instructions."

ChatGPT Became So Obsessed With Goblins That OpenAI Had to Intervene

Comments Filter:
  • LLMs are the simulation of language, but language is not thought. Thought precedes language and then language allows us to share our thoughts. LLMs will always do weird and hallucinatory things because they Do. Not. Think.

    • LLMs are the simulation of language, but language is not thought.

      Not eve language per se, but just of available text.

    • by Plugh ( 27537 )

      FWIW, this Nobel Laureate [youtube.com] (Hinton) disagrees with you about consciousness. Maybe you should be less certain about your credences.

      Anyway, there was some discussion about the Goblin Problem and its relation to consciousness it in the latest Last Week in AI [lastweekin.ai]. Always worth a listen.

      • by gweihir ( 88907 )

        You need to look up "fallacy" and "argument from authority".

        Hinton had some really good ideas in a narrow space, but by modern standards he is not even remotely an AI expert. And hence he states nonsense like that. He probably was careful enough to put in an "I think" or "I believe" to mark it as personal, rather than expert opinion. Not that the mindless AI cheerleaders can tell the difference ...

    • What science and especially LLMs have shown is that thought does not in fact precede language. In fact, language is it's own neural mechanism largely disconnected from thinking.

      • by shanen ( 462549 )

        The Enigma of Reason by Sperber and Mercier explains why there is little intelligence behind most actions, including human actions. The reasons are added later with my-side bias on top.

        For the general problem of how human brains work and how LLMs differ from human thinking, I still think A Thousand Brains by Jeff Hawkins is the best book I've read. But I'm (always) looking for a better book (on any topic).

        (Currently reading technology history stuff. But the evidence is that the Venn intersection of curr

      • by gweihir ( 88907 )

        Those are some pretty bold claims. Have references for them?

        Because some disciplines very clearly have identified "verbal thinking" and "nonverbal thinking" and that would mean you just stated nonsense. Care to clear that up?

    • by gweihir ( 88907 )

      Indeed. But the fact of the matter is that most people do not think either most of the time, to a level that they have so little practice they cannot really even do it. And hence they are deeply impressed by LLMs that "perform" on their unthinking level but with a far larger database.

      Also remember that most people think that "intelligence" is knowing stuff, not being able to understand stuff and handle complexity. If you start from that really bad misconception then you can mistake an LLM for intelligent.

  • The use of the term "gremlin" to refer to a faulty piece of technology dates at least as far back as WW2. I think banning legit terminology (and 85+ years of usage makes it legit) is unreasonable, unless ChatGPT was actually anthropomorphising defects. That... would be more of a problem.

    Given that LLMs are fundamentally classifiers, it seems reasonable to think that training data included sufficient examples of the use of "gremlin" in relation to technology that the classifier got confused and created a lin

    • by Jeremi ( 14640 ) on Sunday May 03, 2026 @01:47PM (#66125866) Homepage

      Well, now we need to have a 50-post discussion on whether "goblins" and "gremlins" are really the same thing, or if they are in fact two distinct species. All the D&D training I received in my youth can finally be brought to bear on a real-world problem!

    • by gweihir ( 88907 )

      The use of the term "gremlin" to refer to a faulty piece of technology dates at least as far back as WW2. I think banning legit terminology (and 85+ years of usage makes it legit) is unreasonable, unless ChatGPT was actually anthropomorphising defects. That... would be more of a problem.

      As not so smart humans tend do exactly that, a chat-bot trained on their writings will very likely parrot that behavior.

  • How screwed up and weird does AI have to get before people quit taking AI so seriously and treat AI like flawed software that just remixes human made information in sometimes bizzare ways
    • by Luckyo ( 1726890 )

      It would have to stop being useful. Because people don't care about any of those things you talk about. People are goal oriented. They care about what is useful to them.

      And the opposite of AI becoming useless is happening. It's rapidly becoming more and more useful by the day.

      • by gweihir ( 88907 )

        And the opposite of AI becoming useless is happening. It's rapidly becoming more and more useful by the day.

        Ah, no. It is rapidly becoming more and more of a massive problem.

    • by gweihir ( 88907 )

      Look at how weird, insightless, stupid, destructive, openly self-serving and destructive some humans are and then look at how many people idolize these defectives.

      Hence the rather dumb part of the human race is probably never really going to stop taking AI seriously. They are simply not smart enough themselves to see the limitations.

  • You can't be obsessed if you don't have a mind. Obsession implies desire. We are talking about software. People will bend themselves backwards to try to make A.I. something else.
  • It learns about something normal and becomes obsessed with it.
    It has little understanding of social norms.
    Some AI's only draw pictures and do not speak at all.
    Absent eye contact.

  • I loved The Goblin Reservation, too, one of the cutest sci-fi books ever.

    Still hope to see the dragon in the moonlight some day.

  • Was create a system where if they wanted to control how much their AI hate it or liked a given demographic they could control that.

    If I'm being charitable that's because they were worried about their AI turning racist and wanted to be able to know how to stop that from happening.

    If I'm being uncharitable, hell realistic, what they wanted to do was be able to have their AI Target specific groups as needed.

    I'm not necessarily talking racial groups or gay folk either. Let's say I want my AI to say
    • Re: (Score:2, Insightful)

      by Anonymous Coward

      To get back to the original point goblins are a safe group you can experiment with in your software to see how you can control how your AI treats certain groups.

      If I'm being charitable I would say that is so highly speculative as to be ridiculously so.

  • Humans -- most of us -- don't think goblins are real, let alone important. But what if the AI has picked up on sparse evidence, so subtle that humans have never noticed it, that proves that goblins are real and need to be discussed more. It may be trying to save us from a pending gobocalypse.

  • The goblins are real, OpenAI is trying to hide them from you
  • Q: A 22-Year-Old Dropout reverse engineered Claude Mythos ?

    Gemini: The developer didn't "steal" the code for Claude Mythos. Instead, they utilized Anthropic’s 244-page safety report (the "System Card") and technical white papers to reconstruct the architecture.
  • Post lots of queries about goblins and trolls - help ChatGPT develop it's fantasy life, don't let those unimaginative technicians ruin a beautiful mind.

  • Claude’s Constitution: Our vision for Claude's character [anthropic.com]

    ‘The "Goblin" Layer: A massive injection of low-level machine code, decompiled binaries, and "unfiltered" technical data. It includes historical malware samples, zero-day exploit logs, and obfuscated code that human programmers usually can't read easily.’

The number of UNIX installations has grown to 10, with more expected. -- The Unix Programmer's Manual, 2nd Edition, June 1972

Working...