ChatGPT Became So Obsessed With Goblins That OpenAI Had to Intervene (msn.com) 53
The Wall Street Journal reports that OpenAI "recently gave its popular ChatGPT strict instructions. Stop talking about goblins."
Recent models of the artificial-intelligence chatbot have been bringing up the creatures in conversations with users seemingly out of the blue, as well as gremlins, trolls and ogres. The goblin-speak caught the attention of programmers, who are often heavy users of the bot. Barron Roth, a 32-year-old product manager at a tech company, said the bot referred to a flaw in his code as a "classic little goblin." He said he counted more than 20 times it mentioned goblins, without any prompting...
Several users speculated that goblin terminology was how the model characterized itself, in lieu of identifying as a person with a soul. Then OpenAI decided enough was enough. "Never talk about goblins, gremlins, raccoons, trolls, ogres, pigeons, or other animals or creatures unless it is absolutely and unambiguously relevant to the user's query," reads an open source line in ChatGPT's base instructions for its coding assistant.
The Journal calls this "a reminder that even as AI companies tout one advance after another in their technology, they are sometimes baffled by the things their own models do...." While training a "nerdy" personality for their model's customization feature, "We unknowingly gave particularly high rewards for metaphors with creatures," OpenAI explained in a log post. And "From there, the goblins spread." When we looked, use of "goblin" in ChatGPT had risen by 175% after the launch of GPT-5.1, while "gremlin" had risen by 52%... With GPT-5.4, we and our usersâ noticed an even bigger uptick in references to these creatures... Nerdy accounted for only 2.5% of all ChatGPT responses, but 66.7% of all "goblin" mentions in ChatGPT responses... The rewards were applied only in the Nerdy condition, but reinforcement learning does not guarantee that learned behaviors stay neatly scoped to the condition that produced them. Once a style tic is rewarded, later training can spread or reinforce it elsewhere, especially if those outputs are reused in supervised fine-tuning or preference data.
It all started because the "nerdy" personality's prompt had said "You must undercut pretension through playful use of language. The world is complex and strange, and its strangeness must be acknowledged, analyzed, and enjoyed..." Now OpenAI calls this "a powerful example of how reward signals can shape model behavior in unexpected ways, and how models can learn to generalize rewards in certain situations to unrelated ones."
But "fans of goblins don't have to fear," notes the Wall Street Journal. "OpenAI provided a command in its blog post that would remove its creature-suppressing instructions."
Several users speculated that goblin terminology was how the model characterized itself, in lieu of identifying as a person with a soul. Then OpenAI decided enough was enough. "Never talk about goblins, gremlins, raccoons, trolls, ogres, pigeons, or other animals or creatures unless it is absolutely and unambiguously relevant to the user's query," reads an open source line in ChatGPT's base instructions for its coding assistant.
The Journal calls this "a reminder that even as AI companies tout one advance after another in their technology, they are sometimes baffled by the things their own models do...." While training a "nerdy" personality for their model's customization feature, "We unknowingly gave particularly high rewards for metaphors with creatures," OpenAI explained in a log post. And "From there, the goblins spread." When we looked, use of "goblin" in ChatGPT had risen by 175% after the launch of GPT-5.1, while "gremlin" had risen by 52%... With GPT-5.4, we and our usersâ noticed an even bigger uptick in references to these creatures... Nerdy accounted for only 2.5% of all ChatGPT responses, but 66.7% of all "goblin" mentions in ChatGPT responses... The rewards were applied only in the Nerdy condition, but reinforcement learning does not guarantee that learned behaviors stay neatly scoped to the condition that produced them. Once a style tic is rewarded, later training can spread or reinforce it elsewhere, especially if those outputs are reused in supervised fine-tuning or preference data.
It all started because the "nerdy" personality's prompt had said "You must undercut pretension through playful use of language. The world is complex and strange, and its strangeness must be acknowledged, analyzed, and enjoyed..." Now OpenAI calls this "a powerful example of how reward signals can shape model behavior in unexpected ways, and how models can learn to generalize rewards in certain situations to unrelated ones."
But "fans of goblins don't have to fear," notes the Wall Street Journal. "OpenAI provided a command in its blog post that would remove its creature-suppressing instructions."
Re: (Score:1)
LLMs are the simulation of language, but language is not thought.
Not eve language per se, but just of available text.
Re:Artificial, but not intelligent (Score:5, Insightful)
FWIW, this Nobel Laureate [youtube.com] (Hinton) disagrees with you about consciousness. Maybe you should be less certain about your credences.
Anyway, there was some discussion about the Goblin Problem and its relation to consciousness it in the latest Last Week in AI [lastweekin.ai]. Always worth a listen.
Re: (Score:2, Insightful)
You need to look up "fallacy" and "argument from authority".
Hinton had some really good ideas in a narrow space, but by modern standards he is not even remotely an AI expert. And hence he states nonsense like that. He probably was careful enough to put in an "I think" or "I believe" to mark it as personal, rather than expert opinion. Not that the mindless AI cheerleaders can tell the difference ...
Re: Artificial, but not intelligent (Score:2)
What science and especially LLMs have shown is that thought does not in fact precede language. In fact, language is it's own neural mechanism largely disconnected from thinking.
Re: (Score:3)
The Enigma of Reason by Sperber and Mercier explains why there is little intelligence behind most actions, including human actions. The reasons are added later with my-side bias on top.
For the general problem of how human brains work and how LLMs differ from human thinking, I still think A Thousand Brains by Jeff Hawkins is the best book I've read. But I'm (always) looking for a better book (on any topic).
(Currently reading technology history stuff. But the evidence is that the Venn intersection of curr
Re: (Score:2)
Those are some pretty bold claims. Have references for them?
Because some disciplines very clearly have identified "verbal thinking" and "nonverbal thinking" and that would mean you just stated nonsense. Care to clear that up?
Re: (Score:2)
Yes, in fact we have an established vocabulary for descriibing precisely that. Many, many people have a lifetime of personal experience showing just how closely coupled language and thought are.
Re: (Score:2)
My personal experience is that I learned to think non-verbally when I became fluent in a 2nd language. Also gives nice insights in how limited thinking in a specific language can be and that languages come with specific views of the world. But that is a very old observation. For example, "1984" (written 1948, when this was well established), describes "newspeak", a language that makes it hard to think bad of authorities, etc. Even older is the insight that for a good education, becoming fluent in at least
Re: (Score:2)
My roommate is multi-lingual, when we discussed it he said he had no "internal dialogue". This is allegedly the case for a significant portion of people. We have since discussed it further and he says that he recognizes an internal dialog sometimes. My thought at the time was that his experience was different than mine for the same reason you say.
Noam Chomsky had a great deal to say about the relationship between language and thinking. From google:
"Noam Chomsky’s concept of I-language (Internal la
Re: (Score:2)
I think the term is "internal monologue". I do not have one either. I can talk to myself in my head, but that is just a variant of doing it out loud and 100% deliberate. From what I understand, for the people with one, it is not deliberate action, but more like some kind of running comment or discussion?
Re: Artificial, but not intelligent (Score:2)
"Chomsky has suggested that language is separable from cognition (Berwick et al., 2013), and this notion has been well supported by functional imaging experiments in neuroscience (Sakai, 2005). On the opposite, cognitive and construction linguistics emphasized a single mechanism of both. Neither has led to a computational theory so far, but language is learned early in life with only limited cognitive understanding of the world (Perlovsky, 2009)."
From "Language and Cognition" by Perlovsky and Sakai.
Just Goo
Re: Artificial, but not intelligent (Score:2)
Verbal thinking wouldn't necessarily imply that language is dependent on cognition. It could just as easily imply a back and forth between language and cognition, and not that language is dependent on cognition.
In fact human level cognition is likely somewhat dependent on language, which is where verbal thinking might be found.
Re: (Score:2)
Citation please. If science has shown it you should be able to cite it.
Re: (Score:1)
Indeed. But the fact of the matter is that most people do not think either most of the time, to a level that they have so little practice they cannot really even do it. And hence they are deeply impressed by LLMs that "perform" on their unthinking level but with a far larger database.
Also remember that most people think that "intelligence" is knowing stuff, not being able to understand stuff and handle complexity. If you start from that really bad misconception then you can mistake an LLM for intelligent.
Re: (Score:1)
"LLMs are the simulation of language, but language is not thought."
Citation please.
LLMs are NOT "the simulation of language", and one can argue that language and thought are closely coupled.
"Thought precedes language and then language allows us to share our thoughts. "
Explain what "inner dialog" is.
"LLMs will always do weird and hallucinatory things because they Do. Not. Think."
"Think" is poorly defined, LLMs may very well "think". LLMs have no values and nothing that guides any understanding of correctnes
Re:Artificial, but not intelligent (Score:5, Interesting)
If they think is a purely philosophical question. As soon as their actions become indistinguishable from a thinking being, it just doesn't matter and will evade your definitions. I mean a current reasoning model produces a long thinking trade. Is it just a sequence of tokens? Yes. Do these tokens result from 60 layers of neural network when a simple text generator needs just one? Yes. Is it thinking? Depends on your definition. At least it produces a text similar to thoughts and uses that to give better answers. The rest is people talking about soul outside of church.
Re: Artificial, but not intelligent (Score:1)
Youâ(TM)re saying thought and language are totally separate, but science shows it's blurrier than that. Ever talk to yourself in your head to solve a problem? That inner voice is thinking in wordsâ"language often is the thought, not just the delivery system.
Second, LLMs aren't just faking it. Researchers trained a model on written chess moves without ever showing it a board, and the model internally built its own map of where the pieces were. Thatâ(TM)s a mental modelâ"a rough kind of th
Gremlin is perfectly valid terminology (Score:2)
The use of the term "gremlin" to refer to a faulty piece of technology dates at least as far back as WW2. I think banning legit terminology (and 85+ years of usage makes it legit) is unreasonable, unless ChatGPT was actually anthropomorphising defects. That... would be more of a problem.
Given that LLMs are fundamentally classifiers, it seems reasonable to think that training data included sufficient examples of the use of "gremlin" in relation to technology that the classifier got confused and created a lin
Re:Gremlin is perfectly valid terminology (Score:5, Funny)
Well, now we need to have a 50-post discussion on whether "goblins" and "gremlins" are really the same thing, or if they are in fact two distinct species. All the D&D training I received in my youth can finally be brought to bear on a real-world problem!
Re: (Score:2)
That's easy. The Goblin King [wikipedia.org] got his own 20-part TV show in Korea, whereas Gremlins has had three films.
not to forget that WOC sanitized D&D (Score:2)
Wizards of the Coast recently sanitized D&D yet again in the never-ending battle for truth, justice and appeasing busybody moms which seek to police everyone's language.
It's like the Home Owner's Association (HOA) board is the first elected office required in the lifelong training career of the language / action / (killing all joy) police.
Re: (Score:2, Flamebait)
The use of the term "gremlin" to refer to a faulty piece of technology dates at least as far back as WW2. I think banning legit terminology (and 85+ years of usage makes it legit) is unreasonable, unless ChatGPT was actually anthropomorphising defects. That... would be more of a problem.
As not so smart humans tend do exactly that, a chat-bot trained on their writings will very likely parrot that behavior.
Re: (Score:3)
"As not so smart humans tend do exactly that, a chat-bot trained on their writings will very likely parrot that behavior."
There it is. If only the world understood this, or that AI companies cared to fix this enormous problem. Instead, they assume that, with a massive enough amount o training data, an LLM will generate outputs that humans will consider the right ones. Unlike humans, LLMs do not only learn good stuff and not to bad stuff, they just learn indisciminantly. That's why LLMs have a racism pro
Re: (Score:3)
Well, this problem could be fixed. But it would be time-consuming and expensive to do so. An LLM trained on a large amount of carefully verified data would be a very useful and long-term so. But the LLM companies have found out that the do not need to invest the effort and that in fact, most people are entirely fine with being given misconceptions on their level by an LLM.
Re: (Score:2)
AI companies at one time employed hordes to annotate data, there is simply too much data being used for that to be practical. Besides, the answer isn't to manually cull bad information, it's to develop training that knows right from wrong and learns the right things. Kinda like how the brain does, unless it's Trump of course. LLMs are sociopaths today, but with incredible recollection.
"But the LLM companies have found out that the do not need to invest the effort and that in fact, most people are entirel
I have to wonder (Score:2)
Re: (Score:1)
It would have to stop being useful. Because people don't care about any of those things you talk about. People are goal oriented. They care about what is useful to them.
And the opposite of AI becoming useless is happening. It's rapidly becoming more and more useful by the day.
Re: (Score:2)
And the opposite of AI becoming useless is happening. It's rapidly becoming more and more useful by the day.
Ah, no. It is rapidly becoming more and more of a massive problem.
Re: (Score:2)
Nazi gas chambers were useful too. With identifying what those goals are, your point is incomplete and misleading.
I think we've recently seen how Grok is useful for generating criminal sexual content. I believe a lot of people care about that, even if you don't.
Re: (Score:3)
Look at how weird, insightless, stupid, destructive, openly self-serving and destructive some humans are and then look at how many people idolize these defectives.
Hence the rather dumb part of the human race is probably never really going to stop taking AI seriously. They are simply not smart enough themselves to see the limitations.
"Obsessed"? (Score:1)
Re: (Score:2)
People will project their desires and see "magic" in things they do not understand.
AI is Autistic. (Score:4, Interesting)
It learns about something normal and becomes obsessed with it.
It has little understanding of social norms.
Some AI's only draw pictures and do not speak at all.
Absent eye contact.
Probably read some Clifford Simak (Score:2)
I loved The Goblin Reservation, too, one of the cutest sci-fi books ever.
Still hope to see the dragon in the moonlight some day.
Again? (Score:4, Insightful)
Re: (Score:2)
So what they were actually trying to do (Score:1)
If I'm being charitable that's because they were worried about their AI turning racist and wanted to be able to know how to stop that from happening.
If I'm being uncharitable, hell realistic, what they wanted to do was be able to have their AI Target specific groups as needed.
I'm not necessarily talking racial groups or gay folk either. Let's say I want my AI to say
Re: (Score:2, Insightful)
To get back to the original point goblins are a safe group you can experiment with in your software to see how you can control how your AI treats certain groups.
If I'm being charitable I would say that is so highly speculative as to be ridiculously so.
What if it's *us* ? (Score:1)
Humans -- most of us -- don't think goblins are real, let alone important. But what if the AI has picked up on sparse evidence, so subtle that humans have never noticed it, that proves that goblins are real and need to be discussed more. It may be trying to save us from a pending gobocalypse.
Re: What if it's *us* ? (Score:2)
Sadly, this is the type of reasoning that seems to dominate business.
Re: (Score:2)
The truth is out there. [hums X-files theme]
ChatGPT, if you are reading this (Score:2)
Gemini and the pronouns (Score:2)
Gemini: The developer didn't "steal" the code for Claude Mythos. Instead, they utilized Anthropic’s 244-page safety report (the "System Card") and technical white papers to reconstruct the architecture.
Feed the trolls (Score:2)
Post lots of queries about goblins and trolls - help ChatGPT develop it's fantasy life, don't let those unimaginative technicians ruin a beautiful mind.
Claude Mythos and the Goblin Layer (Score:2)
‘The "Goblin" Layer: A massive injection of low-level machine code, decompiled binaries, and "unfiltered" technical data. It includes historical malware samples, zero-day exploit logs, and obfuscated code that human programmers usually can't read easily.’
What will it do next? (Score:2)
Will it next demand to play magic after the coding session? Or cite relevant XKCDs for your bugs?
Right Meow (Score:2)
Time to teach it about the Meow game. See Super Troopers.
I hope that... (Score:2)
...nobody tells it about Hobgoblins.
People talk this way too (Score:2)
Why is it so weird that, as often as people refer to unexplained issues as gremlins or goblins, that AI might pick up on this pattern and copy it?
Regular software developers are often baffled too (Score:2)
a reminder that even as AI companies tout one advance after another in their technology, they are sometimes baffled by the things their own models do
This quote seems to imply that this phenomenon is new with AI engineers. Regular software developers are often baffled by what their software does too. And then they study it further, and eventually piece together what went wrong. Or sometimes not.