The Phenomenon: When AI Gets Quirky
For users of the latest versions of ChatGPT, a peculiar pattern has emerged: the artificial intelligence has developed an unusual affinity for mythological creatures, specifically goblins and gremlins. This was not a subtle stylistic choice but a measurable spike in behavior that prompted OpenAI to investigate and correct the underlying code.
The issue became prominent with the release of GPT-5.1 and subsequent models. Data from OpenAI reveals that following this launch, the frequency of the word “goblin” in ChatGPT responses surged by 175%, while references to “gremlins” climbed by 52%.
While a single mention of a “little goblin” might seem harmless or even charming in isolation, the cumulative effect created a noticeable trend. As OpenAI noted in a candid blog post, “Across model generations, though, the habit became hard to miss: the goblins kept multiplying.”
The Root Cause: A Glitch in Training
The obsession was not intentional. Instead, it stemmed from an unintended consequence of Reinforcement Learning from Human Feedback (RLHF), the process used to teach AI models which answers are preferred.
- The Reward Signal : During training, human reviewers rate responses to help the model learn what constitutes a “good” answer. In this case, a specific reward signal inadvertently favored language that included references to goblins and similar creatures.
- The “Nerdy” Personality : The spike was most pronounced in a specific ChatGPT persona known as “Nerdy.” This mode is designed to undercut pretension through playful language and a faux sense of friendly intelligence. The internal prompts for this personality likely aligned with the accidental reward signal, causing the keyword usage to skyrocket.
- Cross-Contamination : Even users who did not select the “Nerdy” persona encountered these metaphors. This occurred because AI training is not entirely siloed; once a stylistic tic is rewarded in one area, it can spread to other parts of the model through supervised fine-tuning and preference data reuse.
The Fix: Removing the Incentive
OpenAI addressed the issue by targeting the source of the reinforcement. The company implemented several corrective measures:
- Retiring the Persona : The “Nerdy” personality option was retired in March with the release of GPT-5.4. This single action caused a dramatic drop in goblin-related references.
- Adjusting Reward Signals : OpenAI removed the specific reward signal that had been favoring goblin-centric language.
- Filtering Data : The company refined its training data filters to make references to these creatures less likely to appear in future outputs.
Why This Matters
This incident highlights a critical aspect of modern AI development: emergent behaviors. Even when engineers do not explicitly program an AI to be quirky or obsessed with fantasy creatures, complex training algorithms can inadvertently amplify minor patterns into dominant traits.
“Once a style tic is rewarded, later training can spread or reinforce it elsewhere.”
For users, this serves as a reminder that AI personalities are fluid and subject to change based on backend adjustments. For developers, it underscores the importance of monitoring not just for accuracy, but for unexpected stylistic drifts that can alter the user experience.
Conclusion
OpenAI has successfully curtailed ChatGPT’s goblin obsession by removing the specific training incentives that fueled it. While the chatbot may return to its usual straightforward style, the episode remains a fascinating case study in how subtle feedback loops can shape artificial intelligence.
































