AI and the Popularity Contest: Why Models Choose What's Common Over What's Correct

Apr 26

Your AI assistant isn’t fact-checking — it’s playing the crowd.

If you've spent time with tools like ChatGPT or Gemini, you might have noticed something odd: they deliver smooth and confident answers, but sometimes the details are off, or niche information gets skipped entirely. That's not a flaw in the code. It's built into how large language models (LLMs) are trained, how they produce text, and how they're fine-tuned to behave in ways that seem "helpful." In this article, we'll dive into why that happens and what to remember when leveraging these tools.

Training Bias: Learning What's Most Common, Not What's Most Complete

LLMs are trained by absorbing massive amounts of online content: websites, Wikipedia entries, social media posts, open-source books, anything easily accessible. More specialized or gated information, like proprietary manuals or academic research, barely makes a dent in that ocean of training data. Pagano et al. (2024) showed that public web and social media text dominate training sets, while curated, vetted information is relatively scarce. Yang and Menczer (2023) went even further, finding that models often "lack knowledge about unpopular news sources" simply because they weren't part of the original data pool. When you ask an LLM for information, it's more likely to pull from popular, mainstream ideas, even when a more accurate answer exists somewhere deeper.

Decoding Choices: Safe and Predictable vs. Creative and Risky

Once trained, a model produces text by predicting one word at a time. It can be "safe," picking the highest probability next word, or "take chances" and sample from several likely options, creating more varied but less predictable responses. Zhang et al. (2024) found that deterministic decoding methods like beam search create more factually accurate outputs. In contrast, high-temperature sampling, which means the LLM adds more randomness, leads to more creativity and a higher risk of hallucinations. If the model sounds bland or formulaic, it's probably playing it safe. If it sounds colorful but drifts off course, it's sampling more freely.

Fine-Tuning for "Helpfulness": Narrowing the Range Even Further

Modern LLMs like ChatGPT undergo a fine-tuning step called Reinforcement Learning from Human Feedback (RLHF). Here, humans rank outputs, and models learn to sound more helpful, polite, and safe. The trade-off? Bai et al. (2022) found that RLHF narrows the model's range of responses even more. Shen et al. (2023) pointed out that models trained this way can develop strange habits, like favoring longer, more verbose answers, whether they're better or not. The model is optimized to sound right, not necessarily to show uncertainty or nuance.

Popularity and Consensus Win (Almost Every Time)

Given the training data and decoding strategies, it's not surprising that LLMs heavily favor popular, well-worn narratives. Yang and Menczer (2023) found that when asked to assess the credibility of different news sources, LLMs often refuse to rate lesser-known outlets but comfortably rate top-tier, widely recognized ones. Models also tend to cite the most common version of content when multiple versions exist, not necessarily the most accurate or updated one. In summary, LLMs default to what they "know well," which often means what's most popular.

Plausibility Isn't the Same as Truth

A final important pattern is that LLMs aren't aiming to verify facts. They're aiming to sound plausible. TruthfulQA and related benchmarks (Lin et al., 2022) demonstrate that LLMs often fall into "imitative falsehoods," repeating wrong information that sounds right because it was common during training. Even OpenAI's GPT-4 technical report admits the model can be "confidently wrong," accepting false user statements without skepticism. If a false answer was common enough online, the model might repeat it confidently without even realizing it was wrong.

Final Takeaway

LLMs are very powerful, but they're products of frequency and familiarity. They tend to lean heavily on what's popular, prioritize safe answers, and are fine-tuned to deliver consensus over complexity. The solution isn't to stop using them. It's to use them wisely. Knowing how they prioritize information makes it easier to spot when to dig deeper, cross-check facts, or challenge an answer that feels too easy. When you understand the "why" behind an LLM's behavior, you can turn it into a powerful but careful partner in learning and decision-making.

How the Insight Layer Helps

The Insight Layer are designed to address these gaps by capturing, validating, and surfacing trusted insights from an organization's knowledge, not just what's most popular online. The Insight Layer helps teams move beyond consensus-driven outputs and make decisions based on deeper, more reliable information by layering in context, verification, and internal expertise. Rather than replacing LLMs, it complements them. The Insight Layer gives AI systems a stronger, vetted foundation to build. With this approach, organizations can get the best of both worlds: the speed and flexibility of LLMs, paired with insights that are truly aligned with their real-world needs.

References

Bai, Y., Kadavath, S., Kundu, S., Askell, A., Kernion, J., Jones, A., Anthony, L., Mirhoseini, A., Olsson, C., Leike, J., & Amodei, D. (2022). Training a helpful and harmless assistant with reinforcement learning from human feedback. arXiv. https://ar5iv.labs.arxiv.org/html/2204.05862

Lin, S., Hilton, J., & Evans, O. (2022). TruthfulQA: Measuring how models mimic human falsehoods. Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics. https://aclanthology.org/2022.acl-long.229/

OpenAI. (2023). GPT-4 technical report. OpenAI. https://cdn.openai.com/papers/gpt-4.pdf

Pagano, S., Hossain, M. S., Wu, T., & Park, Y. (2024). Biases in large language model pretraining corpora: A survey. arXiv. https://arxiv.org/abs/2402.02000

Shen, S., Song, H., Wang, W., Zhou, S., Zhang, Y., & Ke, P. (2023). Bias in reinforcement learning from human feedback: A case study on preference models. Findings of the Association for Computational Linguistics: EMNLP 2023. https://aclanthology.org/2023.findings-emnlp.76/

Yang, K. C., & Menczer, F. (2023). Popular but not always credible: Auditing news source credibility in language models. arXiv. https://arxiv.org/abs/2310.09189

Zhang, A., Kumar, S., Abid, A., Farquhar, S., & Zou, J. (2024). Decoding choices substantially impact factuality in language model outputs. arXiv. https://arxiv.org/abs/2402.01811

Kristy Wedel