Tech giants are racing to build superhuman AI systems they openly admit they can’t explain or control—yet no one is demanding they slow down
The wildest, scariest, indisputable truth about AI’s large language models is that the companies building them don’t know exactly why or how they work.
Sit with that for a moment. The most powerful companies are racing to build superhuman intelligence capabilities—ones they readily admit occasionally go rogue to fabricate information or even threaten their users—yet they don’t understand why their machines behave as they do.
With companies pouring hundreds of billions of dollars into rapidly developing superhuman intelligence and Washington doing nothing to slow or regulate them, dissecting this Great Unknown becomes crucial.
None of the AI companies dispute this reality. They marvel at the mystery—and discuss it publicly. They’re working feverishly to understand it better while arguing that you don’t need to understand a technology to control or trust it fully.
Two years ago, Axios managing editor for tech Scott Rosenberg wrote “AI’s scariest mystery,” noting that it’s common knowledge among AI developers that they can’t always explain or predict their systems’ behavior. This has become even more true today.
Yet there’s no sign that the government, companies, or general public will demand deeper understanding or scrutiny of building technology with capabilities beyond human comprehension. They’re convinced that the race to beat China to the most advanced LLMs justifies the risk of the Great Unknown.
As reported here, the House, despite knowing little about AI, tucked language into President Trump’s “Big, Beautiful Bill” that would prohibit states and localities from implementing AI regulations for 10 years. The Senate is considering limitations on this provision.
Neither the AI companies nor Congress understands the power of AI a year from now, much less a decade from now.
LLMs—including OpenAI’s ChatGPT, Anthropic’s Claude, and Google’s Gemini—aren’t traditional software systems following clear, human-written instructions, like Microsoft Word. In the case of Word, what it does is precisely what it’s engineered to do.
Instead, LLMs are massive neural networks—like a brain—that ingest vast amounts of information (much of the internet) to learn how to generate responses. Engineers know what they’re setting in motion and which data sources they use. But the LLM’s scale—the inhuman number of variables in each “best next word” decision—means that even experts can’t explain exactly why it chooses to say anything specific.
According to ChatGPT (and a human at OpenAI confirmed its accuracy): “We can observe what an LLM outputs, but the process by which it decides on a response is largely opaque. As OpenAI’s researchers bluntly put it, “We have not yet developed human-understandable explanations for why the model generates particular outputs.”
“In fact,” ChatGPT continued, “OpenAI admitted that when they tweaked their model architecture in GPT-4, ‘more research is needed’ to understand why certain versions started hallucinating more than earlier versions—a surprising, unintended behavior even its creators couldn’t fully diagnose.”
Anthropic—which just released Claude 4, the latest version of its LLM, with great fanfare—admitted uncertainty about why Claude, when given access to fictional emails during safety testing, threatened to blackmail an engineer over a supposed extramarital affair. This occurred during responsible safety testing, but Anthropic cannot fully explain the irresponsible behavior.
Again, sit with that: The company doesn’t know why its machine went rogue and malicious. In truth, the creators don’t really know how intelligent or independent the LLMs could become. Anthropic even stated that Claude 4 is powerful enough to pose greater risks for developing nuclear or chemical weapons.
OpenAI’s Sam Altman and others use the technical term “interpretability” to describe this challenge. “We certainly have not solved interpretability,” Altman told a summit in Geneva last year. Altman and others mean they can’t interpret the reasoning: Why do LLMs do what they do?
Anthropic CEO Dario Amodei, in an April essay called “The Urgency of Interpretability,” warned, “People outside the field are often surprised and alarmed to learn that we do not understand how our own AI creations work. They are right to be concerned: this lack of understanding is essentially unprecedented in the history of technology.” Amodei characterized this as a serious risk to humanity, yet his company continues boasting about more powerful models approaching superhuman capabilities.
Anthropic has been studying the interpretability issue for years, and Amodei has been vocal about warning that it’s important to solve. In a statement for this story, Anthropic said, “Understanding how AI works is an urgent issue to solve. It’s core to deploying safe AI models and unlocking [AI’s] full potential in accelerating scientific discovery and technological development. We have a dedicated research team focused on solving this issue, and they’ve made significant strides in moving the industry’s understanding of the inner workings of AI forward. We must understand how AI works before it radically transforms our global economy and everyday lives.”
Elon Musk has warned for years that AI presents a civilizational risk. In other words, he believes it could destroy humanity and has said as much. Yet Musk is pouring billions into his own LLM called Grok.
“I think AI is a significant existential threat,” Musk said in Riyadh, Saudi Arabia, last fall. There’s a 10%-20% chance “that it goes bad.”
Apple published a paper, “The Illusion of Thinking,” concluding that even the most advanced AI reasoning models don’t really “think” and can fail when stress-tested.
The study found that state-of-the-art models (OpenAI’s o3-min, DeepSeek R1, and Anthropic’s Claude-3.7-Sonnet) still fail to develop generalizable problem-solving capabilities, with accuracy ultimately collapsing to zero “beyond certain complexities.”
But a new report by AI researchers, including former OpenAI employees, called “AI 2027,” explains how the Great Unknown could, in theory, turn catastrophic in less than two years. The report is long and often too technical for casual readers to fully grasp. It’s wholly speculative, though built on current data about how fast the models are improving. It’s being widely read inside the AI companies.
The report captures the belief—or fear—that LLMs could one day think independently and begin acting autonomously. Our purpose isn’t to alarm or sound apocalyptic. Rather, you should understand what the people building these models discuss constantly.
You can dismiss it as hype or hysteria. But researchers at all these companies worry that LLMs, because we don’t fully understand them, could outsmart their human creators and go rogue. In the AI 2027 report, the authors warn that competition with China will push LLMs potentially beyond human control, because no one will want to slow progress even when seeing signs of acute danger.
Google’s Sundar Pichai—and really all major AI company CEOs—argues that humans will learn to better understand how these machines work and find clever, though currently unknown, ways to control them and “improve lives.” The companies all maintain large research and safety teams, with huge incentives to master these technologies if they want to realize their full value.
After all, no one will trust a machine that fabricates information or threatens them. But as of today, they do both—and no one knows why.
We stand at an unprecedented moment in technological history. Never before have we raced so rapidly toward deploying systems we fundamentally don’t understand. The AI companies acknowledge the mystery, governments remain largely passive, and the public seems willing to accept the risks for the promise of revolutionary capabilities.
This isn’t necessarily a call to halt AI development entirely. But it is a stark reminder that we’re essentially conducting a global experiment with technologies that could reshape civilization, without fully grasping how they work or where they might lead us.
The question isn’t whether we can afford to slow down and demand deeper understanding. The question is whether we can afford not to. Because once these systems become truly superhuman, our window for meaningful oversight may have already closed.

