Ilya Declares the End of the Scaling Era
Co-Founder and Former Chief Scientist of OpenAI shares what’s been keeping him up at night
At a time when AI feels both inevitable and opaque, it helps to listen closely to the people who have spent the most time staring into its machinery. Yesterday, Ilya Sutskever appeared on the Dwarkesh Podcast, and the conversation offered one of the clearest windows yet into how he thinks about the next decade of AI.
Sutskever, cofounder and former Chief Scientist of OpenAI, left last year after the company’s dramatic internal split and has since built Safe Superintelligence Inc. SSI has raised ~$3B and is valued at $32B as of spring 2025. The company has remained intentionally quiet about its technical agenda. Yet if you listen carefully to this interview, a coherent worldview begins to emerge, and the research priorities behind SSI come into sharper focus.
The conversation opened with a puzzle that has become the fault line of the entire field: today’s models score brilliantly on benchmarks yet remain oddly brittle in the real world. Sutskever lingered on this contradiction. “It’s very difficult to make sense of how the model, on the one hand, does these amazing things, and on the other hand, repeats itself twice… something strange is going on.” Models can ace advanced coding evals while failing to fix a simple bug without introducing a second one.
His analogy is vivid. Today’s frontier models behave like the student who has memorized every trick needed to win competitive programming contests but lacks the deeper flexibility of the classmate who actually “has it.” The second student can move through ambiguous, open-ended tasks with a kind of intuitive stability the first never develops. For him, this is a sign that modern systems have the wrong shape of intelligence: “these models somehow just generalize dramatically worse than people.”
This is where his thinking departs from the mainstream. The central bottleneck, in his view, isn’t scaling limits or data exhaustion or misaligned incentives. The bottleneck is unreliable generalization, and the fact that nobody fully understands why humans learn so well with so little data. Evolution can explain some things - vision, locomotion, basic cognition - but it can’t explain our ability to learn entirely new domains like mathematics, programming, or driving with such speed and robustness. If humans are good at domains our ancestors never encountered, he argues, then the explanation isn’t just priors baked in over millions of years. It’s that “people might have just better machine learning, period.” This is not a small claim. It implies that there is an undiscovered ML principle hiding underneath human cognition, and he hinted repeatedly - carefully, without revealing details - that SSI is organized around searching for it.
This thread extends naturally into his thinking on alignment. He kept returning to emotions - simple, ancient, and astonishingly functional. A person who loses emotional processing, he noted, may retain puzzle-solving skills yet becomes nonfunctional in daily life. Emotions act like a compact value function, continually judging whether a trajectory is promising or catastrophic long before the final outcome arrives. For Sutskever, this is the missing ingredient in ML. Models today must complete a whole trajectory to know whether they are doing well; humans can self-correct every few seconds. This is why a teenager becomes a competent driver in ten hours, while a trillion-token model still stumbles over multistep reasoning. The gap is not just data. It is an absence of internal grounding.
Underneath all this is a deeper thesis: the era of pure scaling has reached diminishing returns. Ilya dates the last decade as two chapters - 2012 to 2020 as the “age of research,” when ideas dominated progress, and 2020 to 2025 as the “age of scaling,” when pre-training became a dependable formula and companies poured billions into compute, comforted by the reliability of scaling laws. He believes that chapter is closing. “Now that compute is big,” he said, “we are back to the age of research.” Data is finite. Pre-training has natural limits. Companies are already burning more compute on RL than on pre-training itself. The next breakthroughs, in his view, will require new conceptual ingredients rather than larger clusters.
His comment about returning to the “age of research” spread quickly because of who it came from: the person who once championed scaling laws is now arguing that the field must search for a new recipe entirely.
This reframing also shapes how he views deployment and safety. He has clearly updated toward gradual exposure over the past year. Purely theoretical debates about AGI are brittle, he said, because “it’s very hard to feel the AGI.” The public, the government, even AI researchers cannot imagine systems that operate far outside human intuition. Once AI begins to “feel powerful,” he predicts safety culture will change sharply: competitors will collaborate, governments will intervene, and plans that once looked abstract will become concrete.
His vision of superintelligence completes the arc. It is not a static oracle. It is a learner: a system that, like a “superintelligent 15-year-old,” arrives with enormous capability for growth rather than a full library of knowledge. Such systems, deployed widely across the economy, could pick up thousands of skills at once, pool their learning, and drive rapid growth even without recursive self-improvement. His timeline to this kind of learner was matter-of-fact: “5 to 20 years.”
His preferred alignment target is also unconventional: not exclusively human values, but “care for sentient life.” He offers this not as a finished solution but as a candidate direction he believes may be more tractable, given that a sentient AI might model others using the same computational circuitry it uses to model itself - an analogue to mirror neurons. He also believes that extremely powerful systems should ideally have their capabilities capped, though he admits that no one has yet articulated a workable method for doing so.
By the end of the conversation, a coherent worldview had emerged. Scaling has carried us astonishingly far, but it will not deliver the final leap. The next era of AI will hinge on unlocking human-like generalization; adding a value-function-like backbone to reasoning; allowing systems to gradually improve through continual learning; and guiding their incentives toward stable, empathetic behavior. It will be an era defined not by data abundance but by conceptual breakthroughs.
If you listen closely, the takeaway is this: the frontier of AI is no longer about adding more of what we already know works, it’s about discovering what we don’t yet understand. And for the first time in years, one of the field’s central figures is saying the quiet part aloud.




Something strikes me as odd here – these questions have been keeping extremely brilliant people up at night for centuries, and sometimes these discussions seem to proceed as if guys like Ilya have never even heard of the philosophy of mind... maybe I'm missing something though.
Super eloquently summarized - as always. Reasoning / thinking models take us in that direction - the missing piece is "intuition" - I guess that's how I'd summarize what will drive the next stage of evolution of models - the teenage driving experience summarizes it best as I get my daughter ready for her behind the wheel test. There has been more than one occasion where she adapted with intuition to a scenario that I had never coached her on. More power to the AI Researchers to take us to the next frontier - and yes, let the bourgeois argue about AGI in the meantime!