When DeepSeek quietly published its R1 model on a Monday morning in January 2025, few in Silicon Valley were paying attention. By Tuesday it had sent Nvidia’s stock into freefall, triggered emergency briefings at OpenAI, and ignited the most consequential debate the AI industry had seen since ChatGPT’s launch.
The numbers were staggering. DeepSeek claimed to have trained R1 for approximately $6 million — a figure that, if accurate, represents roughly one-hundredth of what it reportedly costs to train comparable frontier models in the United States. The model matched or exceeded the performance of OpenAI’s o1 on most standard benchmarks, including mathematics, coding, and scientific reasoning.
“This changes the cost calculus for the entire industry,” said one senior researcher at a major AI lab. “If you can train a top-tier reasoning model for $6 million, the question of who can afford to build AGI becomes a lot more interesting.”
The market reaction was immediate and violent. Nvidia shed more than $593 billion in market capitalisation in a single trading session — the largest single-day loss for any company in stock market history.
DeepSeek’s approach relied on a novel training technique called “mixture of experts” that activates only a small fraction of the model’s parameters for any given input. This dramatically reduces computational requirements without sacrificing output quality.
For the broader AI ecosystem, the implications extend beyond cost. DeepSeek is a Chinese company, and R1 is one of the first models from China to credibly compete with the best Western AI systems. The development has intensified discussions in Washington about export controls on AI chips and the effectiveness of US technology restrictions.
Leave a Reply