AI Research

DeepSeek and the Efficiency Arms Race

The Chinese lab's architectural innovations under compute constraints have forced a global rethink of whether scaling raw compute is still the only path to frontier AI.

By Jakub Jirák Apr 14, 2027 6 min read

deepseekai-researchefficiencyllmchina

High-Flyer Capital Management is a quantitative hedge fund in Hangzhou that made its money — reportedly over $8 billion in profits in 2021 alone — by applying machine learning to financial markets. In 2023, its founder Liang Wenfeng made a decision that looked eccentric at the time: he took a significant portion of those profits and directed them toward building an AI research lab called DeepSeek, with a mandate to do frontier research with whatever compute was available after export controls removed the best hardware from the menu.

What followed is one of the more instructive episodes in the history of constrained innovation.

DeepSeek’s initial models were not remarkable. The V1 series, released in late 2023, was a solid Chinese-language model that demonstrated the team’s competence without suggesting anything unusual about its trajectory. What the team was doing simultaneously was thinking hard about the fundamental question that compute constraints impose: if you cannot have more chips, what can you do with fewer?

The answer they developed, published in extraordinary technical detail in a series of papers beginning in late 2024, involved rethinking several assumptions that had become orthodox in the scaling era. The dominant paradigm held that performance scaled predictably with compute — that doubling the training compute produced predictable improvements in model capability — and that the path to better models was therefore straightforward if expensive: buy more chips, train more.

DeepSeek’s research challenged not the scaling law itself but the assumption that existing architectures were anywhere near optimal for the compute they consumed. The team’s mixture-of-experts architecture, their training efficiency innovations, their approach to inference optimization, and their data curation methods collectively produced models that achieved performance disproportionate to the compute invested. The R1 model, released in January 2025, demonstrated reasoning capabilities competitive with American frontier models on certain tasks while requiring substantially less training compute. The paper reporting these results was, by the standards of AI research, unusually honest about both what worked and what didn’t.

The international AI research community’s response was initially skeptical, then interested, then genuinely impressed. Not because DeepSeek had proven that compute doesn’t matter — it does, and the team’s own analysis acknowledges that their models would be better with more compute. But because the research demonstrated that the efficiency gap between what was theoretically possible and what frontier labs were actually achieving was wider than the field had collectively recognized.

The implications for export controls were immediately and publicly recognized. If a lab operating under compute constraints could achieve competitive results through architectural innovation, then the assumption that compute restrictions would prevent China from training competitive models was not as robust as the policy had assumed. The controls were aimed at a specific path — scaling raw compute — and DeepSeek had demonstrated, at least partially, a different path that the controls did not block.

American policymakers and intelligence analysts were, by most accounts, unsurprised in their specific technical reactions (the NSC had been tracking DeepSeek’s research) and genuinely concerned in their strategic ones (the efficiency gap being wider than assumed had implications for how effective the controls were). The Commerce Department subsequently tightened controls on certain software and training data tools that DeepSeek had been using — a recognition that the hardware-focused control framework needed to account for the software and algorithmic dimension.

The efficiency arms race that DeepSeek accelerated is now a visible feature of global AI research. American labs responded to the DeepSeek efficiency results not by dismissing them but by publishing their own efficiency research. Google’s Gemini team published work on mixture-of-experts optimization. Anthropic’s Constitutional AI research has always emphasized doing more with careful training methodology rather than raw scale. Meta’s LLaMA series has been explicitly designed for efficiency at a given model size, enabling academic researchers and smaller companies to run capable models on limited hardware.

The consequence is that the absolute compute advantage that the US holds — in terms of how much training compute is available to American labs versus Chinese labs — translates to a smaller performance gap than a naive comparison of compute budgets would predict. If efficiency research continues to advance at its current rate, the performance differential between a model trained with ten times the compute and a model trained with optimal efficiency techniques on one times the compute may shrink to something manageable, rather than the order-of-magnitude difference that a simple scaling analysis would suggest.

This is not a prediction that efficiency research will fully close the compute gap. The scaling laws still apply — more compute still produces better models, all else equal. But “all else equal” is doing a lot of work in that sentence. All else is not equal when one team has been forced by circumstances to think harder about architectural efficiency than teams with unlimited compute budgets have been required to.

The secondary effect of DeepSeek’s work is on the open-source AI ecosystem. DeepSeek publishes technical reports with unusual detail, makes model weights available for download, and operates in a way that looks more like a university research group than a commercial AI company. This is a deliberate choice, according to interviews with Liang Wenfeng — the theory being that contributing to the global AI research community creates knowledge and reputation advantages that outweigh the competitive value of keeping innovations proprietary.

The irony that a Chinese AI lab is more committed to the open-source AI research tradition than several of the American labs (which have progressively reduced their research publication and model weight sharing since approximately 2022) is not lost on the research community. Whether DeepSeek’s openness is philosophically principled, strategically calculated, or some combination is unclear from the outside. The effect is that Chinese AI research is contributing to and benefiting from the global AI research commons in ways that complicate the binary US-versus-China framing.

Open-source AI development is genuinely international. The researchers who download and fine-tune DeepSeek models are in Europe, Southeast Asia, Latin America, and North America as well as in China. The architecture innovations that DeepSeek published are being reproduced and extended by researchers everywhere. This is not easily controlled by export controls, because the relevant technology is not hardware but knowledge — and knowledge in a published paper travels at the speed of the internet.

The broader lesson of the DeepSeek episode for AI geopolitics is that the export controls were designed around a specific theory of how AI capability is produced — primarily through scaling raw compute — and that theory, while not wrong, was more partial than the policy assumed. Constraining hardware is a real lever, but it is not the only lever, and sophisticated researchers under resource pressure will find the other levers.

This has implications for how export controls should be designed and what complementary policies are necessary. A control regime that restricts hardware without addressing the open publication of efficiency techniques that partially substitute for hardware is targeting one input into a multi-input production process. The control is not useless — constraining any critical input raises the cost and slows the pace of development. But its effectiveness is bounded by the substitutability of the inputs it doesn’t control.

What the US has not yet figured out — and what the policy debate has not honestly addressed — is that leading in AI over the long run requires not just restricting adversary inputs but accelerating domestic outputs. Training restrictions on China matter less if the American research and development ecosystem is generating architectural advances fast enough to maintain a meaningful lead despite Chinese efficiency improvements.

DeepSeek didn’t change that equation. It clarified it.

DeepSeek and the Efficiency Arms Race

Your Mac Is a Local AI Supercomputer You're Using at 10% Capacity

The Orchestra of Hours: Conducting Time Like a Maestro

Prompt engineering for code: Difference between role prompts and persona cosplay

Agentic coding: What agentic coding taught us about how we really work

Local LLMs with Ollama: Pairing Ollama with Continue for a first private IDE setup