Industrial Policy

What Huawei Actually Built

The Ascend AI chip program is neither the miracle Beijing claims nor the failure Washington hoped for — it is something more instructive than either.

By Jakub Jirák Apr 2, 2027 6 min read

huaweiai-chipschinaindustrial-policysemiconductors

In September 2023, when Huawei quietly started selling the Mate 60 Pro with a domestically produced 7-nanometer chip inside, the reaction in Washington ranged from disbelief to alarm. The Commerce Department launched an investigation. Analysts debated whether SMIC had violated export controls or merely exploited them. Some commentators declared that sanctions had failed. The more careful observers noted that a single phone chip, even one made under sanctions, tells you very little about the industrially relevant question: can China manufacture AI accelerators at the scale and performance level that frontier model training requires?

Three years later, we have considerably more data. The picture it paints is specific enough to be genuinely useful, which is to say it does not fit neatly into either the “Huawei has won” or “sanctions are working” narratives that dominate the political conversation on both sides of the Pacific.

The Ascend program traces its origins to 2018, when the first version of the chip debuted at the Hot Chips symposium in California — a remarkable bit of timing, given what would happen to Huawei’s US access within months. The Ascend 910, released in 2019, was a credible first attempt: a 7nm chip produced by TSMC before the sanctions cut off that relationship, with performance that Huawei claimed exceeded Nvidia’s V100.

What happened after TSMC was forced to stop serving Huawei is the part that matters industrially. Huawei had to move production to SMIC, a domestic foundry that was, at the time of the severance, operating at 14-nanometer process nodes in volume and experimenting with 7-nanometer techniques that had not yet been proven in production. The company also had to redesign its supply chain to eliminate components from US-controlled entities — a list that included not just chips but design software, specialized gases, and equipment components.

The Ascend 910B, released in 2023, represented the outcome of that forced transition. It was manufactured by SMIC at what the company called its “N+2” process — an internal designation for a variant of 7-nanometer that observers have characterized as closer to TSMC’s 10-nanometer in effective performance terms. The chip worked. It was less power-efficient than an H100 by a significant margin (consuming roughly 30 percent more power for comparable arithmetic throughput), and its memory bandwidth was constrained by the unavailability of the most advanced HBM from SK Hynix and Samsung, which faced US pressure to restrict sales.

The 910C, which began shipping in 2026, is a more meaningful step forward. Huawei redesigned the chip’s interconnect architecture and its on-chip memory to compensate for the HBM constraints, using a multi-die packaging approach that achieves higher effective bandwidth through physical proximity rather than faster memory. The approach is inelegant in the way that many engineering workarounds are: it solves the problem of unavailable components by restructuring the problem itself, at the cost of die size, yield, and manufacturing complexity.

The engineering community’s assessment is that the 910C represents a genuine advance — not because Huawei discovered something Nvidia doesn’t know, but because the company’s engineers demonstrated the ability to design around constraints that would have stopped a less determined program. There is a technical literature on workaround-driven innovation, mostly in the aerospace and defense sectors, that describes this phenomenon: when you cannot buy the best component, you sometimes redesign the system to not need it, and the resulting design occasionally has advantages the optimal-component version lacks. The 910C’s memory architecture is not better than an H100’s. It is different in ways that turn out to be useful for specific inference workloads.

The performance numbers that Chinese state media reports — often claiming Ascend 910C matches or exceeds H100 — are probably cherry-picked from benchmarks that favor the chip’s architecture. The performance numbers that American analysts report — often emphasizing the gap in memory bandwidth and training efficiency — probably undercount the chip’s utility for the inference applications where Chinese cloud providers are deploying it. Both are accurate descriptions of a partial measurement.

The more strategically significant development is not the chip itself but the ecosystem around it. Nvidia’s dominance is not primarily about the H100’s specifications. It is about CUDA, the software platform that thousands of engineers, researchers, and companies have been building on since 2007. The investment in CUDA is embedded in codebases, research pipelines, textbooks, and the institutional memory of essentially every AI researcher trained in the past fifteen years. Migrating off CUDA is not a technical decision. It is an organizational one, and organizations resist it even when there are good reasons to make the switch.

Huawei’s MindSpore software stack, rebranded and aggressively updated, has made genuine progress in framework compatibility. Models developed in PyTorch can increasingly be ported to run on Ascend hardware with meaningful effort but without fundamental reimplementation. The Chinese AI research community, which had strong incentives to avoid Ascend before the restrictions made Nvidia unavailable, has shifted. Papers from Chinese universities that used to specify “trained on A100 clusters” now increasingly specify Ascend hardware. The ecosystem is developing not because it is better than CUDA but because necessity is a more effective forcing function than any product manager’s roadmap.

ByteDance, Baidu, and Alibaba have all made significant internal investments in Ascend optimization, effectively hiring the equivalent of CUDA kernel engineers to extract the maximum performance from Huawei hardware. These investments represent a knowledge base that is now embedded in the operational capacity of China’s largest AI deployers. That knowledge transfer, from American-designed hardware to Chinese-designed hardware, is arguably more durable than any benchmark improvement.

The honest assessment of what this means requires distinguishing between different types of AI work. For inference at scale — running already-trained models against user queries — the Ascend 910C is a viable, if not optimal, alternative to Nvidia hardware. Chinese cloud providers can deploy it at costs that are economically sustainable, partly because the domestic procurement ecosystem subsidizes it and partly because the performance gap matters less when you are serving millions of requests per day on models that were trained months ago.

For training frontier models — the computationally intensive process of building the next generation of capable AI systems — the picture is more constrained. The efficiency gap, the yield problems at SMIC, and the interconnect performance of Ascend clusters at scale all constrain how fast Chinese labs can iterate. The rough estimate in the industry is that a training run that takes three months on an optimal Nvidia cluster takes five to seven months on a comparable Ascend cluster, accounting for lower efficiency, more downtime, and the engineering overhead of optimizing for non-CUDA infrastructure.

That difference is not fatal. Chinese AI labs are not going to stop existing because training is slower. But in a field where the frontier advances continuously and where the companies that train the most models learn the most from the process, a 2x slowdown in iteration rate compounds over time. Six months from now the gap may be similar. Eighteen months from now it may be wider.

The most significant question — which neither side’s narrative addresses cleanly — is whether the Chinese government considers this tradeoff acceptable. Paying twice as much, in time and money, for AI capability that is somewhat behind the frontier, in exchange for not being dependent on adversary-controlled infrastructure, may be a rational strategic choice for a country that has watched what happened to Huawei when it was dependent on American chips. Independence at reduced efficiency may be preferable to efficiency at strategic risk.

What Huawei actually built is not a Nvidia killer. It is a demonstration that state-directed industrial programs, given enough funding, time, and engineering talent, can produce adequate solutions to problems that market dynamics would never prioritize. That is a different thing, with different implications, than winning a benchmark competition.

The lesson is not that sanctions failed. It is that they achieved something different than advertised — a more expensive, less efficient Chinese AI ecosystem that is also more resilient to future pressure. Depending on your perspective, that is either a partial victory or a partial defeat. It is certainly not the clean win that either side’s propaganda suggests.

What Huawei Actually Built

The Architecture of Hours: Designing Your Time Like a Master Builder

Influence Through What Is Not Said

The Unsung Alchemists of Tech: An International Testers Day Review

M1, M2, M3, M4, M5: Does It Still Make Sense to Chase a New Mac Every Year?

The Ecosystem of Hours: Cultivating Balance in a World of Endless Demands