4090 skyrocketed to 30,000, and the inventory insurance was wiped out! EUV is blocked and the 5nm process is locked?
The U.S. Department of Commerce banned the sale of cutting-edge AI chips to my country, but did it accidentally affect the 4090 graphics card? Recently, a foreign technology tycoon bluntly said: Banning EUV is the key, and the 5nm process is directly locked.
RTX 4090 has been on the hot search list for two days!
Yesterday, it was revealed that the United States had banned the sale of cutting-edge AI chips such as H800 and A800 to China. Under the new regulations, GPUs with a certain level of performance require additional licenses.
In the official documents submitted by NVIDIA to the U.S. Securities and Exchange Commission (SEC), a product that no one expected suddenly appeared - RTX 4090.
According to calculations by CITIC Securities, if we look at "performance density", 4090 does fall into the regulated category.
As soon as the news came out, 4090 in various stores was immediately sold out.
However, the official document released by the U.S. Department of Commerce that day actually contains this sentence -
As part of these updates, we will also introduce an exemption to allow the export of chips for consumer applications.
As a result, the "4090 banned incident" has not yet passed, and news of a "big reversal" began to circulate on the Internet today.
Whether the United States will approve Nvidia's exemption application for 4090 still needs to let the bullets fly for a while.
Between these twists and turns, the scalpers are winning.
Now, the price of 4090 on the market has risen to almost 30,000 yuan per piece, and it is still rising sharply.
Affected by this, the gaming community was in mourning - no one thought that a chip ban could make games unplayable?
Moreover, 4090 affects not only gamers, but also many domestic artificial intelligence research institutions.
In this regard, many people expressed excitement - Domestic graphics cards, your opportunity has come!
Chip Ban and Moore’s Law
Just today, Ben Thompson, a well-known American technology analyst, published a long article on his blog, providing an analysis of the current chip ban.
The main targets of this export control are the H800 and A800, two "Chinese customized versions" specially designed for the ban.
The main difference between H800/A800 and H100/A100 is the interconnection bandwidth——
The A100's interconnect bandwidth is 600 GB/s (which happened to be the upper limit imposed by last year's export controls) and the H100's is 900 GB/s; the A800 and H800 are limited to 400 GB/s.
The reason why Internet speed is important is because Nvidia CEO Jensen Huang has previously stated that Moore's Law is dead.
Moore's Law, originally proposed by Moore in 1965, states that the number of transistors in integrated circuits will double every year.
A decade later, Moore revised his forecast to a doubling every two years, a forecast that continued until the last decade or so and has now slowed to a doubling roughly every three years.
In practice, however, Moore's Law is more like a cardinal rule in the tech industry: Over time, computing power gets more powerful and cheaper.
In order to facilitate the description, the author proposed a Moore's Precept, which is based on Moore's Law——
Smaller transistors switch faster, consume less energy in the switching process, and can even fit more transistors on a single wafer.
This means that you can put more chips on each wafer, or larger chips, which will either reduce the price or increase the power without changing the price. In practice, we tend to do both.
The rest of the tech industry doesn't need to understand the technical or economic details of Moore's Law.
For 60 years, technology professionals have taken it for granted that computers will get faster and faster, so they chase the latest technology and trust that the processor speed will keep up with their use cases.
Just seeing that a use case is possible is enough. If it's not optimal yet, Moore's Precept will provide optimization solutions to get it there.
Moore's Law, the end?
The difference between Moore's Law and Moore's Precept is the key to understanding what Jen-Hsun Huang calls "Moore's Law is dead."
From a technology perspective, Moore's Law has indeed slowed down, but density continues to increase.
Here are the transistor densities by TSMC’s different process nodes, using first-generation versions of each node:
However, cost is also very important.
Here is the same table, listing the price of a single wafer at TSMC, and the price converted to a billion transistors -
There is something interesting hidden in the numbers in the lower right corner of this table——
TSMC's 5nm process has increased the price of each transistor, and the increase has been very large, reaching 20%.
The reason is obvious. 5nm is the first process that requires the use of ASML extreme ultraviolet lithography (EUV), and EUV machines are expensive, about $150 million each.
In other words, while the technical definition of Moore's Law will continue, the speed and cost of chips will not continue to increase.
GPUs and parallelism
To be clear, Lao Huang's argument does not just stop at the cost of 5nm chips, but also in terms of speed.
Remember, Moore's Law is as much about speed as it is about cost.
The fact is, as energy becomes a constraint in everything from mobile devices to PCs to data centers, the increase in chip density is primarily about improving energy efficiency.
Jen-Hsun Huang's argument for several years has been that Nvidia has a solution for making computing faster: using GPUs.
GPUs are much less complex than CPUs, which means they can execute instructions faster, but those instructions must be much simpler.
At the same time, you can run a large number of GPUs simultaneously and achieve extraordinary results.
Graphics processing is the most obvious example of "awkward parallelism":
Each "shader" on the GPU (the main processing component of the GPU) is responsible for calculating what is displayed in a specific area of the screen.
The size of this area depends on how many shaders there are. If you have 1024 shaders, each shader will draw 1/1024 of the screen area.
So if you have 2048 shaders, drawing to the screen will be twice as fast.
The performance of graphics processing has "awkward parallelism", which means that as the number of processors invested increases, its performance will also improve linearly.
This "awkward parallelism" is the key to GPU performance surpassing that of CPUs.
The current challenge, however, is that not all software problems can be easily parallelized.
NVIDIA's CUDA ecosystem is designed to provide tools to build software applications that can take advantage of GPU parallelism. This is one of the main moats supporting Nvidia's dominance.
However, most software applications still require CPU complexity in order to run.
AI is not like most software.
It turns out that AI is an awkward parallel application, whether it is training a model or using the model for inference. Additionally, optimal scalability extends far beyond the computer monitor on which the graphics are displayed.
This is why Nvidia’s AI chips have the “high-speed interconnection” function mentioned in the chip ban——
AI applications can run on multiple AI chips at the same time, but the key to ensuring that these GPUs run at high speeds is to provide data to them. At this time, high-speed interconnects are needed.
Therefore, the author is skeptical about the comprehensive shift to GPUs for traditional data center applications.
In his opinion, humans and companies are lazy, and CPU-based applications are not only easier to develop, but are mostly already built.
Very few companies take the time and effort to port something that already runs on the CPU to the GPU.
Ultimately, the applications that run on the cloud are determined by the customer providing the cloud resource requirements, not by the cloud provider looking to optimize FLOP/rack.
In addition, it turns out that Moore's Precept is likely to be back on track, so traditional CPUs are still alive.
EUV is key
The above table only introduces the situation of 5nm, but the iPhone 15 Pro uses the N3 chip, and its price/transistor is as follows:
On the 3nm node, the N3B process is currently used for the iPhone A17 Pro chip, and the basis for the future N3 series is the further N3E.
This also makes N3's leap in "price/transistor" even more impressive: N3B solves the regression problem of the 5nm process, while N3E is a significant improvement over the 7nm process.
In terms of revenue, although "price/wafer" has continued to grow, "price/billion transistors" has continued to decline. This is the effect of Moore's Law.
In other words, new equipment (such as EUV) allows us to "embed more components on integrated circuits."
The situation at 5nm is similar to the situation at 20nm during the last price/billion transistor increase:
TSMC started using double-patterning technology at this node, which meant they had to perform each photolithography step twice.
This both doubles the lithography equipment utilization per wafer and reduces yield.
At least as far as 20nm is concerned, the benefits of producing smaller transistors outweigh the costs.
But by the 3nm process, the benefits of EUV have far exceeded the costs, and early rumors about 2nm density and price suggest that these benefits should continue into the next node.
All in all, the author found that TSMC's new process N3E achieved through EUV achieved a greater price/billion transistor improvement than the previous N3B process.
This has rekindled the progress of Moore's Law after the 5nm process.
Bans are imperfect, but they are useful
Ben Thompson summed it up in a blog post last month:
TSMC has demonstrated that it can manufacture 7nm chips using deep ultraviolet (DUV)-based immersion lithography technology, and China has a large number of DUV lithography machines.
Semiconductor Manufacturing International Corporation (SMIC) also manufactured 7nm chips in 2022.
But the manufacturing cost is extremely high. Taking Intel as an example, they could have used DUV lithography technology to produce 7nm chips, but due to cost reasons, they eventually switched to EUV technology.
In other words, it is not surprising that SMIC uses DUV lithography technology to produce 7nm chips, but it does not mean that the chip ban has been bypassed.
In fact, the key lies in the 5nm node. In other words, the export control that will really restrict China's long-term development is EUV technology.
Previously, the United States had persuaded ASML of the Netherlands to no longer export EUV lithography machines, and the Biden administration has further locked this down through the chip ban and further coordination with the Netherlands.
The H800 uses TSMC's third-generation 5nm process (called N4), which means it is manufactured through EUV. However, limitations on the interconnection rate will directly slow down AI research and development and make the cost higher.
Although this cannot completely prevent the development of AI, EUV lithography machines are necessary to achieve Moore's Precept.