After several years of dissemination at the media level, the name “Huang’s Law” was finally officially recognized by NVIDIA.
Will Moore’s Law fail?
Except Intel has refused to admit it, presumably the answer of other manufacturers is “yes”.
Under the premise of this fact, they are more concerned with the question: how to further improve the energy efficiency of the processor after Moore’s Law fails? In response to this problem, some companies have found their own answer, such as NVIDIA.
In the past few years, Huang Renxun has been expressing that “Moore’s law is dead and a new law is taking shape”, especially in terms of GPU, it is predicted that the performance of GPU will increase by 1000 times every 10 years. This prediction is also dubbed “Huang’s Law”. .
It is worth noting that at the ongoing GTC 2020 China Online Conference, the “Huang’s Law” was officially stamped – at the official event, Bill Dally, NVIDIA’s chief scientist and vice president of the research institute, said “Huang’s Law”. law”.
What is Huang’s Law?
Before we talk about Huang’s Law, let’s take a look at Moore’s Law, which is predicted to fail or “dead”.
Moore’s Law, proposed by Gordon Moore, one of the founders of Intel, predicts that the number of transistors that can be accommodated on an integrated circuit will double approximately every 24 months.
After that, former Intel CEO David House updated another version of the statement, that is, the chip performance will double every 18 months.
Now, with the goal of doubling the performance, Huang Renxun also brought his own answer – 1 year. Yes, in Huang Renxun’s opinion, it only takes one year to double the performance based on GPU, which is at least 1.5 times faster than Moore’s Law.
Of course, the establishment of any argument without factual basis must be fragile. For this reason, Bill Dally directly used NVIDIA GPU products as an example during his online speech to prove that the core of improving the performance of chip products is not the process technology.
Specifically, Bill Dally brought an icon that shows the performance growth trend of various GPU products from Kepler in 2012 to Ampere A100 in May this year:
In 8 years, the inference performance of a single chip has been improved by 317 times. “In fact, our inference performance is more than doubling every year, in part due to improvements in Tensor Cores, more optimized circuit designs and architectures, and less of a role for process technology,” said Bill Dally.
He explained that from 2012 to 2020, Nvidia has only used 3 generations of process technology in the development of GPU products, namely the 28nm used by the Kepler architecture at the beginning, the 16nm used in the middle period, and the recent 7nm used by the Ampere architecture.
Among them, Bill Dally pointed out that in the achievement of “317 times”, the overall role of the process technology is less than 20%, and the main contributor is “the improvement of the structure”.
“After Moore’s Law is gone, we still have ‘Huang’s Law’ to keep improving computing performance, because we need to take advantage of higher computing performance to do a lot of work in the future.”
It is well known that the key to Moore’s Law is to put more transistors in a specific volume based on advanced technology, and it is easy to understand. As for the “architectural improvement” of Huang’s Law, how to implement it? Bill Dally also answered this question in his speech.
How to implement Huang’s Law?
In response to this question, Bill Dally used three projects to answer in his speech.
The first is the MAGNet tool to achieve ultra-energy-efficient accelerators. According to Nvidia, the AI inference accelerator generated by MAGNet can achieve an inference capability of 100 tera ops per watt in simulated tests, which is an order of magnitude higher than current commercial chips.
This is possible because MAGNet employs a series of new technologies to coordinate and control the flow of information through the device, minimizing data transfer, which is the most energy-intensive part of today’s chips. Throughout the process, this research prototype is implemented in a modular way, so it can be expanded flexibly.
The goal of the second project is to replace electrical links in existing systems with faster optical links. Bill Dally said: “We can double the speed of NVLink to the GPU, maybe double it again, but the electrical signal will eventually run out.”
Currently, a team of 200 people, led by Bill Dally, is working closely with researchers at Columbia University on how to use technology that telecom providers use in their core networks to transmit dozens of signals over a single fiber.
It is understood that this technology called “dense wavelength division multiplexing” is expected to realize the transmission of Tb/s-level data on a chip only one millimeter in size, which is more than ten times the interconnection density today.
It is worth noting that, in addition to greater throughput, optical links also help to create more dense systems. In response to this, Bill Dally showed an example of a future NVIDIA DGX system model that will be equipped with more than 160 GPUs.
To unleash the full potential of the optical link, the corresponding software is also required. This is the third project shared by Bill Dally, the new programming system prototype Legate.
Legate incorporates a new programming shorthand into the accelerated software library and high-level runtime environment Legion. With Legion, developers can run programs written for a single GPU on systems of any size—even for systems such as Selene with thousands of GPU supercomputer.
Currently, Legate is being tested at the U.S. National Laboratory.
Is there a “ceiling” for Huang’s Law?
The establishment and continuation of a law is definitely not something that an enterprise can do alone. It needs more upstream and downstream partners to stimulate new demands and promote innovation.
In this regard, NVIDIA is also building its own AI ecosystem centered on GPU products. Taking the NVIDIA Startup Acceleration Program as an example, within four years, more than 7,000 companies have joined the program, covering 92 countries around the world. Among them, in China alone, more than 800 companies have been supported by NVIDIA.
In Bill Dally’s speech, we also saw the presentation of 12 representative AI startups in China, including self-driving star Creation Vision Future, WeRide Zhixing, and satellite image data analysis company Dadi Quantum, etc.
Take TuSimple as an example. As early as in a previous interview, they said to the public that the efficiency of a system using NVIDIA AI chips doubled every year. From the perspective of commercial use, from the situation expressed by the enterprise, the “Huang’s Law” is obviously playing a role and becoming a reality.
Of course, referring to the current situation that Moore’s Law is failing, people inevitably start to worry, will Huang’s Law also fail one day?
Unlike the naked eye of the process technology, the improvement of the architecture is relatively virtual. In response to this question, someone has given a possible answer in “10 years”. For the accuracy of this answer, we do not judge too much. Regardless of whether the “ceiling” exists or not, what we can determine at present is that, judging from the performance of the past 8 years, in the next ten days, Huang’s Law will still continue. Taking advantage of its “performance doubling every year”, the final result brought by this law is still worth looking forward to.
No comments:
Post a Comment
Note: Only a member of this blog may post a comment.