Nvidia DPU, a challenge to Intel in the data center

In recent years, concepts such as “subversion”, “extreme” and “revolutionary” have easily appeared in the press conference news of technology manufacturers. At the launch of the iPhone 12, Tim Cook used the word “new era”, marking the iPhone’s official entry into a new era of the 5G era.

But domestic consumers are no stranger to 5G. Because Apple’s epoch-making products did not meet market expectations, the stock price fell by 380 billion on the same day. Later, it will rely on sales to prove whether Apple has entered the market. “New Era”.

Compared with the highly concerned consumer electronics field, this article will focus on the data center industry that most people are not familiar with, and its more upstream data center computing chips. Because we see that with the massive popularity of cloud computing and the exponential growth of AI computing, the data center is mentioned in an unprecedentedly important position.

At a recent forum on the digital communication industry, I heard the opinion of an expert from the China Academy of Information and Communications Technology: Data centers will become the next commanding heights of digital technology alongside 5G technology. A similar point of view, we also heard from Huang Renxun at the Nvidia Online 2020 GPU Technology Conference: The data center has become a brand new computing unit.

Nvidia DPU, a challenge to Intel in the data center

The reason why Huang Renxun has such confidence lies in the launch of a new processor DPU at this conference, as well as the software ecological architecture DOCA around the processor. According to NVIDIA, DPU can be combined with CPU and GPU to form a fully programmable single AI computing unit to achieve unprecedented security and computing power support.

So, can DPU really assume the same computing importance as CPU and GPU, and realize a huge innovation in data center? Where is its innovation point? These are still questions we have to review and investigate.

1. The “core” of NVIDIA DPU

From NVIDIA’s introduction to GTC, the DPU (Data Processing Unit) processor is actually a SoC chip that integrates the functions of an ARM processor core, a VLIW vector computing engine and a smart network card. It is mainly used in distributed storage, The field of network computing and network security.

The main function of the DPU is to replace the CPU processor resources originally used by the data center to handle distributed storage and network communication. Before the DPU, Smart NICs (SmartNICs) were gradually replacing CPUs in network security and network interconnection protocols. Now the emergence of DPU is equivalent to an upgraded and alternative version of smart network cards. On the one hand, it enhances the processing capabilities of smart network cards for network security and network protocols, and on the other hand, it integrates and strengthens the processing capabilities of distributed storage. The domain can better replace the CPU, thereby releasing the computing power of the CPU to other more applications.

NVIDIA’s technological breakthrough in DPU comes from the acquisition of Israeli chip manufacturing company Mellanox last year, and two DPUs of the BlueFeild series, NVIDIA BlueField-2 DPU and BlueField-2X DPU, were developed on the basis of this company’s hardware.

According to reports, the BlueField-2 DPU has all the characteristics of NVIDIA Mellanox Connext-6 SmartNIC, together with 8 64-bit A72ARM processor cores, it is fully programmable and can provide a data transfer rate of 200 gigabits per second, thereby Accelerate security, networking and storage tasks in critical data centers.

The core point is that a single BlueField-2 DPU can provide data center services equivalent to consuming 125 CPU cores, thereby effectively releasing the computing resources of the CPU cores.

The BlueField-2X DPU includes all the key features of the BlueField-2 DPU, which can be enhanced by the AI ​​capabilities of the Nvidia Ampere GPU. In NVIDIA’s roadmap, the future Bluefield-4 will introduce CUDA and NVIDIA AI to greatly speed up the processing of computer vision applications in the network.

Another thing worth noting is that NVIDIA proposed a software development kit for DPU processors – DOCA (Data-Center-Infrastructure-On-A-Chip Architecture). NVIDIA experts compare DOCA to CUDA in the field of data center servers, with the intention of helping developers build corresponding applications on DPU-accelerated data center infrastructure, thereby enriching the DPU application development ecosystem.

From the above introduction, we can see two ambitions of NVIDIA. One is that DPU tries to replicate “the path for GPUs to replace graphics accelerator cards to become general-purpose graphics chips”, and the other is that DOCA tries to replicate “CUDA’s role in the generalization of GPUs.” play an ecological role.”

If combined with the news of NVIDIA’s acquisition of ARM not long ago, we see that an important consideration of NVIDIA is to use the CPU of the ARM architecture as the core to expand from the application acceleration of the server to all the application scenarios of the server, so as to realize the application in the field of data center servers. The goal is naturally to point to the X86 server ecosystem represented by Intel CPUs.

Before examining the possibility of DPU challenging CPU dominance, we can briefly understand NVIDIA’s layout in the data center.

2. Nvidia’s data center “ambition”

After experiencing the slowdown in the growth rate of the game graphics card business and the impact of the significant decline in performance after the ebb of cryptocurrencies, Nvidia, after many twists and turns, finally firmly placed its future on the industrial layout of AI computing and data centers.

In 2017, NVIDIA’s data center business quarterly revenue exceeded $500 million for the first time, a year-on-year increase of 109%, which made Huang Renxun strongly affirm the value of the data center business at a conference.

As early as 2008, NVIDIA initially performed GPU computing for the data center through the earliest Tesla GPU accelerator and the primary CUDA programming environment, trying to offload more parallel computing from the CPU to the GPU. This has become a long-term strategy for the evolution of NVIDIA GPUs.

Since then, with the explosive growth of AI computing demand in data centers, AI hardware is becoming the key to the expansion and construction of more and more data centers. When super AI computing power becomes a rigid requirement of data centers, NVIDIA GPU, with its powerful parallel computing and floating-point capabilities, breaks through the computing power bottleneck of deep learning and becomes the first choice for AI hardware. This opportunity has enabled NVIDIA to gain a firm foothold on the hardware map of the data center. Of course, NVIDIA’s ambitions go far beyond that.

Nvidia’s main layout is in March 2019, when it spent $6.9 billion to acquire the Israeli chip company Mellanox, which is good at providing servers, storage and hyperconverged infrastructure including Ethernet switches, chips and InfiniBand. A wide range of data center products including smart interconnect solutions. The combination of NVIDIA’s GPUs and Mellanox’s interconnect technology enables data center workloads to be optimized across the entire compute, networking, and storage stack, enabling higher performance, higher utilization, and lower operating costs .

At the time, Jen-Hsun Huang saw Mellanox’s technology as the company’s “X-factor,” transforming the data center into a large-scale processor architecture that could address high-performance computing requirements. Now we see the emergence of DPU, which is already an attempt to have the prototype of this architecture.

This year, Nvidia spent $40 billion to acquire the semiconductor design company ARM from Softbank. One of its intentions was to apply the CPU design of the ARM architecture to the future computing model that Nvidia intends to build. The main layout areas include supercomputing, Autonomous driving and edge computing models. Among them, the combination of NVIDIA’s GPU-based AI computing platform and ARM’s ecosystem will not only strengthen NVIDIA’s high-performance computing (HPC) technical capabilities, but also drive NVIDIA’s data center business revenue to continue to record highs.

It can be said that the success of NVIDIA in the field of data centers is related to whether it can achieve large-scale computing in data centers, from the development of self-developed DGX series servers to the integration of Mellanox technology, and the development of new data centers with the help of the ARM ecosystem. The computing architecture is all prepared for the transformation of the data center business.

Of course, if you want to achieve this goal, it depends on whether Intel agrees.

3. How far is NVIDIA challenging Intel?

At present, about 95% of the GPUs in the data center are still connected to the x86 CPU. If NVIDIA simply adds GPUs, it still cannot shake Intel’s dominance in data center servers. Now, Nvidia is obviously not satisfied with seizing the incremental market, but rather wants to cut into the stock market of the data center, that is, try to replace the X86 CPU dominated by Intel (and AMD) with its own chip products.

Since NVIDIA began to acquire ARM, the outside world can see that NVIDIA has repeatedly shown its determination to use ARM processors to further occupy the data center server market, and the DPU integrated with ARM core will become its entry into the data center stock market to replace X86 CPU. the first entry point.

Nvidia launched DPU to cut into this market, rather than directly using ARM core CPU to directly compete with X86 CPU. In fact, it is a more convenient approach, which is equivalent to using next-generation CPU products that integrate network, storage, security and other tasks to achieve. The purpose of gradually replacing the CPU, even if the performance of the ARM CPU contained in it cannot be compared to the X86 CPU of the same generation, but the overall performance of the overall machine must exceed the X86 CPU due to the integration of a dedicated processing acceleration module on the DPU SoC. This kind of strategy with a bit of “Tianji horse racing” is likely to be the beginning of Nvidia’s replacement of low-end X86 CPUs.

But Nvidia still faces a series of difficulties if it wants to challenge Intel in the mid-to-high-end processor market.

First of all, it is NVIDIA’s GPU and X86 CPU that have formed a very stable and strong complementary relationship. Nvidia wants to use ARM-based processors for high-end servers, and it also requires a substantial increase in the performance of ARM processors, and now, this process is not clear.

Another is that Intel has already responded and deployed accordingly in response to the various challenges of Nvidia. Back in 2017, Intel announced it was developing a full-stack GPU portfolio, and next year Intel’s first GPUs are expected to be released in various markets where GPUs are used.

In order to block NVIDIA’s expansion in the field of AI computing and autonomous driving, Intel also successively acquired Nervana and Movidius as the layout of edge AI computing, and acquired Mobileye as the layout of autonomous driving. In addition, Intel also announced in 2018 that it will develop a full-stack open software ecosystem OpenAPI plan for heterogeneous computing to deal with the expansion of the CUDA ecosystem. In other words, Intel is not only doing things in Nvidia’s backyard, but also building its own ecosystem of X86 servers.

For Intel, the data center business is also becoming its core business component. In Q4 2019, Intel’s data center business surpassed the PC business and became its main source of revenue; and this year, Intel’s restructuring of its technical organization and executive team was also seen by the outside world as the beginning of a comprehensive transformation of the data center business.

It is conceivable that in the future data center processor business, NVIDIA will usher in Intel’s strongest defense and counterattack, and the vast number of server integrators may become the beneficiaries of this wrestling.

The mantis catches the cicada, and the oriole is behind, and Nvidia has to face the pursuit of ADM, a new opponent. Not long ago, ADM revealed that it would spend 30 billion US dollars to acquire Xilinx, which was regarded as a dual strategy to challenge Intel and block Nvidia.

In addition, Nvidia also faces challenges from customers’ self-developed chips in the data center processor business. Cloud service providers themselves are not willing to completely hand over their computing cores to NVIDIA. Whether it is AWS, Google, Alibaba, or Huawei, they are already deploying their own cloud processors.

In any case, the data center has become the main battlefield for the old chip giants such as Intel, Nvidia and AMD in the future, and how can Nvidia find a way to cut into the mid-to-high-end server processor in the self-research route of X86 and cloud computing customers. The key point, the DPU just released can only be counted as a preliminary attempt.

In the future, the game of data centers will be fully developed around all fields such as AI and supercomputing.

The Links:   BSM35GD120DN2 SKM300GAL12E4