NVIDIA GTC (GPU Technology Conference) 2022 is back as a virtual event being held over four days from March 21 to March 24. With the last GTC held from November 8 to November 11, 2021, the event is being held sooner to demonstrate the faster development cycle. At the keynote, NVIDIA’s flamboyant CEO Jensen Huang once again donned his leather jacket to highlight major platform developments in the field of AI, autonomous vehicles, and other GPU use cases.
The GTC is an event aimed at developers, researchers, engineers, and innovators. As the largest manufacturer of a graphics processing unit, NVIDIA is among the most innovative companies right now. While GPU was initially used to accelerate graphics in gaming, the use case has expanded to include cloud computing, artificial intelligence (AI), digital twins, autonomous vehicles, and many others.
NVIDIA knows its place in the technology world and has morphed itself from a chip design and manufacturing company to that of a platform player. The chip giant now offers its platform and design as a reference architecture for its partners to build in volume. At GTC 2022, NVIDIA further showed how it is expanding that approach to help transform industries.
Hopper H100 GPU to scale AI data centres
At 1 hour and 39 minutes, Huang’s keynote was long but it had a clear showstopper in the form of the H100 GPU. This is the first chip based on NVIDIA’s new Hopper architecture, named for Grace Hopper, a pioneering US computer scientist. The Hopper architecture succeeded Ampere announced two years ago and brings unparalleled performance to deliver the next level of accelerated computing platform.
The headline figure of H100 GPU is the whopping 80 billion transistors packed using TSMC’s 4nm process. Huang is so confident about H100 that he calls this new GPU the engine for the world’s AI infrastructure. Sticking with numbers, it is the first GPU to support PCIe Gen5 and the first to utilise HBM3. This enables memory bandwidth of 3TB/s and includes nearly 5 terabytes per second of external connectivity.
To put these numbers into perspective, Huang says that twenty H100 GPUs can “sustain the equivalent of the entire world’s internet traffic.” NVIDIA just didn’t stop packing this new Hopper-based GPU with headline number but has also added a new engine for training and inferencing of transformer engines.
A transformer is one of the most widely used deep learning models that rely on a concept called attention, where the model looks at the significance of each part of the input data. NVIDIA is not pitching H100 as a single GPU but instead highlighting the benefits of using multiple H100s linked together using NVLink interconnect to achieve a GPU with external bandwidth of 4.9 Tbps.
The H100 is also the world’s first accelerator chip to support confidential computing capabilities. This will enable protection for AI models and customer data while they are being processed. NVIDIA says H100 will enable chatbots built on Megatron 530B, the most powerful monolithic transformer language model.
Hopper architecture and the H100 is a breakthrough for AI computing and the support for the GPU remains unanimous. Leading cloud service providers including Alibaba Cloud, Amazon Web Services, Baidu AI Cloud, Google Cloud, Microsoft Azure, Oracle Cloud, and Tencent Cloud have announced plans to offer instances based on H100 while servers with H100 accelerators are expected from leading systems manufacturers.
NVIDIA takes simulation to the cloud with Omniverse cloud
Omniverse is a collaboration and simulation engine from NVIDIA that obeys all the laws of physics. It allows companies and developers to build a virtual version of their object, which helps cut down the training time. The Omniverse has been used to train a robot to walk by training virtually and then uploading the data. Omniverse is also an ideal tool for building digital twins, a concept used by factories and industries to build a digital copy of their products or tools.
At GTC 2022, NVIDIA announced Omniverse Cloud, which makes the simulation engine available as a streaming cloud service. While Omniverse requires a powerful system, Omniverse Cloud can work on any hardware including a Chromebook. The minimum requirement for Omniverse Cloud to work well would be a reliable internet connection.
AI Enterprise Stack
As mentioned earlier, NVIDIA wants to be a platform player and not just a GPU manufacturer. This platform’s ambitions become clear when you look at the announcements around DGX system and updates to CUDA-X Libraries.
NVIDIA is pitching its enterprise AI stack as a multi-layer model. The bottom layer of this model includes different systems such as DGX, HGX, EGX, and others built using NVIDIA’s GPUs and DPUs. Above this system layer sits all the necessary software and programs required for developers to work with the hardware. These include CUDA, TAO, RAPIDS, Triton Inference Server, TensorFlow, and other software.
A tech stack is incomplete without some pre-built AI applications. The top layer of this multi-layer model is a set of pre-built AI systems that help developers address specific use cases. There is Maxine for communications and video AI systems, Riva for speech, Clara for healthcare, Merlin for recommender systems, and Isaac for robotics.
This enterprise AI stack can be used by software vendors to build new capabilities. Avaya, a unified communications vendor, is using Maxine in its Spaces product for removing noise, building virtual backgrounds, and other features for video meetings. Automakers such as Jaguar and Mercedes are using Drive for their autonomous vehicles.
NVLink becomes an external switch
At GTC, NVIDIA announced its expansion of NVLink from being an internal interconnect switch to a fully external switch. In the past, NVLink was used to connect GPUs inside a computing system. The 4th generation NVIDIA NVLink allows up to 256 GPUs to act as a single chip. This transition from internal to external switch results in compute performance of 192 Teraflops.
Every NVIDIA keynote is distinct from other tech keynotes because of the way Huang manages to communicate those numbers into real use cases. With such compute performance available through NVLink, it will be easier to run recommendation systems, natural language processing, and other AI use cases. As we keep saying at ai.nl, the data sets and AI models are only getting bigger and the compute performance available right now may not be sufficient in a few years’ time.
NVIDIA shows it is here for greater things
NVIDIA has always been an outlier in the giant maze of tech companies offering just one product – GPU. With the likes of AMD and Intel building data centre architecture including CPU and GPUs, NVIDIA was even written off by some but at GTC, Jensen Huang and company once again showed its relevance and importance in a world where AI is omnipresent.
The transformation of NVIDIA is most akin to Apple, which is transforming from a hardware brand to a services company. Apple is using its hardware as a means to experience its software and services. Similarly, NVIDIA is using its silicon (or GPU) as a means to experience its AI services and scalable computing platform. This unique approach is enough to make NVIDIA the most interesting AI company out there right now.