VMware works with Nvidia to deliver generative AI cloud

VMware has expanded its strategic partnership with Nvidia to enable private cloud infrastructure for the era of generative artificial intelligence (AI).

It positions the new VMware Private AI Foundation with Nvidia as a platform that enables enterprises to customise models and run generative AI applications, including intelligent chatbots, assistants, search and summarisation.

VMware said the platform allows organisations to customise data-centric foundation models using accelerated computing from Nvidia, built on VMware Cloud Foundation and optimised for AI.

Speaking at the company’s VMware Explore 2023 event, Raghu Raghuram, CEO of VMware, said generative AI and multi-cloud are a perfect match. “Together with Nvidia, we’ll empower enterprises to run their generative AI workloads adjacent to their data with confidence while addressing their corporate data privacy, security and control concerns,” he said.

Jensen Huang, founder and CEO of Nvidia, said the company’s BlueField-3 DPUs accelerate, offload and isolate the compute load of virtualisation, networking, storage, security and other cloud-native AI services from the graphics processing unit (GPU) or central processing unit.

“Our expanded collaboration with VMware will offer hundreds of thousands of customers – across financial services, healthcare, manufacturing and more – the full-stack software and computing they need to unlock the potential of generative AI using custom applications built with their own data,” he said.

Since large language models typically run across multiple computers equipped with GPUs for acceleration, the VMware product family offers IT departments a way to orchestrate and manage workloads across these complex distributed systems.

The platform is being positioned as an approach to generative AI that allows organisations to customise large language models; produce more secure and private models for their internal usage; and offer generative AI as a service to users. It’s also designed to run inference workloads at scale in a secure environment.

Alongside the platform, Nvidia announced a number of AI-ready servers, which include L40S GPUs, Nvidia BlueField-3 DPUs and Nvidia AI Enterprise software. Along with providing the infrastructure and software to power VMware Private AI Foundation with Nvidia, the company said the servers can also be used to fine-tune generative AI foundation models and deploy generative AI applications such as intelligent chatbots, search and summarisation tools.

The L40S GPUs include fourth-generation Tensor Cores and an FP8 Transformer Engine, which Nvidia claimed is able to deliver over 1.45 petaflops of tensor processing power and up to 1.7x training performance compared with its A100 Tensor Core GPU.

For generative AI applications such as intelligent chatbots, assistants, search and summarisation, Nvidia said the L40S enables up to 1.2x more generative AI inference performance than the A100 GPU.

By integrating its BlueField DPUs, Nvidia said servers can offer greater acceleration, by offloading and isolating the compute load of virtualisation, networking, storage, security and other cloud-native AI services.

Dell and HP and Lenovo have built new hardware to support the VMware platform. Nvidia AI-ready servers include the Dell PowerEdge R760xa, HPE ProLiant Gen11 servers for VMware Private AI Foundation with Nvidia, and Lenovo ThinkSystem SR675 V3.