NVIDIA Announces Generative AI Microservices for Developers

Published March 18, 2024

NVIDIA has unveiled a suite of generative AI microservices designed to aid developers in creating and launching AI-driven applications. These tools are tailored to harness the power of NVIDIA's CUDA GPU installed base, which spans across various platforms including the cloud, data centers, workstations, and personal computers. The array of microservices and cloud endpoints are pre-optimized to leverage pretrained AI models, ensuring efficient deployment and operation on CUDA-enabled GPUs.

Enterprise-Grade AI Tools

With the intention of streamlining the creation of AI applications, NVIDIA now offers a comprehensive catalog of NVIDIA NIM microservices. These services facilitate enterprises in processing data rapidly and customizing large language models (LLMs), supporting applications like retrieval-augmented generation (RAG) and providing crucial infrastructure such as guardrails. The adoption of these tools is already underway, with major application platform providers like Cadence, CrowdStrike, SAP, ServiceNow, and others, integrating NVIDIA's offerings into their ecosystems.

Accelerated AI Deployment

The new microservices launched by NVIDIA are grounded in the NVIDIA CUDA platform and encompass a variety of NVIDIA's software development kits, libraries, and tools, all available as microservices. These resources aim to hasten development, including specialized offerings in healthcare via healthcare NIM and CUDA-X microservices. NVIDIA's stack also promotes the connection of model developers, platform providers, and enterprises, offering a standardized approach to running custom AI models.

Widespread Industry Adoption

Frontrunners like Adobe, Cadence, CrowdStrike, SAP, ServiceNow, and Shutterstock are among the first to leverage NVIDIA's generative AI microservices featured in NVIDIA AI Enterprise 5.0. NVIDIA has structured these microservices to vastly reduce deployment times from weeks to mere minutes. By employing industry-standard APIs, developers can seamlessly create AI applications with their data in a secure and scalable manner.

Boosting AI Capabilities with CUDA-X

NVIDIA's CUDA-X microservices deliver comprehensive tools necessary for data preparation, customization, and training, ultimately accelerating the development of production-grade AI across various industries. Offerings like NVIDIA Riva and NVIDIA cuOpt cater to specific AI needs such as customizable speech and translation, as well as routing optimization respectively. For more data-driven insights, enterprises can use NeMo Retriever microservices linked to their business data repositories.

Infrastructure Flexibility

Enterprises have the freedom to deploy NVIDIA microservices via NVIDIA AI Enterprise 5.0 on their preferred infrastructure, including major cloud services from AWS, Google Cloud, Azure, and Oracle Cloud Infrastructure, as well as on NVIDIA-Certified Systems from industry leaders.

Commencing Deployment

Developers can test NVIDIA microservices at no cost before opting for production-grade deployment through NVIDIA AI Enterprise 5.0. The announcement sheds light on NVIDIA's continued innovation, furthering their legacy since their GPU invention in 1999, impacting PC gaming, modern AI, and digital industrial transformation.

NVIDIA, AI, microservices, CUDA, developers, enterprise, cloud, deployment, inference, LLM, RAG, infrastructure