GPU Virtual Machine NVIDIA

High-performance compute with NVIDIA GPU

Overview

Unleash peak performance for your most compute-intensive AI and graphics workloads with GPU Virtual Machine NVIDIA from the cloud platform. Delivering up to 30 TFLOPS (FP64), 60 TFLOPS (FP32), 1,671 TFLOPS (FP16), and 3,341 TFLOPS (FP8) with 141 GB of ultra-fast HBM3e memory, these VMs are optimised for LLM/Deep learning-based model training/Inferencing with MIG support and NVLink interconnect for seamless scalability and efficiency.

With reliability at the enterprise level, on-demand scalability, and fast provisioning, you can train sophisticated AI models with certainty, execute real-time analytics, and offer visualization-based experiences. Tap into your cost savings, scalability, and computing power to develop game-changing, next-generation applications and become a clear, decisive, and competitive winner in your industry this Overview Reaches Uniqueness and Strategic Alignment.

Variants

A range of configurations are available, allowing you to select the options best suited for testing and hosting production environments. The offerings are segregated into the below categories. Please, review them carefully before choosing your desired configurations for deployment.

Ubuntu (NVIDIA)

Best for development & non-production environments
Framework: Base Ubuntu 20.04 LTS installation
GPU configuration: Available in configuration of 1x GPU, 2x GPU, 4x GPU, 8x GPU
Billing options: On-demand, 1-month, 6-month, 12-month reserved
Environment: Testing and development

Calculate now

Ubuntu PyTorch (NVIDIA)

Best for development & non-production environments
Framework: Pre-configured PyTorch framework
GPU configuration: Available in configuration of 1x GPU, 2x GPU, 4x GPU, 8x GPU
Billing options: On-demand, 1-month, 6-month, 12-month reserved
Environment: Research and prototyping

Calculate now

Ubuntu TensorFlow (NVIDIA) Recommended

Best for all Production workloads
Framework: Enterprise TensorFlow installation
GPU configuration: Available in configuration of 1x GPU, 4x GPU, 8x GPU
Billing options: On-demand, 1-month, 6-month, 12-month reserved
Environment: Production deployments and enterprise workloads

Calculate now

Core Features at a Glance

Extreme Model Capacity

Train and inference large LLM/ML models with up to 141 GB of ultra-fast HBM3e memory per GPU and exceptional bandwidth (4.8 TB/s with NVLink), supporting extended context lengths and efficient parallel processing.

Optimized Precision Modes

Native support for FP8, BF16, FP16, INT8, as well as FP32 and FP64 Tensor Core precision — enabling efficient, high-throughput AI compute with reduced memory usage and maximum flexibility.

Instant & Flexible Provisioning

Spin up GPU VMs on demand with configurable specs, real-time deployment, and workload-optimised scaling.

Framework-Ready Environment

Full support for NVIDIA CUDA, along with leading AI/ML frameworks like PyTorch and TensorFlow — ready for immediate, accelerated development out of the box.

Flexible Pricing & Real-Time Insights

Choose from On-Demand, Reserved, or Rental options, and monitor GPU usage in real-time for performance tuning and cost efficiency.

High-Speed Memory Bandwidth

Achieve smoother training and inference with ultra-fast memory bandwidth of up to 4.8 TB/s per GPU — ideal for compute-heavy AI and LLM workloads.

Multi-GPU Scalability

Train models faster by scaling seamlessly across 2, 4, or 8 GPUs per VM for parallel workloads.

Cross-Platform OS Support

Deploy on Ubuntu, or RHEL for wide compatibility across AI, data analytics, and visual rendering workflows.

What You Get

Access to Next-Gen GPUs

Leverage cutting-edge NVIDIA H200 NVL with flexible configurations tailored to your workload with available GPU configurations in the form factor of 1 2,4 and 8 GPUs per VM.

Preconfigured Environments

Instantly deploy with optimised AI/ML software stacks, including CUDA, PyTorch, TensorFlow, and more.

Enterprise-Grade Performance & Security

Run production workloads with high availability, secure isolation, encrypted storage, and full compliance support.

Scalable Infrastructure

Seamlessly scale from experimentation to full production across hybrid or multi-cloud deployments.

Support & Monitoring

Get access to technical experts, detailed usage dashboards, performance monitoring tools, and API access for automation.

Still have questions?

What are the key differences between H200 NVL and H200 SXM GPUs?

The H200 NVL is designed for large-scale inference/Train workloads and supports NVLink for multi-GPU scaling. It typically ships in dual-GPU configurations. The H200 SXM, on the other hand, is optimized for maximum throughput in high-density servers with SXM5 sockets, offering superior memory bandwidth and power efficiency for training and inference at scale.

Which workloads are best suited for H200 NVL?

H200 NVL: LLM inference, deep learning inferencing/Training, recommendation systems

What GPU memory and bandwidth specifications do H200 NVL model offer?

H200 NVL: 141 GB HBM3 (per GPU), ~4.8 TB/s bandwidth (dual config)

Do these GPUs support features like MIG (Multi-Instance GPU) or SR-IOV?

H200 NVL : Yes, support MIG (Multi-Instance GPU) for secure GPU partitioning.

Can I choose between on-demand and reserved capacity for these GPUs?

Yes, our platform offers both on-demand access for flexible scaling and reserved capacity options for guaranteed availability, ideal for enterprise SLAs and scheduled training runs.

What are the GPU scaling options per node (1, 2, 4, or 8 GPUs)?

We offer GPU Nodes in configurations of 1, 2, 4, or 8 GPUs per node, depending on the instance type and GPU model. This allows fine-tuned scaling based on workload intensity and budget.

How do these GPUs integrate with existing ML frameworks and toolchains?

All GPU models are compatible with major ML/DL frameworks such as TensorFlow, PyTorch, JAX, Hugging Face Transformers, and ONNX. NVIDIA GPUs use CUDA drivers.

Do you offer pre-configured VM images optimised for these GPUs?

Yes, we provide ready-to-deploy VM images with pre-installed drivers, and libraries optimised for each GPU model.

Resources

Video

Run AI and compute workloads on NVIDIA GPU-powered virtual machines.

Know more

Brochure

NVIDIA GPU virtual machines for high-performance AI, ML, and graphics workloads.

Know more

Ready to Build Smarter Experiences?

Please provide the necessary information to receive additional assistance.

Product *

First Name *

Last Name *

Email Address *

Contact Number *

Company Name *

Pincode *

Please tell us about your business needs *

Type Captcha here *

By selecting 'Submit', you authorise Jio Platforms Limited to store your contact details for further communication.

Submit

Cancel

Archival Storage​

Backup​

Block Storage​

File Storage​​

High Speed Storage

Object Storage​

Application Load Balancer

Bastion Host

Content Delivery Network (CDN)

Domain Name System (DNS)

Internet Gateway

MPLS Connectivity

Network Load balancer​​

Public IP​

Subnet

Virtual Network

Container Registry​

GPU worker Node AMD​

GPU worker Node NVIDIA​

Managed Kubernetes​​

Apache Hadoop

API Gateway

Application CI/CD​​

MongoDB​

MSSQL

MySQL

PostgreSQL

Redis

Bare Metal NVIDIA

GPU Virtual Machine AMD

GPU Virtual Machine NVIDIA

Container Registry

Data Processing

Managed Kubeflow

SFTP

Content Moderation

Content Summarisation

Document Entity Extraction

Document Translation

Entity Extraction

Language Translation

Optical Character Recognition

PII Redaction

Sentiment Analysis

Speech to Text

Speech Translation

Text to Speech

Transcription

Transliteration

Linux Virtual Machine

Windows Virtual Machine​

Antivirus

Intrusion Prevention System

Managed Hardware Security Module​

Managed Key Management Service

Nexgen Firewall​

SSL Certificate

VPN (Client to Site)​

VPN (Site to Site)​

Application Performance Management

Cloud Security Posture Management

Cost Advisory​

Disaster Recovery​

Identity Lifecycle Management​​

Log Analysis​

Process Automation​​

SIEM​

Vulnerability Assessment and Patch Management

Apache Kafka

High-performance compute with NVIDIA GPU

Overview

Variants

Core Features at a Glance

What You Get

Still have questions?

What are the key differences between H200 NVL and H200 SXM GPUs?

Which workloads are best suited for H200 NVL?

What GPU memory and bandwidth specifications do H200 NVL model offer?

Do these GPUs support features like MIG (Multi-Instance GPU) or SR-IOV?

Can I choose between on-demand and reserved capacity for these GPUs?

Archival Storage

Backup

Block Storage

File Storage

Object Storage

Network Load balancer

Public IP

Container Registry

GPU worker Node AMD

GPU worker Node NVIDIA

Managed Kubernetes

Application CI/CD

MongoDB

Windows Virtual Machine

Managed Hardware Security Module

Nexgen Firewall

VPN (Client to Site)

VPN (Site to Site)

Cost Advisory

Disaster Recovery

Identity Lifecycle Management

Log Analysis

Process Automation

SIEM