GPU Virtual Machine NVIDIA

High-performance compute with NVIDIA GPU

image

Overview

Unleash peak performance for your most compute-intensive AI and graphics workloads with GPU Virtual Machine NVIDIA from the cloud platform. Delivering up to 30 TFLOPS (FP64), 60 TFLOPS (FP32), 1,671 TFLOPS (FP16), and 3,341 TFLOPS (FP8) with 141 GB of ultra-fast HBM3e memory, these VMs are optimised for LLM/Deep learning-based model training/Inferencing with MIG support and NVLink interconnect for seamless scalability and efficiency.

With reliability at the enterprise level, on-demand scalability, and fast provisioning, you can train sophisticated AI models with certainty, execute real-time analytics, and offer visualization-based experiences. Tap into your cost savings, scalability, and computing power to develop game-changing, next-generation applications and become a clear, decisive, and competitive winner in your industry this Overview Reaches Uniqueness and Strategic Alignment.

Variants

A range of configurations are available, allowing you to select the options best suited for testing and hosting production environments. The offerings are segregated into the below categories. Please, review them carefully before choosing your desired configurations for deployment.
Ubuntu (NVIDIA)
  • Best for development & non-production environments
  • Framework: Base Ubuntu 20.04 LTS installation
  • GPU configuration: Available in configuration of 1x GPU, 2x GPU, 4x GPU, 8x GPU
  • Billing options: On-demand, 1-month, 6-month, 12-month reserved
  • Environment: Testing and development
Ubuntu PyTorch (NVIDIA)
  • Best for development & non-production environments
  • Framework: Pre-configured PyTorch framework
  • GPU configuration: Available in configuration of 1x GPU, 2x GPU, 4x GPU, 8x GPU
  • Billing options: On-demand, 1-month, 6-month, 12-month reserved
  • Environment: Research and prototyping
Ubuntu TensorFlow (NVIDIA) Recommended
  • Best for all Production workloads
  • Framework: Enterprise TensorFlow installation
  • GPU configuration: Available in configuration of 1x GPU, 4x GPU, 8x GPU
  • Billing options: On-demand, 1-month, 6-month, 12-month reserved
  • Environment: Production deployments and enterprise workloads

Core Features at a Glance 

Extreme Model Capacity
Train and inference large LLM/ML models with up to 141 GB of ultra-fast HBM3e memory per GPU and exceptional bandwidth (4.8 TB/s with NVLink), supporting extended context lengths and efficient parallel processing.
Optimized Precision Modes
Native support for FP8, BF16, FP16, INT8, as well as FP32 and FP64 Tensor Core precision — enabling efficient, high-throughput AI compute with reduced memory usage and maximum flexibility.
Instant & Flexible Provisioning
Spin up GPU VMs on demand with configurable specs, real-time deployment, and workload-optimised scaling.
Framework-Ready Environment
Full support for NVIDIA CUDA, along with leading AI/ML frameworks like PyTorch and TensorFlow — ready for immediate, accelerated development out of the box.
Flexible Pricing & Real-Time Insights
Choose from On-Demand, Reserved, or Rental options, and monitor GPU usage in real-time for performance tuning and cost efficiency.
High-Speed Memory Bandwidth
Achieve smoother training and inference with ultra-fast memory bandwidth of up to 4.8 TB/s per GPU — ideal for compute-heavy AI and LLM workloads.
Multi-GPU Scalability
Train models faster by scaling seamlessly across 2, 4, or 8 GPUs per VM for parallel workloads.
Cross-Platform OS Support
Deploy on Ubuntu, or RHEL for wide compatibility across AI, data analytics, and visual rendering workflows.

What You Get

Still have questions?

The H200 NVL is designed for large-scale inference/Train workloads and supports NVLink for multi-GPU scaling. It typically ships in dual-GPU configurations. The H200 SXM, on the other hand, is optimized for maximum throughput in high-density servers with SXM5 sockets, offering superior memory bandwidth and power efficiency for training and inference at scale.
H200 NVL: LLM inference, deep learning inferencing/Training, recommendation systems
H200 NVL: 141 GB HBM3 (per GPU), ~4.8 TB/s bandwidth (dual config)
H200 NVL : Yes, support MIG (Multi-Instance GPU) for secure GPU partitioning.
Yes, our platform offers both on-demand access for flexible scaling and reserved capacity options for guaranteed availability, ideal for enterprise SLAs and scheduled training runs.
We offer GPU Nodes in configurations of 1, 2, 4, or 8 GPUs per node, depending on the instance type and GPU model. This allows fine-tuned scaling based on workload intensity and budget.
All GPU models are compatible with major ML/DL frameworks such as TensorFlow, PyTorch, JAX, Hugging Face Transformers, and ONNX. NVIDIA GPUs use CUDA drivers.
Yes, we provide ready-to-deploy VM images with pre-installed drivers, and libraries optimised for each GPU model.

Ready to Build Smarter Experiences?

Please provide the necessary information to receive additional assistance.
image
Captcha
By selecting ‘Submit', you authorise Jio Platforms Limited to store your contact details for further communication.
Submit
Cancel