NVIDIA and AWS expand production AI infrastructure

25 June 2026, 01:30·1 min read

NVIDIA and AWS are expanding their joint AI infrastructure work across Amazon EC2 and Amazon OpenSearch, targeting production workloads that require low-latency inference, fast vector search, stronger GPU price-performance and simpler scaling. New Amazon EC2 G7 instances use NVIDIA RTX PRO 4500 Blackwell Server Edition GPUs for AI inference, graphics, spatial computing, GPU-accelerated analytics, video workflows and related enterprise workloads.

Compared with G6 instances, G7 delivers up to 4.6x AI inference performance and up to 2.1x graphics performance. G7 supports up to eight GPUs, 256GB of total GPU memory, 700 Gbps of EFA-enabled networking and up to 7.6TB of local NVMe SSD storage, with access through AWS Deep Learning AMIs, Deep Learning Containers, Amazon EMR, Amazon EKS, Amazon ECS and graphics AMIs, and support coming soon to Amazon SageMaker AI.

Amazon OpenSearch Serverless now uses GPU-accelerated vector indexing powered by NVIDIA cuVS as the default compute choice for all vector collections. The setup is designed for retrieval-augmented generation, semantic search, recommendations and agentic AI, with vector indexing up to 10x faster at a quarter of the cost compared with CPU-only builds, enabling billion-scale vector databases in under an hour. AWS also achieved NVIDIA Exemplar Cloud status on NVIDIA GB300 for training workloads, signaling that it meets NVIDIA performance thresholds for large-scale training infrastructure.

Originally reported by blogs.nvidia.comRead the source →

Related coverage

Infrastructure

NVIDIA and AWS expand production AI infrastructure

NVIDIA expands footprint across TOP500 supercomputers

NVIDIA Rubin pushes AI systems beyond chip benchmarks

NVIDIA shifts AI server cooling to hotter liquid loops

NVIDIA highlights new AI infrastructure for science and robotics