Résumé IA
AWS et NVIDIA ont annoncé lors du GTC 2026 un partenariat élargi incluant le déploiement de plus d'un million de GPU NVIDIA (architectures Blackwell et Rubin) dans les régions cloud AWS dès 2026. AWS devient le premier grand fournisseur cloud à supporter les GPU RTX PRO 4500 Blackwell Server Edition sur Amazon EC2, couvrant des usages variés comme l'IA conversationnelle, l'analytique et le rendu vidéo. La collaboration inclut également une accélération de l'inférence LLM via NVIDIA NIXL sur AWS EFA, des performances Apache Spark 3x plus rapides avec Amazon EMR, et un support étendu des modèles NVIDIA Nemotron sur Amazon Bedrock.
Impact France/UELes entreprises et développeurs européens utilisant AWS pourront accéder aux nouvelles instances GPU Blackwell pour leurs déploiements IA en production.
AI is moving fast, and for most of our customers, the real opportunity isn’t in experimenting with it—it’s in running AI in production where it drives meaningful business outcomes. This means building systems that run reliably, perform at scale, and meet your organization’s security and compliance requirements. Today at NVIDIA GTC 2026, AWS and NVIDIA announced an expanded collaboration with new technology integrations to support growing AI compute demand and help you build and run AI solutions that are production-ready. These integrations span accelerated computing, interconnect technologies, and model fine-tuning and inference. They include: The deployment of more than 1 million NVIDIA GPUs across AWS Regions starting in 2026 Amazon Elastic Compute Cloud (Amazon EC2) support for NVIDIA RTX PRO 4500 Blackwell Server Edition GPUs—first among major cloud providers Interconnect acceleration for disaggregated LLM inference with NVIDIA NIXL on AWS Elastic Fabric Adapter (EFA) 3x faster Apache Spark performance using Amazon EMR on Amazon Elastic Kubernetes Service (Amazon EKS) with Amazon EC2 G7e instances, powered by NVIDIA RTX PRO 6000 Blackwell Server Edition GPUs Expanded NVIDIA Nemotron model support on Amazon Bedrock . Major announcements at NVIDIA GTC 2026 Scaling AI infrastructure with expanded GPU options and optimized interconnect Accelerating compute capacity in the agentic AI era Starting in 2026, AWS will add more than 1 million NVIDIA GPUs including Blackwell and Rubin GPU architectures across our global cloud regions. AWS offers the broadest collection of NVIDIA GPU-based instances of any cloud provider to power a diverse set of AI/ML workloads. AWS and NVIDIA are also collaborating on Spectrum networking and other infrastructure areas, adding to over 15 years of joint innovation between our two companies. AWS’s advanced cloud and AI infrastructure provides enterprises, startups, and researchers with the infrastructure needed to build and scale agentic AI systems—capable of reasoning, planning, and acting autonomously across complex workflows. New Amazon EC2 instances with NVIDIA RTX PRO 4500 Blackwell Server Edition GPUs Today, we announced that Amazon EC2 instances accelerated by NVIDIA RTX PRO 4500 Blackwell Server Edition GPUs are coming soon. AWS is the first major cloud provider to announce support for RTX PRO 4500 Blackwell Server Edition GPUs. These instances are well-suited for a wide range of workloads, including data analytics, conversational AI, content generation, recommender systems, video streaming, video rendering, and other graphics workloads. Amazon EC2 instances accelerated by NVIDIA RTX PRO 4500 Blackwell Server Edition GPUs will be built on the AWS Nitro System , a combination of dedicated hardware and lightweight hypervisor which delivers practically all of the compute and memory resources of the host hardware to your instances for better overall resource utilization and performance. The Nitro System’s specialized hardware, software, and firmware are designed to enforce restrictions so that nobody, including anyone at AWS, can access your sensitive AI workloads and data. In addition, the Nitro System supports firmware updates, bug fixes, and optimizations while the system remains operational. These capabilities within the Nitro System enable the enhanced resource efficiency, security, and stability that AI, analytics, and graphics workloads require in production. Accelerating interconnect for disaggregated LLM inference with NVIDIA NIXL on AWS EFA and Trainium As model sizes grow, communication overhead between GPUs or Trainium can become a bottleneck. Today, we announced support for NVIDIA Inference Xfer Library (NIXL) with AWS EFA to accelerate disaggregated Large Language Model (LLM) inference on Amazon EC2, across NVIDIA GPUs and AWS Trainiums. Accelerating disaggregated inference is critical for scaling modern AI workloads because it enables efficient overlap of communication and computation while minimizing communication latency and maximizing GPU utilization. This integration enables high-throughput, low-latency KV-cache data movement between GPU compute nodes performing token generation and distributed memory resources that store KV-cache state. It also provides the flexibility to build inference clusters using any combination of GPU and Trainium EFA-enabled EC2 instances. NIXL with EFA integrates natively with popular open-source frameworks such as NVIDIA Dynamo, vLLM, and SGLang, delivering improved inter-token latency and more efficient KV-cache memory utilization. Accelerating data analytics with Amazon EMR and NVIDIA GPUs Running Apache Spark 3x faster using Amazon EMR on Amazon EKS with G7e instances Data engineers and data scientists frequently face hours-long data processing pipelines that slow AI/ML model iteration and business intelligence generation. We’re seeing significant performance gains for these workloads—AWS and NVIDIA deliver 3x faster performance