AWS Trainium2 Instances Now Generally Available
AWS announced the general availability of Trainium2-powered Amazon EC2 instances and new Trn2 UltraServers. The new Trn2 instances offer 30-40% better price performance than current GPU-based EC2 instances, featuring 16 Trainium2 chips providing 20.8 peak petaflops of compute. Trn2 UltraServers connect four Trn2 servers with NeuronLink interconnect, scaling up to 83.2 peak petaflops. AWS also unveiled Trainium3, their next-generation AI chip expected to be 4x more performant than Trn2 UltraServers, with availability planned for late 2025.
AWS ha annunciato la disponibilità generale delle istanze Amazon EC2 alimentate da Trainium2 e dei nuovi Trn2 UltraServers. Le nuove istanze Trn2 offrono un 30-40% di migliore rapporto prezzo-prestazioni rispetto alle attuali istanze EC2 basate su GPU, con 16 chip Trainium2 che forniscono 20.8 petaflops di picco in capacità computazionale. I Trn2 UltraServers collegano quattro server Trn2 tramite interconnessione NeuronLink, scalando fino a 83.2 petaflops di picco. AWS ha anche svelato Trainium3, il loro chip AI di nuova generazione previsto per essere 4 volte più performante dei Trn2 UltraServers, con disponibilità prevista per la fine del 2025.
AWS anunció la disponibilidad general de instancias Amazon EC2 impulsadas por Trainium2 y nuevos Trn2 UltraServers. Las nuevas instancias Trn2 ofrecen un 30-40% mejor rendimiento de precio que las actuales instancias EC2 basadas en GPU, con 16 chips Trainium2 que proporcionan 20.8 petaflops de cómputo en pico. Los Trn2 UltraServers conectan cuatro servidores Trn2 con interconexión NeuronLink, escalando hasta 83.2 petaflops en pico. AWS también presentó Trainium3, su chip de IA de próxima generación que se espera sea 4 veces más potente que los Trn2 UltraServers, con disponibilidad planeada para finales de 2025.
AWS는 Trainium2 기반의 아마존 EC2 인스턴스들과 새로운 Trn2 울트라서버의 일반 공급을 발표했습니다. 새로운 Trn2 인스턴스는 현재 GPU 기반 EC2 인스턴스보다 30-40% 우수한 가격 성능비를 제공합니다. 이 인스턴스는 16개의 Trainium2 칩을 사용하여 20.8 피크 페타플롭스의 컴퓨팅 성능을 제공합니다. Trn2 울트라서버는 네오론 링크 인터커넥트를 통해 네 개의 Trn2 서버를 연결하여 최대 83.2 피크 페타플롭스까지 확장할 수 있습니다. AWS는 또한 Trainium3를 공개했으며, 이는 Trn2 울트라서버보다 4배 더 높은 성능을 가진 차세대 AI 칩으로, 2025년 말에 출시될 예정입니다.
AWS a annoncé la disponibilité générale des instances Amazon EC2 alimentées par Trainium2 et des nouveaux Trn2 UltraServers. Les nouvelles instances Trn2 offrent un rapport qualité-prix de 30 à 40 % meilleur que les instances EC2 basées sur GPU actuelles, avec 16 puces Trainium2 fournissant 20,8 pic petaflops de calcul. Les Trn2 UltraServers connectent quatre serveurs Trn2 via une interconnexion NeuronLink, pouvant évoluer jusqu'à 83,2 pic petaflops. AWS a également dévoilé le Trainium3, leur puce d'IA de nouvelle génération qui devrait être 4 fois plus performante que les Trn2 UltraServers, avec une disponibilité prévue pour fin 2025.
AWS hat die allgemeine Verfügbarkeit von Amazon EC2-Instanzen mit Trainium2 und neuen Trn2 UltraServers bekannt gegeben. Die neuen Trn2-Instanzen bieten ein 30-40% besseres Preis-Leistungs-Verhältnis als die aktuellen GPU-basierten EC2-Instanzen und verfügen über 16 Trainium2-Chips, die 20,8 Peak-Petaflops an Rechenleistung bereitstellen. Die Trn2 UltraServers verbinden vier Trn2-Server über die NeuronLink-Interkonnektivität und skalieren bis zu 83,2 Peak-Petaflops. AWS stellte auch Trainium3 vor, ihren nächsten KI-Chip, der voraussichtlich viermal leistungsfähiger sein wird als die Trn2 UltraServers, mit einer Verfügbarkeit, die für Ende 2025 geplant ist.
- 30-40% better price performance compared to current GPU-based EC2 instances
- Significant computing power with 20.8 peak petaflops per instance
- Partnership with major companies like Anthropic, Databricks, and Hugging Face
- Databricks customers expected to achieve up to 30% lower TCO
- Poolside expects 40% cost savings compared to EC2 P5 instances
- Trainium3 chips won't be available until late 2025
- initial availability (only in US East Ohio region)
Insights
AWS's launch of Trainium2 AI chips marks a significant competitive advancement in the cloud AI infrastructure market. The
Key technical differentiators include the NeuronLink interconnect enabling UltraServer configurations and native JAX framework support. The early announcement of Trainium3 with expected
This launch strengthens Amazon's competitive position against cloud rivals Microsoft and Google in the lucrative AI infrastructure market. The superior price-performance ratio could accelerate enterprise AI adoption while improving AWS's margins. Partnerships with key AI companies like Anthropic and Databricks, who expect
The timing of this release amid the AI boom and ongoing chip shortage positions AWS to capture market share from GPU-dependent competitors. The projected 2025 release of Trainium3 provides a clear technology roadmap that could influence enterprise AI infrastructure planning and investment decisions.
New Amazon EC2 Trn2 instances, featuring AWS’s newest Trainium2 AI chip, offer 30
New Trn2 UltraServers use ultra-fast NeuronLink interconnect to connect four Trn2 servers together into one giant server, enabling the fastest training and inference on AWS for the world’s largest models
Amazon EC2 Trn2 UltraServers (Photo: Business Wire)
-
Trn2 instances offer 30
-40% better price performance than the current generation of GPU-based EC2 P5e and P5en instances and feature 16 Trainium2 chips to provide 20.8 peak petaflops of compute—ideal for training and deploying LLMs with billions of parameters. - Amazon EC2 Trn2 UltraServers are a completely new EC2 offering featuring 64 interconnected Trainium2 chips, using ultra-fast NeuronLink interconnect, to scale up to 83.2 peak petaflops of compute—quadrupling the compute, memory, and networking of a single instance—which makes it possible to train and deploy the world’s largest models.
- Together with Anthropic, AWS is building an EC2 UltraCluster of Trn2 UltraServers—named Project Rainier—containing hundreds of thousands of Trainium2 chips and more than 5x the number of exaflops used to train their current generation of leading AI models.
- AWS unveiled Trainium3, its next generation AI chip, which will allow customers to build bigger models faster and deliver superior real-time performance when deploying them.
“Trainium2 is purpose built to support the largest, most cutting-edge generative AI workloads, for both training and inference, and to deliver the best price performance on AWS,” said David Brown, vice president of Compute and Networking at AWS. “With models approaching trillions of parameters, we understand customers also need a novel approach to train and run these massive workloads. New Trn2 UltraServers offer the fastest training and inference performance on AWS and help organizations of all sizes to train and deploy the world’s largest models faster and at a lower cost.”
As models grow in size, they are pushing the limits of compute and networking infrastructure as customers seek to reduce training times and inference latency—the time between when an AI system receives an input and generates the corresponding output. AWS already offers the broadest and deepest selection of accelerated EC2 instances for AI/ML, including those powered by GPUs and ML chips. But even with the fastest accelerated instances available today, customers want more performance and scale to train these increasingly sophisticated models faster, at a lower cost. As model complexity and data volumes grow, simply increasing cluster size fails to yield faster training time due to parallelization constraints. Simultaneously, the demands of real-time inference push single-instance architectures beyond their capabilities.
Trn2 is the highest performing Amazon EC2 instance for deep learning and generative AI
Trn2 offers 30
Trn2 UltraServers meet increasingly demanding AI compute needs of the world’s largest models
For the largest models that require even more compute, Trn2 UltraServers allow customers to scale training beyond the limits of a single Trn2 instance, reducing training time, accelerating time to market, and enabling rapid iteration to improve model accuracy. Trn2 UltraServers are a completely new EC2 offering that use ultra-fast NeuronLink interconnect to connect four Trn2 servers together into one giant server. With new Trn2 UltraServers, customers can scale up their generative AI workloads across 64 Trainium2 chips. For inference workloads, customers can use Trn2 UltraServers to improve real-time inference performance for trillion-parameter models in production. Together with Anthropic, AWS is building an EC2 UltraCluster of Trn2 UltraServers, named Project Rainier, which will scale out distributed model training across hundreds of thousands of Trainium2 chips interconnected with third-generation, low-latency petabit scale EFA networking—more than 5x the number of exaflops that Anthropic used to train their current generation of leading AI models. When completed, it is expected to be the world’s largest AI compute cluster reported to date available for Anthropic to build and deploy their future models on.
Anthropic is an AI safety and research company that creates reliable, interpretable, and steerable AI systems. Anthropic’s flagship product is Claude, an LLM trusted by millions of users worldwide. As part of Anthropic’s expanded collaboration with AWS, they’ve begun optimizing Claude models to run on Trainium2, Amazon’s most advanced AI hardware to date. Anthropic will be using hundreds of thousands of Trainium2 chips—over five times the size of their previous cluster—to deliver exceptional performance for customers using Claude in Amazon Bedrock.
Databricks’ Mosaic AI enables organizations to build and deploy quality agent systems. It is built natively on top of the data lakehouse, enabling customers to easily and securely customize their models with enterprise data and deliver more accurate and domain-specific outputs. Thanks to Trainium's high performance and cost-effectiveness, customers can scale model training on Mosaic AI at a low cost. Trainium2’s availability will be a major benefit to Databricks and its customers as demand for Mosaic AI continues to scale across all customer segments and around the world. Databricks, one of the largest data and AI companies in the world, plans to use Trn2 to deliver better results and lower TCO by up to
Hugging Face is the leading open platform for AI builders, with more than 2 million models, datasets, and AI applications shared by a community of more than 5 million researchers, data scientists, machine learning engineers, and software developers. Hugging Face has collaborated with AWS over the last couple of years, making it easier for developers to experience the performance and cost benefits of AWS Inferentia and Trainium through the Optimum Neuron open-source library, integrated in Hugging Face Inference Endpoints and now optimized within the new HUGS self-deployment service, available on the AWS Marketplace. With the launch of Trainium2, Hugging Face users will have access to even higher performance to develop and deploy models faster.
poolside is set to build a world where AI will drive the majority of economically valuable work and scientific progress. poolside believes that software development will be the first major capability in neural networks that reaches human-level intelligence. To enable that, they're building FMs, an API, and an assistant to bring the power of generative AI to developers' hands. A key to enable this technology is the infrastructure they’re using to build and run their products. With AWS Trainium2, poolside’s customers will be able to scale their usage of poolside at a price performance ratio unlike other AI accelerators. In addition, poolside plans to train future models with Trainium2 UltraServers, with expected savings of
Trainium3 chips—designed for the high-performance needs of the next frontier of generative AI workloads
AWS unveiled Trainium3, its next-generation AI training chip. Trainium3 will be the first AWS chip made with a 3-nanometer process node, setting a new standard for performance, power efficiency, and density. Trainium3-powered UltraServers are expected to be 4x more performant than Trn2 UltraServers, allowing customers to iterate even faster when building models and deliver superior real-time performance when deploying them. The first Trainium3-based instances are expected to be available in late 2025.
Enabling customers to unlock the performance of Trainium2 with AWS Neuron software
The Neuron SDK includes compiler, runtime libraries, and tools to help developers optimize their models to run on Trainium. It provides developers with the ability to optimize models for optimal performance on Trainium chips. Neuron is natively integrated with popular frameworks like JAX and PyTorch so customers can continue using their existing code and workflows on Trainium with fewer code changes. Neuron also supports over 100,000 models on the Hugging Face model hub. With the Neuron Kernel Interface (NKI), developers get access to bare metal Trainium chips, enabling them to write compute kernels that maximize performance for demanding workloads.
Neuron software is designed to make it easy to use popular frameworks like JAX to train and deploy models on Trainium2 while minimizing code changes and tie-in to vendor-specific solutions. Google is supporting AWS's efforts to enable customers to use JAX for large-scale training and inference through its native OpenXLA integration, providing users an easy and portable coding path to get started with Trn2 instances quickly. With industry wide open-source collaboration and the availability of Trainium2, Google expects to see increased adoption of JAX across the ML community—a significant milestone for the entire ML ecosystem.
Trn2 instances are generally available today in the US East (
To learn more, visit:
- The AWS News Blog for details on today’s announcements.
- The AWS Trainium page to learn more about the capabilities.
- The AWS Trainium customer page to learn how companies are using Trainium.
- The AWS re:Invent page for more details on everything happening at AWS re:Invent.
About Amazon Web Services
Since 2006, Amazon Web Services has been the world’s most comprehensive and broadly adopted cloud. AWS has been continually expanding its services to support virtually any workload, and it now has more than 240 fully featured services for compute, storage, databases, networking, analytics, machine learning and artificial intelligence (AI), Internet of Things (IoT), mobile, security, hybrid, media, and application development, deployment, and management from 108 Availability Zones within 34 geographic regions, with announced plans for 18 more Availability Zones and six more AWS Regions in
About Amazon
Amazon is guided by four principles: customer obsession rather than competitor focus, passion for invention, commitment to operational excellence, and long-term thinking. Amazon strives to be Earth's Most Customer-Centric Company, Earth's Best Employer, and Earth's Safest Place to Work. Customer reviews, 1-Click shopping, personalized recommendations, Prime, Fulfillment by Amazon, AWS, Kindle Direct Publishing, Kindle, Career Choice, Fire tablets, Fire TV, Amazon Echo, Alexa, Just Walk Out technology, Amazon Studios, and The Climate Pledge are some of the things pioneered by Amazon. For more information, visit amazon.com/about and follow @AmazonNews.
View source version on businesswire.com: https://www.businesswire.com/news/home/20241203165432/en/
Amazon.com, Inc.
Media Hotline
Amazon-pr@amazon.com
www.amazon.com/pr
Source: Amazon.com, Inc.
FAQ
What is the price performance improvement of AWS Trainium2 compared to current GPU instances?
When will AWS Trainium3 chips be available?
How much computing power does a single Trn2 UltraServer provide?