I'm a Data Engineer by day at Riot Games (via X-Team ) and by night, I curate the last news/product announcements/resources about AWS Silicon (Graviton, AWS Nitro, Inferentia, and Trainium).
Welcome to Issue # 32 of AWS Graviton Weekly, which will be focused on sharing everything that happened in the past week related to AWS Silicon: from April 7th, 2023 to April 14th, 2023.
The recommended resources for this week?
Andy Jassy's letter to shareholders
The launch of the EC2 Inf2 instances powered by AWS Inferentia2
The launch of Amazon EC2 Trn1n instances, optimized for network-intensive generative AI models and Trainium-powered
AWS Graviton2-based Amazon EC2 instances are available in additional regions
If you are a Data Engineer and you are looking for a new role, subscribing to this newsletter will give you 2 to 3 ideas how to stand out your job application.
Join 2420+ Data Geeks who read this newsletter every single week.
AWS has an $85B annualized revenue run rate, is still early in its adoption curve, but at a juncture where it’s critical to stay focused on what matters most to customers over the long-haul.
On Graviton
Chip development is a good example. In last year’s letter, I mentioned the investment we were making in our general-purpose CPU processors named Graviton.
Graviton2-based compute instances deliver up to 40% better price-performance than the comparable latest generation x86-based instances; and in 2022, we delivered our Graviton3 chips, providing 25% better performance than the Graviton2 processors.
Further, as machine learning adoption has continued to accelerate, customers have yearned for lower-cost GPUs (the chips most commonly used for machine learning).
AWS started investing years ago in these specialized chips for machine learning training and inference (inferences are the predictions or answers that a machine learning model provides).
On Trainium
We delivered our first training chip in 2022 (“Trainium”); and for the most common machine learning models, Trainium-based instances are up to 140% faster than GPU-based instances at up to 70% lower cost.
Most companies are still in the training stage, but as they develop models that graduate to large-scale production, they’ll find that most of the cost is in inference because models are trained periodically whereas inferences are happening all the time as their associated application is being exercised.
On Inferentia
We launched our first inference chips (“Inferentia”) in 2019, and they have saved companies like Amazon over a hundred million dollars in capital expense already.
Our Inferentia2 chip, which just launched, offers up to four times higher throughput and ten times lower latency than our first Inferentia processor.
With the enormous upcoming growth in machine learning, customers will be able to get a lot more done with AWS’s training and inference chips at a significantly lower cost.
We’re not close to being done innovating here, and this long-term investment should prove fruitful for both customers and AWS. AWS is still in the early stages of its evolution, and has a chance for unusual growth in the next decade.
At AWS, we have played a key role in democratizing ML and making it accessible to anyone who wants to use it, including more than 100,000 customers of all sizes and industries. AWS has the broadest and deepest portfolio of AI and ML services at all three layers of the stack.
We’ve invested and innovated to offer the most performant, scalable infrastructure for cost-effective ML training and inference; developed Amazon SageMaker, which is the easiest way for all developers to build, train, and deploy models; and launched a wide range of services that allow customers to add AI capabilities like image recognition, forecasting, and intelligent search to applications with a simple API call.
This is why customers like Intuit, Thomson Reuters, AstraZeneca, Ferrari, Bundesliga, 3M, and BMW, as well as thousands of startups and government agencies around the world, are transforming themselves, their industries, and their missions with ML. We take the same democratizing approach to generative AI: we work to take these technologies out of the realm of research and experiments and extend their availability far beyond a handful of startups and large, well-funded tech companies. That’s why today I’m excited to announce several new innovations that will make it easy and practical for our customers to use generative AI in their businesses.
Today, AWS announces the general availability of Amazon Elastic Compute Cloud (Amazon EC2) Trn1n instances, which are powered by AWS Trainium accelerators.
Building on the capabilities of Trainium-powered Trn1 instances, Trn1n instances double the network bandwidth to 1600 Gbps of second-generation Elastic Fabric Adapter (EFAv2).
With this increased bandwidth, Trn1n instances deliver up to 20% faster time-to-train for training network-intensive generative AI modelssuch as large language models (LLMs) and mixture of experts (MoE).
Similar to Trn1 instances, Trn1n instances offer up to 50% savings on training costs over other comparable Amazon EC2 instances.
AWS Lambda now supports SnapStart for Java functions in 6 additional AWS Regions: Asia Pacific (Mumbai), Asia Pacific (Seoul), Canada (Central), Europe (London), South America (São Paulo), US West (N. California). AWS Lambda SnapStart for Java delivers up to 10x faster function startup performance at no extra cost. Lambda SnapStart is a performance optimization that makes it easier for you to build highly responsive and scalable Java applications using AWS Lambda, without having to provision resources or spend time and effort implementing complex performance optimizations.
Today, AWS is announcing the general availability of Amazon CodeWhisperer. This artificial intelligence (AI) coding companion generates real-time single-line or full function code suggestions in your integrated development environment (IDE) to help you more quickly build software. With general availability, we are excited to introduce two tiers: CodeWhisperer Individual and CodeWhisperer Professional.
CodeWhisperer Individual is free to use for generating code. You can sign up with an AWS Builder ID based on your email address. The Individual Tier provides code recommendations, reference tracking, and security scans.
Kubernetes 1.26 introduced several new features and bug fixes, and AWS is excited to announce that you can now use Amazon EKS and Amazon EKS Distro to run Kubernetes version 1.26. Starting today, you can create new 1.26 clusters or upgrade your existing clusters to 1.26 using the Amazon EKS console, the eksctl command line interface, or through an infrastructure-as-code tool.
I’m excited to announce that Amazon EC2 Inf2 instances are now generally available!
Inf2 instances are the first inference-optimized instances in Amazon EC2 to support scale-out distributed inference with ultra-high-speed connectivity between accelerators. You can now efficiently deploy models with hundreds of billions of parameters across multiple accelerators on Inf2 instances. Compared to Amazon EC2 Inf1 instances, Inf2 instances deliver up to 4x higher throughput and up to 10x lower latency.
We chose AWS for the scalability and elasticity that our application needed as well as the great customer service it offers.
Using Amazon Elastic Compute Cloud (Amazon EC2) G5g instances featuring NVIDIA T4G Tensor Core GPUs as the infrastructure for ToxMod has helped us lower our costs by a factor of 5 (compared to G4dn instances) while achieving our goals on throughput and latency.
As a nimble startup, we can reinvest these cost savings into further innovation to help serve our mission. In this post, we cover our use case, challenges, and alternative paths, and a brief overview of our solution using AWS.
Through digital twins running on SOAFEE’S EWAOL architecture, Excelfore’s eSync platform enables developers from across the automotive ecosystem to define and deliver the SDV experience through a standard architecture and platform.
In this post, we chose the Anbox Cloud Appliance to demonstrate how you can use it to stream a resource-demanding game called Genshin Impact. We use a G5g instance along with a mobile phone to run the streamed game inside of a Firefox browser application.
Amazon Web Services (AWS) has introduced Graviton processors to provide better price-performance for their customers. In this deep technical blog, we will explore the performance of Graviton-based instances, how they compare to x86-based instances, and the various ways you can adopt these instances for your workloads.
Expecting one sweeping change to optimize your cloud performance and reduce costs is wishful thinking. It requires a deep dive. A single unidentified interdependency amplified by your broad approach is all it takes to kill any performance gains and erase all cost savings. No one understands this better than the world's most comprehensive and broadly adopted cloud platform. It's at the heart of the AWS solution that supports cost reduction without compromising performance.
Enter the custom-built Graviton processors. This new system allows AWS to improve a company’s performance and deliver faster speeds at a lower cost. More energy-efficient compute options also help reduce your carbon footprint and meet sustainability goals.
You may be able to migrate to AWS Graviton and begin immediately taking advantage of the lower costs, higher performance, and greater sustainability today. In this webinar we'll help you identify the applications you can easily migrate to Graviton without any significant code or configuration changes and save up to 40% on your EC2 costs.
Moving from x86-based Amazon EC2 instances to AWS Graviton Arm-based processors can save you a lot of money, with up to 40 percent better price performance. Can you simply update your AWS CloudFormation templates from c5 to c6g and reap the savings? In this session, learn about customer-proven strategies that can help you make the move to AWS Graviton confidently while minimising uncertainty and risk. Gain insights on identifying candidate workloads, performance testing, maintaining availability and flexibility, monitoring tools and release management.
A modern enterprise demands best-in-class solutions to remain secure, operational, and deliver digital-first experiences to drive growth. Modernisation is challenging, and moving from a traditional data centre to a hybrid cloud environment is a complex exercise. In this session, hear from Red Hat and AWS on how their customers overcame these complexities to build and implement a secure, robust and scalable cloud environment to match customer demands. Discover how this lead to sustainable long term cost savings, while enabling their technology teams to architect for the future by leveraging managed app platforms.
Speakers:
Paul Whiten, Emerging Sales Specialist, Cloud Services, Red Hat
I'm a Data Engineer by day at Riot Games (via X-Team ) and by night, I curate the last news/product announcements/resources about AWS Silicon (Graviton, AWS Nitro, Inferentia, and Trainium).
Issue # 115: November 22, 2024 to November 29, 2024 Hey Reader. Welcome to Issue # 115 of AWS Graviton Weekly, which will be focused on sharing everything that happened in the past week related to AWS Silicon: from November 22, 2024 to November 29, 2024. On Sunday, AWS:reinvent 2024 will start. I couldn't join this year (US visa is getting longer to get), so please if you make it: enjoy your time there, meet new people, network with intent to actually make true connections there. And more...
Issue # 114: November 15, 2024 to November 22, 2024 Hey Reader. Welcome to Issue # 114 of AWS Graviton Weekly, which will be focused on sharing everything that happened in the past week related to AWS Silicon: from November 15, 2024 to November 22, 2024. Enjoy. Recommendation of the week: CAST.AI Save big today NEWS Amazon and Anthropic deepen strategic collaboration Amazon EC2 added New CPU-Performance Attribute for Instance Type Selection The next generation of Amazon FSx for Lustre file...
Issue # 113: November 8, 2024 to November 15, 2024 Hey Reader. Welcome to Issue # 112 of AWS Graviton Weekly, which will be focused on sharing everything that happened in the past week related to AWS Silicon: from November 8, 2024 to November 15, 2024. AWS re:Invent 2024 is almost here, and if I have to recommend some talks there, check out the Datadogs talks and this one from the EC2 team called "CMP323 | Optimize your AI/ML workloads with Amazon EC2 Graviton". Enjoy. Recommendation of the...