Marcos Ortiz

AWS Graviton Weekly # 32

Published 12 months ago • 10 min read

Issue # 32: April 7th, 2023 to April 14th, 2023

[Read the browser version right here]

In partnership with Interesting Data Gigs

Hey Reader

Welcome to Issue # 32 of AWS Graviton Weekly, which will be focused on sharing everything that happened in the past week related to AWS Silicon: from April 7th, 2023 to April 14th, 2023.

The recommended resources for this week?

  • Andy Jassy's letter to shareholders
  • The launch of the EC2 Inf2 instances powered by AWS Inferentia2
  • The launch of Amazon EC2 Trn1n instances, optimized for network-intensive generative AI models and Trainium-powered
  • AWS Graviton2-based Amazon EC2 instances are available in additional regions
  • Read how Modulate.AI uses G5g instances featuring NVIDIA T4G Tensor Core GPUs

Enjoy the content of this week

Brought to you by Interesting Data Gigs

If you are a Data Engineer and you are looking for a new role, subscribing to this newsletter will give you 2 to 3 ideas how to stand out your job application.

Join 2420+ Data Geeks who read this newsletter every single week.


Andy Jassy’s Letter to Shareholders

On the big business of AWS

AWS has an $85B annualized revenue run rate, is still early in its adoption curve, but at a juncture where it’s critical to stay focused on what matters most to customers over the long-haul.

On Graviton

Chip development is a good example. In last year’s letter, I mentioned the investment we were making in our general-purpose CPU processors named Graviton.
Graviton2-based compute instances deliver up to 40% better price-performance than the comparable latest generation x86-based instances; and in 2022, we delivered our Graviton3 chips, providing 25% better performance than the Graviton2 processors.
Further, as machine learning adoption has continued to accelerate, customers have yearned for lower-cost GPUs (the chips most commonly used for machine learning).
AWS started investing years ago in these specialized chips for machine learning training and inference (inferences are the predictions or answers that a machine learning model provides).

On Trainium

We delivered our first training chip in 2022 (“Trainium”); and for the most common machine learning models, Trainium-based instances are up to 140% faster than GPU-based instances at up to 70% lower cost.
Most companies are still in the training stage, but as they develop models that graduate to large-scale production, they’ll find that most of the cost is in inference because models are trained periodically whereas inferences are happening all the time as their associated application is being exercised.

On Inferentia

We launched our first inference chips (“Inferentia”) in 2019, and they have saved companies like Amazon over a hundred million dollars in capital expense already.
Our Inferentia2 chip, which just launched, offers up to four times higher throughput and ten times lower latency than our first Inferentia processor.
With the enormous upcoming growth in machine learning, customers will be able to get a lot more done with AWS’s training and inference chips at a significantly lower cost.
We’re not close to being done innovating here, and this long-term investment should prove fruitful for both customers and AWS. AWS is still in the early stages of its evolution, and has a chance for unusual growth in the next decade.

Highly recommended reading for the weekend

Announcing New Tools for Building with Generative AI on AWS, by Swami Sivasubramanian (VP, Database, Analytics and ML at AWS)

At AWS, we have played a key role in democratizing ML and making it accessible to anyone who wants to use it, including more than 100,000 customers of all sizes and industries. AWS has the broadest and deepest portfolio of AI and ML services at all three layers of the stack.
We’ve invested and innovated to offer the most performant, scalable infrastructure for cost-effective ML training and inference; developed Amazon SageMaker, which is the easiest way for all developers to build, train, and deploy models; and launched a wide range of services that allow customers to add AI capabilities like image recognition, forecasting, and intelligent search to applications with a simple API call.
This is why customers like Intuit, Thomson Reuters, AstraZeneca, Ferrari, Bundesliga, 3M, and BMW, as well as thousands of startups and government agencies around the world, are transforming themselves, their industries, and their missions with ML. We take the same democratizing approach to generative AI: we work to take these technologies out of the realm of research and experiments and extend their availability far beyond a handful of startups and large, well-funded tech companies. That’s why today I’m excited to announce several new innovations that will make it easy and practical for our customers to use generative AI in their businesses.

Learn more:

Amazon EC2 Trn1n instances, optimized for network-intensive generative AI models, are now generally available

Today, AWS announces the general availability of Amazon Elastic Compute Cloud (Amazon EC2) Trn1n instances, which are powered by AWS Trainium accelerators.
Building on the capabilities of Trainium-powered Trn1 instances, Trn1n instances double the network bandwidth to 1600 Gbps of second-generation Elastic Fabric Adapter (EFAv2).
With this increased bandwidth, Trn1n instances deliver up to 20% faster time-to-train for training network-intensive generative AI models such as large language models (LLMs) and mixture of experts (MoE).
Similar to Trn1 instances, Trn1n instances offer up to 50% savings on training costs over other comparable Amazon EC2 instances.

Learn more:

AWS Graviton2-based Amazon EC2 instances are available in additional regions

Starting today, Amazon Elastic Compute Cloud (Amazon EC2) M6gd instances are available in Asia Pacific (Seoul).
C6g instances are available in Asia Pacific (Melbourne) and Europe (Zurich).
M6g instances are available in Africa (Cape Town), Asia Pacific (Melbourne, Osaka) and Europe (Zurich).
R6g instances are available in Africa (Cape Town), Middle East (Bahrain), Asia Pacific (Melbourne) and Europe (Zurich).

Learn more:

AWS Lambda now supports SnapStart for Java functions in 6 additional regions

AWS Lambda now supports SnapStart for Java functions in 6 additional AWS Regions: Asia Pacific (Mumbai), Asia Pacific (Seoul), Canada (Central), Europe (London), South America (São Paulo), US West (N. California). AWS Lambda SnapStart for Java delivers up to 10x faster function startup performance at no extra cost. Lambda SnapStart is a performance optimization that makes it easier for you to build highly responsive and scalable Java applications using AWS Lambda, without having to provision resources or spend time and effort implementing complex performance optimizations.

Learn more:

Amazon CodeWhisperer is now generally available

Today, AWS is announcing the general availability of Amazon CodeWhisperer. This artificial intelligence (AI) coding companion generates real-time single-line or full function code suggestions in your integrated development environment (IDE) to help you more quickly build software. With general availability, we are excited to introduce two tiers: CodeWhisperer Individual and CodeWhisperer Professional.
CodeWhisperer Individual is free to use for generating code. You can sign up with an AWS Builder ID based on your email address. The Individual Tier provides code recommendations, reference tracking, and security scans.

Learn more:

Amazon EKS and Amazon EKS Distro now support Kubernetes version 1.26

Kubernetes 1.26 introduced several new features and bug fixes, and AWS is excited to announce that you can now use Amazon EKS and Amazon EKS Distro to run Kubernetes version 1.26. Starting today, you can create new 1.26 clusters or upgrade your existing clusters to 1.26 using the Amazon EKS console, the eksctl command line interface, or through an infrastructure-as-code tool.

Learn more:


Amazon EC2 Inf2 Instances for Low-Cost, High-Performance Generative AI Inference are Now Generally Available, by Antje Barth (Principal Developer Advocate, AI & Machine Learning at Amazon Web Services and co-author of "Data Science on AWS" book)

I’m excited to announce that Amazon EC2 Inf2 instances are now generally available!
Inf2 instances are the first inference-optimized instances in Amazon EC2 to support scale-out distributed inference with ultra-high-speed connectivity between accelerators. You can now efficiently deploy models with hundreds of billions of parameters across multiple accelerators on Inf2 instances. Compared to Amazon EC2 Inf1 instances, Inf2 instances deliver up to 4x higher throughput and up to 10x lower latency.

Learn more:

Driving Price Performance Benefits with AWS Graviton, by Carter Huffman, CTO and Co-founder at Modulate and Shruti Koparkar (Product Marketing Lead, AI/ML Acceleration at AWS)

We chose AWS for the scalability and elasticity that our application needed as well as the great customer service it offers.
Using Amazon Elastic Compute Cloud (Amazon EC2) G5g instances featuring NVIDIA T4G Tensor Core GPUs as the infrastructure for ToxMod has helped us lower our costs by a factor of 5 (compared to G4dn instances) while achieving our goals on throughput and latency.
As a nimble startup, we can reinvest these cost savings into further innovation to help serve our mission. In this post, we cover our use case, challenges, and alternative paths, and a brief overview of our solution using AWS.

Learn more:

How Excelfore Uses the AWS Cloud to Develop and Deliver Continuous Software Updates for Software Defined Vehicles, by Shrikant Acharya (President and CTO at Excelfore) and Nolan Chen (Solutions Architect at AWS)

Through digital twins running on SOAFEE’S EWAOL architecture, Excelfore’s eSync platform enables developers from across the automotive ecosystem to define and deliver the SDV experience through a standard architecture and platform.

Learn more:

Streaming Android games from cloud to mobile with AWS Graviton-based Amazon EC2 G5g instances, by Vincent Wang, GCR EC2 Specialist SA, Compute.

In this post, we chose the Anbox Cloud Appliance to demonstrate how you can use it to stream a resource-demanding game called Genshin Impact. We use a G5g instance along with a mobile phone to run the streamed game inside of a Firefox browser application.

Learn more:

AWS RDS Costs Saving Tips, by Rahul Subramaniam, (Founder and Chief Evangelist at CloudFix)

A very good pack of saving tips here by Rahul.

The Power of AWS Graviton Processors: A Deep Dive into Performance and Adoption, by Nithin Janardhanan

Amazon Web Services (AWS) has introduced Graviton processors to provide better price-performance for their customers. In this deep technical blog, we will explore the performance of Graviton-based instances, how they compare to x86-based instances, and the various ways you can adopt these instances for your workloads.

Learn more:

Drive Data-Backed Cost Optimization and Savings, by Jarosław Grząbel (Cloud Architect at SoftServe)

Expecting one sweeping change to optimize your cloud performance and reduce costs is wishful thinking. It requires a deep dive. A single unidentified interdependency amplified by your broad approach is all it takes to kill any performance gains and erase all cost savings. No one understands this better than the world's most comprehensive and broadly adopted cloud platform. It's at the heart of the AWS solution that supports cost reduction without compromising performance.
Enter the custom-built Graviton processors. This new system allows AWS to improve a company’s performance and deliver faster speeds at a lower cost. More energy-efficient compute options also help reduce your carbon footprint and meet sustainability goals.

Learn more:


[VIDEO] Effortless migration - Is your app already Graviton-ready? by Michael Fisher (Principal Specialist Solutions Architect for EC2 Graviton at AWS) and Vishal Manan (Senior Specialist Solutions Architect for EC2 Graviton at AWS)

You may be able to migrate to AWS Graviton and begin immediately taking advantage of the lower costs, higher performance, and greater sustainability today. In this webinar we'll help you identify the applications you can easily migrate to Graviton without any significant code or configuration changes and save up to 40% on your EC2 costs.

Learn more:

[VIDEO] AWS On Air ft. Build a SAAS MVP on AWS, with Jillian Forde (Senior Solutions Architect at AWS), Bill Tarr (Sr. Partner Solutions Architect - SaaS Factory at AWS) and Ben Duncan (Startup Solutions Architect at AWS)

Learn the most important architecture considerations when building a SAAS startup on AWS.


If you are looking for amazing people to be part of your company, inside our Talent Collective, you will find a lot of great people. Check it out here


AWS Summit ASEAN 2023

May 4, 2023 | SINGAPORE
Sands Expo & Convention Centre at Marina Bay Sands

Improving price-performance with AWS Graviton-based instances

Moving from x86-based Amazon EC2 instances to AWS Graviton Arm-based processors can save you a lot of money, with up to 40 percent better price performance. Can you simply update your AWS CloudFormation templates from c5 to c6g and reap the savings? In this session, learn about customer-proven strategies that can help you make the move to AWS Graviton confidently while minimising uncertainty and risk. Gain insights on identifying candidate workloads, performance testing, maintaining availability and flexibility, monitoring tools and release management.


  • Chetan Suri, Enterprise Support Lead, AWS
  • Marc Venturini, Senior Software Engineering Manager, Build Automation, Grab

Unlocking innovation from 24x7 Operations to 9x5 Innovation

A modern enterprise demands best-in-class solutions to remain secure, operational, and deliver digital-first experiences to drive growth. Modernisation is challenging, and moving from a traditional data centre to a hybrid cloud environment is a complex exercise. In this session, hear from Red Hat and AWS on how their customers overcame these complexities to build and implement a secure, robust and scalable cloud environment to match customer demands. Discover how this lead to sustainable long term cost savings, while enabling their technology teams to architect for the future by leveraging managed app platforms.


  • Paul Whiten, Emerging Sales Specialist, Cloud Services, Red Hat
  • Wayne Toh, Sr. Specialist SA, EC2 Graviton, AWS

From the ARM Ecosystem

Tweet of the week

Joe Speed (Head of Edge at Ampere Computing)

twitter profile avatar
Joe Speed
Twitter Logo
April 6th 2023

Marcos Ortiz

I'm a Data Engineer by day at Riot Games (via X-Team ) and by night, I curate the last news/product announcements/resources about AWS Silicon (Graviton, AWS Nitro, Inferentia, and Trainium).

Read more from Marcos Ortiz

Issue # 82: April 5, 2024 to April 12, 2024 Hey Reader. Welcome to Issue # 82 of AWS Graviton Weekly, which will be focused on sharing everything that happened in the past week related to AWS Silicon: from April 5, 2024, to April 12, 2024. This week has been very quiet related to AWS Silicon because most headlines were occupied by Google Next, especially the new Arm-based CPU called Axion, which looks very interesting and promising. But one thing I noticed was that more and more times we will...

about 7 hours ago • 3 min read

Issue # 81: March 29, 2024 to April 5, 2024 Hey Reader. Welcome to Issue # 81 of AWS Graviton Weekly, which will be focused on sharing everything that happened in the past week related to AWS Silicon: from March 29, 2024, to April 5, 2024. Things you can't miss in this issue: How Quora modernized its model serving with NVIDIA Triton and Amazon EKS Now, back to business. Tool of the week: Antimetal New website, new product; same incredible possible outcome if your company uses this tool to...

7 days ago • 1 min read

Issue # 80: March 22, 2024 to March 29, 2024 Hey Reader. Welcome to Issue # 80 of AWS Graviton Weekly, which will be focused on sharing everything that happened in the past week related to AWS Silicon: from March 22, 2024, to March 29, 2024. Before doing the normal stuff I do here every week, let me share with you an exciting product I'm building here. Some of you have always asked about having a single place with all the resources I've shared here in this newsletter since its foundation, but...

14 days ago • 3 min read
Share this post