AWS Graviton Weekly # 43


Issue # 43: June 23rd, 2023 to June 30th, 2023

[Read the browser version right here]

Hey Reader.

Welcome to Issue # 43 of AWS Graviton Weekly, which will be focused on sharing everything that happened in the past week related to AWS Silicon: from June 23rd, 2023 to June 30th, 2023

In this issue, you will find:

  • An application deep-dive into the AWS Graviton3E-based Amazon EC2 Hpc7g instance
  • Amazon SageMaker Neo now supports the compilation of PyTorch and TensorFlow models for Inferentia 2 and Trainium 1 instances
  • How Snowflake is achieving its Sustainability goal using Graviton
  • How GoDaddy is using Graviton instances to build a highly scalable hosting platform
  • A deep dive into PyTorch 2.0 on Graviton
  • Oracle Announces Oracle Database for Arm Architectures in the Cloud & On-Premises
  • A very interesting tutorial about how to optimize & deploy BERT on AWS inferentia2​

Before the regular share, I wanted to give a shoutout to my good friend Cristian Măgherușan-Stanciu (the creator of AutoSpotting and EBS Optimizer) who is sharing a lot of FinOps tips in his newsletter and on LinkedIn.

You can subscribe here.

The last one? $7k of yearly savings for 10-15 minutes of work

Back to business.


NEWS

Amazon SageMaker Neo now supports compilation of PyTorch and TensorFlow models for Inferentia 2 and Trainium 1 instances

Starting today, you can choose Inferentia 2 and Trainium 1 as additional targets to compile your PyTorch and TensorFlow models for Amazon SageMaker Neo, a capability of Amazon SageMaker that enables customers to optimize machine learning (ML) models for inference on SageMaker to achieve faster inference without any loss in accuracy. Amazon Elastic Compute Cloud (Amazon EC2) Inf2 instances deliver high performance at the lowest cost for generative artificial intelligence (AI) models, including large language models (LLMs) and vision transformers. AWS Trainium is a machine learning (ML) accelerator that AWS purpose-built for deep learning training of 100B+ parameter models.

Learn more

https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_OutputConfig.html

https://docs.amazonaws.cn/en_us/sagemaker/latest/dg/neo-supported-cloud.html#neo-supported-inferentia


Other news


ARTICLES AND TUTORIALS

Application deep-dive into the AWS Graviton3E-based Amazon EC2 Hpc7g instance, by Neil Ashton (Principal Computational Engineering Specialist at Amazon Web Services (AWS), Karthik Raman (Principal Application Engineer, HPC at AWS), Dnyanesh Digraskar (Senior Partner Solutions Architect for HPC at AWS), Heidi Poxon (Principal, and lead, for the Performance Engineering and Technical Strategy team in HPC Engineering at AWS), Jun Tang (Software Development Engineer at Annapurna Labs), Rama Malladi (Solution Architect, HPC and Research at AWS) and Stephen Sachs (Principal HPC Application Engineer at AWS)

In this post we introduced the new Hpc7g instance that joins the Amazon EC2 HPC instances family. We ran several popular HPC applications and showed that it offers up to 70% better performance and almost 3x better price-performance compared to previous generation Graviton instances.
Graviton3E instances also consume up to 60% less energy for the same work than comparable Amazon EC2 instances, which makes it a more sustainable approach to HPC.

Learn more:

https://aws.amazon.com/blogs/aws/new-amazon-ec2-hpc7g-instances-powered-by-aws-graviton3e-processors-optimized-for-high-performance-computing-workloads/

https://aws.amazon.com/solutions/case-studies/flying-whales/

https://github.com/aws/aws-graviton-getting-started/tree/main/HPC


How Snowflake Optimized its Virtual Warehouses for Sustainability Using AWS Graviton, by Gabe Bryant (Cloud Engineering Sr. Manager at Snowflake), Murray Stokely (Efficiency Architect at Snowflake), Archana Srinivasan (Sr. Technical Account Manager at AWS) and Frank Dallezotte (Sr. Solutions Architect at AWS)

In this post, we will discuss how Snowflake reduced its carbon emissions footprint and improved performance efficiency by transitioning virtual warehouses to AWS Graviton-based instance types.

Learn more:

https://github.com/aws/aws-graviton-getting-started/blob/main/transition-guide.md


Disaster Recovery 5G Core Network on AWS, by Ashutosh Tulsi (Principal Solutions Architect, Telco 5G at AWS), Neb Miljanovic (Partner Solutions Architect at AWS, and Dr. Young Jung (Principal Solutions Architect (WW Telco NFV/5G) at AWS)

Since the 5G mobile core network serves mission-critical services, such as voice calls and data streaming, you must make sure of the disaster-resiliency of the service and also have the capability for prompt disaster-recovery of the network component. More specifically, to have better isolation from a fault and disaster, it is reasonable to consider building a DR 5G network on the cloud rather than legacy CSP data centers. In addition, if this DR 5G network is mainly supposed to be used for a limited time period (during the recovery of service or absorbing the spike in traffic burst), then it has a good fit with the cloud’s pay-as-you-go model. AWS can help CSP customers by providing not only an environment for building this DR virtual data center, but also various tools of automation and scaling capability for the network as demonstrated with the GitHub repo sample in this post. Using this fast scaling-out capability along with the right type/size of instance, such as the Graviton instance in AWS, would maximize the benefit of cost and energy saving for building a DR 5G network for CSP customers.

Learn more:

https://d1.awsstatic.com/whitepapers/5g-network-evolution-with-aws.pdf

https://aws.amazon.com/blogs/architecture/disaster-recovery-dr-architecture-on-aws-part-i-strategies-for-recovery-in-the-cloud/


Fire up your Unreal Engine-based game on all Graviton cores, by Yahav Biran (Principal Solutions Architect), Matt Trescot (Games, SA Leader – Americas), and Chris Launius (Sr. Marketing Manager, Amazon Game Tech)

Historically, the art of creating and running complex game servers locked developers into a single CPU architecture, typically Intel/AMD. Our developers tell us it’s hard to introduce different CPU architectures once game servers are built for a given processor.
In this article, we’ll show you how to build an Unreal Engine game with full support for the AWS Graviton processor. Plus, we’ll show you how to meet your performance requirements at a 42% lower cost than comparable current generation x86-based instances. Let’s dive in.

Learn more:

https://aws.amazon.com/blogs/gametech/fire-up-your-unreal-engine-based-game-on-all-graviton-cores/

https://aws.amazon.com/solutions/case-studies/innovators/epic-games/


Optimize & Deploy BERT on AWS inferentia2, Philipp Schmid (Technical Lead at Hugging Face)

In this end-to-end tutorial, you will learn how to optimize and deploy BERT on AWS Inferentia2. We will reduce latency down to 4ms latency for BERT-base with a sequence length of 128.


Building a Scalable and Performant Website Hosting Platform in AWS, by Chris Hinrichs (Principal Engineer VI at GoDaddy)

While GoDaddy Websites + Marketing was built on one of the fastest hosting platforms on the planet, we wanted to improve latency, availability, and reliability by leveraging AWS technologies. This article details how we rebuilt and rearchitected our hosting platform for Websites + Marketing from the ground up using AWS technologies.

Optimize the cost of your Amazon ElastiCache for Redis workloads, by Shirish Kulkarni (NoSQL Specialist Solutions Architect at AWS) and Roberto Luna Rojas (Worldwide Sr In-Memory Databases Specialist Solutions Architect at Amazon Web Services (AWS)

In this post, we covered the top five recommendations on how to optimize the cost when running ElastiCache for Redis workloads using native ElastiCache features. We talked about lowering ElastiCache costs by utilizing Graviton nodes and reserved instances, avoiding over-provisioning and scaling your clusters per business needs with auto-scaling, achieving 4.8 times the capacity with two times less cost using data tiering nodes, and enhancing throughput and lowering resource utilization with I/O multiplexing by upgrading to ElastiCache for Redis 7. Contact your AWS account team to get assistance on how to take advantage of these cost-optimization options in your use cases.


EPAM Gains 40% Price-Performance Improvement for a Cloud Management App With AWS Graviton

EPAM was tasked, by Maestro Cloud Control, to migrate it’s Maestro hybrid cloud management platform to AWS Graviton within a pre-existing enterprise infrastructure. The aim of the project was to reduce Maestro’s ongoing R&D cost and improve its performance. This was achieved by increasing processing time by 10 percent, using less resources to achieve this higher level of performance.

How to Reduce Your Amazon EKS Costs by Half in 15 Minutes, by Laurent Gil (Chief Product Officer at CAST.AI)

Overprovisioning is the top reason why teams see their cloud bills constantly growing. But choosing the best instances from the hundreds of options AWS offers is a tough call. Luckily, automation is here to help and slash your EKS costs in 15 minutes. Read this case study to learn more.

Learn more:

https://cast.ai/blog/how-to-reduce-your-amazon-eks-costs-by-half-in-15-minutes/

https://cast.ai/cloud-cost-monitoring/


SLIDES, VIDEO, AND AUDIO

[VIDEO] Deep Dive: PyTorch 2.0 on Graviton- AWS Online Tech Talks, by Hahnara Hyun (Sr Specialist Solutions Architect, EC2 Graviton at Amazon Web Services (AWS)

Since the release of PyTorch in 2017, hardware accelerators have gotten faster, and Arm-based server processors were introduced to the cloud. PyTorch 2.0 improvements take advantage of the growing landscape of new compute capabilities, reducing framework overhead as well as supporting Arm Compute Library. In this session, we will cover the cost reduction and performance gains of Graviton and how to get started with AWS EC2 Graviton and Amazon SageMaker.

Learn more:

https://github.com/aws/deep-learning-containers/blob/master/available_images.md

https://pytorch.org/blog/pytorch-2.0-release/

https://aws.amazon.com/about-aws/whats-new/2022/10/amazon-sagemaker-adds-new-graviton-based-instances-model-deployment/

https://github.com/aws/aws-graviton-getting-started/blob/main/machinelearning/pytorch.md


[SLIDES] You’re (Probably) Ready for AWS Graviton, by Michael Fisher (Principal Specialist, AWS EC2, Graviton, and Containers) and Vishal Manan (Sr. Specialist Solutions Architect EC2 Graviton)


[VIDEO in Hebrew] AWS Summit Tel Aviv 2023 - AWS Graviton: Best Price Performance for your AWS Workloads (SEC301), with Guy Almog (Senior Solution Architect at AWS), Barak Nissim (Sr. Business Development Manager, Compute Services at AWS), and Yotam Bagam (DevOps Engineer at Singular)

From many major instance families in Amazon EC2 to managed services such as AWS Lambda, Amazon Aurora, and Amazon EKS, AWS Graviton-based architecture is being used by tens of thousands of customers to get significant price-performance benefits for a wide variety of workloads on AWS. AWS Graviton3 processors provide up to 25 percent better performance over AWS Graviton2 processors, which already provided significant price-performance benefits. This session dives deep into the AWS Graviton2, Graviton3 and Graviton3E processors including suitable workloads and considerations for adoption, and it features an AWS customer speaking about their processor adoption experience.


[VIDEO in Hebrew] AWS Summit Tel Aviv 2023 - Reduce up to 30% of Containers cost, with EKS and Graviton (DMO301), by Yahav Biran (Principal Solutions Architect at AWS) and Yuval Dovrat (Head of Compute Solutions Architecture at AWS)

Running containerized apps on EKS? Is your app written in Python, Java, or Ruby? Come learn how to reduce your compute costs, increase application resiliency, and future-proof your architecture with Graviton and Intel processors. You'll learn how your app will benefit from Graviton, Intel and AMD processors and how to benchmark it. Moreover, we will demonstrate how to gradually deploy, processor-agnostic applications. We will present a real-world, python-based workload deployed in EKS, powered by Karpenter. We will load the system with 10000 transactions per second, on 100s of cores, and show the cost benefits of Graviton and CPU diversification.


EVENTS


Graviton Essentials - Virtual Developer Day (Wednesday, July 12 2023 | 9:00 AM - 5:00 PM PDT) Live Virtual & Interactive


From the ARM Ecosystem

Marcos Ortiz

I'm a Data Engineer by day at Riot Games (via X-Team ) and by night, I curate the last news/product announcements/resources about AWS Silicon (Graviton, AWS Nitro, Inferentia, and Trainium).

Read more from Marcos Ortiz
AWS Graviton Weekly # 97: How Amazon’s New CPU Fights Cybersecurity Threats?

Issue # 97: July 19, 2024 to July 26, 2024 Hey Reader. Welcome to Issue # 97 of AWS Graviton Weekly, which will be focused on sharing everything that happened in the past week related to AWS Silicon: from July 19, 2024 to July 26, 2024. Enjoy. Recommendation of the week: CAST.AI NEWS AWS Step Functions now supports Customer Managed Keys Llama 3.1 models from Meta are now available on AWS, offering more options for building generative AI applications AWS Lambda now supports Amazon MQ for...

Issue # 96: July 11, 2024 to July 19, 2024 Hey Reader. Welcome to Issue # 96 of AWS Graviton Weekly, which will be focused on sharing everything that happened in the past week related to AWS Silicon: from July 11, 2024 to July 19, 2024. Before continuing with the regular content: #hugops for CrowdStrike for the Microsoft Windows BSOD issue today. BTW, Cristian Măgherușan-Stanciu worked on some Terraform automation tool to fix this issue on AWS. He is looking for testers. Back to business now...

Issue # 95: July 5, 2024 to July 11, 2024 Hey Reader. Welcome to Issue # 95 of AWS Graviton Weekly, which will be focused on sharing everything that happened in the past week related to AWS Silicon: from July 5, 2024 to July 11, 2024. Enjoy. Recommendation of the week: CAST.AI NEWS Amazon EC2 R8g instances powered by AWS Graviton4 now generally available AWS Neuron introduces Flash Attention kernel enabling high performance and large sequence lengths Announcing availability of AWS Outposts in...