Let’s explore EC2 instance types and choose ones that offer the best price-performance combo. Let’s discuss the best practices for AWS cost optimization.
A DevOps life isn’t a piece of cake in AWS. How are you supposed to make sense of EC2 instance types when you’re looking at almost 400 different ones? Picking the right VM type for the job that doesn’t burn a hole in your pocket is a challenge. But there are a few things you can do to make your life easier (and gain points with your financial department).
Careful choice of EC2 instances is definitely worth your time because compute is the biggest part of your cloud bill. If you manage to optimize it, you’ll open the doors to dramatic reductions in your cloud costs.
Amazon Elastic Compute Cloud ( EC2) is a service that delivers compute capacity in the cloud to help teams benefit from easy-to-scale cloud computing.
Some teams make the mistake of choosing EC2 instances that are too large. They want to be on the safe side in case their application’s requirements increase. But why overprovision when you can use a burstable instance or delegate the task to incredibly cost-effective spot instances when needed?
Other teams are tempted to use more affordable instances. But what if they start running memory-intensive applications and encounter performance issues?
It all starts with knowing your workload requirements well. Make a deliberate effort to get only what your application really needs.
Identify the minimum requirements of your workload and pick EC2 instance types that meet them across these dimensions:
Let’s say that you’ve done your homework and came up with a set of targeted instance types.
If you’re looking for an instance to support a machine learning application, for GPU instead of CPU. GPU-dense instance types train models much faster. Interestingly, the GPU wasn’t initially designed for machine learning; it was designed to display graphics.
What about running predictions? Is investing specialized instance types worth it? AWS has introduced a new instance type designed for inference, AWS EC2 Inf. It supposedly delivers up to 30% higher throughput and 45% lower cost per inference than EC2 G4 instances.
And what’s the hype around Arm all about? The EC2 A1 family is powered by the Graviton2 Arm processor. Since Arm is less power-hungry, it’s also cheaper to run and cool. Cloud providers usually charge less for this type of processor.
But if you’d like to use it, you might have to re-architect your delivery pipeline to compile your application for Arm. On the other hand, if you’re already running an interpreted stack like Python, Ruby, or NodeJS, your applications will likely run on Arm.
EC2 instance family | Key characteristics | Use cases |
General-purpose | Balanced ratio of vCPU to memory | – General-purpose applications that use vCPU and memory in equal proportions – Scale-out workloads like web servers, containerized microservices, and small to mid-sized development environments – Low-latency user-interactive applications, small to medium databases workloads – Virtual desktops machines, code repositories, application servers |
Compute-optimized | – High ratio of vCPU to memory – Optimized for vCPU-intensive workloads | – High-performance web servers, batch processing, distributed analytics – High-performance computing (HPC) – Highly scalable multiplayer gaming platform apps – High-performance frontend fleets, backend applications, and API servers – Science and engineering applications |
Memory-optimized | – High ratio of memory to vCPU | – High-performance database clusters – Distributed web scale in-memory caches – Mid-size in-memory databases and enterprise applications – Applications that process unstructured big data in real-time – High-performance computing (HPC) and Hadoop/Spark clusters |
Storage optimized | – Designed for workloads that need high, sequential read and write access to massive data sets on local storage – Can deliver thousands of low-latency, random I/O operations per second (IOPS) to applications | – NoSQL databases (Cassandra, MongoDB, Redis) – In-memory databases (SAP HANA, Aerospike) -Scale-out transactional databases and distributed file systems (HDFS and MapR-FS) – Massively Parallel Processing (MPP) – MapReduce and Hadoop distributed computing – Apache Kafka, and big data workload clusters |
Accelerated computing | – Uses hardware accelerators (co-processors) to power functions that machine and deep learning systems require | – Machine/deep learning – High-performance computing (HCP) – Computational finance – Speech recognition and conversational agents – Molecular modeling and genomics – Recommendation engines – 3D visualizations and rendering |
Inference type | – Promises up to 30% higher throughput and 45% lower cost per inference than EC2 G4 instances – Includes 16 AWS Inferentia chips, second-generation Intel Xeon Scalable processors, and networking of up to 100 Gbps – Learn more | – Machine learning applications – Search recommendation – Speech recognition and natural language processing – Fraud detection |
EC2 instance types come in one or more sizes, so scaling resources to match your workload’s requirements is easy.
But size isn’t the only factor that determines the cost.
AWS rolls out different computers to provide compute capacity. And the chips in those computers have different performance characteristics.
You might get an instance running on an older-generation processor that is slightly slower or a new-generation one that is a bit faster. The instance type you pick might come with strong performance characteristics your application doesn’t really need. And you won’t even know it.
How do you verify this? Benchmarking is the best approach.It means that you drop the same workload on every machine type you want to examine and check its performance characteristics.
To understand instance performance, we developed a metric called Endurance Coefficient. Here’s how we calculate it:
We tested the DigitalOcean s1_1 machine and (as you can see) it achieved a pretty high endurance coefficient of 0.97107 (97%). The AWS t3_medium_st instance delivered a less stable result with an endurance coefficient of 0.43152 (43%).
Source: CAST AI
Next, you have to select an EC2 pricing model that matches your needs and budget. AWS offers the following models:
You pay only for the resources that you actually use. No need to worry about long-term binding contracts or upfront payments. Increase or reduce your usage just in time. But this flexibility comes with a high price tag. Workloads with fluctuating traffic spikes benefit the most from On-Demand instances.
Buy capacity upfront in a given availability zone with a large discount off the On-Demand price. The larger your upfront payment, the larger the discount. But if go for it, you’re also committing to a specific instance or family. And you can’t change that later if your requirements change.
Get the Reserved Instances discounts but commit to using a given amount of computing power per hour (not specific instance types and configurations). Anything extra will be billed at the high On-Demand rate.
But wait, didn’t you migrate to the cloud to avoid CAPEX in the first place? Resourced Instances and Savings Plans pose the risk of vendor lock-in. The resources you get today might make little sense for your company down the line. Three years is an eternity in cloud computing.
Bidding on spare compute is a smart move, you can save up to 90% off the On-Demand pricing. But AWS can pull the plug on your instance any time and give you just 2 minutes to prepare for it. You need to come up with a strategy to deal with that.
A physical server that brings an instance capacity that is fully dedicated to you. You can reduce costs by using your own licenses to slash costs and get the resiliency and flexibility of the cloud. It’s pricey, but a good match for applications that have to achieve compliance and, for example, not share hardware with other tenants.
Burstable performance instances were designed to give you a baseline level of CPU performance together with the possibility of bursting to a higher level when the need arises.
Burstable instances in families T2, T3, T3a, and T4g are a good fit for low-latency interactive applications, microservices, small/medium databases, and product prototypes.
Bursting can happen if you have credits. The number of accumulated CPU credits depends on your instance type. Generally, larger instances collect more credits per hour. But note that there’s a cutoff to the number of credits that can be collected (and naturally, it’s higher for larger instances)
We examined burstable instances AWS offers and discovered that if you load your instance for 4 hours or more per day (on average), you’re better off with a non-burstable instance. But if you run an e-commerce business and experience traffic spikes once in a while, a burstable instance is cost-effective.
Our tests revealed that compute capacity tends to increase linearly during the first four hours. After that, the increase is limited and the amount of available compute goes down by nearly 90% by the end of the day.
Source: CAST AI
To maximize cloud cost savings, be careful about data storage:
Spot Instances are a great way to save up on your AWS bill. By bidding on instances AWS isn’t using, you can get up to a 90% discount on the On-Demand pricing.
The first step is qualifying your workload for Spot Instances. Is it spot-ready? Answer these questions to find out:
Once you determine that your workload is a good candidate for Spot Instances, here are a few helpful pointers:
Luckily, you can use intelligent cloud optimization tools to get your hands on the best instances and avoid locking yourself into a long-term expensive commitment.
Source: https://dzone.com/articles/400-ec2-instance-types-the-good-the-bad-and-the-ug
Department of Information Technologies: https://www.ibu.edu.ba/department-of-information-technologies/