ITNEXT

ITNEXT is a platform for IT developers & software engineers to share knowledge, connect, collaborate, learn and experience next-gen technologies.

Follow publication

Running resilient workloads in EKS using Spot instances

Matías Costa
ITNEXT
Published in
7 min readNov 8, 2022

--

Kubernetes and Spot instances

Spot Instances Overview

A Spot Instance is an instance that uses spare EC2 capacity that is available for less than the On-Demand price (up to 90% cheaper), which makes it a very cost efficient option, but comes with some downsides. Spot Instances are interruptible by AWS EC2 Spot service in what is called a Spot Instance interruption. The following are the possible reasons that Amazon EC2 might interrupt your Spot Instances:

  • Price: The Spot price is greater than your maximum price.
  • Capacity: Amazon EC2 can interrupt your Spot Instance when it needs it back. EC2 reclaims your instance mainly to repurpose capacity, but it can also occur for other reasons such as host maintenance or hardware decommission.
  • Constraints: If your Spot request includes a constraint such as a launch group or an Availability Zone group, the Spot Instances are terminated as a group when the constraint can no longer be met.

You can see the historical interruption rates for your instance type in the Spot Instance Advisor.

If a Spot Instance is stopped, hibernated, or terminated, you can use CloudTrail to see whether Amazon EC2 interrupted the Spot Instance. In AWS CloudTrail, the event name BidEvictedEvent indicates that Amazon EC2 interrupted the Spot Instance. To view BidEvictedEvent events in CloudTrail:

  1. Open the CloudTrail console
  2. In the navigation pane, choose Event history.
  3. In the filter drop-down, choose Event name, and then in the filter field to the right, enter BidEvictedEvent.
  4. Choose BidEvictedEvent in the resulting list to view its details. Under Event record, you can find the instance ID.

Spot instances are usually suitable for stateless, fault-tolerant applications that are able to checkpoint and continue after an interruption, as well as batch jobs.

Considerations when using Spot Instances

At giffgaff we run all our applications in an EKS cluster using 100% Spot instances. We make use of Spot and the Ocean Clusters feature.

--

--

Published in ITNEXT

ITNEXT is a platform for IT developers & software engineers to share knowledge, connect, collaborate, learn and experience next-gen technologies.

Written by Matías Costa

SRE engineer | Technology enthusiast | Learning & Sharing

No responses yet

Write a response