Building the CI/CD of the Future, Adding the Cluster Autoscaler
In this tutorial, I will share my experience as a DevOps engineer, this is the third post of the tutorial in which I will describe how to add Cluster Autoscaler to the EKS cluster we created in the previous post.
Building the CI/CD of the Future published posts:
- Introduction
- Creating the VPC for EKS cluster
- Creating the EKS cluster
- Adding the Cluster Autoscaler
- Add Ingress Nginx and Cert-Manager
- Install and configure Jenkins
- Create your first pipeline
Let’s start.
What is Cluster Autoscaler?
Cluster Autoscaler is a tool that automatically adjusts the size of the Kubernetes cluster when one of the following conditions is true: there are pods that failed to run in the cluster due to insufficient resources, there are nodes in the cluster that have been underutilized for an extended period of time and their pods can be placed on other existing nodes.
Cluster Autoscaler adjusts the number of nodes for the ‘ng-spot’ node group according to load on the cluster by changing the desired capacity of an AWS Autoscaling Group. To do that Cluster Autoscaler must have certain permissions/policies for this node group, so let’s add them.
IAM Policy for Cluster Autoscaler
The Cluster Autoscaler requires the following IAM permissions to change the desired capacity of the autoscaling group.
Save it as ‘eks-ca-asg-policy.json’
Let’s apply the policy above, but first, we need to find ‘ng-spot’ node group id.
In AWS account go to Services -> CloudFormation -> Stacks
Go to ‘eksctl-eks-ci-cd-nodegroup-ng-spot’ stack -> Resources
The Physical ID of ‘NodeInstanceRole’ is what we need: ‘eksctl-eks-ci-cd-nodegroup-ng-spo-NodeInstanceRole-11M42X5PXCIK3’
Applying the policy
Checking the policy applied successfully
Please apply this policy for the ‘ng-static’ group also (although it not managed by CA), otherwise, you will see a lot of errors in Cluster Autoscaler’s logs or CA pod will have an error status.
Deploy Cluster Autoscaler
To deploy Cluster Autoscaler I will use cluster-autoscaler-one-asg.yaml from here
It’s the deployment of Cluster Autoscaler for one Autoscaling Group Only, exactly what we need in our case.
You can see other examples here
We need to modify this file first before apply it.
Look for command section, — nodes line must be changed to the real id of your ‘ng-spot’ node group.
Deploy
Troubleshooting logs of Cluster Autoscaler
Testing Scaling
To test scaling we will deploy 100 Nginx pods
In a couple of minutes, you will see new spot instances provisioned.
Let’s delete those pods now
It will take something like 10 minutes for Cluster Autoscaler to terminate unneeded nodes.
Run Windows workflows on EKS
To use windows workflows in our cluster we can add an additional node group to eks-cluster config:
This node group also will run on spot instances and will use EC2 instances with AMI: WindowsServer2019, you can run pods(containers) with windows based images only on windows worker nodes.
You need to be careful and use the node selector to properly put your pods on correct instances.
If you want to understand how to run properly windows workflows on your EKS cluster, please read my article
In the next post, I will explain how to install and configure ‘Ingress Nginx’ and ‘certificate manager’ to your cluster.
Please subscribe to my YT channel
Please follow me on Twitter (@warolv)
I will save all configuration created in this tutorial in my Github