Horizontal Pod Autoscaler (HPA) Demo¶
Now that you have an understanding of how HPA works, let's see it in action.
Docker Images¶
Here is the Docker Image used in this tutorial: reyanshkharga/nodeapp:v1
Note
reyanshkharga/nodeapp:v1 runs on port 5000
and has the following routes:
GET /
Returns host info and app versionGET /health
Returns health status of the appGET /random
Returns a randomly generated number between 1 and 10
Objective¶
We'll follow these steps to test the Horizontal Pod Autoscaler (HPA):
- We'll create a
Deployment
and aService
object. - We'll create
HorizontalPodAutoscaler
object for the deployment. - We'll generate load on pods managed by the deployment.
- We'll observe HPA taking autoscaling actions to meet the increased demand.
Let's see this in action!
Step 1: Create a Deployment¶
First, let's create a deployment as follows:
Apply the manifest to create the deployment:
Verify deployment and pods:
Note that each pod can consume a maximum of 100m
CPU and 128Mi
memory.
Step 2: Create a Service¶
Next, let's create a LoadBalancer
service as follows:
Apply the manifest to create the service:
Verify service:
Step 3: Create HPA for the Deployment¶
Now, let's create a HPA for the deployment as follows:
Apply the manifest to create the HPA:
Verify HPA:
Step 4: Generate Load¶
Let's generate load on the pods managed by the deployment. On your local machine run the following command to generate the load:
The above command concurrently sends 1000 requests per second to the LoadBalancer service using 100 parallel processes.
Step 5: Monitor Pods and HPA Events¶
# List pods in watch mode
kubectl get pods -w
# List hpa in watch mode
kubectl get hpa -w
# View hpa events
kubectl describe hpa my-hpa
You'll notice that as soon as any of the defined threshold is crossed the hpa scales the number of replicas to ensure the resource utilization is within the defined threshold.
Here's a sample event from the hpa:
Conditions:
Type Status Reason Message
---- ------ ------ -------
AbleToScale True SucceededRescale the HPA controller was able to update the target scale to 2
ScalingActive True ValidMetricFound the HPA was able to successfully calculate a replica count from cpu resource utilization (percentage of request)
ScalingLimited False DesiredWithinRange the desired count is within the acceptable range
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal SuccessfulRescale 5s horizontal-pod-autoscaler New size: 2; reason: cpu resource utilization (percentage of request) above target
Clean Up¶
Assuming your folder structure looks like the one below:
Let's delete all the resources we created: