Previous blog posts have highlighted the benefits of using Prometheus and Grafana for monitoring pod metrics in Kubernetes. Additionally, Prometheus metrics can be utilized to manage Horizontal Pod Autoscaling (HPA). HPA is employed to dynamically adjust the number of pods in a deployment, providing additional compute power to applications during peak usage times or reducing them when not needed to save on compute costs. By default, Kubernetes only supports scaling based on CPU and Memory; however, with Prometheus, scaling can be done on any metric that Prometheus tracks!
Once you have successfully installed the Prometheus Adapter, you will be equipped to retrieve a wide array of metrics from your Kubernetes environment. This is achieved by executing a specific command that allows you to access the custom metrics API. The command you will use is as follows:
kubectl get --raw /apis/custom.metrics.k8s.io/v1beta1 > metrics.json
The list is extensive, so I suggest saving the results to a file, then formatting it into a more readable JSON format. An example of a metric you can use is the volume of network bytes sent and received by a specific pod:
"name": "pods/network_transmit_bytes",
"singularName": "",
"namespaced": true,
"kind": "MetricValueList",
"verbs": ["get"]
I suggest establishing a baseline for the specified metric and observing how it increases with usage through Grafana dashboards. After collecting this data, you can create a HorizontalPodAutoscaler resource using the following YAML:
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: my-deployment-name-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: my-deployment-name
minReplicas: 2
maxReplicas: 8
metrics:
- type: Pods
pods:
metric:
name: network_receive_bytes
target:
averageValue: 300M
type: AverageValue
The deployment named my-deployment-name will be scaled between 2 and 8 pods based on the average value of the network_receive_bytes metric. This setup ensures that each pod manages roughly 300M of traffic. For instance, if the average traffic reaches 1200M, the deployment will scale to 4 pods.
There are numerous metrics available for scaling. Explore how your application responds to high traffic by monitoring various metrics to either scale up for increased demand or scale down to reduce costs.
No Comments Yet
Let us know what you think