Esta entrada de blog está también disponible en español.

One of the star features of the Kubernetes world is, without a doubt, the Horizontal Pod Autoscaling. With it, we can benefit from a superior performance capacity of our service at specific moments, where the original dimensioning of our cluster may not be sufficient, avoiding degradation or even loss of service.

Kubernetes offers us out-of-the-box autoscaled by resource consumption, such as CPU and Memory. Although a temporary load situation in the service can increase the consumption of these resources in our pods and therefore trigger our autoscaling, sometimes and more in relation to memory autoscaling, it may not be of much help and more in Java applications how the JVM manages memory. For these reasons and from the experience with customers, sometimes horizontal autoscaling is better suited for other types of metrics, such as monitoring the http pool of the application server, since normally when the system begins to stress, it begins to manifest itself by saturation of the application server pools or the pool of connections to Elasticsearch, in the case of Liferay Portal / DXP.

What do I need to automatically scale my Liferay Portal / DXP based on a custom metric?

It will be necessary to rely on tools such as JMX Exporter to indicate which metrics we want to export to the outside of our Liferay Portal / DXP pod. To do this, you can see how to configure it in my previous Blog "Monitoring Liferay Portal / DXP in Kubernetes"
Once our Liferay Portal / DXP pods are exporting our metrics, it will be necessary to have Prometheus in order to collect our metrics. To install it in our Kubernetes cluster, I again refer to my previous Blog "Monitoring Liferay Portal / DXP in Kubernetes" where I describe in detail how to install it.
With Prometheus installed, now what we will need is to implement the Kubernetes custom-metrics API in order to obtain an endpoint within our cluster, with which to expose our custom metrics necessary to perform autoscaling and for this we will use Prometheus Adapter. To install it we can use HELM:

$ helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
$ helm repo update
$ helm install --name my-release prometheus-community/prometheus-adapter

The interesting part to configure in Prometheus Adapter to carry out our attack is:

a) Configure the endpoint where Prometheus is exposed and its port: For this, the values of the Chart can be modified with the correct values.

b) Configure the query that will extract the custom metrics from Prometheus to expose them in the custom.metrics endpoint of k8s: For this, the values of the Chart can be modified with the desired queries. In our case, since we want to scale Liferay Portal / DXP depending on the current size of the http connection pool, we will extract the average of the series exposed by the threadpool of each Liferay Portal / DXP Tomcat through Prometheus:

       - seriesQuery: 'tomcat_threadpool_currentthreadcount{kubernetes_namespace!=\"\",kubernetes_pod_name!=\"\"}'
       resources:
       overrides:
       kubernetes_namespace: {resource: \"namespace\"}
       kubernetes_pod_name: {resource: \"pod\"}
       name:
       matches: \"^(.*)_currentthreadcount\"
       as: \"${1}_threadcount_avg\"
       metricsQuery: 'avg(<<.Series>>{<<.LabelMatchers>>}) by (<<.GroupBy>>)'

Once the Prometheus Adapter is deployed and configured, we can check if our metric is being exported through the custom.metrics endpoint with
kubectl get --raw /apis/custom.metrics.k8s.io/v1beta1 | jq .
The last thing we'll need to do is configure the HorizontalPodAutoscaler to link our Liferay Portal / DXP Deployment to it in order to scale when the metric reaches a desired threshold (and do scaleDown when it decreases):

apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
name: liferay-autoscaler
namespace: liferay
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: liferay
minReplicas: 2
maxReplicas: 5
metrics:
- type: Pods
pods:
metricName: tomcat_threadpool_threadcount_avg
targetAverageValue: "100"

With the previous manifest, we'll be using our "tomcat_threadpool_threadcount_avg" metric exported through the custom.metrics endpoint and when it reaches the value of 100 (half the maximum value, 200 for the default pool in Tomcat), Kubernetes will increase the number of Deployment replicas up to a maximum of 5.
Since Kubernetes version 1.18, it is possible to configure the behavior of the HPA in terms of scaleUp, scaleDown and the stabilization windows to perform them:

behavior:
scaleDown:
stabilizationWindowSeconds: 300
policies:
- type: Percent
value: 100
periodSeconds: 15
scaleUp:
stabilizationWindowSeconds: 0
policies:
- type: Percent
value: 100
periodSeconds: 15
- type: Pods
value: 2
periodSeconds: 15
selectPolicy: Max

With the above configuration, downscaling is performed with a stabilization window of 5 minutes. Only one policy is configured to perform downscale which allows 100% of the additional replicas to be deleted.
A stabilization window (0) is not configured for scaling. When the metric reaches the threshold, the number of replicas is increased immediately. In policies, 2 policies are configured with which 2 pods or 100% of the replicas that are currently running will be added every 15 seconds until the HPA reaches its stable state again.

Executing kubectl describe hpa -n = liferay we can check the status of our new HPA:

Testing our new HorizontalPodAutoscaler

With JMeter, we will inject load into our Liferay Portal/DXP cluster to test autoscaling. We can use our Prometheus instance to monitor the threadpool of our pods while injecting load:

Once the threshold is reached, we see how the number of Liferay Portal / DXP instances increases:

After 5 minutes of stopping the load, we can check how the additional instances are deleted until leaving our Liferay Portal / DXP cluster with our 2 nodes at least:

Closure

Autoscaling is a valuable functionality on a Kubernetes infrastructure, but this functionality has to be adapted to the use of each project. The decision of scaling by one metric or another has to be analyzed to check that it fits in how we use Liferay Portal / DXP and later, with the help of performance tests, adjust the scaleUp and scaleDown behaviors, as well as the thresholds, of the best way to benefit our solution built on Liferay Portal / DXP and Kubernetes.