Health based traffic control with Kubernetes

Last time we covered how the liveness probe can be integrated with Spring Boot Actuator. Today, I’m going to show an example application for the readiness probe.

Readiness probe

The readiness probe is kind of similar to the liveness probe. It determines if the application running is allowed to serve traffic. Think about the case when the application starts up – so the liveness probe says, it’s all good – but until it can really respond to requests, it has to process a huge file, fill up the caches from the database or contact an external service. In this case you don’t want the application to be restarted by the liveness probe but wait until it’s fully operational.

Another scenario when the app is having some background processing responsibilities as well on top of a normal HTTP API. If it gets overloaded with the background work, it might not have enough resources to reply to HTTP requests, at least in case response time is crucial. With the readiness probe, you can have such functionality implemented so in case the application is lacking the necessary resources, no traffic will be sent to it until it frees up.

Configuring this type of probe is almost identical to other probes, the only difference is the name readinessProbe.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: actuator-healthcheck-example
  labels:
    app: actuator-healthcheck-example
spec:
  replicas: 1
  selector:
    matchLabels:
      app: actuator-healthcheck-example
  template:
    metadata:
      labels:
        app: actuator-healthcheck-example
    spec:
      containers:
        - name: actuator-healthcheck-example
          image: actuator-healthcheck-example:latest
          imagePullPolicy: IfNotPresent
          ports:
            - containerPort: 8080
          readinessProbe:
            httpGet:
              path: /ready
              port: 8080
            initialDelaySeconds: 5
            periodSeconds: 5
            failureThreshold: 1

I’m not going to go through all the settings for the probes, the same can be applied just like for a liveness probe.

Example – startup

I’m going to extend the example I’ve shown in the previous article so if you are out of context, make sure you check it here.

Moving back to writing code. We are going to simulate when the application has to load something at startup that takes several seconds.

First of all, we need a state holder whether the application is ready to serve traffic or not.

@Component
public class ReadinessHolder {
    private AtomicBoolean isReady = new AtomicBoolean(false);

    public boolean isReady() {
        return isReady.get();
    }
}

Now the startup load simulation. I’m going to use the TaskExecutor interface from Spring to execute an asynchronous task that will set the isReady attribute to true after 20 seconds. The implementation looks like this:

@Component
public class ReadinessHolder {
    private static final Logger logger = LoggerFactory.getLogger(ReadinessHolder.class);

    @Autowired
    private TaskExecutor taskExecutor;

    private AtomicBoolean isReady = new AtomicBoolean(false);

    @PostConstruct
    public void init() {
        taskExecutor.execute(() -> {
            try {
                logger.info("Sleeping for 20 seconds..");
                Thread.sleep(TimeUnit.SECONDS.toMillis(20));
                isReady.set(true);
                logger.info("Application is ready to serve traffic");
            } catch (InterruptedException e) {
                Thread.currentThread().interrupt();
            }
        });
    }

    public boolean isReady() {
        return isReady.get();
    }
}

In the task, I’m doing a simple log message so we can verify the logs in Kubernetes. Then sleeping the thread for 20 seconds and after that setting the isReady state to true.

Next up, we need to expose this information on HTTP. I’m creating a new controller:

@RestController
public class ReadinessRestController {
    @Autowired
    private ReadinessHolder readinessHolder;

    @GetMapping(value = "/ready", produces = MediaType.APPLICATION_JSON_VALUE)
    public ResponseEntity<String> isReady() {
        if (readinessHolder.isReady()) {
            return new ResponseEntity<>("{\"status\":\"READY\"}", HttpStatus.OK);
        } else {
            return new ResponseEntity<>("{\"status\":\"NOT_READY\"}", HttpStatus.SERVICE_UNAVAILABLE);
        }
    }
}

A single GET endpoint that gets the data from the holder and depending on the value, it responds with either HTTP 200

{
    "status":"READY"
}

or HTTP 503

{
    "status":"NOT_READY"
}

Now the deployment descriptor:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: actuator-healthcheck-example
  labels:
    app: actuator-healthcheck-example
spec:
  replicas: 1
  selector:
    matchLabels:
      app: actuator-healthcheck-example
  template:
    metadata:
      labels:
        app: actuator-healthcheck-example
    spec:
      containers:
        - name: actuator-healthcheck-example
          image: actuator-healthcheck-example:latest
          imagePullPolicy: IfNotPresent
          ports:
            - containerPort: 8080
          livenessProbe:
            httpGet:
              path: /actuator/health
              port: 8080
            initialDelaySeconds: 5
            periodSeconds: 10
            failureThreshold: 2
          readinessProbe:
            httpGet:
              path: /ready
              port: 8080
            initialDelaySeconds: 5
            periodSeconds: 5
            failureThreshold: 1
---
apiVersion: v1
kind: Service
metadata:
  name: actuator-healthcheck-example-svc
  labels:
    app: actuator-healthcheck-example
spec:
  type: NodePort
  ports:
    - port: 8080
      targetPort: 8080
      nodePort: 31704
  selector:
    app: actuator-healthcheck-example

There are 2 changes I’ve made compared to the previous article. On one hand I’ve added the readinessProbe so it’s mapped to the /ready endpoint we created. The other one is the Service descriptor, I’ve changed it to NodePort so it’s easier to access for the test. You can use the original descriptor if you want though. The NodePort only means that the API can be accessed through the Kubernetes node directly. For minikube, you can use minikube ip to get the address and then http://<ip>:31704 will be the root of the API.

Next up, let’s deploy the application. Usual exercise, building the jar, then the image and applying the Kubernetes descriptor. Don’t forget to execute eval $(minikube docker-env) if you are using minikube.

$ ./gradlew clean build
$ docker build . -t actuator-healthcheck-example
$ kubectl apply -f k8s-deployment.yaml

Observing the running pods:

$ kubectl get pods -w

The -w flag watches for changes. Inspecting the output:

$ kubectl get pods -w
NAME                                            READY   STATUS    RESTARTS   AGE
actuator-healthcheck-example-74bd59c574-92d7j   0/1     Running   0          3s
actuator-healthcheck-example-74bd59c574-92d7j   1/1     Running   0          29s

It’s clearly visible that after 20 seconds, the application suddenly changed it’s ready state, just like we implemented it. During the period of the pod not being ready, no request will be served. So if you try to execute for example the following command during startup:

$ curl <ip>:31704/actuator/health
curl: (7) Failed to connect to <ip> port 31704: Timed out

As soon as the readiness probe says, the pod is ready, executing the same command will result in a proper response:

$ curl <ip>:31704/actuator/health
{"status":"UP"}

That’s it. The readiness probe is working properly and it doesn’t let traffic go to the pod until it’s reported healthy.

Example – background processing

The other application for the readiness probe is when the application is running low on resources. Like if background processing is part of the application that uses threads in a threadpool. If it gets overloaded, the resources might not be sufficient to serve HTTP requests in an acceptable manner.

I hope you didn’t expect me to give you a full-blown background processing engine that will starve the compute power needed for an HTTP API. Rather I’m just going to emulate the insufficient resource state by setting a flag.

Compared to the previous example, we are making a single change for now. Exposing an HTTP API to switch the ready flag manually.

The new holder class with the switchReady method:

@Component
public class ReadinessHolder {
    private static final Logger logger = LoggerFactory.getLogger(ReadinessHolder.class);

    @Autowired
    private TaskExecutor taskExecutor;

    private AtomicBoolean isReady = new AtomicBoolean(false);

    @PostConstruct
    public void init() {
        taskExecutor.execute(() -> {
            try {
                logger.info("Sleeping for 20 seconds..");
                Thread.sleep(TimeUnit.SECONDS.toMillis(20));
                isReady.set(true);
                logger.info("Application is ready to serve traffic");
            } catch (InterruptedException e) {
                Thread.currentThread().interrupt();
            }
        });
    }

    public boolean isReady() {
        return isReady.get();
    }

    public void switchReady() {
        boolean newReadyValue = !isReady.get();
        logger.info("Switching the ready flag to {}", newReadyValue);
        isReady.set(newReadyValue);
    }
}

And the controller with the new /readyswitch API:

@RestController
public class ReadinessRestController {
    private static final Logger logger = LoggerFactory.getLogger(ReadinessRestController.class);

    @Autowired
    private ReadinessHolder readinessHolder;

    @GetMapping(value = "/ready", produces = MediaType.APPLICATION_JSON_VALUE)
    public ResponseEntity<String> isReady() {
        if (readinessHolder.isReady()) {
            return new ResponseEntity<>("{\"status\":\"READY\"}", HttpStatus.OK);
        } else {
            return new ResponseEntity<>("{\"status\":\"NOT_READY\"}", HttpStatus.SERVICE_UNAVAILABLE);
        }
    }

    @GetMapping("/readyswitch")
    public ResponseEntity<?> readySwitch() {
        readinessHolder.switchReady();
        return new ResponseEntity<>(HttpStatus.OK);
    }
}

Building the application again and deploying. After the initial readiness probe lets traffic to the pod, we can simply switch the readiness flag so Kubernetes will stop forwarding requests to the pod.

Verifying the API works after startup:

$ curl  <ip>:31704/actuator/health
{"status":"UP"}

Switching the flag:

$ curl <ip>:31704/readyswitch

Verifying the API doesn’t respond anymore:

$ curl <ip>:31704/actuator/health
curl: (7) Failed to connect to <ip> port 31704: Timed out

And Kubernetes is showing the pod as not ready:

$ kubectl get pods
NAME                                            READY   STATUS    RESTARTS   AGE
actuator-healthcheck-example-74bd59c574-bblsl   0/1     Running   0          30m

Of course switching it back through the exposed port is not possible anymore as Kubernetes stopped sending HTTP traffic to the pod. We can still exec into the container though and switch the flag back:

$ kubectl exec -it actuator-healthcheck-example-74bd59c574-bblsl -- bash
bash-4.4# curl localhost:8080/readyswitch

Exiting from the inside container and checking what Kubernetes thinks about the pod:

$ kubectl get pods
NAME                                            READY   STATUS    RESTARTS   AGE
actuator-healthcheck-example-74bd59c574-bblsl   1/1     Running   0          35m

Now it’s back to operation and traffic is allowed.

The real benefit kicks in when you are running the application in multiple instances. To demonstrate this, let’s create a dummy endpoint that logs a single message.

@RestController
public class DummyRestController {
    private static final Logger logger = LoggerFactory.getLogger(DummyRestController.class);

    @GetMapping(value = "/dummy", produces = MediaType.APPLICATION_JSON_VALUE)
    public ResponseEntity<String> dummy() {
        logger.info("Dummy call");
        return new ResponseEntity<>("{}", HttpStatus.OK);
    }
}

Making the application run in 2 instances needs a little bit of tweak (replicas attribute):

apiVersion: apps/v1
kind: Deployment
metadata:
  name: actuator-healthcheck-example
  labels:
    app: actuator-healthcheck-example
spec:
  replicas: 2
  selector:
    matchLabels:
      app: actuator-healthcheck-example
  template:
    metadata:
      labels:
        app: actuator-healthcheck-example
    spec:
      containers:
        - name: actuator-healthcheck-example
          image: actuator-healthcheck-example:latest
          imagePullPolicy: IfNotPresent
          ports:
            - containerPort: 8080
          livenessProbe:
            httpGet:
              path: /actuator/health
              port: 8080
            initialDelaySeconds: 5
            periodSeconds: 10
            failureThreshold: 2
          readinessProbe:
            httpGet:
              path: /ready
              port: 8080
            initialDelaySeconds: 5
            periodSeconds: 5
            failureThreshold: 1
---
apiVersion: v1
kind: Service
metadata:
  name: actuator-healthcheck-example-svc
  labels:
    app: actuator-healthcheck-example
spec:
  type: NodePort
  ports:
    - port: 8080
      targetPort: 8080
      nodePort: 31704
  selector:
    app: actuator-healthcheck-example

Open 3 terminals, 2 for monitoring the application logs for the 2 instances and one for executing the requests.

For watching the logs continuously:

$ kubectl logs <podid> -f --tail=10

Execute this command against the 2 pods you have. Then call the /dummy API from the 3rd terminal.

$ curl <ip>:31704/dummy

Now Kubernetes is balancing the requests between the 2 replicas as they are both ready. You can see it in the logs as well:

2020-04-11 12:45:19.826  INFO 1 --- [nio-8080-exec-4] c.a.b.healthcheck.DummyRestController    : Dummy call

Sometimes the first pod is serving the request, sometimes the other.

And now the most exciting part, if we switch one of the pods to not be ready.

$ curl <ip>:31704/readyswitch

Triggering the dummy API will always be served from the ready pod:

$ curl <ip>:31704/dummy
2020-04-11 12:48:55.554  INFO 1 --- [nio-8080-exec-9] c.a.b.healthcheck.DummyRestController    : Dummy call
2020-04-11 12:48:56.230  INFO 1 --- [nio-8080-exec-1] c.a.b.healthcheck.DummyRestController    : Dummy call
2020-04-11 12:48:56.743  INFO 1 --- [nio-8080-exec-3] c.a.b.healthcheck.DummyRestController    : Dummy call
2020-04-11 12:48:57.182  INFO 1 --- [nio-8080-exec-5] c.a.b.healthcheck.DummyRestController    : Dummy call
2020-04-11 12:48:57.520  INFO 1 --- [nio-8080-exec-7] c.a.b.healthcheck.DummyRestController    : Dummy call
2020-04-11 12:48:57.905  INFO 1 --- [nio-8080-exec-9] c.a.b.healthcheck.DummyRestController    : Dummy call
2020-04-11 12:48:58.298  INFO 1 --- [nio-8080-exec-1] c.a.b.healthcheck.DummyRestController    : Dummy call
$ kubectl get pods
NAME                                            READY   STATUS    RESTARTS   AGE
actuator-healthcheck-example-74bd59c574-fm7hx   0/1     Running   0          6m28s
actuator-healthcheck-example-74bd59c574-rxtbp   1/1     Running   0          6m28s

If you switch back the non-ready pod, it will continue responding back to requests.

Conclusion

We’ve checked 2 scenarios when the readiness probe is useful. The whole purpose of healthchecks is to create more resilient applications and I can encourage you to invest some time into doing it properly. It will definitely return the investment.

As usual, the code can be found on GitHub. If you liked the article, give it a thumbs up and share it. If you are interested in more, make sure you follow me on Twitter.

2 Replies to “Health based traffic control with Kubernetes”

    1. Arnold Galovics says:

Leave a Reply

Your email address will not be published. Required fields are marked *