- Smart healthchecks with Kubernetes and Spring Boot Actuator
- Health based traffic control with Kubernetes
Last time we covered how the liveness probe can be integrated with Spring Boot Actuator. Today, I’m going to show an example application for the readiness probe.
Readiness probe
The readiness probe is kind of similar to the liveness probe. It determines if the application running is allowed to serve traffic. Think about the case when the application starts up – so the liveness probe says, it’s all good – but until it can really respond to requests, it has to process a huge file, fill up the caches from the database or contact an external service. In this case you don’t want the application to be restarted by the liveness probe but wait until it’s fully operational.
Another scenario when the app is having some background processing responsibilities as well on top of a normal HTTP API. If it gets overloaded with the background work, it might not have enough resources to reply to HTTP requests, at least in case response time is crucial. With the readiness probe, you can have such functionality implemented so in case the application is lacking the necessary resources, no traffic will be sent to it until it frees up.
Configuring this type of probe is almost identical to other probes, the only difference is the name readinessProbe
.
apiVersion: apps/v1 kind: Deployment metadata: name: actuator-healthcheck-example labels: app: actuator-healthcheck-example spec: replicas: 1 selector: matchLabels: app: actuator-healthcheck-example template: metadata: labels: app: actuator-healthcheck-example spec: containers: - name: actuator-healthcheck-example image: actuator-healthcheck-example:latest imagePullPolicy: IfNotPresent ports: - containerPort: 8080 readinessProbe: httpGet: path: /ready port: 8080 initialDelaySeconds: 5 periodSeconds: 5 failureThreshold: 1
I’m not going to go through all the settings for the probes, the same can be applied just like for a liveness probe.
Example – startup
I’m going to extend the example I’ve shown in the previous article so if you are out of context, make sure you check it here.
Moving back to writing code. We are going to simulate when the application has to load something at startup that takes several seconds.
First of all, we need a state holder whether the application is ready to serve traffic or not.
@Component public class ReadinessHolder { private AtomicBoolean isReady = new AtomicBoolean(false); public boolean isReady() { return isReady.get(); } }
Now the startup load simulation. I’m going to use the TaskExecutor
interface from Spring to execute an asynchronous task that will set the isReady
attribute to true after 20 seconds. The implementation looks like this:
@Component public class ReadinessHolder { private static final Logger logger = LoggerFactory.getLogger(ReadinessHolder.class); @Autowired private TaskExecutor taskExecutor; private AtomicBoolean isReady = new AtomicBoolean(false); @PostConstruct public void init() { taskExecutor.execute(() -> { try { logger.info("Sleeping for 20 seconds.."); Thread.sleep(TimeUnit.SECONDS.toMillis(20)); isReady.set(true); logger.info("Application is ready to serve traffic"); } catch (InterruptedException e) { Thread.currentThread().interrupt(); } }); } public boolean isReady() { return isReady.get(); } }
In the task, I’m doing a simple log message so we can verify the logs in Kubernetes. Then sleeping the thread for 20 seconds and after that setting the isReady
state to true.
Next up, we need to expose this information on HTTP. I’m creating a new controller:
@RestController public class ReadinessRestController { @Autowired private ReadinessHolder readinessHolder; @GetMapping(value = "/ready", produces = MediaType.APPLICATION_JSON_VALUE) public ResponseEntity<String> isReady() { if (readinessHolder.isReady()) { return new ResponseEntity<>("{\"status\":\"READY\"}", HttpStatus.OK); } else { return new ResponseEntity<>("{\"status\":\"NOT_READY\"}", HttpStatus.SERVICE_UNAVAILABLE); } } }
A single GET endpoint that gets the data from the holder and depending on the value, it responds with either HTTP 200
{ "status":"READY" }
or HTTP 503
{ "status":"NOT_READY" }
Now the deployment descriptor:
apiVersion: apps/v1 kind: Deployment metadata: name: actuator-healthcheck-example labels: app: actuator-healthcheck-example spec: replicas: 1 selector: matchLabels: app: actuator-healthcheck-example template: metadata: labels: app: actuator-healthcheck-example spec: containers: - name: actuator-healthcheck-example image: actuator-healthcheck-example:latest imagePullPolicy: IfNotPresent ports: - containerPort: 8080 livenessProbe: httpGet: path: /actuator/health port: 8080 initialDelaySeconds: 5 periodSeconds: 10 failureThreshold: 2 readinessProbe: httpGet: path: /ready port: 8080 initialDelaySeconds: 5 periodSeconds: 5 failureThreshold: 1 --- apiVersion: v1 kind: Service metadata: name: actuator-healthcheck-example-svc labels: app: actuator-healthcheck-example spec: type: NodePort ports: - port: 8080 targetPort: 8080 nodePort: 31704 selector: app: actuator-healthcheck-example
There are 2 changes I’ve made compared to the previous article. On one hand I’ve added the readinessProbe
so it’s mapped to the /ready
endpoint we created. The other one is the Service
descriptor, I’ve changed it to NodePort
so it’s easier to access for the test. You can use the original descriptor if you want though. The NodePort
only means that the API can be accessed through the Kubernetes node directly. For minikube, you can use minikube ip
to get the address and then http://<ip>:31704
will be the root of the API.
Next up, let’s deploy the application. Usual exercise, building the jar, then the image and applying the Kubernetes descriptor. Don’t forget to execute eval $(minikube docker-env)
if you are using minikube.
$ ./gradlew clean build $ docker build . -t actuator-healthcheck-example $ kubectl apply -f k8s-deployment.yaml
Observing the running pods:
$ kubectl get pods -w
The -w
flag watches for changes. Inspecting the output:
$ kubectl get pods -w NAME READY STATUS RESTARTS AGE actuator-healthcheck-example-74bd59c574-92d7j 0/1 Running 0 3s actuator-healthcheck-example-74bd59c574-92d7j 1/1 Running 0 29s
It’s clearly visible that after 20 seconds, the application suddenly changed it’s ready state, just like we implemented it. During the period of the pod not being ready, no request will be served. So if you try to execute for example the following command during startup:
$ curl <ip>:31704/actuator/health curl: (7) Failed to connect to <ip> port 31704: Timed out
As soon as the readiness probe says, the pod is ready, executing the same command will result in a proper response:
$ curl <ip>:31704/actuator/health {"status":"UP"}
That’s it. The readiness probe is working properly and it doesn’t let traffic go to the pod until it’s reported healthy.
Example – background processing
The other application for the readiness probe is when the application is running low on resources. Like if background processing is part of the application that uses threads in a threadpool. If it gets overloaded, the resources might not be sufficient to serve HTTP requests in an acceptable manner.
I hope you didn’t expect me to give you a full-blown background processing engine that will starve the compute power needed for an HTTP API. Rather I’m just going to emulate the insufficient resource state by setting a flag.
Compared to the previous example, we are making a single change for now. Exposing an HTTP API to switch the ready flag manually.
The new holder class with the switchReady
method:
@Component public class ReadinessHolder { private static final Logger logger = LoggerFactory.getLogger(ReadinessHolder.class); @Autowired private TaskExecutor taskExecutor; private AtomicBoolean isReady = new AtomicBoolean(false); @PostConstruct public void init() { taskExecutor.execute(() -> { try { logger.info("Sleeping for 20 seconds.."); Thread.sleep(TimeUnit.SECONDS.toMillis(20)); isReady.set(true); logger.info("Application is ready to serve traffic"); } catch (InterruptedException e) { Thread.currentThread().interrupt(); } }); } public boolean isReady() { return isReady.get(); } public void switchReady() { boolean newReadyValue = !isReady.get(); logger.info("Switching the ready flag to {}", newReadyValue); isReady.set(newReadyValue); } }
And the controller with the new /readyswitch
API:
@RestController public class ReadinessRestController { private static final Logger logger = LoggerFactory.getLogger(ReadinessRestController.class); @Autowired private ReadinessHolder readinessHolder; @GetMapping(value = "/ready", produces = MediaType.APPLICATION_JSON_VALUE) public ResponseEntity<String> isReady() { if (readinessHolder.isReady()) { return new ResponseEntity<>("{\"status\":\"READY\"}", HttpStatus.OK); } else { return new ResponseEntity<>("{\"status\":\"NOT_READY\"}", HttpStatus.SERVICE_UNAVAILABLE); } } @GetMapping("/readyswitch") public ResponseEntity<?> readySwitch() { readinessHolder.switchReady(); return new ResponseEntity<>(HttpStatus.OK); } }
Building the application again and deploying. After the initial readiness probe lets traffic to the pod, we can simply switch the readiness flag so Kubernetes will stop forwarding requests to the pod.
Verifying the API works after startup:
$ curl <ip>:31704/actuator/health {"status":"UP"}
Switching the flag:
$ curl <ip>:31704/readyswitch
Verifying the API doesn’t respond anymore:
$ curl <ip>:31704/actuator/health curl: (7) Failed to connect to <ip> port 31704: Timed out
And Kubernetes is showing the pod as not ready:
$ kubectl get pods NAME READY STATUS RESTARTS AGE actuator-healthcheck-example-74bd59c574-bblsl 0/1 Running 0 30m
Of course switching it back through the exposed port is not possible anymore as Kubernetes stopped sending HTTP traffic to the pod. We can still exec into the container though and switch the flag back:
$ kubectl exec -it actuator-healthcheck-example-74bd59c574-bblsl -- bash bash-4.4# curl localhost:8080/readyswitch
Exiting from the inside container and checking what Kubernetes thinks about the pod:
$ kubectl get pods NAME READY STATUS RESTARTS AGE actuator-healthcheck-example-74bd59c574-bblsl 1/1 Running 0 35m
Now it’s back to operation and traffic is allowed.
The real benefit kicks in when you are running the application in multiple instances. To demonstrate this, let’s create a dummy endpoint that logs a single message.
@RestController public class DummyRestController { private static final Logger logger = LoggerFactory.getLogger(DummyRestController.class); @GetMapping(value = "/dummy", produces = MediaType.APPLICATION_JSON_VALUE) public ResponseEntity<String> dummy() { logger.info("Dummy call"); return new ResponseEntity<>("{}", HttpStatus.OK); } }
Making the application run in 2 instances needs a little bit of tweak (replicas
attribute):
apiVersion: apps/v1 kind: Deployment metadata: name: actuator-healthcheck-example labels: app: actuator-healthcheck-example spec: replicas: 2 selector: matchLabels: app: actuator-healthcheck-example template: metadata: labels: app: actuator-healthcheck-example spec: containers: - name: actuator-healthcheck-example image: actuator-healthcheck-example:latest imagePullPolicy: IfNotPresent ports: - containerPort: 8080 livenessProbe: httpGet: path: /actuator/health port: 8080 initialDelaySeconds: 5 periodSeconds: 10 failureThreshold: 2 readinessProbe: httpGet: path: /ready port: 8080 initialDelaySeconds: 5 periodSeconds: 5 failureThreshold: 1 --- apiVersion: v1 kind: Service metadata: name: actuator-healthcheck-example-svc labels: app: actuator-healthcheck-example spec: type: NodePort ports: - port: 8080 targetPort: 8080 nodePort: 31704 selector: app: actuator-healthcheck-example
Open 3 terminals, 2 for monitoring the application logs for the 2 instances and one for executing the requests.
For watching the logs continuously:
$ kubectl logs <podid> -f --tail=10
Execute this command against the 2 pods you have. Then call the /dummy
API from the 3rd terminal.
$ curl <ip>:31704/dummy
Now Kubernetes is balancing the requests between the 2 replicas as they are both ready. You can see it in the logs as well:
2020-04-11 12:45:19.826 INFO 1 --- [nio-8080-exec-4] c.a.b.healthcheck.DummyRestController : Dummy call
Sometimes the first pod is serving the request, sometimes the other.
And now the most exciting part, if we switch one of the pods to not be ready.
$ curl <ip>:31704/readyswitch
Triggering the dummy API will always be served from the ready pod:
$ curl <ip>:31704/dummy
2020-04-11 12:48:55.554 INFO 1 --- [nio-8080-exec-9] c.a.b.healthcheck.DummyRestController : Dummy call 2020-04-11 12:48:56.230 INFO 1 --- [nio-8080-exec-1] c.a.b.healthcheck.DummyRestController : Dummy call 2020-04-11 12:48:56.743 INFO 1 --- [nio-8080-exec-3] c.a.b.healthcheck.DummyRestController : Dummy call 2020-04-11 12:48:57.182 INFO 1 --- [nio-8080-exec-5] c.a.b.healthcheck.DummyRestController : Dummy call 2020-04-11 12:48:57.520 INFO 1 --- [nio-8080-exec-7] c.a.b.healthcheck.DummyRestController : Dummy call 2020-04-11 12:48:57.905 INFO 1 --- [nio-8080-exec-9] c.a.b.healthcheck.DummyRestController : Dummy call 2020-04-11 12:48:58.298 INFO 1 --- [nio-8080-exec-1] c.a.b.healthcheck.DummyRestController : Dummy call
$ kubectl get pods NAME READY STATUS RESTARTS AGE actuator-healthcheck-example-74bd59c574-fm7hx 0/1 Running 0 6m28s actuator-healthcheck-example-74bd59c574-rxtbp 1/1 Running 0 6m28s
If you switch back the non-ready pod, it will continue responding back to requests.
Conclusion
We’ve checked 2 scenarios when the readiness probe is useful. The whole purpose of healthchecks is to create more resilient applications and I can encourage you to invest some time into doing it properly. It will definitely return the investment.
As usual, the code can be found on GitHub. If you liked the article, give it a thumbs up and share it. If you are interested in more, make sure you follow me on Twitter.
I recently started a guide for beginners for Spring Boot and Kubernetes (https://github.com/nicc777/kubernetes-from-scratch) and this specific pattern I used also in chapter 5 (https://github.com/nicc777/kubernetes-from-scratch/blob/main/chapter_05/README.md). My implementation may be slightly different, but if you don’t mind I will link this post as a reference as I think this is a more detailed explanation. Wish I discovered this a little earlier 🙂
Hi Nico. Sure thing, feel free to link it. I’m glad I could help.