How-to, notes, thoughts


Cron Jobs migration to Kubernetes tips

These days it is popular to move everything to Kubernetes. Based on my experience of moving old monolith systems to K8S, quite often we have to deal with regular tasks that we run on servers and at the time when they were created nobody thought about containers. So I decided to prepare a small list of items to pay attention to when you move them to Kubernetes:

A cron job creates a job object about once per execution time of its schedule. We say “about” because there are certain circumstances where two jobs might be created, or no job might be created. We attempt to make these rare, but do not completely prevent them. Therefore, jobs should be idempotent”

The term idempotent means that the Cron Job whether performed once or twice or any number of time would have the same effect on the system. Checking for updates, monitoring those kind of operations can be considered idempotent. But modifying data, or writing to a database are not among these.

  • Be careful with time zones. All Cron Job schedule times based on the time zone of the kube-controller-manager. If your control plane runs the kube-controller-manager in Pods or bare containers, the time zone set for the kube-controller-manager container determines the time zone that the cron job controller uses.

  • Check docker images size. Every time you execute a Cron Job it starts a new POD, which means it can download a container image from a registry. Try to reduce image as much as possible to reduce server resources and traffic.

  • Make sure your script or program provides correct output. Some Cron Jobs can be important for your system and it’s a good practice to monitor failed executions. There are tools like Alertmanager that can help together with Prometheus to identify failed tasks, but your program should catch exceptions and don’t finish successfully if something went wrong during an execution.

  • Logs, logs, logs. The easiest and most embraced logging method for containerized applications is to write to the standard output and standard error streams. If you move existing, for example PHP applications and Cron Jobs are PHP scripts, sometimes it’s hard to read those logs what usually should help you and your support team to troubleshoot failed tasks. I would recommend spending some time and improve such scripts to provide better output.

  • How long does it take to run a Cron Job? Check how long it takes to execute a task before set up too small interval.

  • Limit container resources. Don’t forget to set up default CPU and RAM container limits for Cron Jobs.

  • Update docker image and environment variables. Probably it is a good idea to add your Cron Jobs to release and testing processes. If Cron Jobs images are hosted inside some app which has a batch of environment variables and is updated often, make sure you test it with new code and keep required environments updated.

  • Execute read-to-use images. I saw a situation when PHP Composer installed all dependencies during a POD startup process. It’s not recommended practice, because everything could be fine when you deploy this image to Staging, but stop working because some external resource is not accessible during deployment to Production. Also it makes a startup process longer. Try to build a ready-to-use image.