Running background tasks on a schedule is a standard requirement of backend services. Getting setup used to be simple – you’d define your tasks in your server’s crontab and call it a day. Let’s look at how you can utilize cron while using Docker for deployment.
Containerising your services increases developer productivity. Simultaneously, it can leave you wondering how traditional sysadmin concerns map to Docker concepts. You’ve got several options when using cron with Docker containers and we’ll explore them below in order of suitability. Before continuing, make sure you’ve installed Docker and built a Docker image of your application.
Using the Host’s Crontab
At its most basic, you can always utilize the cron installation of the host that’s running your Docker Engine. Make sure cron is installed and then edit the system’s crontab as normal.
You can use docker exec to run a command within an existing container:
This will only work if you can be sure of the container’s name ahead of time. It’s normally better to create a new container which exists solely to run the task:
Every five minutes, your system’s cron installation will create a new Docker container using your app’s image. Docker will execute the /example-scheduled-task.sh script within the container. The container will be destroyed (–rm) once the script exits.
Using Cron Within Your Containers
Using the host’s crontab breaks Docker’s containerization as the scheduled tasks require manual setup on your system. You’ll need to ensure cron is installed on each host you deploy to. While it can be useful in development, you should look to integrate cron into your Dockerised services when possible.
Most popular Docker base images do not include the cron daemon by default. You can install it within your Dockerfile and then register your application’s crontab.
First, create a new crontab file within your codebase:
Next, amend your Dockerfile to install cron and register your crontab – here’s how you can do that with a Debian-based image:
We install cron and copy our codebase’s crontab into the /etc/cron.d directory. Next, we need to amend the permissions on our crontab to make sure it’s accessible to cron. Finally, use the crontab command to make the file known to the cron daemon.
To complete this setup, you’ll need to amend your image’s command or entrypoint to start the cron daemon when containers begin to run. You can’t achieve this with a RUN stage in your Dockerfile because these are transient steps which don’t persist beyond the image’s build phase. The service would be started within the ephemeral container used to build the layer, not the final containers running the completed image.
If your container’s only task is to run cron – which we’ll discuss more below – you can add ENTRYPOINT [“cron”, “-f”] to your Dockerfile to launch it as the foreground process. If you need to keep another process in the foreground, such as a web server, you should create a dedicated entrypoint script (e.g. ENTRYPOINT [“bash”, “init.sh”]) and add service cron start as a command within that file.
RELATED: How to Install Docker and Docker Compose on Linux
Separating Cron From Your Application’s Services
Implementing the setup described in the preceding section provides a more robust solution than relying on the host’s crontab. Adding the cron daemon to the containers that serve your application ensures anyone consuming your Docker image will have scheduled tasks setup automatically.
This still results in mixing of concerns though. Your containers end up with two responsibilities – firstly, to provide the application’s functionality, and secondly, to keep cron alive and run the scheduled tasks. Ideally, each container should provide one specific unit of functionality.
Wherever possible, you should run your cron tasks in a separate container to your application. If you’re creating a web backend, that would mean one container to provide your web server and another which runs cron in the foreground.
Without this separation, you’ll be unable to use an orchestrator like Docker Swarm or Kubernetes to run multiple replicas of your application. Each container would run its own cron daemon, causing scheduled tasks to run multiple times. This can be mitigated by using lock files bound into a shared Docker volume. Nonetheless, it’s more maintainable to address the root problem and introduce a dedicated container for the cron daemon.
Generally, you’ll want both containers to be based on your application’s Docker image. They’ll each need connections to your service’s Docker volumes and networks. This will ensure the cron container has an identical environment to the application container, with the only difference being the foreground process.
This is not a hard-and-fast rule – in some projects, your scheduled tasks might be trivial scripts which operate independently of your codebase. In that case, the cron container may use a minimal base image and do away with connections to unnecessary peripheral resources.
One way to get setup with a separate cron container would be to use docker-compose. You’d define the cron container as an extra service. You could use your application’s base image, overriding the entrypoint command to start the cron daemon. Using docker-compose also simplifies attaching the container to any shared volumes and networks it requires.
Using the above example, one container serves our application using the default entrypoint in the image. Make sure this does not start the cron daemon! The second container overrides the image’s entrypoint to run cron. As long as the image still has cron installed and your crontab configured, you can use docker-compose up to bring up your application.
Using Kubernetes Cron Jobs
Finally, let’s look at a simple example of running scheduled tasks within Kubernetes. Kubernetes comes with its own CronJob resource which you can use in your manifests.
You don’t need to install cron in your image or setup specialized containers if you’re using Kubernetes. Be aware that CronJob is a beta resource which may change in future Kubernetes releases.
Apply the above manifest to your cluster to create a new cron job that will run /my-cron-script.sh within your container every five minutes. The frequency is given as a regular cron definition to the schedule key in the resource’s spec.
You can customize the ConcurrencyPolicy to control whether Kubernetes allows your jobs to overlap. It defaults to Allow but can be changed to Forbid (prevent new jobs from starting while one already exists) or Replace (terminate an existing job as soon as a new one starts).
Using Kubernetes’s built-in resource is the recommended way to manage scheduled tasks within your clusters. You can easily access job logs and don’t need to worry about preparing your containers for use with cron. You just need to produce a Docker image that contains everything your tasks need to run. Kubernetes will handle creating and destroying container instances on the schedule you specify.