Selfhosted GitOps for docker with Gitea and Actions

I've been in a rather lengthy process of overhauling my own servers. I run a server in Hetzner's datacenter for my own use. It's not really a lab more than it's a place where I host this blog and some apps.

I wrote a couple of rather lengthy blog posts about the setup before, but I figured that they were too long and too personal to be interesting to anyone else.

Instead, I want to describe some of the components I've designed, especially when this design wasn't really very well described elsewhere.

The original problem

The problem I'm solving is around orchestration of container deployment. While docker, kubernetes and friends all describe themselves as container orchestration frameworks, and they are, there is a piece missing around deployment.

Consider this: You've written an app and dockerized it. You can spin it up on your local laptop manually, but you also want it to run on a production environment.

You could manually deploy it there, sure, but that's not very automation friendly. In reality, you want to have some sort of continuous process that ensures that it is deployed to production whenever a new version is released, and preferrably after it has been tested in some sort of isolated environment.

There are quite a lot of ways to solve this, some well described, some less. Some depend on the platform in question and so on.

With containers, you generally have a process that builds the application images, runs tests and upon completion submits the image to some sort of repository.

Another process would ensure that your environment state matches some desired state, including which applications are installed. This process would for example update the running containers if you indicate this in your target state.

So Continuous Integration (CI) and Continuous Deployment (or Delivery - CD) are two different processes defined separetely.

How to actually define this is very much up to you, and is something that has a lot of commercial tool offerings that you could tap in to. On the self-hosted open-source approach, I was a little bit frustrated with both the lack of readily available solutions, but also on how to actually design and produce an approach that worked for me.

When looking around on the internet, it feels like you are either expected to work at an employer that has this well defined, or an absolute beginner who shouldn't deal with any of this.

GitOps

One approach that is often described is GitOps. It builds a workflow around git and especially the tooling and processes in Github.

For CI, you would have your app hosted on Github, with some pipeline that would run defined tests for all your changes. When you tagged a new release, your app's image artifact(s) would be submitted to an internal image registry.

For CD, another Github repository would define your production environment's state, through so-called Infrastructure as Code, and by updating the application version in that code, it would trigger another pipeline that would test and deploy it.

Git provides a lot of value here, especially if you're a team, with pull requests, code reviews and so on.

Implementing my own thing

My own server runs mostly applications that I haven't made myself, so I don't need much of the CI-based infrastructure. I am also just me, so I actually don't really gain any value from many of the collaborative features here.

So first, let's look at the various parts involved in deploying an application on a server:

The application itself

Most applications are distributed in the form of a docker image. Often a reference docker-compose.yml file is provided with various required components such as databases. Then finally the application needs some configuration, and this may include various passwords.

So we can define the parts as follows:

  • System image
  • Configuration
  • Secrets
  • State

In addition to this, some storage is also required, persistent or ephemeral.

The system images are not hard to handle. They are available from well-known publicly avaiilable docker repositories. I could build them myself if I really wanted, but I haven't found any need for that.

The configuration, usually in the form of environment variables, but it could be actual configuration files that need to be provided to the image. Some of this could be set once and never touched again, but usually it requires some tuning, and is often related to the environment it runs on.

The secrets must be guarded well. You want to ensure that your passwords aren't available in clear-text, and certainly not in your git repositories.

Finally the state, which includes other dependent containers, the volumes, the network definitions and so on. It's closely related to the configuration, but not quite.

Ansible

I am using Ansible with my various virtual hosts. I tried to extend this to deploying the containers, and I explored two approaches:

Defining the whole application in ansible

This is really not a bad approach: I would write a role to deploy the application, write the configuration as group variables, and put the secrets in Ansible Vault.

But once you've reached a number of applications, your ansible playbook becomes massive, and you start running in to challenges with the design if you have more of the same application but deployed in different ways to different hosts.

Defining a more generic ansible role

Another thing I explored was to make more flexible ansible roles. I would store the application configuration as inventory data, then write a role that merged this with included default values, and then have a role that checked for various variations before finally deploying the application.

I wrote a single role I called "microservices" which did this. I liked it on a conceptual level, but in reality it meant that I had three different domains involved in a single application: The docker-compose file that needed to be templated, the role that needed to be written, and the inventory system that I needed to write this in to.

GitOps

I abandoned my various Ansible-oriented approaches, and went with a Git-oriented solution.

I wanted a separate flow for each application deployment, and I needed it to be simple to work with, limiting the amount of different systems to maintain.

I have my own Gitea installation - it's very similar to Github and implements a lot of the same functionality around workflows. Notably here it has support for Github Actions albeit not at full feature parity. Aside for a few minor issues, I've not run into any real limitations here.

Before I could create the repositories, I had to solve thing first: How to handle secrets.

Secrets were in fact something I handled poorly with Ansible. While they were stored encrypted in the Ansible Vault, they ended up in the environments in clear text.

While they need to end up in clear text eventually, it's best to limit where this is exposed.

I briefly considered Hashicorp Vault, but decided against adding yet another entity, and I also briefly explored a solution where the secrets would be stored in KeePassX-files and inserted into your environment through a small script.

Instead I ended up with a system that used git-crypt, and simply storing each individual secret in a file that would be encrypted in the repository.

A lot of applications today support the ability to read the secret from a file instead of an environment variable, and docker compose has the ability to handle this through a secrets definition.

The good thing is that if an application does not support this, I can put the environment variable defintions into a file that is referenced by the compose file, and encrypt just that file instead.

Continuous Deployment

In order to deploy an application, I needed to write a workflow that did the following:

  1. Check out the repository
  2. Decrypt it
  3. Transfer the unencrypted files to the target host
  4. Apply the docker-compose file
  5. Remove the files

I ended up writing my own Action for this, essentially just merging two open source actions I found on Github: One action to decrypt a repository with git-crypt, and one to deploy a docker-compose project on a remote server via SSH.

This turned out to be rather straight forward, and I added a second Action quickly after this that would log in to a remote host over SSH and log in to my private docker registry so that I could pull my own images too.

./diagram.png

Ansible Vs. Github Actions

Having implemented this first through Ansible and then through Github (or Gitea) Actions I should be able to compare the two approaches.

It's not like I have thrown Ansible out. I still use it heavily to define the servers and their state, including bootstrapping them, installing docker, setting up DNS and so on.

The primary strength of Github Actions lies is in its simplicity.

You can implement most things with already made workflow components, it's widely adopted so it's easy to find references, and it's also easy to write your own Actions:

If you can write a shell script that does what you need to do, then you just put that shell script into a minimalist container and voila, you have an action.

On the down side, you lose a lot of flexibility around state that Ansible gives you.

Actions are run as defined workflows, with very limited branching and no real ability to react on information in your target host. This is probably mostly because your workflows run on ephemeral systems, called runners, and they are largely unaware of what they are actually doing.

I ran into this exact issue when I wanted to upgrade the version of an application I run, and that new version required some manual intervention.

This was not practical to try to express through the workflow, especially since it is a one-off operation.

I ended up doing the procedure by hand, but I can't help but think that with Ansible, you could most likely write some more intelligent behavior for this.

Github workflows are designed to be a repeatable series of actions that need to be executed under a set of circumstances you define. They will run if the condition matches these circumstances, and it's very much up to you to define your safety mechanisms.

Still, I'm not running anything critical, so this is not really a show stopper. Just worth keeping in mind.

I guess you could also try a combination of both approaches.

Instead of implementing one large ansible playbook, you could have one playbook per application in separate repositories, with Github Workflows executing these on some sort of ansible execution environment, I suppose.

Maybe I will explore this if something turns out being too fragile with just docker-compose and actions. We'll see.

Final words

I ended up building some new actions to make this work. I will release them eventually, but I don't feel I've tested them enough yet. Once they have a bit more mileage under their belts I will release them on Github.

comments powered by Disqus