kasperhermansen-blog/content/posts/2023-07-22-development-stack-2023.md

---
type: "blog-post"
title: "Revisiting my Personal Platform in 2023"
description: "The tech landscape moves fast, and as a Platform/Developer experience engineer I like to stay up to date with recent technology and approaches so that I can deliver the best and most solid approach for my engineers. As such in this blog post I will explore what my personal development stack looks like, and how I want to kick it up a notch and reflect on the challenges I've had in the previous year. Strap in because this is gonna be a janky ride. But first I will dig into why I've got a personal platform and why that might be useful for you as well."
draft: false
date: "2023-07-23"
updates:
- time: "2023-07-23"
  description: "first iteration"
tags:
- '#blog'
---

The tech landscape moves fast, and as a Platform/Developer experience engineer I
like to stay up to date with recent technology and approaches so that I can
deliver the best and most solid approach for my engineers. As such in this blog
post I will explore what my personal development stack looks like, and how I
want to kick it up a notch and reflect on the challenges I've had in the
previous year. Strap in because this is gonna be a janky ride. But first I will
dig into why I've got a personal platform and why that might be useful for you
as well.

## What do i mean by personal platform

You may've heard the term self-hosted thrown around, or homelab. These terms
overlap a bit, but are also a bit orthogonal. Homelab is a personal or small
deployment of stuff, you can tinker with experiment and enjoy using. Parts of it
usually consist of HomeAssistant, Plex/Emby, various vms and such. Self hosted
basically means off the shelf tools you can host yourself, whether for personal
use or for enterprise.

When I mean personal platform, parts of it means a homelab, but taking it a step
further, and specializing it for development usage. The goal is to develop a
platform like a small to medium sized company that is capable of rolling out
software and get the amenities you want to select for (more on that later). It
should be useful and not just an experiment. You should actually use the
platform to roll out software. One of the most important part of developing a
platform is actually using it yourself (dog fooding) otherwise you will never
learn the sharp edges and where your requirements break, and such.

So for me the basic requirements for a platform is:

1. A place to host deployments, this may be a vm, a raspberry pi, fly.io, aws.
   It doesn't matter too much, it all depends on your needs and what you want to
   develop.
2. A place to store source code, again the easiest option is just to choose
   GitHub, but you can also choose to go a step further and actually host the
   code yourself in the spirit of a homelab. I do this personally.
3. A domain or a way to interact with the services and deployments you build.
   You want to make the things you build be accessible to how wide of an
   audience you choose. Whether that is only yourself, your closest family and
   friends or the public. I personally do a mix, some stuff like the platform
   internals are only accessible internally, other services are public, and some
   are invite only.

If it is difficult to illustrate, you can kind of think of the platform as the
same things you would get if you used, fly.io, aws, gcp or any of the Platform
as a Service solutions out there.

## Why build a platform only for yourself

This is a question I get a lot, I seemingly spent a lot of effort in building
tools, services and whatnot, which is incredibly overkill for my personal needs.
I think of it like so:

> Get comfortable with advanced tooling and services, so when you actually need
> to do it in practice it is easy

It is part personal development, but also building up a certain expertise, that
can be difficult to acquire in a job, it is also incredibly fun and filled with
challenges.

It should also be noted that Personal platform would seem like incredibly
overkill, but it is an incremental process, you may already have parts of it
already. Just implicitly.

## The beginning

My personal platform began as an old workstation running Linux (the distro
doesn't really matter), with `docker` and `docker-compose` installed. Then I ran
various homelab deployments, such as `gitea`, `drone-ci`, `plex`, etc.

My workflow would be to simply build a docker image on the service I was at.
`make ci`, which `docker build .` and `docker push`, and finally I would ssh
into the workstation, and bump the image version using `image:latest`. It is a
fairly basic platform and a lot of the details weren't documented or automated.
In the beginning everything would just be accessible internally and I would just
use the hostname given by `dhcp` and so on. Such as
`http://home-server:8081/todo-list` or something like that.

It worked fine for a while, but I began the needs to actually want to use some
of those tools when I left the house. And as my tool stack grew and there were
more hostnames and ports to remember I began to look for enhancements for the
stack.

> This is actually the most important part of building a personal platform.
> Start small, and grow in the direction of your requirements and needs. Do not
> start with a self hosted kubernetes with all the bells and whistles. And don't
> copy another persons stack, it will not fit your needs and you won't be able
> to maintain it.

In the beginning I choose to use tools such as upnp and ngrok, to expose these
services as well as a dashboard service for discoverability. However, that
didn't work out. First of all ngrok, upnp wasn't the most stable, and I didn't
want to expose my home network to the internet in that way. I also didn't use
the dashboard service that much, as just that extra step, made me not use the
tools that I'd build that much. I would select for only those that I remembered
the hostname and port for and not the more niche ones.

### Getting a VPS

Getting my first vps for personal use, was a decision I made once I figured that
there was a lot of ammenties that I would get out of the box, I would get a
stable machine, which ran nearly 24/7, it has a public static ip, and was
reachable from anywhere.

I choose hetzner, because it was the cheapest option I could get where I am at,
with the required bandwidth cap and such.

I choose namecheap for a domain, and cloudflare for dns. Cloudflare technically
isn't needed, but the tooling is nice.

At this point my stack was like this.

```
namecheap -> cloudflare -> hetzner vps
```

This was sort of useful, but not that much, I could host some things on the vps,
but I'd like to use the cheap compute I had at home, but still make it
reachable. I then began searching for a mesh vpn. I looked at openvpn, a bunch
of other options, but finally landed on `wireguard`, because it seemed to be the
most performant, and suited my needs quite perfectly.

In the beginning I wanted to just use the vpn as a proxy.

```
namecheap -> cloudflare -> hetzner vps -> wireguard -> home workstation
```

However, setting `iptables` rules and such turned out to be a nightmare, and as
such I kept it simple and just installed `caddy` and `nginx` on the vps. Caddy
for TLS certificates, and nginx for TCP load balancing and reverse proxying.
(Caddy doesn't officially support TCP loadbalancing, only with a plugin which I
don't want to use because of ergonomics).

So now the stack was like this:

```
namecheap -> cloudflare -> hetzner vps -> caddy/nginx ->  wireguard -> home workstation
```

I was really happy with this stack, and actually still use it.

The wireguard setup is setup as a bunch of point-to-point connections all
pointing at the ingress node.

```
home workstation (interface) -> hetzner ingress vps (peer)
hetzner ingress vps (interace) -> home workstation (peer)
```

Home workstation:

```
[Interface]
PrivateKey = <home-workstation-priv-key>
Address = 10.0.9.2
ListenPort = 55107

[Peer]
PublicKey = <ingress-vps-public-key
AllowedIPs = 10.0.9.0/16 # allows receiving a wide range of traffic from the wireguard peer
Endpoint = <ingress-vps-public-static-ip>:51194
PersistentKeepalive = 25
```

Hetzner vps:

```
[Interface]
Address = 10.0.9.0
ListenPort = 51194
PrivateKey = <ingress-vps-private-key>

# packet forwarding
PreUp = sysctl -w net.ipv4.ip_forward=1

[Peer]
PublicKey = <home-workstation-public-key>
AllowedIPs = 10.0.9.2/32 # this peer should only provide a single ip
PersistentKeepalive = 25
```

It is incredibly simple and effective. I even have entries for on the vps for my
android phone, mac, you name it. Super easy to setup, but requires some manuel
handling. Tailscale can be used to automate this, but when I set this up it
wasn't really a mature solution. But if I started today I would probably use it.

The important part is that the registration is only needed between the peer and
the hetzner ingress vps. So if I add another vps at some point only that and the
ingress vps, will need registration, but my phone would still be able to talk to
it, because of the 10.0.0.0/16. That is of course as long as they share a
subnet, i.e. 10.0.9.1 and 10.0.9.2.

Now my caddy things can just reverse proxy to my home workstation, without it
needing a public port.

```
hetzner ingress vps -> caddy -> wireguard ip for home workstation and port for service -> home workstation -> docker service
```

Because of docker bridge networking, even if caddy is running in a docker
container, it can still use the wireguard network interface and reverse proxy to
that. This is what was and is still binding all my own services together, even
if they don't share a physical network subnet.

## Hosting

My hosting of personal services is now a mix between, home workstation for plex
and other compute intensive services, and on hetzner, I've rented a few more for
services I use frequently like `gitea`, `grafana` and so on.

![infra](2023-07-22-infra.png)

As you may imagine plex, drone, grafana etc. shouldn't be exposed to the
internal, but I'd still like the convenience, so I've setup caddy to only allow
the wireguard subnet, and use domain wildcard certs for certificates, such that
it can still provision internal https certificates using lets encrypt.

There is a bunch more services I've left out, especially my own home built
things. However, the deployment model is still as handheld as I mentioned in the
beginning. Now they're just spread onto the vps and private nodes.

## Development

My next iteration for development was using an open-source tool I've helped
develop at work: https://github.com/lunarway/shuttle. The idea is to eliminate
the need for sharing shell scripts, makefiles and configuration between
different repositories. Now, just initialize a template `shuttle.yaml` file and
fill it out with a parent template plan, and you've got all you need. I usually
develop a mix of `nextjs`, `sveltekit`, `rust-axum`, `rust-cron`, `rust-cli` and
finally `go-service`. All of these plans contains everything needed to build a
docker image, prepare a docker-compose file and publish it. These again aren't
public, because they specifically suit my needs.

I've ended up building my own incarnation of `shuttle` called `cuddle`
https://git.front.kjuulh.io/kjuulh/cuddle it isn't made for public consumption,
and was one of the first projects I built when I was learning rust.

My workflow has changed to simply be `cuddle x ci` and it will automatically
build, test and prepare configs for deployment. It won't actually do the
deployment step, that is left for CI in drone when it actually runs
`cuddle x ci --dryrun=false`. I've developed a homegrown docker-compose gitops
approach, where the deployment is simply creating a commit to a central
repository with a docker-compose file, with a proper image version set. usually
a prefix plus a uuid.

My vps simply has a cronjob that once every 5 minutes it does a `git pull` and
executes a script

```bash
#!/bin/bash

set -e

LOG="/var/log/docker-refresh/refresh.log"
GIT_REPO="/home/<user>/git-repo"

exec > >(tee -i ${LOG})
exec 2>&1

echo "##### docker refresh started $(date) #####"

cd "$GIT_REPO" || return 1

git fetch origin main
git reset --hard origin/main

command_to_execute="/usr/local/bin/docker-compose up -d -v --remove-orphans"

find "$GIT_REPO" -type f \( -name "docker-compose.yml" -o -name "docker-compose.yaml" \) -print0 | while IFS= read -r -d '' file; do
    dir=$(dirname "$file")
    cd "$dir" || return 1
    echo "Executing command in $dir"
    $command_to_execute
done

# Monitor health check
curl -m 10 --retry 5 <uptime-kuma endpoint>

echo "##### docker refresh ended $(date) ##### "
```

This is simply run by cron and works just fine, I've setup uptime kuma to send a
slack message to me if it isn't run once an hour.

## The problems

This is my current state, except for some small experiments, you can never
capture everything in a blog post.

The main problems now, are mostly related to the manual tasks I've got to do
when creating a new web service i.e. axum, nextjs, svelte, go etc.

1. Create a new repository (manual)
2. Git push first (manual)
3. CI drone enable (manual)
4. GitOps repo update (automated)
5. Hostname inserted into caddy (manual)
6. If using authentication; setup (Zitadel manual)
7. Prometheus setup (manual registration)
8. Uptime kuma setup (manual registration)
9. Repeat for production deployment from step 5

Cuddle actually gives a lot out of the box, and I would quite easily be able to
automate most of it if alot of the configuration for drone, prometheus etc,
where driven by GitOps, but they aren't.

For service such as this blog, which is a rust-zola deployment, I also always
have downtime on deployments because I only run a single replica. This isn't the
end of the world, but I'd like the option to have a more declarative platform.

## Visions of the future

I want to focus the next good while on converting as much of the manual tasks to
be automated as possible.

The plan is to solve the root of the issues, and that is the deployment of the
services and simply service discovery. By that I could continue with
docker-compose and simply build more tooling around it. Many some heuristics on
what is in the docker gitops repo. However, I could also venture into the path
that is kubernetes.

We already maintain a fully declarative cluster setup in my dayjob, using
ClusterAPI and flux. So that is the option I will go with.

### Kubernetes

After some investigation and experiments, I've chosen to go with Talos and Flux.
I simply have to copy a vm, register it, and I've got controller or worker
nodes. I sadly have to run some Talos stuff imperatively, but to avoid the
complexity around ClusterAPI this is suitable approach for now. Flux simply
points at a gitops repo with a cluster path and it maintains the services I'd
want to run.

This means I can run `fluentbit`, `prometheus`, `traefik` and such in kubernetes
and automatically get deployments rolled out.

### Cuddle

From the development point of view, I simply change the docker-compose templates
to kubernetes templates, and I get the same benefit. Not much to say here. A
release to master will automatically release to prod, and a release to a branch
will create a preview environment for that deployment, which will automatically
be pruned after a period of time after the branch has been deleted.

A prometheus and grafana dashboard maintains a list which preview environments
are available, and how long they've been active for.

## Future list of steps

1. Create a new repository (manual)
2. Git push first (manual)
3. CI drone enable (manual)
4. GitOps repo update (automated)
5. Hostname inserted into caddy (automated)
6. If using authentication; setup (Zitadel manual)
7. Prometheus setup (automated)
8. Uptime kuma setup (automated)
9. Repeat for production deployment from step 5

I've got some ideas for 3 but that will have to rely on a kubernetes operator
sor something. The same goes for 6. As long as both has sufficient apis.

I've moved some of the operations from manual work, into kubernetes, but that
also means that maintaining kubernetes is a bigger problem. As docker-compose
didn't really have that much day 2 operation.s

Instead. I will have to rely on a semi automated talos setup for automatically
creating vm images, and doing cluster failovers for maximum optime and comfort.

# Conclusion

I've designed a future setup which will move things into kubernetes to relieve a
lot of manual tasks. I will still need to develop tooling for handling
kubernetes and various painpoints around it. As well as thinking up new
solutions for the last manual tasks. Some may move into kubernetes operators,
others into either chatops or clis.