390 lines
16 KiB
Markdown
390 lines
16 KiB
Markdown
---
|
|
type: "blog-post"
|
|
title: "Revisiting my Personal Platform in 2023"
|
|
description: "The tech landscape moves fast, and as a Platform/Developer experience engineer I like to stay up to date with recent technology and approaches so that I can deliver the best and most solid approach for my engineers. As such in this blog post I will explore what my personal development stack looks like, and how I want to kick it up a notch and reflect on the challenges I've had in the previous year. Strap in because this is gonna be a janky ride. But first I will dig into why I've got a personal platform and why that might be useful for you as well."
|
|
draft: false
|
|
date: "2023-07-23"
|
|
updates:
|
|
- time: "2023-07-23"
|
|
description: "first iteration"
|
|
tags:
|
|
- '#blog'
|
|
---
|
|
|
|
The tech landscape moves fast, and as a Platform/Developer experience engineer I
|
|
like to stay up to date with recent technology and approaches so that I can
|
|
deliver the best and most solid approach for my engineers. As such in this blog
|
|
post I will explore what my personal development stack looks like, and how I
|
|
want to kick it up a notch and reflect on the challenges I've had in the
|
|
previous year. Strap in because this is gonna be a janky ride. But first I will
|
|
dig into why I've got a personal platform and why that might be useful for you
|
|
as well.
|
|
|
|
## What do i mean by personal platform
|
|
|
|
You may've heard the term self-hosted thrown around, or homelab. These terms
|
|
overlap a bit, but are also a bit orthogonal. Homelab is a personal or small
|
|
deployment of stuff, you can tinker with experiment and enjoy using. Parts of it
|
|
usually consist of HomeAssistant, Plex/Emby, various vms and such. Self hosted
|
|
basically means off the shelf tools you can host yourself, whether for personal
|
|
use or for enterprise.
|
|
|
|
When I mean personal platform, parts of it means a homelab, but taking it a step
|
|
further, and specializing it for development usage. The goal is to develop a
|
|
platform like a small to medium sized company that is capable of rolling out
|
|
software and get the amenities you want to select for (more on that later). It
|
|
should be useful and not just an experiment. You should actually use the
|
|
platform to roll out software. One of the most important part of developing a
|
|
platform is actually using it yourself (dog fooding) otherwise you will never
|
|
learn the sharp edges and where your requirements break, and such.
|
|
|
|
So for me the basic requirements for a platform is:
|
|
|
|
1. A place to host deployments, this may be a vm, a raspberry pi, fly.io, aws.
|
|
It doesn't matter too much, it all depends on your needs and what you want to
|
|
develop.
|
|
2. A place to store source code, again the easiest option is just to choose
|
|
GitHub, but you can also choose to go a step further and actually host the
|
|
code yourself in the spirit of a homelab. I do this personally.
|
|
3. A domain or a way to interact with the services and deployments you build.
|
|
You want to make the things you build be accessible to how wide of an
|
|
audience you choose. Whether that is only yourself, your closest family and
|
|
friends or the public. I personally do a mix, some stuff like the platform
|
|
internals are only accessible internally, other services are public, and some
|
|
are invite only.
|
|
|
|
If it is difficult to illustrate, you can kind of think of the platform as the
|
|
same things you would get if you used, fly.io, aws, gcp or any of the Platform
|
|
as a Service solutions out there.
|
|
|
|
## Why build a platform only for yourself
|
|
|
|
This is a question I get a lot, I seemingly spent a lot of effort in building
|
|
tools, services and whatnot, which is incredibly overkill for my personal needs.
|
|
I think of it like so:
|
|
|
|
> Get comfortable with advanced tooling and services, so when you actually need
|
|
> to do it in practice it is easy
|
|
|
|
It is part personal development, but also building up a certain expertise, that
|
|
can be difficult to acquire in a job, it is also incredibly fun and filled with
|
|
challenges.
|
|
|
|
It should also be noted that Personal platform would seem like incredibly
|
|
overkill, but it is an incremental process, you may already have parts of it
|
|
already. Just implicitly.
|
|
|
|
## The beginning
|
|
|
|
My personal platform began as an old workstation running Linux (the distro
|
|
doesn't really matter), with `docker` and `docker-compose` installed. Then I ran
|
|
various homelab deployments, such as `gitea`, `drone-ci`, `plex`, etc.
|
|
|
|
My workflow would be to simply build a docker image on the service I was at.
|
|
`make ci`, which `docker build .` and `docker push`, and finally I would ssh
|
|
into the workstation, and bump the image version using `image:latest`. It is a
|
|
fairly basic platform and a lot of the details weren't documented or automated.
|
|
In the beginning everything would just be accessible internally and I would just
|
|
use the hostname given by `dhcp` and so on. Such as
|
|
`http://home-server:8081/todo-list` or something like that.
|
|
|
|
It worked fine for a while, but I began the needs to actually want to use some
|
|
of those tools when I left the house. And as my tool stack grew and there were
|
|
more hostnames and ports to remember I began to look for enhancements for the
|
|
stack.
|
|
|
|
> This is actually the most important part of building a personal platform.
|
|
> Start small, and grow in the direction of your requirements and needs. Do not
|
|
> start with a self hosted kubernetes with all the bells and whistles. And don't
|
|
> copy another persons stack, it will not fit your needs and you won't be able
|
|
> to maintain it.
|
|
|
|
In the beginning I choose to use tools such as upnp and ngrok, to expose these
|
|
services as well as a dashboard service for discoverability. However, that
|
|
didn't work out. First of all ngrok, upnp wasn't the most stable, and I didn't
|
|
want to expose my home network to the internet in that way. I also didn't use
|
|
the dashboard service that much, as just that extra step, made me not use the
|
|
tools that I'd build that much. I would select for only those that I remembered
|
|
the hostname and port for and not the more niche ones.
|
|
|
|
### Getting a VPS
|
|
|
|
Getting my first vps for personal use, was a decision I made once I figured that
|
|
there was a lot of ammenties that I would get out of the box, I would get a
|
|
stable machine, which ran nearly 24/7, it has a public static ip, and was
|
|
reachable from anywhere.
|
|
|
|
I choose hetzner, because it was the cheapest option I could get where I am at,
|
|
with the required bandwidth cap and such.
|
|
|
|
I choose namecheap for a domain, and cloudflare for dns. Cloudflare technically
|
|
isn't needed, but the tooling is nice.
|
|
|
|
At this point my stack was like this.
|
|
|
|
```
|
|
namecheap -> cloudflare -> hetzner vps
|
|
```
|
|
|
|
This was sort of useful, but not that much, I could host some things on the vps,
|
|
but I'd like to use the cheap compute I had at home, but still make it
|
|
reachable. I then began searching for a mesh vpn. I looked at openvpn, a bunch
|
|
of other options, but finally landed on `wireguard`, because it seemed to be the
|
|
most performant, and suited my needs quite perfectly.
|
|
|
|
In the beginning I wanted to just use the vpn as a proxy.
|
|
|
|
```
|
|
namecheap -> cloudflare -> hetzner vps -> wireguard -> home workstation
|
|
```
|
|
|
|
However, setting `iptables` rules and such turned out to be a nightmare, and as
|
|
such I kept it simple and just installed `caddy` and `nginx` on the vps. Caddy
|
|
for TLS certificates, and nginx for TCP load balancing and reverse proxying.
|
|
(Caddy doesn't officially support TCP loadbalancing, only with a plugin which I
|
|
don't want to use because of ergonomics).
|
|
|
|
So now the stack was like this:
|
|
|
|
```
|
|
namecheap -> cloudflare -> hetzner vps -> caddy/nginx -> wireguard -> home workstation
|
|
```
|
|
|
|
I was really happy with this stack, and actually still use it.
|
|
|
|
The wireguard setup is setup as a bunch of point-to-point connections all
|
|
pointing at the ingress node.
|
|
|
|
```
|
|
home workstation (interface) -> hetzner ingress vps (peer)
|
|
hetzner ingress vps (interace) -> home workstation (peer)
|
|
```
|
|
|
|
Home workstation:
|
|
|
|
```
|
|
[Interface]
|
|
PrivateKey = <home-workstation-priv-key>
|
|
Address = 10.0.9.2
|
|
ListenPort = 55107
|
|
|
|
[Peer]
|
|
PublicKey = <ingress-vps-public-key
|
|
AllowedIPs = 10.0.9.0/16 # allows receiving a wide range of traffic from the wireguard peer
|
|
Endpoint = <ingress-vps-public-static-ip>:51194
|
|
PersistentKeepalive = 25
|
|
```
|
|
|
|
Hetzner vps:
|
|
|
|
```
|
|
[Interface]
|
|
Address = 10.0.9.0
|
|
ListenPort = 51194
|
|
PrivateKey = <ingress-vps-private-key>
|
|
|
|
# packet forwarding
|
|
PreUp = sysctl -w net.ipv4.ip_forward=1
|
|
|
|
[Peer]
|
|
PublicKey = <home-workstation-public-key>
|
|
AllowedIPs = 10.0.9.2/32 # this peer should only provide a single ip
|
|
PersistentKeepalive = 25
|
|
```
|
|
|
|
It is incredibly simple and effective. I even have entries for on the vps for my
|
|
android phone, mac, you name it. Super easy to setup, but requires some manuel
|
|
handling. Tailscale can be used to automate this, but when I set this up it
|
|
wasn't really a mature solution. But if I started today I would probably use it.
|
|
|
|
The important part is that the registration is only needed between the peer and
|
|
the hetzner ingress vps. So if I add another vps at some point only that and the
|
|
ingress vps, will need registration, but my phone would still be able to talk to
|
|
it, because of the 10.0.0.0/16. That is of course as long as they share a
|
|
subnet, i.e. 10.0.9.1 and 10.0.9.2.
|
|
|
|
Now my caddy things can just reverse proxy to my home workstation, without it
|
|
needing a public port.
|
|
|
|
```
|
|
hetzner ingress vps -> caddy -> wireguard ip for home workstation and port for service -> home workstation -> docker service
|
|
```
|
|
|
|
Because of docker bridge networking, even if caddy is running in a docker
|
|
container, it can still use the wireguard network interface and reverse proxy to
|
|
that. This is what was and is still binding all my own services together, even
|
|
if they don't share a physical network subnet.
|
|
|
|
## Hosting
|
|
|
|
My hosting of personal services is now a mix between, home workstation for plex
|
|
and other compute intensive services, and on hetzner, I've rented a few more for
|
|
services I use frequently like `gitea`, `grafana` and so on.
|
|
|
|

|
|
|
|
As you may imagine plex, drone, grafana etc. shouldn't be exposed to the
|
|
internal, but I'd still like the convenience, so I've setup caddy to only allow
|
|
the wireguard subnet, and use domain wildcard certs for certificates, such that
|
|
it can still provision internal https certificates using lets encrypt.
|
|
|
|
There is a bunch more services I've left out, especially my own home built
|
|
things. However, the deployment model is still as handheld as I mentioned in the
|
|
beginning. Now they're just spread onto the vps and private nodes.
|
|
|
|
## Development
|
|
|
|
My next iteration for development was using an open-source tool I've helped
|
|
develop at work: https://github.com/lunarway/shuttle. The idea is to eliminate
|
|
the need for sharing shell scripts, makefiles and configuration between
|
|
different repositories. Now, just initialize a template `shuttle.yaml` file and
|
|
fill it out with a parent template plan, and you've got all you need. I usually
|
|
develop a mix of `nextjs`, `sveltekit`, `rust-axum`, `rust-cron`, `rust-cli` and
|
|
finally `go-service`. All of these plans contains everything needed to build a
|
|
docker image, prepare a docker-compose file and publish it. These again aren't
|
|
public, because they specifically suit my needs.
|
|
|
|
I've ended up building my own incarnation of `shuttle` called `cuddle`
|
|
https://git.front.kjuulh.io/kjuulh/cuddle it isn't made for public consumption,
|
|
and was one of the first projects I built when I was learning rust.
|
|
|
|
My workflow has changed to simply be `cuddle x ci` and it will automatically
|
|
build, test and prepare configs for deployment. It won't actually do the
|
|
deployment step, that is left for CI in drone when it actually runs
|
|
`cuddle x ci --dryrun=false`. I've developed a homegrown docker-compose gitops
|
|
approach, where the deployment is simply creating a commit to a central
|
|
repository with a docker-compose file, with a proper image version set. usually
|
|
a prefix plus a uuid.
|
|
|
|
My vps simply has a cronjob that once every 5 minutes it does a `git pull` and
|
|
executes a script
|
|
|
|
```bash
|
|
#!/bin/bash
|
|
|
|
set -e
|
|
|
|
LOG="/var/log/docker-refresh/refresh.log"
|
|
GIT_REPO="/home/<user>/git-repo"
|
|
|
|
exec > >(tee -i ${LOG})
|
|
exec 2>&1
|
|
|
|
echo "##### docker refresh started $(date) #####"
|
|
|
|
cd "$GIT_REPO" || return 1
|
|
|
|
git fetch origin main
|
|
git reset --hard origin/main
|
|
|
|
command_to_execute="/usr/local/bin/docker-compose up -d -v --remove-orphans"
|
|
|
|
find "$GIT_REPO" -type f \( -name "docker-compose.yml" -o -name "docker-compose.yaml" \) -print0 | while IFS= read -r -d '' file; do
|
|
dir=$(dirname "$file")
|
|
cd "$dir" || return 1
|
|
echo "Executing command in $dir"
|
|
$command_to_execute
|
|
done
|
|
|
|
# Monitor health check
|
|
curl -m 10 --retry 5 <uptime-kuma endpoint>
|
|
|
|
echo "##### docker refresh ended $(date) ##### "
|
|
```
|
|
|
|
This is simply run by cron and works just fine, I've setup uptime kuma to send a
|
|
slack message to me if it isn't run once an hour.
|
|
|
|
## The problems
|
|
|
|
This is my current state, except for some small experiments, you can never
|
|
capture everything in a blog post.
|
|
|
|
The main problems now, are mostly related to the manual tasks I've got to do
|
|
when creating a new web service i.e. axum, nextjs, svelte, go etc.
|
|
|
|
1. Create a new repository (manual)
|
|
2. Git push first (manual)
|
|
3. CI drone enable (manual)
|
|
4. GitOps repo update (automated)
|
|
5. Hostname inserted into caddy (manual)
|
|
6. If using authentication; setup (Zitadel manual)
|
|
7. Prometheus setup (manual registration)
|
|
8. Uptime kuma setup (manual registration)
|
|
9. Repeat for production deployment from step 5
|
|
|
|
Cuddle actually gives a lot out of the box, and I would quite easily be able to
|
|
automate most of it if alot of the configuration for drone, prometheus etc,
|
|
where driven by GitOps, but they aren't.
|
|
|
|
For service such as this blog, which is a rust-zola deployment, I also always
|
|
have downtime on deployments because I only run a single replica. This isn't the
|
|
end of the world, but I'd like the option to have a more declarative platform.
|
|
|
|
## Visions of the future
|
|
|
|
I want to focus the next good while on converting as much of the manual tasks to
|
|
be automated as possible.
|
|
|
|
The plan is to solve the root of the issues, and that is the deployment of the
|
|
services and simply service discovery. By that I could continue with
|
|
docker-compose and simply build more tooling around it. Many some heuristics on
|
|
what is in the docker gitops repo. However, I could also venture into the path
|
|
that is kubernetes.
|
|
|
|
We already maintain a fully declarative cluster setup in my dayjob, using
|
|
ClusterAPI and flux. So that is the option I will go with.
|
|
|
|
### Kubernetes
|
|
|
|
After some investigation and experiments, I've chosen to go with Talos and Flux.
|
|
I simply have to copy a vm, register it, and I've got controller or worker
|
|
nodes. I sadly have to run some Talos stuff imperatively, but to avoid the
|
|
complexity around ClusterAPI this is suitable approach for now. Flux simply
|
|
points at a gitops repo with a cluster path and it maintains the services I'd
|
|
want to run.
|
|
|
|
This means I can run `fluentbit`, `prometheus`, `traefik` and such in kubernetes
|
|
and automatically get deployments rolled out.
|
|
|
|
### Cuddle
|
|
|
|
From the development point of view, I simply change the docker-compose templates
|
|
to kubernetes templates, and I get the same benefit. Not much to say here. A
|
|
release to master will automatically release to prod, and a release to a branch
|
|
will create a preview environment for that deployment, which will automatically
|
|
be pruned after a period of time after the branch has been deleted.
|
|
|
|
A prometheus and grafana dashboard maintains a list which preview environments
|
|
are available, and how long they've been active for.
|
|
|
|
## Future list of steps
|
|
|
|
1. Create a new repository (manual)
|
|
2. Git push first (manual)
|
|
3. CI drone enable (manual)
|
|
4. GitOps repo update (automated)
|
|
5. Hostname inserted into caddy (automated)
|
|
6. If using authentication; setup (Zitadel manual)
|
|
7. Prometheus setup (automated)
|
|
8. Uptime kuma setup (automated)
|
|
9. Repeat for production deployment from step 5
|
|
|
|
I've got some ideas for 3 but that will have to rely on a kubernetes operator
|
|
sor something. The same goes for 6. As long as both has sufficient apis.
|
|
|
|
I've moved some of the operations from manual work, into kubernetes, but that
|
|
also means that maintaining kubernetes is a bigger problem. As docker-compose
|
|
didn't really have that much day 2 operation.s
|
|
|
|
Instead. I will have to rely on a semi automated talos setup for automatically
|
|
creating vm images, and doing cluster failovers for maximum optime and comfort.
|
|
|
|
# Conclusion
|
|
|
|
I've designed a future setup which will move things into kubernetes to relieve a
|
|
lot of manual tasks. I will still need to develop tooling for handling
|
|
kubernetes and various painpoints around it. As well as thinking up new
|
|
solutions for the last manual tasks. Some may move into kubernetes operators,
|
|
others into either chatops or clis.
|