Files
kasperhermansen-blog/content/posts/2024-09-24-your-companys-superpower.md
kjuulh 4bb6b0228a
Some checks failed
continuous-integration/drone/push Build is failing
feat: add blog contents
2025-07-31 11:01:22 +02:00

188 lines
10 KiB
Markdown

---
type: blog-post
title: "Your Company's Superpower: Platform Engineering"
description: Platform Engineering basically has two angles in the industry, either the hypefor it is overwhelming because it is the new hot thing on the block, or it isunderestimated because it just looks like operations doing software engineering.I aim with this piece of content to define why you should care about PlatformEngineering.
draft: false
date: 2024-09-25
updates:
- time: 2024-09-25
description: first iteration
tags:
- "#blog"
---
Platform Engineering basically has two angles in the industry, either the hype
for it is overwhelming because it is the new hot thing on the block, or it is
underestimated because it just looks like operations doing software engineering.
I aim with this piece of content to define why you should care about Platform
Engineering.
## Platform Engineering defined
Platform Engineering to its namesake, is the practice of engineering a platform.
Often as Software Engineers we forget what the Engineering in our titles
actually mean. Engineering means building the right thing, reliably and
securely. That we understand, but the platform is harder to define. When we
think of a Platform for a company, it is the foundation that other users within
the company utilize. An important detail here is that Platform Engineering is
not just for developers, it can be for business users, data analysts, operations
people, etc. It is basically a vehicle to make someone elses job easier within
the company.
An example of a platform to be built could be: A specific individual or team,
owning the deployment pipeline that ships software into production, the metrics
solution that is used across the company, the tool analysts interface with when
they want to query the data in the company.
These might sound mundane, and you may have a question, but hey, I could just
setup github actions, and boom, my service is now going to production. You would
be correct, Platform Engineering is taking a holistic view of the company's
portfolio of tools, and make decision based on those. Often tooling starts grass
roots, a developer is missing a feature, he/she goes and implements said
feature, they now have a pipeline to build and deploy their code. But so does
the other 15 teams in the company, and now you've got 15 bespoke solutions.
Another step would be a DevOps department, being consultants to the different
teams to have some homogeniety.
A Platform Engineering team would take those requirements; we need a build and
deployment capability across the company, lets come up with as simple solution
as possible to capture the largest amount of complexity. As such the team might
end up building a complete build pipeline, but only for Golang Services, and as
such keep the complexity around for Python Services or what have you. The
Platform Engineering team would then treat the pipeline as a product, handle
user feedback, user interviews, track data for adoption, make sure the right
features are available, etc.
The end product is that you've got a dedicated team, which can capture the
partial complexity for 15 teams, and if the product is good enough, it can be
complete, i.e. the friction between the development teams and the platform teams
is minimal. If you extrapolate this mindset, you can go from:
Here is the example of the scope of the complexity a normal organisation might
have in software products, internal to the company, but without dedicated
stewardship.
5 programming languages, 30 pipelines, 5 types of software libraries, 10
libraries pr language, 30 types of deployment, 3 clouds
`5 + 30 + 5 * 10 + 30 + 3 = 83` products spread out over the organisation
Versus a stewarded Platform
1 programming language, 1 pipeline, 10 libraries, 1 deployment, 1 cloud
`1 + 1 + 10 + 1 + 1 = 14` products dedicated to a specialized team
You might say, that it is inrealistic to succeed with a single programming
language, or 1 pipeline, but it can be done. It can be a long journey to get
there, and you may not want to go that far, if you've got enough people to
maintain the software. But in general, the goal of Platform Engineering in such
an organisation is to move out complexity of feature teams to let them focus on
what they're best at; building features.
As is often the case, Platform Engineers are basically Software Engineering
working in a specialized field, with specialized tooling, as such it is more
approachable to tackle familiar problems, i.e. building out a deployment
strategy for Kubernetes, AWS, whatever. But it can also be so much more. How do
SQL analysts interface with their tools, what is slowing them down, do they
achieve the quality they want from the products they rely on, are their
workflows as effecient and tight as can be? Often this isn't the case, in the
same vein as a software engineer hacking together a pipeline, and analyst might
cook up a workflow that is borderline masochism. As Platform Engineers we've got
the knowledge and tools to help shape some of the workflows and tools to fit the
needs of our users.
## Platform Engineering doesn't mean invented here
Platform Engineering can be taken to its extremes, where we basically build all
the tools from scratch, define all workflows, templates by hand, and rely on a
massive team to support said complexity. But it shouldn't be our first approach,
one of the most interesting things about Platform Engineering is the creativity
it invites. Hey, I need to build a build pipeline, what tools do I have
available, and how can I turn this into a good product and abstraction for my
users. Do I really want to provide a small layer of abstraction on top of github
actions/jenkins etc. Or will we build a turnkey solution that basically builds
our company's version of what a Golang services looks like.
Are we supposed to build the entire build pipeline software ourselves, or can we
leverage either open-source offerings, or SaaS solutions and provide a small
opinionated layer on top to make it a product internally. That is really the
goal of Platform Engineering, to think creatively about problems, such that we
can build the most reliable thing, with the lowest complexity, in the most
secure manner.
## Platform Engineering A Superpower
I gotta make up for the title, so how is Platform Engineering a superpower? Lets
say a new security requirement comes down, that you now need to calculate
software bills of materials across all of your software because you want to sell
your services to an organisation that requires that level of security. You can
basically measure your profecciency in Platform Engineering in how well you're
able to execute cross cutting concerns.
If you had 30 pipelines, for 5 languages. You'd need to basically copy
paste/modify whatever the same product to produce software bills of materials to
each and every pipeline, that may run on different types of CI systems, have
different level of compatibility with the language in question.
If you however had 1 language and 1 pipeline, across the entire fleet of
services, you could basically build the feature in the afternoon, append it to
the build pipeline, track all the builds and see that all artifacts required
were produced. With Platform Engineering you've basically transformed a
challenging multi month effort into an afternoon project.
For the first approach developer teams will also have to own the changes,
because it is their pipelines, how difficult would it be to prioritise that
across 30 teams? From experience there are always a handful that are absolutely
strapped for time, so you probably wouldn't make it in time. The second approach
is completely automated, they wouldn't even know their pipelines are producing
bills of materials. The same could be said for signing artifacts, producing
artifacts for other architectures, swapping out library internals and whatnot.
It can be extremely fulfilling work to build a project that can basically
bootstrap a service from scratch to production in minutes, without any handoff
to other teams, or requring complex manuals for actually being allowed to go in
production. As a Platform Engineering team, we can offload a lot of requirements
from teams, such they can focus on delivering value for our paying customers,
and in turn make the organisation much more nimble, scaleable and so on.
## Platform Engineering is a double edged sword
If your Platform Engineering team isn't able to build whole abstractions, which
by the way are extremely difficult to build. They will have a lot of maintenance
on their hands, if they aren't able to fulfill requirements from their
customers, the developers, analysts and so on. You might end up with a munity on
your hands. People simply going rogue, because the complexity of the platform
has increased to an almost absurd level.
This can happen and it needs to be careful considered, as engineers we should
continuously defer products to maintenance, and pick them up once offerings
become available via. open-source or proprietary software caches up with our own
abstractions.
You might build a state of the art build system one year, let it run for 5, and
suddenly it is clunky to work with, because it has been in maintenance mode for
years, but you may discover that an open-source tool has matured that is able to
fill some of those requirements you had, because you're fully in control of the
platform you might even be able to swap out the "engine" with a less complex
one, and free up some maintenance budget for your team.
I'll discuss when you should make those decision and what they look like in
another piece of content.
## Teams require help to be in control of their software
At the end of the day, feature teams should still be fully responsible for their
products, including the operations. So as a Platform Team you've got to
carefully consider how you allow them to be in control. Do you send back all the
right signals for them to be in control? Do they know much many applications
they're running, how much CPU, memory they're using. What is their SQL latency,
when did they deploy, which version is deployed, how are their gRPC latencies,
what is the HTTP Error rate.
Again you can treat these as products, it doens't have to be fancy, but it takes
a long time to get this right, and as you build abstractions on top of products,
you'll continuously see that as services demand more of the platform that more
and more of the internals needs to be exposed, such that the feature teams can
engineer their services to the implicit requirements of the platform. I.e. they
might need to tune memory settings, connection pools, http authentication, which
ports are open. To what the platform expects.