In software there always seems to be more to do than time to do it. This is particularly true when it comes to often understaffed internal platform teams which is why I previously spoke about how self service can substantially reduce the load on a small team. But aspiring to self service is not enough, let’s talk about the realities of trying to implement it.
I would define self service as:
The ability for someone to complete a task without support based on a reasonable expectation of skill and knowledge.
The real trick here is what a reasonable expectation of skill and knowledge is when we are talking about self service internal platforms.
In my experience, platform engineers can be described in the same way as application engineers building any other product. As a platform engineering team, we had customers to serve (internal application teams), feature road maps, and support rotas. With all the same patterns comes all the same pitfalls as well. One such pitfall is to assume that you as the team understand user experience and needs without research and close collaboration. The closer a team believes they are to their users, the more likely they are to fall into this bad behaviour. When a team starts to build their product based on their own experiences rather than researching with their users, the team is likely to build the wrong abstractions and user journeys.
On one past platform team, this manifested in a Terraform repository owned by the platform team with a set of modules that exposed infrastructure as-a-Service. The interface for our users, the application developers, was to make a pull request to the Terraform repository based on a set of instructions we maintained and examples already in the repository. On another team, the platform engineers managed a set of Helm and Kubernetes manifests that lived in each application repository and defined key infrastructure concerns. To update these the application teams needed to either learn the syntax and intention of these files, or ask for platform support.
When I was on these platform teams it was extremely painful. The platform team started to build resentment because there had been a lot of work done to document and expose these interfaces, yet the application teams were continuing to require significant support. And when the teams did use the instructions, there were often nuanced differences between implementations that made the code harder to maintain.
But I have also been on the application side of this experience, and that too was painful. I saw some teammates wade into the world of learning a new language/tool chain despite the team pressures to work on more aligned work. I saw others make architectural decisions based on avoiding using these interfaces. Decision like appending a feature to an existing, though unrelated, service rather than creating a new domain service.
The problem in these experiences was not the intention or even the implementation of the platform code. It was that the platform engineers viewed themselves as more than just the creators, they also viewed themselves as example core users.
Instead, we as the platform team needed to focus on defining, researching, and collaborating with our actual core users, the application teams. When this happens it becomes obvious that, as with any product, the platform requires an interface that separates implementation from user experience. This interface requires a well designed API that can power a UI (e.g. Backstage) or CLI experience if the users prefer. Kratix Promises enable the platform team to leverage Kubernetes Custom Resources as their API, shifting the heavy lifting of hosting and managing CRUD actions to Kratix and the Kubernetes API while abstracting any implementation details (and possible changes!) away from the platform users.
This blog is the ninth in our series, The 12 Platform Challenges of Christmas. Check back daily until January 5th, 2023 for new posts!
תגובות