Real Value of DevOps
No matter how hard you work in implementing DevOps practices in your organization, there are always questions arising regarding the actual bottom-line value that it brings. Especially in larger organizations, where it’s easier to coast along without being noticed if your work is actually making an impact or not.
Recently I found myself in a position several time to explain to people both in and outside of the Tech industry about DevOps, the value it brings and is it just another trap to milk money off people and companies. So I have decided to make sense of my thoughts and put it all into writing, for future reference to myself and anyone else asking.
As always, let’s start from scratch - what is DevOps and why should we care?
Devops is a way of thinking and working that enables people and teams to work/collaborate in an effective and sustainable way.
Many people think about DevOps as specific tools like Kubernetes or Docker, but tools alone are not DevOps, it’s rather how and why you use them.
Needless to say, in the DevOps/SRE field, it’s very important to show the value being brought by implementing said practices as they are usually hidden or not so obvious to the rest of the organization, since at times they are so subtle, that it’s taken for granted.
And here is where we get to the core of the issue - how to explain DevOps plainly for anyone to understand, ideally in one sentence?
Eliminating inefficiencies while making conscious and sensible trade-offs
Why does it matter to eliminate inefficiencies?
To put it bluntly: Inefficiencies == waste. And waste in any form is bad - whether it’s money, resources or time. As much as you have the budget, manpower and deadline buffer, being able to do more with less budget, manpower, in shorter time span is always better and will give you the crucial market advantage that every organization is after.
(Hidden) Inneficiencies
For the most part, you can go a long way without noticing all the opportunities to remove waste. Let’s look at several examples:
- That extra step that you have to make in order to promote your app from DEV to STG environment.
- The fact that you can’t test your new payment feature in production so you spend hours waiting for a real payment in order to verify.
- The fact that your automated regression suite is always “amber” and you have to check it every time to ensure that the flaky tests are not breaking anything important.
- That noisy alert that’s been waking you up too many times but you never go around to implement a permanent solution.
Depending on your maturity level, you will have different problems at different points in time and that is normal. Important bit is to iterate the improvements in a way that will elevate that maturity, ideally with a focus in different pillars that are the core of DevOps. There is some debate around the number of pillars and which those are. My take is the 5 pillars:
- Culture
- Automation
- Lean (in every sense of that word)
- Measurement
- Sharing
Each of these has subtypes but let’s leave that for some other time.
The catch here is as you reach higher levels of maturity, you will “unlock” new opportunities for improvement. Also, you will realise that some improvements are not worth doing until you reach a higher level of maturity. For this, I will reference a DevOps Bible - The Phoenix Project. There is a part where the main character is taken into some kind of a plant/factory and is explained the concept of how bottlenecks (a form of waste) are formed and the Theory Of Constraint.
The Theory of Constraint
The theory goes something like this:
- Identify the constraint in your current situation
- Exploit the constraint (don’t allow for wait time)
- Subordinate other activities to constraint (your primary focus should be on removing the constraint, not doing other things)
- Elevate the constraint to new levels
- Find the next constraint
Applying this theory, you can identify the bottleneck in your team/process/project and improve the flow of work, since the actions taken without taking the bottlenecks into account will likely only bring short-term improvements or an ilusion of an improvement.
To put an example, let’s say you have the most awesome build pipeline that allows you to deploy on demand any branch in isolation but your regression tests are missing coverage of core functionalities for your service.
In this case, you would probably want to focus on your testing (your bottleneck) instead of doing more improvements on your build pipeline, which is enough for your team to move swiftly.
And with this example, we introduce another important type of inefficiencies, talking about working on the wrong things:
Self-imposed Inefficiencies
Tech Debt is often talked about and used to justify many improvements. Some of which, don’t always make sense since they focus on solving a problem that doesn’t yet exist and probably never will. Or in other cases, we (over)complicate our system with a justification that we are making it better (whatever that might mean for different situations).
Example of that would be something like this:
You have discovered the wonderful world of microservices and containers. So you decide to refactor your monolith into microservices. Off the bat, you build 10 new microservices to replace the old system and you migrate all of the users, only to realize that 10 was probably too much and now you ended up into maintaining 10 different codebases and want to shoot yourself in the hand every time you have to make a change that involves more than 3 places. In such cases, it might be better to start slowly, probably build 1 and delegate some functionalities and have it work in parallel with the legacy monolith, see what can becomea library and share the functionality between different services and so on and so forth.
Another example would be that your team runs a web app for 100 users and you decide it’s time to scale so you kick-off an initiative to put it on Kubernetes, in 3 different AWS regions, because it’s considered a best practice. You have probably now tripled your cloud bill, yet your users don’t really notice anything.
Things have to be put into perspective, depending on where you are in your journey.
Making conscious and sensible trade-off
This is a mouthful but it’s goes hand-in hand with the the previous points about constraints and unjustified improvements. If you are working in a startup that is rushing to push features out of the door, it makes all the sense to take some shortcuts and create tech debt on purpose. Because the trade-off you are getting is speed and it’s important when you are racing to a new market with some fresh idea and want to get some VC funding. However, this has to be a conscious choice - we are choosing to create tech debt for the sake of being faster. There has to be some strategy behind making such a decision that can be harmful in the future.
Same goes when choosing solution A vs B. There will be times when neither will be “the one” as each of them will have their own pros and cons. At these crossroads you make decisions and accept the trade-off of the solution you chose. This is an important part of DevOps as there will be many times where you will have to create tech debt consciously or make a decision that has some downside because you are gaining something in return that is, hopefully more important strategically.
DevOps in the Real World
Even though, the concept of DevOps has originated in Tech, I do notice people that have a DevOps mindset tend to apply it not only at work, but in their personal lives as well. I even know some people that have never worked in Tech but have a very DevOps-oriented mindset.
And it all comes down to removing inefficiencies from their daily lives.
In my previous life, I worked a lot in hotels and restaurants as a waiter. In this context, I have seen many people doing inefficient work by making several roundtrips to the same table because they haven’t though through about all the things they need. Or going empty handed on the way back to the kitchen, not scanning their surroundings for a potential quick cleanup of the adjecent table, which in itself is a minor-effort action but when appplied consistently throughout the day, the compound effect is noticable. I regret that I didn’t wear one of those watches that count steps and compare it with other people, just to measure it.
I have a bunch of other random examples, from how you clean the house, strategy for shopping in the supermarket to any other facet of life.
If you are having a hard-time selling DevOps in your surroundings, I hope that this rant has inspired you a bit and can help you advocate for less waste, inefficiencies and reasonable ways of working.