This whitepaper covers the main Kubernetes infrastructure management challenges, gives technical tips and best practices in order to provide visibility into K8s environment, optimize costs, overcome misconfiguration issues, improve Kubernetes performance and calculate unit economics for Kubernetes clusters.
FinOps enthusiast & CEO at Hystax
The advantages of Kubernetes infrastructure like portability and scalability, its open-source base and the ability to increase developer’s productivity have made container technologies a popular choice for many companies, and Kubernetes has become the standard for running container-based apps across clouds. More than 80% of companies today run containers in production and 78% of them use Kubernetes services.
As the containerized infrastructure is obtaining widespread adoption and Kubernetes technologies are gaining momentum, it’s becoming crucial to understand how to get a clear picture of spending on K8s resources, enforce cost optimization opportunities and enhance Kubernetes performance.
The reality shows that it’s not enough just to use Kubernetes to get the best value of public clouds. According to a recent StackRox report, about 70% of companies detected misconfiguration in their Kubernetes environment.
A containerized structure creates significant difficulties with cloud transparency, allocation and performance that cause challenges in resource management and optimization.
The whitepaper goes over the top management challenges of Kubernetes performance, describes recommendations and technical tips in order to achieve K8s clusters transparency and overcome cost management and optimization issues.
It will help you build a solid management strategy for the Kubernetes environment, make a giant leap forward in improving application performance and reduce its infrastructure cost.
There are two main approaches to virtualization: virtual machines (VMs) and virtual containers, each with its own advantages and disadvantages. In the first case, every machine uses its guest OS, which gives an opportunity to create heterogeneous computing environments on one physical server. Virtual containers, in turn, instead of an OS only have a user environment, which makes it possible to create only homogeneous computing environments.
However, since virtual machines include an operating system, they can be as high as several gigabytes in size – that substantially limits their usefulness in today’s agile world. Another disadvantage of virtual machines is that it takes much more time to load the OS and its applications.
Containers are lighter and are generally measured in megabytes; and, what’s more important, they are easier to scale and deploy applications. This creates an entirely new infrastructure ecosystem, which means new challenges and complexities. IT companies, both large corporations and start-ups, deploy thousands of container instances every day and thus must somehow manage this overwhelming amount.
Kubernetes, a container orchestration platform for deploying, scaling, and managing containerized applications, was designed as a solution to this problem. Over time, K8s has essentially become the industry standard platform and the flagship project of the Cloud Native Computing Foundation, supported by market leaders: Google, Amazon, Microsoft, IBM, Alibaba Cloud and many others.
Owing to its advantages Kubernetes gains popularity but at the same time it brings significant difficulties to cloud costs tracking and finance management.
Before the widespread adoption of containerization technology, cloud resource allocation and cloud usage optimization was much easier. All that was required in this case was the attribution of specific resources to a specific project or department. It won’t be difficult for a FinOps team – if they are part of your IT team – to come up with a cloud cost breakdown and put together a cloud cost optimization strategy. If you would like to learn more about the role of FinOps and what potential benefits it can provide to business, please refer to an ebook ”From FinOps to proven cloud cost management & optimization strategies”.
Unfortunately, this approach is absolutely inapplicable for containerization tools in general and Kubernetes in particular. What is the reason why Kubernetes costs are so difficult to define and analyze?
The difficulty in tracking Kubernetes costs stems from its architecture. At the core of Kubernetes, there is a cluster that consists of numerous virtual (or physical) machines – nodes. On those nodes, containers are deployed and launched – they actually contain various applications.
Now let’s say that several departments in your company are working on various applications that run inside containers and, as it happens, share common Kubernetes clusters. It is almost impossible to determine which of the launched applications uses which part of the resources of which clusters, because each of the applications is launched on several containers at the same time. While calculating the cost of one container is possible and not too difficult in itself, it still takes infrastructure and time, and the complexity grows in proportion to the number of containers used.
Until now, we considered the situation within the frame of one cloud. But what if your company, like most modern IT organizations today, uses the multicloud. In this case, cost monitoring will grow many times over – each of the clouds within this multicloud can have a different service provider, taking on only a part of the total workload because Kubernetes can work with AWS, Microsoft Azure, Google Cloud Platform, Alibaba Cloud and many others.
In addition, the resource intensity of each of your applications can change over time, which imposes additional difficulties in calculating costs. So, the easy-to-use VPA (Vertical Pod Autoscaler) and HPA (Horizontal Pod Autoscaler) tools, which, respectively, automatically adjust the limit on the number of requests to a single container and the number of containers used, become additional hard-to-factor variables when trying to track and manage current Kubernetes costs and, of course, predict future costs.
Another problem is that a container’s lifespan is just one day, and functions being run on Kubernetes – down to minutes or even seconds. Again, such dynamics is a definite plus from the point of view of an IT engineer, but it becomes a headache when it comes to cost tracking and management.
As we already said, it’s very difficult to track and manage Kubernetes costs, but there are still numerous techniques that can help you analyze Kubernetes costs — and eventually take them under control.
Most probably, you’re familiar with cloud resource tagging. When it comes to Kubernetes, the so-called labels are used instead of tags. If you and your engineers take care of the labeling of the resources used, it will greatly facilitate their search and identification in the future.
It is important to be smart about this process so that you can later break down your resources by various relevant parameters. Successful implementation of such a strategy will require the active participation of all IT team members; otherwise, this initially good idea can lead to even more confusion.
Open-source monitoring systems like Prometheus can be great help in visualizing your costs. And competent visualization, in turn, is a giant leap on the way to competent analysis.
Autoscaling is a killer feature of Kubernetes, and with the help of it, one can easily manage workloads. We already mentioned two of them – Vertical Pod Autoscaler and Horizontal Pod Autoscaler, but there are actually two more available: Kubernetes Event-Driven Autoscaler, and Cluster Autoscaler, where the former manages the scaling of any container in Kubernetes based on the number of events having to be processed, while the latter deals with autoscaling on the cluster and node level. That said, it’s quite a challenge to make them work together properly — not only should you stick to the numerous best practices, but also fine-tune the settings based on your scenarios.
The cost of Kubernetes directly depends on how well the cloud instances are selected. It is important to ensure that Kubernetes pods’ resource consumption matches the amount of allocated memory and compute resources of the instance, regardless of which cloud provider is used.
Underutilized and unused resources are one of the first items to look for direct losses and room for optimization. As mentioned previously, IT specialists prefer performance over resource optimization, so they tend to overuse resources. Despite the tempting prospect to immediately abandon all idle capacities, this must be done wisely, so as not to exclude anything that turns out to be important – the next point follows from this.
FinOps can help in solving several problems at once. Firstly, it will pass more responsibility to technical specialists for the financial performance of the company as a whole, and its spending on cloud resources in particular. Secondly, it can become the missing link that can monitor and manage Kubernetes costs on a daily basis so that the scaling of the resources used occurs when it is really needed.
Since more and more organizations are expanding the usage of container orchestrators, and Kubernetes is becoming a popular choice for many companies, it’s important to understand how to provide full transparency across K8s resources for achieving cost optimization goals and performance improvements.
The advantages of container technologies like portability and scalability and its open-source base have made Kubernetes the standard for running container-based apps across clouds.
Luckily, cloud platforms provide support and help companies of any size to adopt Kubernetes technology. Here is a list of services provided by the major cloud platforms:
Despite the advantages of Kubernetes, a containerized structure creates challenges with cloud cost transparency, allocation that cause significant difficulties in resource management and optimization.
An effective cloud management needs cost visibility; it’s crucial to identify organizational units such as applications, cloud services, asset pools, business units, teams, individual engineers and map them onto cloud costs.