Why companies need to track and analyze Kubernetes costs

July 8, 2021

There are two main approaches to virtualization: virtual machines (VMs) and virtual containers, each with advantages and disadvantages. In the first case, every machine uses its guest OS, which allows the creation of heterogeneous computing environments on one physical server. Virtual containers, instead of an OS, have only a user environment, making it possible to create homogeneous computing environments.

However, since virtual machines include an operating system, they can be as large as several gigabytes, substantially limiting their usefulness in today’s agile world. Another disadvantage of virtual machines is that loading the OS and its applications takes much more time.

Containers are lighter and generally measured in megabytes. What’s more, they are easier to scale and deploy applications. This creates an entirely new infrastructure ecosystem with new challenges and complexities. IT companies, both large corporations and start-ups, deploy thousands of container instances daily and thus must somehow manage this overwhelming amount.

Kubernetes, a container orchestration platform for deploying, scaling, and managing containerized applications, was designed to solve this problem. Over time, K8s has essentially become the industry standard platform and the flagship project of the Cloud Native Computing Foundation, supported by market leaders: Google, Amazon, Microsoft, IBM, Alibaba Cloud, and many others.

Kubernetes gained popularity due to several advantages, among which the following are particularly noteworthy: it’s scalable, cost-efficient, and cloud-agnostic. However, containerized tools have certain drawbacks, including the complexity of tracking cloud costs and managing finances. In this article, we’ll describe how to address this issue and why it is essential not to overlook it.

Why it’s hard to analyze Kubernetes costs

Before the widespread adoption of containerization technology, cloud resource allocation and cloud usage optimization were much more effortless. All that was required in this case was the attribution of specific resources to a specific project or department. It won’t be difficult for a FinOps team – if they are part of your IT team – to come up with a cloud cost breakdown and put together a cloud cost optimization strategy. If you would like to learn more about the role of FinOps and the potential benefits it can provide to businesses, please refer to this article.

Unfortunately, this approach is inapplicable to containerization tools in general and Kubernetes in particular. Why is Kubernetes’s cost so challenging to define and analyze?

The difficulty of tracking Kubernetes costs stems from its architecture. At its core is a cluster consisting of numerous virtual (or physical) machines—nodes. On those nodes, containers containing various applications are deployed and launched.

Let’s say several departments in your company are working on various applications that run inside containers and, as it happens, share common Kubernetes clusters. Because each application is launched on several containers simultaneously, it is almost impossible to determine which uses which part of the resources of which clusters. While calculating the cost of one container is possible and not too tricky, it still requires infrastructure and time, and the complexity grows in proportion to the number of containers used.

Until now, we considered the situation within the frame of one cloud. But what if your company, like most modern IT organizations today, uses the multicloud (read about the benefits and best practices of a multicloud approach in our article) In this case, cost monitoring will grow many times over — each of the clouds within this multicloud can have a different service provider, take on only a part of the total workload because Kubernetes can work with AWS, Microsoft Azure, Google Cloud Platform, Alibaba Cloud and many others.

In addition, your applications’ resource intensity can change over time, imposing additional difficulties in calculating costs. So, the easy-to-use VPA (Vertical Pod Autoscaler) and HPA (Horizontal Pod Autoscaler) tools, which, respectively, automatically adjust the limit on the number of requests to a single container and the number of containers used, become additional hard-to-factor variables when trying to track and manage current Kubernetes costs and, of course, predict future costs.

Another problem is that a container’s lifespan is just one day, and functions are run on Kubernetes – down to minutes or even seconds. Again, such dynamism is a definite plus from the point of view of an IT engineer, but it becomes a headache when it comes to cost tracking and management.

Why it is important to analyze Kubernetes costs

All of the above is the flip side of containerization tools: a price to pay for the extraordinary convenience, flexibility, and agility that Kubernetes brings. Although the infrastructure built this way has a high potential for automatic optimization, the lack of sufficient cost control can lead to sad consequences, including letting the cloud bill spiral out of control.

Why can this happen? For many IT teams, productivity, and delivery speed are often prioritized over budget. In this case, the latter may suffer due to unforeseen expenses. A one-time IT department intervention cannot solve this problem because technicians will continue to prioritize high-performance applications and code over costs.

The autoscaling tools discussed in the previous section are not only a hard-to-factor variable from the point of view of Kubernetes cost control but also a potential time bomb planted under your company’s budget if scaling policies are left to chance. Suppose scaling limits are not set correctly, or potentially dangerous corner cases are not considered. In that case, this can result in drastic upscaling that will cause a dramatic increase in costs.

How to track and manage Kubernetes costs

As we already said, tracking and managing Kubernetes costs is very difficult, but numerous techniques can help you analyze them and eventually control them.

Proper resource labeling

Most probably, you’re familiar with cloud resource tagging. Regarding Kubernetes, the so-called labels are used instead of tags. If you and your engineers take care of labeling the resources used, it will significantly facilitate their search and identification in the future. It is important to be thoughtful about this process so that you can later break down your resources by various relevant parameters. Successful implementation of such a strategy will require the active participation of all IT team members; otherwise, this initially good idea can lead to even more confusion.

Visualization utilizing Prometheus

Open-source monitoring systems like Prometheus can significantly help you visualize your costs. Competent visualization, in turn, is a giant leap toward competent analysis.

Proper use of autoscaling

Autoscaling is a killer feature of Kubernetes, and with its help, one can easily manage workloads. We already mentioned two of them: Vertical Pod Autoscaler and Horizontal Pod Autoscaler. Still, there are actually two more available: Kubernetes Event-Driven Autoscaler and Cluster Autoscaler, where the former manages the scaling of any container in Kubernetes based on the number of events needing to be processed, while the latter deals with autoscaling on the cluster and node level. That said, it is a challenge to make them work together correctly – not only should you stick to the numerous best practices and fine-tune the settings based on your scenarios, but

Choosing the right cloud instances

The cost of Kubernetes directly depends on how well the cloud instances are selected. It is vital to ensure that Kubernetes pods’ source consumption matches the amount of allocated memory and computing resources of the instance, regardless of which cloud provider is used.

Proactive resource management

Underutilized and unused resources are among the first items to look for direct losses and room for optimization. As we already said, IT specialists prefer performance over resource optimization, so they tend to overuse resources. Despite the tempting prospect of immediately abandoning all idle capacities, this must be done wisely to avoid excluding anything that turns out to be important—the next point follows from this.

Hiring a FinOps manager

FinOps, already mentioned in the article, can help solve several problems simultaneously. First, it will assign more responsibility to technical specialists for the company’s overall financial performance and its spending on cloud resources. Second, it can become the missing link that monitors and manages Kubernetes costs daily so that the scaling of the resources used occurs when it is really needed.

Using specialized tools

Last but not least, you can use Kubernetes cost analysis and management tools, particularly Hystax OptScale. It can give you full visibility over your Kubernetes costs, help optimize IT costs, enhance application performance, and engage your engineers in cloud cost savings. It also provides all the necessary Kubernetes or cloud environment cost details required to get precise data for both Kubernetes and public clouds.

To give it a try, you can have a free trial; no credit card is needed → https://my.optscale.com/register

Enter your email to be notified about new and relevant content.

Thank you for joining us!

We hope you'll find it usefull.

You can unsubscribe from these communications at any time. Privacy Policy

News & Reports

Slide deck

FinOps and MLOps

A full description of OptScale as a FinOps and MLOps open source platform to optimize cloud workload performance and infrastructure cost. Cloud cost optimization, VM rightsizing, PaaS instrumentation, S3 duplicate finder, RI/SP usage, anomaly detection, + AI developer tools for optimal cloud utilization.

How-tos

FinOps, cloud cost optimization and security

Discover our best practices:

How to release Elastic IPs on Amazon EC2
Detect incorrectly stopped MS Azure VMs
Reduce your AWS bill by eliminating orphaned and unused disk snapshots
And much more deep insights

OptScale

Optimize RI/SP usage for ML/AI teams with OptScale

Find out how to:

see RI/SP coverage
get recommendations for optimal RI/SP usage
enhance RI/SP utilization by ML/AI teams with OptScale