Whitepaper 'FinOps and cost management for Kubernetes'
Please consider giving OptScale a Star on GitHub, it is 100% open source. It would increase its visibility to others and expedite product development. Thank you!
Ebook 'From FinOps to proven cloud cost management & optimization strategies'
OptScale — FinOps
FinOps overview
Cost optimization:
MS Azure
Google Cloud
Alibaba Cloud
OptScale — MLOps
ML/AI Profiling
ML/AI Optimization
Big Data Profiling
Acura — Cloud migration
Database replatforming
Migration to:
MS Azure
Google Cloud
Alibaba Cloud
Public Cloud
Migration from:
Acura — DR & cloud backup
Migration to:
MS Azure
Google Cloud
Alibaba Cloud

How to keep cloud costs under control: cost anomaly detection, tags, TTLs and more other useful ways

Recently, we have entered a truly golden age of cloud services: their prevalence, flexibility and financial affordability even for the smallest companies open hitherto unprecedented opportunities for the development of almost any business, one way or another tied to IT infrastructure. But the low barrier to entry into the “cloud” has inevitably led to various problems of cloud service management, which are most often associated with cloud cost management.

How to keep cloud costs under control cost anomaly detection tags

Naturally, new challenges have led to the evolution of existing IT professions and the development of new ones, such as FinOps. We have already talked about this role in one of our previous articles. Long story short, FinOps began to play a great role on the market only a few years ago with Adobe and Intuit as pioneering companies. The role was a natural evolution of cloud cost managers and cloud cost optimizers.

In this article, we will learn how IT teams, and first of all, FinOps, manage to keep costs under control and what tools they use. Then, we’ll take a closer look at cost anomaly detection and resource TTLs, arguably the most effective ways to optimize cloud costs.

The most common ways to control cloud costs


Implementing a tag system and keeping it up to date is not a trivial task, but it is nevertheless worth the effort. Tags make it easy to keep track of resources and IT workloads in use, identify and control those resources that are using the bulk of the cloud costs, and disable those that are no longer in use. An important detail here is a standardized and consistent approach to tag attribution. So, for example, in the case of Amazon Web Services, there are AWS-generated and user-defined tags, and which ones to use depends on the amount of IT resources a company possesses. It goes without saying that user-defined tags usage is a better option as it provides more flexibility and helps get insightful and detailed cost allocation reports that can be broken down by specific parameters, but, obviously, it’s more labor-intensive.

Scale and optimize instance sizes

One of the main advantages of the cloud is the ability to pay only for those cloud resources and, only to the extent, that are required to use. However, unfortunately, it is impossible to fully automate upscaling and downscaling depending on the required capacities. Also, in addition to manual control, you will need to look for a balance of horizontal and vertical scaling: whether you need to add more (or remove) machines to your VM infrastructure (horizontal scaling), or increase CPU or RAM power to existing virtual machines (vertical scaling).

Reasonable use of non-production environments

Not keeping track of such IT environments is a very common mistake that may cost a company a lot of money. That’s where test environment management comes into play – it’s a set of activities that includes test automation, dynamic rollout of test environments, test environment plan preparation, etc.

Free cloud cost optimization. Lifetime

Cloud anomaly detection

In a nutshell, it’s a type of service that analyzes your spending curve and thus helps you identify anomalous spending spikes, find root causes and, in some cases, even forecast the future cloud spending based on behavioral patterns. There are different services on the market: Google and Amazon offer native anomaly detection solutions for their clouds. However, if you have a multi-cloud arrangement or want to avoid a vendor lock-in for some other reasons, you might want to turn your sight to third-party services, which oftentimes provide you with more flexibility and versatility. In the event of increased cloud expenses or resource counts, in real-time OptScale defines anomaly detection that will notify your engineering team to take action quickly and prevent unforeseen charges


TTL (time to live) rules provide a feature that allows setting up policies for certain items; those policies determine the dates and times when the items should be removed. It comes in handy if you know in advance what specific sources will become obsolete or useless after a certain period of time, as well as for how long you’re going to use these sources. To get the most out of the TTL feature, you need to come up with a deletion plan and let the TTL tool (as in the case of cloud anomaly detection, it could be either a native or a third-party solution) run in the background and repeatedly get rid of sources and data eligible for deletion.

Conclusion: how to improve the efficiency of cloud cost management and control?

As a team of seasoned practitioners and engineers who face cloud challenges ourselves on a daily basis, we thrived to create a solution with a completely different approach to cloud cost management – OptScale.

When it comes to cost anomaly detection, OptScale continuously observes cloud cost and resource utilization to identify anomalies and spikes. Due to the real-time monitoring of the tool, cost anomalies are detected almost instantly, allowing you to quickly assess the situation and take action to prevent unexpected costs. Timely notifications to resource owners firstly help to avoid budget overruns, and secondly, make the members of the engineering team accountable for cloud resource usage, which leads to even greater cost savings. Moreover, OptScale makes it easier to determine root causes of costs anomalies and spikes, which enables your engineering team to take urgent action and prevent budget overruns.

In addition, OptScale has a number of features that allow you to fine-tune various options. For instance, Anomaly detection policy management allows you to build a system that will be accurate enough to detect significant expense spikes and at the same time won’t be triggered by false alarms.

The Resource page that shows all the cloud expenses has robust sorting, grouping, and filtering options and allows. All of this, together with the daily expenses breakdown feature, which shows how many resources of a certain type were created or deleted on the selected date, provides a holistic overview of the data source resources.

Finally, OptScale makes it possible to manage resources through TTL rules and budget threshold settings. Various scenarios available help minimize resource underutilization and improve cloud cost management overall.

Overlooked resources are contributed to a company cloud bill, and users don’t even expect that they’re paying for them.
💡 Find the ways of identifying and cleaning up orphaned snapshots to keep MS Azure and Alibaba Cloud costs under control → https://hystax.com/finops-best-practices-how-to-find-and-cleanup-orphaned-and-unused-snapshots-in-microsoft-azure-and-alibaba-cloud

Enter your email to be notified about new and relevant content.

Thank you for joining us!

We hope you'll find it usefull

You can unsubscribe from these communications at any time. Privacy Policy

News & Reports

FinOps and MLOps

A full description of OptScale as a FinOps and MLOps open source platform to optimize cloud workload performance and infrastructure cost. Cloud cost optimization, VM rightsizing, PaaS instrumentation, S3 duplicate finder, RI/SP usage, anomaly detection, + AI developer tools for optimal cloud utilization.

FinOps, cloud cost optimization and security

Discover our best practices: 

  • How to release Elastic IPs on Amazon EC2
  • Detect incorrectly stopped MS Azure VMs
  • Reduce your AWS bill by eliminating orphaned and unused disk snapshots
  • And much more deep insights

Optimize RI/SP usage for ML/AI teams with OptScale

Find out how to:

  • see RI/SP coverage
  • get recommendations for optimal RI/SP usage
  • enhance RI/SP utilization by ML/AI teams with OptScale