Whitepaper 'FinOps and cost management for Kubernetes'

Get your copy

Please consider giving OptScale a Star on GitHub, it is 100% open source. It would increase its visibility to others and expedite product development. Thank you!

Live Webinar: VMware migration insights, smart VM replication, synthetic full backup, KubeVirt support, and more →

Please consider giving OptScale a Star on GitHub, it is 100% open source. It would increase its visibility to others and expedite product development. Thank you!

Live Webinar: VMware migration insights, smart VM replication, synthetic full backup, KubeVirt support, and more →

Please consider giving OptScale a Star on GitHub, it is 100% open source. It would increase its visibility to others and expedite product development. Thank you!

Live Webinar: VMware migration insights, smart VM replication, synthetic full backup, KubeVirt support, and more →

Ebook 'From FinOps to proven cloud cost management & optimization strategies'

Get the ebook

Backup and Disaster Recovery
in Public Clouds

Should I build a DR strategy in public clouds, as they tend to be resilient, or use the backup? If so, what is the best DR strategy for me? Let’s discuss…

First, let’s briefly discuss the difference between Disaster Recovery and Backup, as the majority see no difference or even misuse the terms. Backup is when you replicate some state of your data to some storage (tape, NAS, cloud storage, etc.) and then have a way to restore a missing item or items from the restore points. According to some retention settings and policies, it protects from data loss, ransomware, or system failure. The best backup solution is the one that stores data efficiently and provides various recovery options like granular file/folder recovery, restores to a database, etc. Usually, recovery is supposed to run in the same or a similar environment. In the meantime, Disaster Recovery is about fast replication and recovery of the applications and the entire infrastructure; storage consumption is essential, but the main focus is on low RPO (Recovery Point Objective – the time between replications) and RTO (Recovery Time Objective – time to restore the entire system after a disaster). Data is stored in a ready-to-use format; granular file/folder recovery is an advantage, but not a core aspect. An ideal DR solution should be able to restore to another cloud or region and have a smooth failback (returning apps when the disaster is fixed) functionality.

Now that we know the difference, what technology should be used? Well, I would say that you need to have a backup functionality. It’s better than nothing and will help you recover after a failure or a disaster. You can use standard public cloud capabilities to take volume snapshots or use a vendor for that. Just be sure that you understand how long it takes to recover, where the data is stored, how much you need to pay for the backup (not just for licenses if it’s not for free, but also for storage and data transfer), and how to restore data or VMs.

Backup is excellent when you need to restore some piece of your information, but it doesn’t help a lot when the whole data center or availability zone is down.

A few years ago, companies considered public clouds insecure and unstable. Now there is an opposite trend – people tend to think that public clouds are highly reliable, store all the data in a few copies, and provide up to 100% uptime. Of course, sometimes it’s like that, but the truth is in between.

Public clouds do have issues from time to time – their regions or specific services may be down, affecting their customers, including you, as their customer. You can see the public cloud status of AWS, Azure, or GCP. For example, let’s say you run your VMs in AWS in the US-West region. If it has issues due to connectivity, your apps and VMs are in trouble. You may say it happens too seldom, and you don’t need to worry about that. I don’t remember a month without any cloud services issues for one of the Big Three. If up to 6 hours of the outage is ok for you, you can skip it; public clouds do an excellent job of restoring their services in this timeframe.

If not, you need to calculate two things:

First, how much does an hour of downtime cost you?
How long will it take to restore your entire infrastructure from a backup?

This will help protect against a cloud outage and prepare for ransomware, human error (about 70% of all outages happen because of this), or any hardware failure.

If multiplying p.1 and p.2 gives you some unacceptable numbers, you need a DR solution.

There are multiple DR solutions available on the market. I suggest you keep in mind the following criteria:

If possible, use a different public cloud for a failover. This will prevent you from being affected by a global error and give you true workload mobility. That means that you are not bound to any specific cloud and can use the best from all of them.
Run regular DR tests. It’s a pity companies are not leveraging a DR strategy, and it’s even more disappointing to see people paying for something that doesn’t work. Run the test once a month, at least.
Find a balance between native cloud services and running applications on your own. Cloud services are convenient and easy to use, but there is no simple way to failover them.
Benchmark multiple DR software. Some backup companies are aware of backup/DR confusion and pretend they can run a full infrastructure failover. Test and see whether you are satisfied.

Public clouds are ideal for use as a failover site: You don’t need to build a separate DR site with hardware and software licenses and support it, considering that 80% of the time it will stay idle being prepared for a failover. Otherwise, public clouds can be used to store snapshots, and you don’t need to pay for compute until you run a failover.

Keep in mind that, at least, a backup solution is a ‘must-have’ nowadays. Consider how critical it is for your business to be down until you recover from a backup, and think about a DR solution. Public clouds are the best option for a failover site. Remember that there are two types of companies: a) those that have not backed up yet, and b) those that have already backed up.

Please, feel free to read my recent article ‘Top three public cloud services used‘

News & Reports

Slide Deck

Realize your company’s FinOps adoption potential

A full description of Hystax OptScale as a FinOps enablement platform – product features, benefits and functionality.

Whitepaper

Public cloud usage report

Great critical insights on hybrid cloud management benchmarks, trends and best practices.

Webinar

Optimize your cloud usage with Hystax OptScale

Discover how to analyze cloud metrics and get cloud optimization recommendations based on your usage.

We're STEVIE® WINNER
in Cloud Storage and Backup Solution

Backup and Disaster Recovery in Public Clouds

News & Reports

Realize your company’s FinOps adoption potential

Public cloud usage report

Optimize your cloud usage with Hystax OptScale

Backup and Disaster Recovery
in Public Clouds