Li recentemente um bom artigo pela equipe de engenharia da Intuit, onde eles mencionaram um tópico interessante — bolha dupla. Em termos de nuvens, é um estado durante migração para nuvem ou transformação digital quando você paga por ambas as nuvens, sites de origem e de destino. Vamos discutir o quão comum isso é.
Um projeto padrão de migração para a nuvem consiste em um conjunto definido de etapas:
1. Justificativa de TI e comprovação de uma necessidade comercial. A justificativa pode ser evitar o bloqueio de fornecedor, ciclo de renovação de licença de hardware ou software de arrendamento de datacenter, velocidade de dimensionamento, TCO ou desconto de nuvem pública etc. Normalmente, quando há uma justificativa clara, não é um problema comprovar a necessidade comercial.
2. Definição do escopo do projeto. Nesta etapa, uma pessoa responsável (gerente de projeto) e aplicações/recursos específicos são definidos.
3. Fase de migração para a nuvem.
4. Post-mortem.
Falaremos apenas sobre p. 2 e 3 por enquanto. É normal que durante o processo de migração haja momentos em que você tenha recursos em execução em ambas as nuvens, de origem e de destino. Alguns dos recursos podem já ter sido migrados, alguns ainda podem estar em uma fila ou mesmo não definidos para migração e ainda em execução em uma nuvem de origem. Mas há um caso interessante quando você migra alguns recursos (em alguns casos, centenas ou milhares de máquinas) e precisa pagar por eles duas vezes.
Alcançar a eficácia gestão de custos é tudo sobre otimização e rastreamento. Com um conjunto de políticas, princípios e processos em vigor, as empresas podem apaziguar as partes interessadas e garantir que seus gastos com a nuvem permaneçam sob controle. Se sua conta de gastos com a nuvem for uma surpresa toda vez que você a receber, você simplesmente não está aproveitando todas as ferramentas de governança de custos disponíveis.

When companies define what to move to a cloud, they usually think in categories of applications, departments, or entire resources. Experienced migration consultants or vendors will always advise splitting resources into chunks of 30–50 VMs and migrating by phases. It reduces risk from one side; from the other, it helps blow off the bubble. Ideally, a single application should be in one chunk. In that case, you can migrate and test the whole granular part of your system and avoid data locality issues. Remember that cross-region and outbound cloud traffic is not for free and is pretty expensive. It’s better to think about that before you receive your first cloud bill 🙂
The root cause of the bubble is in ‘replication -> testing’ -> cutover scope. When you migrate some chunk, it takes some time to replicate data, define how you’ll grab increments (better in an automated way), test the chunk on a target cloud, and schedule a maintenance window to execute the cutover. And those 3 phases form the bubble. You store the data in block devices or object storage, run VMs on a target cloud, and pay for compute. In most cases, test migrations can run for 1–3 weeks (can be even more) until a team that owns the migrated application validates that all is fine with it on a target cloud and there are no performance degradation or other issues. And if you migrate multiple chunks in parallel, the bubble will grow.
Então, como evitar a bolha…
1. First of all, identify your migration pace. Be very frank with yourself. This is exactly how to learn a new skill—very slow in the beginning and much better after a few iterations.
2. Define a queue of applications/chunks. Put it into your migration project calendar.
3. Figure out a way to replicate VMs without downtime and without constantly re-replicating them. Dozens of tools do that, saving you time and money as you pay less for storage.
4. Communicate with the teams owning the chunks or apps. Define acceptance criteria and the cutover process with them. The earlier they start thinking about that, the more prepared they will be when the time comes. Define time slots when they need to test the migration. This is the most important step as testing blows the bubble, and usually, teams don’t have any idea how to test applications, what the components are, or who owns individual machines.
5. Define the waiting period—how long you wait until you remove migrated VMs from the source environment. Don’t forget that you need a backup plan if something goes wrong with VMs and apps on a target cloud, and you still pay (directly or indirectly) for the machines on a source side.
6. Shut down the tests as soon as you see that the team is not prepared or it’s not their proper priority. If they are not motivated, they will just waste time (equal to money in public clouds) or, even worse, make a decision (accept or reject) based on some odd criteria and either proceed to use the machines on a source cloud or will figure out that were issues when source VMs would have been already removed. Reiterate with their manager or upper management to adjust both teams’ priorities.
7. If tests pass, proceed with the cutover. Remove all snapshots and test migrations for the migrated applications. Remember to start the clock for their waiting time.
8. Revise suas estimativas de ritmo e ajuste sua programação.
You’ll have some sort of double bubble in any migration project, but you can control how big it will be by proper planning and communication with application and VM owners. Only brave people migrate the entire infrastructure in a single run; smart people plan and do it in chunks and phases.