Migration
If you work long enough as a software engineer you will end up doing a migration from one system to another.
Below is a step by step guide on how to approach it.
First step is to standardize and consolidate the API of the old system. This is an important step and critical to the success of the migration.
Second step is to implement the API of the new system.
Third step is to invoke the old system and the new system keeping the old system as the source of truth. In this step, you compare the old and new system output and fix any discrepancies between them. This is called Shadowing.
Most of them will be from bugs in the new system but do not be surprised if it is also due to existing bugs in the old system that got surfaced by the shadowing process.
The longer you shadow the more confident you can be of the new system. This comes at the cost of delaying the migration of the new system, maintaining the old system and calling both the old and the new system at the same time.
A huge problem in the shadow phase is what to do with new features? Do you setup a feature freeze on the old system? Do you tell teams to wait for the end of migration and implement in the new one? Do you implement in both?
This is why mostly shadowing needs to run for a limited period only.
The forth stage is similar to the third stage but uses the new system as the source of truth for the returned output. This is called Reverse Shadow. The benefit of this stage is to know that it is possible to fallback to the old well tested system in case of an outage.
The fifth and last stage is to just call the new system, remove the shadowing and reverse shadowing logic and deprecate the old system.
Bonus point is to have a logic to do gradual or targeted rollout from the old system to the new system in the shadow and reversal phase
Another bonus point is to have an ability to compare and ignore fields from the response of the old and new system to model expected discrepancies. This will improve the discovery of issues and will fasten the matching of responses between the old and new system during the shadowing phase.
A third one is to build a migration dashboard that shows the rollout between the new and old system, error/success ratio and percentage of mismatch.
Software Engineering from the Frontlines Course on Maven
If you liked this article, I will be teaching a “Software Engineering from the Frontlines” course on Maven where I will teach hard-learned lessons I acquired developing large-scale products at companies such as Uber, Airbnb, and Microsoft.