How to plan a (successful) VPN migration - Part II



In this second post we'll be reviewing more topics that you should take into consideration if you're planning a VPN migration. 
If you missed the first part, you can start from there. Here's the link:

Are you just moving or do you have to apply architectural changes?

Sometimes a migration could be as "simple" as replicating what you've got in another infrastructure, the traditional lift and shift. But in many cases, it's a good idea to review your current setup and ask yourself? Is my current architecture well setup? Is there anything I could improve? Has there been anything in the last months/years that you found out it could be better but you couldn't change it because it was already too late? Has there been any changes in your infrastructure, after you created your VPNs, that would require a different setup for your VPNs to work better. Some examples:
  • You created your VPN a few years ago with Static routing because your device didn't support BGP, but recently your new device supports it. Will you keep your static routes or will you take the opportunity to improve your VPN setup?
  • You created your VPNs to connect to a datacenter (on prem or in the cloud). But recently, your organization has started to move some data sources to AWS. Will you just replicate the VPN connection as it is or would use this migration to start replacing this to-be-commissioned connection for an AWS Transit Gateway.
  • You're in Cloudhub 1.0 but haven't thought about CH2.0. Will you migrate now the VPNs causing (more or less) disruption in your business and then do another migration to CH2.0 after, creating more disruption? More on CH2.0 later.
  • You created your VPCs and their VPNs in a child BG. But after some time, the organizational needs changed and you had to create more VPCs and VPNs to support the parent BGs. Will you move that problem to your new infrastructure with the migration?
  • You created your VPN with static routing, no High Availability. That was not necessary at that time, the number of apps was very small and they were not that critical. Fair enough. But now things went very well, the number of apps have grown exponentially and there are a few of them that are key for the business. The problem is that you never wanted to add a second connection or move to dynamic routing, better not to touch anything. Well, maybe that moment has come or maybe you have to migrate for other reasons. The point is, if the migration will happen, take the opportunity to apply those changes.
  • You created your VPNs a while ago. Again, no HA, static routing (nothing against that, if that made sense at that time. One day Mulesoft comes and tells you the process of upgrading VPNs has changed and that it will impact your existing VPNs. And now you don't sleep well not knowing if the next upgrade will happen when you're planning your holidays. Now it could be the time to, using your migration, take the opp to provide that HA to your VPNs.

Design a Plan B. And a Plan C or D if necessary

What's your level of support in your contract? what's your SLA? if something goes wrong, how long can it take to fix it? The best strategy to avoid problems is to assume there will be problems. So, even if you design THE plan for migrating, THE plan that's "impossible" to fail... prepare yourself for what you'll do if anything goes wrong.
  • Create your rollback strategy. Decide, at every step of the process, what to do if that step fails. You need to define the plan to shut down the old VPN and start the new one. And you also need to define how to shut down the new one and get the old one back up again if necessary.
  • Measure the times. Once you know how to roll back, measure how long it will take you to do it. That's how you properly estimate the window of time you need for the migration. If you tell your users 1, 2 or 4 hours of downtime that's not the time you need to create the new VPN and shut down the old one. That needs to be the time it will take you to, in a failed attempt,  migrate everything, check that it does not work and roll back to the old VPN.
  • And to measure that, test your rollback strategy. And test it for real. Make sure the first time you do a rollback is not the migration date.

Testing, testing, testing

Talking about testing - Do you have all you need to test if your migration is working? Do you have the tools, on both sides of the connection, to guarantee the new VPN is working? You need to define what are the required tests that will give you the green check. Ping, traceroute, nslookups... you name them, but make sure you know exactly what you need to see before telling your organization everything is working.

Documentation is key

We all get old, you can't stop that. And I don't know about you, but the older I get the harder it is for me to remember, especially details. Jokes aside, you need to document all of this. Make sure you write down, at least:
  • The current state, the AS-IS architecture
  • The future or desired state, the To BE architecture
  • The migration process - the configuration steps that need to be followed during the migration day. It does not matter "you know what you're doing, I've done it 1000 times" - tomorrow you won't be in the company and your team might have to repeat that migration. Or worse, the only available date for the migration is during your PTO.
  • As mentioned, the required tests to validate whether the migration is successful or not. And also what results are acceptable or not for cases like latency - how much is acceptable?

Communication Plan

  • As in every migration, especially if you know there will be downtime keep your users informed.
  • Plan ahead, don't just send an email saying tomorrow your system won't work for an hour.
  • How long you're giving notification depends on every case company, you know what works for you.
  • But as soon as you have a date in mind and the time you need to do it, send your notification. The sooner you do it the sooner you will find out if that date suits all the impacted users. You might find out that on your selected date there's already something planned incompatible with your migration (and probably you didn't have to do it).
  • Be clear on what's going to happen during the whole migration process. What will work, what won't work and how long exactly it will take. Be specific, don't give details that only you understand... that will only get you the opposite, people not paying attention to what you're planning.
  • This will help:
    • You - First, avoid finding out "that something planned" during the migration date and only because you get complaints
    • Your business - If your organization understands what's going to be the impact and you give them time, they will probably work around that date and avoid planning anything important during the timeframe you gave them
Nobody likes migrations, of any type. And that's because migrating means moving from something under control to something else that still is not under our control. But this happens only when you don't have a plan. Hope it helps!



Previous Post Next Post