Migrating to the Cloud

10 min readJan 22, 2022

Migrating to the Cloud — Photo by Chris Briggs on Unsplash

A lot of organizations today are either evaluating or en route to migrating their workloads on the Cloud specifically AWS. In this article, I will outline some of the planning & actual migration tools available from AWS to successfully plan a migration. Every enterprise is different and their migration journey is going to be unique. Hence I will juxtapose all the tools available with an actual case study from 2017 (This is an import of an article I originally wrote in 2017 here) where we chose some of the tools to complete our own migration to AWS

Like most migration journeys, ours began with our executive leadership making a decision to migrate our entire infrastructure to AWS (which was by the way a breath of fresh air since we were tired of debugging obvious infrastructure issues). At the time, our workloads ran on 2 data centers with the Production workloads running out of a local hosting service provider and non production running out of our on premises corporate data center. In this article, I will outline how we went about the migration from Planning to execution and all the services & tools we made use of to achieve this

Planning the Migration

AWS has a very detailed blog on the various strategies to migrate to the Cloud that is outlined here. Something that we did not have our disposal at the time of writing but that exists today is the Cloud Adoption Readiness Tool (CART) which systematically enables you to assess & build a detailed plan

In our case, we did not have any workloads that we could offhand retire or Simply Retain. Our contracts with the local data centers had to be renewed and so Not moving workloads was not an option. We also did not have any Repurchasing situation. Thus from the 6Rs, our migration strategy broke down into the following 3 parts

Refactoring: We were committed to redefining our products and platforms and that enabled us to look at what we would simply refactor. This constituted about 70% of our functionality where we decided to adopt a micro services architecture building net new services. Thus with a business decision to also launch a new experience in parallel to this migration effort, we followed a “strangulation” pattern involving a gradual refactoring of features from the monolithic legacy apps into newer micro services all behind new analytics experience. To understand how we broke the monolith down into smaller micro services over a 3 year journey you can read my blog here
RePlatforming: This included components that were too costly to change architecturally. The 2 that stand out were our NAS & Oracle infrastructure. NAS formed the backbone of our SFTP service which our clients to send daily files which in turn were processed by a COTS product. Changing this architecture involved breaking integrations with our customers ( AWS Managed SFTP was not available at the time). Similarly our Oracle infrastructure with its transactional database, warehouses & critical ETL processes was hard to decouple from.
Re Hosting: Finally this left us with our legacy web applications where our decision was simple — we did not want to make significant investments but absolutely had to keep the lights on. This primarily included applications running on Jboss, Oracle, Active MQ, etc. Here our path was to do a lift & shift a.k.a Server Migration

It’s also pertinent to mention that we had formalized using names of seasons (Spring, Summer, Fall, Winter) to make major customer facing release announcements E.g. Fall 2016. While the actual deployments happened daily, these marketing releases served as good milestones for us for AWS migration. With these high level principles & milestones, we came up with 3 broad phases for our migration

Phase 1: The first milestone was in the Fall which had a combination of some new capabilities but a lot of existing features all on our new micro services foundation. From a network topology, this involved running a hybrid architecture where all the backend & front end components of this new portal would run on AWS. It was hybrid because while our backend ran on AWS it was going to be unlikely that we would be able to move the Oracle infrastructure & legacy applications in the same time frame requiring a split of workloads running in AWS vs On premise
Phase 2: The second milestone was in the Winter. This was meant to be lightweight in terms of actual deployments but heavy in terms of detailed assessments for moving everything to including legacy applications
Phase 3: This was the final phase with a target of following Spring by when we needed to completely be on AWS

What I want to highlight here is the need for such a broad plan since it guides how you would approach the migration for a group of your components. The high level plan also provides you with opportunity to do detailed assessments for some of your trickier workloads

Finally cost & speed were critical factors for us. So we made a choice to start with AWS in the US East Region and follow up with a multi region plan after we had completed the initial migration. The reliability, scalability and performance that we were likely to gain from just the migration far outweighed the delays in going for a multi region approach from the onset.

While we ourselves took this route, I strongly recommended that you have a multi account multi region strategy from the onset since it will guide your networking & organizational strategy which is cumbersome to adopt later on

Networking

Like I mentioned, our production workloads were running on a local hosting service provider whereas our non production workloads were running out of our On Premises corporate data center. From a connectivity perspective we had to first decide on the options we would use for connecting these data centers to our AWS VPC. With Fall being 6 months away, it was critical that we get started with our non production workloads as quickly as possible to support development whereas we had a bit more time when it came to the production workloads. To address this we went with the following approaches

Site to Site VPN: We used a Site to Site VPN between the non production data center and the AWS VPC on top of our 500Mbps Internet Connection. This got set up fast and the cost for it outweighed the cons of not having a dedicated connection for non production workloads. It jump started our refactoring efforts for building new micro services and thus the foundations for our future platform which including new CI/CD pipelines using Git, Jenkins, CloudFormation and Code Deploy

Direct Connect: For production we absolutely needed a dedicated connection to avoid any latencies and so we established 2 1 Gbps Direct Connect links one with our Hosted Data Center and one with the Oracle ExaData Location

Outside of these there are a number of options available today that make VPC Design scalable. The top that come to mind are the Transit Gateway that enables connect multiple VPCs instead of point to point VPC Peering and Private Link that allows exposing services between VPCs over a private connection

Refactoring

My blog here outlines how we carried out our refactoring by Breaking the monolith but below is the depiction of our final architecture

Re Platforming with AWS Managed Services

A key decision to make was this: Using the Cost vs Speed trade off, what AWS Managed Services should we use?

All AWS Services tend to significantly reduce the operational overhead but might not be what you want. To cite an example, our ideal end state involved us exercising full control over our compute infrastructure so that we could achieve the density using container workloads but we also needed time to build our CI/CD & infrastructure automation. We were not going to have the stacks or the maturity on Day One. Thus we arrived at a strategy to use managed services to get us the speed we needed with a commitment to continue to evaluate and refactor aspects where we would like to eventually have more control. Thus we leveraged the following main services

S3: For any kinds of object storage including static content and website hosting with CloudFront for our CDN. We did not have a lot of edge computing that we had to do but Cloud Front is feature rich with its Lambda Edge
Elastic beanstalk: Given that our new micro services backend was powered by Java, Spring Boot and Tomcat this worked as a steroid for us to get our workloads quickly into Production. We later on refactored to eliminate the use of beanstalk to go to EC2 once our CI/CD had matured. The refactor itself was trivial given that our backend services all followed a consistent runtime model & more importantly given that we thought consciously about that end state!
RDS: While we had significant expertise in house to manage Oracle, we wanted to start making use of open source relational database technologies where we did not have to manage a lot of the operational aspects like patches, upgrades, failover, etc plus not to mention the licensing costs. This ended up being an important catalyst that allowed us to alter our development culture where individual teams owning their micro services E2E!
SQS: For queuing messages and performing various asynchronous tasks such as within our reporting pipeline
Simple workflow (SWF): For complex workflow orchestration as we had for our scheduled reporting pipeline. Step Functions was fairly new at this time
Code Deploy: To manage deployments to our beanstalk and later EC2 instances. AWS Code Pipeline & Code Build are extremely useful services too; we just were comfortable in our use of Jenkins
Certificate Manager: To manage our certificates
Route 53: For all our DNS needs
Elastic search: For our centralized logging to aid troubleshooting and debugging. AWS has now forked the open source ELK stack to form the Open Search service
EFS: To migrate our NAS workloads from on premise to the cloud.
Lambda: Last but not the least we used Lambda for glue code or small tasks / functions. The cost of running Lambda at the time of this writing gave us pause though later on in 2018 AWS reduced the costs significantly to make it a no brainer from an adoption perspective and how Lambda has now become an integral part of the enterprise you can find my blog here

Rehosting

To make our migration possible by Spring, we had to first do a detailed assessment of our legacy workloads. At the time of our assessment, the size of our workloads did not warrant using AWS Application Discovery Service though it remains a vital tool in the toolbox for planning. With some manual due diligence we were able to categorize the components into the following broad approaches

Virtual Machines: We had basically our legacy applications and some custom licensed software components such as our Jaspersoft reports and SFTP offering built on a COTS product. All these apps were running on Redhat VMs and we used the VM Export/Import options
Oracle RAC: Here we made a use of snowball devices to do an initial data export and import into AWS S3, set up our Oracle database in ExaData (Primary) with AWS on EC2 (DR). The initial export was followed by use of Oracle native tools to synchronize the data as per the AWS Migration Guide
NAS to EFS: Using Snowball devices we moved all our files from NAS to EFS and then mounting completed the subsequent synchronization over the Direct Connect connection

Final Cutover

Our final cutovers happened over a couple of weekends where we had both stacks running with Route 53 DNS cutover planned out. The first weekend was a dry run for us to evaluate any unknowns that we could not plan for which we did but luckily they were small enough allowing us to fix them during the week. It is important to note that our usage at that time allowed us to make use of the weekends to perform the dry runs. The final cutover happened as planned with only an isolated edge case issue occurring post the full cutover thus concluding an exhilarating journey!

Epilogue

AWS today has enhanced its service offerings so much that planning & performing migrations has become a science. Some of the tools that we did not use but would be super useful today are as follows

AWS Migration Evaluator: As the name sounds this helps you make a business case for migration
AWS Migration Hub: A central place to track your migrations
AWS Application Discovery Service: Enables you to make a detailed inventory of your servers, networks and processes running on the virtual machines all of which help you do a thorough assessment for your migration
AWS Server Migration Service: This is the new incarnation of the old VM Export / Import Service that supports initial migration plus replicating newer changes
AWS Data Migration Service: This allows you to migrate your On Premise databases to AWS. With the neat Schema Conversion Tool (SCT) you can also convert from one database to another again with support for initial migration and continuous synchronization

For all other resources you can head over to the AWS Migration page where there is constantly new features and capabilities added to make migrating to Cloud an easier decision

Migrating to the Cloud

Written by Shailesh Hemdev