VMware is considered the “king” of on-premises infrastructure — used by almost every enterprise today. vSphere, VSAN and many other VMware products are used in many on-premises data centers. AWS, on the other hand, is the “king” of the public cloud. There are even some companies that have ditched their on-premises solutions and gone all in with public clouding. They have no physical infrastructure they own or manage. How did they get there? Each and every migration story varies, but this post will describe some key aspects of migration; pemirhaps, motivating and enabling you to embark on a journey of your own.
There were probably a number of assumptions that were made over the years regarding how to provide resilience and redundancy for your in-house applications. Assumptions such as a using an enterprise shared storage array, or load balancers. There might, however, be some issues with these assumptions when moving to AWS (or any cloud provider for that matter).
In the past, when deploying a system in-house, most applications employed master–slave approach for providing redundancy. This means there are two machines installed — one acts as the master and serves all the traffic and requests; the other, the slave, is the backup machine that waits to take over if the master fails.
Although this approach was implemented using a range of technologies and mechanisms, the logic was elementary. You had a quorum device (usually a raw disk shared between the machines) that had the correct logic to provide who had a lock on the cluster and who was the master. As soon as the master went down, the lock would be released and the slave could take over.
This is almost impossible to achieve today because there is no way to share a raw device in AWS (at least not without designing your own solution to address this). A block device can be attached to one, and only one, device at any given time. If your applications require this method of clustering, you will not be able to use this technology when moving to AWS.
It is not common for an on-premises application to scale infinitely. It might be because the levels of automation that usually exist in-house, and the tooling available do not enable easy, seamless scaling, therefore the requirement for scalability of the application is not usually perceived as an important consideration throughout the process. When moving to the cloud, this becomes a major consideration for how you deploy and design your applications.
Manual deployment is possible – whether on-premises or in AWS – however it soon becomes apparent that doing so without an automated solution will not be sustainable. Sure, you can deploy 10 instances with the UI in AWS, and then SSH into each of them and configure them manually, one at a time. But, this is a cloud anti-pattern because everything should be automated – 100% repeatable, with no reliance on manual intervention. There are a few methods that can be used both on-premises and in the cloud as part of this journey.
The process is basically the same for all machines (or instances) that you bring up. The machine loads an operating system (which can be done in a number of ways, such as kickstart or a deployment image). When the OS is up and running, you run a script that installs all your software and places the configuration on the machine itself.
Here is a basic example of the process:
- Bring up CentOS 7.x
- Run install script
- Install basic packages
- Backup agent
- Monitoring agent
- Security software
- Install specific packages
- HTTP server
- Database server
- Configure the application
- Serve specific web pages
- Connect to database
- Create specific tables
- Install basic packages
Again, this is the basic process that works for many organizations. Nonetheless, it does have its downsides, as described in the next section.
Software such as Puppet, Chef, Salt, and Ansible are advanced configuration management and orchestration tools with important elements that take installation to a higher level. First of all, they are declarative languages, meaning if you want a specific version of software (for example, Apache 2.2), you would define that in the statement of code you use for the installation. As opposed to a shell script, where you would have to introduce a number of logical constructs to handle the cases of existing software on the machine, when applicable.
The second element is that of idempotency. In short, no matter how many times you run the process with a configuration management solution, you will always have exactly the same outcome. When running this in a homemade shell script, you will are likely to run into a number of edge cases that will pose numerous challenges, requiring “creative” solutions to ensure your scripts produce the same exact outcome each and every time.
The really nice thing when moving to AWS is nothing stops you from reusing the solutions that you have on-premises today — almost all of them will work out-of-the-box in the cloud. It goes without saying that you should test this before making the leap into the cloud.
Strategies for Migration
There are number of ways that companies migrate their workloads into a cloud environment. Let’s go through them in the order of least to most desirable.
Lift and Shift
Let’s start by comparing this to an example in our physical world today. You are moving from one house to another. In your old house, you had a fridge in your kitchen. You empty it out, wrap it up nicely for transport, move it on a truck to your new house, unpack it, and put all your groceries back inside. A basic move from one location (lift) to a new location (and shift).
This can be accomplished with all the workloads on VMware in your data center today. Take the VMs, create an inventory of the software installed (empty out), document dependencies and installation processes (wrap up), install a new instance in AWS (move to new home), and install the software on the new instance (put everything back).
We made only a few basic assumptions above that if they turn out to be inaccurate, this might become a very difficult process. First, we assumed all versions in the cloud are exactly the same as they were before. This is not always the case, potentially causing a number of complications with this method.
Another method of accessing the instances should be to make them available from your local (on-premises) network through remote access (either SSH or RDP). By default, this is not the case. You need to make sure network connectivity is properly set up (for example, a VPC with a site-to-site VPN) before you start migration. Instance sizes might be another issue. In AWS, you have a number of preconfigured instance sizes to choose from, however they might not exactly match what you have in-house. For example, if you had a VM with 48 GB vRAM and 16 vCPUs, you will not find a matching size in AWS so you will either have to choose a bigger or smaller size. As a result, you are either wasting resources or risking hitting a constraint because your applications have not been tested with this smaller configuration.
If we were to compare it again to the move above, a smaller space that your fridge does not fit into, or an electrical socket that is further away than the length of your cord are just two simple examples. Mostly likely, you could find a workaround for such problems, but the end results will be a slightly different solution than you originally had in mind.
This is the longer road, but definitely the optimal one. If you just take what you currently have and deploy it as-is in the cloud, you won’t make much use of the built-in tools and advantages you get with a cloud provider.
Let’s consider a couple of examples.
Elastic Volumes allow you to resize your attached disk volumes on the fly without downtime. If you were to choose the lift and shift methodology, you would take the exact same volume size you had on your VM and attach an EBS volume of the same size. You should ask yourself why did you choose that volume size in the first place? Probably because the initial estimates were projected to be the maximum within a year. Using a feature like elastic volumes allows you to start small and increase only when you need to. On the other hand, you will need to incorporate some kind of alerting mechanism to alert you of the need to expand your volume when necessary — or even better to do it automatically for you.
AutoScaling Groups are another option. Here you will allow AWS to provision more instances automatically, based on the load and metrics that you define for your services. This was something that was most probably not possible or not something you had with your VMware solution. This does require that the architecture of your application support this (usually master-slave topologies do not work; you will need an active-active model), and that your build process is 100% automated to allow the application to grow without any manual intervention.
There are a number of ways you can prepare to embark on a migration project to the cloud. The concepts you are used to today with an on-premises solution will not always fit into the world of AWS. And because the concepts are different, some of the basic assumptions that exist in your infrastructure today will no longer be valid in AWS.
Standardization and automation are key aspects of your migration from your on-premises VMware environment to the whole new paradigm of vision and possibilities within AWS. Utilizing some of the strategies described above will enable you to get a foot in the door and help steer you in the right direction.