When it comes to VMware Cloud on AWS (VMC), the devil is in the networking details. For those out there unfamiliar with VMC, it is an service that was jointly developed between AWS and VMware, and is fully managed by VMware. It is essentially a VMware Software Defined Data Center (SDDC) as-a-Service running within the AWS Global Infrastructure. VMC provides customers who still run mission critical workloads within the vSphere ecosystem the ability to deploy new VMware clusters with public cloud agility.
A VMC deployment is an automated process that spins up a SDDC within an AWS region in under 2 hours. Once the service is available, you have the ability to set up hybrid connectivity to other existing VMware environments. The SDDC also comes with direct access to AWS native services utilizing the AWS backbone network.
VMC fits a handful of compelling use cases, which means that even though it is just VMware running within AWS, there can still be a lot design complexity. This is especially true when it comes to network design. Many times, this complexity is a result of the size and/or configuration of an organization’s existing AWS native environment.
I’ve worked on VMC deployments of many different sizes and have seen one minor gotcha that can result in some fun AWS network troubleshooting. This can be especially true for something like a proof of concept (POC), where a customer may not have any preexisting AWS footprint.
Connectivity from VMC to AWS native is enabled by selecting a connected subnet within a specific VPC during deployment. Once the SDDC deployment starts, AWS Elastic Network Interfaces (ENI) are created within that connected subnet. These ENIs are routable directly to the SDDC and kind of act as the “gateway” between VMC and AWS native.
To get AWS workloads to communicate to with VMC workloads and vice-versa, you typically would start by allowing the traffic through both the VMC firewall (compute gateway specifically) and the AWS security groups used by native workloads. Another thing to keep in mind is that the connected subnet needs to use the main route table for the VPC, as this route table is the only one that will be dynamically updated by changes that happen within VMC. Any new VMC network segments automatically populate into the main route table. Should the active ENI change due to VMware HA for maintenance, a failed host, etc., the active ENI route will dynamically change here too.
This may be enough to get some connectivity to work. Since AWS native security groups are stateful, I’ve seen instances where communication from VMC into AWS native seems to work fine, but initiating traffic from AWS over to VMC still doesn’t work. In this case, we need to remember that the ENIs in the connected subnet also have security groups associated with them:
One thing to be aware of is that if you have a default security group in your VPC and the ENIs in the connected subnet are using that default security group, there is a change the ENIs are to blame for blocking the traffic from AWS workloads into VMC. When you create a new VPC, a default security group is created along with it. Out of the gate, that default security group is configured to only allow that security group itself as the traffic source. Therefore, it isn’t configured to allow traffic from a resources using a different security group, even if they are routable to the connected subnet.
Now you may be saying, well that should never happen because you ideally are using Infrastructure as Code (IaC) to control all of this! That certainly is one good way to avoid this type of situation, as long as you are aware of this issue and have designed your IaC to account for it. For some customers who may not have any AWS infrastructure before deploying VMC, or in the case of say a POC, it may be easier or more convenient to simply log into the AWS console and walk through a new VPC creation. It is very easy to overlook all the intricacies of what is required for end to end communication when working on a very basic deployment.
Regardless of how you plan to deploy your VMC environments, keep this note in mind to avoid some potentially frustrating moments.