How to choose the CIDR block for your VPC

In one of my previous blog posts, we saw that creating a VPC is quite simple. We just need to provide values to four parameters and we can get our VPC up and running in minutes:

Region
CIDR Block
Environments
Business Groups

From these four parameters, the CIDR block is usually the trickiest one to understand. And this parameter is critical — because once your Mule VPC is created, you can’t change your CIDR block. We’ll see, in the following lines, why this parameter is so important and how to properly calculate it.

What is the CIDR block?

To understand the CIDR block, we must first understand what an IP address is (don’t worry, this isn’t going to be a networking lesson). An IP address is the numerical representation of a location in a network. Similarly to how your phone number identifies your cell phone, your IP address identifies your device, your server, and your network interface.

Computers only understand binary numbers, for that reason an IP address is just a sequence of zeros and ones. To be more specific, an IP address uses a combination of 32 zeros and ones, 32 bits. Humans, however, need another format. That's why that binary number is split into four blocks or bytes and each block is represented with a decimal value. Each decimal number goes from 0 to 255 (decimal representation of eight 1s). Because of that, you always see an IP address as a combination of four decimal numbers, something like 192.168.1.1.

CIDR stands for classless inter-domain routing notation. This is the notation that we use to identify networks and hosts in the networks. A network is just a collection of devices, computers, servers, in general a collection of hosts interconnected. A host is anything with an IP address associated.

In terms of IP addressing, to create a network or a subnet, we need to take a segment of consecutive IP addresses from the whole IP addressing space. And for that, the CIDR notation will help us identify our network and each host in that network.

The CIDR notation consists of an IP address, a slash character (‘/’) and a decimal number from 0 to 32. Using this notation we take the IP address and we split it into two blocks of bits: the most significant bits, the network prefix represents the network, and the second block identifies the host in that network. The number after the slash character (the subnet masks) tells us how many bits we need to take for the network prefix.

For example, let’s see 192.168.0.0/24:

IP Address: 192.168.0.0

Subnet Mask: 255.255.255.0

11000000 10101000 00000000 00000000

11111111 11111111 00000000 00000000

In this example, we use 24 bits for the network representation and the remaining 8 bits to identify hosts within this network. This means we’ve got 28 = 256 possible IP addresses for our hosts. In other words, we are sizing our network to 256 hosts.

Below you can find a table with the block sizes and their corresponding number of IP addresses:

CIDR blocks in your AnyPoint VPCs

Why is all of this relevant for our AnyPoint VPC? Because when we create a VPC we are assigning a size for that VPC, we are defining the number of IPs that we can use in our VPC.

When we deploy a Mule app within our VPC it will get at least one IP address from the CIDR block of the VPC. This CIDR block determines the range of IP addresses allocated for your apps in the VPC.

For an Anypoint VPC, the size of this CIDR needs to be a number between 24 (256 Ips) and 16 (65,536 IPs). Having a short block might cause your deployment to run out of IPs and it won’t be able deploy apps in the VPC. From that perspective, setting your VPC size to the maximum CIDR block would be the best solution, however there´s something else we need to consider.

The moment we connect this VPC to our Data Center, using a VPN or a VPC peering, that CIDR block will become part of our internal network and it will consume private IP addresses from your internal addressing space. For that reason, it's important not to oversize your VPC, as it will take out more IPs than necessary from your internal network. For many organizations, If we consider the amount of SaaS solutions that require a private connection, it becomes a challenge to reserve big CIDR blocks for all of them.

How to estimate the number of IPs required

Considering the above, how do we estimate the number of IPs we need for our Mule deployment? Start from the number of applications we’ll deploy in our VPC. The key is to understand that there's not a 1 to 1 relation between apps and IPs. It's likely that one application will consume more than one IP.

These are the key concepts to understand to do a proper estimation for the CIDR block:

Number of workers

A Mule app is deployed to one or more workers. Every worker gets its own IP address, so an app deployed to one worker will get one IP and the same app deployed to four workers will get four IPs.

Horizontal scaling and high availability

You need to estimate how many workers your app needs. There are mainly two reasons to add more than one worker to your app:

Horizontal Scaling: Some apps, due to the type of processing the app does, require more than one worker to distribute the load between the workers and get better performance.
High availability: If your app is critical you need to add additional workers so that if one worker fails the app can continue serving requests with the other workers.

Fault tolerance (region of the VPC)

The Mule VPC is a resource hosted at the region level. This means workers of an application can be distributed across all the availability zones in the region, so that if one availability zone becomes unavailable other workers of the same app in a different AZ can keep the application up. With that in mind, we need to think how critical are the apps to deploy in a VPC. Providing more than a worker for an app will give us fault tolerance at the worker level, but if we required fault tolerance for the whole region we need to provide one worker per AZ. Depending on the region we're creating our VPC there might be three or four AZs. So, if fault tolerance is needed at the region level, you need to find out the number of AZs in that region. For example, for the Frankfurt Region, we've got three AZs, then there would be three workers for a fault tolerant app in the region, and that is three IPs for an app.

Zero down-time

Zero downtime deployment is a deployment style that allows CloudHub to deploy new versions of an application without causing any interruption to the consumers of the application. With this technique we can deploy a new version of our app or update the runtime with no service interruption. It is also useful if we need to scale our app, vertically or horizontally. Zero downtime leverages a side-by-side deployment, for any of those operations CloudHub starts up a new worker with the new version of the app and keeps both workers (the new and the old one) until the old one is removed and the new one remains.

In this process, the new worker will require a new IP, and for a period of time we'll have two workers, and therefore two IPs running. So, zero downtime affects the required size of our CIDR block. We need to have enough free IPs in the IP range of our VPC so that we could duplicate the number of IPs assigned to existing apps when a bulk update happens. So, questions to answer are:

How many apps would you be updating in parallel?
Do you need to update your apps in groups?
Do you update your runtimes periodically? How many different versions of the mule runtime do you keep in your deployment?
How does security and continuous patching affect your apps?
What type of traffic do you have for your apps? Is there a group of critical apps in your deployment that you typically scale vertically or horizontally to accommodate peaks of traffic?

Answering all these questions would give you a better understanding of the block of IPs you need to maintain unused in your CIDR block for zero downtime operations. If you want to be certain that you'll have enough IPs, just plan for the worst case scenario, that is, two times the number of workers or IPs in your VPC.

Number of environments

Each environment (and the apps deployed on it) will be hosted in one and only one VPC. The same environment cannot belong to two different VPCs. Oftentimes, the same application will have a version of the app for every environment: the version running in production plus the version we have in dev, QA and/or test, for instance. For that reason, we need to consider how many environments we will host in the same VPC.

The recommendation is to have at least two VPCs, one for production environments and another for non-production environments. You need to pay attention in those cases in which you can have multiple non production environments because that means multiplying by two, three, or more the number of IPs required for the same app. And something else to ask: are you maintaining a version of the app in all environments for all apps?

With that, be sure to plan ahead, get the number of apps and the number of workers per app. The rest is just maths.