Vertical or Horizontal Scaling – What’s better on the cloud?

This blog post is a deeper look at the 8-step process an-8-step-process-to-architect-solutions-on-the-google-cloud-platform-gcp. Here we will explore when vertical scaling makes sense and when horizontal scaling does and the best practices involved to get to an scalable GCP Architecture.

In simple terms, vertical scaling involves adding more power i.e., CPU, RAM to an existing machine. Horizontal scaling involves adding more machines into your pool of resources.

An example

Let’s take the example of an online flower delivery business that’s built on the Google Cloud Platform (GCP). The business is growing and needs more resources to meet the demand. The business can ‘scale vertically’ and go from a 2 core machine to a 64 core machine on the GCP platform. You can go higher in the near future. So you can keep it simple and remain a single box with upgraded capabilities. But there is a big disadvantage with vertical scaling in that if there is an outage, the significance of the outage due to a single box is more, due to the increased user base.

copyright – venus et fleur

With horizontal scaling, the flower business can solve the central issue of a single box failure and have multiple boxes with similar configuration, but it needs to be done early to get that desired flexibility.

Other pros and cons

The other big advantages of horizontal scaling is that you can automate more easily and can do rolling deployments without taking everything offline. During deployment in the vertical scaling option – you would have a downtime. So you may ask – what’s the downside of horizontal scaling? It’s lower latency. But it’s an agreeable trade-off to downtime. There are also more overheads with horizontal scaling, but it is easily outweighed by the benefits like failure management, easier upgrades and configuration.

How do you determine the configuration needed?

So, how do you determine how many boxes do you need in horizontal scaling? Google recommends N/3 queries per second. That’s because if you have two boxes, then you do have fault tolerance but that puts a lot of pressure on one box when a failure happens. With N/3 which is three boxes, you have enough capacity. The goal is to always be stateless as much as possible – we covered ‘State’ in a previous blog post –

With horizontal scaling, Google recommends to keep the functionality of the servers simple and let it do one thing well. If a certain task can be split up, then use separate servers. It’s easier to scale as there is no state to rebalance, you can manage failure better because there is no state to recover and easier to load-balance. Before you determine the configuration needed – ask questions about SLO’s (Service Level Objective’s) and user needs, what resources do we need? Do we need central control or can we make do without it?


So in summary – with horizontal scaling – small stateless servers increase reliability and scalability. But duplication and isolation of the server needs to be managed. With vertical scaling, while it’s easier and simple as it all just one box and reduces latency/complexity, it is not an elegant and effective option.

The good thing about Google Cloud is that you can adjust, so you benchmark and test all the time as you grow and adjust as needed. As google says ‘design first and dimension later’.

Have any Question or Comment?

Leave a Reply

Your email address will not be published. Required fields are marked *