The total amount of data in the universe is growing by 40% a year, and we want to store and access it. A lot of it ends up in our databases. We need to scale our database instances to cater to both the growth in data volumes and the increased traffic and workload.
This strains our database engines as we push ever-increasing demands on indexes, searches, references, updates, reliability and availability.
How do we plan for database scalability?
As data and workload grow, the number of server instances in use by businesses grows accordingly, and the importance of stable and reliable IT systems increase, it’s no longer possible for businesses to cope with rigid, non-scalable systems and tools.
Making your databases and applications scalable is not necessarily a simple task. There are two main variations of server scalability to consider.
In this blog, we will outline the pros and cons of vertical and horizontal scaling.
Vertical scaling, or “scaling up”, involves adding more resources to a server instance. By increasing CPU resources, memory, and storage or network bandwidth, the performance of every individual node can be improved, scaling small servers to handle large databases.
This is the "old school" way of scaling up. You either procure the largest machine you can afford or give your existing server as many resources as you can to fit the workload demand.
Vertical scaling is typically easy for application developers as they do not have to do much to support a larger machine. Same for DBAs - usually quite easy to manage. But at some point, you may reach the capacity ceiling for the hardware you have, and you will need to spend an ever-increasing amount of time optimizing the application, SQL, and server instance to wring out every last bit of performance. This will bring you ever-diminishing returns as you get closer and closer to the maximum size you can take a single server. At some point, the cost and effort to scale vertically hit the ceiling, where you no longer can scale up.
Vertically scaled systems do come with some disadvantages, though. Not only can initial hardware costs be high due to the need for high-end hardware and virtualization, but upgrades can be both expensive and limited - there is, after all, only so much you can add to one machine before it is still outgrown by your database. Normally clustering like Always On or RAC is applied to these large servers to make them reliable and with enough capacity to handle the load.
You may also find yourself quickly ‘locked-in’ to a particular database vendor by following this strategy, and moving away from that vendor later could mean very expensive server upgrades.
In summary, vertical scaling is much easier to establish and administer - as it is just a small number of machines, or even just one. Vertical systems can also offer advantages in terms of stability, reliability and development, and cost savings through being suitable for smaller data centers, and license costs might be lower.
Horizontal Scaling, or “scaling out,” is adding capacity by adding more database server instances. This is done by spreading out databases on more machines to answer the call for increased demand and capacity.
This is now often the preferred way of scaling databases and is often referred to as the "cloud native" way of doing things.
When more capacity is needed in a system, DBAs can add more machines to keep up. In database terms, this means that data must be partitioned across all the machines that make up the cluster, with each individual server holding one part of the database.
Usually, applications need to be written specifically for horizontal scaling. But once built, it becomes very easy to scale the application to meet any workload.
Unless you are certain that your application will fit comfortably on a single server forever, you should write it so it can be horizontally scaled. Even then, consider making it horizontally scalable to make it future-proof and cost-effective to deploy in a cloud.
Horizontally scaled servers can also make use of data replication, whereby one machine holds a primary copy of the entire database while the multiple copies and/or caching are used for read-only load sharing. You often see this in large cloud-based solutions, where data can be distributed across geographic regions, and data centers by use of database read replicas. Consequently, it provides improved response time, better performance, and higher availability.
Horizontal scaling has several key advantages over a vertical approach. The main advantage is that once built, horizontally scalable systems can scale to almost any size and take full advantage of elastic cloud computing – adding more servers to handle peak load. Not only is establishing a scalable system easier, with individual nodes being cheaper than in a vertical set-up but upgrades are also made quicker and more affordable. Maintenance can be easier, too, with faulty instances quickly switched out with minimal disruption to the rest of the system.
A high number of instances adds complexity to a system. It makes monitoring, administering, and troubleshooting more difficult and time-consuming during disaster recovery. When machines are licensed individually, licensing fees would also be higher. Not to mention, the physical space needs to house multiple servers will also bring cost and logistical issues.
As database requirements continue to grow, your organization will need to adopt a form of scalability to keep up. While horizontal scaling is widely considered to be modern, flexible, and advantageous, it can bring some unwanted challenges that you will eventually need to manage.
Whether you scale vertically or horizontally, you will always need to calculate the optimum hardware and software combination, especially from a cost perspective. You may want to use many small and cheap 4-core blade servers, or one or a few fast and large 24-core or larger servers. Even more important, check the software license cost for the total system based on the number of nodes and cores. Do the math; it is a non-trivial and important cost decision.
Vertical scalability brings complexity, while horizontal scalability brings logistical challenges. In either case, good tools for monitoring, analysis and administration are necessary to monitor performance, reduce the challenges, help you deliver greater productivity, and keep costs in check.
Whichever approach you choose will depend on your business' requirements. However, regardless of what you opt for now, always keep in mind as your business continues to grow, you will eventually need to keep up with ever-expanding databases.