How is database farm management different from instance management? Why do you need it and when? That is what I will try to shed some light on in this blog. If you are responsible for a database farm, read on.
Managing instances – watching and tuning performance, handling incidents, and generally maintaining them has always been the DBA domain. DBAs are focused on the database server performance now. As the number of instances grows, you will need more DBAs to handle the job of keeping all instances ship-shape daily. This is when you need to consider farm management as well.
Managing the database server farm is about managing and optimizing resources, cost, risk and inventory, planning, forecasting, reporting, and budgeting. Database Farm Management is focused on the medium- and long-term future, so it is usually done by senior DBAs and IT operations managers.
As an analogy, think of the difference between database farm management and instance management as the difference between managing public transport in a large city with managing a formula one race car team. The former is concerned with moving as many people as possible on buses, trams, and trains in a cost-efficient manner, while the latter is concerned with making one or two cars win the race at almost any cost.
Database Farm Management is different from instance management. If you are to do this efficiently, you will need more comprehensive tools than those usually used by a DBA.
The first task in database farm management is to get the total overview of all the server instances under your responsibility. A complete overview is crucial since you cannot manage what you cannot see or do not know. This may seem trivial, but I have seen too many sites that do not have a complete overview of all their database servers. Sometimes, departments or outside 3rd party solution vendors will install new servers without informing IT, or someone will deploy a new temporary cloud server and forget to decommission it. In most cases, it will come back to haunt you – whether deserved or not. Ensure you have the complete overview. Install tools to auto-scan your networks for new instances and keep a close eye on your cloud services bill for new servers popping up.
So, now you know what database servers you are responsible for, you have the overview. While you are at it, collect as much relevant data and properties as you can, such as platform, version, location, resources and licenses. You will need it for later.
The next step is to monitor status and health.
Are they ok, or do you need to take corrective or preventive action? There are lots of tools to help you with monitoring. Make sure they monitor all your instances on your list, so you are not caught out when somebody complains about some server you somehow forgot to include in your monitoring scheme. So, monitor them all – all the time. It is also a sign of professionalism to show and document to any manager who wants to know what you are doing and control.
The goal of database operations is to have everything available with acceptable performance whenever needed. If you fail to monitor, you can only react to service complaints since you have no forewarning to let you take preventive action.
When you can monitor the whole farm as a whole and see the bigger picture, it should also be easier to know where you should direct your DBA expertise to work with the most impact on overall system performance and health.
If you have set up this appropriately so far, you should be in a position to quickly produce any report on all your servers, required for internal reporting, budgeting, or audit.
Another use for this is to see what versions you are running and use it for planning upgrade and patch cycles.
One of the critical areas and benefits of database farm management is in optimizing resource utilization.
Your database farm consists of large amounts of expensive and limited resources: memory, disk, CPU cores, and software licenses. These resources represent a large financial investment and cost, and your job is to ensure the farm is utilized optimally. Here are some typical questions you should ask yourself:
- - Do I have servers that are not being used, and can be decommissioned and the resources returned to the free pool?
- - Do I have underutilized servers that we possibly could consolidate to free resources?
- - Do all the instances require and use all the memory that has been allocated?
- - Do they need and use all the cores they have been allocated?
- - Do I have servers that are starved of CPU or memory, that can better use these resources?
- - Do all servers with enterprise licenses need enterprise licenses, or is there scope for reducing licenses and cost?
I have seen examples of sites where they auto scale/auto configure the memory allocated vs used on 1000+ servers every night. They then automatically reduce or increase memory on each instance to maximize performance by shifting memory to where it is most needed. It sounds like a big job – but it can be done completely automatically. The result was better overall performance and delayed the need for a new VM cluster. Maximizing resource usage in an elegant manner.
When you have convinced yourself that you have taken out all the slack resources in your farm, you can start planning for expansion. If you have trend charts for how the whole database farm is growing in resource usage, you have a good starting point for planning and budgeting for growth. When you also can document that there are no more slack or extra resources than necessary, it should be easier to argue for more resources.