In this article we'll be going over the server metric 'Load Averages' that you may see within WHM, cPanel Server Stats, DirectAdmin Server Stats and programs such as
top. Understanding the values that this provides will help you understand how your server is performing within the confines of the available resources. Being aware of this value and what it should and shouldn't be within your server will help protect yourself and your hardware from unnecessary downtime while at the same time assisting you of knowing what to look out for. :)
This article will apply to Virtual Private Servers, Cloud Servers and Dedicated Servers as all of these services get CPU core allocations. This article will cover the usage of programs such as
top and what it pertains too and
nproc when attempting to determine your servers CPU core count.
In order to understand load averages and how they affect your server, we'll need to be aware of how many CPU cores your server currently has. To do this, you'll need to have root access and be able to access/login over SSH.
Once you're logged into your server you can run the
nproc command which will provide you with a single or double digit number (depending on your CPU type)
Here in our example shows how many CPU cores this server has:
[root@yourserverhere]# nproc 2
nproc – this server as
2 available CPU Cores for processing information. Great, this will serve as the baseline for the guide when talking about load averages.
Now that we know, we'll want to check the the current load averages by running
top in the command line, it'll give us output such as:
top - 11:32:02 up 98 days, 23:47, 2 users, load average: 1.10, 3.14, 5.18 Tasks: 108 total, 1 running, 107 sleeping, 0 stopped, 0 zombie %Cpu(s): 0.1 us, 0.1 sy, 0.0 ni, 99.8 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st KiB Mem : 4194304 total, 1669384 free, 1278944 used, 1245976 buff/cache KiB Swap: 0 total, 0 free, 0 used. 2794300 avail Mem
In this output we see things such as uptime, cpu load, memory availability, running processes and load averages – the load averages value is what we're interested in here as that's going to provide us with a general understanding of how our server is performing.
The load average is broken down into three parts which translates into: 1 minute averages, 5 minute averages and 15 minute averages.
Within Linux the load averages attempt to be a understanding of system load as a whole measurement. Meaning that this is taking the number of threads that are working or waiting to work. With CPU's only being able to handle so many processes at one time, this number is meant to act as an overall value of the server work currently being performed.
Let's break this down into an easy to understand format.
If a server is only a dual core server like what we saw when we used
nproc then think of this like a highway. This highway only has two lanes to handle the vehicles operating on this highway. If this highway is operating at capacity (100%) we can expect to see a value of
2.00. If the highway is operating at half capacity then we would see
A value over
2.00 would mean the highway is congested and as such it takes longer time for processes to complete. (Such as it would for traffic to get through a congested highway)
Consistently busy highway could indicate an issue which you would want to have investigated.
A server spiking over the allocated CPU Core count on the server is OK every once in awhile. However, should you see this value being consistently higher than the available core count would indicate a problem that needs to be checked into/evaluated by yourself and/or your server management company. The average values (1, 5, 15) allow for this metric to be checked to determine if there is an issue, for example if the 15 minute average persists higher than
2.00 you know that something is awry.
Servers with larger core counts will have more highway lanes to handle the processes(vehicles) on the network, as such the values of the 1, 5 and 15 minute time frames could be larger.
An example would be a
32 core server could have values of 1.00, 2.00, 15.00 or 24.00 for periodic times longer than 15 minutes and that would be correct within that servers capability.
Understanding the load average and how it applies to your server will help in determining whether the current load averages you're experiencing are indicative of an issue or if your server is simply just busy. :)