cpanel backups taking a very long time

turbo2ltr

Member
I run daily cpanel backups. My home directory is about 2.3GB in 16 accounts. cpbackup typically takes about 3.5 hours to run. It seems like an awful long time.

Last night the backup started at 1AM and didn't finish until 6:12am. Just doesnt make sense.

Server load history:
Code:
12:10:01 AM         1        72      1.06      0.82      0.40
12:20:03 AM         0        74      2.69      1.84      1.05
12:30:03 AM         1        79      0.38      0.50      0.69
12:40:01 AM         4        76      0.15      0.75      0.81
12:50:03 AM         0        77      0.41      0.42      0.63
01:00:01 AM         0        75      0.35      0.34      0.45
01:10:01 AM         1        74      1.86      1.91      1.23
01:20:01 AM         2        75      1.06      1.56      1.44
01:30:07 AM         0        76      3.62      1.81      1.51
01:40:01 AM         0        77      3.20      2.65      2.04
01:50:06 AM         7        73      2.90      2.72      2.40
02:00:03 AM         0        80      2.23      2.19      2.24
02:10:02 AM         0        75      1.75      2.26      2.32
02:20:01 AM         0        72      1.35      2.01      2.22
02:30:05 AM         1        64      1.13      1.58      1.96
02:40:04 AM         0        74      2.26      2.02      1.95
02:50:10 AM        16        74      3.84      2.99      2.58
03:00:06 AM         7        68      2.40      2.49      2.66
03:10:06 AM         5        72      3.80      2.30      2.32
03:20:01 AM         0        77      2.66      2.77      2.59
03:30:03 AM         1        76      4.12      3.84      3.13
03:40:01 AM         1        74      1.37      2.08      2.56
03:50:01 AM         1        78      2.05      1.80      2.13
04:00:01 AM         0        81      2.08      2.18      2.18
04:10:02 AM         0        82      2.86      2.88      2.60
04:20:01 AM         1        79      2.27      3.10      2.97
04:30:02 AM         0        82      2.19      2.64      2.81
04:40:05 AM         0        75      3.79      3.40      3.14
04:50:04 AM         9        72      3.04      3.10      3.12
05:00:06 AM         5        76      4.04      3.68      3.40
05:10:01 AM         2        74      1.63      1.82      2.48
05:20:02 AM         1        75      1.37      1.41      1.94
05:30:07 AM         5        68      1.81      2.19      2.13
05:40:01 AM         1        74      1.30      1.71      1.98
05:50:01 AM         1        71      0.64      0.86      1.43
06:00:02 AM         0        77      0.75      0.78      1.10
06:10:01 AM         1        71      0.11      0.57      0.92
06:20:04 AM        14        71      2.83      2.20      1.58
06:30:01 AM         2        70      1.15      1.59      1.55
06:40:02 AM         0        70      0.33      1.41      1.62
06:50:01 AM         1        69      0.08      0.39      0.97

Any ideas why it would take so long?
 
I see actually cpbackup was complete at 5:22am... still a long time. The sync with S3 was complete at 6:12.

All my sites are very slow loading this morning. Nothing in top shows anything hogging resources but server load seems abnormally higher that what I'm used to seeing.

Code:
top - 07:10:44 up 52 days, 12:32,  1 user,  load average: 1.00, 0.96, 0.93
Tasks:  47 total,   2 running,  45 sleeping,   0 stopped,   0 zombie
Cpu(s):  0.0%us,  0.0%sy,  0.2%ni, 99.2%id,  0.6%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:    524288k total,   230960k used,   293328k free,        0k buffers
Swap:        0k total,        0k used,        0k free,        0k cached
 
Hello turbo2ltr,

I can suggest a couple of things:

1) Looking at what you posted from top since it shows you not using CPU but still having a high CPU load there is someone else on the server using it. This would of course slow down backups and everything for everyone on that particular machine.

2) You could try a reboot. Although if the cause is indeed due to #1 this will not help much.

What I would do is a reboot. And then after rebooting just give it some time, very often these things tend to work themselves out. Then after some time or even if you do not want to wait shoot support a ticket and ask them to look at it since there is load on the server that is effecting your VPS (and others' of course).

Hope that helps!
 
When you say "someone else using it", I assume you are talking about another VPS customer on the same box?

I'm confused because in another thread here, it was stated by KH that all load statistics that you see (top, uptime, etc) are for only your VPS instance, not the hardware.

I thought things might "pass" but load still seems high. Full top:

Code:
top - 11:54:33 up 52 days, 17:16,  1 user,  load average: 0.22, 0.52, 0.51
Tasks:  50 total,   1 running,  49 sleeping,   0 stopped,   0 zombie
Cpu(s):  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:    524288k total,   255044k used,   269244k free,        0k buffers
Swap:        0k total,        0k used,        0k free,        0k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
    1 root      15   0  1996  660  568 S  0.0  0.1   0:45.33 init
 3714 root      15   0  7004 1068  668 S  0.0  0.2   0:00.02 sshd
 7430 root      18   0  1584  420  352 S  0.0  0.1   0:00.39 courierlogger
 7431 root      15   0  1696  536  456 S  0.0  0.1   0:00.41 couriertcpd
 7438 root      25   0  1584  336  280 S  0.0  0.1   0:00.00 courierlogger
 7439 root      25   0  1700  516  436 S  0.0  0.1   0:00.00 couriertcpd
 7445 root      18   0  1584  420  352 S  0.0  0.1   0:01.79 courierlogger
 7446 root      15   0  1692  532  456 S  0.0  0.1   0:01.51 couriertcpd
 7463 root      24   0  1588  340  280 S  0.0  0.1   0:00.00 courierlogger
 7464 root      25   0  1696  516  436 S  0.0  0.1   0:00.00 couriertcpd
 7469 root      18   0  1588  424  352 S  0.0  0.1   0:00.00 courierlogger
 7470 root      18   0  1940  620  512 S  0.0  0.1   0:00.02 authdaemond
 7471 root      15   0  1940  368  256 S  0.0  0.1   0:00.55 authdaemond
 7472 root      15   0  1940  368  256 S  0.0  0.1   0:00.51 authdaemond
 7744 root      18   0  7680 5144 1732 S  0.0  1.0   0:29.75 tailwatchd
 7904 root      18   0  125m  97m 5432 S  0.0 19.1   2:56.82 httpd
 8062 root      15   0  5808 3116 1760 S  0.0  0.6   0:07.98 authProg
 8116 root      15   0 16912 8036 1280 S  0.0  1.5   0:11.08 cpsrvd-ssl
 8180 root      18   0  5848 3212 1844 S  0.0  0.6   0:08.01 authProg
 9379 nobody    18   0  126m  97m 4640 S  0.0 19.1   0:00.16 httpd
 9381 nobody    15   0  129m 107m  11m S  0.0 21.1   0:00.69 httpd
 9391 nobody    18   0  126m  97m 4936 S  0.0 19.1   0:00.26 httpd
 9556 nobody    15   0  128m 105m 9.9m S  0.0 20.5   0:00.37 httpd
 9939 nobody    15   0  126m  97m 4588 S  0.0 19.1   0:00.14 httpd
10037 nobody    15   0  125m  97m 4480 S  0.0 19.0   0:00.15 httpd
10098 nobody    15   0  126m  97m 4640 S  0.0 19.1   0:00.22 httpd
10099 nobody    15   0  126m  97m 4512 S  0.0 19.1   0:00.22 httpd
10186 nobody    15   0  126m  98m 5124 S  0.0 19.2   0:00.24 httpd
12231 nobody    15   0  126m  98m 4960 S  0.0 19.2   0:00.20 httpd
13477 mailnull  15   0 10244 1164  676 S  0.0  0.2   0:00.01 exim
13614 root      15  -4  2080  636  428 S  0.0  0.1   0:00.00 udevd
14126 root      15   0  1656  572  476 S  0.0  0.1   0:25.32 syslogd
14199 root      16   0  2652  856  676 S  0.0  0.2   0:00.00 xinetd
15708 root      18   0  5508  704  428 S  0.0  0.1   0:00.00 saslauthd
15709 root      18   0  5508  440  164 S  0.0  0.1   0:00.00 saslauthd
15740 root      18   0  1612  428  344 S  0.0  0.1   0:00.00 portsentry
15915 root      15   0  6300 1596 1228 S  0.0  0.3   0:04.04 pure-ftpd
15917 root      16   0  6028 1216  972 S  0.0  0.2   0:01.21 pure-authd
19928 root      15   0  4012 2228 1080 S  0.0  0.4   0:10.87 cphulkd
20000 root      17   0 13932 7824 1396 S  0.0  1.5   0:00.00 cpdavd
20003 root      18   0  5260 3624 1280 S  0.0  0.7   1:07.91 queueprocd
20016 root      33  18  3732 1544  636 S  0.0  0.3   0:01.02 cpanellogd
20149 root      18   0  2384 1124  968 S  0.0  0.2   0:00.00 mysqld_safe
20174 mysql     15   0  164m  44m 4348 S  0.0  8.7   7:30.12 mysqld
22400 root      15   0  2136 1048  816 R  0.0  0.2   0:00.03 top
26013 root      15   0  4408 1120  564 S  0.0  0.2   0:12.59 crond
27923 root      15   0 10036 2796 2236 S  0.0  0.5   0:00.08 sshd
28275 named     25   0 93528 6192 2104 S  0.0  1.2   1:44.24 named
28341 root      16   0  2868 1260  976 S  0.0  0.2   0:00.00 login
28372 root      15   0  3684 1516 1228 S  0.0  0.3   0:00.01 bash

Hmm, some sites are now coming up as just blank pages, but a refresh brings up the page. Something is wacky.

I guess its time for a reboot.
 
So after the raid problem this morning and subsequent reboot, my backups didnt run. So I manually ran it this morning. It took 11 minutes! There is some serious issue with this. I wonder if it's been a raid issue that finally culminated to the failure this morning, or just that every customer on the box does their backups at the same time. But I can't believe things would slow down from 11 minutes to 4 hours because of other customers unless the box was seriously oversold.

Guess we'll see tonight.
 
turbo2ltr,

Up until last light vz24-la didn't show any problems. We constantly monitor all physical machines and take appropriate action if any of signs of possible problem are showing up.
We do not oversell resources when it comes to VPS hosting. "Serious overselling" isn't a part of our business.

Regards,
Paul
 
Sorry that came out wrong. I wasn't implying you oversold as the fact that you dont is one of the reasons I chose you guys.

But it still leaves the question as to possible reasons for the abnormally large differences in run times... I'll be curious to see what it does tonight.
 
turbo2ltr,

This could be related to pretty much anything. Best bet is to check what is going on with the VPS at the time when problem is happening. According to the stats there isn't much resource usage and what I can think of right now is that something might be blocking backup operation - for example /scripts/update_db_cache script might be running at the same time and blocking the database dumps, etc.
 
After this server was rebooted on the 7th, things got better. A few days after, things settled down and now the actual cpanel backup consistently only takes about 10 minutes. And another 22 minutes to sync to S3. Nothing changed on my end, so whatever you guys did/didn't do...thanks!
 
Top