My site's running like a dog, but my stats are green?

lalaland

New Member
Hi,

My site's been running like a dog all night, timeouts occuring, PHP scripts not being handled properly (instead on occasion it offers you to dowload the file) and just general slowness. I have users and moderators complaining to me, so I check out the Virtuozzo panel and my stats panel in Cpanel, but both are showing green.

What else could be wrong and what can I check?

The site is running at an almost unusable state at times tonight, yet the stats all say it's fine.

Help?:confused:

[Broken External Image]:http://scuppet.co.uk/forumimages/virt.jpg

[Broken External Image]:http://scuppet.co.uk/forumimages/cpan.jpg
 
The problem you're probably experiencing is the problem that we all suffer from.... Shared Resources!

You're noticing lag and slow response times because other people on the server are using the same resources that you are (disk, cpu, and memory resources) at the same time.

Unfortunately this is the draw back of having a VPS; you're sharing resources with (who knows how many) people.

Just out of curiosity, which VPS are you on? Doing a traceroute to (or from) your VPS will tell you - it will look some thing like:
vz(your_vps_number_here)-tx.somedomain.net
 
I thought virtuozzo is suppose to take care of that shared resources problem by guranteeing the resources to each customer???
 
Just out of curiosity, which VPS are you on? Doing a traceroute to (or from) your VPS will tell you - it will look some thing like:
vz(your_vps_number_here)-tx.somedomain.net
16

ppc said:
I thought virtuozzo is suppose to take care of that shared resources problem by guranteeing the resources to each customer???
I have to admit, I was under this impression too, is this not so?
 
With Virtuozzo the only resources that can be guaranteed are:
1. memory - amount of memory that is physically used and can be allocated at any time depends on your hosting plan;
2. Minimal CPU power - we run all VPSs in so called qual share configuration which means that every single VPS running on the physical machine has absolutely the same rights for CPU power and for short periods of time can burst up to 100% of total CPU power of the physical machine.
3. Bandwidth availability.

Resources that cannot be guaranteed are:
1. Disk I/O - this thing turns out life into nightmare;
2. Maximum CPU power

Guaranteed resource suppose to be pretty clear for every one, so let me write couple words about two resources mentioned above. Disk I/O is the biggest nightmare for us, every time when our monitoring detects increased I/O throughput and/or increased I/O wait we have to spend hours and hours trying to locate problem source. In most cases processes that create high I/O don't consume a lot of memory and/or CPU time and as such it is quite hard to find the exact process responsible for high disk I/O load and to deal with it. Just about 1 hour ago we finished fighting with increased disk I/O load on vz05-tx machine. Customers running on this server might have noticed some slowdowns 3 times during the day today, 2 times load disappeared on its own without us being able to track it down to the specific process and just a bit more than 2 hour agos we, finally, found the "bad" process which caused slower disk operations and fixed it. From what we know software vendor is working on implementation of some kind of disk bandwidth management. Hopefully this will resolve the problem sometime in the future.

Maximum CPU time - quite often we see either runaway processes that are trying to consume all available CPU power and/or websites that require huge amount of CPU (i.e, huge Gallery sites with 1000s of images in the single album, etc). Usual way to handle this is to dispatch abuse notification to the customer and allow the customer to either resolve it or ask to migrate CPU-hungry sites to the dedicated server.

The above are examples of how various situations are handled based on the information we obtain from our physical machines in almost real time. There are huge number of things that may affect performance for our customers, most of them are resolved quickly and might not be even noticed by anyone while some problems require time to investigate the situation and take appropriate steps to resolve performance-related issues. As in case of today's vz05 slowdowns - this node worked with no problems for quite good amount of time and today single "crazy" process took half of our day to find and fix it.


Now let's get back to the original post in this thread - lalaland - I looked through data collected from vz16 yesterday and failed to find anything that may show unusual load and/or activity on this machine. I also went through every single ticket that was submited yesterday and found only one regarding slow site access on vz16 node. Am I right assuming that this is the same as was in the ticket #6970? If not - I apologize, if yes - then I will second support's opinion regarding possible network bandwidth problem somewhere between your location and our network. While traceroute looks pretty good taking in account your location the actual throughput might be affected by various factors such as packet loss, limited bandwidth availability and so on. The second thing that leads me to believe that problem was network related is that you had slow access not only to your VPS but also to our own site. Our site is hosted on the different physical server and as such both these sites can't be affected in the same way by the same "resource sucker".

Regards,
Paul
 
Sounds good to me, thanks Paul.

Just to confirm, the site's been running fine today and tonight, so it seems to be a one off.
 
thanks paul for the explanations.

What does I/O stand for?

Would it be possible to post in the forum or have some post somewhere just saying there is a abusive account problem on such and such node...that way we know that KH see's a problem....AND you wont get hords of tickets asking if there is a problem on one's node..:)
 
ppc,

I/O stands for Input/Output.

We update maintenance part of this forum in case if there is a downtime or possible downtime related problem. In most cases abusive processes are found and resolved in pretty much no time and rate of possible tickets related to abusive processes is less than 1%. If we're going to post about every single abusive process we see, then we'll have to create new threads every hour or so ;)

Regards,
Paul
 
Top