frustrated

Status
Not open for further replies.
A couple of days ago I raised a support ticket and specifically mentioned the cpanel Server Load was high and that sites were timing out.

The response was to do a traceroute and that cronjobs were checked and that server scheduled jobs may be impacting.

Nothing on my sites has changed and again today (right now) I'm finding the same issue of high server load and sites not loading up. I've responded to the last response I got - please don't ask me to do a traceroute again (which I've done anyway and even though it has nothing to so with the server high load).

End of vent.
 
I've had the exact same problem on two different occassions and support could never find a reason. It almost sent me looking for another host...

While your cause may be differentthan mine, I'd appreciate your posting back ifyou figure it out.

BTW - I find this very useful:

http://www.labradordata.ca/home/13
 
That definitely is frustrating, Aussie_Boy. This looks like something that should be emailed to Paul(paul@) for investigation.
 
Aussie_Boy, LeMarque could you please either PM or email me (paul at knownhost dot com) your ticket numbers?
 
Thanks Bryan


Paul from KH has been excellent and it looks like one issue has been resolved (CPU load) but the timeout issues are still an issue. I thought it may have just been some issue on my pc but I have members of a forum I host report the same issue independantly (having to click links twice before webpage shows in broswer or if they wait, browser timeout error message.
 
Hello Aussie_Boy,

I can just feel your frustration :( sorry man...

Do the problems last a period of time? Maybe long enough to log into your server via SSH and run top to see if there's something you can see there?

Or maybe you could install some monitoring software? I have not personally used any but there has been mention of a couple pieces here on the forum.

Or do you think it's a problem with someone else on the same server as you?
 
Hello Aussie_Boy,

I can just feel your frustration :( sorry man...

Do the problems last a period of time? Maybe long enough to log into your server via SSH and run top to see if there's something you can see there?

Or maybe you could install some monitoring software? I have not personally used any but there has been mention of a couple pieces here on the forum.

Or do you think it's a problem with someone else on the same server as you?


Thanks Dan, server is fine. It's network related. KH say it's a service provider, service provider say it's the peering network and to contact my webhost. I'm the meat in the sandwich.
 
I won't even pretend to be a networking expert but can someone tell me where a provider states the networks they use a "bandwidth solution that consists of 8 transit providers including Qwest, Sprint, Level 3, Global Crossing, Savvis, PCCW, Tiscali and Teleglobe" - what exactly does that mean? I guess not for international routes back to user IP's?
 
Just to confirm, you moved over to the new CA data center?


No, I asked if that would help and a traceroute provided by KH from CA showed pretty much the same path as from SJ so the same issues which surprise me and why I asked about what the network providers info actually means (maybe it means network from DC to the big data exchange on the coast? I don't know...).

No-one else seems to have experienced the same issues and posted about it at WHT or elsewhere. Even other people from my region haven't posted on KH forums about it.
 
To clarify the issues referred to in this thread:

LeMarque provided two ticket numbers, unfortunately both tickets are quite outdated - one from mid-March, another from mid-April. Due to the amount of time passed since tickets were created it is quite hard to provide any specific assistance as most of the logs were rotated already. Based on the information from "ps" output provided in both tickets LA numbers pretty much corresponded to number of Apache processes that were in the "R" (running) state. According to LeMarque there were no really active sites on the VPS at that time however something still triggered Apache being in the "R" state but what exactly - impossible to find out due to amount of time passed.

Aussie_Boy's ticket is a totally different story and quite many hours were spent investigating and monitoring the problem. In reality two problems happened at about the same time:
1. Slow sites, high load, high CPU usage. After extensive monitoring this was tracked down to MySQL database queries, one being extremely resource (read CPU and disk i/o) intensive and was taking up to 200+ seconds at times to be executed, other was lighter but still taking its toll. Before I started to look at Aussie_Boy's ticket the support pointed the reason for high load to the very same account and database operations. It is my understanding that Aussie_Boy disabled the resource intensive query and trimmed huge table in the database to reduce amount of time required for other queries to run;

2. Network connectivity problems - occasional timeouts were reported. Based on the information about IPs of users who was having problems the possible source well outside of any of our networks was found - packet loss somewhere between SingTel and OPTUS INTERNET networks on the other side of the ocean. Ping to SingTel's hop doesn't show any packet loss, ping to the very next hop which belongs to Optus's network shows up to 30% packet loss. Sorry but there isn't much we can do with this.
Also, taking in account Aussie_Boy's constant complains about network stability in San Jose and SLA request tickets I still remain very wondered why would the customer continue to ignore the possibility to move to a more stable network. Complains and refusal for better change is a bit beyond of my understanding.

Aussie_Boy - also, it would be much appreciated if at least some feedback would be posted as a follow up to the public complain after all the work that was done to address your initial concern in this thread. Apparently based on follow ups from other customers everyone remains under the impression that we just deliver poor service and don't care about our customers and/or service we deliver.
 
To clarify the issues referred to in this thread:


Also, taking in account Aussie_Boy's constant complains about network stability in San Jose and SLA request tickets I still remain very wondered why would the customer continue to ignore the possibility to move to a more stable network. Complains and refusal for better change is a bit beyond of my understanding.

Aussie_Boy - also, it would be much appreciated if at least some feedback would be posted as a follow up to the public complain after all the work that was done to address your initial concern in this thread. Apparently based on follow ups from other customers everyone remains under the impression that we just deliver poor service and don't care about our customers and/or service we deliver.


Paul


you will see earlier that I have posted a compliment in regard to your support. I also believe I followed up and said one issue remained.

In regard to moving to a more stable network, I asked you about CA and the traceroute you provided indicated the same issue. If that is not the case, then your email certainly did not say that CA was fine. Please don't misrepresent me either - given the effect this has, it takes planning and if CA has the same issues, why move?

Edit - these are not "occasional" timeouts. Having to refresh or press the link twice every time to get a site/web page to load is not occasional.
 
I won't even pretend to be a networking expert but can someone tell me where a provider states the networks they use a "bandwidth solution that consists of 8 transit providers including Qwest, Sprint, Level 3, Global Crossing, Savvis, PCCW, Tiscali and Teleglobe" - what exactly does that mean? I guess not for international routes back to user IP's?

This means that network connectivity is provided by 8 independent transit providers. The actual packet will be routed through the network which provides shortest BGB path to the destination IP/network. Also, in case of problems with one-two-more transit providers the routing will automatically adjust to select a new closest route to the destination point.
All these network providers offer premium grade transit and, as such, stable and fast network connectivity.
 
also, I appreciate that you did the ping test on one of the customer ip's however as I said I am the meat in the sandwich as on that one test you said it wass an Optus issue however the Optus Engineers email to another user affected by this network isue said:

"Our testings point to a problem either within Cogent's network or on a peering link
between Cogent and Singtel in LA.
--
I'd suggest that the owner of the domain approach his hosting provider and have them
escalate to Cogent. We can't escalate to Cogent as we have no peering with
them."
 
This means that network connectivity is provided by 8 independent transit providers. The actual packet will be routed through the network which provides shortest BGB path to the destination IP/network. Also, in case of problems with one-two-more transit providers the routing will automatically adjust to select a new closest route to the destination point.
All these network providers offer premium grade transit and, as such, stable and fast network connectivity.


So in the case of moving to CA, will this or will this not fix the network issues? Your traceroute showed the same SingTel/Optus path.
 
Status
Not open for further replies.
Top