Virtualization Platform Upgrade

jkstark · Sep 1, 2015

OK - so the migrations are moving along, and I had to deal with that last night.

MAJOR FAIL on the part of KnownHost though - you *NEED* to inform your customers in advance of doing a shutdown for the migration, and set up a schedule for those customers! I was logged on to my machine, to be suddenly and ungraciously booted off with a message from the system stating that the machine was being SHUT DOWN *NOW*.

Opening a ticket on this ended up getting me a response to the lines of "we don't know why it went to a shutdown, give us some time to figure it out." In other words communication between KH and the customer failed, as did communication internally to the support team.

Unfortunately, this was no 5 minute reboot either - it took close to an hour to get back on. Having been in the middle of some heavy lifting on databases, I was concerned about data loss as well. At least I was not doing anything to the OS that might have resulted in partial dependency updates and rendered the system unbootable...

However, now we have had to deal with customers who are curious about why everything was down, and I'm having to deal with a serious lack of sleep from having been up till 2 AM, as I had to finish what had been started prior to the shutdown - and then do further checks to see what exactly had caused the shutdown, since I did not get an answer to that till this morning after I again asked for an update to the reason. The ticket had been left open, but no final followup had been posted until I asked about it again...

Communication please!

(I should add that the concept of a console is great, but having it run in Java is a failing... Most browsers will stop supporting Java very shortly, making the idea of having a console available through the customer side somewhat pointless..)

calpurnio · Sep 1, 2015

I have to agree with you, jkstark, that its not really professional to do any migration "with a forced shutdown", without a clear scheduled program for the customer.
This migration just take place as a surprise for all of us. And may be... at any (not given) time.

You can even put a long time frame (ex. from 2 am to 4 am) but a customer has the right to know the exact day and time frame for this to happen.

I've been with many hosts over the past 10 years, even with the worse ones with a lot of downtimes,
but for "planned" shutdowns, customer has always been aware and prepared, as far as I remember.

KH-Jonathan · Sep 7, 2015

Migrations are 80% complete.

Dek · Sep 17, 2015

Since my servers were migrated to OpenVZ some client disk quotas have become randomly set to unlimited in WHM/cPanel. Some accounts are 'Unlimited' instead of showing exact fixed quotas. Running /scripts/fixquotas multiple times does correct *most* client accounts, but it never corrects them all and many clients will have an unlimited disk quota. Which is not what I want.

Running /scripts/fixquotas a second time to correct the ones not updated on the first execution will allocate unlimited to other accounts that were previously corrected. Basically, running /scripts/fixquotas seems to allocate unlimited quotas to accounts at random.

KH-Jonathan · Sep 21, 2015

Migrations are complete.

curdude · Sep 21, 2015

Nicely done. The system seem to be running ok. Wish we had more options for viewing the system stats and info from OpenVZ.

adev · Oct 11, 2015

Has there been/can we get a response from KH staff on these two posts? I am also getting the unlimited quota issue, and I would also like assurance that any planed forced shutdowns/reboots of the hardware will be scheduled and all customers informed. I feel this is very important and it's very important for you as a company to respond to the points raised to reassure (both current and potential/future) customers.

jkstark said:
OK - so the migrations are moving along, and I had to deal with that last night.

MAJOR FAIL on the part of KnownHost though - you *NEED* to inform your customers in advance of doing a shutdown for the migration, and set up a schedule for those customers! I was logged on to my machine, to be suddenly and ungraciously booted off with a message from the system stating that the machine was being SHUT DOWN *NOW*.

Opening a ticket on this ended up getting me a response to the lines of "we don't know why it went to a shutdown, give us some time to figure it out." In other words communication between KH and the customer failed, as did communication internally to the support team.

Unfortunately, this was no 5 minute reboot either - it took close to an hour to get back on. Having been in the middle of some heavy lifting on databases, I was concerned about data loss as well. At least I was not doing anything to the OS that might have resulted in partial dependency updates and rendered the system unbootable...

However, now we have had to deal with customers who are curious about why everything was down, and I'm having to deal with a serious lack of sleep from having been up till 2 AM, as I had to finish what had been started prior to the shutdown - and then do further checks to see what exactly had caused the shutdown, since I did not get an answer to that till this morning after I again asked for an update to the reason. The ticket had been left open, but no final followup had been posted until I asked about it again...

Communication please!

(I should add that the concept of a console is great, but having it run in Java is a failing... Most browsers will stop supporting Java very shortly, making the idea of having a console available through the customer side somewhat pointless..)

Dek said:
Since my servers were migrated to OpenVZ some client disk quotas have become randomly set to unlimited in WHM/cPanel. Some accounts are 'Unlimited' instead of showing exact fixed quotas. Running /scripts/fixquotas multiple times does correct *most* client accounts, but it never corrects them all and many clients will have an unlimited disk quota. Which is not what I want.

Running /scripts/fixquotas a second time to correct the ones not updated on the first execution will allocate unlimited to other accounts that were previously corrected. Basically, running /scripts/fixquotas seems to allocate unlimited quotas to accounts at random.

Dek · Oct 11, 2015

"I am also getting the unlimited quota issue"

It is a known issue that has to be fixed at node level:
http://wiki.openvz.org/Cpanel_quotas

The above was referenced from this old cPanel forum thread:
https://forums.cpanel.net/threads/whm-disk-quotas-issue-cpanels-appears-unlimited.437621/

My ticket regarding the unlimited quotas issue was escalated, and KH are saying:
====
"I am aware that this has recently come up during the migration but I've also reviewed over our support tickets and it does not appear that we have seen a large uptick in issues like this, so while it may still be possible and I've not completely ruled it out. I don't think it's a direct cause.

There are many other factors that also could be at play, including the fact that the servers were so old and cPanel has made many tweaks and changes over the years that something may not quite be in sync causing a process to fail which then cascades down to the quota issue. We'll know more once we can nail down a time based on the timestamps for the quota files in /"
====

I gave KH two dates and rough times when it happened but they haven't been able to drill down the exact cause. KH say they're monitoring but they haven't replied with anything since September 18th.

I started investigating myself and found an easy solution which seems to have worked:
In WHM, just save each package you have without making changes. This will update all client accounts with this package to the correct quotas. Since then, the quotas are correct for all client accounts.

HTH

adev · Oct 12, 2015

Thanks Dek.

KH-DanielP · Oct 12, 2015

@adev

On the few instances we've seen on the quota issue it simply appears some of the cPanel settings did not fully save correctly. As Dek mentioned once re-saving the settings and quotas the issue is resolved.

In regards to the notices, we did include note that during the migration process each VPS would experience a reboot during the migration. On rare occasions depending on the amount of data written and other factors during a migration it may take a container offline longer than normal to sync the data. These migrations are now of course complete, but we do apologize for any inconvenience experienced by this particular user. Swapping the entire backend architecture is not something under taken very often or something that would be commonly done.

adev · Oct 15, 2015

Thanks for your response Daniel. However, the point others made and I agree, is that the reboot should have been on a scheduled winow for each server, so your customers would have known not to be SSH'd in or whatever at that time performing vital heavy lifting etc. Your transition took a few months if I'm not mistaken so that's a lot of guesswork without it being scheduled. In the past with bluehost and hostgator (both shared hosting but that's irrelevant) they've always given plenty of notice and a scheduled window for such planned maintenance/upgrades, and I'd say that's pretty essential. Let's not have an argument about it - it happened as it happened, and I'm sure I'm not alone in saying we appreciate all the hard work you guys put in, but in future it would be better to do it more transparently. It's not only your customers data that's at stake, but also you're reputation, which so far seems to be pretty darn good. Let's keep it that way!

Virtualization Platform Upgrade

jkstark

New Member

calpurnio

New Member

KH-Jonathan

CTO

Dek

New Member

KH-Jonathan

CTO

curdude

Member

adev

Member

Dek

New Member

adev

Member

KH-DanielP

KH-CEO

adev

Member