Or maybe this is betternevermind I found it
The built-in cPanel thing is very handy and allows you to select accounts to back up in WHM. There is a furious (I guess) debate on the cpanel forums about whether cPanel should be imposing its retention policy on S3 Amazon by deleting your old files in S3, which apparently it does. Since Amazon itself has complex retention policies you can specify, it seems greedy of cPanel to impose its rules, not to mention inflexible. Most people are using the Amazon rules to move S3 to Glacier.
My inclination is to push copies of my backups straight to Glacier. That's a rabbit hole, but I think I finally came up with an easy way to do it. Such as:
* Not: the aws command line utility. AWS CLI is easy to install on your friendly KnownHost Centos7 server. But it is useless for Glacier, which organizes files in small buckets with recursive hashes you would have to calculate using another utility. By the way, if you do need AWS CLI for something and don't want to install it, the easiest thing may be to spin up an EC2 instance at Amazon, which by default will be Linux with AWS CLI already installed, and storage attached. You'll only need it for a day so it won't cost much to get a hefty one. I don't like EC2 for a web server because it's unmanaged, but it seems perfect for a restore job.
* Not: the python-based amazon-glacier-cmd-interface. I just had too much trouble getting the python prereqs going, especially when I was on Centos6 and it wanted python 2.7 and there were big warning about breaking yum. It came down to an SSL thing and that and other issues are getting old on the repo.
* Yes: the perl-based mtglacier. Amazon is generally python-happy but for me perl is the easiest language. It installed easily using CPAN, but it's also on GitHub, with good instructions. You don't need to know perl to use it. Note, any Glacier utility will have its own way of naming and indexing buckets. That means that if you switch utilities, or retreive it raw with the AWS CLI, you will not be able get the original filenames of the objects. I did not install this on KnownHost, but it seems like it should be as easy here as on the small Centos6 server where I did install it. Perl is built-in, after all, as is CPAN. I did run a lot of CPAN updates though. In python, stuff was always failing. Nothing failed in perl, despite all the verbose messages. And it seems the developer is monitoring the project, though it doesn't get many merges - maybe doesn't need them.
Next up: I'm going to look at Google Nearline storage. It's about halfway in cost between S3 and Glacier (I think) and I wonder if it's easier to work with.
One caveat: KnowHost has its own fairly generous backup policy, and here is something to consider if you are brewing up your own transfers to external backup like S3 or a dedicated backup server somewhere. Keep your cPanel-generated backups in the default folder /backup or something like it, because:
- We backup each VPS on our network every other day.
- Our backups do not count against your disk quota.
- The following folders are not included in our backups:
- We generally have backups as old as at least 7-10 days