Jul 11, 2016

Amazon S3 Easy Scripted Backup from Windows for the Enterprise

I don’t normally post code, nor am I normally implementing scripts myself.  But sometimes to learn the ins and outs of a capability, you have to dive in and try it out.  Since I’ve been working with Amazon Web Services (AWS) via Windows, I’ve found a remarkable lack of sample scripts for using it, so I’m posing my little project here.

For me, with heavy Unix scripting experience in my distant background, using Powershell and the AWS Powershell add-in was a no brainer.  While the syntax of Powershell is significantly different from Unix Bourne shell, the capabilities are practically identical, including piping.

Now for the requirements:

- I needed to backup to the Cloud for an offsite backup.
- The data needed to be encrypted with a client-managed key, but I had neither the tools nor onsite CPU or extra storage for client-site encrypting.
- The backed-up data needed to track changes or access.
- To show any modifications, the backed-up files needed to manage versions – so no changed version would overwrite a previous version.

Here’s what I did…

1. Set up a specific IAM user with permissions only for S3 by:

- Created an AWS group titled “backup_group”.
- Attached the policy “AmazonS3FullAccess” and no others.
- Created users “backup_user1” & “backup_user2”.
- Stored these user’s REST access keys in a secure encrypted local location.
- Added both users to “backup_group”.

2. Create a specific S3 bucket(s) for these backups, accessible only by the Backup user (and administrative user), with logging.

- Created S3 bucket backup-logs with lifecycle setting, DELETE all content 2 years old or older – meaning logs have a 2 year life. (If you don’t give log directories a lifecycle rule, they’ll accumulate forever with ever increasing storage costs.  Since this “disk” never “fills”, you won’t get any kind of log error that would force rotation, just an ever growing cost.)

- Created S3 bucket “backup-for-bi”, enabled logging to “backup-logs” with subdirectory logs-bi/
- Created S3 bucket “backup-for-DB”, enabled logging to “backup-logs” to subdirectory logs-db/

3. Enable versioning to preserve each copy and prevent hidden changes. – enabled on both buckets.

6. Utilize an upload command setting that requires encryption of the uploaded data with an client managed key, which will prevent any unencrypted download of the content even by Amazon.  Key will be stored locally.

- Now this was particularly tricky and confusing, encryption keys not being my specialty.  AWS’s data encryption is AES 256, but generating an AES 256 key was rejected.  The command spec says it should be Base64 encoded, which I did but it was still rejected.  In the end I was able to generate an AES-128-cbc encrypt key from a passcode, and then Base64 encode that key which generated a 44 byte string (ending with =) that AWS accepted.  In essence, that Base64 string is, as far as we are concerned, the key – though I’m storing that key, the original password, and the 128 bit key and salt.

With all that prep ready, here’s the Powershell to upload a list of directories, the list being embedded in the shell.  $accesskey equals the AWS IAM user access key shown by AWS on creation of the user.  $secretkey is also shown by AWS on creation of the user.

    Upload Listed Directories to Amazon AWS S3
    1) AWS Tools for PowerShell from http://console.aws.amazon.com/powershell/

    powershell.exe .\AWS_Backup_Dirs.ps1  

$bucket          = "nameofbackupbucket"
$backup_list     = "E:\Prod", "E:\PreProd"
$AES256_key      = "AAAABBBBCCCCDDDDEEEE99991111222233334444777="
$accesskey       = "ASDKLJASDFJKLASDF"

    import-module "C:\Program Files (x86)\AWS Tools\PowerShell\AWSPowerShell\AWSPowerShell.psd1"
catch [system.exception] 
    $error_fail = "Error: AWS Powershell Extensions not installed or missing from expected location... " + $_.Exception.Message
    Write-Host $error_fail
    throw $error_fail

foreach ($backme in $backup_list)
    $bucket_subdir = Split-Path -Path $backme -Leaf
        Write-S3Object -BucketName $bucket -Folder $backme -Recurse -KeyPrefix $bucket_subdir -AccessKey $accesskey -SecretKey $secretkey -ServerSideEncryptionCustomerProvidedKey $AES256_key -ServerSideEncryptionCustomerMethod AES256
    catch [system.exception] 
        Write-Host  "Error: " $_.Exception.Message


The –KeyPrefix specifies that the data will be written into subdirectories matching the last name of the directory path.  So if the path is D:\dog\cat, it will be stored on S3 in the subdir “cat”.

This script can be set up as a scheduled task and run daily or weekly.  BUT, as written it will send up the whole directory, which can bear the cost of the full data transfer even if nothing has changed.  If you want incremental backups, you have to adjust the script to only find newer files, and loop to send them up one at a time (rather than the whole directory in this script).

Hope this helps!