You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 9 Next »


Introduction

AWS S3 (including S3 Glacier) can provide very cheap object storage for some on-premise backup and archiving use cases. But, be careful. There are some pitfalls:

  • S3 is an object store, not a file system. You will need to make sure that the tools you use to accomplish backup/archive are S3-savvy.
  • Since S3 is not a file system, some features you might expect are missing or can be costly to replicate.
    • E.g., it costs money just to get the MD5 hash or creation date for an S3 object. It's not much money, but it can add up when dealing with hundreds of thousands or millions of objects.
  • S3 storage can be very cheap indeed, but you need to be careful that the tools you are using don't end up costing you a lot for S3 API operations used for checking object hashes and collecting other metadata from objects.

General Approaches

There are a lot of pathways to get on-premise files to S3 or other AWS services. Picking the right one will depend on your use case, budget, ability or desire to tinker and monitor costs, and palatability of deploying additional on-premises resources.

rclone

rclone is a CLI tool in the same vein as rsync, but it is savvy about cloud object stores like S3.

True Backup Software Using S3 for Storage

Many backup software solutions can use S3 for backend storage. An example of this for smaller-scale deployments is MSP360 Managed Backup (formerly CloudBerry Backup).

AWS Tools and Services

AWS has a lot of tools and service options to make it easier to move/sync data from on-premise sources to AWS. These services can be a great solution if they do exactly what you need and have the budget for them. Tools in this category include:

Anti-Patterns

Data Already in AWS

Don't role-your own backup/archive solution if your data is already in AWS. Use built-in AWS services and features:

  • AWS Backup
  • EBS Snapshots
  • RDS Snapshots
  • S3 Replication
  • ... 

Resources

Internal

External




  • No labels