Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

AWS S3 (including S3 Glacier) can provide very cheap object storage for some on-premise backup and archiving use cases. But, be careful. There are some pitfalls:

  • S3 is an object store, not a file system. You will need to make sure that the tools you use to accomplish backup/archive are S3-savvy. Since S3 is not a file system, some features you might expect are missing or can be costly to replicate.
  • S3 storage can be very cheap indeed, but you need to be careful that the tools you are using don't end up costing you a lot for S3 API operations used for checking object hashes and collecting other metadata from objects.
    • E.g., it costs money just to get the MD5 hash or creation date for an S3 object. It's not much money, but it can add up when dealing with hundreds of thousands or millions of objects.
  • Remember AWS S3 isn't the only cloud storage available. Azure, Google, and Wasabi are other options.

Framing Your Use Case

Here are some questions that may be valuable to answer when thinking about backup and archiving:

  • What do you want to restore from your backup or archive? Specific files? All files as of a specific date? 
  • How fast do you need it? I.e., what is your Recovery Time Objective (RTO)?
  • What is your Recovery Point Objective (RPO)? I.e. how far out-of-date can objects be when restored?
    • E.g., A server RPO may be 24 hours, meaning that its OK to have restored files be as much as 24 hours old, but no older.
  • Should different versions of backed-up/archive objects be kept? Or, do you always want the latest version?
  • How often do you envision having to restore data?
    • Some service pricing is fairly expensive when you actually need to restore data, especially in short timeframes.
  • What are the basic metrics of the data in scope for your use case?
    • Total cumulative size of target data?
    • Total number of target files/objects?
    • Total number of target files/objects < 128KB?
      • Some services handle smaller objects differently than larger objects
    • Estimate the of number of target files/objects deleted before 90 days
      • Some services require a minimum object lifetime and will charge you for storing the object for the entirety of that period, even if the object is deleted before reaching that age.

General Approaches

There are a lot of pathways to get on-premise files to S3 or other AWS services. Picking the right one will depend on your use case, budget, ability or desire to tinker and monitor costs, and palatability of deploying additional on-premises resources.

...

External

...