Add true backup functionality to the existing cluster. Discussion started Oct. 2013.

Spring 2014

Another data point

Brightworks provides backups to area businesses. They use <http://www.keepitsafe.com/>, and wrap their services around this. Including:

  • Consulting, to ensure a good fit.
  • Installation and maintenance.
  • Daily monitoring, by a trained staff member, of every report for every business

Cost for 1 TB would be about $800-900 per month ($9,600 - $10,800/ year).

Cost is about $1/ GB, whereas EZ-Backup is $0.04/ GB marginal cost for quantities above ~33GB.

Challenges to doing backups: The problem to solve

The quantity and number of files can put demand on the head-node and may take too long

  • Will the backup put too much demand on the computer, competing with the computer's primary research purpose?
  • Will the backups take too long?

System uses ~800 cores, with 30 programs (per Yi He).

The number of files

  • 100's of millions of small files, along with "normal" files.

The amount of data

  • 3.1TB of data (research and users)
    • On 6TB partition
  • 77GB OS data
    • On 160GB partition

Cost is an issue since not originally budgeted.

Key take-home

  • Anything that can reduce the number of files from "100's of million" will help tremendously.
  • Reducing the total amount to backup can help keep costs down and speed up restoration efforts.

Objectives

Capability to restore system and data hardware to all-new hard drives, if necessary

Deal with various failures, such as:

  • file system corruption#
  • hard disk failures
  • RAID controller card failures
  • motherboard failure
  • other server hardware of sub-system failure
  • fire damage of room*
  • water damage of room*
  • theft*
  • malicious incursion into the system*#

* Requires an off-site copy to be effective.

# May require versioning to get prior copies of since-corrupted data.

NOTE: If other hardware than hard disks are required to be replaced, OS may need to be modified

Ability to have some versioning, if not too expensive an add-on.

Options

Option1: In-room, off-box, copy

HDs in a dedicated computing box.

Option2: Out of room

HDs in a dedicated computing box.

Hosted file service.

EZ-Backup.

  • $850-1,650/yr, depending on level of file compression.
  • Software may not have time to process the files at each backup.

Ideas to consider for above options

Software to sync data, ideally with versioning

EZ-Backup

CrashPlan

Compare

How CrashPlan backup works (technical)

Real-time Backup Version Retention

Backup Frequency and Version Retention

Backing Up Very Large File Selections

Software that backs up changes to the partition, not looking at files per-se.

One possibility Oliver found:

CA ARCserve D2D I2 Backups

12 minute technical review of D2D

PDF Brief

Pricing

We'd like to know the educational price (if any), but here are the upper-bound prices:

http://shop.arcserve.com/Products/D2D/CA-ARCSERVE-D2D-FOR-LINUX-Product-plus-1-Year-Maintenance

  • (List) Price: $732.00

http://shop.arcserve.com/Products/D2D/CA-ARCSERVE-D2D-FOR-LINUX-Product-plus-3-Year-Maintenance

  • (List) Price: $976.00

Oliver can't tell whether one pays more for additional clients. The other OS options specify number of seats, so the above implies to Oliver a single price gets you unlimited clients(?) per server.

  • No labels