Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Migrated to Confluence 5.3

Excerpt

The Matrix cluster is currently unavailable due to a problem with its data storage.

12/2013: EZ-Backup data:

  • Duration of backup: 60 minutes
  • Total: 5.35 million files backed up, for a total of 1.57TB of data.
    • This represents 52% compression (including versions?)
  • Incremental backup.
    • Most recent backup backed up 453K files, for a total of 5GB of data transferred.

Situation

See recovery project's punch list.

Update

11/1215/13

Draft schedule, assuming no surprises

Now: Yi to run 2 different ~24 hour tests.

...

, Friday

  • Matrix is now open to all researchers.

=> Researchers to confirm they have all their data from their clean-up. We recognize this may require running some test jobs. Do this as soon as possible!

Deadline for getting older files back is:

  • Monday, 11/25 (but may be sooner if spare HDs needed earlier!): After this date, we will be unable to restore any data not on EZ-Backup.

...

ChemIT's hard drive testing status.

...

Hard drive number

Size

Purpose

Test status

Result, notes

0

3TB

data

N/A

Physically broken connector.
File system suspect from initial failure.

1

3TB

data

PASSED

Drive passed testing and was successfully zeroed

2

3TB

data

PASSED

File system suspect from initial failure. - Drive passed testing and zeroing

3

3TB

data

 FAILED

Reported ECC error during initial recovery/ failure. - Drive being replaced via Seagate ASAP
Replacement drive might be a while as Seagate is having supply issues.

4

3TB

data

PASSED

Drive passed testing and was successfully zeroed

5

3TB

data

PASSED

Drive passed testing and was successfully zeroed

6

160GB

OS

PASSED

Passed long test on 10/15/2013

7

160GB

OS

PASSED

Passed long test on 10/15/2013

...

  • We tried copying the data from the Matrix disks to ChemIT disks to create a backup which resides independently from your hardware.
  • However, when we arrived this morning, we found the copy did not complete (176GB of 3.1TB). And worse, we now can't can’t see much of the original data.
  • We have called in additional expertise to further help characterize the problem, especially now that we can't can’t even see the original data.

...