Restore the system

Assumptions

Original Scheraga RAID card is OK.

  • So far, no indication that it isn't OK.

Using a RAID card is more robust than not using the card.

  • Although the card can fail, and a new card would be required to re-access the disks, this risk of failure is far less than the risk that one of the hard disks will fail. Hence, the value of using the RAID card.

All but one of the existing hard drives is fully operational.

  • If other drives are non-operational, may have to buy more drives.

Rebuild 1: Operating system

Summary recommendation

  • Retain 100% functionality.
  • No purchases necessary.
  • Reuse both existing OS hard drives.
    • The data currently on the hard drives can be deleted.

Details

Unknown: ChemIT to confirm drives pass a drive test. If they pass the test, re-deploy them.

  • Do a "quick test: Test uses drive tools which //retain// all existing data.

Result would be status quo:

  • 160GB of space.
    • Two 160GB drives on hardware RAID card, mirrored (RAID 1).
    • OS using ~77GB of this space and not growing rapidly, so that's plenty of space.

Rebuild 2: Data

Summary recommendation

  • Don't retain 100% functionality
    • Retain full data storage capability, with same redundant configuration.
    • Remove in-box versioning (Back in Time copy), decreasing number of disks required.
  • No purchases necessary.
  • Reuse existing OS hard drives which still work (at leest 4 of the original 6).
    • The data currently on the hard drives can be deleted.

Details

One hard drive has a broken board. Current plan does not require it, so thus likely not worth expense to fix.

Unknown: ChemIT to confirm other 5 drives pass a drive test. If they do, redeploy at least 4 of them.

  • Do a "long test:Test using drive tools which //wipes out// all existing data.

Unknown: A hard drive port in the head-node might not be usable. If needed, it would need to be confirmed it can still work.

Result would be same storage space and redundancy for data:

  • 6TB of space.
    • Four 3TB drives on hardware RAID card, so 2 could fail (RAID 6).
    • Have one spare drive "on hand".
    • Data currently at 3.1TB, and growing.

Usage

  • ~800 cores are used by ~20 people, sometimes running 30 programs (at once?).
  • No labels