You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 3 Next »

What are the cost trade-offs of reducing server hard drive failures? What can be done to mitigate consequences to drive failures?

Can we reduce the chance of hard drives failing, and at what cost/ benefit?

Ideas and our current thoughts

Invest in more reliable disk drives

More reliable disk drives cost more money. Fortunately we tend to avoid plain consumer grade ones for the servers and invest in ones like WD's "Red" (good) and "Black" (better).

  • On 3/27/13, we bought a WD "Black" 2TB for $160-170.

Buy Solid State Drives (SSD's).

Consider doing this whenever the smaller size is acceptable, and cost for that smaller size is also acceptable.

From Roger,re: False disk failures:

http://www.techspot.com/news/52047-what-is-false-disk-failure-and-why-is-it-a-problem.html

  • 2/100 drives fail per year (it's actually much higher for the drives ChemIT uses!!)
    You can expect something like 1/100 of your drives to really fail this year. And you can expect another 1/100 of your drives to fail this year, but not actually be failed. You'll still pay all the operational overhead of not actually having a failed drive – rebuilds, disk replacements, management interventions, scheduled downtime/maintenance time, and the OEM replacement price for that drive – what $600 or so?

Can we reduce the consequence of hard drives failing, and at what cost/ benefit?

Ideas and our current thoughts

Monitoring tools

Invest in learning how to better deploy and use monitoring tools. Some tools may cost money. Maybe not all relevent, but here are some buzz-words Oliver has come across:

  • Nagios
  • NetOps
  • S.M.A.R.T.
  • No labels