Specifications, costs, and trade-offs for upgrading and expanding the Matrix cluster.
From Scheraga, with Czarek, from 5/27/2014.

The bottom line IN THE FOLLOWING BRIEF FORM is

A. Buy new head node
B. Buy new storage machine
C. Buy new computational nodes
D. Arrange an efficient back-up plan

Added by ChemIT:

E. Buy components for networking, power. And new rack.

TOTAL for Option 1 column: <$50,750 + $1,000/yr for backups.

Component

Option 1 (most likely)

Option 2

To consider, to do's, questions, notes and comments.

A. New head node

Dedicated chassis.
(1 U)
$3,000

None.

We considered using one of four computers (within a Quad), in one chassis (2U), for the head node. This is only worth considering if needing to save $5,000 by giving up one (of 8) compute node.
Note 1: Head node has lower proc, lower memory, and 256 GB SSD drive (for OS and applications).
Note 2: Head node price includes a large drive 4TB "WD black" for temp. user data (~$2,700+$250).

B. New storage machine

Synology-branded dedicated storage array (RS 3412XS)
$3,000, plus five 4TB hard drives ($250ea). Total price is for 12-16TB initial storage, depending on risk choices (see Notes, at left).
$4,250

Home-brewed dedicated storage array, perhaps running OpenNAS software.
To do: Get price of appropriate hardware.

Consider Option 2 for cost savings (if any).
BUT, must also consider risk, support, and staffing effort.
To do: What connector? If ethernet: iSCSI (all compute node writes through head node) or NFS (theoretically could be accessible by compute nodes)? If not ethernet, what connector technology, at what price and complexity?
Note 1: Five 4TB HD's can allow for 1 drive to fail (16TB) or 2 drives to fail (12TB).
Note 2: Price for Option 1 includes five large 4TB "WD red?" for temp. user data (~$2,700+$250).
Note 3: $1,000 get you a redundant power supply.

C. New computational node

8 nodes (in 2 Quads), with higher computational processors.
Each node: 2 * E5-26700v2; 2.5GHz, 10 cores/ proc (thus, 20 cores/ node). Add $2,500 per node compared to Option 2. Thus, $5,100 each node.
$40,800

16 nodes (in 4 Quads), with standard computational processors.
Each node: 2 * E5-26200v2; 2.1GHz, 6 cores/ proc (thus, 12 cores/ node). $2,600 per node.
$41,600

Consider Option 2 for increasing number of cores from 160 to 192, but with slower set of processors.  See Czarek's note, from 4/14, below.
For either choice, to do: Storage: Fast or large? Or both?
Note: Buying in multiples of 4 (Quads) is most cost-effective.

D. Arrange an efficient back-up plan

EZ-Backup
$1,000 per year, at current amounts (2.7TB) and current rates (see Notes, at left).

N/A.
At current quantities of backed-up data, ChemIT cannot recommend an alternative.

On-going to do: Evaluate cost-effectiveness as volume grows.
Note 1: At current 2.7 TB's of backup (including compression and versioning), costs are as were predicted (no surprises). That is to say, were considered affordable and cost-effective compared to investing in own hardware and staffing.
Note 2: Costs are ~$80/ month currently for 2.7TB. Future amounts expected to be more.
Note 3: Rates likely >10% drop July 1, 2014, as in past years.

E. Components for networking, power. And new rack.

Under $2,700.

N/A.
Not much discretionary elements in this category.

Specifics:
Networking (~$1,500)
    1 switch (Cisco SG300-52; ~$750)
    Network cabling (price dependent on above decisions. Expect well under $1K)
Power (~$700)
    UPS (APC SMC1500-2U; $500; for head node, storage node, and switch; Use current (cheaper) UPS, when again available from current head node, if necessary to spread load or for manageability.)
    2-3 PDUs (power distribution units; a.k.a. power strips; $100 each)
        Exact qty depends on processor decisions (amperage calculations)
Rack (~$500)

4/14/14. Czarek: It looks like the slowest cpu E5-2620v2 2.1GHz has the best price performance ratio but anyway I would not buy the slowest cpu. Right now in Gdansk we are buying 10 servers and we decided to go for 10-core cpus E5-2670v2 2.5Ghz (20 cores per node). As in matrix in Gdansk we have only slow interconnect between nodes and some programs can run efficiently only on single node so more cores per node gives for such program advantage. Other programs both in Gdansk and on matrix just need the highest possible total performance and exchange very little data between nodes so than number of cores is not important. What about space restriction ? Is it better to buy smaller number of faster nodes?

  • No labels