NSF grant awarded. Thus, this project is a "go" as of August 2013.

Summary information for Hoffmann researchers, as of 12/12/13:

Helios and Sol will be unavailable to researchers between Monday, Jan. 6th at 9am and approximately Thursday, Jan 9th, at noon.

Detailed information for cluster migration:

1) Reduce the amount of researchers’ temporary, test user data on Sol to the bare minimum during the break.

Bottom line: If there are any files on Sol which researchers cannot afford to lose, researchers must move that data to Helios (or elsewhere) before Sol is turned off.

2) Monday, Jan. 6th at 9am: Helios AND Sol being turned off to researchers

Researchers must know that:

Tasks for ChemIT to do:

3) Thursday, Jan 9th, at noon: Expect Sol available to researchers.

The Hoffmann group is encouraged to improve their software installations in order to create a more robust environment and to improve support outcomes.

Tasks for Hoffmann group:

Tasks for ChemIT:

NOTE: Helios will no longer be available after all this is done.

Thank you -ChemIT


Older or other notes:

Data rates

12/11/13: Users data to transfer from Helios to Sol is about 80GB. The transfer time, using rsync, is expected to take about 24 hours.

Next steps

Draft idea

Unknowns

Tasks and estimated timing

Top Level Task Description

Effort Est.

Assignee

Planning

 

 

Discovery/ Overview mtg

1.5 hrs

 

Vet options and conduct needs analysis to match to hardware order

1-2 weeks

 

Specify exactly the systems to order within budget. Includes iterating with vendor experts.

1 week

 

Approval

0 days

 

Order & Installation

 

 

Place & Process order

1/2 week

 

Delivery, after order is placed at Cornell

~3 weeks

 

Receive order and set-up hardware in 248 Baker Lab

1 week

 

Build New Cluster

 

 

Get head node and 1st cluster node operational with OS and cluster management software

3 weeks

 

Test / Verify / Approval

1 week

 

Convert Old Cluster

 

 

Move user accounts and data; test, prep, and do

1 week

 

Move old nodes to new cluster

1 week

 

Other provisioning models and related ideas

Buy cycles, on demand

Good for irregular high-performance demands, especially if have high peaks of need and long-lasting jobs.

Host hardware at CAC rather than with ChemIT

Hosting costs at CAC is for basic: Expert initial configuration, then keep the system current, and keep the lights running. Other service charged hourly.

Per the above rate calculator, the rate for 9 nodes (1 head node + 8 compute nodes) would be $8,291/yr. Or, $24,873 for 3 years for this service.

At current ChemIT rates, 9 nodes would be $321.84/yr. Or, $965.52 for 3 years of service.

Table, related to our options

                          Option ==>
Consideration, below:

ChemIT

CAC:
RedCloud

CAC:
Hosting

Amazon (EC3?) or
Google (Compute?)

Other ideas?

Hardware costs

$25K

-

$25K

-

 

Hardware support

Yes.

-

Yes.

-

 

OS install and configuration

Yes. CentOS 6.4

 

Yes. CentOS 6.4

 

 

Cluster and queuing management

Yes. Warewulf, with options

-

Yes. ROCKS, no options.

-

 

Research software install and configuration

Yes

No

Yes; additional cost

No

 

Application debugging and optimization support

Not usually.
Available from CAC, at additional cost?

Yes; additional cost

Yes; additional cost

No.
Available from CAC, at additional cost?