You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 17 Next »

NSF grant awarded, so this is a "go", August 2013

Next steps

  • Meet to review all options and confirm desired direction and expected timing.
  • Review resources. Xiao-Qiu Ye has cluster experience, including set-up.

Draft idea

High-level overview

  • Create a stand-along cluster using new hardware ($25K for minimum of 3 years operations (thus, ~$8K/yr in hardware)).
    • Uses new OS and related cluster management software.
    • Install and configure necessary applications.
    • Enable NetID-based access, if possible (limit 2-3 days for a "go/no-go" decision on this functionality)
  • Confirm old nodes can successfully be added to that new cluster.
  • Migrate users to new cluster.
  • Migrate old nodes to new cluster.

Unknowns

  • Time for install of all necessary applications, many of which are new to Lulu.
  • Whether NetID-based access will succeed. But note that this is not a do-or-die step.

Tasks and estimated timing

Top Level Task Description

Effort Est.

Assignee

Planning

 

 

Vet options and conduct needs analysis to match to hardware order

1-2 weeks

 

Specify exactly the systems to order within budget. Includes iterating with vendor experts.

1 week

 

Order & Installation

 

 

Place & Process order

1/2 week

 

Delivery, after order is placed at Cornell

~3 weeks

 

Receive order and set-up hardware in 248 Baker Lab

1 week

 

Build New Cluster

 

 

Get head node and 1st cluster node operational with OS and cluster management software

3 weeks

 

Test / Verify / Approval

 

 

Convert Old Cluster

 

 

Move user accounts and data

 

 

Move old nodes to new cluster

 

 

 

 

 

Lulu becomes available ~mid-September?

See unknowns, above, which related to tasks which will obviously take additional time to accomplish.

Other ideas

  • We can walk through rates and scenarios, as appropriate.
  • We can meet with them since they may be willing to do more with a commitment of $25K than is published with their $400 min. offering.
    • Brainstorming idea: Would they be willing to add hardware to CAC's RedCloud to get a buyer of that hardware a better cost and/or privileged access?

Buy cycles, on demand.

Good for irregular high-performance demands, especially if have high peaks of need and long-lasting jobs.

  • Buy cycles from CAC (RedCloud, minimum of $400 for 8585 core*hours
    • 12 cores available at any one time on one system.
      • Can access more than one system at a time, but systems are not linked.
    • $400 (minimum) buys you 8585 core*hours
      • This comes out to ~1 core for an entire year, non-stop.
    • For 96 cores, that's $38.4K for 1 year, non-stop.
      • 96 = 8 nodes, each with dual 6-core procs => 8 * 12 = 96
    • Or, for $25K, that's ~536,562 cores*hours.
      • $25K = $400*62.5 units. And each unit is 8585 core*hours, so 62.5 of them gets you 536,562.5 cores*hours.
      •  
  • Determine costs, processes, and trade-offs if use another cloud service, such as:
    • Amazon. Amazon AC3?
    • Google. Google Compute?

Host hardware at CAC rather than with ChemIT

Option:

ChemIT

CAC:
RedCloud

CAC:
Hosting

Amazon (EC3?) or
Google (Compute?)

Other ideas?

Hardware costs

$25K

-

$25K

-

 

Hardware support

Yes.

-

Yes.

-

 

OS install and configuration

Yes. CentOS 6.4

 

Yes. CentOS 6.4

 

 

Cluster and queuing management

Yes. Warewulf, with options

-

Yes. ROCKS, no options.

-

 

Research software install and configuration

Yes

No

Yes; additional cost

No

 

Application debugging and optimization support

Not usually.
Available from CAC, at additional cost?

Yes; additional cost

Yes; additional cost

No.
Available from CAC, at additional cost?

 

  • No labels