You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 10 Next »

Priority is Liang's project, which in the short term is not likely to required Eldore.

Project lead: Oliver <oh10>

Team: Zhichun Liang. And Peter (or, Petr) Borbat

Goal

Run humongous jobs, and run them hundreds of times. Expect that having own equipment is likely most cost-effective.

Strategy

Get Liang set-up with CAC's services so he can create the software and test it. And pay per drink at that smaller scale.

The outcome (and process) of using CAC's hardware and services will inform what needs to happen for the subsequent production mode.


Oliver's meeting notes, 11/28/12's mtg

Barry: Need to reinstall OS (since depends on AFS and CCMR's infrastructure).

Petr: Remove 4 GPUs, and install in Wintel boxes, one GPU per Wintel box. Barry: Giant sized GPUs: Will they fit? Recommend consult with CRCF before buying Windows-based computers, to confirm fit.

Barry: Currently: RedHat. Future idea: Scientific Linux 6 (based on RHEL 6), 64-bit.

Oliver: Noted that CRCF and CAC are using CentOS (also, RHEL-based).

Barry: ~48-64GB RAM

Liang's computations: 64-bit. X-Windows. Batch.

Jack: Reported Nandini has 40-50 notes, managed by CAC. Very positive. 3 FTE cluster experts. Rapid response. Reasonable, for the nodes. Consulting at about $60/hr. Contact is Resa Alford, <rda1>.

Kevin: If Petr and Kevin to manage: Kevin has past experience with SUSE. Might need paid consulting.

Oliver: If managed by CRCF, likely would be CentOS. Unless good reason not to be that OS.

Decision at meeting:

Consider using CAC if it's a technical "fit", and pay per drink. Get estimate before committing. In the short-term, do this since need VERY high memory. BUT, note that doing this in productions likely not cost-effective compared to investing in own hardware

----Past email threads, pre-meeting

Barry, 11/20/12, 2:34 PM

Eldor will need to be reinstalled. Our installation is tied to AFS and our infrastructure. Going to windows or a standalone Linux system seems like a good idea.

Peter, 11/20/12, 12:51 PM

It may be better to set up user accounts on or to lease CAC v4-64g node (64 GB Read Hat Linux). Lease option may be practicable if the node is used intensively. CAC operates in cost recovery mode, we just need to estimate fees.

ELDOR may not blend with any of CAC's blade systems, so it could be decommissioned and used locally at ACERT as a LINUX or WINDOWS 64-bit development platform. Kevin Hobbs and I can host it if CRCF would not.

Alternatively, I can use it for 3D-EM simulations, for example, since its GPUs are not used by any NLLS code. We surely can find a good use to it.

Oliver, 11/20/12, 10:23 AM

(1) Hosting the Eldore server. (Jack will be speaking with Nandini about CAC's services)

  • No labels