Examples
From Roald Hoffmann to his researchers, 5/28/2014:
See also
Cluster support matrix:
Who is responsible? |
Level |
Older cluster |
Contemporary cluster |
Notes |
---|---|---|---|---|
Research group member only |
Applications: running within a user's directory |
|
|
What will run on contemporary OS? |
Research group (designated member) |
Applications: Research software, residing in research group's shared directory |
xtaloopt, phonop |
xtaloopt, phonop |
What will run on contemporary OS |
ChemIT |
Applications: Select research software, residing in ChemIT-designated directory |
Gaussian, ADF |
Gaussian, ADF |
Q: Are these the correct applications? |
ChemIT |
Compiler applications: Select software, residing in ChemIT-designated directory |
Intel compiler, using ChemIT-managed license server. |
Intel compiler, using ChemIT-managed license server. |
What will run on contemporary OS? |
ChemIT |
System applications: Provisioning web interface to cluster services |
WebMO (old version) |
WebMO (latest version) |
Collum's only, |
ChemIT |
System applications: Job queuing and scheduling systems |
Maui/ Torque |
Maui/ Torque |
|
ChemIT |
Compute node provisioning |
Perceus |
Warewulf |
|
ChemIT |
Monitoring tools |
RAID controller card monitoring (hard disk monitoring) |
RAID controller card monitoring (hard disk monitoring) |
|
ChemIT |
Versioning tools |
Back In Time |
Back In Time |
Optional |
ChemIT |
Backup software (not just versioning!) |
EZ-Backup (off-site) |
EZ-Backup (off-site) |
Optional? |
ChemIT |
Operating System (OS) |
Fedora 13 (or 9 and 11!) |
CentOS 6 |
|
|
|
Scheraga |
Collum |
Migration in progress: |
Cluster or |
Research group's software installer and maintainer |
Notes |
---|---|---|
Scheraga |
Czerek |
|
Lancaster |
Kyle |
|
Lancaster: Crane |
Who? |
|
Hoffmann |
3/19/14: Who now? |
Was Andreas. Who now? |
Freed |
Who? |
|
Define "responsible"
Group which selects, installs (compiles, if necessary), debugs, maintains, updates, and otherwise keeps software functional. Finds alternatives. Secures resources to make work, and otherwise "owns" the problem.
Others can assist and consult and provide "best effort" services. For example, ChemIT will require research groups to confirm the operation of new installations. And research groups may benefit from having ChemIT look over problems the research group is having with their software.
Conversations with Oliver
Group |
Representative |
Date, notes |
---|---|---|
Crane |
Brian |
11/26/13, Tues. Initial briefing. No concerns expressed at this time. |
Hoffmann |
Roald |
11/25/13, Mon. |
Issues
- Version neglect makes support harder over time, for any of these layers.
- Impact on research when software (1) updates, or (2) upgrades.
Other language used to hopefully help clarify roles and responsibilities related to software on the research clusters
(1) Application software your researchers install on the cluster should not go in the home directories. Instead, software should be installed in:
(Example: /home/hoffmann/bin)
This will increase efficiency and robustness.
Details:
Using the cited shared directory makes it easier for other researchers to use already installed software. And as people leave the group, and new people arrive into the group, their home directories won't have software others depend on. For these reasons, if Lulu is asked to assist in application installation, configuration, or use, I would appreciate your support and understanding that she cannot afford to do so for software installed in people's cluster home directories.
(2) Your researchers are responsible for installing, configuring, and knowing how to effectively use any research application they use. They can request assistance from Lulu, but only after they have made a good-faith effort to learn and try to fix their problem themselves. This way Lulu can contribute to expanding a researcher's knowledge of something they are responsible for understanding, even as she assist them in getting things working, with their active involvement.
I have drafted a page listing the research applications (and related software) on your cluster, identifying explicitly the software your researchers are responsible for installing (including compiling) and maintaining:
https://confluence.cornell.edu/x/ogNpDw
Details:
The page also lists, for completeness and clarity, application software ChemIT (through Lulu) is responsible for installing (including compiling) and configuring. (For added clarity, that software is installed elsewhere on your cluster; see page for directory details.) Even for software ChemIT is responsible for installing, researchers still have the responsibility to know how to effectively use the software, to confirm it works correctly (and helping characterize the problem when it doesn't), and for thoughtfully responding to requests related to our debugging efforts of that software to meet your researchers' needs.