PLEASE:

Scheraga's group must confirm decision on partitions AND their capabilities and use.

Why? ChemIT requires this information to set up the new Matrix. Soon, a lack of a decision from the group will bring this project to a full-stop until the information is provided.

  • Initial email sent 7/29/2014 5:30:05 PM to Czarek. No written confirmation back Czarek, as of 8/25/14 (Monday).
    • Q: Copy sent to Gia? Format OK? (Unreadable format within Remedy- work with plain-text email readers?)

Even after group has confirmed ChemIT's initial suggestions (which still needs to happen), iterations and subsequent clarifications may also be required. Thus, we will need the group's continued engagement to ensure ChemIT is doing things to best meet the group's needs.

Current idea:

On external 12 TB Synology system:

  • /home (3-3.5TB total (ensure under 4TB!); on Synology)
    • Can be seen by compute nodes.
      • Scratch data is to be stored on compute nodes until calculation is done. If not automatically deleted (aborted job, etc.) it is the researcher's responsiblitliy to delete the scratch data.
      • When calculation is complete on a computer node, than the data is written to the /home (or /notbackedup, if more space is needed temporarily) partition.
    • Data on /home must be removed when not actively needed for a researcher's cluster work. Researchers can move the data to /storage, for safe-keeping and convenient access.
    • Data is backed up via EZ-Backup.
    • Group to decide on quota for any new user. The process to request more space is through the group's designate.
  • /storage (8TB total)
    • Can NOT be seen by compute nodes.
    • This is to store data related only to Scheraga research. No personal files.
    • Data is backed up via EZ-Backup.
    • The process to request more space is through group designate.
  • TBD (1TB total)
    • To be used for either expanding /home or /storage at a later date (when we know which needs more space sooner than the other).

On internal to headnode 4TB regular hard drive:

  • /notbackedup (4TB total)
    • Can be seen by compute nodes.
    • For when a researcher needs more temporary space than an individual's /home directory can provide.
      • No need to expand a user's /home directory to meet their temporary "peak" needs.
    • Researcher must move relevant data off as soon as they are done with it.
    • This data is NOT BACKED UP. Use only temporarily, and only when a researcher needs "surge" space.
    • No quotas. Thus, potential for researchers to "step on each other".
  • Q: Make more robust by adding hard-ware RAID and a 2nd identical drive?
    • Cost of RAID card:
    • Cost of 2nd identical hard drive:

On internal to headnode 256GB SSD hard drive:

Drive is backed up via EZ-Backup, and contains these partitions:

  • OS (16 GB)
    • YUM-installed applications
    • And other select other cluster-specific applications
  • swap (32 GB)
  • /data (200+GB; the rest!)
    • Applications for researchers (/software -> /data/software)
    • Software source files.

Scheraga's group must provide to ChemIT two (2) quota numbers for every active user (and info might inform partitions' minimum sizes, required above)

For each current researcher, a quota must separately be specified for their:

  • /home
  • /storage

ChemIT has provided Gia the current use and single quota for each researcher.

  • This quota is a combination of both /home-type use and /storage-type use.

Gia has asked every researcher:

  • Desired quota for /home?
  • Desired quota for/storage?
  • How much data to transfer to Matrix's /storage when it is made available (from their local hard drives, etc.)?

Many researchers have not replied to Gia.

  • ChemIT will thus create a report to Gia with when they last logged in.

Scheraga's group must approve process for ChemIT to copy researchers' files from old Matrix to the new Matrix.

Idea:

  • Create empty /home directories for each user. ChemIT leaves it empty, for the researcher to purposefully populate.
  • ChemIT copies all a user's data in old Matrix to the new Matrix's /storage partition. Each researcher selectively and deliberately moves what is required for their current cluster jobs, to their new Matrix home directory.
  • OUTCOMES
    • Clean /home directories, making restores and debugging much easier over time.
    • Pawel supports this approach. He estimates perhaps 1/2 - 1 hour work for each researcher.
  • To do: Figure out dates and other timing, including complete downtime.

Scheraga's group must decide importance of integrating GPU

Yi He must test (on ChemIT's headnode)

 

Group's Matrix documentation

ChemIT strongly recommends that the group develop and communicate a community standard of practice.

  • Where to host?
  • Who to create, organize, and maintain content and associated conventions?
    • Negotiate decisions among group, and ensure technical decisions are ones ChemIT can support.

Initial ideas from Pawel and Gia (documented by Oliver, as a favor):


Regulations:

1) Default quotas for new researchers.

  • Example:
/home/storage

5GB?

50GB?

100GB

How does a user request a bigger quota?

  • Document workflow, specifying who needs to approve what, and what info do they need to do that.
  • Is there a difference between a temporary quota increase and one which is more permanent?

(2) Rules and expectations (behaviors) of group members

  • Ex: Respond in a timely manner, and completely, to requests from group's designate.

(3) Support for researchers using Matrix.

  • For a given question, how get it?
    • What questions go through group designate, and what goes directly to ChemIT, and how is that done?
  • What response can one expect? 

Also may want to add:

  • How-to's

 

 

 

 

  • No labels