DIT Grace Cluster info for CS faculty and instructors

  1. Introduction
  2. Software
  3. Environment, filesystem, and directories
  4. Other issues
  5. Access to systems, adding TAs, and increasing quota
  6. Other information

Introduction

Please mail me any questions you may have, and when you get access if you find problems or other issues please let me know so I can bring them to DIT's attention. Also I would be happy to have any comments on or suggestions about this document.

DIT's classroom machines are named the Grace cluster ("Glue Research and Academic Computing Environment"). These are several (VMware virtual) machines, each with four 3.0 GHz Xeon processors and 16GB RAM, running Red Hat Enterprise Linux 6. There's a webpage with some information about the systems for instructors and students at http://www.grace.umd.edu/.

The machines are named grace1.umd.edu, grace2.umd.edu, etc., however, if you don't care which machine you use, just log in to grace.umd.edu, which which will connect to one of them.

Note that insecure access (telnet and ftp) to these systems is not possible; secure access (ssh, slogin, scp) must be used instead.

Software

These systems use a utility called module so users can get access to certain software without having to permanently modify their dotfiles. Some software is not on the standard search path but running module modifies the user's path and environment to give access to it for the duration of the current login session. For example, mathematica and matlab aren't on the default search path, but they can be run after typing  module load mathematica and typing  module load matlab. "module avail" lists the applications available via this mechanism. "module whatis " gives a few-word explanation of a module, for example  module whatis mathematica.

One module of interest to CS faculty and instructors is Oracle Java 1.8, which is available by typing  module load java/8. (See Project submission below.)

If you or students want to use one of the applications available via module during most logins, just put the appropriate module command into the file .cshrc.mine in your home directory, which gets executed upon every login (this is assuming the default shell tcsh is used; use .bashrc.mine for bash).

Environment, filesystem, and directories

Accounts

"Disposable" class accounts for each course, which disappear at the end of the semester, are not used with these systems. Instead students, faculty, instructors, and TAs can access the systems using their campus TerpConnect account (TerpConnect accounts were previously called Glue or WAM accounts). Anyone affilitated with the University can get a TerpConnect account online at www.it.umd.edu/new/. Students registered for a course will then get login permission on the Grace machines when the course instructor requests class access (see the last section below).

Faculty and instructors don't have to deal with student password problems, login difficulties, etc.; since students use their TerpConnect accounts they could contact the DIT Help Desk for such things (1221 McKeldin Library, x51500, https://umd.service-now.com/itsupport/). For courses that use these systems students also don't have to set up their environment (.emacs file, aliases, etc.) every semester for a separate account for every course as with the old DIT systems.

The Grace machines share a filesystem with the regular TerpConnect machines. The differences between the Grace systems and the regular TerpConnect machines are as follows. The Grace machines are dedicated for instructional use, so only students taking courses can log into them, while TerpConnect accounts persist as long as their owners are associated with the University, so anyone can log into the regular TerpConnect machines. The Grace machines are faster. The regular TerpConnect machines are Solaris boxes, and lastly, when class access is created for the Grace systems, students registered for the course get extra disk space (see below), and shared course disk space is created where files may be provided to students.

Note that students may be using their TerpConnect account for more than one course, if they're taking multiple courses using the Grace cluster. They won't need separate class accounts for each course as with the old cluster, they'll just get additional separate disk space for each course, and access to shared course disk space created for each course.

Note that a user's University directory ID and password are their login ID and password for the TerpConnect and Grace systems. Although some University applications (for example the registration system) allow uppercase letters in the University ID when logging in, the Grace and TerpConnect systems use the all-lowercase form of the directory ID, so if students have trouble logging in one thing to ask is if they're using uppercase letters in their directory ID.

tcsh is the default shell on the Grace machines. Although bash is supported, and some other shells are installed, some settings may not work correctly without tweaking if users change their shell. Various shell and environment variables) in the system login scripts may not all get set correctly, for example, if someone changes their shell. If students have trouble with any tasks (things like resource limits, submitting projects, using the right version of Java, etc., discussed below) you may want to ask them if they've changed their shell. If so they should either change it back, or, if they're proficient enough with UNIX to have a preference between shells they may be able to figure out has to be modified in their dotfiles to use another shell.

Temporary files

On the Grace systems, temporary files should be put in /tmp, rather than in /var/tmp.

Default dotfiles and resetting an account

If a user accidentally deletes or trashes one of their dotfiles, but can still log in and execute commands, default versions of all the dotfiles as of when new accounts are created can be found in the directory /local/lib/skel. If a user clobbers their dotfiles to the extent that the best thing would be for them to start over with a fresh set of dotfiles, as of when their account was created, they can run /usr/local/scripts/newdefaults, which will do this.

If they mess things up so badly that they can't even log in (which probably means that they can log in but get kicked off right away), you or a TA would still (if the student is with you) probably be able to scp their dotfiles somewhere else (taking care not to clobber your own versions), with the student typing in their password, editing them, and copying them bck. However, the easiest thing to do in such a situation may be to just tell the student to contact the DIT Helpdesk and explain that they need their TerpConnect account dotfiles reset, because they can't even log in.

AFS and Kerberos

TerpConnect uses the AFS file system. This allows instructors, and TAs to grant varying levels of access to directories to specific lists of users, without having to get staff to create groups. (Users can create and administer their own groups in AFS as needed, although DIT's web interface can usually handle most situations.) Each directory has an Access Control List (ACL) which can be manipulated to control access. Of course, it requires learning the AFS commands if you want to do these things, if you don't know them already; these are described at www.helpdesk.umd.edu/documents/1/1222. A few of the things described in more detail on that page are:

  • Listing the AFS ACLs for a directory: use  fs listacl
  • Setting the AFS ACLs for a directory: use  fs setacl
  • ACL access rights (the AFS analogue of directory permissions) are described at http://www.helpdesk.umd.edu/documents/1/1222/#acl.
  • Also see the AFS command quick reference chart at www.helpdesk.umd.edu/documents/1/1222/#quick; this includes the commands to create and administer groups.

    The systems use Kerberos for authentication. When a user logs in they get a Kerberos ticket and an AFS token (used for file permission and authentication) by the login process. These both expire after a day; the command klist lists Kerberos tickets and the time they expire, while command tokens lists AFS tokens and when they expire.

    Class file and directory structure

    When you request space for your class the following directory will be created (suppose this is for CMSC 123, Fall 2015, section 0101):

    /afs/glue.umd.edu/class/fall2015/cmsc/123/0101
    Under this several directories will be created. A full description of all of these directories, and what permissions different classes of users (instructors, TAs, and students) have for them is given on DIT's Grace pages at
    http://www.grace.umd.edu/directory-layout.html. DIT's page at www.grace.umd.edu/calendarhtml indicates when after the semester is over student access to these directories will end, and when they're deleted.

    Student class disk space

    Students are given extra disk space for every course using the Grace cluster which they're registered for. Instead of the disk space being in their home directory it's in the AFS volume for that course. For example, a student whose login ID was larry, taking CMSC 123, Fall 2015, section 0101 would get a directory created named
    /afs/glue.umd.edu/class/fall2015/cmsc/123/0101/student/larry.
    To avoid students having problems with quota in their home directories you may want to tell them how how to create a symlink from their home directory to this AFS space, and to cd there to do all their coursework for your course.

    Note that after a semester ends students can just use cp if they want to save their work, by copying it from this location to their regular TerpConnect home directory tree (provided they have sufficient disk quota avaialable; see below). Since TerpConnect accounts persist as long as someone is associated with the University they would have access to anything in their home directory tree for the duration of their time at UMCP.

    Quota

    Individual students running out of disk quota shouldn't be as much of an issue to deal with as on the old detective cluster, because students will get extra quota for each course, and any questions about quota would be between a student and DIT. The AFS class account space which is created for a course has a common quota rather than a per-user quota.  fs lq <directory name> will list the quota for the AFS volume if you want to check on the quota being used by your class space. That could fill up, if any students create large files. Use the web interface at https://learningonline.umd.edu/LTtools/grace/my-grace-spaces.seam (under "Manage site(s)") to request a large-enough quota for your course. You will get a warning email if disk usage starts to get close to the quota.

    Project submission

    If you are using the CSD project submission server the command-line submission tool uses a jarfile, which requires Oracle Java, and won't work with GNU Java. On the Grace machines, where GNU Java is the default, either tell students to type module -q java (once per login) before running the project submission script, or just have them add it to their .cshrc.mine file in their home diretory, which gets executed upon every login.

    The project submission script for the submit server will fail on the Linux Grace machines due to insufficient default virtual memory; the default is 6000000 kbytes, but Oracle Java requires more than that. Just have students add limit vmemoryuse 1500000 to their ~/.cshrc.mine file, which is more than sufficient (a smaller limit may also work if you want to experiment) and log in again.

    There is a simple option for project submission for those who aren't using the CSD project submission server; as mentioned above the AFS space for a course includes space and directories already set up for project submission. The submit program which some faculty used on the old DIT class cluster, written by Dr. Purtilo, won't work on these systems as it won't run under AFS. DIT has a simple submit shell script installed as /usr/local/scripts/submit. To use it a student just runs submit, and it prompts for the necessary information. It leaves the submitted file (for example for CMSC 123, Fall 2015, section 0101) in /afs/glue.umd.edu/class/fall2015/cmsc/123/0101/submit/<loginID>. A subdirectory is created there for each project (based on the project number given when submitting.)

    This script lacks some features, for example it doesn't identify late submissions, it doesn't keep a log of submissions made, and the only mechanism to cut off submissions as of a certain time is to change the access permissions for the directory (or to copy the submitted files elsewhere) at the desired time; of course you can write a script to choose which submissions you want to grade based on their timestamp (the timestamp is appended to each submitted file's name). Of course it's possible to copy and modify this script to add features and have your students run your version instead, or to set up your own directory for submission elsewhere and write your own project submission program entirely if preferred.

    Other issues

    Making requests and reporting problems (the request system)

    Problems with the systems can be reported using the "request" system- from the new machines (or from any TerpConnect machine) run "request" and type in the description of the issue. This directs it to the sysadmins. If you can't log in to a TerpConnect machine the request application is also available on the web at https://umd.service-now.com, or just send mail about the problem to itsupport@umd.edu. To submit a request using the web interface click on "Report an Issue". The procedure for requesting class account space is described below.

    Mail forwarding

    New TerpConnect accounts are set up to autoforward email to the user's University email account, which should be a Terpmail account for students, or a mail.umd.edu account for faculty and staff. Although it's doubtful anyone would want to do this, users who for some reason don't want their mail forwarded (who want to read mail sent to their TerpConnect account locally on the TerpConnect or Grace machines) can use DIT's request system (described right above) to ask that mail autoforwarding on their TerpConnect account to mail.umd.edu be turned off. If after that you want to forward email elsewhere, the .forward file should be put in an atypical location: if your account ID is "user" it would be placed in in /mail/user/.forward rather than in your home directory. You can point students who are novices to mail forwarding to more detailed procedures on the mail forwarding specific to TerpConnect on the Help Desk page www.helpdesk.umd.edu/os/unix/usage/410; this also explains to the student the format of the .forward file.

    Long-running jobs

    The Kerberos tickets and AFS tokens used for authentication will expire after a day. If you leave an xterm running for longer than that your session won't have any permissions, but just run renew and type your password to restore access.

    If you have a long-running or background job which will take longer to run than the duration of your Kerberos tickets and AFS tokens, DIT has an autorenew command which will extend tickets and tokens before they expire (man autorenew has a brief explanation).

    scp and ssh-agent

    Due to Kerberos authentication it's not possible to log in to TerpConnect and Grace machines without typing the password, even if you're running ssh-agent on your local machine. You also can't remotely scp files to or from TerpConnect machines without typing the password, for the same reason. However, you can run ssh-agent and ssh-add on a TerpConnect or Grace machine and scp files in either direction, without typing the password, by just running scp on the TerpConnect or Grace host. This requires typing your password once to log in to the Grace host and once when running ssh-add on it, and then once every 25 hours when you would need to run the renew command mentioned above to renew your Kerberos tickets and AFS tokens.

    at, batch, and cron jobs

    Due to Kerberos authentication issues, users won't be able to run at/batch/cron/ jobs on these systems (you can run them, but since they wouldn't get any authentication tickets they won't be able to access files). If there's something which really has to be run as one of these types of jobs the staff can set it up if you use the request system described above.

    Access to systems, adding TAs, and increasing quota

    Request access for your course using the form at https://grace-request.umd.edu/cgi-bin/gracefe.cgi.

    You have to have a TerpConnect account yourself in order for the request to be processed, otherwise you won't be able to request or access the class space. If you don't have a TerpConnect account already you can request one at www.it.umd.edu/new. Once your account exists you can request class access.

    When requesting Grace access, first you select the desired semester, and the system determines what courses you're listed as teaching. If you're teaching multiple courses you next choose which one to create Grace access for. If it's a course with multiple sections (all taught by you) you can create access for the sections separately, or for only one section or some sections, or for all sections separately. (If multiple instructors are teaching a course see below before proceeding.) More likely, if you are teaching a course with multiple sections you want to create one common site for all sections, meaning they would all share common AFS groups and common class disk space. Click on "Submit Request" and when the class access is created (which may take a day or so) you will get an email.

    Note: if multiple instructors are teaching different sections of the same course and want a shared Grace space for all sections there is currently no automatic mechanism to set this up. Before requesting access one of the instructors will need to submit a request explaining to DIT that you want a shared Grace space between different instructors' sections of the same courses, which they will set up manually. (Alternately, one instructor can create Grace access for their sections combined, then send the request asking for the other instructor's sections to be added to the shared space.), before the second instructor has requested

    Note that your own login access to the Grace systems will disappear, as will the students', sometime after each semester is over, although you will remain able to log into the regular TerpConnect machines as long as you are associated with the University. At some point after each semester you would lose access to files created in the disk space created for the course (see the Grace access removal timeline at www.grace.umd.edu/instructors/timeline.html), so you would have to copy any files created in that space that you want to save to your TerpConnect home directory tree, or to some other machine, before file access is lost. If you want to have permanent login permission to the Grace machines you can request a research account at www.grace.umd.edu/research/. Note that even with a research account you would still lose access to course files according to the timeline linked to above, however you would continue to be able to log into the Grace machines even after your class access expires.

    When class access has been set up you can then log into the same site used to request access in order to administer the class space, by cicking on "Manage site(s)". Click on "Manage" next to the course you want to administer, which takes you to a self-explanatory web interface that allows you to perform activities like requesting an increase in quota, add TAs to the course (who are automatically given access to the course space, just like students) modifying permissions (for example which directories TAs can access), etc.

    Other information

    1. Students new to TerpConnect can be pointed to www.helpdesk.umd.edu/documentation/unix/gettingstarted.shtml.

    2. A page with some Grace information for students is at www.grace.umd.edu/students/.

    3. Lots of general TerpConnect information is available at www.glue.umd.edu/afs/glue.umd.edu/system/info/olh/.

    4. AFS reference documentation for UNIX is available at www.glue.umd.edu/admin/afsunixdoc/HTML/index.htm.

    5. AFS reference documentation for Windows is available at www.glue.umd.edu/admin/afswindoc/html/index.htm.

    6. An AFS FAQ is available at www.angelfire.com/hi/plutonic/afs-faq.html.