NeuroEcon New User Guide

Getting an account

To get an account on the neuroecon cluster, Ask your PI to send an email to help-hpc@caltech.edu requesting one for you. Please include your preferred username, email address, phone number, mail stop, and how you want us to deliver your password to you.  We can get you your password through campus mail or phone.

Connecting to the cluster

You will need to ssh to neuroecon from the caltech network to get access.

If using a linux computer, simply open a terminal and type “ssh -X -lusername neuroecon.caltech.edu”

If using MacOSX, you can  open the terminal application in /Applications/Accessories  From there you can also type “ssh -X -lusername neuroecon.caltech.edu”. If you need to run graphical programs from the cluster (such as matlab) you can use the built in X11 application on MacOSX.

If running Windows, you will need to get an ssh client.  One commonly used ssh application is Putty. You can get a copy here. To run interactive graphical  applications, you will need to have a Xserver installed on your machine. Caltech has a site lice for X-Win 32 which can be downloaded at http://software.caltech.edu

.When you connect to neuroecon, you can do any interactive work that you need. Most notably, you will be compiling your applications, running matlab, and submitting jobs from these nodes. Make sure that you do not run jobs on the headnode, this will slow down interactive use for everyone.

Your first connection to the cluster

When you initially connect to the cluster, you will likely be asked to generate an ssh key pair.  You can do this simply by hitting Enter a few times until you are at a prompt.  This will create a passphraseless password for use within the cluster.  All mpi jobs are launched using ssh, so this is an important step.  It is important to not use this key elsewhere as it is generally not a good idea to use passphraseless keys.  Do not create passphraseless keys on remote systems to connect to neuroecon.

Changing your Password or shell

Our default shell is bash on the cluster.

To change the password run the “passwd” command on the headnode. To change your shell, run “chsh” on the headnode.

Forward email from the cluster to your regular email account

To have the email sent from the scheduler deliver properly to you, you will need to set up forwarding.  To do this create a file in your home directory called”.forward” with your email address in it.

Setting up matlab for the distributed computation server

To setup matlab on the headnode for cluster use you will need to import the configuration into matlab. To do this, you will need to connect to the server with an xserver so that you can run graphical matlab.  launch matlab by simply typing “matlab” at the command prompt. Once matlab is open, go to the Parallel menu and go to the Configuration Manager. Go to File, the Import from the menu. Navigate to /home/quickstart and choose NeuroEconlocal.mat. Once it is imported, you can test it to ensure that it works properly.  select the configuration you just imported and click the “Start Validation” button.

Using Maltab with the Distributed computation server

The best place to look for information on using matlab with the distributed computation server is in the mathworks documentation which can be found here

Some specifics to our cluster would be the imported configuration.  to set the cluster configuration as your scheduler in matlab use:

sched = findResource('scheduler', 'configuration', 'NeuroEcon.local')

Below is a demo parallel session, with a matlab file called colsum.m that will be submitted to 4 cores on the cluster. Before running this demo please copy the colsum.m file from /home/quickstart into your home directory.

sched = findResource('scheduler', 'configuration', 'NeuroEcon.local')

sched =

PBS Scheduler Information
=========================

Type : Torque
ClusterSize : 96
DataLocation : /home/matlab_working/
HasSharedFilesystem : true

- Assigned Jobs

Number Pending  : 0
Number Queued   : 0
Number Running  : 0
Number Finished : 1

- PBS Specific Properties

ClusterMatlabRoot : /opt/matlab/
ServerName : neuroecon.caltech.edu
SubmitArguments :
ResourceTemplate : -l nodes=^N^

pjob = createParallelJob(sched);
set(pjob, 'FileDependencies', {'colsum.m'})
set(pjob, 'MaximumNumberOfWorkers', 4)
set(pjob, 'MinimumNumberOfWorkers', 4)
t = createTask(pjob, @colsum, 1, {})

t =

Task ID 1 from Job ID 12 Information
====================================

State : pending
Function : @colsum
StartTime :
Running Duration :

- Task Result Properties

ErrorIdentifier :
ErrorMessage :
>> submit(pjob)
>> results = getAllOutputArguments(pjob)

results =

[136]
[136]
[136]
[136]

Wallclock

Once you have a better idea of how long your matlab job will take you will want to set a wallclock time. Setting a realistic wallclock time is very important as it informs other users of how long the job will run, enables fairshare to work properly and prevents runaway jobs from locking nodes that could be used for valid jobs. Set the wallclock time before submitting the matlab job by altering the submission arguments:
set(sched, 'SubmitArguments', '-l walltime=06:00:00')

Compiling your MPI programs

Use mpicc for compiling C applications, mpif90 for fortran, and mpic++ for your c++ applications. You can use mpi-selector to choose a particular mpich implementation

Creating your submission script

Generally, if using matlab for submission, you will not need to create this script yourself.  Matlab will create it and submit for you.

You will generally launch your mpi jobs via a submission script. This will give information to the scheduler on what resources your job will require.  Let’s step through a typical script.

All scripts start with the shell it will be running in. In this case, bash

#!/bin/bash

You will generally need to give your job a name. This wil make it eaier to identify in the scheduler queue

#PBS -N My_Job_Name

Then we want to select a queue to run in. Initally you will use the use the default queue, as this is the only one currently configured. If other queues are needed, email help-hpc@caltech.edu for us to set it up.

#PBS -q default

The you will want to decide on the total number of cores you would like your job on

#PBS -l nodes=16

This is your wall clock time. After this time has passed, your job will be terminated.

#PBS -l walltime=3:20:00

You will want to set any email options for job notifications. for this to work, you will need to setup mail forwarding in your account.

#PBS -m ae

Change your working directory.

echo Working directory is $PBS_O_WORKDIR
cd $PBS_O_WORKDIR

Ensure that you can pass the number of cores requested to the mpirun command

NPROCS=`wc -l < $PBS_NODEFILE`
echo This job has allocated $NPROCS cpus

Then we launch your job using mpirun. This is a wrapper script for a standard mpirun that populates all of the options required as defined in your script.

mpirun -np $NPROCS  mpihello

Submitting your job

To submit your job you can use the qsub command.  If you have set up your submission script as above, you can simply run “qsub  scriptname” to send you job to the queue.  You can override various settings on the command line when submitting your job.  Here are some switches you can add to qsub to override what is in your script:

-l walltime=[hour:]minute
-N job_name
-q “queue_name …”

There are many other options, but these are the most commonly used.  You can run “man qsub” to get a list of other options.

Job management and monitoring

There are many ways to get information about the state of the cluster, queues, and running jobs.  To get an overview of the state of the cluster, you can run “cstat”  This will give you a summary of the number of processers used, pending, free, and various other realtime statistics.

To check what is running in the queue, you can use “showq” This will show you all currently running, pending, and deffered jobs. If you are having problems with jobs being deffered, please let us know.

To get more in depth information on a job, you can used the command “checkjob jobid”.  This will show you the submission commands, the nodes it is running on, pids, and various other information.

Need help?

If you need help, there are a number of ways you can get it.  If you are having trouble compiling code that your group uses, the best resource is your colleagues. Many of them will already have the code running and can answer questions that are code specific.

If you are having technical difficulties, questions, or wish to report a problem, the best place to get help is to email help-hpc@caltech.edu. This will generate a ticket and many people are looking at this system.  If you suspect a problem of any sort, I encourage you to send an email here.  A problem that you think is just affecting you may be more systemic and your report will help everyone.