Longhai Li @ University of Saskatchewan

Computer cluster “Bayes.usask.ca”

1.   How to access “bayes”

Your accounts on bayes have been created. By default, they use your nsid and password on paws for bayes. You can change your password with terminal command "passwd" if you want after you use ssh client (Putty on windows, or terminal on mac) to log into bayes.usask.ca.

Use VPN to access bayes from off-campus. As required by the U of S and CFI, you can only use bayes on campus. If you need to use bayes from home, you need to download vpn client:

http://www.usask.ca/ict/services/network-services/vpn/index.php 

It may be very slow to use bayes from home depending on your connection.

2.   How to use rstudio server on “bayes”

To use R, you can use rstudio server with your browser, such as chrome, firefox and safari. The url to log into rstudio on bayes is:

http://rstudio.bayes.usask.ca

Note: Rstudio is only for developing and testing your R code. It cannot do parallel computing.

3.   Basic commands in linux terminal to manage your files:

To understand basic linux terminal (shell) commands, you are recommended to read this webpage:
http://www.comptechdoc.org/os/linux/usersguide/linux_ugbasics.html

You are in terminal if the line starts with $, which is different from R console that starts a line with > .

4.   File transmission between bayes and your desktop

SCP is available on bayes. You can use these methods to upload/download files between bayes and your desktop:

Ø  On all platforms: Rstudio web interface provides the function to transfer files

Ø  On windows: using winscp.

Ø  On Mac: mount bayes harddrive to your Mac file system with these steps:

1.     Download osxfuse and sshfs and install them

2.     Download this bash script to your “home” directory:  .bash_profile

3.     Log out and log in again

4.     Create a directory called “bin” under /home; change its permission with “chmod 755 bin”

5.     Download the script file “mountbayes” into your “bin”

6.     Mount bayes drive of user abc123 with this command:

mountbayes abc123

5.   How to use “bayes” to do parallel computing with R

1.     log in bayes using ssh client (putty on windows, terminal in linux and mac) with your user and password.

2.   Upload/create the “test.r” file (http://math.usask.ca/~longhai/useBayes/bayesfiles/test.r) to a directory, for example ``test'' under your home directory ((which you need to create with mkdir test)

3.     change to test directory from your home directory ~ in terminal($), type: cd test

4.     Test your R job, say test.r, with this command:

testR test.r

5.     If test.r run correctly, and you want to run test.r simultaneously 100 times, do this:

qsubR test.r 100 1GB 00:10:00

Explanation:

In R syntax, this command is very much like

for (i in 1:100) {irep <- i; source (“test.r”))}

It creates 100 R jobs by adding an R expression “irep <- i(for i = 1, …, 100) to the top of “test.r” then submit the 100 modified “test.r” to the cluster. All the 100 new “test.r” will be run simultaneously in 100 CPUs. To distinguish the 100 job outputs, use “irepas the identity in each R job. The difference from an ordinary “for” loop is that the results of each job must be saved in harddrive and we need to write a function to combine the results by retrieving the results from harddrive.

6.   Basic commands to control your jobs on cluster

Ø  qstat: to monitor status of jobs

Ø  qdel 100: to kill the job with id 100

Ø  mqdel: a command to kill multiple jobs with a command, for example

mqdel 321 100

The above command will kill 100 jobs back from job 321, that is job with id 321, ..., 222. Find the ids of your jobs using qstat.

For more introductions, you can visit the help page for cluster plato which uses the same scheduling system:

http://www.usask.ca/ict/services/research-technologies/advanced-computing/plato/running-jobs.php.

Note that: You don't know need to understand the above description to use parallel computing. The fcommand " qsubR " creates a shell file based on your R code and submits it to bayes scheduler.

7.   Read more about the command qsubR and mqdel?

Type whereis qsubR to find the path to the file, then you can read the file for details. It has two more arguments to specify the memory and walltime needed for your jobs. Using less walltime and less memory than the default settings may place your jobs in higher priority.

8.   Contact

To create account and installing software, please contact Richard Kondra (kondra@math.usask.ca), with Longhai Li (longhai.li@usask.ca) being c.c.