Tutorial - Connecting to PDC Supercomputer

PDC, the KTH's supercomputing center, kindly made us available Tegner supercomputer for the course.

How do I access and use PDC supercomputers?

1. Obtain an account at PDC

To apply for an account on PDC supercomputers, you need a scan of your passport or international ID card. You will be asked to upload this as part of the application process. Also, examine the computer rules for running at PDC and the PDC privacy statement.

To apply for a PDC user account, you need to fill the form at:

https://pdc-web-01.csc.kth.se/accounts/

During the application, please explicitly state that you are applying for an account for DD2360. As project, in the "Additional Information" page, please select "edu19.DD2360” (or “none", if "edu19.DD2360" is not yet present in the list).

account

2. Set-up Kerberos

In order to access Tegner (or any other PDC supercomputer) from your local machine, you will need to set-up Kerberos that is an authentication system to access supercomputers at PDC. If you are using the workstations available in the laboratory rooms, please, ignore this step.

Different systems (Linux, Mac, Windows) have slightly different procedures to set-up Kerberos.

kerberos

Before connecting to Tegner, the local environment must be configured. Detail instructions can be found at How to configure kerberos and SSH.

To configure Kerberos on your local machine you need to do the following steps:

1) Create .ssh folder if it does not already exist in your home folder.

2) Create a file called krb5.conf in .ssh with the following content

[domain_realm]
  .pdc.kth.se = NADA.KTH.SE

[appdefaults]
  forwardable = yes
  forward = yes
  krb4_get_tickets = no

[libdefaults]
  default_realm = NADA.KTH.SE
  dns_lookup_realm = true
  dns_lookup_kdc = true

3) In .bash_profile or .profile, add the following two lines. The first one points kerberos to the right configuration and the second creates a kerberos cache in a fixed location instead of tmp.

export KRB5_CONFIG=$HOME/.ssh/krb5.config
export KRB5CCNAME=$HOME/.ssh/krb5cc

4) Create a file in .ssh called config with the following content

# Hosts we want to authenticate to with Kerberos
Host *.kth.se *.kth.se.
# User authentication based on GSSAPI is allowed
GSSAPIAuthentication yes
# Key exchange based on GSSAPI may be used for server authentication
GSSAPIKeyExchange yes
# Hosts to which we want to delegate credentials. Try to limit this to
# hosts you trust, and were you really have use for forwarded tickets.
Host *.csc.kth.se *.csc.kth.se. *.nada.kth.se *.nada.kth.se. *.pdc.kth.se *.pdc.kth.se.
# Forward (delegate) credentials (tickets) to the server.
GSSAPIDelegateCredentials yes
# Prefer GSSAPI key exchange
PreferredAuthentications gssapi-keyex,gssapi-with-mic
# All other hosts
Host *

5) set the right permission on the file

chmod 644 ~/.ssh/config

3. Obtain a Kerberos ticket to log into Tegner (or any other PDC supercomputer)

At this point, you will need to create a Kerberos ticket using kinit. This is the time you need to have to provide the password provided by PDC:

kinit --forwardable YourUsername@NADA.KTH.SE

If you are using a workstation in the laboratory rooms, use pdc-kinit instead of kinit:

pdc-kinit --forwardable YourUsername@NADA.KTH.SE

Kerberos tickets normally expire after 10 hours. It is possible to create tickets with longer lifetimes, like this:

kinit -l 12h --forwardable YourUsername@NADA.KTH.SE

to get a 12-hour ticket. Replace kinit with pdc-kinit if you are using one of the workstations in the laboratory rooms.

More instructions can be found here.

4. Use ssh from the command line to access Tegner

Use your Kerberos ticket and SSH to connect to Tegner:

ssh YourUsername@tegner.pdc.kth.se

Once again, if you are using a workstation in the laboratory rooms, use pdc-ssh instead of ssh:

pdc-ssh YourUsername@tegner.pdc.kth.se

5. Your Data on PDC supercomputers

When using PDC supercomputers, you deal with two file systems:

AFS (Andrew File System)
Lustre (Also called CFS)

Before running your code on PDC supercomputers:

When you log into PDC systems, you are on AFS. To move to Lustre, you need to change directory.
All files for computations must go on Lustre. That means that if you try to run an executable on AFS, it will fail telling you that it is not possible to find the executable file.
Big data files computations should be put on Lustre

When you log in to Tegner, you will arrive in your PDC AFS directory

/afs/pdc.kth.se/home/y/yourUsername

There is another file system, called Lustre, that is used to access files on the compute nodes. Your Lustre space is available at:

/cfs/klemming/scratch/<first letter of yourUsername>/yourUsername

/cfs/klemming/nobackup/<first letter of yourUsername>/yourUsername

You can compile your programs in your AFS directory, but you must run your programs from your Lustre space, since this is available on all the compute nodes (whereas your AFS space is not). Note that the Lustre file system is not backed up, so it is a good idea to keep a copy of any important files in your AFS directory.

6. Compile and run on Tegner

Compile programs using the specified compiler, such as GCC.

The salloc command allocates an interactive node. The srun command will then launch programs on that node. Assuming the name of the binary is hello.out, to execute this binary on an interactive node, run the following:

cd /cfs/klemming/nobackup/u/username
salloc --nodes=1 -t 01:00:00 -A edu19.DD2360 
srun -n 1 ./hello.out

Where --nodes specify the number of nodes to be allocated, -t specifies the duration of allocation and -A specifies the allocation code which can be found with "projinfo" command. For information about slurm options on Tegner check here.

We might also want to submit a batch job with this binary, below follows an example.

First, we'll create a simple script that will go to the right directory in CFS and will execute hello.out. Create a file named "myjob.sh" in your /cfs/klemming/nobackup/u/username directory, containing:

#!/bin/bash

# The name of the script is myjob
#SBATCH -J myjob
# Only 1 hour wall-clock time will be given to this job
#SBATCH -t 1:00:00
#SBATCH -A edu19.DD2360
# Number of nodes
#SBATCH --nodes=1
#SBATCH -e error_file.e

# Run the executable file 
# and write the output into my_output_file
srun -n 1 ./hello.out > hello_output

Now we will submit this script to the batch system, by executing:

sbatch ./myjob.sh

By using squeue, we can monitor the job in the queue:

JOBID PARTITION     NAME     USER ST       TIME  NODES NODELIST(REASON)
----------------------------------------------------------------------
  252      main    myjob username  R       0:01      1          t01n34

After the job has finished, you will find a file named "hello_output" in the same directory as you submitted the job.

cat hello_output

which should give

Hello World!

Changing Compiler and Compilation Environment

In order to change compilation environment, we need to load a specific module for a given compiler. Refer to here for more information.

https://www.pdc.kth.se/support/documents/software_development/development.html

A more detailed presentation about how to connect to PDC supercomputer is presented here:

IntroductionToPDCsupercomputers.pdf Download IntroductionToPDCsupercomputers.pdf