IMPORTANT: you must never directly contact the University computing helpline for assistance, unless we ask you to! Everything that you need (e.g., filesystems, compute access) has already been pre-configured for you. If you need help, ask in the lab or on the forums.
What you will learn in this part:
- Working with a compute cluster:
- Logging in
- How to run a job
- Skills for working on a remote system
- Where files are stored
Logging in to Eddie in the terminal
We will be using the Edinburgh Compute and Data Facility that provides the “Eddie” compute cluster which has GPUs that we need.
Terminology:
Local computer – the computer you are sitting at, perhaps your own laptop or a PPLS lab computer
Login node – a computer that is part of Eddie, which you can log in to remotely
Compute node – a computer that is part of Eddie, which you cannot directly log in to, but can run jobs on. Some nodes contain a GPU, which is required for running the models that we will use in this exercise.
GPU – “graphics processing unit”, a specialised type of computer that performs a large number of computations at the same time (“in parallel”): well-suited to the type of computations needed for neural networks
Job – running a program on one of the compute nodes, by scheduling it
Scheduler – a program running on the cluster that decides which job to run next
Filesystem – a place to store files. There are multiple filesystems which you’ll learn about shortly.
To log in to Eddie, you need to be either on the campus network, or connected to the VPN. In a terminal on your local computer:
ssh s1234567@eddie.ecdf.ed.ac.uk
Where s1234567 is your username and the password is your EASE password. This will log you in to one of Eddie’s login nodes. Important: you must never perform any heavy computation on login nodes – they are shared between all users (and they don’t have GPUs).
For convenience, you can add the following to your ssh configuration file on your local computer (usually this is ~/.ssh/config)
Host eddie HostName eddie.ecdf.ed.ac.uk User s1234567
which will allow you to log in using
ssh eddie
and that will also be convenient when using VS Code.
Running your first job
Remember: do not run substantial jobs directly on a login node! There are two ways to run a job on the cluster:
- Schedule the job (add it to the queue) – this is the most common way you will use
- Request an interactive session on a compute node, then run the job directly – this is useful for development and debugging
Let’s run our first job!Create a shell script that prints “Hello world!” and save it as hello.sh in your home directory. Since this is such a simple script, it’s OK to try running it on the login node to check it works:
./hello.sh
now add the job to the queue
qsub hello.sh
You will get a message like: Your job 51978720 ("hello.sh") has been submitted
Now, how do we know whether our job is running, or has finished? We can ask the scheduler:
qstat
When run with no arguments, this will list all the jobs that you currently have in the queue. If you get no output, that means nothing is in the queue – so perhaps your job already finished. Since it ran on a compute node, it will not have printed “Hello world!” to our session on the login node. Instead, all output will be placed in a pair of files, one each for stdout and stderr:
hello.sh.e51978720 hello.sh.o51978720
The names for these files are constructed from the name of the job (which defaults to the name of the program or script that you submitted), e or o, and the job number (assigned by the scheduler). They will be saved in whichever directory you were in when you ran qsub.
Take a look at their contents. Which one contains “Hello world!”? Why?
Next, start learning the useful skills in the following section. Take your time and don’t worry if at first they seem difficult or confusing.

Skills: working on a remote machine
A few clever techniques will make working on a remote machine more convenient. If you find this part difficult or confusing, just take it slowly and keep practicing.

Skills: filesystems on ECDF
It's important to understand the differences between the various filesystems. Each is for a specific purpose.
Related forums
Korin’s slides from the first lab session are available in the forums.
-
- Forum
- Topics
- Last Post

