1 Getting started with HPC systems
1.1
This is a short and [hopefully] simple tutorial for guiding people on how to use a HPC cluster for running their analyses. This is intended to be very generic so it is not covering a particular HPC for a specific institution. Please contact your corresponding HPC system administrator for requesting access to computing resources.
1.2 What is a HPC cluster?
A HPC cluster is the combination of:
Many individual machines, each referred to as “nodes”
Fast shared storage, accessible to all nodes
All interconnected over high speed networks and/or specialized interconnects
With resource access managed by a scheduler (i.e. slurm)
1.3 Get started with your analyses
A typical workflow on an HPC cluster includes:
Log in: Use the command line SSH or web interface to access the cluster
An example on how to use the command line to log in via SSH to BSE-HPC:
$ ssh user@hpc.institution.edu
Transfer Data: Move data from your local computer and/or other sources to the HPC cluster
Find Software: Access existing software from the cluster, download from a remote source, or compile your own code
As an example, you can find and load existing software via
module
:# To list the available/installed software user@login1:~$ module avail --------------------- /institution/hpcdata/software/rhel9/modules/bio --------------------------------- alphafold/2.3.2 bwa/0.7.17 guppy/6.0.1 metaxa2/2.2.3 # To load software, i.e. alphafold user@login1:~$ module load alphafold/2.3.2 # Ready to use alphfold user@login1:~$ alphafold Usage: /institution/hpcdata/software/containers/alphafold/alphafold_2.3.2.sh <OPTIONS> Required Parameters: -o <output_dir> Path to a directory that will store the results. -f <fasta_file> Path to a FASTA file containing one sequence ...
Prepare Input: Set up necessary files for calculation
Prepare Job Script: Create a job script with the commands to run the cluster. Here is an example of a
alphafold.sh
script#!/bin/bash #SBATCH --job-name=alphafold #SBATCH --output=alphafold_%j.out #SBATCH --error=alphafold_%j.err #SBATCH --nodes=1 #SBATCH --cpus-per-task=16 # Adjust based on your system and needs #SBATCH --mem=64G # Adjust memory as needed #SBATCH --time=24:00:00 # Adjust runtime as needed #SBATCH --gres=gpu:1 # Request a GPU module load alphafold # Or however you load the alphafold environment # Example command to run AlphaFold alphafold run_prediction \ --fasta_paths=target.fasta \ --output_dir=output_dir \ --data_dir=/path/to/alphafold/data \ --preset=model_1_ptm \ --max_template_date=2023-12-31 #Explanation of important parts: #SBATCH directives: # --job-name: Name of the job. # --output: Output file. # --error: Error file. # --nodes: Number of nodes. # --cpus-per-task: Number of CPU cores per task. # --mem: Memory allocation. # --time: Maximum runtime. # --gres=gpu: Number of GPUs requested. #module load alphafold: Loads the AlphaFold environment. This will vary depending on your HPC setup.
Submit Jobs: Send your batch submission to start the calculation:
user@login1:~$ sbatch alphafold.sh
Monitor Progress: Check the status of your calculations
user@login1:~$ squeue JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON) 62847 partition alphafold user R 13:05 1 vgpu-2017-001
JOBID
: A unique numerical identifier for each jobPARTITION
: The name of the partition (queue) where the job is submittedNAME
: The name assigned to the jobUSER
: The username who submitted the jobST
: The current status of the job:PD
Pending,R
Running,CD
Completed,F
Failed,S
SuspendedTIME
: The amount of time the job has been runningNODES
: the number of noes allocated to the jobNODELIST(REASON)
: The names of nodes allocated to the job. If the job is pending, this column may display the reason why it’s waiting(e.g. “Resources”, “Priority”, “Dependency”)
Analyze Results: Review results when they finish either on the HPC or back on your local computer for analysis and visualization.