2 How to Run BCL-Convert on HPC
A short tutorial that shows how to get started with BCL Convert version 4.2.7
. The examples below show how to run bcl-convert
for demultiplexing Illumina’s BCL raw data. This tutorial assumes that BCL Convert is installed in your system and it is fully functional.
2.1 Make BCL-convert samplesheet-v2.csv
for standard bulk DNA/RNA-seq
For standard sequencing runs, like bulk RNA sequencing:
Create the sample sheet file required by BCL-convert.
Populate the file with the corresponding sample information. The final sample sheet should look like this:
[Header] FileFormatVersion,2 FlowCellType,P2 [BCLConvert_Settings] CreateFastqForIndexReads,0 BarcodeMismatchesIndex1,1 [BCLConvert_Data] Sample_ID,index,index2,Sample_Project,Sample_Name sample1_ID,ATCACGTT,,Project_1,sample1 sample2_ID,CGATGTTT,,Project_2,sample2
BarcodeMismatchesIndex2,1
should be added to the[BCLConvert_Settings]
section when using double indexing. Addindex2
barcode sequence.
2.2 Make BCL-convert samplesheet-v2.csv
for 10X Genomics single-cell RNA-seq
Populate the sample sheet with the sample information. Main spread sheet might contain the index names but you can look for the sequences here: https://cdn.10xgenomics.com/raw/upload/v1655151897/support/in-line%20documents/Dual_Index_Kit_TT_Set_A.csv
- Note that
index
andindex2
columns should contain the actual sequences.
- Note that
The final sample sheet should look like this:
[Header] FileFormatVersion,2 FlowCellType,P2 [BCLConvert_Settings] CreateFastqForIndexReads,0 BarcodeMismatchesIndex1,1 BarcodeMismatchesIndex2,1 [BCLConvert_Data] Sample_ID,index,index2,Sample_Project,Sample_Name sample1_ID,TATCAGCCTA,GTTTCGTCCT,Project_1,sample1 sample2_ID,GCCCGATGGA,AATCGTCTAG,Project_1,sample2
Note that
Sample_Project
remains the same given that both samples come from the same sequencing run.
2.3 Run BCL-convert
Assuming BCL Convert is properly installed in your HPC system, you can follow the following workflow:
Create a bash script, i.e.
doDemux_bclconvert.sh
:#!/bin/bash DIR=$1 # First positional argument for the input directory SHEET=$2 # Second positional argument for the sample sheet OUT=$3 # Third positional argument for the output directory # Load the bclconvert installation path module load bclconvert/4.2.7 # bcl-convert command bcl-convert --bcl-input-directory $DIR --output-directory $OUT --sample-sheet $SHEET --shared-thread-odirect-output true --sample-name-column-enabled true --bcl-sampleproject-sub directories true
Run
bcl-convert
# Run the doDemux_bclconvert.sh wrapper sbatch -p priority -c 24 --mem 250000 -t 24:0:0 \ \ # full path for the created bash script doDemux_bclconvert.sh /data/NextSeq1000/runs/run \ # full path for the sequencing run output samplesheet.csv \ # sample sheet located in current directory /path/to/output # full path for output directory
You might need to change some out directory permissions
# Change directory permissions chmod +xr -R output/Project/ # example for a 10X run
Verify a successful BCL-convert run by inspecting the
slurm-[jobid].out
file. Should look like this:Index Read 2 is marked as Reverse Complement in RunInfo.xml: The barcode and UMI outputs will be output in Reverse Complement of Sample Sheet inputs. Sample sheet being processed by common lib? Yes SampleSheet Settings: BarcodeMismatchesIndex1 = 1 BarcodeMismatchesIndex2 = 1 CreateFastqForIndexReads = 0 shared-thread-linux-native-asio output is enabled WARNING: shared-thread-linux-native-asio output could have low performance or hang if output directory is on a distributed file system bcl-convert Version 00.000.000.4.2.7 Copyright (c) 2014-2022 Illumina, Inc. Command Line: --bcl-input-directory /data/NextSeq1000/runs/run --output-directory output --sample-sheet samplesheet.csv --shared-thread-odirect-output true --sample-name-column-enabled true --bcl-sampleproject-subdirectories true Conversion Begins. # CPU hw threads available: 24 Parallel Tiles: 4. Threads Per Tile: 6 SW compressors: 24 SW decompressors: 12 SW FASTQ compression level: 1 Conversion Complete.
2.4 Generate MultiQC report
MultiQC can be used to generate a sequencing report. MultiQC only requires the main output directory of BCL-convert.
Here is an implementation of MultiQC using Singularity containers
A wrapper script for MultiQC, i.e.
multiqc.sh
:#!/usr/bin/bash DIR=$1 singularity exec --bind $DIR:/run /apps/linux/5.4/multiqc/lib/multiqc-1.22.3.sif multiqc --outdir /run/MultiQC/ /run
Run the script on your system
# Run MultiQC /apps/linux/5.4/multiqc/bin/multiqc.sh /path/to/bcl-convert/output