Examples

Here we provide the detailed description of possible workflows. We recommend to run analysis using a terminal multiplexer, e.g. tmux or screen.

Running multiple mapping tools (E.g., STAR, HISAT2 and bbmap)

Make sure you followed the steps described in the setup section carefully.
Before getting started make sure to activate the snakemake conda environment:

conda activate dicast-snakemake

Create the input folder:

cd /path/to/DICAST/
mkdir input

Create the directory structure as in the sample_output:

cd input
mkdir controldir
cd controldir
mkdir fastqdir

Download or copy the genome fasta file into the input folder. Dont’t forget to uncompress it. E.g.:

cd /path/to/DICAST/input
wget http://ftp.ensembl.org/pub/release-105/fasta/homo_sapiens/dna/Homo_sapiens.GRCh38.dna.primary_assembly.fa.gz
gunzip Homo_sapiens.GRCh38.dna.primary_assembly.fa.gz

Download or copy the genome gtf annotation into the input folder. Dont’t forget to uncompress it. E.g.:

wget http://ftp.ensembl.org/pub/release-105/gtf/homo_sapiens/Homo_sapiens.GRCh38.105.gtf.gz
gunzip Homo_sapiens.GRCh38.105.gtf.gz

Download or copy the fastq files you want to align into the /path/to/DICAST/input/controldir/fastqdir. Note: we support only paired-end RNA-Seq - fastq files have to be in pairs.
Go to /path/to/DICAST/scripts and edit config.sh according to your run (see How to change your config.sh file):

cd /path/to/DICAST/scripts
nano config.sh

In the config.sh file edit the following lines:

read_length=76
fastaname=Homo_sapiens.GRCh38.dna.primary_assembly.fa
gtfname=Homo_sapiens.GRCh38.105.gtf

List the mapping tools you want to run:

cd /path/to/DICAST/scripts/snakemake/
nano snakemake_config.yaml

In the snakemake_config.yaml file edit the following lines:

Mapping_tools:
    What_tools_to_run: 'star, hisat, bbmap'

In the /path/to/DICAST/scripts/snakemake/ folder run:

snakemake -j 1 -d /path/to/DICAST/input -s Snakefile -c snakemake_config.yaml

This command will start the mapping tools indicated in the snakemake_config.yaml (E.g. STAR, HISAT2 and bbmap).

First, the pipeline will build all necessary dockers. Second, in will create a /path/to/DICAST/index folder and put the results of indexing. Finally, the pipeline will create a /path/to/DICAST/output folder with the alignment results inside the dedicated folders (e.g., star-output, hisat-output, bbmap-output).

Running multiple alternative splicing event detection tools (E.g., MAJIQ and Whippet)

Make sure you followed the steps described in the setup section carefully.
Before getting started make sure to activate the snakemake conda environment:

conda activate dicast-snakemake

Create the input folder:

cd /path/to/DICAST/
mkdir input

Create the directory structure as in the sample_output:

cd input
mkdir controldir
cd controldir
mkdir fastqdir
mkdir bamdir

Download or copy the genome fasta file into the input folder. Dont’t forget to uncompress it. E.g.:

cd /path/to/DICAST/input
wget http://ftp.ensembl.org/pub/release-105/fasta/homo_sapiens/dna/Homo_sapiens.GRCh38.dna.primary_assembly.fa.gz
gunzip Homo_sapiens.GRCh38.dna.primary_assembly.fa.gz

Download or copy the genome annotation file into the input folder. Dont’t forget to uncompress it. E.g.:

wget http://ftp.ensembl.org/pub/release-105/gtf/homo_sapiens/Homo_sapiens.GRCh38.105.gtf.gz
gunzip Homo_sapiens.GRCh38.105.gtf.gz

Download or copy the genome gff3 annotation into the input folder (for MAJIQ). Dont’t forget to uncompress it. E.g.:

wget http://ftp.ensembl.org/pub/release-105/gff3/homo_sapiens/Homo_sapiens.GRCh38.105.gff3.gz
gunzip Homo_sapiens.GRCh38.105.gff3.gz

Download or copy the fastq files you want to use into the /path/to/DICAST/input/controldir/fastqdir. Note: we support only paired-end RNA-Seq - fastq files have to be in pairs.
Download or copy the bam files you want to use into the /path/to/DICAST/input/controldir/bamdir.
Go to /path/to/DICAST/scripts and edit config.sh according to your run (see How to change your config.sh file):

cd /path/to/DICAST/scripts
nano config.sh

In the config.sh file edit the following lines:

read_length=76
fastaname=Homo_sapiens.GRCh38.dna.primary_assembly.fa
gtfname=Homo_sapiens.GRCh38.105.gtf
gffname=Homo_sapiens.GRCh38.105.gff3

List the mapping tools you want to run:

cd /path/to/DICAST/scripts/snakemake/
nano snakemake_config.yaml

In the snakemake_config.yaml file edit the following lines:

Alternative_splicing_detection_tools:
   What_tools_to_run: 'majiq, whippet'

In the /path/to/DICAST/scripts/snakemake/ folder run:

snakemake -j 1 -d /path/to/DICAST/input -s Snakefile -c snakemake_config.yaml

This command will start the mapping tools indicated in the snakemake_config.yaml (E.g. MAJIQ, Whippet).

First, the pipeline will build all necessary dockers. Second, the pipeline will create a /path/to/DICAST/output folder with the event detecton results inside the dedicated folders (e.g., majiq-output, hisat-output, whippet-output).