Examples
Here we provide the detailed description of possible workflows. We recommend to run analysis using a terminal multiplexer, e.g. tmux or screen.
Running multiple mapping tools (E.g., STAR, HISAT2 and bbmap)
Make sure you followed the steps described in the setup section carefully.
Before getting started make sure to activate the snakemake conda environment:
conda activate dicast-snakemake
Create the input folder:
cd /path/to/DICAST/
mkdir input
Create the directory structure as in the sample_output:
cd input
mkdir controldir
cd controldir
mkdir fastqdir
Download or copy the genome fasta file into the input folder. Dont’t forget to uncompress it. E.g.:
cd /path/to/DICAST/input
wget http://ftp.ensembl.org/pub/release-105/fasta/homo_sapiens/dna/Homo_sapiens.GRCh38.dna.primary_assembly.fa.gz
gunzip Homo_sapiens.GRCh38.dna.primary_assembly.fa.gz
Download or copy the genome gtf annotation into the input folder. Dont’t forget to uncompress it. E.g.:
wget http://ftp.ensembl.org/pub/release-105/gtf/homo_sapiens/Homo_sapiens.GRCh38.105.gtf.gz
gunzip Homo_sapiens.GRCh38.105.gtf.gz
Download or copy the fastq files you want to align into the /path/to/DICAST/input/controldir/fastqdir. Note: we support only paired-end RNA-Seq - fastq files have to be in pairs.
Go to /path/to/DICAST/scripts and edit config.sh according to your run (see How to change your config.sh file):
cd /path/to/DICAST/scripts
nano config.sh
In the config.sh file edit the following lines:
read_length=76
fastaname=Homo_sapiens.GRCh38.dna.primary_assembly.fa
gtfname=Homo_sapiens.GRCh38.105.gtf
List the mapping tools you want to run:
cd /path/to/DICAST/scripts/snakemake/
nano snakemake_config.yaml
In the snakemake_config.yaml file edit the following lines:
Mapping_tools:
What_tools_to_run: 'star, hisat, bbmap'
In the /path/to/DICAST/scripts/snakemake/ folder run:
snakemake -j 1 -d /path/to/DICAST/input -s Snakefile -c snakemake_config.yaml
This command will start the mapping tools indicated in the snakemake_config.yaml (E.g. STAR, HISAT2 and bbmap).
First, the pipeline will build all necessary dockers. Second, in will create a /path/to/DICAST/index folder and put the results of indexing. Finally, the pipeline will create a /path/to/DICAST/output folder with the alignment results inside the dedicated folders (e.g., star-output, hisat-output, bbmap-output).
Running multiple alternative splicing event detection tools (E.g., MAJIQ and Whippet)
Make sure you followed the steps described in the setup section carefully.
Before getting started make sure to activate the snakemake conda environment:
conda activate dicast-snakemake
Create the input folder:
cd /path/to/DICAST/
mkdir input
Create the directory structure as in the sample_output:
cd input
mkdir controldir
cd controldir
mkdir fastqdir
mkdir bamdir
Download or copy the genome fasta file into the input folder. Dont’t forget to uncompress it. E.g.:
cd /path/to/DICAST/input
wget http://ftp.ensembl.org/pub/release-105/fasta/homo_sapiens/dna/Homo_sapiens.GRCh38.dna.primary_assembly.fa.gz
gunzip Homo_sapiens.GRCh38.dna.primary_assembly.fa.gz
Download or copy the genome annotation file into the input folder. Dont’t forget to uncompress it. E.g.:
wget http://ftp.ensembl.org/pub/release-105/gtf/homo_sapiens/Homo_sapiens.GRCh38.105.gtf.gz
gunzip Homo_sapiens.GRCh38.105.gtf.gz
Download or copy the genome gff3 annotation into the input folder (for MAJIQ). Dont’t forget to uncompress it. E.g.:
wget http://ftp.ensembl.org/pub/release-105/gff3/homo_sapiens/Homo_sapiens.GRCh38.105.gff3.gz
gunzip Homo_sapiens.GRCh38.105.gff3.gz
Download or copy the fastq files you want to use into the /path/to/DICAST/input/controldir/fastqdir. Note: we support only paired-end RNA-Seq - fastq files have to be in pairs.
Download or copy the bam files you want to use into the /path/to/DICAST/input/controldir/bamdir.
Go to /path/to/DICAST/scripts and edit config.sh according to your run (see How to change your config.sh file):
cd /path/to/DICAST/scripts
nano config.sh
In the config.sh file edit the following lines:
read_length=76
fastaname=Homo_sapiens.GRCh38.dna.primary_assembly.fa
gtfname=Homo_sapiens.GRCh38.105.gtf
gffname=Homo_sapiens.GRCh38.105.gff3
List the mapping tools you want to run:
cd /path/to/DICAST/scripts/snakemake/
nano snakemake_config.yaml
In the snakemake_config.yaml file edit the following lines:
Alternative_splicing_detection_tools:
What_tools_to_run: 'majiq, whippet'
In the /path/to/DICAST/scripts/snakemake/ folder run:
snakemake -j 1 -d /path/to/DICAST/input -s Snakefile -c snakemake_config.yaml
This command will start the mapping tools indicated in the snakemake_config.yaml (E.g. MAJIQ, Whippet).
First, the pipeline will build all necessary dockers. Second, the pipeline will create a /path/to/DICAST/output folder with the event detecton results inside the dedicated folders (e.g., majiq-output, hisat-output, whippet-output).