Post-processing tool for fusion transcripts visualization.

The component takes as inputs:

The component creates two virtual genome references: The reads are then aligned to these two virtual references using STAR aligner (https://github.com/alexdobin/STAR).

The component outputs:

The output can be visualized loading these files in IGV browser (https://www.broadinstitute.org/igv/). In IGV browser: "Genomes->Load Genome from File..." and select either virtual_ref1.fa or virtual_ref2.fa; "File->Load from file..." and select respectively annotation1/2.gtf,annotation1/2.gff3, align1/2.sorted.bam.

Install ensembl api following instructions from http://www.ensembl.org/info/docs/api/api_git.html Also install bioperl-run https://github.com/bioperl/bioperl-run

Version 1.1
Bundle sequencing
Categories Alignment
Authors Gabriele Partel (gabrielepartel@gmail.com)
Issue tracker View/Report issues
Requires BioPerl ; Ensembl Perl API ; STAR ; Samtools ; installer (bash)
Source files component.xml fusionvisualizercomp.sh
Usage Example with default values


Name Type Mandatory Description
fusionList CSV Mandatory 4 columns tab separated file containing a header line with the column names and a line for each fusion annotated with the IDs of the genes involved and the fusion breakpoints in 1-based coordinates. The column names must be passed as parameters.
annotations GTF Mandatory Ensembl gtf annotation file.
reads FASTQ Mandatory Single-end or first paired-end (reads) file. (The file format allowed are: .fa,.fq,.fq.gz).
mates FASTQ Optional Second paired-end (mates) files (leave empty if single-end). (The file format allowed are: .fa,.fq,.fq.gz).


Name Type Description
folder BinaryFolder Output folder containing the output files


Name Type Default Description
as string "0.95" STAR --outFilterScoreMinOverLread parameter: Alignment will be output if its score is higher than or equal to this value (normalized to the read length).
gene3pBP string "gene3p_bp" 3'pair gene breakpoint coordinate column name in fusionList.
gene3pID string "gene3p_id" 3'pair gene ID column name in fusionList.
gene5pBP string "gene5p_bp" 5'pair gene breakpoint coordinate column name in fusionList.
gene5pID string "gene5p_id" 5'pair gene ID column name in fusionList.
release string "84" Ensembl release number.
threads string "8" STAR -runThreadN parameter: number of threads used by STAR.

Test cases

Test case Parameters IN
case1 (missing) fusionList annotations reads mates (missing)

