Up: Component summary Component

ExpressionQuantifier

Quantifies expression from a bam file. Currently supports eXpress, Cufflinks, BitSeq, and RSEM. Tool specific instructions: Cufflinks: align with tophat or if using STAR include --outSAMstrandField intronMotif parameter when aligning. Requires genome reference. Example parameters for cufflinks: "--no-effective-length-correction --max-bundle-length 10000000 --multi-read-correct --upper-quartile-norm -q". eXpress: align with bowtie or if using STAR include --quantMode TranscriptomeSAM --quantTranscriptomeBan Singleend. Requires transcriptome reference that can be prepared using gffread -w transcripts.fa -g genome.fa annotation.gtf (gffread comes with cufflinks). Example parameters for eXpress: "--rf-stranded". RSEM: align with bowtie or if using STAR include --quantMode TranscriptomeSAM --quantTranscriptomeBan IndelSoftclipSingleend. This component only supports the rsem-calculate-expression command; so you need to prepare the reference on your own (using RSEM-prepare-reference). For reference here provide the directory where you created the reference, and the string that you use to name the reference in this command needs to be given as the "extra" parameter in this component. Example parameters for RSEM: "-p 8 --paired-end --bam --estimate-rspd --append-names"; example extra: "rsemRef". BitSeq: align with bowtie or if using STAR use either parameters for eXpress or RSEM (both seem to work and I don't know which ones are better). BitSeq runs in two steps, first it runs parse-alignment command and then estimateExpression. In "parameters" you can define the parameters you want for the parseAlignment command, and in "extra" you can define the parameters used for estimating expression. Example parameters for BitSeq: "--uniform"; example extra: "--outType rpkm".

Notes: This component has been tested using STAR alignments; should work with other alignments provided that they have been aligned as the quantification tools specifies. RSEM needs that the aligment file has the right extension (fastq, fq, bam, sam). The alignments in the array input will be quantified as the same sample when the tool allows it, so, if you want to quantify different samples use this component in a loop; it will not quantify samples separately when given in array format. If you get errors make sure you are using the correct alignment and the correct set of parameters for each tool.

Version 2.0
Bundle sequencing
Categories Analysis
Authors Alejandra Cervera (alejandra.cervera@helsinki.fi)
Issue tracker View/Report issues
Requires installer (bash)
Source files component.xml quant.sh
Usage Example with default values

Inputs

Name Type Mandatory Description
alignment AlignedReadSet Optional Aligned reads to transcriptome or genome.
reference FASTA Optional Target sequences, either transcripts or genome. For Cufflinks a genome fasta is expected; for eXpress and BitSeq a transcriptome fasta.
referenceFolder BinaryFolder Optional Reference folder, only for RSEM. Include in the extra parameter the name of the reference inside the folder without extension.
annotation GTF Optional Only for Cufflinks, it sets -g/--GTF-guide parameter.
mask GTF Optional Only for Cufflinks, gtf with transcripts to be excluded from the analysis. If this file is provided the -M paramter will be added automatically, do not include in paramters below.
array Array<AlignedReadSet> Optional Alignments supplied as an array will be comma separated in the input for eXpress.

Outputs

Name Type Description
folder BinaryFolder All files produced by the tool.
genes CSV Gene level expression table, if available (eXpress does not produce one).
isoforms CSV Transcript level expression table. Gene level expression values are not produced by express and from cufflinks they are available in the output array.

Parameters

Name Type Default Description
extra string "" When using RSEM define here the name given to the reference (using rsem-prepare-reference); for BitSeq define here any parameters wanted in the estimateExpression step.
parameters string "" Define parameters to be used with either tool, do not include any that requiere an input file defined above.
tool string "" The quantifier to be used, either cufflinks, express or rsem.

Test cases

Test case Parameters IN
alignment
IN
reference
IN
referenceFolder
IN
annotation
IN
mask
IN
array
OUT
folder
OUT
genes
OUT
isoforms
case1_express properties alignment reference (missing) (missing) (missing) (missing) (missing) (missing) (missing)

tool=express

case2_cufflinks properties alignment reference (missing) annotation (missing) (missing) (missing) (missing) (missing)

tool=cufflinks,

case3_rsem properties alignment (missing) referenceFolder (missing) (missing) (missing) (missing) (missing) (missing)

tool=rsem,
parameters=-p 8 --bam --estimate-rspd --append-names --paired-end,
extra=RSEMref

case4_bitseq properties alignment reference (missing) (missing) (missing) (missing) (missing) (missing) (missing)

tool=bitseq,
parameters=--uniform,
extra=--outType rpkm


Generated 2018-12-18 07:42:25 by Anduril 2.0.0