Given an expression matrix and a CSV containing sample IDs and treatment groups, calculate basic statistics by treatment group: mean, median, and standard deviation of genes. Visualizations (histogram, density, boxplot, hierarchical clustering) are generated if parameter makeVisuals is true.

Version 0.1
Bundle sequencing
Categories Expression smallRNA
Authors Katherine Icay (katherine.icay@helsinki.fi), Alejandra Cervera (alejandra.cervera@helsinki.fi)
Issue tracker View/Report issues
Requires R ; MASS (R-package) ; ggplot2 (R-package) ; reshape (R-package)
Source files component.xml ExpressionStats.r
Usage Example with default values


Name Type Mandatory Description
expr CSV Mandatory Expression matrix. If matrix has an additional bio_type column annotated, parameter biotype should not be "skip".
ref CSV Mandatory CSV file containing sample names and treatment groups. Sample names must match column names of expr.
geneSet CSV Optional 2-column list of interesting genes (gene id, gene name) to create heatmap with. Id column should be the same name as defined in parameter exprID.
bodyMap CSV Optional Illumina body map. CSV file of geneIds (rows) per tissue (columns). Ids used should be the same as in expr (e.g. Ensembl gene id).


Name Type Description
stats CSV Rows of genes with columns containing the mean, median and standard deviation of its expression by treatment type.
statsArray Array<CSV> Additional statistics (topExpressed, topExpressedByTissue, zero, low, high, and summary) when makeVisuals is true and biotype is not skipped.
report Latex Produced only if parameter makeVisuals is true. Latex report containing expression visualizations using additional inputs geneSet and bodyMap.


Name Type Default Description
biotype string "skip" Is bio_type annotation added to input expr file? Either blank "" or "skip". Default is to skip this step.
biotype_min float 0 When biotype is blank "", this is the minimum expression value to consider when calculating additional statistics.
bodySite string "body" Any body tissue from the Illumina Body Map. Default is an empty string, but other options are heart, stomach, brain, etc.
exprID string "" Column name of expr file containing the unique gene IDs that statistics are calculated for. If blank, the first column is used.
makeVisuals boolean false Perform additional analysis of geneSet. When true, geneSet and bodyMap should be defined.
refGroup string "Treatment" Column name of ref file containing the treatment information corresponding to each sample ID. If blank, the component will lookfor a "Treatment" column.
refID string "" Column name of ref file containing the reference names to match to the expr columns. If blank, the first column is used.
topNum int 10 Number of top genes to be reported

Test cases

Test case Parameters IN
