ACGHnorm microarray Normalizes two channel array CGH data with LOESS, median or mode.
ACGHsegment microarray Segments Array CGH data with the Circular Binary Segmentation (CBS) algorithm and produces hard or soft copy number aberration calls.
AdaptorRemoval sequencing Removes tags from the reads in fastq/fasta files.
AffyReader microarray Imports gene expression or exon array data from Affymetrix CEL-files.
AgilentReader microarray Imports data from microarray text files such as Agilent CSV files.
AligFilterFlag sequencing Allows to filter SAM files with alignments using the flag field.
AligFilterMatches sequencing Allows filtering of alignments in BAM/SAM files, mainly output from Tophat.
Align sequencing

This component aligns sequencing reads to the reference genome using the Borrows-Wheeler aligner with the mem algorithm.

Complete documentation:

AlleleCounter microarray Calculates genotype frequencies of a SNPMatrix.
AlphaShape anima Takes a list of points and returns the alpha shape.
AnnotateDrugs dsrt Annotate drug concentrations.
Annovar sequencing The component annotates variants using Annovar (table_annovar).
AnnTools sequencing Runs one of the AnnTools variant annotator suite's scripts.
Ant tools Anduril wrapper for Apache Ant.
Array2CSV builtin Reads an array, and returns a CSV with Key and File columns.
Array2Folder builtin Transforms an array into a physical folder.
ArrayCombiner builtin Combines multiple arrays into one using set operations on array keys.
ArrayConstructor builtin Constructs an array from atomic elements.
ArrayExtractor builtin Extracts elements from an array.
ArraySplit builtin Divides array keys evenly in to N arrays.
ArrayTransformer builtin Transform arrays by removing elements, renaming keys or sorting the elements.
Ascat microarray Runs Allele specific copy number analysis with ploidy and tumor purity estimation using the tool ASCAT.
AttributeVisualizer microarray Visualizes multidimensional attributes using a heat map together with a clustering dendrogram.
AutoGating flowand Automatically gate cell populations.
BackgroundSubtract anima Performs background subtraction for a directory of images with various methods.
BackSPIN tools Biclustering of gene expression data using the BackSPIN algorithm.
Bam2Counts sequencing Converts Bam files to count matrices summarizing on gene, transcript or exon level.
Bam2Fastq sequencing Revert BAM file to Fastq file for realignment.
BamCombiner sequencing

Merges multiple bam files using the Picard MergeSamFiles.

BashEvaluate tools Executes a Bash script that generates any output from the given data or other inputs.
BayesianBiclustering tools Perform Bayesian Biclustering on a data matrix.
BFConvert anima Wrapper for the Bioformats command line tool for converting exotic image formats.
BFMetadata anima Wrapper for the Bioformats command line tool for extracting metadata from exotic image formats.
BiclustClusterer tools TODO.
BICseq sequencing Detects copy number alterations from whole-genome sequencing data using BICseq tool.
BiomartAnnotator microarray Fetches attributes with given filters using BioMart.
BismarkAlign sequencing Performs alignment of bisulfite sequencing data in addition to deduplication and methylation extraction using Bismark Bisulfite Mapper.
Blast sequencing Basic local alignment search tool.
BlobDraw anima Visualizes the blobs found in a directory of images.
BlobFeatures anima reads the blobs found in a directory of images and produces intensity features The component requires that all the columns from ImageBlobDetect are still found in the data file.
BoxPlot tools Creates box plots for numeric data.
BreakpointVisualizer sequencing Visualizes genomic breakpoints resulting from copy-number aberrations (amplifications and deletions) and chromosomal rearrangements (translocations and inversions).
BSAlign(Deprecated) sequencing Aligns BS or RRBS data though BSMAP software, version 2.74.
CellGrowthAssay dsrt Compute the plate statistics for the plate survival assays and for dummy plates.
CellProfiler anima Runs a pipeline in CellProfiler, for a folder or several folders of images.
CGHubDownload sequencing Executes CGHub's GeneTorrent package's utilities.
CircosPlot microarray Draws circos plots from genomic data.
ClassifierPerformance tools Calculates popular classifier performance values based on actual classes, and predicted values.
ClickImage anima Get user to click points on a folder of images, and write a CSV table of the coordinates.
ClusterChooser flowand Graphical tool for cluster population selection.
ClusterFiltering flowand Filters columns and/or rows from by clusters using flexible criteria.
ClusterPlotCombiner flowand Creates scatter, line or bar plots where Y and X coordinates come from CSV files.
ClusterReorder tools Re-orders cluster ID numbers based on a given data vector.
ClusterReport tools Performs a hierarchical clustering of samples.
CN2GECollection microarray Encapsulates three algorithms with similar inputs for copy number to RNA expression integration.
CNACaller microarray Calls significant copy number alterations from a group of segmented copy number alteration profiles.
CNGEIntegrator microarray Performs one of two possible nonparametric tests for the detection of copy number induced differential gene expression.
CombinationCounter tools Count the occurrencies of combinations of two labels.
CombinationMatrix dsrt Compute the plate statistics for the plate survival assays and for dummy plates.
CompareImage tools Compare images.
ControlsQCs dsrt Compute the plate statistics for the plate survival assays and for dummy plates.
ConvertImage anima Reads various image formats with LOCI Bioformats library (
CopyOutput builtin Copies any input to a user selected path.
CorrelationReport tools Creates a correlation report that shows how much columns of a numeric matrix correlate with each other.
CreateSQLiteDB sequencing Create miRNA-target gene database(s) for fast and simple SQLite query of potentially interesting regulatory pairs.
CreateSQLiteIDs sequencing For a subset of genes, create a file with additional reference IDs.
CSV2Array builtin Reads a CSV, and returns an array based on selected columns.
CSV2Excel tools Converts a file from CSV format to Excel 97 format.
CSV2FASTA microarray Converts the given comma separated value file to a FASTA file.
CSV2GraphML tools Converts a matrix or CSV representation of a graph to GraphML.
CSV2IDList tools Extracts columns from each given CSV file and prints their content out without duplicates.
CSV2Image anima Create an image visualization from CSV file.
CSV2JSON tools Transforms a CSV file to JSON format.
CSV2Latex tools Converts the given comma separated value file to a LaTeX table.
CSV2SetList tools Extracts identifier columns from the given CSV files and prints them out without duplicates.
CSVCleaner tools Cleans up CSV outputs, optionally removes file headers, quotations and unused columns, and can reorder and rename columns.
CSVColRename tools Changes column names in a CSV file given a reference file.
CSVDplyr tools Applies up to R functions (including dplyr functions) to the input csv1.
CSVFilter tools Filters columns and/or rows from CSV files using flexible criteria.
CSVJoin tools Joins rows from two or more CSV files from all the inputs, optionally using one column as a matching key.
CSVListFilter flowand Filters CSV files in a given folder according to the to extract specific rows given output of AutoGating component.
CSVListJoin tools Combines the content of input files into a single CSV file.
CSVSort tools Sorts and merges CSV files.
CSVSplit tools Divides the content of the given CSV to an array of multiple CSV files.
CSVSplitColumns tools Splits CSV into new files according to columns.
CSVSummary tools Summarises values for rows of a file according to a column label or the whole file to single row.
CSVTransformer tools Transforms CSV files using R expressions.
Cuffcompare(Deprecated) sequencing Cufflinks includes a program that you can use to help analyze the transfrags you assemble.
Cuffdiff(Deprecated) sequencing Cufflinks includes a program, "Cuffdiff", that you can use to find significant changes in transcript expression, splicing, and promoter use.
Cuffmerge(Deprecated) sequencing Cufflinks includes a script called cuffmerge that you can use to merge together several Cufflinks assemblies.
DEan sequencing Performs differential expression analyses in R between pairs of sample groups using up to 4 methods: DESeq, DESeq2, EdgeR, and upper-quartile normalization with t-test.
DEE2DEG microarray Find differentially expressed genes (DEG) based on differentially expressed exons (DEE).
DEGReport microarray Creates a LaTeX report on differentially expressed genes (DEGs).
DESeqExpr sequencing Provides a gene expression matrix using DESeq.
DiffMeth(Deprecated) sequencing Function to compute differential methylation at a base/window/region resolution.
Directed2Undirected tools Converts a directed graph into undirected graph.
DMML sequencing Identifies differentially methylated sites in tumor samples with varying, unknown tumor cell fraction using a maximum-likelihood method.
DSRTplots dsrt Plots the inhibition curves for each screening experiments, combine plots for different pretreatment times and computes the scores indicating the efficacy of each drug.
DuplicateMarker sequencing

This component performs marking of PCR duplicates using Picard.

Complete documentation:

DuplicateQuality microarray Estimates microarray chip quality by printing expression distributions for duplicate probes.
EdgeR sequencing Computes the digital expression fold change and p-value based on counts.
Ellipse2Mask anima Converts a CSV expression of ellipses in to masklist (and mask) image format.
EncodeTextFile tools Convert general text files from one character encoding to another.
EnsemblDNA microarray Fetches DNA sequences from the Ensembl database.
EnsembleAttributeSelection tools Feature selection with multiple algorithms.
EntrezAnnotator microarray Retrieves database records from NCBI Entrez, including gene annotation, PubMed references and sequences.
Excel2CSV tools Converts a Microsoft Excel sheet to a CSV file.
Excel2Text tools Converts an Excel 97 xls-sheet to a text file.
ExclusiveCombiner tools Binds one set of inputs to the output ports.
ExonAnnotator microarray Maps gene, transcript or exon Ensembl IDs to one another via the Ensembl query interface.
ExonExpression microarray Converts exon expression matrix for a list of exons of interest.
ExonToPeptide microarray Fetches exon-peptide sequences of query exon list.
ExpandCollapse tools Converts between the two possible representations of a relations with multivalued columns.
ExperimentSetup microarray Creates a report that displays all sample groups using a table and a graph that visualizes the sample groups and their relations.
ExpExpIntegration microarray Integrates gene/transcript/other expression data with binary explanatory data derived in some way.
ExpressionExtremes microarray Selects differentially expressed genes using expression values as criterion.
ExpressionQuantifier sequencing Quantifies expression from a bam file.
ExpressionStats sequencing Given an expression matrix and a CSV containing sample IDs and treatment groups, calculate basic statistics by treatment group: mean, median, and standard deviation of genes.
ExpressionSubset sequencing Given an expression matrix and a CSV containing a subset of interest in one column (e.g.
ExprMethylCGH microarray Integrates expression data with methylation and CGH data by using two label matrices of ones and zeros.
ExprMixtureModel microarray Provides a mixture model fit of two normal distributions for the given genes.
ExtensiveCSVJoin tools Combines and cleans many files.
ExtractCond sequencing Creates a custom input file defining treatment groups for MDSPlot component using expression matrices and a reference file.
FASTA2CSV sequencing Converts a FASTA file to CSV.
FastQC sequencing Short reads quality control.
FastQScreen sequencing FastQScreen allows you to screen a library of sequences in FastQ format against a set of sequence databases, for example vectors, virus or ribosomal RNA, so you can see if the composition of the library matches with what you expect.
FCSReader flowand FCSReader imports a FCS file (standards 2.0 and 3.0 supported) and converts it into a tab-delimitted CSV file.
Fiji anima Executes a Jython script under Fiji (ImageJ) that generates an image directory from given image directories.
FijiFeatures anima Uses Fiji to extract features from input images, segmented with masks.
FillNA tools Fills NA values with linear interpolation or previous value.
FlowClustModeling flowand Modeling FACS data using flowClust library.
FlowMeans flowand Clusters rows in CSV files.
FoldChange microarray Selects differentially expressed genes using fold change as criterion.
Folder2Array builtin Constructs an array from folder contents.
FolderCleaner tools Renames a folder of files, by leaving only the last extension.
FolderCombiner tools Copies files from input folders to output folder.
FolderExtractor tools Extracts files from a folder.
FolderSplit tools Divides files evenly from a folder in to N folders in an array.
FSomMetaClustering tools Creates self-organizing maps of cytometry data with FlowSOM algorithm and meta clusters the data based on 100 fSOM-maps with consensus clustering.
FusionAnalyzer sequencing Post-processing tool for gene fusion analysis.
FusionCaller sequencing Gene fusion detection.
FusionGenes(Deprecated) sequencing Allows filtering of alignments in BAM/SAM files, mainly output from Tophat.
FusionVisualizer sequencing Post-processing tool for fusion transcripts visualization.
GatingSummary flowand Generate summay report table and heatmap.
GeneCount microarray Produces a LaTeX report that shows the number of genes or features under analysis in various phases of an Anduril script.
GeneInfo microarray Prints detailed information on single genes.
GeneTable microarray Compiles gene ID sets into CSV files that contain gene annotations and expression.
GenomeSlider sequencing Summarized genomic events at sliding window intervals and/or at predifined regions.
GenomeSpy microarray

Generates an RMarkdown report, which displays genomic regions or copy number segmentations using GenomeSpy.

This component exposes a simplified GenomeSpy API and is limited to displaying only one type of data at a time.

GenomicRearrangement sequencing Call genomic rearrangments.
GenotypeComparator microarray Compares the allele and genotype differences between two sets of genotype frequencies.
GetReads sequencing Visualizes reads that are aligned to a given position.
GetUtrSeq sequencing Fetch UTR sequences of query transcripts.
GlobalCorrelation sequencing Compute correlation between miRNA and gene/transcript expression.
GMLVQClassifier tools Trains a classifier based on the given sample data, or predicts with a classifier trained earlier.
GNUPlot tools Executes a GNUPlot script that generates any output from the given data or other inputs.
GOClustering microarray Clusters genes based on their gene ontology (GO) annotations.
GOEnrichment microarray Computes enriched GO terms in a set of genes or proteins.
GOFilter microarray Filters gene lists based on Gene Ontology annotations.
GOProbabilityTable microarray Creates custom a priori probability tables for GO similarity and enrichment analysis.
GOSearch microarray Filters gene lists based on Gene Ontology annotations.
GraphAnnotator tools Inserts or extracts attributes from GraphML files using CSV files.
GraphMetrics tools Computes various graph metrics for a GraphML file.
GraphSplitter tools Splits a graph into subgraphs that are either connected or strongly connected (case of directed graph) components.
GraphVisualizer tools Creates a visualization of a graph using Graphviz.
GroupFiles tools Produces a CSV annotated by parsing path names containing named groups.
GSEAAnalyzer microarray Performs Gene Set Enrichment Analysis by using category definitions from KEGG or GO.
GSVDIntegrator microarray Extracts variation patterns from two matrices based on the generalized singular value decomposition.
GTFParser sequencing Extracts field contents from a GTF file to output a CSV file.
GUIInput builtin Asks user for inputs.
HazardRatio sequencing Calculates incidence rates, hazard ratios, their CIs and P values.
HeatMapReport tools Clusters samples hierarchically and draws the corresponding dendrograms and heat map.
HistogramPlot_FrequencyCalculator tools Calculates frequencies from data to be used in a HistogramPlot.
HTMLCombiner tools Combines a HTML page from parts.
HTMLExtractor tools Extracts HTML parts from a site.
HTMLImages tools Creates a static list of images to a web page.
HTMLReport tools Visualizes CSV files and their relationships using statically generated HTML files.
HTMLTable tools Creates an interactive HTML table of input data.
HTSeqCount sequencing Quantify reads mapped to EnsemblID transcripts for a given sample.
HTSeqExprMatrix sequencing Takes a CSV list of outputs from HTSeqCount to merge into an expression matrix.
HTSeqExprNormalize sequencing Takes as input "raw", integer count expression matrix of quantified sequenced reads and outputs a scale-normalised, log-transformed expression matrix.
IDConvert tools Converts IDs from a column in a CSV file to other type of IDs and/or combines duplicated rows into one.
IDDistribution tools Extracts one column from the given CSV file and prints the frequencies of its values.
IlluminaAnnotator microarray Fetches genomic annotations for Illumina methylation and expression array probes.
IlluminaNormalization microarray Performs between array normalization for an input matrix.
IlluminaPlot microarray Produces quality control plots.
IlluminaReader microarray Imports gene expression data from Illumina BeadStudio output files.
Image2CSV anima Create a CSV file from image file.
ImageGallery anima Creates a web gallery from given folders of images and optional annotation CSV files.
ImageList2MaskList anima Convert from ImageList type to MaskList data type.
ImageList2Video anima Uses either ImageMagick or avconv to convert a list of images in to a video file.
ImageLocalMaxima anima Finds local maxima centers in a grayscale intensity image.
ImageMagick anima A versatile wrapper for using ImageMagick convert, for folder(s) of images.
ImageSummary anima Summarises a directory of images, with one line for each image.
INPUT builtin Imports an external input file or directory to the pipeline.
IntensityCorrelation anima Extracts the intensity correlation features (Co-Localization) from two image channels.
IntensityFeatures anima Extracts the intensity features from segmented image mask and gray scale image.
ISABiclust tools Does Biclustering using iterative signature algorithm-package.
iSeq sequencing Wraps the iSeq R package that implements the methods described in "A fully Bayesian hidden Ising model for ChIP-seq data analysis" by Qianxing Mo.
JASPARMotif microarray Writes JASPAR motif matrices into a MotifSet directory.
JCSVJoin tools Java implementation of R CSVJoin component.
KaplanMeier tools Produces a Kaplan-Meier plot representing survival estimates based on the given data.
KaplanMeierPlot tools Plots Kaplan Meier survival plot using the survival and survminer R packages.
KeywordMatcher microarray Identifies gene/protein IDs using fuzzy keyword matching between gene aliases and descriptions.
KGML2GraphML microarray Converts a KEGG pathway (KGML) to a GraphML file.
KorvasieniAnnotator microarray Converts gene, transcription, and translation identifiers using Korvasieni.
Labels2Columns tools Reads columns of data, the unique entries of the label column are transposed as column names.
LatexAttachment tools Constructs a LaTeX fragment that includes attachments for the given files.
LatexCombiner tools Combines several LaTeX fragments into one LaTeX document.
LatexPDF tools Compiles a LaTeX document into Portable Document Format (PDF).
LatexTemplate tools Creates LaTeX header and footer files that are inserted to the beginning and the end of the LaTeX report.
LiftOver microarray Converts chromosome region coordinates from one genome build to another.
LimmaNormalizer microarray Normalizes two-channel and single-channel microarrays using the Bioconductor Limma package.
LimmaStat microarray Select differentially expressed probesets, exons or transcripts from probeset, exon or transcript expression data by using limma statistical test.
LinearBinner tools A very basic clustering-like method.
LinearNormalizer tools Normalizes the input matrix column-wise.
LineFeatures anima Measure LinkedList segments as objects.
LocationVariation microarray Gets variation from Ensembl from a genomic location.
MACS microarray Finds peaks from aligned short-reads.
MarkerCorrelations microarray Combines correlating markers from the SNPMatrices.
MaskClusterDraw anima Reads cluster id:s, images and masks to create a colored visualization of object clustering.
MaskFilter anima Filters mask objects based on the object numbers present in a table.
MaskList2ImageList anima Copy files from MaskList image structure to conventional ImageList data type.
MaskRelate anima Relates objects from two masks defining parents and their children objects.
MatlabEvaluate tools Executes a Matlab script that generates a matrix from the given data.
MatlabOp anima Executes a Matlab script that generates an image directory from given image directories.
MatrixRank tools Computes ranks of values in a matrix.
MatrixTranspose tools Transposes a matrix.
MDSPlot tools Plots multidimensional scaling of numeric data.
MeapNormalization microarray Prepares normalized data at probe level.
MeapQuantification microarray Quantify expression data on exon, splicing variant or gene level.
MeapVisualizer microarray Generate visualization for genes of interest with spliced isoforms.
MEMERunner microarray Finds enriched motifs from the given set of DNA sequences.
MergeImage anima Joins grayscale images together with colors of choice.
MetaClustering flowand Meta-Clustering for defining isomorphisms on cells across clusters.
MethFilterNorm(Deprecated) sequencing Function loading the methylation calls data into a MethylRaw or MethylRawList object (MethylKit package data types) and performing normalization and filtering of the data.
MethylCall sequencing Perform Methylation Calling of aligned RRBS and WGBS data obtained from the component BSalign.
WARNING: It requires quite a big amount of CPU (For human genome, needs ~26GB memory or more).
MethylExtract sequencing Extracts methylation from a bam file simultaneously with variant calling.
MicroplateReader dsrt Reads multiple XLSX files generated by a fluorescence/luminescence reader for a 96 or 384 well plate and converts it into a tab-delimitted CSV file with the correct annotations for each well.
MirMatch sequencing String match for between mature miRNA seeds and target UTR seeds.
MMClustering flowand Normal, t, Skew-normal, and skew-t Mixture-Model Clustering.
ModelicaCompiler tools Compile a Modelica model into binary form using OpenModelica.
ModelicaSimulator tools Simulate a Modelica model by executing a compiled model.
MorphologyFeatures anima Extracts the morphology features from segmented image mask.
MotifMatch microarray Aligns the given motifs against the DNA sequencies.
MRXSConvert anima Exports MRXS images in to PNGs, cropping them to small images.
MutationContextPlot sequencing Draws mutation context plots for sequencing samples.
NextGene microarray Finds the closest gene, exon, or transcript for the given loci.
NovelMatureSeq sequencing Given a set of putative novel miRNA regions (e.g. output from miRanalyzer or miRDeep2), retrieves consensus functional "mature" region that can then be used for predicting target genes (e.g. as input to MirMatch.
NucleusGenerator anima Generates DAPI-like grayscale images with random cell nuclei.
ObjectFilter anima Filters objects based on the object X,Y coordinates and an overlap with a mask image.
OptimalClustering flowand The component takes the output of the MMClustering component, where each data sample has been clustered with multiple different cluster numbers, and determines the optimal cluster number for each sample.
OUTPUT builtin Exports a result file or directory to an output directory.
PairCorrelation tools Calculates correlations for the row pairs.
PairWiseClusterPlot flowand Produces pairwise scatter plots from clustered and filtered data.
Pandoc tools Executes Pandoc with user given switches.
PanelDoc microarray Calls CNV ratios/gains/losses using the existing programs PanelDoc (developed by Nord et al. 2011) or CNVPanelizer (Bioconductor package), or both, with custom filtering available.
PanelDoc sequencing Calls CNV ratios/gains/losses using the existing programs PanelDoc (developed by Nord et al. 2011) or CNVPanelizer (Bioconductor package), or both, with custom filtering available.
Pause builtin Halts pipeline execution.
PCA tools PCA performs a principal component analysis on a given data matrix based on eigen values.
PCAxisReader tools Converts a PC-Axis file to a CSV.
PeakScore microarray Calculates a cumulative score for each gene having some regions assigned to it.
PhenographClusterer tools Clusters data using Phenograph clustering method.
PINA microarray Finds the interacting proteins of query proteins for a list of query Uniprot ACs.
PipelineInput builtin Read a previous Anduril pipeline execution folder as an input.
PlateQCs dsrt Compute the plate statistics for the plate survival assays and for dummy plates.
Plot2D tools Creates scatter, line or bar plots where Y and X coordinates come from CSV files.
PlotChannels flowand Component meant to visualize Flow and Mass Cytometry data before and after transformations.
PlotHTML tools Creates scatter, line or bar plots where Y and X coordinates come from JSON files.
PlotTreeHTML tools Produces a plot of a tree (or forest) in a html page.
PointDistance tools Measures the nearest reference point (reference) distance for each input point (in).
PointTracker anima Tracks point coordinates moving in time giving an ID for each track.
PopulationTimeline tools Currently it uses Wanderlust to calculate developmental trajectories within a population of single cell data.
Properties2Latex tools Converts a set of properties files into a LaTeX fragment.
PyClone sequencing Component to run PyClone.
PyCloneQC sequencing Quality checker for PyClone.
PyCloneRC sequencing Reclustering tool for PyClone.
PythonEvaluate tools Executes an Python script.
QCParser sequencing Parse SeqQC summary result.
QuantileFilter tools Filters values from numeric matrices that are below or above quantile limits.
QUBICClusterer tools Qualitative biclustering algorithm (QUBIC).
QuickBash tools Executes a Bash script that generates any output from the given data or other inputs.
Randomizer tools Generates numeric matrices filled with random values.
RandomSampler tools Randomly selects rows and columns from a text or CSV file without replacement.
RangeMatch microarray Component for 'left join' using keys and ranges.
RankScore microarray Calculates a score matrix for a dichotomially classified set of samples.
RConfigurationReport microarray Generates a report about the versions and the purposes of the selected R packages.
Realigner sequencing

This function will do local realignment around indels using Genome Analysis Toolkit (GATK).

Complete documentation:

RefSNPAnnotator microarray Fetches annotations for given SNP rsID's with biomaRt.
RegionConvert sequencing Converts files containing genomic or other sequence related regions to other formats retaining all applicable information.
RegionReport microarray Generates statistics about the DNA regions.
RegionTransformer sequencing Computes DNA region set operations such as union and overlap.
REvaluate tools Executes an R script that generates a matrix from the given data.
RiskRatioPlot microarray Creates a visualization for a list of risk ratios and their confidence intervals.
RNAFold microarray Predicts RNA secondary structure using Vienna RNA Package energy minimization function.
RowCount tools Classifies the number of rows of an input CSV to be either small, medium, or large.
RowJoin tools Joins duplicate id rows from a numeric matrix based on the frequency of some given value.
Rsync tools Uses rsync to fetch or push files.
RUBIC sequencing RUBIC detects recurrent copy number aberrations using copy number breaks, rather than recurrently amplified or deleted regions.
SampleBalancer tools Divides matrix rows in to two, by balancing the occurrence of unique labels in column.
SampleCombiner tools Combines expression data from several samples into one by taking means, medians or log ratios.
SampleExpression tools Creates a -1/0/1 matrix that indicates whether a given gene/probe is differentially expressed in individual samples.
SampleGroupCreator tools Creates sample group tables based on sample names read from data files.
SamSpectral flowand SamSPECTRAL clustering of flow cytometry data as described by Zare et al (2010).
SBML2GraphML tools Visualizes an SBML model as a GraphML file.
SBML2HTML tools Prints the contents of an SBML model as a HTML file.
SBMLSimulator tools Simulates an SBML model using SBML ODE Solver.
SBMLTable tools Extracts SBML attributes into a CSV file, or imports attributes from a CSV file.
ScalaEvaluate tools Executes a Scala script.
ScalaScript tools Scripts Scala source code and executes it via scala.sys.process.Process.
ScriptIGV(Deprecated) sequencing

Still under construction! Only the one file input and bedtools igv mode work at the moment, as evidenced by the only test case.

SearchReplace tools Transforms the given file by replacing certain string with new values.
SegmentBlob anima Segment a directory of images with Gaussian Blob method.
SegmentCRImage anima Segments a directory of images with a method tuned for HE images.
SegmentFiji anima Produces binary mask images, segmenting the input images using Fiji.
SegmentGraphCut anima Segmentation based on graph cutting.
SegmentImage anima Segments a directory of images with various methods.
SegmentPlot microarray Plots segmented chromosomal data.
SegmentSeeded anima Segments a directory of images with various methods, using a seed (a previous mask helping in refining the segmentation).
SelectDiffMeth(Deprecated) sequencing Function to select which bases/windows/regions are significantly differentially methylated.
SelectMethContext sequencing Separate cytosines based on the methylation context in which they occur and produces statistics about context methylation.
SequenceFilter microarray Filters out nucleotide sequences that do not satisfy the criteria.
SequencingRequirements sequencing This component depends on the requirements required by the functions in this bundle.
SetTransformer tools Transforms sets using union, intersection, difference and other functions.
ShapeFitting anima Fits a user given shape on image data by changing the location, orientation and scale of the shape.
SignatureExtractor sequencing This component performs signature extraction from point mutation data using non-negative matrix factorization following the pipeline in SomaticSignature R package.
SignificantCurves dsrt Selects drugs based on their efficacy scores.
SigPathway microarray Performs pathway analysis by computing NEk and NTk statistics described in Tian et al (2005).
SimpleWebPage tools Creates a simple web page, for publishing a list of files.
SISSRs sequencing SISSRs is a software application for precise identification of genome-wide transcription factor binding sites from ChIP-Seq data.
Skeleton2CSV anima Extracts lines from a skeleton mask.
SkeletonFeatures anima Extracts intensity information from a grayscale image, using a skeleton mask.
SNPArrayReader microarray Imports genotype data from Illumina SNP and Affymetrix SNP 6.0 or 5.0 files.
SNPHelistinReader microarray Imports genotype data from a SNPHelistin database.
SNPKaplanMeier microarray Calculates Kaplan-Meier estimates for genotype specific survival effects.
SNPs3DSearch microarray Fetches disease associated genes from the SNPs3D database.
SourceCode2Latex tools Converts the given source code to a LaTeX fragment containing the given listing (lstset).
SpatialContrast microarray Finds two spatial regions in expression data that have a maximum contrast in average expression value.
SpatialPlot microarray Creates spatial plots for red/green channels, log ratios and spot quality.
SPIA microarray Performs pathway analysis using Signaling Pathway Impact Analysis (formerly known as Pathway-Express).
SPINLONG sequencing Identify user-defined spatio-temporal patterns in ChIP-seq and other sequencing data using the SPINLONG method.
SpiralJoin anima Joins grayscale images together in a montage, or splits a montage in to separate images in a spiral fashion.
SPP sequencing Wraps the SPP R package from published in Kharchenko PK, Tolstorukov MY, Park PJ "Design and analysis of ChIP-seq experiments for DNA-binding proteins" Nat.
SQLSelect microarray Executes an SQL query and returns its results as a CSV file.
StackProjection anima Projects a grayscale z-stack to a 2D image.
StandardProcess builtin Executes a given system command.
STAR sequencing Spliced Transcripts Alignment to a Reference for RNA-seq The reason to implement STAR component was that TopHat is around 50 times slower, and that time is counted in days.
STARGenome sequencing Spliced Transcripts Alignment to a Reference for RNA-seq See the STAR component for more documentation.
StatisticalTest tools Computes p-values using statistical tests, optionally with correction for multiple hypotheses.
StringInput builtin Turns a string into a file usable in the pipeline.
SubsetBam sequencing Extracts reads from a BAM file based on chromosomal regions specified by a user and produces a merged BAM file containing only those regions.
Summarise flowand Summarises values for different rows of a file (i.e. reference) according to the clustered and possibly filtered result of the file (i.e. clusterFiles).
Summarization sequencing Summarizes information from region using sliding window or group ids (same as SQL GROUP BY).
When sliding window is used, input file needs to be sorted according to locationCol Variation.jar is in microarray bundle.
SVDAnalyzer microarray Performs Singular Value Decomposition to gene sets to test whether a set of genes is significantly differentially expressed.
SyntaxHighlight tools Reads a program code listing, and creates a syntax highlighted HTML file.
TableQuery tools Executes an SQL query on CSV tables and creates a result table.
TextDraw anima Write text on a bitmap image with a small font.
TextFileSplitter tools Splits a text file (such as CSV) into an array of smaller text files.
TextureFeatures anima The component uses VLFeat library to extract SIFT/MSER keypoints and SIFT descriptors from grayscale images.
TPquery sequencing Retrieve target prediction and target validated data from an SQLite database to annotate a list of miRNAs or a list of miRNA-target gene pairs.
Transformation flowand Cell fluorescense intensity transformation for FACS data.
Traph sequencing Traph RNA-seq abundance estimator from Veli Mäkinen lab.
TreeSplitter tools Splits the leafs of a tree into two sets such that the mutual information between the split and leaf annotations is maximized.
TrimGalore sequencing The component is a wrapper of the trim_galore tool which is also a wrapper of two tools: fastqc and cutadapt which both should be in PATH.
Trimmer sequencing Trim sequence reads at both ends.
Trimmomatic sequencing Trimmomatic performs a variety of useful trimming tasks for illumina paired-end and single ended data.
TSNE tools This component performs dimensionality reduction with an R wrapper of the C++ implementation of Barnes-Hut-SNE as described in
TumorscapeReader microarray Connects to the Tumorscape database and outputs chromosomal aberration information for the query genes.
UMIQuantifier sequencing Quantifies distinct unique molecular identifiers (UMIs) from a bam file.
URLInput builtin Imports an external input file to a workflow script from a URL source.
UtrSeq2Seeds sequencing Generate seeds with 15bp (XXXXXXXQXXXXXXX) with each query loci in middle.
VariantAnnotator(Deprecated) sequencing This component is not being actively maintained currently, since Annovar is updated to frequently.
VariantFilter sequencing Filters variants based on allele frequencies in the data.
VariationFilter tools Filters out rows from a matrix where the standard deviation is below (or above) a threshold.
VCF2AnnotatedCSV sequencing

A quick and dirty wrapper for ANNOVAR.

VCF2CSV sequencing Converts variation calls in VCF format to CSV.
VCMM sequencing A variant caller, from VCMM website.
VennDiagram tools Draws Venn diagrams showing the intersections of the given sets of identifiers.
VertexJoin tools Simplifies the given graph by merging vertices with an equal set of edges.
Video2ImageList anima Uses either ImageMagick or avconv to convert a list of images in to a video file.
Watershed anima Methods to divide objects in a mask, or grayscale images.
WekaAPriori tools Mines association rules using a priori algorithm: R.
WekaAttributeSelection tools Selects parameters using Weka AttributeSelection classes.
WekaClassifier tools Creates a classifier based on the given sample data.
WekaClusterer tools Clusters data using clustering methods implemented in Weka.
WekaTransform tools Transforms data using filter classes implemented in Weka.
XML2CSV tools Transforms an XML file into CSV table.
XShiftClusterer tools Clusters data using the original implementation of the X-shift algorithm.

AffyNormalization microarray Reads in Affymetrix gene expression microarrays and compares various normalization methods.
AffySNPFetch microarray Fetches SNP rs-numbers from Ensembl based on Affy SNP-ids.
AgilentImport microarray Imports Agilent microarray data, merges duplicate probes and creates a log ratio matrix.
AlignStats sequencing Evaluate sequencing data, especially RNA-seq data quality using RSeQC.
AnnotEnsembl sequencing Takes the ncRNA fasta file downloaded from Ensembl and creates a gff file with the transcript IDs as the reference chromosome.
AnnotMirbase sequencing Takes a GFF3 file downloaded from miRBase and modifies the genomic locations to reference by Ensembl transcript ID (instead of chromosome) and relative positions inside the transcript ID (start and end positions).
AssemblyQuality sequencing Evaluate the quality of a de novo genome assembly using a variety of metrics.
BamReducer sequencing

This function reduces the reads in a bam file using the Genome Analysis Toolkit (GATK) 2 and outputs a dramatically more compressed bam file.

BamRegrouper sequencing

Picard-tools is used to add or replace read group information for each read in the input alignment file.

BamReorder sequencing This function reorders reads in a BAM file to match the contig ordering in the provided reference file. .
BamStats sequencing The function collects alignment and coverage statistics from a bam file.
BaseRecalibrator sequencing

This function will do base quality score recalibration using Genome Analysis Toolkit (GATK).

ClusterAnnotator microarray Produces a hierarchical clustering of the samples and assigns the provided annotations to the brances of the clustering tree.
ClusterPlot flowand Generates 2D plots for clusters where the dimensions are specified, hence specializing the pairwise plotting of all possible pairs of clusters dimensions.
DataOverlap sequencing Take the union of and collapse 2 datasets, with the option to filter out rows with NAs.
DiffExprSeq(Deprecated) sequencing For differential expression at gene and exon level of RNA-Seq data.
DocumentGenerator microarray Creates a PDF document from the given LaTeX elements and of the system configuration.
EnsemblChromosomes sequencing Retrieves the Ensembl names and lengths of chromosomes.
EnsemblGenes sequencing Retrieves the Ensembl gene locations.
ExpressionImport microarray Reads expression data, applies all available normalization methods and returns a record with normalized datasets.
ExprTable sequencing

Generates an expression table from individual samples expression files.

It was created with genes.fpkm_tracking and isoforms.fpkm_tracking files from Cufflinks in mind, but it can be used to summarize in one table any group of expression files that have an id column common to all files and the expression values.
ExprTableReport sequencing

Generates expression statistics into an HTML report.

FusionMap(Deprecated) sequencing Detect genomic fusions from paired or single ended DNA-seq or RNA-seq data using FusionMap.
GenomicVariantAnnotator microarray Queries the Database of Genomic Variants.
HistogramPlot tools Counts the frequencies and makes a line plot of histogram values.
HTSeqBam2Counts sequencing For obtaining read counts at gene and exon level from RNA-Seq alignment files.
InteractionPlot microarray Plots the distribution of the expression of two genes.
LatexSearchReplace tools Transforms text (including Tex) files in the given Latex directory by replacing certain string with new values in the specified text files.
MicroarrayReader microarray Reads in expression microarray data produced by feature extraction software.
MutSig sequencing Determine significantly mutated genes in a set of genetic variations using MutSig.
NCRNAAnnot sequencing Creates reference files for gene features and known ncRNAs and creates additional columns for a putatively novel miRNA expression matrix with information on relative genomic location (e.g. intragenic/intergenic, host gene and transcript) and neighbouring or overlapping ncRNAs.
NovelExprMatrix sequencing Take output of miRanalyzer (or miRDeep2) and create an expression matrix with the putative novel miRNAs found.
NovelMirnas sequencing Performs novel miRNA discovery using miRanalyzer or mirDeep2, depending on what is installed and what you want to use.
PrepRnaBam sequencing Preprocessing of RNA bam files according to GATK best practices (adding groups, marking duplicates, splitNtrim and base recalibration).
QCFasta sequencing Quality control function for RNA-Seq data.
QueryReport tools Executes an SQL query and prints it with the result table.
ReferenceIndexer sequencing The following auxiliary files for the reference fasta sequence will be created:
  • 1.
RegionOverlap(Deprecated) microarray Produces a list of overlapping DNA regions.
RNAVariantCaller sequencing Variant calling in RNAseq data using HaplotypeCaller and VariantFiltration in GATK.
Sam2Fastq sequencing This function uses the Picard java library to perform a conversion from SAM/BAM alignment format back to FASTQ sequence format.
SGA sequencing Perform de novo assembly for sequencing reads using String Graph Assembler (SGA).
SmallRNAPrep sequencing Uses FASTX-Toolkit to perform adapter trimming, artifact filtering, base-quality filtering, and read trimming for single-end read data.
SmallRNAQC sequencing Performs all 3 quality filtering steps (preprocessing, alignment to genome, and optional alignment to transcripts) in the smallRNA pipelines.
Star2Pass sequencing Runs STAR aligner in two pass mode for an array of samples together.
VariantCaller sequencing

Calls genomic sites of variation using the specified caller.

VariantCombiner sequencing This function combines all input variant files into one file using the Genome Analysis Toolkit (GATK).
VariantLiftover sequencing

This is analogous to the LiftOver component, but for VCF files.

VariantRecalibrator sequencing This function will apply machine learning in order to improve the input variants.
VariantSelector sequencing Select variants from a VCF file using specific criteria, e.g. type, annotations or genomic intervals.
VariantValidator sequencing This function validates the input variant file using the Genome Analysis Toolkit (GATK).

