Up: Component summary Component

SignatureExtractor

This component performs signature extraction from point mutation data using non-negative matrix factorization following the pipeline in SomaticSignature R package.

Version 1.2
Bundle sequencing
Categories Analysis
Authors Amjad Alkodsi (Amjad.Alkodsi@Helsinki.FI)
Issue tracker View/Report issues
Requires SomaticSignatures (R-bioconductor) ; BSgenome.Hsapiens.UCSC.hg19 (R-bioconductor) ; BSgenome.Hsapiens.UCSC.hg38 (R-bioconductor) ; pheatmap (R-package) ; plyr (R-package) ; cowplot (R-package) ; RColorBrewer (R-package) ; reshape (R-package) ; grid (R-package) ; ggplot2 (R-package)
Source files component.xml SignatureExtractor.R
Usage Example with default values

Inputs

Name Type Mandatory Description
in CSV Mandatory Input csv file having variant data. The file should have columns for chr, position, ref, alt and sample ID columns that can be specified by parameters

Outputs

Name Type Description
outContribution CSV CSV file having percentage of contribution of each signature in each sample.
outNmuation CSV CSV file having estimated number of mutations induced by each signature in each sample.
signatureDistance CSV CSV file having euclidean distance between each extracted signature and the list of published signatures.
plots BinaryFolder Binary folder having different plots.

Parameters

Name Type Default Description
altColumn string "alt" Name of the column having alternative allele
barplotDimensions string "5,5" Dimenstions of the barplot pdf in the format of "width,height".
barplotExtra string "" ggplot R expression to be added to barplot command for example 'scale_fill_manual(values=c("red","blue","green")' to change colors.
chrColumn string "chr" Name of column having chromosomes.
genomeBuild string "hg19" Can be either hg19 or hg38.
n int 3 Final number of signatures to be extracted
nRange string "2:10" R expression for range of number of signatures to be tested for explained variance.
nRep int 5 Number of repeitition for each number of signatures as specified in the nRange parameter.
nrun int 1 The signatures will be extracted based on repeated runs specified by this parameter.
posColumn string "pos" Name of column having chromosomal position of the variant.
refColumn string "ref" Name of the column having reference allele.
sampleIDcolumn string "ID" Name of the column having sample IDs.
signaturePlotDimensions string "5,5" Dimenstions of the signature plot pdf in the format of "width,height".
signaturePlotExtra string "" ggplot R expression to be added to signature plot command for example 'theme(axis.text.x=element_text(size=5))' to change x axis font size.
sortBy string "burden" sort samples in the barplot according to this parameters. It can take signature names like S1, S2 ..Sn or "burden" to sort by number of mutations.

Test cases

Test case Parameters IN
in
OUT
outContribution
OUT
outNmuation
OUT
signatureDistance
OUT
plots
case1 properties in (missing) (missing) (missing) (missing)

nRange=2:3,
nRep=2,


Generated 2018-12-18 07:42:27 by Anduril 2.0.0