Up: Component summary Component

Summarization

Summarizes information from region using sliding window or group ids (same as SQL GROUP BY).
When sliding window is used, input file needs to be sorted according to locationCol Variation.jar is in microarray bundle.

Version 1.0
Bundle sequencing
Categories VariationAnalysis
Authors Sirkku Karinen (sirkku.karinen@helsinki.fi)
Issue tracker View/Report issues
Requires Variation.jar (jar)
Source files component.xml
Usage Example with default values

Inputs

Name Type Mandatory Description
data CSV Mandatory Input has values that are summarized abd the positions or the ids that values are summarized by.

Outputs

Name Type Description
summarization CSV Summarized data.

Parameters

Name Type Default Description
keyCols string "" Columns that have ids for grouping for summarization (same functionality as SQL GROUP BY). Comma-separated list is accepted.
locationCol string "" Column that has the positions for window. If not specified, uses row indexes.
method string "SUM" Method of summarization, possible values are AVERAGE, MEDIAN, MAX, MIN, SUM, COUNT, MULTIPLY
resultCol string "" Names of the result column in comma-separated list. If not given, same as valueCols.
resultType string "float" int/float
valueCols string (no default) Columns that have the values for which summarization is calculated (Comma-separated list of column names). Summarization is calculated for each column independently.
window float 0 Window size in summarization, window size is either from locationCol and if that is not specified, uses row indices. If not specified, uses ids in keyCols.
windowStart int 0 From where window starts. -1 = the first location that is in input file.
windowStep float 0 Length of sliding of the window.

Test cases

Test case Parameters IN
data
OUT
summarization
case1 properties data summarization

method=AVERAGE,
window=30,
windowStep=10,
valueCols=Value1,Value2,
locationCol=Location

case2 properties data summarization

method=SUM,
window=2,
windowStep=1,
valueCols=Value1,Value2

case3 properties data summarization

method=MEDIAN,
keyCols=Gene,
valueCols=Value1,Value2,
resultCol=median1,medio2

case4 properties data summarization

method=MEDIAN,
keyCols=GeneSymbol,
valueCols=GSM510177,GSM510186,GSM510196,GSM510200,
resultCol=GSM510177,GSM510186,GSM510196,GSM510200


Generated 2018-12-17 07:42:24 by Anduril 2.0.0