Up: Component summary Component


Integrates gene/transcript/other expression data with binary explanatory data derived in some way. The input matrices are two N x M matrices that must have their columns (samples) and rows in the same order. Each row shall in the exprMatrix has the gene, transcript or other such expression values samplewise. The labelMatrix has must have a vector 0 and 1 on each row. This represents a partition of the samples rowwise.

The output of the component contains a score ("weight") for each gene shared by the explanatory and expression data. The alpha value is the result of permutation test denoting the probability that H0: "Large weight is due to random event" is erroneously rejected. The weight is calculated for each row such that that the vector of the labelMatrix is used to partition the values of exprMatrix into two groups. Then, the mean of the expression values of the group labeled with 1 is subtracted from the mean of the other group. This is divided by the sum of the standard deviations of these two groups to obtain final weight.

Both input matrices can contain missing (NA) values. Values in the label matrix that are not 0, 1 or NA are effectively ignored in the weight calculation. Rows with NA values in the label matrix are also ignored i.e., no weight is calculated for these genes.

Version 1.1
Bundle microarray
Categories Integration
Authors Riku Louhimo (Riku.Louhimo@Helsinki.FI)
Issue tracker View/Report issues
Requires Asser.jar (jar)
Source files component.xml
Usage Example with default values


Name Type Mandatory Description
labelMatrix CSV Mandatory The binary explanatory values.
exprMatrix CSV Mandatory The samplewise expression values.


Name Type Description
Values CSV Output matrix containg the index of the gene in the input matrix, and the weight and alpha values.


Name Type Default Description
exprNALimit float 0.1 Defines the percentage of missing measurements in the expression matrix we are willing to allow.
gainData boolean true Defines the which type of CNV data is being used.
idColumn int 0 Defines the idColumn. The output idColumn is the column name of idColumn in the label matrix.
minGroupSize int 2 Defines the minimum size of a group in the label matrix.
nroOfperms int 1000 Defines the number of permutations applied during permutation testing.

Test cases

Test case Parameters IN
case1 properties labelMatrix exprMatrix Values

gainData = true,
idColumn = 1,

case2 properties labelMatrix exprMatrix Values

gainData = false,
idColumn = 0,

case3 properties labelMatrix exprMatrix Values

gainData = false,
idColumn = 0,

case4 properties labelMatrix exprMatrix (expecting failure)

gainData = true,
idColumn = 1

case5_error properties labelMatrix exprMatrix (expecting failure)

idColumn = 1,

Generated 2018-12-12 07:42:05 by Anduril 2.0.0