Up: Component summary Component

SampleBalancer

Divides matrix rows in to two, by balancing the occurrence of unique labels in column. The rest of the data is saved to remainder output.

Useful in dividing data to training and testing sets. The remainder can be further balanced to evaluation and test sets with the same tool.

Version 1.0
Bundle tools
Categories Classification
Authors Ville Rantanen (ville.rantanen@helsinki.fi)
Issue tracker View/Report issues
Requires R
Source files component.xml SampleBalancer.r
Usage Example with default values

Inputs

Name Type Mandatory Description
in CSV Mandatory Expression matrix.

Outputs

Name Type Description
balanced CSV Table with balanced classes
remainder CSV The rest of the data

Parameters

Name Type Default Description
classCol string (no default) Name of the column with class information
ratio float 0.5 Ratio of samples to be assigned as training set. Defaults to half.
seed int 20151208 Seed for randomization

Test cases

Test case Parameters IN
in
OUT
balanced
OUT
remainder
case1 properties in balanced remainder

classCol=Class


Generated 2018-12-12 07:42:06 by Anduril 2.0.0