Up: Component summary Component

CSVTransformer

Transforms CSV files using R expressions. This allows applying arithmetic functions to numeric columns and combining columns from different CSV files.

The inputs are one to two CSV files. The R expressions are evaluated and are expected to return R matrices, data frames or vectors that are concatenated to a final result. Concatenations is done on columns, so each transformation creates additional columns to the output. Transformations should create items having the same number of rows. However, the expression may yield a single string or number that is duplicated to fit the number of rows.

In the expressions, "csv1" and "csv2" are R data frames containing contents of the input files; "csv2" is defined only if the csv2 input is given. The numeric columns of csv1 and csv2 are visible as R matrices "matrix1" and "matrix2". If there are no numeric columns, these are empty.

Version 1.1
Bundle tools
Categories Preprocessing
Specialties generic
Authors Kristian Ovaska (kristian.ovaska@helsinki.fi)
Issue tracker View/Report issues
Requires R
Source files component.xml CSVTransformer.r
Usage Example with default values

Type parameters (generics)

Inputs

Name Type Mandatory Description
csv1 CSV Optional Input file 1.
csv2 CSV Optional Input file 2.
columnNamesFile IDList Optional Column names for output. Overridden by the 'columnNames' parameter.
array Array<CSV> Optional Input CSV Array. Variables are named csv.[key] and matrix.[key]

Outputs

Name Type Description
out T (generic) Transformed output. The first column(s) are created using transform1, the next column(s) using transform2, and so on.

Parameters

Name Type Default Description
columnNames string "" R expression that evaluates to the column names of the result CSV file. The evaluated vector must have the same number of items as there are columns in the output. If empty, column names are taken from the input CSV files; depending on the transforms, some column names may be automatically generated.
combineFunction string "cbind" R expression that combines the transformations. Defaults to cbind, that joins by columns. Use rbind to join by rows.
transform1 string (no default) R expression that evaluates to a matrix, data frame, vector or constant. The expression may refer to data frames "csv1" and "csv2" (only if csv2 is given) and matrices "matrix1" and "matrix2" (only if csv2 is given).
transform2 string "" Transformation expression 2. If empty, no transformation is done.
transform3 string "" Transformation expression 3. If empty, no transformation is done.
transform4 string "" Transformation expression 4. If empty, no transformation is done.
transform5 string "" Transformation expression 5. If empty, no transformation is done.
transform6 string "" Transformation expression 6. If empty, no transformation is done.
transform7 string "" Transformation expression 7. If empty, no transformation is done.
transform8 string "" Transformation expression 8. If empty, no transformation is done.
transform9 string "" Transformation expression 9. If empty, no transformation is done.

Test cases

Test case Parameters IN
csv1
IN
csv2
IN
columnNamesFile
IN
array
OUT
out
case1 properties csv1 csv2 (missing) (missing) out

transform1=csv1[,c("C1","C2")],
transform2=matrix1+matrix2,
transform3=csv2[,1,drop=FALSE],
transform4="Constant"

case2_colnames properties csv1 csv2 (missing) (missing) out

transform1=csv1[,c("C1","C2")],
transform2=matrix1+matrix2,
transform3=csv2[,1,drop=FALSE],
transform4="Constant",
columnNames=c(colnames(csv1), "X1", "X2")

case3_matrix properties csv1 csv2 (missing) (missing) out

transform1=(matrix1+matrix2)/2

case4_reusing_transformed properties csv1 csv2 (missing) (missing) out

transform1=cbind(csv1[,c("C1","N1")],csv2[,"N3"],100*csv1[,"N1"]/csv2[,"N3"]),
transform2=transformed[,4]>20,
columnNames=c("C1","N1","N3","Ratio%","IsGreaterThan20%")

case5_colInput properties csv1 csv2 columnNamesFile (missing) out

transform1=csv1[,c("C1","C2")],
transform2=matrix1+matrix2,
transform3=csv2[,1,drop=FALSE],
transform4="Constant"

case6_join_by_row properties csv1 csv2 (missing) (missing) out

transform1=t(csv1[,c("C2")]),
transform2=t(csv2[,c("N4")]),
combineFunction=rbind,
columnNames=c(csv2[,c("C3")]),

case7_array properties (missing) (missing) (missing) array out

transform1=csv.key1[,c("C1","C2")],
transform2=matrix.key1+matrix.asdf,
transform3=csv.asdf[,1,drop=FALSE],
transform4="Constant"


Generated 2018-12-12 07:42:06 by Anduril 2.0.0