Up: Component summary Component

CSV2IDList

Extracts columns from each given CSV file and prints their content out without duplicates. NA values are removed. Only one column from each input file can be processed at a time, so taking the union of two columns in the same file requires that the file be specified twice as an input table. The result is an union over all the inputs.

Version 1.4
Bundle tools
Categories Convert
Authors Marko Laakso (Marko.Laakso@Helsinki.FI)
Issue tracker View/Report issues
Requires csbl-javatools.jar (jar) ; installer (bash)
Source files component.xml
Usage Example with default values

Inputs

Name Type Mandatory Description
in1 CSV Optional The first input relation
in2 CSV Optional The second input relation
in3 CSV Optional The third input relation
in4 CSV Optional The fourth input relation
in5 CSV Optional The fifth input relation
in6 CSV Optional The sixth input relation
in7 CSV Optional The seventh input relation
in8 CSV Optional The eighth input relation
in9 CSV Optional The ninth input relation
array Array<CSV> Optional An array of input files

Outputs

Name Type Description
out IDList A list of selected IDs

Parameters

Name Type Default Description
acceptMissing boolean false Files with missing columnIn are accepted as empty if this is true.
columnIn string "" A comma separated list of column names for the IDs of interest in each table input. Empty values refer to the first column of the file.
columnInArray string "" A comma separated list of array_key=column_name pairs for the IDs of interest in array files. Empty values refer to the first column of the file.
columnOut string "" Name of the only column of the output list. Empty input refers to the name of the input column.
constants string "" A comma separated list of values that are always included into the output
isList boolean false True if the seleted column contains a comma separated list of values to be splitted
quotation boolean false Indicator that can be used to disable quotation of the output values
regexp1 string "" Regular expression for the row filtering in table1. A row is included in the result if this parameter is empty or if values in the given columns match given regular expressions. The parameter has a format COLNAME1=EXPRESSION,COLNAME2=EXPRESSION2 where COLNAMEs are column names in "csv" and EXPRESSIONs are regular expressions using Java syntax. For example, "col=a|b" includes rows where the column col has a value of "a" or "b".
regexp2 string "" Regular expression for the row filtering in table2
regexp3 string "" Regular expression for the row filtering in table3
regexp4 string "" Regular expression for the row filtering in table4
regexp5 string "" Regular expression for the row filtering in table5
regexp6 string "" Regular expression for the row filtering in table6
regexp7 string "" Regular expression for the row filtering in table7
regexp8 string "" Regular expression for the row filtering in table8
regexp9 string "" Regular expression for the row filtering in table9
regexpArr string "" Regular expression for the row filtering of array files

Test cases

Test case Parameters IN
in1
IN
in2
IN
in3
IN
in4
IN
in5
IN
in6
IN
in7
IN
in8
IN
in9
IN
array
OUT
out
case1 properties in1 (missing) (missing) (missing) (missing) (missing) (missing) (missing) (missing) (missing) out

columnIn=name,
isList=true

case2 properties in1 in2 in3 (missing) (missing) (missing) (missing) (missing) (missing) (missing) out

isList =true,
columnIn =value,V,value,
regexp1 =method=accept,
regexp3 =method=accept

case3 properties in1 (missing) (missing) in4 in5 in6 in7 in8 in9 (missing) out

acceptMissing=true,
quotation=true,
constants=N1,N2

case4 properties (missing) in2 (missing) (missing) (missing) (missing) (missing) (missing) (missing) array out

isList = true,
columnInArray = tf1=value,tf3=value,
regexpArr = method=accept,
columnIn = ,V,
columnOut = value


Generated 2018-12-11 07:42:06 by Anduril 2.0.0