Up: Component summary Component

Folder2Array

Constructs an array from folder contents. Only atomic (non-folder) files are used as array elements, unless specified otherwise by parameters. Folders are traversed recursively. Element order in output array is deterministically based on sorted file names so that atomic files are processed first and sub-folders processed second.

Version 1.3
Bundle builtin
Categories Internal
Specialties generic
Authors Kristian Ovaska (kristian.ovaska@helsinki.fi)
Issue tracker View/Report issues
Source files component.xml Folder2Array.java
Usage Example with default values

Type parameters (generics)

Inputs

Name Type Mandatory Description
folder1 T1 (generic) Optional Input folder 1.
folder2 T1 (generic) Optional Input folder 2.
folder3 T1 (generic) Optional Input folder 3.
folder4 T1 (generic) Optional Input folder 4.
folder5 T1 (generic) Optional Input folder 5.
folder6 T1 (generic) Optional Input folder 6.
folder7 T1 (generic) Optional Input folder 7.
folder8 T1 (generic) Optional Input folder 8.
folder9 T1 (generic) Optional Input folder 9.

Outputs

Name Type Description
out Array<T2> (generic) Output array.

Parameters

Name Type Default Description
excludePattern string "\\..*" Exclude pattern for files and folders.
filePattern string ".*" Java regular expression for matching file names and producing element key values. Only files whose base name (last component of absolute name) matches the pattern are included in output.
folderPattern string ".*" Java regular expression for selecting folders to search. If folder base name does not match, the sub-folder is not visited and searching of the sub-tree is terminated.
includeFolders boolean false If true, list folders in the root of input folder as Array elements. The component will NOT recursively enter the subfolders.
keyMode string "pattern" The mode to generate the key. Pick from the following methods:
  • pattern - Pattern-based key: if the pattern contains a capturing group (marked by parenthesis), the contents of this group are used as key value for the element. Example: (.*)[.]txt matches files with .txt extension and uses the part without extension as key.
  • filename - Filename-based key: a relative path inside the input folder is used as the key. Note, that all non-alphanumeric symbols are replaced with underscores. Format of the key: folderi_relative_file_path, where folderi is the input folder of the file, and relative_file_path is the relative file path inside the folderi.
  • hashcode - Hashcode-based key: uses absolute file path to generate the hashcode. Format of the key: keyhashcode.
  • number - numbers the keys from 1 on.
pattern method is best suited for cases where the semantics of the file in the array needed to be derived from its key value. filename method can be used in cases of complex directory structure of the input folders. hashcode method is best used when the value of the key is not important, but the uniqueness of the key is crucial. By default, pattern method is used.
timeStampPrefix boolean false If true, then the file time stamp is added as the prefix to the file's key. Then, the key looks like this: time_stamp_original_key. This time stamp prefix is useful when downstream components need this information. For instance, TCGAClinStorScanner from TCGA bundle extracts the time stamp information of a file from its key value.

Test cases

Test case Parameters IN
folder1
IN
folder2
IN
folder3
IN
folder4
IN
folder5
IN
folder6
IN
folder7
IN
folder8
IN
folder9
OUT
out
case1_exclude_file properties folder1 (missing) (missing) (missing) (missing) (missing) (missing) (missing) (missing) out

excludePattern=.*[.]jpg

case2_filename properties folder1 (missing) (missing) (missing) (missing) (missing) (missing) (missing) (missing) out

filePattern=(.*)_1[.]fq[.]gz,
keyMode=filename

case3_pattern properties folder1 (missing) (missing) (missing) (missing) (missing) (missing) (missing) (missing) out

filePattern=(.*)_1[.]fq[.]gz,
keyMode=pattern


Generated 2018-12-11 07:42:05 by Anduril 2.0.0