Imports gene expression or exon array data from Affymetrix CEL-files. CEL-files are recognized by the suffix 'CEL'.

Version 1.1.2
Bundle microarray
Categories Data Import Affy
Authors Marko Laakso (Marko.Laakso@Helsinki.FI), Kari Nousiainen (Kari.Nousiainen@Helsinki.FI), Ping Chen (ping.chen@helsinki.fi)
Requires R ; affy (R-bioconductor) ; affydata (R-bioconductor) ; gcrma (R-bioconductor) ; plier (R-bioconductor) ; aroma.affymetrix (R-package) ; affxparser (R-bioconductor) ; AffymetrixDataTestFiles (R-bioconductor)
Usage Example with default values


Name Type Mandatory Description
affy AffyDirectory Mandatory Affymetrix source file directory.
sampleNames CSV Mandatory Sample definitions. The table contains columns "SampleID" (key), "Filename" (relative to the Affy source directory), "Description". If some samples are not found in the source file directory, they are ignored.


Name Type Description
expr LogMatrix Normalized gene expression matrix. Expression values are in log base 2 scale.
exprRaw LogMatrix Raw, unnormalized gene expression matrix. Notice that expr and exprRaw usually have different number of rows.
groups SampleGroupTable Sample group definitions. They are automatically generated based on sampleNames.


Name Type Default Description
acceptCELOnly boolean true If true, reads only files with .CEL extension, if false, accepts all file extensions.
arrayType string (no default) A string specifying the type of arrays to read. The type is either 'geneArray' or 'exonArray'.
cdf string (no default) Chip annotation file as defined in Bioconductor. For exon array, you should use 'exon.pmcdf' as cdf when 'normalizationMethod' is 'rma' or 'plier'; use 'HuEx-1_0-st-v2' when 'normalizationMethod' is 'firma'.
idType string "AffyProbeSet" Probe set annotation type that is used as a column name for the identifiers of the output file.
metaData string "" The directory for storing metadata when 'normalizationMethod' is set to 'firma'. Otherwise, it is not needed.
normalizationMethod string "rma" For gene expression array, normalization could be one of the following normalization methods "rma", "gcrma", "mas5", "dChip" or "plier". For exon array, normalization is 'rma', 'firma' or 'plier'.
quantileNormalization boolean true Boolean argument indicating either the quantile normalization is in use or not. Used by algorithms rma and plier.
skipCELFiles boolean false If true, CEL-files found in the soure file directory but not defined in the sample definition file are skipped. If false, any CEL-files not defined in the sample definition file found in the soure file directory cause an error.
suffixLength int 0 Character lenght of the gene ID suffix to be removed. This is only used for gene expression array.

Test cases

Test case Parameters IN
case1 properties affy sampleNames (missing) (missing) groups

cdf =HGU133A,
arrayType =geneArray,
normalizationMethod =rma

