Up: Component summary Component

ACGHsegment

Segments Array CGH data with the Circular Binary Segmentation (CBS) algorithm and produces hard or soft copy number aberration calls.

The 'geneAnnotation' input file must contain columns Chr, start, and end separately and the chromosomal numbers must be without the string 'chr' in front of the chromosome identifier. Valid chromosome identifiers are [1,2,..22, X, Y]. The actual names of the column names of the identifiers in the 'geneAnnotation' file are user definable.

Two output modes exist. The first one (hard) calls the copy number of a probe aberrated if is more than the median plus two standard deviations apart from the mean of the intensities of all the samples under scrutiny. In the second (soft) copy number aberration calls are made probabilistically with either the CGHcall or the FastCall (part of the TASSO package) algorithm. Please note that FastCall only works on 32bit systems.

Several plots can be produced based on user parameters. Example images can be found here.

The outputs are the dependant on the analysis procedure. The following matrices can be output: All segments with their respective copy number aberration calls, thresholds for significant copy number aberrations, and segment call probabilities.

Version 2.0
Bundle microarray
Categories Agilent Copy Number Analysis
Authors Riku Louhimo (Riku.Louhimo@Helsinki.FI)
Issue tracker View/Report issues
Requires R ; limma (R-bioconductor) ; DNAcopy (R-bioconductor) ; CGHcall (R-bioconductor) ; TASSO (R-package)
Source files component.xml ACGHsegment.r
Usage Example with default values

Inputs

Name Type Mandatory Description
caseChan CSV Mandatory CSV file containing the normalized probe intensities for the case channel. First row should have the probenames and samples should be columnwise.
geneAnnotation AnnotationTable Mandatory Probewise annotations as produced by the AgilenReader component. Only the probes present in the casechannel csv can be included. Use CSVFilter if the probes do not otherwise match.

Outputs

Name Type Description
report Latex Latex report for the analysis.
segments CSV Segmented probewise copy number change values.
tholds CSV Thresholds that were used to define a segment aberrated.
lossProbs CSV Probewise probability of a loss having occured in the segment to which the probe was assigned.
gainProbs CSV Probewise probability of a gain having occured in the segment to which the probe was assigned.
normProbs CSV Probewise probability of no copy number change in the segment to which the probe was assigned.
rawSegments CSV Probewise segment means.
frequency CSV Probewise copy-number alteration frequencies.

Parameters

Name Type Default Description
CGHCallMaxnumseg int 100 Only used if callProbMethod=CGHCall. Maximum number of segments on a sample to be used for fitting the probabilty model.
CGHCallPrior string "auto" Only used if callProbMethod=CGHCall. Set the method to determine prior probabilities to CGHCall algorithm. Must be one of "auto", "all", "not all".
CGHCallRobustsig boolean true Only used if callProbMethod=CGHCall. Setting this to true enforces a lower bound on the normal segments.
alpha float 0.01 P-value of CBS to accept a break point.
bpEndCol string "end" Column name for chromosome end basepair in geneAnnotation.
bpStartCol string "start" Column name for chromosome start basepair in geneAnnotation.
callProbMethod string "CGHCall" Method with which to call CNA segment probabilities. Must be either CGHCall or FastCall. FastCall is significantly (30000 times) faster while CGHCall is more accurate.
callProbabilities boolean false Enabling this will make the component estimate probabilties for CNA segments via the CGHcall package. Only guaranteed to work when multiple chromosomes are analyzed simultaneously.
chrColumn string "Chr" Column name for chromosome identifiers in geneAnnotation.
filterNA boolean true Filter out probes with NA values. This is always true when 'callProbabilities' is true.
lowerLimit float 0.0 Use this as the intensity limit for calling a segment lossed. Default value computes the mean of the sample set and estimates the lower threshold for CNA call to be two standard deviations from it.
minWidth int 2 Minimum number of probes for CBS to define a segment. CBS only allows widths between 2 and 5.
nSegFit int 3000 Maximum number of segments used for fitting the mixture model in CGHcall probability calculations. Disabled if callProbabilities=false. Decreasing this lowers accuracy but can speed computation significantly.
outputAllSegs boolean false Output all segmentation results even if non-significant or non-aberrated.
plotChromosomes string "0" Defines the plots that the user wants to output as a comma separated list. The whole genome is plotted by default for each sample. Inputting a value different than 0 will generate additional plots for these specific chromosomes from each sample.
plotEverySample boolean false Enabling this will make the component print each sample separately.
undoSD int 3 Only used if 'undoSplits=sdundo'. Defines how many SDs two adjacent segments can be apart before they are combined.
undoSplits string "sdundo" A character string specifying how change-points are to be undone, if at all. Undoing change-points decreases the sensitivity of segmentation. Choices are "none","prune", which uses a sum of squares criterion, and "sdundo" (default), which undoes splits that are not at least this many SDs apart. SD by default is 3.
upperLimit float 0.0 Use this as the intensity limit for calling a segment gained. Default value computes the mean of the sample set and estimates the upper threshold for CNA call to be two standard deviations from it.

Test cases

Test case Parameters IN
caseChan
IN
geneAnnotation
OUT
report
OUT
segments
OUT
tholds
OUT
lossProbs
OUT
gainProbs
OUT
normProbs
OUT
rawSegments
OUT
frequency
case1 properties caseChan geneAnnotation report segments tholds lossProbs gainProbs normProbs rawSegments (missing)

chrColumn=Chromosome,
bpStartCol=startBp,
bpEndCol=endBp,
filterNA=true

case2 properties caseChan geneAnnotation report segments tholds lossProbs gainProbs normProbs (missing) (missing)

plotChromosomes = 6,7,
plotEverySample = true,
callProbabilities = true,
nSegFit = 500,
CGHCallPrior = all,
lowerLimit = -0.650366166469952,
upperLimit = 0.587166166469953,
filterNA = false

case3 properties caseChan geneAnnotation report segments tholds (missing) (missing) (missing) (missing) (missing)

upperLimit=0.583291228317388,
lowerLimit=-0.646491228317388,
callProbabilities=false

case4_InvalidParam properties caseChan geneAnnotation (expecting failure) (expecting failure) (expecting failure) (expecting failure) (expecting failure) (expecting failure) (expecting failure) (expecting failure)

minWidth=6,
undoSplits=none

case5_invalidColumnNames properties caseChan geneAnnotation (expecting failure) (expecting failure) (expecting failure) (expecting failure) (expecting failure) (expecting failure) (expecting failure) (expecting failure)

undoSplits=prune

case6_allSegs properties caseChan geneAnnotation (missing) segments (missing) (missing) (missing) (missing) (missing) (missing)

upperLimit=1,
lowerLimit=-1,
callProbabilities=false,
outputAllSegs=true

Test network

Display TestNetwork
Generated 2018-12-11 07:42:06 by Anduril 2.0.0