Up: Component summary Component

DiffMeth

Function to compute differential methylation at a base/window/region resolution. It uses functions contained in the R package methylKit.

Version 1.0
Bundle sequencing
Categories Analysis
Authors Chiara Facciotto (chiara.facciotto@helsinki.fi)
Issue tracker View/Report issues
Requires R ; GenomicRanges (R-bioconductor) ; data.table (R-package) ; download (bash) ; methylKit
Source files component.xml DiffMeth.R
Usage Example with default values
Deprecated

This component is not part of the methylation pipeline anymore.

Inputs

Name Type Mandatory Description
rawFilteredNormMethyl BinaryFile Mandatory Binary file containing the RData output from MethFilterNorm
regions BED Optional A 6 field BED file containing the locations and IDs of the regions where methylation has to be summarized.

Outputs

Name Type Description
plots Latex Folder containing plots about correlation analysis, clustering and PCA.
differentialMethylation BinaryFile File containing the differential methylation information in RData format. This is the input file for the component SelectDiffMeth
visualization BinaryFolder Folder containing the bedgraph file used to visualize the differential methylation across the genome.
analysis CSVList Folder containing CSV files with methylation summarized by region/window and differential methylation information.

Parameters

Name Type Default Description
agglMethod string "ward" Agglomeration method to be used for the clustering of samples. Possible values are "ward", "single", "complete", "average", "mcquitty", "median" and "centroid".
bedgraph boolean false Boolean value indicating if a bedgraph file should be printed in the visualization folder. The scores are meth.diff.
correlation string "pearson" Correlation coefficient (or covariance) to be computed when performing sample correlation. Possible values are "pearson", "kendall" and "spearman".
covBases int 0 Minimum number of bases to be covered in a given window.
dataFrame boolean false Boolean value indicating if CSV files summarizing the methylation information for every sample should be printed in the analysis folder.
destrand boolean false Boolean paramenter. If TRUE consider only the strand specific methylated Cs.
distance string "correlation" Distance method to be used for the clustering of samples. Possible values are "correlation", "euclidean", "maximum", "manhattan", "canberra", "binary" and "minkowski".
minPerGroup int (no default) Integer value denoting the minimum number of samples per condition needed to cover a region/window/base. By default only regions/bases that are covered in all samples are united as methylBase object, however by supplying an integer for this argument users can control how many samples needed to cover region/window/base to be united as methylBase object. For example, if minPerGroup set to 2 and there are 3 samples per condition, the bases/window/regions that are covered in at least 2 samples will be united and missing data for uncovered bases/regions will appear as NAs.
mode string "base" Designates whether methylation information have to be analyzed at base-pair, window or regional resolution. Possible values are base, window or region.

When mode == "window", it is possible to summarize the methylation in windows of specified length using the parameters winSize, stepSize and covBases.

When mode == "region", it is possible to summarize the methylation at specific regions using the parameters covBases and strand. The input file regions must be given as input.
numCores int 1 Integer value denoting how many cores should be used for differential methylation calculations.
regionName string "" Name of the regions contained in the region file. No spaces nor commas are allowed in the string.
similarityAnalysis boolean true Boolean value indicating if correlation analysis, clustering and PCA need to be performed to analyzed the similarities among samples.
slim boolean true Boolean values stating how to perform the P-value adjustment. If TRUE(default) SLIM method will be used for P-value adjustment. If FALSE, p.adjust with method="BH" option will be used for P-value correction.
stepSize int 1000 An integer for the step size of tiling windows, it must be less or equal to winSize.
weigthedMean boolean true Boolean values stating if the mean methylation difference between groups has to be computed using read coverage as weights.
winSize int 1000 An integer for the size of the tiling windows. It has to be an integer greater than 1.

Test cases

Test case Parameters IN
rawFilteredNormMethyl
IN
regions
OUT
plots
OUT
differentialMethylation
OUT
visualization
OUT
analysis
case1_default properties rawFilteredNormMethyl (missing) plots differentialMethylation (missing) (missing)

# Testing DiffMeth component,
minPerGroup=1

case2_window properties rawFilteredNormMethyl (missing) plots differentialMethylation (missing) (missing)

# Testing DiffMeth component,
mode=window,
winSize=40,
stepSize=20,
covBases=1,
minPerGroup=1

case3_region properties rawFilteredNormMethyl regions (missing) differentialMethylation (missing) (missing)

# Testing DiffMeth component,
mode=region,
regionName=regions,
covBases=1,
minPerGroup=1,
similarityAnalysis=false,
destrand=true

case4_dataFrame_and_bedgraph properties rawFilteredNormMethyl (missing) plots differentialMethylation visualization analysis

# Testing DiffMeth component,
dataFrame=true,
minPerGroup=1,
bedgraph=true


Generated 2018-12-18 07:42:24 by Anduril 2.0.0