Up: Component summary Component

MethylCall

Perform Methylation Calling of aligned RRBS and WGBS data obtained from the component BSalign.
WARNING: It requires quite a big amount of CPU (For human genome, needs ~26GB memory or more). For systems with limited memory, user can set the -c/--chr option to process specified chromosomes only, and combine results for all chromosomes afterwards.

Version 1.0
Bundle sequencing
Categories Alignment
Authors Chiara Facciotto (chiara.facciotto@helsinki.fi)
Issue tracker View/Report issues
Requires python ; samtools
Source files component.xml MethylCall.sh
Usage Example with default values

Inputs

Name Type Mandatory Description
reference FASTA Mandatory The reference genome file in fasta format. It supports also gzipped fasta format.
alignment BAM Mandatory Aligned reads in bam format.

Outputs

Name Type Description
methylationCalling CSV CSV file with the following columns:

1) chromorome
2) coordinate (1-based)
3) strand
4) sequence context (2nt upstream to 2nt downstream in Watson strand direction)
5) methylation ratio, calculated as #C_counts / #eff_CT_counts
6) number of effective total C+T counts on this locus (#eff_CT_counts)
ctSNP ="no action", #eff_CT_counts = #CT_counts
ctSNP ="correct", #eff_CT_counts = #CT_counts * (#rev_G_counts / #rev_GA_counts)
7) number of total C counts on this locus (#C_counts)
8) number of total C+T counts on this locuso (#CT_counts)
9) number of total G counts on this locus of reverse strand (#rev_G_counts)
10) number of total G+A counts on this locus of reverse strand (#rev_GA_counts)
11) lower bound of 95% confidence interval of methylation ratio, calculated by Wilson score interval for binomial proportion.
12) upper bound of 95% confidence interval of methylation ratio, calculated by Wilson score interval for binomial proportion.

Parameters

Name Type Default Description
chr string "all" Option to process only specified chromosomes. Chromosomes must be listed as comma separated values without spaces and in the form chr1,chrX and not only the chromosome number or identifier (X, Y or MT).

example: chr ="chr1,chr2" uses ~4.5GB compared with ~26GB for the whole genome.
ctSNP string "correct" How to handle CT SNP when performing the methylation calling. Three possible modes of use: "no-action", "correct", "skip":
- "correct": correct the methylation ratio according to the C/T SNP information estimated by the G/A counts on reverse strand
-"skip": do not report loci with C/T SNP detected (i.e. detected A on reverse strand)
- "no-action": do not consider C/T SNP.
optionsMethylCall string "" Other options for methylation calling. This parameter is given as written to the aligner execution command. Example: "-g true" combines CpG methylaion ratio from both strands.
pair boolean true Option to process only properly paired mappings (i.e true -> process only properly paired reads, false -> process all aligned reads).
removeDuplicate boolean false Option to remove duplicated mappings to reduce PCR bias (i.e true -> remove duplicated mappings, false -> process all mappings).

This option should not be used on RRBS data. For WGBS, sometimes it's hard to tell if duplicates are caused by PCR due to high seqeuncing depth.
trim int 2 Defines the number of fill-in nucleotides to be trimmed in DNA fragment end-repairing.

This option is only for pair-end mapping. For RRBS, trim could be detetmined by the distance between cuttings sites on forward and reverse strands.

For WGBS, trim is usually between 0~3.
unique boolean false Option to process only unique mappings/pairs (i.e true -> process only unique mappings/pairs, false -> process all aligned reads).
zeroMeth boolean true Option to report loci with zero methylation ratios (i.e true -> report loci with zero methylation ratios, false -> report only loci with non-zero methylation ratios).

Test cases

Test case Parameters IN
reference
IN
alignment
OUT
methylationCalling
case1_default (missing) reference alignment methylationCalling
case2_pair properties reference alignment methylationCalling

# Testing MethylCall component,
pair=false

case3_zeroMeth properties reference alignment methylationCalling

# Testing MethylCall component,
zeroMeth=false

case4_ctSNP_skip properties reference alignment methylationCalling

# Testing MethylCall component,
ctSNP=skip

case5_ctSNP_no-action properties reference alignment methylationCalling

# Testing MethyCall component,
ctSNP=no-action


Generated 2018-12-16 07:42:12 by Anduril 2.0.0