Up: Component summary Function

BamReducer

This function reduces the reads in a bam file using the Genome Analysis Toolkit (GATK) 2 and outputs a dramatically more compressed bam file. The output file is callable for variants (at least using GATK) and also visualizable. Note that GATK 1 does not include this functionality.

Complete documentation:

Version 1.0
Bundle sequencing
Categories Alignment
Authors Rony Lindell (rony.lindell@helsinki.fi)
Issue tracker View/Report issues
Source files component.xml function.scala
Usage Example with default values

Inputs

Name Type Mandatory Description
reference FASTA Mandatory The reference fasta file.
bam BAM Mandatory Input BAM file.

Outputs

Name Type Description
alignment BAM Reduced alignment bam file.

Parameters

Name Type Default Description
cleanup boolean false Removes input alignment files by replacing them with empty files. Use this to save space when the previous alignments are no longer needed.
downsample int -1 The coverage of a variable region is downsampled to size INT. Use -1 if you want to use the default downsampling value of GATK. A value of 0 turns downsampling off. If memory problems occur, reducing the value e.g. down to 40 might help.
gatk string "" Path to GATK directory containing the 'GenomeAnalysisTK.jar' file. If empty string is given (default), GATK_HOME environment variable is assumed to point to the GATK directory where GenomeAnalysisTK.jar is located.
memory string "4g" The amount of java-heap memory being allocated to the GATK thread, given in the format "4g" for 4 gigabytes or "2560m" for 2560 megabytes (2,5g) etc.

Test cases

Test case Parameters IN
reference
IN
bam
OUT
alignment
case1 properties reference bam (missing)

# Default run using less memory,
memory=1g,
downsample=10


Generated 2018-12-16 07:42:17 by Anduril 2.0.0