Rony Lindell

Component Version Inputs Outputs Description
BamReducer 1.0 reference, bam alignment

This function reduces the reads in a bam file using the Genome Analysis Toolkit (GATK) 2 and outputs a dramatically more compressed bam file.

BamRegrouper 1.0 bam alignment

Picard-tools is used to add or replace read group information for each read in the input alignment file.

BamReorder 1.0 reference, bam alignment This function reorders reads in a BAM file to match the contig ordering in the provided reference file. .
BaseRecalibrator 1.0 reference, bam, dbsnp, mask, intervals alignment, report, plots

This function will do base quality score recalibration using Genome Analysis Toolkit (GATK).

DuplicateMarker 2.0 in outBam, outMetrics

Mark PCR and optical duplicate reads in BAM files using Picard MarkDuplicates.

ReferenceIndexer 1.0 reference index The following auxiliary files for the reference fasta sequence will be created:
Sam2Fastq 1.0 alignment folder, reads, mate This function uses the Picard java library to perform a conversion from SAM/BAM alignment format back to FASTQ sequence format.
VariantCaller 1.0 reference, bam1, bam2, bams, intervals, dbsnp snp, indel, metrics

Calls genomic sites of variation using the specified caller.

VariantCombiner 1.0 reference, variants, variants1, variants2, variants3, variants4, variants5 calls This function combines all input variant files into one file using the Genome Analysis Toolkit (GATK).
VariantLiftover 1.0 chain, oldReference, newReference, variants calls

This is analogous to the LiftOver component, but for VCF files.

VariantRecalibrator 1.0 reference, variants, hapmap, omni, hcsnp, dbsnp, mills calls This function will apply machine learning in order to improve the input variants.
VariantSelector 1.0 reference, variants, intervals, conc, disc calls Select variants from a VCF file using specific criteria, e.g. type, annotations or genomic intervals.
VariantValidator 1.0 reference, variants, dbsnp errors This function validates the input variant file using the Genome Analysis Toolkit (GATK).

