Up: Component summary Function

Sam2Fastq

This function uses the Picard java library to perform a conversion from SAM/BAM alignment format back to FASTQ sequence format. The resulting raw reads with included quality values can then be re-aligned using an alignment component. The procedure SAM->FASTQ->SAM can be useful when re-processing old alignments.

Normally single-end reads will be output to 'reads' and pair-end reads to 'reads' and 'mate'. If 'perReadgroup' is set to 'true', all read-group specific output files are written to 'folder'. Use 'options' to add any number of additional picard-tools options to the run command. See the manual for more information about picard-tools SamToFastq.

Version 1.0
Bundle sequencing
Categories Alignment
Authors Rony Lindell (rony.lindell@helsinki.fi)
Issue tracker View/Report issues
Requires picard-tools
Source files component.xml function.scala
Usage Example with default values

Inputs

Name Type Mandatory Description
alignment AlignedReadSet Mandatory Input SAM/BAM alignment file to convert. The file should have correct extension (.sam or .bam) in order for Picard to be able to determine the file type.

Outputs

Name Type Description
folder BinaryFolder Folder in which to output fastq files when 'perReadgroup' is set to true.
reads FASTQ Output FASTQ file containing single-end reads or reads of the first pair in pair-end data.
mate FASTQ Output FASTQ file containing reads of the second pair in pair-end data.

Parameters

Name Type Default Description
memory string "2g" A non-default value appends -XmxVALUE to the java command to specify the maximum size, in bytes, of the memory allocation pool. This value must a multiple of 1024 greater than 2MB. Append the letter k or K to indicate kilobytes, m or M to indicate megabytes or g or G to indicate gigabytes. The default value is chosen at runtime based on system configuration. For example a value of "4g" would allocate 4 gigabytes of memory and a value of "512m" would allocate 512 megabytes of memory. Note: a value of at least '2g' is recommended by Picard developers, therefore this is the default value.
options string "" Any additional picard-tools options can be added here. This string is added to the run command as written. See the picard-tools manual part 'SamToFastq' for more information. Some useful options are e.g. clipping and trimming.

Example: options="CLIPPING_ACTION=X READ1_TRIM=5 READ2_TRIM=5 RE_REVERSE=false"
perReadgroup boolean false Output a FASTQ file per read group (two fastq files per read group if the group is paired). All FASTQ files will be written to 'folder' output.
picard string "/opt/share/picard/" Path to Picard directory, e.g. "/opt/share/picard", which containg the Picard-tools .jar files. If empty string is given, the picard version in sequencing lib directory will be used

Test cases

Test case Parameters IN
alignment
OUT
folder
OUT
reads
OUT
mate
case1 properties alignment (missing) reads (missing)

# Test conversion; input SAM,
picard=,
memory=512m

case2 properties alignment folder (missing) (missing)

# Test conversion; input SAM,
picard=,
memory=1g,
options=READ1_TRIM
=10 READ2_TRIM=10,
perReadgroup=true

case3 properties alignment (missing) reads mate

# Convert pair-end alignment,
picard=,


Generated 2018-12-11 07:42:07 by Anduril 2.0.0