Uses FASTX-Toolkit to perform adapter trimming, artifact filtering, base-quality filtering, and read trimming for single-end read data. This function is executed within the larger QCFasta component when parameter tool=fastx.

Version 1.0
Bundle sequencing
Categories Preprocessing smallRNA
Authors Katherine Icay (katherine.icay@helsinki.fi)
Issue tracker View/Report issues
Requires FASTX-Toolkit
Source files component.xml function.scala
Usage Example with default values


Name Type Mandatory Description
reads FASTQ Mandatory Input file in FASTQ format.


Name Type Description
fastq FASTQ Trimmed, high-quality reads. Must be in fastq/fq format and should not be zipped.
stats CSV Count statistics of reads before and after processing.


Name Type Default Description
Lmax int 32 Maximum acceptable sequence length.
Lmin int 15 Minimum acceptable sequence length.
M int 6 Minimum, partial adapter match length needed for removal to occur.
adapter string "ATCTCGTATGCCGTCTTCTGCTT" Adapter sequence to remove. Default is Illumina smallRNA-seq adapter. Value of "NA" will disable trimming and just calculate total read lengths per sample.
extra string "-n" Extra parameters for fastx_clipper. E.G. "-n" keeps sequences with N, "-c" discards non-clipped sequences.
minPercent int 20 Minimum percentage of bases that must have at least minQ for a read to be kept.
minQ int 30 Minimum quality score to keep.
qual string "-Q64" Type of quality scores of sequences: -Q64 for Sanger scores, -Q33 for Phred scores.
zip boolean false Defines if the output sequences should be gzipped or not.

Test cases

Test case Parameters IN
case1 (missing) reads (missing) (missing)

