Up: Component summary Component

SPINLONG

Identify user-defined spatio-temporal patterns in ChIP-seq and other sequencing data using the SPINLONG method. SPINLONG stands for Spatial Pattern Identification using Non-Linear OptimizatioN with Global constraints. Its applications include analysis of time-series RNA polymerase and histone modification data.

Version 1.0
Bundle sequencing
Categories Short-read Sequencing
Authors Kristian Ovaska (kristian.ovaska@helsinki.fi)
Issue tracker View/Report issues
Requires commons-primitives-1.0.jar (jar) ; commons-math-2.1.jar (jar) ; csbl-javatools.jar (jar) ; jahmm-0.6.1.jar (jar) ; jcommon-1.0.16.jar (jar) ; jfreechart-1.0.13.jar (jar) ; saxon9he.jar (jar)
Source files component.xml
Usage Example with default values

Inputs

Name Type Mandatory Description
bedFiles BinaryFolder Mandatory Short reads in BED formats. The files may optionally by gzipped.
patterns XML Mandatory XML file defining the patterns.
regions CSV Mandatory Genomic regions.
chromosomes CSV Optional Metadata for chromosomes. Must contain the columns Chromosome and Size, where Size gives the length of the chromosome. If not present, the parameter chromosomePreset is used.
mappability BinaryFile Optional If given, contains genomic regions that are uniquely mappable in BED format. This is used to scale short read densities.
control BinaryFile Optional If given, contains a control track in BED format that is subtracted from primary tracks (bedFiles input).

Outputs

Name Type Description
scores CSV Segment scores.
plots ImageList Segment plots and optionally optimizer plots. If both plotSegments and plotOptimizer are false, this output is empty.
patternsDump HTML Patterns formatted as HTML.

Parameters

Name Type Default Description
chromosomePattern string "" Java regular expression that selects the chromosomes to be used in analysis. The empty value selects all chromosomes. The pattern is matched against the first column of BED files.
chromosomePreset string "hs37" If a custom chromosomes input is not given, this parameter is used to select a predefined chromosome set. Currently the legal values are hs36 (Homo sapiens, genome build 36) and hs37 (Homo sapiens, build 37).
fragmentSize int 200 Size of sequences DNA fragments. The short reads are elongated so their final length matches this number. Setting this to 0 disables elongation.
maxDuplicateReads int 0 The maximum number of duplicate short reads (same position and strand) that are utilized for each position. The rest are filtered out. This allows to renive reads that may be technical artefacts. If 0 or negative, all repeats are used.
minRegionLength int 0 Minimum length for input regions that are processed. If 0, all regions are processed. This allows to filter out very short genes.
plotOptimizer boolean false If true, visualize the progression of the optimizer. If false, omit optimizer plotting.
plotSegments boolean true If true, visualize results of regions whose score is over the threshold. If false, omit plotting.
scoreThreshold float 0.1 Minimum score for including a pattern match in the score and plot outputs. Notice that the score distribution depends on the scoring method.
seed int -1 Seed for random number generator. If negative, an automatically generated seed is used. Using a pre-defined seed ensures that results are deterministic.
threads int 4 Maximum number of threads to use.
yLog boolean false If true, the Y axis is plotted using logarithmic scale.

Test cases

Test case Parameters IN
bedFiles
IN
patterns
IN
regions
IN
chromosomes
IN
mappability
IN
control
OUT
scores
OUT
plots
OUT
patternsDump
case1 properties bedFiles patterns regions (missing) (missing) (missing) scores (missing) (missing)

scoreThreshold=0,
threads=1,
maxDuplicateReads=2,
seed=100

case2_chr properties bedFiles patterns regions (missing) (missing) (missing) scores (missing) (missing)

scoreThreshold=0,
threads=1,
chromosomePattern=chr1,
yLog=true

case3_xslt properties bedFiles patterns regions (missing) (missing) (missing) scores (missing) (missing)

scoreThreshold=0.325,
plotOptimizer=true

case4_xslt_simple properties bedFiles patterns regions (missing) (missing) (missing) scores (missing) (missing)

scoreThreshold=0,
plotSegments=false


Generated 2018-12-17 07:42:24 by Anduril 2.0.0