Close assembly gaps using long-reads at high accuracy.
Keywords: bioinformatics, close-assembly-gaps, cluster, daligner, damapper, docker, dub, gap-filling, genome-assembly, long-reads, pacbio, singularity, snakemake
The following list contains all options across all command of DENTIST. Default values are given in parentheses after the option name and the associated commands are stated in parentheses after the colon.
Example:
--fasta-line-width, -w <ulong>(50): (output)
└──┘ └──────┘
default value command
List of options:
--agp <string>
: (output
)
write AGP v2.1 file that describes the output assembly
--allow-single-reads
: (process-pile-ups
)
allow using single reads instead of consensus sequence for gap closing
--auxiliary-threads, --aux-threads, -A num-threads(floor(totalCpus / <threads>)
: (collect-pile-ups
, process-pile-ups
)
use <num-threads> threads for auxiliary tools like daligner
, damapper
and daccord
)
--bad-fraction <frac>(0.8)
: (process-pile-ups
)
Intrinsic QVs are categorized as “bad” if they are greater or equal to the best QV of the worst <frac> trace point intervals.
--batch, -b <idx-spec>[,<idx-spec>...]
: (process-pile-ups
)
process only a subset of the pile ups. <pile-up-ids> is a comma-separated list of <idx-spec>. Each <id-specifications> is either a single integer <idx> or a range <from>..<to>. <idx>, <from> and <to> are zero-based indices into the pile up DB. The range is right-open, i.e. index <to> is excluded. <to> may be a dollar-sign ($
) to indicate the end of the pile up DB.
--bed <string>(standard input)
: (bed2mask
)
input BED file; fields must be TAB-delimited
--best-pile-up-margin <double>(3.0)
: (collect-pile-ups
)
given a set of of conflicting gap closing candidates, if the largest has <double> times more reads than the second largest it is considered unique. If a candidates would close gap in the reference assembly marked by n
s the number reads is multipled by –existing-gap-bonus.
--cache-contig-alignments <string>( )
: (validate-config
)
if given the contig location will be cached as JSON faking the effect of the same option in check-results
. NOTE: the result has to amended manually to be fully valid.
--closed-gaps-bed <string>
: (output
)
write BED file with coordinates of closed gaps
--config <config-json>
: (all except validate-config
)
provide configuration values in a YAML or JSON file. See README.md for usage and examples.
--daccord <daccord-option>[,<daccord-option>...]
: (process-pile-ups
)
Provide additional options to daccord
--daligner-consensus <daligner-option>[,<daligner-option>...]
: (process-pile-ups
)
Provide additional options to daligner
--daligner-reads-vs-reads <daligner-option>[,<daligner-option>...]
: (process-pile-ups
)
Provide additional options to daligner
--daligner-self <daligner-option>...
: (generate-dazzler-options
, process-pile-ups
)
Provide additional options to daligner
--damapper-ref-vs-reads <damapper-option>...
: (generate-dazzler-options
, collect-pile-ups
)
Provide additional options to damapper
--data-comments
: (bed2mask
)
parse BED comments (column 4) as generated by output
. This will cause a crash if formatting errors are encountered.
--datander-ref <datander-option>[,<datander-option>...]
: (generate-dazzler-options
, process-pile-ups
)
Provide additional options to datander
--debug-pile-ups <db-stem>
: (collect-pile-ups
)
write pile ups of intermediate steps to <db-stem>.<state>.db
--debug-repeat-masks
: (mask-repetitive-regions
)
(only for reads-mask) write mask components into additional masks <repeat-mask>-<component-type>
--dust-reads <dust-option>[,<dust-option>...]
: (process-pile-ups
)
Provide additional options to dust
--dust-ref <dust-option>[,<dust-option>...]
: (generate-dazzler-options
)
Provide additional options to dust
--existing-gap-bonus <double>(6.0)
: (collect-pile-ups
)
if a candidate would close an existing gap its size is multipled by <double> before conflict resolution (see –best-pile-up-margin).
--fasta-line-width, -w <ulong>(50)
: (output
)
line width for ouput FASTA
--help, -h
: (all)
Prints this help.
--join-policy <JoinPolicy>(
scaffoldGaps)
: (output
)
allow only joins (gap filling) in the given mode: scaffoldGaps
(only join gaps inside of scaffolds – marked by n
s in FASTA), scaffolds
(join gaps inside of scaffolds and try to join scaffolds), contigs
(break input into contigs and re-scaffold everything; maintains scaffold gaps where new scaffolds are consistent)
--json, -j
: (show-mask
, show-pile-ups
, show-insertions
, translate-coords
)
if given write the information in JSON format
--keep-temp, -k
: (collect-pile-ups
, process-pile-ups
)
keep the temporary files; outputs the exact location
--mask, -m <name>[,<name>...]
: (propagate-mask
, collect-pile-ups
, process-pile-ups
)
Dazzler masks for repetitive regions (at least one required; generate with mask-repetitive-regions
command)
--max-alignment-error, -e <double>(0.30)
: (generate-dazzler-options
, collect-pile-ups
, process-pile-ups
)
local alignments may have an error rate of no more than <double>
--max-chain-gap <bps>(10000)
: (chain-local-alignments
, process-pile-ups
)
two local alignments may only be chained if at most <bps> of sequence in the A-read and B-read are unaligned.
--max-coverage-reads <uint>
: (mask-repetitive-regions
)
this is used to derive a repeat mask from the ref vs. reads alignment; if the alignment coverage is larger than <uint> it will be considered repetitive; a default value is derived from –read-coverage; both options are mutually exclusive
--max-coverage-self <uint>(4)
: (mask-repetitive-regions
)
this is used to derive a repeat mask from the self alignment; if the alignment coverage larger than <uint> it will be considered repetitive
--max-improper-coverage-reads <uint>
: (mask-repetitive-regions
)
this is used to derive a repeat mask from the ref vs. reads alignment; if the coverage of improper alignments is larger than <uint> it will be considered repetitive; a default value is derived from –read-coverage; both options are mutually exclusive
--max-indel <bps>(1000)
: (chain-local-alignments
, process-pile-ups
)
two local alignments may only be chained if the resulting insertion or deletion is at most <bps>
--max-insertion-error <double>(0.10)
: (output
)
insertion and existing contigs must match with less error than <double>
--max-relative-overlap <fraction>(0.30)
: (chain-local-alignments
, process-pile-ups
)
two local alignments may only be chained if the overlap between them is at most <fraction> times the size of the shorter local alignment. This must hold for the reference and query.
--min-anchor-length <uint>(500)
: (generate-dazzler-options
, collect-pile-ups
, process-pile-ups
)
alignment need to have at least this length of unique anchoring sequence
--min-coverage-reads <num>
: (validate-regions
)
validly closed gaps must have a continuous coverage of at least <num> properly aligned reads; see –weak-coverage-mask for more details
--min-extension-length <ulong>(100)
: (output
)
extensions must have at least <ulong> bps of consensus to be inserted
--min-gap-size <uint>(0)
: (filter-mask
)
minimum size for gaps between mask intervals
--min-interval-size <uint>(0)
: (filter-mask
)
minimum size for mask intervals
--min-reads-per-pile-up <ulong>(3)
: (process-pile-ups
)
pile ups must have at least <ulong> reads to be processed
--min-relative-score <fraction>(1.0)
: (chain-local-alignments
, process-pile-ups
)
output chains with a score of at least <fraction> of the best chains score. A value of 1.0 means that only chains with the best chains score will be accepted; a value of 0.0 means that all chains will be accepted
--min-score <int>(trace point spacing of alignment)
: (chain-local-alignments
, process-pile-ups
)
output chains with a score of at least <int>
--min-spanning-reads, -s <ulong>(3)
: (collect-pile-ups
, validate-regions
)
require at least <ulong> spanning reads to close a gap
--no-highlight-insertions, -H
: (output
)
turn off highlighting (upper case) of inserted sequences in the FASTA output
--no-merge-extension
: (collect-pile-ups
)
Do not merge extension reads into spanning pile ups.
--only <OnlyFlag>(spanning)
: (process-pile-ups
, output
)
only process/output insertions of the given type. Note, extending insertions are experimental and may produce invalid results.
--ploidy, -N <uint>
: (validate-regions
)
this is used to derive a lower bound for the read coverage
--progress
: (chain-local-alignments
)
Print regular status reports on the progress.
--progress-every <msecs>(500)
: (chain-local-alignments
)
Print status reports every <msecs>.
--progress-format <format>(human)
: (chain-local-alignments
)
Use <format> for status report lines where <format> is either human
or json
. The former prints a status line that updates regularly while the latter prints a full JSON record per line with every update
--proper-alignment-allowance num(trace point spacing of alignment)
: (mask-repetitive-regions
, collect-pile-ups
, process-pile-ups
, validate-regions
)
An alignment is called proper if it is end-to-end with at most <num> bp allowance.
--quiet, -q
: (all)
reduce output as much as possible reporting only fatal errors. If given this option overrides –verbose.
--read-coverage, -C <double>
: (mask-repetitive-regions
, validate-regions
)
This is used to provide good default values for –max-coverage-reads (mask-repetitive-regions) or –min-coverage-reads (validate-regions); –read-coverage and –*-coverage-reads are mutually exclusive. Ideally, the user provides the haploid read coverage which, for example, may be inferred using a histogram of the alignment coverage across the genome. Alternatively, the average raw read coverage can be used which is the number of base pairs in the reads divided by the number of base pairs in the assembly.
--region-context <bps>(1000)
: (validate-regions
)
consider <bps> base pairs of context for each region to detect splicing errors
--report-all
: (validate-regions
)
report all validation results instead of only failed gaps
--revert <option>[,<option>...]
: (all)
revert named option to default value. This is useful to revert specific options of a config file.
--scaffolding <insertions-db>
: (output
)
write the assembly scaffold to <insertions-db>; use show-insertions
to inspect the result
--skip-gaps <gap-spec>[,<gap-spec>...]
: (output
)
Do not close the specified gaps. Each <gap-spec> is a pair of contig IDs <contigA>-<contigA> meaning that the specified contigs should not be closed. They will still be joined by a prexisting gap.
--skip-gaps-file <file>
: (output
)
Same as –skip-gaps but <file> contains one <gap-spec> per line. If both options are given the union of all <gap-spec>s will be used. Empty lines and lines starting with #
will be ignored.
--threads, -T <uint>(number of cores)
: (collect-pile-ups
, process-pile-ups
, validate-regions
)
use <uint> threads
--tmpdir, -P <string>
: (collect-pile-ups
, process-pile-ups
)
use <string> as a working directory
--usage
: (all)
Print a short command summary.
--verbose, -v
: (all)
increase output to help identify problems; use up to three times. Warning: performance may be drastically reduced if using three times.
--weak-coverage-mask <mask>
: (validate-regions
)
write a Dazzler mask <mask> of weakly covered regions, e.i. sliding windows of –weak-coverage-window base pairs are spanned by less than –min-coverage-reads local alignments
--weak-coverage-window <bps>(500)
: (validate-regions
)
consider sliding window of <bps> base pairs to identify weak coverage
This file was generated by dentist --list-options
at v3.0.0.