Skip to main content

Sort GFF by CDT

sort-gff

Sort a CDT file and its corresponding GFF file by the total score in the CDT file across the specified interval

CDT File Statistics

CDT file statistics provide summary measures like mean, median, and standard deviation, along with distribution and clustering metrics, to help understand and analyze the genomic data's characteristics and variability.

Sorting Strategy

Depending on the strategy selected, the "Size of Expansion" (in bins) can mean different things.

  • Sort by Center: This strategy sorts genomic GFF intervals according to the scores in the CDT file at the midpoint of each GFF interval.
  • Sort by Index: This strategy sorts genomic GFF intervals based on scores in the GFF file at a specific index position within each BED interval.

Command Line Interface

Usage:

java -jar ScriptManager.jar coordinate-manipulation sort-gff [-hV] [-c=<center>]
[-o=<outputBasename>] [-x=<index> <index>]... <gffFile> <cdtReference>

Positional Input

InputDescription
<gffFile>the GFF file to sort
<cdtReference>the reference CDT file to sort the input by

Output Options

OptionDescription
-o, --output=<outputBasename>specify output file basename (no .cdt/.gff extension, script will add that)
-z, --gzipgzip output (default=false)

Sort Options

These options indicate which windows to sort the files by (choose one).

OptionDescription
-c, --center=<center>sort by center on the input size of expansion in bins (default=100)
-x, --index=<index> <index>sort by index from the specified start to the specified stop (0-indexed and half-open interval)