Skip to main content

Aggregate Data


Compile data from tab-delimited file into matrix according to user-specified metric.

AfA=f(a1,1a1,nam,1am,n)=(f([a1,1a1,n])f([am,1am,n]))A \rightarrow fA = f * \begin{pmatrix} a_{1,1} & \cdots & a_{1,n}\\ \vdots & \ddots & \vdots \\ a_{m,1} & \cdots & a_{m,n} \\ \end{pmatrix} = \begin{pmatrix} f([a_{1,1} \cdots a_{1,n}]) \\ \vdots \\ f([a_{m,1} \cdots a_{m,n}]) \\ \end{pmatrix}

Aggregation Method Options

  • Sum
  • Average
  • Median
  • Mode
  • Min
  • Max
  • Positional Variance

Command Line Interface


java -jar ScriptManager.jar read-analysis aggregate-data [--sum | --avg | --med |
--mod | --min | --max | --var] [-fhmV] [-l=<startCOL>] [-o=<output>]
[-r=<startROW>] [<inputFiles>...]

The AggregateData tool is used to process a bunch of matrix files into one matrix file.

Input Options

Since this tool process a bunch of files together, there are two ways of feeding input files:

(1) You can list them out in the command line tool,

java -jar ScriptManager.jar read-analysis aggregate-data matFile1 matFile2 ... matFileX <OPTIONS>

(2) or you can write all the paths for all your files in a single file and pass that as the input using the -f flag

java -jar ScriptManager.jar read-analysis aggregate-data inputFile -f <OPTIONS>

where inputFile is listed out line by line:


Note that absolute file paths are easier to work with. For relative paths, you'll have to check that they are built with respect to the ScriptManager directory.

Output Options

-m, --mergemerge to one output file
-o, --output=<output>Specify output file, default is "aggregate_matrix.txt" or the input filename if -f flag used
-z, --gzipgzip output (default=false)

The file output can be specified by the user using this flag. Otherwise the output will be aggregate_matrix.txt in the same directory as ScriptManager. Or based on the input filename if the -f flag is used.

Aggregation Method Options

--sumuse summation method (default)
--avguse average method
--meduse median method
--moduse mode method
--minuse minimum method
--maxuse maximum method
--varuse positional variance method

Coord Start Options

-r, --start-row
-l, --start-col