Here is a basic example of analyzing genomic samples in single sample mode (static clustering) using the command line apt-dmet-genotype.
apt-dmet-genotype \
--cdf-file=DMET_Plus.v1.cdf \
--chrX-probes=DMET_Plus.v1.chrXprobes \
--chrY-probes=DMET_Plus.v1.chrYprobes \
--special-snps=DMET_Plus.v1.specialSNPs \
--reference-input=DMET_Plus.v1.genomic.ref.a5 \
--cn-region-gt-probeset-file=DMET_Plus.v1.cn-gt.ps \
--probeset-ids=DMET_Plus.v1.genomic.gt.ps \
--region-model=DMET_Plus.v1.cn-region-models.txt \
--probeset-model=DMET_Plus.v1.cn-probeset-models.txt \
--cc-chp-output \
--sample-type=genomic \
--probeset-ids-reported=consent.txt \
--cel-files=cel_files.txt \
--out-dir=results
Single sample mode (static clustering) on plasmid controls requires the following changes:
--reference-input=DMET_Plus.v1.plasmid.ref.a5 \
--probeset-ids=DMET_Plus.v1.plasmid.gt.ps \
--sample-type=plasmid \
and one addition:
--run-cn-engine=false
Multi-sample mode (dynamic clustering) on genomic samples requires the following additions:
--reference-output=foo.a5 \
--batch-name=cluster-run-1
Where batch-name is any unique name for the batch of cel files clustered together.
apt-dmet-genotype - A program for analyzing DMET 3.0 CEL files.
usage:
apt-dmet-genotype ...
options:
Common Options (not used by all programs)
-h, --help Display program options and extra
documentation about possible analyses. See
-explain for information about a specific
operation. [default 'false']
-v, --verbose How verbose to be with status messages 0 -
quiet, 1 - usual messages, 2 - more
messages. [default '1']
--console-off Turn off the default messages to the
console but not logging or sockets.
[default 'false']
--use-socket Host and port to print messages over in
localhost:port format [default '']
--version Display version information. [default
'false']
-f, --force Disable various checks including chip
types. Consider using --chip-type option
rather than --force. [default 'false']
--throw-exception Throw an exception rather than calling
exit() on error. Useful for debugging. This
option is intended for command line use
only. If you are wrapping an Engine and
want exceptions thrown, then you should
call Err::setThrowStatus(true) to ensure
that all Err::errAbort() calls result in an
exception. [default 'false']
--analysis-files-path Search path for analysis library files.
Will override AFFX_ANALYSIS_FILES_PATH
environment variable. [default '']
--xml-file Input parameters in XML format (Will
override command line settings). [default
'']
--temp-dir Directory for temporary files when working
off disk. Using network mounted drives is
not advised. When not set, the output
folder will be used. The defaut is
typically the output directory or the
current working directory. [default '']
-o, --out-dir Directory for output files. Defaults to
current working directory. [default '.']
--log-file The name of the log file. Generally
defaults to the program name in the out-dir
folder. [default '']
Engine Options (Not used on command line)
--command-line The command line executed. [default '']
--exec-guid The GUID for the process. [default '']
--program-name The name of the program [default '']
--program-company The company providing the program [default
'']
--program-version The version of the program [default '']
--program-cvs-id The CVS version of the program [default '']
--version-to-report The version to report in the output files.
[default '']
--free-mem-at-start How much physical memory was available when
the engine run started. [default '0']
--meta-data-info Meta data in key=value pair that will be
output in headers. [default '']
Input Options
--cel-files Text file specifying cel files to process,
one per line with the first line being
'cel_files'. [default '']
-c, --cdf-file File defining probe sets. Use either
--cdf-file or --spf-file. [default '']
--spf-file File defining probe sets in spf (simple
probe format) which is like a text cdf
file. [default '']
--special-snps File containing all snps of unusual copy
(chrX,mito,Y) [default '']
--chrX-probes File containing probe_id (1-based) of
probes on chrX. Used for copy number probe
chrX/Y ratio gender calling. [default '']
--chrY-probes File containing probe_id (1-based) of
probes on chrY. Used for copy number probe
chrX/Y ratio gender calling. [default '']
--reference-input Reference file with cluster prior
information. [default '']
-s, --probeset-ids Tab delimited file with column
'probeset_id' specifying probesets to
analyze. [default '']
--probeset-ids-reported Tab delimited file with column
'probeset_id' specifying probesets to
report. This should be a subset of those
specified with --probeset-ids if that
option is used. [default '']
--chip-type Chip types to check library and CEL files
against. Can be specified multiple times.
The first one is propigated as the chip
type in the output files. Warning, use of
this option will override the usual check
between chip types found in the library
files and cel files. You should use this
option instead of --force when possible.
[default '']
--region-model Regions model parameters. [default '']
--probeset-model Probeset model parameters. [default '']
--cn-region-gt-probeset-file Tab delimited file mapping probeset ids to
copynumber regions. [default '']
Output Options
--cc-chp-output Output resulting calls in directory called
'chp' under out-dir. This makes one AGCC
Multi Data CHP file per cel file analyzed.
[default 'false']
--reference-output File to write reference values to.
Specifying this option will turn on dynamic
clustering. WARNING: Currently the
resulting reference file is not really
usable as a reference. See the manual for
more info. [default '']
--batch-name The name of the batch for the dynamic
cluster analysis. [default '']
Analysis Options
--set-analysis-name Explicitly set the analysis name. This
affects output file names (ie prefix) and
various meta info. [default 'dmet']
--ps-analysis Explicitly set the ProbesetSummarizeEngine
analysis string. [default '']
--gt-analysis Explicitly set the ProbesetGenotypeEngine
analysis string. [default '']
--gt-qmethod-spec Explicitly set the ProbesetGenotypeEngine
quant spec. [default '']
--sample-type Set the type of samples being processed. eg
genomic, plasmid. [default 'unknown']
--batch-info Indicates whether or not information about
other cel files in the batch should be
reported in CHP headers. [default 'false']
--null-context Indicates whether or not context info
should be populated in the CHP files.
[default 'true']
--run-cn-engine Indicates if the CN engine should be run or
not. [default 'true']
--pra-thresh The threshold for calling PRAs based on the
cluster mean strength. [default '3']
--geno-call-thresh The confidence threshold for reporting
calls in the CHP file. [default '0.1']
Gender Options
--female-thresh Threshold for calling females when using
cn-probe-chrXY-ratio method. [default
'0.17']
--male-thresh Threshold for calling females when using
cn-probe-chrXY-ratio method. [default
'0.68']
Advanced Options
--call-coder-max-alleles For encoding/decoding calls, the max number
of alleles per marker to allow. [default
'6']
--call-coder-type The data size used to encode the call.
[default 'UCHAR']
--call-coder-version The version of the encoder/decoder to use
[default '1.0']
Execution Control Options
--use-disk Store CEL intensities to be analyzed on
disk. [default 'true']
--disk-cache Size of intensity memory cache in millions
of intensities (when --use-disk=true).
[default '50']
Engine Options (Not used on command line)
--cels Cel files to process. [default '']
--report Probesets to report. eg consented. [default
'']
--time-start The time the engine run was started
[default '']
--time-end The time the engine run ended [default '']
--time-run-minutes The run time in minutes. [default '']
--analysis-guid The GUID for the analysis run. [default '']
apt-dmet-genotype allows one to do both static and dynamic clustering. With dynamic clustering, the genotype calling algorithm will update cluster centers, variance, etc... before making genotype calls. In static clustering mode, the priors (cluster centers, variance, etc...) are not updated and genotype calls are based on the original input.
Dynamic clustering is enabled with the use of the --reference-output option. With this option, dynamic clustering is used. Without this option static clustering is used.
When using --reference-output, you must also provide a batch name using the --batch-name option.
WARNING: The resulting reference file when using --reference-output is not currently suitable for use with the --reference-input option. This may change in a later release.
WARNING: You should always provide a input reference file, regardless of whether you are doing dynamic or static clustering.
apt-probeset-genotype allows users to pass options directly into the sub-engines it calls. These sub engines are:
One can specify these parameters on the command line by separating them from other parameters with "--". For example
apt-probeset-genotype --cdf-file=.... ... \ -- [ProbesetSummarizeEngine options] -- [DmetCopyNumberEngine options] -- [ProbesetGenotypeEngine options] -- [DmetCHPWriter options]
See apt-engine-wrapper for getting more information about what options are available for each engine. For example, adding the following to the end of your apt-dmet-genotype command line:
-- --summaries=true --feat-effects -- --text-output -- --table-output --feat-effects --summaries
Will result in additional text output from the first three analysis engines.
1.7.1