apt-copynumber-workflow implements two distinct workflows, a batch worflow that uses as a reference the set of CEL files that are input; and conceptually, a single-sample workflow that compares each CEL file to a pre-computed reference. For efficiency of computation the "single-sample workflow" operates on a set of input CEL files at a time, but the output for any CEL file is unaffected by any of the other CEL files.
On unix systems a basic command using the default parameters to do a batch run on GenomeWide SNP 6.0 data would look like:
apt-copynumber-workflow \
--adapter-type-normalization true \
--reference-output results-dir/MySamplesReference.a5.ref \
--set-analysis-name MySamples \
--cdf-file GenomeWideSNP_6.cdf \
--chrX-probes GenomeWideSNP_6.chrXprobes \
--chrY-probes GenomeWideSNP_6.chrYprobes \
--special-snps GenomeWideSNP_6.specialSNPs \
--netaffx-snp-annotation-file GenomeWideSNP_6.na25.annot.csv \
--netaffx-cn-annotation-file GenomeWideSNP_6.cn.na25.annot.csv \
--delete-files true \
--o results_dir \
--text-output \
--cnchp-output false \
--cel-files *.CEL
The output will consist of a report file with some summary statistics about each chip analyzed, a text file per chip, and reference file. The 'a5' is a convention used by APT to refer to binary files saved in HDF5 format.
To output cnchp files instead of text files, remove the argument --cnchp-output false To suppress text file output remove the argument --text-output
WARNING: apt-copynumber-workflow will overwrite any existing output files it finds. If you wish to keep existing results make sure to specify a different output directory name.
NOTE: On windows the DOS prompt does not support wildcard expansion and the preferred method is to supply a text file with the path to the cel files via the '--cel-files' option (see below for details of file format).
NOTE: The windows DOS prompt also does not allow a continuation of a command with the '\' character, unlike unix. So in the examples shown here the '\' character should be omitted and everything entered on a single line.
To run in single-sample workflow using an existing reference, replace
--reference-output
with
--reference-input ExistingReference.a5.ref
where "ExistingReference.a5.ref" is the filename of an existing reference,
apt-copynumber-workflow - A program to compute copy number
results from DNA analysis arrays.
usage:
apt-copynumber-workflow.exe \
--adapter-type-normalization true \
--text-output false \
--reference-output CNReference.a5 \
--set-analysis-name TestReference \
--cdf-file GenomeWideSNP_6.cdf \
--chrX-probes GenomeWideSNP_6.chrXprobes \
--chrY-probes GenomeWideSNP_6.chrYprobes \
--special-snps GenomeWideSNP_6.specialsnps \
--netaffx-snp-annotation-file snp_annot_2.csv \
--netaffx-cn-annotation-file cn_annot_2.csv \
--o results --cel-files celfiles.txt
options:
Basic Info and Control Options
-h, --help This message. [default 'false']
--explain Explain a particular operation (i.e.
--explain cn-state or --explain loh).
[default '']
-v, --verbose How verbose to be with status messages 0 -
quiet, 1 - usual messages, 2 - more
messages. [default '1']
--version Output program version and quit. [default
'false']
Input Options
--xml-file Input parameters in XML format (Will
override command line settings). [default
'']
--reference-input Input reference file name. [default '']
--reference-output Output reference file name. [default '']
--cdf-file File defining probe sets. [default '']
-f, --force Don't check the chip types, just assume
they match. [default 'false']
--qcc-file File defining QC probesets. [default '']
--qca-file File defining QC analysis methods. [default
'']
--cel-files Text file specifying cel files to process,
one per line with the first line being
'cel_files'. [default '']
--special-snps File containing all snps of unusual copy
(chrX,mito,Y) [default '']
--chrX-probes File containing probe_id (1-based) of
probes on chrX. Used for copy number probe
chrX/Y ratio gender calling. [default '']
--chrY-probes File containing probe_id (1-based) of
probes on chrY. Used for copy number probe
chrX/Y ratio gender calling. [default '']
--target-sketch File specifying a target distribution to
use for quantile normalization. [default
'']
--use-feat-eff File defining a feature effect for each
probe. Note that precomputed effects should
only be used for an appropriately similar
analysis (i.e. feature effects for pm-only
may be different than for pm-mm). [default
'']
--read-models-brlmmp File to read precomputed BRLMM-P snp
specific models from. [default '']
Output Options
-o, --out-dir Directory to write result files into. Any
previous results in directory will be
overwritten. [default '.']
Analysis Options
-a, --analysis String representing analysis pathway
desired. [default '']
--med-polish Use median polish summarization method
instead of plier. [default 'false']
--adapter-type-normalization Adapter Type Normalization option. true =
perform adapter type normalization.
[default 'true']
--normalization-type Normalization option. 0 = none, 1 =
'quant-norm', 2 = 'med-norm.target=1000'
[default '1']
--adapter-parameters Parameters to use when running adapter type
normalization. [default '']
--brlmmp-parameters Parameters to use when running brlmmp.
[default '']
Execution Control Options
--mem-usage How many MB of memory to use for this run.
[default '0']
--block-size How many probesets to process at once,
useful when memory is limited. If set to 0
program attempts to guess available RAM and
set appropriately. [default '0']
--run-geno-qc Run the GenoQC engine. [default 'true']
--run-probeset-genotype Run the Probeset Genotype engine. (For
testing purposes only.) [default 'true']
--prior-size How many probesets to use for determining
prior. [default '10000']
--use-disk Use disk based representation to avoid
excessive RAM use. [default 'true']
--disk-dir Directory for temporary files when working
off disk. Using network mounted drives is
not advised. [default '']
--disk-cache Size of memory cache when working off disk
in megabytes. [default '100']
--arrs ARR files to process. Must be paired with
cels. [default '']
--cychps CYCHP files to output. Must be paired with
cels. [default '']
CNReferenceEngine Options
--probeset-ids Tab delimited file with column
'probeset_id' specifying probesets to
summarize. [default '']
--netaffx-snp-annotation-file NetAffx SNP Annotation file. [default '']
--netaffx-cn-annotation-file NetAffx CN Annotation file. [default '']
--xChromosome X Chromosome [default '24']
--yChromosome Y Chromosome [default '25']
CNLog2RatioEngine Options
--delete-files Delete extra output files after the run has
completed. [default 'false']
--log2-input Input Allele Summaries are in log2.
[default 'false']
--median-autosome-median-normalization Perform the median autosomal median
normalization step. [default 'true']
--yTarget Y Target [default '0.6748']
--allelic-difference-outlier-trim Allele Diff Outlier Trim [default '3']
--gc-correction Apply the GC correction to the Log2Ratios
and Allelic Differences. [default 'true']
--gc-content-override-file Input file used to override the GC content
read from the annotation files (Two columns
with header line, ProbeSetName/GCContent).
[default '']
--gc-correction-bin-count The number of bins to use when applying the
gc-correction. [default '25']
--geno-qc-file The file output from GenoQC. [default '']
--cyto2 Processing CYTO2 chip. [default 'false']
--CN2Gender-MAPD-threshold The MAPD cutoff threshold for CN2 gender
calling. [default '0.5']
--CN2Gender-male-ChrX-lower-threshold The male CN call lower threshold for
chromosome X CN2 gender calling. [default
'0.8']
--CN2Gender-male-ChrX-upper-threshold The male CN call upper threshold for
chromosome X CN2 gender calling. [default
'1.3']
--CN2Gender-male-ChrY-lower-threshold The male CN call lower threshold for
chromosome Y CN2 gender calling. [default
'0.8']
--CN2Gender-male-ChrY-upper-threshold The male CN call upper threshold for
chromosome Y CN2 gender calling. [default
'1.2']
--CN2Gender-female-ChrX-lower-threshold The female CN call lower threshold for
chromosome X CN2 gender calling. [default
'1.9']
--CN2Gender-female-ChrX-upper-threshold The female CN call upper threshold for
chromosome X CN2 gender calling. [default
'2.1']
--CN2Gender-female-ChrY-lower-threshold The female CN call lower threshold for
chromosome Y CN2 gender calling. [default
'0']
--CN2Gender-female-ChrY-upper-threshold The female CN call upper threshold for
chromosome Y CN2 gender calling. [default
'0.4']
--array-name Array name or type to use. [default '']
--set-analysis-name Analysis name to use as prefix for output
files. [default '']
--text-output Output data in ASCII text format in
addition to calvin format. [default
'false']
--cnchp-output Report CNCHP files [default 'true']
--cychp-output Report CYCHP files [default 'false']
Data transformations:
cn-state CopyNumber CNState
gaussian-smooth CopyNumber GaussianSmooth
loh CopyNumber LOH
cn-neutral-loh Copynumber CNNeutralLOH
normal-diploid Copynumber NormalDiploid
mosaicism Copynumber Mosaicism
no-call Copynumber NoCall
version: apt-1.10.1 $Id: apt-copynumber-workflow.cpp,v 1.67 2008/10/24 06:08:52 awilli Exp $
example
A. The answer.
1.5.3