That said, there are still situations where it is useful to call genotypes based on a single sample. Examples include situations where very few samples were processed, or when one wants to do a check of sample identity via a quick check on genotypes without going through a full clustering. When using the single-sample analysis workflow it is important to bear in mind the risk that the accuracy of the calls may be lower, perhaps substantially, than what is attainable via the recommended multiple sample clustering.
The single-sample workflow described here uses the BRLMM-P genotyping algorithm, which has two key enabling features - it has the capability to enable the user to generate a set of SNP-specific priors and it has a special single-samples calling mode.
The target probe intensity distribution can be created with the --write-sketch option to apt-probeset-genotype - this option causes the quantile sketch derived to be written out to a file with filename suffix ".normalization-target.txt" in the output directory. Estimates of probe (feature) effects can also be generated with apt-probeset-genotype using the ----feat-effects option - the resulting feature effects are written to a file with suffix ".feature-response.txt" in the output directory.
To generate a set of SNP-specific priors, see the clustering without priors and building priors. Even if a set of priors is already available (for example, if you are using one of the catalog products) it may be useful to allow your own samples influence the priors by starting with the set of provided priors, doing a clustering run with your own samples, then saving the resulting posteriors to serve as priors for future runs. This can be accomplished with the --write-models option to apt-probeset-genotype.
An example of using the above three options in a run using the GenomeWideSNP_6 product is shown below.
apt-probeset-genotype \ --cdf-file lib/GenomeWideSNP_6.cdf \ --special-snps lib/GenomeWideSNP_6.specialSNPs \ --read-models-brlmmp lib/GenomeWideSNP_6.brlmm-p.models \ --chrX-probes lib/GenomeWideSNP_6.chrXprobes \ --chrY-probes lib/GenomeWideSNP_6.chrYprobes \ --set-gender-method cn-probe-chrXY-ratio \ --analysis brlmm-p-plus \ --out-dir single_sample_parameter_dir \ --write-models \ --write-sketch \ --summaries \ --feat-effects \ --cel-files my_cel_file_list.txt
Note that the analysis string should match that which was used for the models file supplied. In the case of the GenomeWideSNP_6 product there is a models file GenomeWideSNP_6.brlmm-p.models supplied as part of the library file package found on the product page, it was created with the same analysis string that is aliased to "brlmm-p-plus".
apt-probeset-genotype \ --cdf-file lib/GenomeWideSNP_6.cdf \ --special-snps lib/GenomeWideSNP_6.specialSNPs \ --chrX-probes lib/GenomeWideSNP_6.chrXprobes \ --chrY-probes lib/GenomeWideSNP_6.chrYprobes \ --set-gender-method cn-probe-chrXY-ratio \ --analysis quant-norm.sketch=50000,pm-only,brlmm-p.CM=2.bins=100.mix=1.bic=2.HARD=3.SB=0.45.KX=1.KH=1.5.KXX=0.5.KAH=-0.6.KHB=-0.6.transform=MVA.AAM=2.0.BBM=-2.0.AAV=0.06.BBV=0.06.ABV=0.06.copyqc=0.000001.wobble=0.05.MS=0.05 \ --out-dir my_output_dir \ --use-feat-eff single_sample_parameter_dir/brlmm-p-plus.plier.feature-response.txt \ --target-sketch single_sample_parameter_dir/quant-norm.normalization-target.txt \ --read-models-brlmmp single_sample_parameter_dir/brlmm-p-plus.snp-posteriors.txt \ --cel-files my_cel_file_list.txt
Note that the long analysis string is the same as the analysis string aliased to "brlmm-p-plus" as used in training the SNP-specific priors, but with the critical distinction that "CM=2". CM stands for Calling Method and setting it to 2 enables the single-sample calling mode.
As a test you can analyze a collection of CEL files, save the feature effects, quantile sketch and SNP-specific priors, and then recall all the samples in single-sample mode. This should result in the same genotype calls for both modes of calling, with some small amount of difference due to the rounding involved when the parameters are written out to text files.
Affymetrix Power Tools (APT) Release apt-1.10.1
1.5.3