Most SNPs assayed to-date with the WGSA approach have been autosomal and sample genders have no impact on clustering such SNPs - they are always clustered assuming that the chromosome copy number is always two. However for sex chromsomes and mitochondrial SNPs this assumption does not hold true and such SNPs should be treated differently. When the expected chromosomal copy number is one there are only two clusters that one expects to see, and to accommodate this the BRLMM-P algorithm has a haploid mode that can be engaged to enforce appropriate treatment.
The Human Mapping 100K, 500K and Genome Wide SNP 5.0 products all includes both autosomal and chrX SNPs but no chrY or mitochondrial SNPs. For these designs apt-probeset-genotype enables identification of the chrX SNPs in a chrx file specified with the --chrX-snps option. See the section below on the chrx file for details on the file format, contents and usage.
The Genome Wide SNP 6.0 Array was the first WGSA chip to also include chrY and mitochondrial SNPs. In order to supply apt-probeset-genotype with the information on SNP types a new specialSNPs file was created, supplied with the --special-snps option. See the section below on the specialSNPs file for details on the file format, contents and usage.
If neither the --chrX-snps nor the --special-snps option is supplied to apt-probeset-genotype it will exit with an error - this is to prevent accidentally running while ommitting this information, which would end up treating all SNPs as autosomal. If however the user does indeed want to treat all SNPs as autosomal and to ignore gender the run can be forced to proceed with the "--no-gender-force" option.
The chrX file uses the TSV format. It should be tab-delimited text and it should contain a column (which may be the only column) with the heading "all_chrx_no_par". This column should specify the SNPs that are in the non-pseudo-autosomal region of chromosome X. As an example, the .chrx file from the Genome Wide SNP 5.0 Array can be found in the library file installation available at the product support materials page. Here is an example usage based on the Genome Wide SNP 5.0 Array:
apt-probeset-genotype \ --cdf-file lib/GenomeWideSNP_5.cdf \ --chrX-snps lib/GenomeWideSNP_5.chrx \ --read-models-brlmmp lib/GenomeWideSNP_5.models \ --analysis brlmm-p \ --out-dir out \ --cel-files cel.txt
The specialSNPs file uses the TSV format. It should be tab-delimited text and it should contain the four columns explained below. There should be one row for each chromosme X (non-pseudo-autosomal), chromosome Y and mitochondrial SNP. The four required columns are:
Here is an example usage based on the Genome Wide SNP 6.0 Array:
apt-probeset-genotype \ --cdf-file lib/GenomeWideSNP_6.cdf \ --special-snps lib/GenomeWideSNP_6.specialSNPs \ --read-models-birdseed lib/GenomeWideSNP_6.birdseed.models \ --chrX-probes lib/GenomeWideSNP_6.chrXprobes \ --chrY-probes lib/GenomeWideSNP_6.chrYprobes \ --set-gender-method cn-probe-chrXY-ratio \ --analysis birdseed \ --out-dir out \ --cel-files cel.txt
Affymetrix Power Tools (APT) Release 1.14.3
1.7.1