- Date:
- 2007-11-20
Most SNPs assayed to-date with the WGSA approach have been autosomal and sample genders have no impact on clustering such SNPs - they are always clustered assuming that the chromosome copy number is always two. However for sex chromsomes and mitochondrial SNPs this assumption does not hold true and such SNPs should be treated differently. When the expected chromosomal copy number is one there are only two clusters that one expects to see, and to accommodate this the BRLMM-P algorithm has a haploid mode that can be engaged to enforce appropriate treatment.
- Chromosome X SNPs: Since males have only one copy of the X chromosome male samples are clustered together with the haploid (2-cluster) model and female samples are clustered together with the diploid (3-cluster) model. This treatment applied only to the non-pseudo-autosomal region of chrX, the pseudo-autosomal region is treated just like all the autosomes: all samples are clustered together with the diploid model, regardless of gender.
- Chromosome Y SNPs: Since males have only one Y chromosome and females have none, male samples are clustered together with the haploid model and female samples are just assigned a no-call, without even looking at any signal intensities.
- Mitochondrial SNPs: All samples are clustered together with the haploid model, regardless of gender.
The Human Mapping 100K, 500K and Genome Wide SNP 5.0 products all includes both autosomal and chrX SNPs but no chrY or mitochondrial SNPs. For these designs apt-probeset-genotype enables identification of the chrX SNPs in a chrx file specified with the --chrX-snps option. See the section below on the chrx file for details on the file format, contents and usage.
The Genome Wide SNP 6.0 Array was the first WGSA chip to also include chrY and mitochondrial SNPs. In order to supply apt-probeset-genotype with the information on SNP types a new specialSNPs file was created, supplied with the --special-snps option. See the section below on the specialSNPs file for details on the file format, contents and usage.
If neither the --chrX-snps nor the --special-snps option is supplied to apt-probeset-genotype it will exit with an error - this is to prevent accidentally running while ommitting this information, which would end up treating all SNPs as autosomal. If however the user does indeed want to treat all SNPs as autosomal and to ignore gender the run can be forced to proceed with the "--no-gender-force" option.
The chrX file uses the TSV format. It should be tab-delimited text and it should contain a column (which may be the only column) with the heading "all_chrx_no_par". This column should specify the SNPs that are in the non-pseudo-autosomal region of chromosome X. As an example, the .chrx file from the Genome Wide SNP 5.0 Array can be found in the library file installation available at the product support materials page. Here is an example usage based on the Genome Wide SNP 5.0 Array:
apt-probeset-genotype \
--cdf-file lib/GenomeWideSNP_5.cdf \
--chrX-snps lib/GenomeWideSNP_5.chrx \
--read-models-brlmmp lib/GenomeWideSNP_5.models \
--analysis brlmm-p \
--out-dir out \
--cel-files cel.txt
The specialSNPs file uses the TSV format. It should be tab-delimited text and it should contain the four columns explained below. There should be one row for each chromosme X (non-pseudo-autosomal), chromosome Y and mitochondrial SNP. The four required columns are:
- probeset_id: The SNP identifier from the CDF file.
- chr: The classification of the SNP. Example values (from the Genome Wide SNP 6.0 Array) are:
- MT: for Mitochondrial SNPs
- PAR: for SNPs on the pseudo-autosomal region of chromosome X. Such SNPs are treated just like autosomes and in fact can be omitted from the specialSNPs file. However some users may still wish to list such SNPs in the specialSNPs file, finding it convenient to have all non-autosomal SNP accounted for in one place
- X: for SNPs on the non-pseudo-autosomal region of chromosome X
- Y: for SNPs on chromosome Y
- copy_male: The number of chromosomal copies expected in a male sample (1,2,1,1 for MT,PAR,X,Y respectively)
- copy_female: The number of chromosomal copies expected in a female sample (1,2,2,0 for MT,PAR,X,Y respectively) An example of a specialSNPs file is the specialSNPs file for the Genome Wide SNP 6.0 Array, found in the library file installation available at the product support materials page.
Here is an example usage based on the Genome Wide SNP 6.0 Array:
apt-probeset-genotype \
--cdf-file lib/GenomeWideSNP_6.cdf \
--special-snps lib/GenomeWideSNP_6.specialSNPs \
--read-models-birdseed lib/GenomeWideSNP_6.birdseed.models \
--chrX-probes lib/GenomeWideSNP_6.chrXprobes \
--chrY-probes lib/GenomeWideSNP_6.chrYprobes \
--set-gender-method cn-probe-chrXY-ratio \
--analysis birdseed \
--out-dir out \
--cel-files cel.txt
Affymetrix Power Tools (APT) Release apt-1.10.1
Generated on Mon Nov 3 12:21:42 2008 for Affymetrix Power Tools by
1.5.3