VIGNETTES: Mouse Diversity Genotyping Array Clustering Analysis

Date:
2009-6-15

Contents

Introduction

The Mouse Diversity Genotyping Array is designed for high-density, genome-wide profiling of single nucleotide polymorphisms (SNPs) and copy number variation (CNV) segments. This cutting-edge research tool provides more than 100 times the SNP coverage than any other available mouse array, permitting high resolution mapping, genomic analysis, association studies, etc.

This vignette provides examples on how to use Affymetrix Power Tools (APT) to make genotype calls from the array. These examples and the supporting files are designed for genotyping SNPs in classic mouse strains.

Before genotyping, it is important to remove the samples that do not pass QC. Please see the Mouse Diversity Genotyping Array Sample Quality Control Vignette.

Supporting Files

The following files are referenced in this vignette:

These files can be found on the Affymetrix Mouse Diversity Genotyping Array product page.

APT Installation

Affymetrix Power Tools (APT) are a set of cross-platform command line programs that implement algorithms for analyzing and working with Affymetrix GeneChip arrays. The latest version of the Affymetrix Power Tools can be downloaded and installed from here: http://www.affymetrix.com/partners_programs/programs/developer/tools/powertools.affx#1_2

The program for this vignette, "apt-probeset-genotype", will be in "<apt folder>\bin" after downloading and unzipping.

Sample Genotyping Script

apt-probeset-genotype is the application for making genotype calls. A modified version of BRLMM-P is the genotype calling algorithm used in this array. It is a model based algorithm, so it needs to be run on multiple CEL files at once to estimate probe effect and SNP cluster parameters. It is advisable to cluster with at least 44 genetically distinct samples, though adding more will continue to be of benefit, in particular for accurately calling rare genotypes.

Included in the files available on the array product page and included below is a sample genotyping script "genotype.sh". To execute this script without change, "genotype.sh", "apt-probeset-genotype", and all input files must be in the same directory.

The command below is very similar to that used for doing single-sample analysis but there is one key difference. The parameters supplied in the analysis string specify a multi-sample clustering mode. In this mode, data from all samples is used in a Bayesian process to update the prior estimates of genotype cluster locations before making final genotype calls.

  apt-probeset-genotype \
	-o genotype \
	-c MOUSEDIVm520650.CDF \
	--cel-files cel.txt \
	-s SNPsubset.txt \
	--a quant-norm.sketch=50000,pm-only,brlmm-p.CM=1.bins=100.mix=1.bic=2.HARD=3.SB=0.75.KX=0.2.KH=0.3.KXX=-0.1.KAH=-0.1.KHB=-0.1.transform=MVA.AAM=2.0.BBM=-2.0.AAV=0.06.BBV=0.06.ABV=0.06.copyqc=0.00000.wobble=0.05.CSepThr=4.CSepPen=0.1.KYAH=-0.05.KYHB=-0.05.KYAB=-0.1.AAY=9.ABY=9.5.BBY=9.copytype=-1.clustertype=2.ocean=0.00001.MS=0.1 \
	--read-models-brlmmp Mouse.models \
	--summaries \
	--write-models
	--chrX-probes Mouse.chrXprobes \
	--chrY-probes Mouse.chrYprobes \
	--special-snps Mouse.specialSNPs

Note - In unix, there should not be any characters after the continuation of a command '\'.

Note - The windows DOS prompt does not allow a continuation of a command with the '\' character. So in the sample script above, the '\' character should be omitted and everything entered on a single line.

For more information and options for the genotyping script, please see the apt-probeset-genotype manual: http://www.affymetrix.com/support/developer/powertools/changelog/apt-probeset-genotype.html

For more information on the genotyping algorithm, see the original BRLMM-P whitepaper as well as the DMET Plus algorithm whitepaper which documents some more recently-added features.

The output report

The following is the standard genotyping output. All, except for the summary files, are included with the download and will be output in the genotype directory when "genotype.sh" is executed.

For more information on the output files, please see the apt-probeset-genotype manual: http://www.affymetrix.com/support/developer/powertools/changelog/apt-probeset-genotype.html

Affymetrix Power Tools (APT) Release 1.14.3