This site will redirect to thermofisher.com starting April 7. Get details >

Mapping 500K Sample Data Set

This dataset is expected to be useful for a variety of purposes including software and workflow demonstration and development of probe-level analysis methods for making genotype calls from probe intensity data.

The dataset consists of 48 samples, each on both the Nsp and Sty arrays (so a total of 48x2=96 hybridizations). The samples consist of thirteen trios (5 HapMap CEPH trios, 5 HapMap Yoruban trios and three other non-HapMap trios) and 9 unrelated HapMap Asian samples. In total 39 of the 48 samples are part of the samples use in the International HapMap Project.

Of particular use is the fact that the HapMap Project has made available a large number of reference genotypes which can be used in conjunction with this dataset. HapMap data access policy limits redistribution rights on these genotypes so they cannot be made available directly by Affymetrix, but the reference data can be downloaded directly from the HapMap Project. As of HapMap release 16c1, a total of about 124,624 SNPs have reference genotypes available for the samples shared here (65,246 SNPs for Nsp and 59,378 SNPs for Sty). These numbers are steadily increasing with each HapMap update. The details of the analysis method used by GTYPE to determine genotype calls based on probe intensity data have been published in Bioinformatics.

The dataset has been split into 13 parts for convenient download. These can be unzipped on top of one another. The file with the word 'base' in the filename is required, the other 12 zip files each contain distinct collections of chip data and users wanting to download only a subset of the data may pick a subset of these zips.

The data is provided in two versions. Each version contains the same data but in different file formats. Version 1 (in table 1) contains raw CEL, CHP and EXP files and is suitable for use outside of the GCOS/GTYPE framework. It is expected to be mainly of interest for users interested in low-level probe analysis. Version 2 (in table 2) contains DTT format files for integration with the GCOS/GTYPE framework and is expected to be mainly of interest for users wishing to integrate the data with these applications.

In either case there is a file named README.txt provided in the 'base' file with detailed instructions on how to use the data. Md5 checksums are provided in the tables below for verification of the integrity of downloaded data.

Version 1 release of data: CEL,CHP and EXP format

(Suitable for use outside of GCOS/GTYPE framework)

File Size md5 Checksum Description
101MB
30abc864a12205eda21af3a9d33d3a27
Documentation and library files for entire dataset
215MB
b563e2b3a809983f36fb1d5dbddd0e43
Probe intensities and genotype calls for 8 HapMap samples on Nsp array
221MB
b30c9d56ad9fff4b777d6a2bad98c1f6
Probe intensities and genotype calls for 8 HapMap samples on Nsp array
210MB
6ad65f1d4b8fd3a4fe807a6b4b8a843e
Probe intensities and genotype calls for 8 HapMap samples on Nsp array
220MB
50aaf3213d2ce0f43c6b6188b241f6be
Probe intensities and genotype calls for 8 HapMap samples on Nsp array
215MB
e504591076cac04c9dd78320ff622916
Probe intensities and genotype calls for 7 HapMap and 1 non-HapMap samples on Nsp array
216MB
50a796fccbf06c7d767260a50a03094c
Probe intensities and genotype calls for 8 non-HapMap samples on Nsp array
209MB
2a45a4f4fc282029c6aae652250c75fb
Probe intensities and genotype calls for 8 HapMap samples on Sty array
222MB
654a6d82147930a5bbc207643d16ba1f
Probe intensities and genotype calls for 8 HapMap samples on Sty array
216MB
c86549f0011c019da253b88ba9cd8e60
Probe intensities and genotype calls for 8 HapMap samples on Sty array
221MB
446683507a8df1dd6affd4a792bf844f
Probe intensities and genotype calls for 8 HapMap samples on Sty array
221MB
5a368488d8383a95ab801e32e10f5806
Probe intensities and genotype calls for 7 HapMap and 1 non-HapMap samples on Sty array
222MB
d488c793bc1ac3747719226267fb283e
Probe intensities and genotype calls for 8 non-HapMap samples on Sty array

Version 2 release of data: DTT format

(Intended for use within GCOS/GTYPE framework)

File Size md5 Checksum Description
101MB
30abc864a12205eda21af3a9d33d3a27
Documentation and library files for entire dataset
215M
eb6812d72feab9167deb28c7cad81ce5
Archived DTT files for 8 HapMap samples on Nsp array
221M
419b0311dd55105a36e0bf3acf0ab957
Archived DTT files for 8 HapMap samples on Nsp array
210M
cfcede32d13d1deabe11c6b5c4d3795e
Archived DTT files for 8 HapMap samples on Nsp array
221M
34b25f3e188b1b7373e635f121c13efa
Archived DTT files for 8 HapMap samples on Nsp array
215M
b19b1bdbdb4aa83ed35d45390d71c738
Archived DTT files for 7 HapMap and 1 non-HapMap samples on Nsp array
216M
177077a197712bbc77dcca2059b3b76c
Archived DTT files for 8 non-HapMap samples on Nsp array
209M
0dde6dbf688d32eb79252b1d3a161f33
Archived DTT files for 8 HapMap samples on Sty array
222M
70c459d8da59f13a384b40d870c38f02
Archived DTT files for 8 HapMap samples on Sty array
216M
4ef48cfc72c9dafc712effaff53eb0b3
Archived DTT files for 8 HapMap samples on Sty array
221M
feca84b7dcc8d473487c73ddfa94d7ea
Archived DTT files for 8 HapMap samples on Sty array
222M
c94a4d7e3434a4538e8966374ba01bac
Archived DTT files for 7 HapMap and 1 non-HapMap samples on Sty array
222M
f9eb0b7db9fe3a375d6ccbb8c83103db
Archived DTT files for 8 non-HapMap samples on Sty array

Back to Top >