Affymetrix® DAT Data File Format
DAT FILE
Description
The DAT file contains pixel intensity values collected from an Affymetrix scanner.
The information below will describe the following versions:
- Version 1 format is generated by the MAS and GCOS 1.x software.
- Command Console version 1 is generated by the Command Console software. This is stored in the Command Console "generic" data file format.
The file is a binary file with a 512 byte header followed by 16 bit unsigned integer pixel intensity values. Values are stored in little-endian format.
Header
The 512 byte header contains the following information:
| Item | Description | Type |
|---|---|---|
| 1 | Type of file, must be 0xFC. | BYTE |
| 2 | Number of pixels per line. | WORD |
| 3 | Number of lines in the image. | WORD |
| 4 | The total number of data points (pixels) in the image. | DWORD |
| 5 | Minimum pixel value in the image. | DWORD |
| 6 | Maximum pixel value in the image. | DWORD |
| 7 | Mean pixel value. | double |
| 8 | Standard deviation of the pixel values | double |
| 9 | Number of pixels per row (padded with spaces), preceded with "CLS=". | char[9] |
| 10 | Number of rows in the image (padded with spaces), preceded with "RWS=". | char[9] |
| 11 | Pixel width in micrometers (padded with spaces), preceded with "XIN=" | char[7] |
| 12 | Pixel height in micrometers (padded with spaces), preceded with "YIN=". | char[7] |
| 13 | Scan speed in millimeters per second (padded with spaces), preceded with "VE=". | char[6] |
| 14 | Temperature in degrees Celsius (padded with spaces). If no temperature was set then the entire field is empty. | char[7] |
| 15 | Laser power in milliwatts or microwatts (padded with spaces. | char[4] |
| 16 | Date and time of scan (padded with spaces). | char[18] |
| 17 | There are several sub-fields in this field. The first sub field is the scanner ID, sometimes followed by a number, followed by three spaces. If the scanner ID is absent, the field consists of four spaces. Next are 10 structured comment fields. Each field is preceded by the delimiter 0x14 and a space. The field is followed by a space and 0x14. Only field two is valid, the other 9 fields are obsolete. Field 2 contains the probe array type, followed by .1sq. The 1sq extension is also obsolete. Next (after the last structured field) there is the chip orientation preceded by a space. The rest of the field is filled with nulls (zeros) | char[220] |
| 18 | Average DC Offset of the scanner for this image. | double |
| 19 | Standard deviation of the average DC offset. | double |
| 20 | Number of samples taken in determining the DC offset. | DWORD |
| 21 | Coordinates of the grid's upper left corner. | POINTS |
| 22 | Coordinates of the grid's upper right corner. | POINTS |
| 23 | Coordinates of the grid's lower right corner. | POINTS |
| 24 | Coordinates of the grid's lower left corner. | POINTS |
| 25 | Cell margin used for computing the cells intensity value. | WORD |
| 26 | Name of the experiment padded with nulls. | char[154] |
Types used are defined as: BYTE (An 8-bit unsigned integer), SHORT (A 16-bit integer), WORD (A 16-bit unsigned integer), DWORD (A 32-bit unsigned integer), double (An 64-bit floating-point number), char (8-bit character) and POINTS (A structure that defines coordinates of a point. The contents of the structure are two SHORT values describing the X and Y values of the coordinate).
Data
The pixel intensity data is stored following the header at byte 512. Intensity values are stored as 16 bit unsigned integer values. Data is stored by rows.
The format of the DAT file generated by the Command Console software uses the Command Console generic data format. The following describes the data sets and groups in the file.
The generic data header shall include:
The data type identifier is set to "affymetrix-calvin-scan-acquisition"
The parameters stored in the file include:
| Parameter Name | Definition |
|---|---|
| affymetrix-array-type | The probe array type. |
| affymetrix-pixel-size | The pixel size. |
| affymetrix-scanner-type | The scanner type. e.g. M10 for the GCS3000 high resolution scanner. |
| affymetrix-scanner-id | The scanner ID |
| affymetrix-scan-date | The date of the scan. |
| affymetrix-pixel-rows | The number of rows of pixels. |
| affymetrix-pixel-cols | The number of columns of pixels. |
| affymetrix-partial-dat-header | DAT header string without min and max intensity. Present if this is a native Command Console DAT file. |
| affyemtrix-max-pixel-intensity | Max pixel intensity. Present if this is a native Command Console DAT file. |
| affymetrix-min-pixel-intensity | Min pixel intensity. Present if this is a native Command Console DAT file. |
| affymetrix-full-dat-header | Full DAT header string. Present if this is a DAT file converted from GCOS. |
| affymetrix-image-orientation | A code indicating the orientation of the image. |
| affymetrix-file-version | File version. |
| affymetrix-image-flip-flag | Indicates if the image has been flipped about the y-axis. |
Array file parameters (if available) will be stored within the parent data header object. Parameters include:
| Parameter Name | Definition |
|---|---|
| affymetrix-array-id | A GUID that identifies the array. |
| affymetrix-array-barcode | The array barcode. |
The data is stored in a single group with 4 data sets. The data sets are defined as:
| Data Set Name | Description | Number of Columns | Column Name | Column Type | Description |
|---|---|---|---|---|---|
| Pixel | The intensity values for each pixel. | 1 | Pixel | USHORT | The pixel intensities. |
| Stats | The minimum and maximum pixel value. | 2 | Min Intensity Max Intensity |
USHORT USHORT |
The minimum pixel value. The maximum pixel value. |
| GlobalGrid | The status of the grid and the 4 corners that define the global grid. | 9 | GridStatus Upper left x Upper
right x Lower right x
Lower left x |
UINT FLOAT FLOAT FLOAT FLOAT |
The status of the grid. OK=1, Error=2, Manual Adjust=4 The X coordinate of the upper left corner
The X coordinate of the upper right corner The X coordinate of the lower right corner The X coordinate of the lower left corner |
| Subgrid | The status of the grid and the 4 corners that define each sub-grid. | 9 | GridStatus Upper left x Upper
right x Lower right x
Lower left x |
UINT FLOAT FLOAT FLOAT FLOAT |
The status of the grid. OK=1, Error=2, Manual Adjust=4 The X coordinate of the upper left corner
The X coordinate of the upper right corner The X coordinate of the lower right corner The X coordinate of the lower left corner |