home login register your profile contact        
Affymetrix
Products Support Analysis Scientific Community Corporate Careers Shop Affymetrix Japan
BY PRODUCT
Affymetrix Support - GeneChip Arrays GeneChip Arrays
Affymetrix Support - Assays and Reagents Assays & Reagents
Affymetrix Support - Instruments Instruments
Affymetrix Support - Software Software
BY SUPPORT TYPE
Affymetrix Support - Technical 
            Documentation Technical Documentation
Affymetrix Support - Sample Data Data Resource Center
Affymetrix Support - Assay Panel Files Assay Panel Files
Affymetrix Support - NetAffx Annotation Files Annotation Files
Affymetrix Support - Library Files Library Files
Affymetrix Support - Sample Data Software Downloads
Affymetrix Support - Fluidics Scripts Fluidics Scripts
Affymetrix Support - Mask Files Mask Files
Affymetrix Support - Array Comparisons Array Comparisons
Affymetrix Support - Product Updates Product Updates
Affymetrix Support - Affymetrix Software Developer's Network Developers' Network
Affymetrix Support - GeneChip Compatible Partners - Software GeneChip Compatible Software
Affymetrix Support - Third Party Tools - Supported by Affymetrix Affymetrix Tools
Affymetrix Learning Center - Online Training LEARNING CENTER
Learning Center, Train on Affymetrix Tools and Instruments Learning Center Overview
Learning Center, Command Console Software Series Command Console®
Learning Center, Newark NJ - Data Analysis Workshops Data Analysis Workshops
Learning Center, CNAT 4.0 Overview BAT 2.0 Overview
Learning Center, CNAT 4.0 Overview CNAT 4.0 Overview
Learning Center, Genotyping Console Software Series Genotyping Console®
Learning Center, Genotyping Console Software Series NetAffx® Learning Center
Learning Center, GTYPE 4.1 Software Overview GTYPE 4.1 Overview
Learning Center, GTYPE 4.1 Software Overview Mapping 500k Assay
Learning Center, GTYPE 4.1 Software Overview WT Assay Tutorial
Tiling Analysis Software Tutorial Tiling Analysis Software Tutorial
Learning Center, Expression Data Analysis Series Expression Data
Analysis Series
SERVICE SUPPORT
Ordering Information
Affymetrix Support - Instument Installation Instrument Installation
Service Contracts
Affymetrix Services - List of Service Providers Service Providers
Affymetrix Services - Email Technical Support E-mail Technical Support
Affymetrix Services - FTP Secure File Exchange Secure File Exchange
Parsing Affymetrix Genotyping CHP Files

Introduction

The Affymetrix GTYPE, BRLMM Analysis Tool and Genotyping Console™ genotyping software packages store the probe set summarizations (allele call, confidence, etc.) from the MPAM, Dynamic Model, BRLMM, and other algorithms in a binary file with a .CHP. The newer BRLMM Analysis Tool and Genotyping Console software will also include the algorithm name as part of the CHP file name. This file is referred to as the CHP ("chip") file. There are two main formats of the CHP file which will be supported by the Affymetrix genotyping software. These are known as the GCOS/XDA format (generated by GTYPE and BRLMM Analysis Tool) and the Command Console format (generated by BRLMM Analysis Tool and Genotyping Console). Documentation on these file formats are available on the DevNet section of the Affymetrix web site.

Library Files

One thing to note about the CHP files is that the older format (GCOS/XDA) only stores the analysis results, not the probe set names. To obtain the associated probe set names you will need to read either the CDF or PSI files (library files). The PSI file is an ASCII text file with the probe set name and number of probe pairs. This file is smaller and easier to parse than the CDF file. The CDF file contains, in addition to the probe set names, the list of associated features (X/Y feature coordinate on the array and other attributes) for each probe set. The format of this file is either ASCII or binary.

The PSI and CDF files are named using the array type (also known as chip type) with a .PSI or .CDF extension. The array/chip type is stored in the header of the CHP file. Given the full path to the CHP file and the full path of the library directory you can determine the CHP file's associated PSI/CDF file.

The order of the probe sets in the PSI, CDF and XDA format CHP are the same, with this you can open each file and use the index of the probe set to join the probe set data in the files.

The newer Command Console format CHP files do contain the probe set names (also known as the SNPID or Probe Set ID in the software). The format of this file allows for multiple tables of data thus allowing control probe sets results to be stored separately from the SNP probe set results. Because of this the order of the probe sets in the CDF file may not match that of the CHP file.

Fusion Software Developers Kit (SDK)

Parsers in the form of C++ and Java source code are available from Affymetrix to parse the CHP, CDF and PSI files. These parsers, along with sample code and documentation, are contained within the Fusion SDK.

Parsing a genotyping CHP File

The classes/interfaces provided within the Fusion SDK provide the ability to parse the different types of CHP files. These include genotyping, expression and tiling CHP files. The FusionCHPLegacyData class provides the support for XDA format CHP files, and the FusionCHPMultiDataData class provides support for Command Console format CHP files.

Sample Code

The following is an example of C++ code using the Fusion SDK to extract the genotyping results from a CHP (GCOS/XDA or Command Console) file. The top level function is ReadCHPFile.

#include "FusionCHPData.h"                // header file for reading CHP files.
#include "FusionCHPLegacyData.h"     // XDA CHP files
#include "FusionCHPMultiDataData.h" // Command Console CHP files
#include "StringUtils.h"
#include <string>

using namespace std;
using namespace affymetrix_calvin_utilities;
using namespace affymetrix_calvin_data;
using namespace affymetrix_fusion_io;
using namespace affymetrix_calvin_io;

/*! This function extracts the results from the CHP file and probe set name from the PSI file.
 * @param legchp  The CHP file object.
 * @param libPath  The full path to the library file directory.
 * @return True if successfully read.
 */

bool ExtractData(FusionCHPLegacyData *legchp, const char *libPath)
{
    if (legchp->GetHeader().GetAssayType() != FusionGenotyping)
        return false;

    // The chip type is stored in the header of the CHP file.
    string chipType = StringUtils::ConvertWCSToMBS(legchp->GetHeader().GetChipType());

    // The probe set names are stored in either the PSI or CDF file. Use the
    // FusionPSIFile class in the Fusion SDK to parse the PSI file.
    FusionPSIFile psi;
    string psiFile = libPath + chipType + ".psi";
    psi.SetFileName(psiFile.c_str());
    if (psi.Read() == false)
        return false;

    // Now loop over the probe sets to get the call and confidence values.
    // p-values are also available from the psResults object.
    // The probe set name comes from the PSI file.
    // Note that the call can be one of the following constants defined in CHPFileData.h
    // ALLELE_A_CALL, ALLELE_AB_CALL, ALLELE_B_CALL, ALLELE_NO_CALL

    float conf;
    u_int8_t call;
    string name;
    FusionGenotypeProbeSetResults psResults;
    int n = legchp->GetHeader().GetNumProbeSets();
    for (int i = 0; i < n; i++)
    {
        legchp->GetGenotypingResults(i, psResults);
        conf = psResults.GetConfidence();
        call = psResults.GetCall();
        name = psi.GetProbeSetName(i);
    }
    return true;
}

/*! This function extracts the results from the CHP file.
 * @param mchp  The CHP file object.
 * @return True if successfully read.
 */

bool ExtractData(FusionCHPMultiDataData *mchp)
{
    // Multiple tables of results are stored in the multi data CHP file.
    // The SNP analysis results are stored in the "genotype" table.
    int n = mchp->GetEntryCount(GenotypeMultiDataType);

    // Now loop over the probe sets to get the call and confidence values.
    // Other values such as the contrast and strength for the BRLMM algorithm
    // are stored. Use the GetGenotypeEntry function to retrieve all of the columns.
    // results for the given SNP.

    float conf;
    u_int8_t call;
    string name;
    for (int i = 0; i < n; i++)
    {
        call = mchp->GetGenoCall(GenotypeMultiDataType, i);
        conf = mchp->GetGenoConfidence(GenotypeMultiDataType, i);
        name = mchp->GetProbeSetName(GenotypeMultiDataType, i);
    }
    return true;
}

/*! This will read the CHP file, determine the type and extract the results.
 * @param fileName The full path to the CHP file.
 * @param libPath The full path to the library file directory.
 * @return True if successfully read.
 */

bool ReadCHPFile(const char *fileName, const char *libPath)
{

    // Read the CHP file. This function will read any type of CHP file whose parsers (from Fusion)
    // have been compiled and linked into the program.

    FusionCHPData *chp = FusionCHPDataReg::Read(fileName);
    if (chp == null)
        return false;

    // The following function will determine if the CHP file read is of the XDA format. This
    // can be either a GCOS/XDA file. The "legacy" format data is that
    // which contains a call, confidence, p-values, RAS1 and RAS2 results.
    // Note: The XDA file format also provides for storage of expression results. The ExtractData function
    // will perform an additional check to ensure it is a genotyping CHP file.
    bool status = false;
    FusionCHPLegacyData *legchp = FusionCHPLegacyData::FromBase(chp);
    if (legchp != null)
    {
        status = ExtractData(legchp, libPath);
        delete legchp;
        return status;
    }

    // The following function will determine if the CHP file read is of the Command Console format.
    // The library path does not need to be passed into the function as the probe set names (SNPID's)
    // are stored in the CHP file.

    status = false;
    FusionCHPMultiDataData *mchp = FusionCHPMultiDataData::FromBase(chp);
    if (mchp != null)
    {
        status = ExtractData(mchp);
        delete mchp;
        return status;
    }

    // The CHP file was read, but not of the type we wanted/expected. This will happen when other CHP file parsers
    // are compiled and linked in your application.

    delete chp;
    return false;
}

888-DNA-CHIP (888-362-2447) +44 (0) 1628 552550 feedback e-mail support terms of use privacy policy