home login register your profile contact        
Affymetrix
Products Support Analysis Scientific Community Corporate Careers Shop Affymetrix Japan
BY PRODUCT
Affymetrix Support - GeneChip Arrays GeneChip Arrays
Affymetrix Support - Assays and Reagents Assays & Reagents
Affymetrix Support - Instruments Instruments
Affymetrix Support - Software Software
BY SUPPORT TYPE
Affymetrix Support - Technical 
            Documentation Technical Documentation
Affymetrix Support - Sample Data Data Resource Center
Affymetrix Support - Assay Panel Files Assay Panel Files
Affymetrix Support - NetAffx Annotation Files Annotation Files
Affymetrix Support - Library Files Library Files
Affymetrix Support - Sample Data Software Downloads
Affymetrix Support - Fluidics Scripts Fluidics Scripts
Affymetrix Support - Mask Files Mask Files
Affymetrix Support - Array Comparisons Array Comparisons
Affymetrix Support - Product Updates Product Updates
Affymetrix Support - Affymetrix Software Developer's Network Developers' Network
Affymetrix Support - GeneChip Compatible Partners - Software GeneChip Compatible Software
Affymetrix Support - Third Party Tools - Supported by Affymetrix Affymetrix Tools
Affymetrix Learning Center - Online Training LEARNING CENTER
Learning Center, Train on Affymetrix Tools and Instruments Learning Center Overview
Learning Center, Command Console Software Series Command Console®
Learning Center, Newark NJ - Data Analysis Workshops Data Analysis Workshops
Learning Center, CNAT 4.0 Overview BAT 2.0 Overview
Learning Center, CNAT 4.0 Overview CNAT 4.0 Overview
Learning Center, Genotyping Console Software Series Genotyping Console®
Learning Center, Genotyping Console Software Series NetAffx® Learning Center
Learning Center, GTYPE 4.1 Software Overview GTYPE 4.1 Overview
Learning Center, GTYPE 4.1 Software Overview Mapping 500k Assay
Learning Center, GTYPE 4.1 Software Overview WT Assay Tutorial
Tiling Analysis Software Tutorial Tiling Analysis Software Tutorial
Learning Center, Expression Data Analysis Series Expression Data
Analysis Series
SERVICE SUPPORT
Ordering Information
Affymetrix Support - Instument Installation Instrument Installation
Service Contracts
Affymetrix Services - List of Service Providers Service Providers
Affymetrix Services - Email Technical Support E-mail Technical Support
Affymetrix Services - FTP Secure File Exchange Secure File Exchange
Affymetrix - Help - EXON Glossary Terms Index
EXON Glossary

1. Overview – A condensed report of the current vital characteristics of a probe set, exon cluster or transcript cluster.

Probe Set ID – A set of synthetic oligonucleotide probes that interrogate gene expression from typically one gene. The probes in a probe set are designed to hybridize to a portion of an mRNA sequence and may cross-hybridize to one or more mRNAs that contain a similar sequence. All probe sets on an _st exon array, such as HuEx-1_0-st, are designed to anneal to the sense strand of the mRNA and therefore are located on the array in an antisense orientation.

There are three types of probe sets on exon arrays that are distinguished by their cross-hybridization potential (see glossary term Hybridization Target).

Exon Cluster – A group of probe sets that cover a transcript exon. The exon cluster is divided into more than one probe selection region when transcript evidence implies that a splice boundary or termination of transcription occurs within the exon.

Transcript Cluster – A consensus sequence of nucleotide bases, reflecting all the transcription evidence known for that genomic region on that strand. Transcript clusters are clusters of exons that contain evidence, which demonstrates their association within the same transcript. There are three levels of evidence, from the most reliable to the least reliable, which use the annotations Core, Extended, and Full, respectively (see glossary term Probe Set Grade).

Array – The GeneChip® probe array where the probe set is located. Go to http://www.affymetrix.com/support/index.affx for information about GeneChip® probe arrays.

Species – The target organism for the probe set. This may be different from the organism indicated by the array, since there are probe sets which may contain sequences from other organisms on any given GeneChip® probe array. The organism name may vary because there are always control probe sets on an array which may be from different organisms. Some GeneChip® probe arrays are designed to read off of more than one related organism or strain. For information about any specific product of interest, see the product information under Support on affymetrix.com.

Annotation Genome – The genome build version and date that the mRNA was last loaded from its source database.

Probe Set Location – The genomic location of the probe set for the genome build as described in the Probe Design Information section on the Full Record page. These coordinates begin at the first base of the first probe sequence and end at the last probe of the last probeset.

Number of Probes – The number of probes in this probe set or cluster. Probe sets on exon arrays contain only 1-4 probes, while exon or transcript clusters may contain up to hundreds of probe sets.

Probe Set Grade – Probe sets are graded according to the quality of evidence supporting the transcription of the genomic sequence they are designed to interrogate.

  • Core probe sets are supported with the most reliable evidence from RefSeq and full length mRNA GenBank records containing complete CDS information.
  • Extended probe sets come from other cDNA evidence. Extended evidence annotations include other human mRNA, EST sequences ENSEMBL gene collections, syntenically mapped mRNA from Mouse, Rat and Human, mitoMap mitochondrial genes, microRNA registry genes, vegaGene, and vegaPseudoGene records.
  • Full probe sets come from computational gene prediction evidence only. They are supported by gene and exon prediction algorithms including GeneID, GenScan, GenScanSubOptimal, exoniphy, RNAGene, sgpGene and Twinscan.
  • Free probe sets are designed against annotations which were merged such that no single annotation (or evidence) contains the probe set.
  • Ambiguous probe sets cannot be unambiguously assigned to a particular transcript cluster.

Probe set grade may change over time as the public mRNA record evolves. For example, new full length mRNA records added to GenBank might promote the grade of one or more probe sets from Extended to Core.

Probe sets are graded according to the highest confidence evidence that supports it. In order for a probe set to be labeled at the Core, Extended, or Full levels, it must be entirely contained within the bounds of an annotation at that level. For example, if half the probes of a probe set measure a Core annotation, but all of the probes measure an Extended annotation, then the probe set is labeled at the Extended level.

A probe set grade of Ambiguous results when two different genes have overlapping transcripts. A probe set that lies within this overlap region is given an Ambiguous level tag, since the gene it belongs to cannot be determined at design time. An exception is made for Core annotations, however. If a probe set is contained within the overlap region of multiple genes, but within the Core region of only one of the genes, then the probe set is labeled Core.

A probe set is labeled Free if the probe set is not contained in any annotations.

Additional details about the probe set grouping procedure is found in the whitepaper, Exon Probeset Annotations and Transcript Cluster Groupings, located at http://www.affymetrix.com/support/technical/whitepapers/exon_probeset_trans_clust_whitepaper.pdf.

2. Assigned mRNA Sequences – The mRNA or gene prediction found by the NetAffx™ annotation pipeline should be detected by the probe set or cluster based on computational sequence analysis. These assignments are made by compiling a non-redundant set of publicly available mRNA sequences from GenBank, RefSeq, and Ensembl and then using sequence alignment programs and other tools to associate them with probe sequences. Details of methods used by this assignment pipeline are available in a whitepaper, Transcript Assignment For NetAffx™ Annotations, on affymetrix.com.

The NetAffx™ transcript assignment methods derive a relationship between GeneChip® probe sets and the current public mRNA record. The number of mRNA sequences and Expressed Sequence Tags (ESTs) available in public databases continues to evolve from the original time of design. The NetAffx website maintains a current view of public mRNA sequences that GeneChip probe sets interrogate.

Accession ID (Source) – The NetAffx™ annotation pipeline draws on GenBank, RefSeq, Ensembl, and other public databases for transcript records. The unique identifier for the transcript is given here, with the source database in parentheses.

Gene Symbol – Gene names, IDs and symbols are extracted from Entrez Gene or UniGene. In some cases, specialty databases may provide the gene name such as: FlyBase, WormBase, and Saccharomyces Genome Database.

Entrez GeneID – Gene names, IDs and symbols are extracted from Entrez Gene or UniGene. In some cases, specialty databases may provide the gene name such as: FlyBase, WormBase, and Saccharomyces Genome Database.

Pathway – Displays the GenMAPP pathway if the transcript has been found to play a role in a proteome functional pathway in the GenMAPP collection.

The Pathway field offers links to two pathway databases from GenMAPP and KEGG. Either one of these pathways can be further visualized with probe set data overlaid by the GenMAPP application from genmapp.org.

Matched Probes – Shows the fraction of the probes in the probe group that match the transcript described. The numerator is the number of matches, while the denominator is the possible number of matches defined by the number of probes that fall within the alignment of the transcript to the Probe Selection Region, Exon Cluster Sequence, or Transcript Cluster Sequence as appropriate.

3. Design Information – A record of the evidence, which includes the probe set on the array, compiled by the design process. The current biological interpretation of the probe set is given in the remainder of the Details page.

Design Date – The date the chip was released.

Number of Probe Sets – Displays the number of probe sets in the exon or transcript cluster.

Hybridization Target – Describes the cross-hybridization potential of the probe set determined at the time of array design. This field is based on computational sequence alignment against all known and putatively transcribed array design content, which includes all Cross-hybridization potential of the probe set determined at the time of a potentially transcribed regions of the genome and other transcribed sequences that could not be mapped to the genome. This field has one of three possible values:

  • Unique – All probes in the probe set perfectly match only one sequence in the putatively transcribed array design content. The vast majority (>80%) of probe sets are unique.
  • Similar – All the probes in the probe set perfectly match more than one sequence in the putatively transcribed array design content.
  • Mixed – The probes in the probe set either perfectly match or partially match more than one sequence in the putatively transcribed array design content.

Transcript Evidence Sequences – The number of nucleotide sequences that establishes the transcript cluster. For a summary of the evidence used to make the exon array, see the Affymetrix Technical Note.

4. Related Probe Sets – Lists the other probe sets within the same exon cluster or transcript cluster.

Probe Selection Region – The genomic region from which individual probe sequences are selected.

Transcript Location – The genomic location of the transcript cluster for the genome build as described in the Probe Design Information section on the Full Record page. These coordinates begin at the first base of the first probe sequence and end at the last probe of the last probe set.

Probe Set Location – The chromosome name, genomic coordinates, and strand of the probe set. This information is based on either the version of the genome assembly used for the current NetAffx annotation release or the version of the genome assembly used at design time.

The transcript cluster sequence is an idealized consensus of all transcribed sequences associated with a gene and therefore may have several distinct public mRNA sequences associated with it.

Genome Source – The version of the genome used to design the array. See also the Annotation Genome field which describes the genome version related to the annotations on the Details page.

Probe Selection Region (PSR) Evidence – A table of the kinds of evidence for the transcription of the underlying genomic sequence and used to define boundaries for the probe sets within an exon cluster or transcript cluster.

The searchable evidence classes for the Human Exon Array include:

  • ncbi
  • fl
  • mrna
  • ensGene
  • est
  • est-fl
  • mouse-fl
  • mouse-mrna
  • rat-fl
  • mitomap
  • microRNAregistry
  • vegaGene
  • vegaPseudoGene
  • geneid
  • genscan
  • genscanSubOpt
  • exoniphy
  • rnaGene
  • sgpGene
  • twinscan

Exon Cluster Sequence Evidence – The number of sequences, from all sources, used as evidence in the design of the exon cluster. See Probe Selection Region (PSR) Evidence for details.

Transcript Cluster Sequence Evidence – The number of sequences used as evidence from all sources for the design of the transcript cluster. See Probe Selection Region (PSR) Evidence for details.

Overlapping Probes – Actual probe sequences that overlap in genomic sequence, when a probe selection region is small.

Probe Self Cross-Hybridizes – Describes whether or not a probe, which hybridizes to a position within this transcript cluster, was found at the design date.

Probe Cross-Hybridizes – Describes whether or not a probe was found at the design date that would hybridize to a transcript outside of this transcript cluster.

PSR Evidence – Details the source of the evidence sequences compiled to produce this probe set.

Evidence Sequences – The number of nucleotide sequences compiled as evidence for the probes described. The evidence may have been cDNA, mRNA, EST or predicted gene sequences. For full information on the exon array design see the Exon Array Design Technical Note.

Overlaps Coding Sequences – Indicates whether the probe set overlapped the coding sequence of the transcript as of the last genome build update.

5. Sequence – Gives the related sequence data. For probe sets, the target sequence is provided. For exon clusters, the exon sequence is the contiguous nucleotide sequence that spans all the constituent probe selection regions. For transcript clusters, the transcript cluster sequence is the nucleotide sequence of all its constituent exon clusters. The transcript cluster sequence is not necessarily a contiguous genomic sequence.

TECHNICAL SUPPORT
  United States / Canada
888-DNA-CHIP
(888-362-2447)
e-mail technical support
  Europe
+44 (0) 1628 552550
e-mail technical support
  Japan
+81 3-5730-8222
e-mail technical support
POPULAR DOWNLOADS
Brochure, The GeneChip® System: An Integrated Solution For Expression and DNA Analysis (pdf, 227 KB)
Brochure, RNA Expression Analysis with the GeneChip® System (pdf, 1.3 MB)
Data Sheet, Human Genome Arrays (pdf, 169 KB)
Manual, Expression Analysis Technical Manual
Manual, Data Analysis Fundamentals (pdf, 723 KB)
888-DNA-CHIP (888-362-2447) +44 (0) 1628 552550 feedback e-mail support terms of use privacy policy