GeneChip® Mouse Expression Set 430
What is a probe set? What do the different suffixes attached to a probe set name mean?
A probe set is a collection of probes designed to interrogate a given sequence. A probe set name is used to refer to a probe set, which looks like the following:
12345_at or 12345_a_at or 12345_s_at or 12345_x_at
The last three characters (_at, in RED) identify the probe set strand. Probe sets that are designed to detect the anti-sense strand of the gene of interest are annotated with "_at".
There are different types of probe sets that can result from the probe selection process. Most probe sets have an extension of an underscore and a letter to designate the probe set type, except for unique probe sets. These different probe set types are shown in the example above in BLUE.
Probes in a gene family probe set (_a set) all cross-hybridize to the same set of sequences that belong to the same gene family (i.e. having same name in the "geneCluster" column). This probe set type is only created if the "geneCluster" column is included in the Instruction File and contains information.
Probes in a unique probe set do not cross-hybridize to any other sequences in the design (including any additional pruning sequences provided).
Probes in an identical probe set (_s set) all cross-hybridize to the same set of sequences that are used for the design (including any additional pruning sequences if provided). These sequences are not defined as from the same gene family for one the following reasons: the values in the "geneCluster" column are different, or the gene family information is not provided.
Probes in a mixed probe set (_x set) contain at least one probe that cross-hybridizes with other sequence(s) used for the design. Cross-hybridizing probes have a cross-hybridization penalty applied to their raw probe scores, and thus, favoring unique probes of the same quality over cross-hybridizing probes.
The following diagram is a graphical representation of these different probe set types.
This diagram shows some of the possible relationships between sequences and probe sets on an array. The solid lines show how sequences or sequence sets (sequences from the same gene cluster) are represented by individual probes or probe sets. If a solid line is connected to a box, the it applies only to that particular circle.
For example, the line connecting the S1 circle to PS1 indicates that all of the probes in that probe set represent only that particular sequence, whereas the lines connected to PS2 indicate that all probes represent both sequences S2 and S3 in the G1 family (Gene Cluster 1). However, the probes do not represent S4 in the same G1 family. The lines connected to PS3 indicate that all probes in probe set 3 represent both sequences S4 and S5; however the sequences are from different gene clusters (S4 is from G1, S5 is from G2). The lines connected to PS4 indicate that all probes represent S5 in G2; however, one probe (P12) also represents (cross-hybridizes) to sequences S6 and S7.