In addition to a priori masking out of probes, feature selection methods such as the PCA Feature Selection method may prove useful as a non- a priori method.
In the current version of APT (as of 1.8.1) the --kill-list option will remove any MM probes associated with a PM probe which has been removed.
The first step is to create the probe mask file. The probe mask file is a tab separated text file which must contain two columns: probeset_id, and probe_id. The following is an example of a valid probe mask file:
probe_id probeset_id 14 2561152 17 2400195 87 2448973 92 2985267 102 3011834 106 2822668 107 2798328 111 2604445 156 3077098 ...
The probeset_id column is either the numeric probeset_id (if using PGF/CLF file) or the alphanumeric probeset name (if using a CDF file). In either case the column name is probeset_id.
Alternatively, the probe mask file can contain an "x" and a "y" column (0-base). If both an x/y columns and a probe_id column is present, the consistency of the two will be checked.
See the FAQ item on probe IDs for more info.
WARNING: In earlier APT versions (<=1.8.0) the masking functionality only works correctly for probes present in a single probeset. If a probe appears in more than one probeset, you will most likely get an error if the probe is specified multiple times in the mask file for each probeset. If you only list it once in the mask file you will get an error during the library file read when the probe is observed in the other probesets.
WARNING: Also note that in earlier APT versions (<=1.8.0) probesets and meta probesets which were completely masked out would result in a runtime error. With version 1.8.1+, a warning is reported rather than an error.
Finally we can run apt-probeset-summarize with our new files:
apt-probeset-summarize \
-a rma-sketch \
-p HuEx-1_0-st-v2.r2.pgf \
-c HuEx-1_0-st-v2.r2.clf \
-b HuEx-1_0-st-v2.r2.antigenomic.bgp \
-o results \
--kill-list my-mask.probe_mask \
*.CEL
So in short:
Can't find probe set with name: [probeset id]
A. You are probably using an older version of APT. There are probesets (as listed in the PGF file) which are completely masked out in the probe mask file. With older versions of APT (<=1.8.0) you have to manually remove these probesets from the probeset list or meta probeset file to prevent this failure.
Q. I get the following error when using a probe mask file:
Probe id: [probe id] specified twice in file: [my mask file]
A. You are probably using an older version of APT. In older versions of APT (<=1.8.0) a probe ID can only be listed once in the mask file. There is no way to mask out a probe which is present in more than one probeset using the mask file. Newer version of APT take into account botht he probe ID and the probeset ID when masking out probes.
Q. I get the following error when using a probe mask file:
ChipLayout::readPgfFileKillList() - Expecting probe with id: [probe id] to be in probeset '[probeset id]' not probeset '[different probeset id]'
A. Your probe mask file contains a probe which is present in more than one probeset. In older versions of APT (<=1.8.0) a probe ID can only be listed once in the mask file. There is no way to mask out a probe which is present in more than one probeset using the mask file.
Affymetrix Power Tools (APT) Release apt-1.10.1
1.5.3