Stanford MicroArray Database
WORLD
  Password   
Sign Out

SMD : Help : Meebo/Heebo Array Quality Plots
 

Help : Meebo/Heebo Array Quality Help*


Contents


  • Introduction

    This program provides a set of array quality graphs specifically designed for MEEBO/HEEBO arrays. For this it uses the arrayQuality R package developed at UCSF.
    The Users can choose to generate any of these 3 kinds of plots as needed. In any case, doping controls performance and MEEBO/HEEBO set quality plots results should be interpreted with care; they will not be informative if the hybridization quality is poor.
  • MEEBO/HEEBO controls

    The MEEBO/HEEBO set integrates a large collection of control features corresponding to both endogenous mouse/human transcripts as well as a diverse array of over 200 spiked-in doping control RNAs. These features can be used to examine in detail the performance of any given hybridization based on criteria such as sensitivity, specificity, dynamic range, and linearity of the hybridization. In addition, the control features are also helpful in detecting various possible hybridization or labeling biases. This section describes the properties and designs of these control features that are used to generate the quality plots in the package. For more details, please refer to MEEBO/HEEBO documentation:

    http://alizadehlab.stanford.edu/
    http://www.arrays.ucsf.edu/meebo.html

    Control types used for plots:

    • Doping controls (oligo_id: mCDnnnnnn): They correspond to probes that recognize spike-in transcript from Methanococcus and B. subtilis (Stanford) and from commercial suppliers (Affymetrix, Ambion and Stratagene).
    • Mismatch controls (oligo_id: mCMnnnnnn): This type of controls were designed against 5 selected spike-in transcripts and 5 positive control mouse genes. Probes that perfectly match their target sequences are called Perfect Match probes (PM), and probes that contain point mutations are called Mismatch probes (MM). Each PM probe is replicated between 8 to 23 times on the array. For each PM, the corresponding MMs have the following features:
      • wide range of number of point mutations: 1, 3, ..., 63;
      • mutations are located at the extremities (anchored MM)and distributed evenly (distributed MM).
      • each MM is replicated 3 times
    • Negative controls (oligo_id: mCNnnnnnn): Randomized 70mers, selected not to recognize mouse transcripts.
    • Positive controls (oligo_id: mCPnnnnnn): Several kinds of positive controls are included:
      • Ubiquitin C probes as sector-corner placed PMT aids, assuming that sector widths are 28 or 29 spots (192 replicates)
      • Normalization genes: 10 mouse "housekeeping" genes, 20 copies of each, based on Vandesompele et al., Genome Biol. 2002 3(7):RESEARCH0034. They are all gathered under Positive controls in the diagnostic plots at the moment.
    • Tiling controls (oligo_id: mCTnnnnnn): Series of probes designed to recognize sequences at varying distances from the 3' end (used to assess 3' bias): 11 mouse genes and selected spike-in transcripts.
    • Mouse constitutive exonic oligos (oligo_id: mMCnnnnnn): These oligos are included in the quality plots in order to compare the expression level of control probes to the expression level of "real genes".

  • How to use the Program

    This program can be used to produce quality plots for Meebo/Heebo arrays, either for an experiment stored in SMD, for a list of experiments contained in a resultset list, or for an uploaded gpr file. It takes a few minutes per experiment tp produce the plots, so the job is enqueued in the job queue and upon completion the user receives an email with a link to the plots. If the program program was run for an experiment stored in SMD, the latest set of quality plots is archived and can be looked at from the from the View Details page.

    Figure 1 shows the webpage when it is called from the main page (from lists menu -> all programs -> tools section: ArrayQuality Plots for meebo slides). The progam can also be reached from the View Details page for an individual experiment (from MyData menu -> Display My Data -> view details icon for an experiment) - some of the options will be missing and others will be pre-selected.
    Options:

    • Upload gpr file: This file has to be a valid gpr file and - to produce the graphs - under the 'Name' column, it must contain the oligo ids for the controls. (Option not shown when accessed from View Details page.)
    • Select Result Set List From Loader Account: Grapshs can be produced in batch for a set of experiments/result sets. All experiments/result sets must belong to the same print design. (Option not shown when accessed from View Details page - name of selected experiment is shown instead.)
    • Select Print: The print desing used for printing the slide. (Pre-selected when accessed from View Details page for a single experiment.)
    • Select Doping Control: Select the doping control mix added to the experiment. (Pre-selected when accessed from View Details page for a single experiment.)
    • Normalization Method: Select a normalization method that will be used during creation of some of the plots ('none' is a valid option). The normalization methods available here are those of the LIMMA BioConductor package. This normalization is independent of normalizations performed on the experiment and stored in SMD: the result of this normalization is not saved in the database and is used only for producing the quality plots. Normalizations, that can be stored in the database are done at a separate step.
    • Background Correction Method: Choose a background correction method used to produce some of the quality graphs. Result of this background correction is not saved in the database. There is a help document that describes the available background corrections methods that can be stored in the database. The background correction methods available here are those of the LIMMA BioConductor package.
    • Select Plot Type: Select the type of plot you would like to analyze.
    Figure 1.Options for making quality plots for meebo/heebo arrays.
    meebo/heebo quality plots options
  • Description of quality plots and examples

    There are three types of quality plots produced for MEEBO/HEEBO arrays:
    • A diagnostic plot that includes several statistics and exploratory plots and provides a quick graphic insight on the quality of the array.
    • Doping controls quality plots: these plots show the performance of the doping controls that were added to the hybridization mix, and compare them to expected results.
    • Mismatch and tiled controls plots: these plots are designed to show the specificity of the MEEBO set and to demonstrate amplification bias toward the 3' end of the transcripts. These plots can be used as a MEEBO/HEEBO set quality check rather than hybridization quality assessment.
    • Diagnostic plots

      Figure 2 represents an example for the diagnostic plot of a good hybridisation.
      • Plot 1: MA-plot of raw intensities. No background subtraction is performed. The colored lines represent the loess curves for each print-tip group. Red dots highlight any spot with corresponding weighted value less than 0. Users can create their own weigthing scheme or function. Things to look for in a MA-plot are saturation of spots and the trend of loess curves, which is an indicator of the amount of normalization to be performed.
      • Plot 2: MA-plot of normalized data density. By default, print-tip loess normalization is used. Instead of the typical MA-plot, we have used the package "hexbin" to highlight density of dots on the MA-plot. A light yellow color indicates a high density of dots, whereas blue color represents a lower density. This plot gives you information on the bulk of your data intensity (low/high signal)
      • Plot 3: Spatial plot of rank of raw M values (no background subtraction): Each spot is ranked according to its M value. A blue to yellow color scale is used, where blue represents the higher rank (1), and yellow represents the lower one. Missing spots are represented as white squares. This is a quick way to visually detect uneven hybridization and missing spots.
      • Plot 4: Spatial plot of ranks of normalized M values. Same colorscale as in Plot 3. In addition, flagged spots are higllighted by a black square. This type of graphical representation helps verify that normalization removed any spatial effects.
      • Plot 5: Spatial plot of raw A values. The color indicates the strength of the signal intensity, i.e. the darker the color, the stronger the signal. Missing spots are represented in white.
      • Plot 6: Histogram of the signal-to-noise log-ratio (SNR) for Cy5 and Cy3 channels. The mean and the variance of the signal are printed on top of the histogram. In addition, overlay density of SNR stratified by different control types (status) are highlighted. Their color schemes are provided in Table 1. The SNR is a good indicator for dye problems. The negative and empty controls density lines should be close to each other, almost superimposed.
      • Plot 7: Dot plot of controls normalized M values. Controls with more than 3 replicates are represented on the Y-axis, the color scheme is represented in Table 1. Controls M values should be tight. and close to 0.
      • Plot 8: Dot plot of controls A values, without background subtraction. Controls with more than 3 replicates are represented on the Y-axis, the color scheme is represented in Table 1. Intensity of positive controls should be in the high-intensity region, negative and empty controls should be in the lower intensity region. Positive controls range and negative/empty controls range should be well separated.
      Figure 2.Diagnostic plots for meebo/heebo arrays
      meebo/heebo quality
	diagnostic plots
    • Plots using mismatch controls

      • Signal intensity vs. binding energy: Figure 3

        This is a boxplot of normalized raw signal intensity for all MisMatch controls and associated wild-type oligos, binned by Binding Energy.

        Filename: BindingEnergy.SLIDENAME.png

        Raw signal intensity: (Red Foreground) + (Green Foreground) (background corrected as specified) Filtering: controls with low expression levels are removed from the plot. The median raw intensity of the wild-type controls should be greater than the 75 percentile of the intensity for the whole array for a set of mismatch probes to be included. Normalization: the normalized raw intensity for each mismatch oligo is obtained by dividing the raw intensity by the median raw intensity of the associated wild-type probes.

        Figure 3.Signal intensity vs. binding energy
        Signal intensity vs. binding energy plot
      • Signal intensity vs. percentage of mismatch: Figure 4

        Scatter plot of the log intensity vs. percentage of mismatched bases for each of the 10 transcripts with mismatch probes.

        Filename: Mismatch.SLIDENAME.png

        Anchored and distributed mismatch controls are represented separately, and the loess line for both types is overlaid on top of the scatter plot. Each plot also includes the boxplot of log intensity for the associated WT probes and the boxplot of log intensity for negative controls (red boxes on left and right and of figures, respectively). The right axis represents the percentiles of A values for the array in question, and the 50th, 75th and 90th percentiles of A values are highlighted in red.

        Figure 4.Signal intensity vs. percentage of mismatch
        Signal intensity
		 vs. percentage of mismatch plot
      • Signal intensity vs. 3 distance: Figure 5

        For each of the 11 tiling controls, scatter plot of the raw log-intensity vs. 3' distance, for both channels.

        Filename: Tiling.SLIDENAME.png

        Figure 5.Signal intensity vs. 3 distance
        Signal intensity vs. 3 distance
    • Plots using doping controls

      • Cy5 raw signal intensity vs. Cy3 raw signal intensity (log2 scale): Figure 6

        Scatter plot of raw Cy5 signal intensity over the raw Cy3 signal intensity (background correction is performed if requested) for all spiked doping-controls, colored by expected ratio.

        Filename: Spike.Cy5vsCy3.SLIDENAME.png

        A doping control will not be used if the corresponding Cy5 mass column is empty. No filtering is performed. Result: Spots with the same expected ratio should form a straight line paralell to the diagonal. The controls with large negative log ratio values are expected in the upper left, with large positive ratios in the lower right corner of the graph.

        Figure 6.Cy5 raw signal intensity vs. Cy3 raw signal intensity (log2 scale)
        Cy5 raw signal intensity vs. Cy3 raw signal
			intensity (log2 scale)
      • Scatter plot of observed log-ratios: Figure 7

        For each spiked doping-control, the plot shows the observed log-ratios (black) of each replicate with the expected log-ratios overlayed on top (red).

        Filename: Spike.MM.Scatter.SLIDENAME.png

        Figure 7.Scatter plot of observed log-ratios
        Scatter plot of observed log-ratios
      • Observed ratio vs. expected ratio: Figure 8

        For each type of doping controls, the plot shows the observed log-ratio vs. the expected value for each probe (letters) as well as the median observed log-ratios vs expected ratios (colored symbols). Log-ratios are shown after background correction and normalization.

        Filename: Spike.MMplot.SLIDENAME.png

        If there are more than 16 spiked-in doping controls in one type (usually for MJs), only 1 color will be used and the median of replicated probes will not be printed. No legend on the right in this case.

        Figure 8.Observed ratio vs. expected ratio
        Observed ratio vs. expected ratio
      • Sensitivity: Figure 9

        For each type of doping control, the plot shows a boxplot of the raw signal intensity vs. the mass of the spiked-in doping control (in log2 scale, for each channel).

        Filename: Spike.Sensitivity.FILENAME.png

        Figure 9.Sensitivity
        Sensitivity
      • Sensitivity of each individual spike: Figure 10

        Boxplot of raw signal intensity (log2 scale) for each doping control, ordered by increasing mass. Doping controls are separated by spike types. A boxplot of the log2 signal intensity of negative controls and signal intensity quartiles are provided on each graph to indicate scale.

        Filename: Spike.Sensitivity.Indi.FILENAME.

        Figure 10.Sensitivity of each individual spike
        Sensitivity of each individual spike
    * Based on document from: Agnes Paquet, Yuanyuan Xiao, (Jean) Yee Hwa Yang, Andrea Barczak, David Erle (November 28, 2005)


    Please send comments or questions to: array@genome.stanford.edu