Functional Genomics Facility Logo

How to design a successful microarray experiment

by Rainer Breitling

Plant Science Group and
Bioinformatics Research Centre
Institute of Biomedical and Life Sciences (IBLS)
University of Glasgow
Glasgow G12 8QQ
United Kingdom

Present address:
Groningen Bioinformatics Centre
University of Groningen, The Netherlands

R.Breitling@rug.nl
http://gbic.biol.rug.nl/~rbreitling

Bioinformatics Research Center Logo

This page tries to answer some of the basic questions to be considered when planning a microarray experiment for the first time. They represent a highly subjective selection of issues that were identified during a meta-analysis of microarray experiments performed at the Sir Henry Wellcome Functional Genomics Facility in collaboration with Pawel Herzyk, and at the Molecular Plant Sciences group in collaboration with Anna Amtmann and Patrick Armengaud.

  1. Are microarray experiments more difficult to design than other studies?
  2. No, but... Microarrays are still quite expensive to perform, so you would want to do them properly from the first step on. Also, a single microarray hybridization can generate as many data points as several "classical" Ph.D. theses, so expectations towards the results will be particularly high. Failure to interpret the data due to incorrect design can be particularly embarrassing.

  3. What is a good microarray experiment?
  4. Just as there are many uses for classical experimental techniques, there are many ways to exploit microarrays. One important consideration is that you should compare samples that are similar. Don't try to maximize the number of differentially expressed genes. Some special cases:

  5. Do I need statistical advise for my study design?
  6. Microarray experiments are biological experiments, so the most important considerations will be biological. Especially in simple cases, where you want to use microarrays as a comprehensive Northern blot, microarray-specific statistical issues can be of secondary importance at the early stages. It will, however, be very useful to involve a statistician in the analysis/interpretation process to prevent some common pitfalls, such as underestimating the "multiple testing problem" involved in examining thousands of genes at once (see below: How do I analyze my results?). And of course you should always be aware of the basic statistical and philosophical issues involved in any successful experimental design.

  7. What kind of microarrays should I use?
  8. How many replicates do I need?
  9. As many as possible! For an exploratory analysis 3 replicates are usually sufficient, unless the data are particularly noisy (e.g. samples from very small numbers of cells) or the expected effect is particularly small (e.g. changes occur only in very few, specialized cells in the sample). Using less than 3 replicates is not a good idea. Most important is the use of real replicates. Do equal numbers of replicates for each condition/comparison to keep the later analysis simple.

  10. What is a real replicate?
  11. A real - or biological - replicate is an independent sample that is varying all the variables that a colleague in another lab couldn't control. You want to report only observations that are general and reproducible. In an imaginary "ideal" experiment, each replicate would be performed in a different lab - so it may be advisable to approximate that situation as much as possible. That does not mean that you have to vary all the variables, if you are certain that some of them won't have any effect, e.g. the brand of standard chemicals, the phases of the moon, etc. But don't underestimate the sensitivity of microarrays, variables like batch of cells or time of day can very well have an observable effect. Most of all, be careful to prepare a perfectly matched control for every sample, even if you are going to use single-color arrays.

  12. Should I do technical replicates?
  13. No, unless you are planning a technical instead of a biological study. Repeated hybrization of the same biological sample is a waste of resources. Microarrays are reliable and it has been shown repeatedly that this kind of replication doesn't provide any biologically useful information.

  14. Should I do dye-swap experiments?
  15. No. See
    "Should I do technical replicates?". But of course you can use reverse labelling for some of your biological replicates if you feel like it.

  16. Should I pool samples?
  17. It is tempting to pool samples to save hybridization costs. Unless you do single-cell sampling, every samples is already a pool, as it contains mRNA from many cells. Especially if the number of cells obtained from each individual is very small, pooling is the best way of reducing the noise while keeping the number of hydridizations reasonably small. Unless you expect to find interesting inter-individual variations, e.g. in a medical study, there is little to argue against pooling. However, it is important to pool the biological material (tissue, cells), not the purified RNA or labeled cDNA! In this way, problems are far easier to spot. Don't ever include any sample that looks suspicious.

  18. How do I analyze my results?
  19. At the
    SHWFGF we have recently introduced two simple new statistical techniques (RankProducts [RP] and iterative GroupAnalysis [iGA]) that facilitate and enhance the interpretation of microarray experiments. Both methods provide rigorous significance estimates for your observations and perform considerably better than previous techniques, particularly for the small and noisy data sets that are often produced in biological experiments. The standard (recommended) analysis procedure used at the SHWFGF is described here and software is available for download at the GlaMA website.

  20. What else should I do?








Nedstat Basic last updated: 20/October/2005