An Interactive Power Analysis Tool for Microarray Hypothesis Testing and Generation - HCE 3.5

a priori power calculations for both experimental design in hypothesis testing, and hypothesis generation

Motivation: Human clinical projects typically require a priori statistical power analysis. Towards this end, we sought to build a flexible and interactive power analysis tool for microarray studies integrated into our public domain HCE 3.5 software package. We then sought to de-termine if probe set algorithms or organism type strongly influenced power analysis results.

Results: The HCE 3.5 power analysis tool was designed to import any pre-existing microarray project, and interactively test the effects of user-defined definitions of α (significance), β (1-power), sample size, and effect size. The tool generates a filter for all probe sets or more focused ontology-based subsets, with or without noise filters that can be used to limit analyses of a future project to appropriately powered probe sets. We studied projects from three organisms (Arabidopsis, rat, human), and three probe set algorithms (MAS5.0, RMA, dChip PM/MM). We found large differences in power results based on probe set algorithm selection and noise filters. RMA provided exquisite sensitivity for low numbers of arrays, but this came at a cost of high false positive results (24% false positive in the human project studied). Our data suggests that a priori power calculations are important for both experimental design in hypothesis testing, and hypothesis generation, as well as for selection of optimized data analysis parameters.

Design and implementation of a power analysis tool for microarrays

We designed and implemented an interactive power analysis method that enables rational design of experiments, thereby minimizing ethical concerns while maximizing the effectiveness of the study. The design strategy is shown in Fig. 3. Researchers first identify a pre-existing project that best matches their proposed project using any one of the existing data repositories (e.g. GEO, ArrayExpress, PEPR). The project must use the same microarray as in the proposed (future) experiment. The power analysis tool in HCE can use either a one-sample t-Test (one group of microarrays corresponding to replicates with a single variable), or a two-sample t-Test (two groups of microarrays differing by one variable).
Interactive Power Analysis Framework in HCE3.5

Effect of biological noise and probe set algorithms on power analysis results

We used three different pre-existing microarray projects to test our power calculation tool, one from a plant (Arabidopsis), one from a rat spinal cord damage project, and one from a human muscular dystrophy patient muscle biopsy project. We chose these three projects to test the effects of two variables, biological noise and probe set algorithms, on the resulting power calculations. We designed and implemented an interactive power analysis method that enables rational design of experiments, thereby minimizing ethical concerns while maximizing the effectiveness of the study. The design strategy is shown in Fig. 3. Researchers first identify a pre-existing project that best matches their proposed project using any one of the existing data repositories (e.g. GEO, ArrayExpress, PEPR). The project must use the same microarray as in the proposed (future) experiment. The power analysis tool in HCE can use either a one-sample t-Test (one group of microarrays corresponding to replicates with a single variable), or a two-sample t-Test (two groups of microarrays differing by one variable).

Effect of noise filters on power calculations

Not all genes are expressed into mRNA in each cell or tissue type. Those probe sets detecting mRNAs that are not expressed, or expressed at very low levels, are expected to result in signals that are at or near background (noise) levels. We therefore tested the effects of a “present call” noise filter on the resulting power calculations; we expected that the “performance” (e.g. proportion of sufficiently powered probe sets for any given number of arrays) would improve with this noise filter. we applied a fairly stringent noise filter (50% present calls)

Effect of noise filter on concordance of power analysis results by probe set algorithms

Given the strong effects of the probe set algorithm choice on the proportion of sufficiently powered probe sets (Fig. 4), we then tested the intersection of the appropriately powered probe sets. For this test, we selected a gene ontology group, inflammatory response genes, where we expected many of the probe sets to show relatively low signals. We used the two-sample t-Test, and studied both the rat and human data. It should be noted that both of these projects are known to show increased inflammatory gene expression in one of the two groups (severe damage rat group; Duchenne muscular dystrophy human group). We also studied the intersection with and without a 50% present call noise filter.

How to prepare input files

You have to prepare two files, probe set signal file and probe set detection call file (or probe set detection p-value file).  As you can see in the figure, you can use the probe set detection p-value file from MAS5 for all other signal files generated by probe set signal algorithms other than MAS5.

File names

The two files should be in the same folder.  The extension of the probe set detection call file should be pma.  Please refer to the following example.

1. Using tab delimited text files

If the signal file name is mah-mas5.exp (or mah-mas5.txt), the probe set detection call file name should be mah-mas5.pma.

2. Using Excel files

If the signal file name is mah-mas5.xls, the probe set detection call file name should be mah-mas5.pma.xls.

Format of the input files

Example : Please take a close look at this small example input files (spinal-cord.txt and spinal-cord.pma) in spinal-cord.zip.

Probe set signal file

Probe set detection call file (generated by MAS5)

Please note that the order of rows and columns is the same as in the signal file.

To perform power analysis in HCE 3.5

  1. load your data
  2. go to Tool -> Power Calculation & Filtering
  3. assign samples to groups
  4. select a model (one or two-sample t-Test)
  5. select a dependent parameter
  6. adjust values for independent parameters
  7. click "Calculate" button
  8. adjust the double sided slider control to change the thresholds for the dependent parameter
  9. export or highlight the resulting probe sets ("export" or "highlight" button)

Papers

For more information, please refer to the following papers.

Download

Download HCE 3.5 version for interactive power analysis, April 25, 2006 (first released on Nov. 11, 2005)

Old User manual (New one for ver 3.5 in preparation)

System requirements
Intel®Pentium® processor
Microsoft®Windows 2000® Windows XP®


Last updated