Skip to main content

Carlea Holl-Jensen||


Seo, J., Bakay, M., Chen, Y., Hilmer, S., Shneiderman, B., Hoffman, E. (December 2003)
Building a Coherent Data Pipeline in Microarray Data Analyses: Optimization of Signal/Noise Ratios Using an Interactive Visualization Tool and a Novel Noise Filtering Method
Bioinformatics 20, (2004), 2534-2544.
HCIL-2003-44, CS-TR-4674, ISR-TR-2005-49

Motivation: Sources of uncontrolled noise strongly influence data analysis in microarray studies, yet signal/noise ratios are rarely considered in microarray data analyses. We hypothesized that different research projects would have different sources and levels of confounding noise, and built an interactive visual analysis tool to test and define parameters in Affymetrix analyses that optimize the ratio of signal (desired biological variable) versus noise (confounding uncontrolled variables). Results: Five probe set algorithms were studied with and without statistical weighting of probe sets using Microarray Suite (MAS) 5.0 probe set detection p values. The signal/noise optimization method was tested in two large novel microarray datasets with different levels of confounding noise; a 105 sample U133A human muscle biopsy data set (11 groups) (extensive noise), and a 40 sample U74A inbred mouse lung data set (8 groups) (little noise). Success was measured using F-measure value of success of unsupervised clustering into appropriate biological groups (signal). We show that both probe set signal algorithm and probe set detection p-value weighting have a strong effect on signal/noise ratios, and that the different methods performed quite differently in the two data sets. Among the signal algorithms tested, dChip difference model with p-value weighting was the most consistent at maximizing the effect of the target biological variables on data interpretation of the two data sets. Availability: The Hierarchical Clustering Explorer 2.0 is available at , and the improved version of the Hierarchical Clustering Explorer 2.0 with p-value weighting and F-measure is available upon request to the first author. Murine arrays (40 samples) are publicly available at the PEPR resource ( (Chen et al., 2004).


Community Analysis and Visualization Screenshot

Community Analysis and Visualization
More information

Tech Reports
Video Reports
Annual Symposium

Seminars + Events
HCIL Seminar Series
Annual Symposium
HCIL Service Grants
Events Archives
HCIL Conference Travel Award
Job Openings
For the Press
HCIL Overview
Become a Member
Collaborating Groups + People
Academic Visitors
Join our Mailing List
Contact Us
Visit Us
HCIL Store
Give the HCIL a Hand
HCIL T-shirts for Sale
Our Lighter Side
HCIL Memories Page
Faculty/ Staff
Ph.D. Alumni
Past Members
Research Areas
Design Process
Digital Libraries
Physical Devices
Public Access
Research Histories
Faculty Listed by Research
Project Highlights
Project Screenshots
Publications and TRs
Studying HCI
Masters in HCI
PhD in HCI
Visiting Scholars
Class Websites
Sponsor our Research
Sponsor our Annual Symposium
Active Sponsorship
Industrial Visitors