Skip to main content

Carlea Holl-Jensen||cholljen@umd.edu


HCIL-2003-44

Seo, J., Bakay, M., Chen, Y., Hilmer, S., Shneiderman, B., Hoffman, E. (December 2003)
Building a Coherent Data Pipeline in Microarray Data Analyses: Optimization of Signal/Noise Ratios Using an Interactive Visualization Tool and a Novel Noise Filtering Method
Bioinformatics 20, (2004), 2534-2544. http://bioinformatics.oupjournals.org/cgi/content/abstract/20/16/2534?etoc
HCIL-2003-44, CS-TR-4674, ISR-TR-2005-49

Motivation: Sources of uncontrolled noise strongly influence data analysis in microarray studies, yet signal/noise ratios are rarely considered in microarray data analyses. We hypothesized that different research projects would have different sources and levels of confounding noise, and built an interactive visual analysis tool to test and define parameters in Affymetrix analyses that optimize the ratio of signal (desired biological variable) versus noise (confounding uncontrolled variables). Results: Five probe set algorithms were studied with and without statistical weighting of probe sets using Microarray Suite (MAS) 5.0 probe set detection p values. The signal/noise optimization method was tested in two large novel microarray datasets with different levels of confounding noise; a 105 sample U133A human muscle biopsy data set (11 groups) (extensive noise), and a 40 sample U74A inbred mouse lung data set (8 groups) (little noise). Success was measured using F-measure value of success of unsupervised clustering into appropriate biological groups (signal). We show that both probe set signal algorithm and probe set detection p-value weighting have a strong effect on signal/noise ratios, and that the different methods performed quite differently in the two data sets. Among the signal algorithms tested, dChip difference model with p-value weighting was the most consistent at maximizing the effect of the target biological variables on data interpretation of the two data sets. Availability: The Hierarchical Clustering Explorer 2.0 is available at http://www.cs.umd.edu/hcil/hce/ , and the improved version of the Hierarchical Clustering Explorer 2.0 with p-value weighting and F-measure is available upon request to the first author. Murine arrays (40 samples) are publicly available at the PEPR resource (http://microarray.cnmcresearch.org/pgadatatable.asp) (Chen et al., 2004).


[HTML


Community Analysis and Visualization Screenshot

Community Analysis and Visualization
More information

Tech Reports
Video Reports
Annual Symposium

News
Seminars + Events
Calendar
HCIL Seminar Series
Annual Symposium
HCIL Service Grants
Events Archives
Awards
HCIL Conference Travel Award
Job Openings
For the Press
HCIL Overview
Become a Member
Collaborators
Collaborating Groups + People
Academic Visitors
Join our Mailing List
Contact Us
Visit Us
HCIL Store
Give the HCIL a Hand
HCIL T-shirts for Sale
Our Lighter Side
HCIL Memories Page
Faculty/ Staff
Students
Ph.D. Alumni
Past Members
Research Areas
Communities
Design Process
Digital Libraries
Education
Physical Devices
Public Access
Visualization
Research Histories
Faculty Listed by Research
Project Highlights
Project Screenshots
Publications and TRs
Videos
Books
Products
Presentations
Studying HCI
Masters in HCI
PhD in HCI
Visiting Scholars
Class Websites
Sponsor our Research
Sponsor our Annual Symposium
Active Sponsorship
Industrial Visitors