Behavior-Driven Optimizations for Big Data Exploration
With massive amounts of data coming from field sensors, sequencers, mobile devices, etc., data-driven decision making has become increasingly important in industry, government, and the sciences. One of the key issues for analysts and scientists who work with large datasets is efficient visualization of their data to extract patterns, observe anomalies, and debug their analysis workflows. Though a variety of visualization tools exist to help people make sense of their data, these tools often rely on database management systems (or DBMSs) for data processing and storage; and unfortunately, DBMSs fail to process the data fast enough to support a fluid, interactive visualization experience. My work blends optimization techniques from databases and methodology from HCI and visualization in order to support interactive exploration of large datasets. In this talk, I will first discuss Sculpin, a visual exploration system that learns user exploration patterns automatically, and exploits these patterns to pre-fetch data ahead of users as they explore. I will then discuss ongoing work to better understand how analysts explore data in more sophisticated analysis systems, such as Tableau Desktop. Finally, I will report on ongoing efforts to standardize the way we evaluate visual data analysis systems in general.