CMSC828K: Sensor Data Management; Data Streams

Prof. Amol Deshpande;    CSIC 3118;    Tue-Thur 3:30pm-4:45pm

[Home] [Description] [Schedule] [Projects] [Resources]

Course Description

In recent years, there has been an explosion of information in a variety of environments that pose significantly different data management challenges than traditional database domains. Examples include sensor networks, world wide web, scientific domains, XML, P2P networks etc. In this course, we will explore a few topics related to data management in some such environments.
  • Distributed Measurement Networks:

    Recent innovations in miniaturization technology have enabled large-scale deployments of disitributed measurement networks in a variety of settings. "Wireless" sensor-actuator networks, especially, enable highly cost-effective monitoring and control of physical environments at unprecedented detail. Networks of larger sensing devices such as web cameras (eg. to monitor traffic), GPS devices, and RFID sensor networks have also become ubiquitous. However, the potential of such networks has barely been exploited, mainly because of the complexity of managing, analyzing, and effectively using the huge amounts of data generated in such distributed environments. We will briefly review some such applications and the hardware trends in sensor networks, and then discuss a variety of data management and processing issues.
  • Data Streams:

    There is an increasing need for real-time processing, analyzing, and dissemination, of data generated in environments such as sensor networks, mobile devices, network monitors, financial data, XML data etc. Traditional database systems cannot handle requirements of such environments. As a result of this, there has been much research in "data streams" in last few years. We will review and discuss the needs of such applications, the proposed systems for management of streaming data, and algorithms for processing such data in a real-time fashion. Topics of interest include query processing over data streams, query optimization etc.
  • Management of Uncertain, Imprecise Data / Probabilistic Databases:

    Many of the challenges in data management in such environments stem from the impreciseness, inherent uncertainty and incompleteness of the generated data. This has brought the issue of effectively managing such data to the forefront. There has been very little work on this topic so far, thus opening up many exciting research opportunities. Probabilistic and statistical modelling techniques in particular are emerging as a promising alternative to manage complexity in many such environments. We will review some of the (older) work on probabilistic databases, and also recent proposals to dealing with such data, especially those that use machine learning techniques.