- Algorithms for the Thermal Scheduling Problem;
Koyel Mukherjee, Samir Khuller, and Amol Deshpande;
IPDPS 2013.
The energy costs for cooling a data center constitute a significant portion of the overall running costs. Thermal imbalance and hot spots that arise due to
imbalanced workloads lead to significant wasted cooling effort -- in order to ensure that no equipment is operating above a certain temperature, the data center
may be cooled more than necessary. Therefore it is desirable to schedule the workload in a data center in a "thermally aware" manner, assigning jobs to machines
... [more]
The energy costs for cooling a data center constitute a significant portion of the overall running costs. Thermal imbalance and hot spots that arise due to
imbalanced workloads lead to significant wasted cooling effort -- in order to ensure that no equipment is operating above a certain temperature, the data center
may be cooled more than necessary. Therefore it is desirable to schedule the workload in a data center in a "thermally aware" manner, assigning jobs to machines
not just based on local load of the machines, but based on the overall thermal profile of the data center. This is challenging because of the spatial
cross-interference between machines, where a job assigned to a machine may impact not only that machine's temperature, but also nearby machines. Here, we
continue formal analysis of the "thermal scheduling" problem that we initiated recently~\cite{sig-abstract}. There we introduced the notion of "effective load of
a machine" which is a function of the local load on the machine as well as the load on nearby machines, and presented optimal scheduling policies for a simple
model (where cross-effects are restricted within a rack) under the assumption that jobs can be split among different machines. Here we consider the more
realistic problem of "integral" assignment of jobs, and allow for cross-interference among different machines in adjacent racks in the data center. The integral
assignment problem with cross-interference is NP-hard, even for a simple two machine model. We consider three different heat flow models, and give constant
factor approximation algorithms for maximizing the number (or total profit) of jobs assigned in each model, without violating thermal constraints. We also
consider the problem of minimizing the maximum temperature on any machine when all jobs need to be assigned, and give constant factor algorithms for this
problem.
- Saving on Cooling: The Thermal Scheduling Problem;
Koyel Mukherjee, Samir Khuller, and Amol Deshpande;
SigMetrics 2012 (Poster paper).
- Energy Efficient Monitoring in Sensor Networks;
Amol Deshpande and Samir Khuller and Azarakhsh Malekian and Mohammed Toossi;
Algorithmica, Volume 59, Number 1, 94-114, 2011.
[pdf]
We study a set of problems related to efficient battery energy utilization for
monitoring applications in a wireless sensor network with the goal to increase
the sensor network lifetime. We study several generalizations of a basic problem
... [more]
We study a set of problems related to efficient battery energy utilization for
monitoring applications in a wireless sensor network with the goal to increase
the sensor network lifetime. We study several generalizations of a basic problem
called Set k-Cover. The problem can be described as follows: we are given a set
of sensors, and a set of targets to be monitored. Each target can be monitored
by a subset of the sensors. To increase the lifetime of the sensor network, we
would like to partition the sensors into k sets (or time-slots), and activate
each set of sensors in a different time-slot, thus extending the battery life of
the sensors by a factor of k. The goal is to find a partitioning that maximizes
the total coverage of the targets for a given k. This problem is known to be
NP-hard. We develop an improved approximation algorithm for this problem using a
reduction to Max k-Cut. Moreover, we are able to demonstrate that this algorithm
is efficient, and yields almost optimal solutions in practice. We also consider
generalizations of this problem in several different directions. First, we allow
each sensor to be active in α different sets (time-slots). This means that the
battery life is extended by a factor of k/\alpha, and allows for a richer space
of solutions. We also consider different coverage requirements, such as
requiring that all targets, or at least a certain number of targets, be covered
in each time slot. In the Set k-Cover formulation, there is no requirement that
a target be monitored at all, or in any number of time slots. We develop a
randomized rounding algorithm for this problem. We also consider extensions
where each sensor can monitor only a bounded number of targets in any time-slot,
and not all the targets adjacent to it. This kind of problem may arise when a
sensor has a directional camera, or some other physical constraint might prevent
it from monitoring all adjacent targets even when it is active. We develop the
first approximation algorithms for this problem.
- On Computing Compression Trees for Data Collection in Wireless Sensor Networks;
Jian Li, Amol Deshpande, and Samir Khuller;
INFOCOM 2010 (also CoRR Technical Report arXiv:0907.5442).
[pdf]
We address the problem of efficiently gathering correlated data from a wired or
a wireless sensor network, with the aim of designing algorithms with provable
optimality guarantees, and understanding how close we can get to the known
... [more]
We address the problem of efficiently gathering correlated data from a wired or
a wireless sensor network, with the aim of designing algorithms with provable
optimality guarantees, and understanding how close we can get to the known
theoretical lower bounds. Our proposed approach is based on finding an optimal
or a near-optimal "compression tree" for a given sensor network: a compression
tree is a directed tree over the sensor network nodes such that the value of a
node is compressed using the value of its parent. We consider this problem under
different communication models, including the "broadcast communication" model
that enables many new opportunities for energy-efficient data collection. We
draw connections between the data collection problem and a previously studied
graph concept, called "weakly connected dominating sets", and we use this to
develop novel approximation algorithms for the problem. We present comparative
results on several synthetic and real-world datasets showing that our algorithms
construct near-optimal compression trees that yield a significant reduction in
the data collection cost.
- Energy Efficient Monitoring in Sensor Networks;
Amol Deshpande and Samir Khuller and Azarakhsh Malekian and Mohammed Toossi;
LATIN 2008.
[pdf]
In this paper we study a set of problems related to efficient energy management
for monitoring applications in wireless sensor networks. We study several
generalizations of a basic problem called Set "k"-Cover. The problem can be
... [more]
In this paper we study a set of problems related to efficient energy management
for monitoring applications in wireless sensor networks. We study several
generalizations of a basic problem called Set "k"-Cover. The problem can be
described as follows: we are given a set of sensors, and a set of regions to be
monitored. Each region can be monitored by a subset of sensors. The goal is to
partition the sensors into "k" sets (or time-slots), so that by activating the
set of sensors in a time-slot, we can maximize coverage of the regions. By
activating each sensor in only one of the "k" time slots, we decrease its
battery consumption and extend its battery life significantly (by a factor of
"k"). This problem is known to be "NP"-hard. Our goal is to develop improved
approximation algorithms for this problem. Moreover, we are able to demonstrate
that this algorithm is practical, and yields almost optimal solutions in
practice.
We also consider generalizations of this problem in several
different directions. First, we allow each sensor to be active in "\alpha"
different sets (time-slots). This means that the battery life is extended by a
factor of "\frac{k}{\alpha}", and allows for a richer space of solutions. We
also consider different coverage requirements, such as requiring that the
regions be covered in each time slot, or a certain number of regions be
monitored in each time slot. In the Set "k"-Cover formulation, there is no
requirement that a region be monitored at all, or in any number of time slots.
We develop a randomized rounding algorithm for this problem.
We also
consider extensions where each sensor can monitor only a bounded number of
regions, and not all the regions adjacent to it. This kind of problem may arise
when a sensor has a directional camera, or some other physical constraint might
prevent it from monitoring all adjacent regions even when it is active. We
develop the first approximation algorithms for this problem.
- Predictive Modeling-based Data Collection in Sensor Networks;
Lidan Wang and Amol Deshpande;
EWSN 2008.
[pdf] Selected as one of the best papers.
We address the problem of designing practical, energy-efficient protocols for
data collection in wireless sensor networks using predictive modeling. Prior
work has suggested several approaches to capture and exploit the rich spatio-temporal
... [more]
We address the problem of designing practical, energy-efficient protocols for
data collection in wireless sensor networks using predictive modeling. Prior
work has suggested several approaches to capture and exploit the rich spatio-temporal
correlations prevalent in WSNs during data collection.
Although shown to be effective in reducing the data collection cost, those
approaches use simplistic corelation models and further, ignore many
idiosyncrasies of WSNs, in particular the broadcast nature of communication.
Our proposed approach is based on approximating the joint probability distribution
over the sensors using "undirected graphical models", ideally suited to
exploit both the spatial correlations and the broadcast nature of communication.
We present algorithms for optimally using such a model for data collection under
different communication models, and for identifying an appropriate model to use for a given
sensor network. Experiments over synthetic and real-world datasets show that our
approach significantly reduces the data collection cost.
- Online Filtering, Smoothing and Probabilistic Modeling of Streaming data;
Bhargav Kanagal, Amol Deshpande;
ICDE 2008.
[pdf] (Extended version)
In this paper, we address the problem of extending a relational database system
to facilitate efficient real-time application of dynamic probabilistic models to
streaming data. We use the recently proposed abstraction of model-based views
... [more]
In this paper, we address the problem of extending a relational database system
to facilitate efficient real-time application of dynamic probabilistic models to
streaming data. We use the recently proposed abstraction of model-based views
for this purpose, by allowing users to declaratively specify the model to be
applied, and by presenting the output of the models to the user as a
probabilistic database view. We support declarative querying over such views
using an extended version of SQL that allows for querying probabilistic data.
Underneath we use particle filters, a class of sequential Monte Carlo algorithms
commonly used to implement dynamic probabilistic models, to represent the
present and historical states of the model as sets of weighted samples
(particles) that are kept up-to-date as new readings arrive. We develop novel
techniques to convert the queries on the model-based view directly into queries
over particle tables, enabling highly efficient query processing. Finally, we
present exmodel to be applied, and by presenting the output of the models to the
user as a probabilistic database view. We support declarative querying over such
views using an extended version of SQL that allows for querying probabilistic
data. Underneath we use particle filters, a class of sequential Monte C
- Model-based Querying in Sensor Networks;
Amol Deshpande, Carlos Guestrin, Samuel Madden;
Chapter in Encyclopedia of Database Systems. Ling Liu and M. Tamer Ozsu, ed. 2009..
[pdf]
- Data Compression in Sensor Networks;
Amol Deshpande;
Chapter in Encyclopedia of Database Systems. Ling Liu and M. Tamer Ozsu, ed. 2009..
[pdf]
- Data Management in the Worldwide Sensor Web;
Magdalena Balazinska, Amol Deshpande, Michael Franklin, Phil Gibbons, Jim Gray, Mark Hansen, Michael Liebhold, Suman Nath, Alex Szalay, and Vincent Tao;
IEEE Pervasive Computing, Volume 6(2), 2007.
[pdf]
Harvesting the benefits of a sensor-rich world presents many data management challenges. Recent advances in research and industry aim to address these challenges.
- A Graph-Based Approach to Vehicle Tracking in Traffic Camera Video Streams;
Hamid Haidarian Shahri, Galileo Mark Namata, Saket Navlakha, Amol Deshpande, and Nick Roussopoulos;
The 4th International VLDB Workshop on Data Management for Sensor Networks (DMSN), 2007.
- Approximate Data Collection in Sensor Networks using Probabilistic Models;
David Chu, Amol Deshpande, Joseph M. Hellerstein, Wei Hong;
ICDE 2006.
[pdf] [talk]
Wireless sensor networks are proving to be useful in a variety of settings. A
core challenge in these networks is to minimize energy consumption. Prior
database research has proposed to achieve this by pushing data-reducing
... [more]
Wireless sensor networks are proving to be useful in a variety of settings. A
core challenge in these networks is to minimize energy consumption. Prior
database research has proposed to achieve this by pushing data-reducing
operators like aggregation and selection down into the network. This approach
has proven unpopular with early adopters of sensor network technology, who
typically want to extract complete "dumps" of the sensor readings, i.e., to run
"SELECT *" queries. Unfortunately, because these queries do no data reduction,
they consume significant energy in current sensornet query processors.
In this paper we attack the "SELECT *" problem for sensor networks. We propose a
robust approximate technique called "Ken" that uses "replicated dynamic
probabilistic models" to minimize communication from sensor nodes to the
network's PC base station. In addition to data collection, we show that Ken is
well suited to
anomaly- and event-detection applications.
A key challenge in this work is to intelligently exploit spatial
correlations "across" sensor nodes without imposing undue sensor-to-sensor communication burdens to maintain the models.
Using traces from two real-world sensor network deployments, we demonstrate that relatively simple models can provide
significant communication (and hence energy) savings without undue sacrifice in result quality or frequency. Choosing
optimally among even our simple models is NP-hard, but our experiments show that a greedy heuristic performs nearly as
well as an exhaustive algorithm.
- Model-based Approximate Querying in Sensor Networks;
Amol Deshpande, Carlos Guestrin, Sam Madden, Joseph M. Hellerstein, Wei Hong;
International Journal on Very Large Data Bases (VLDB Journal), 2005.
[pdf]
Declarative queries are proving to be an attractive paradigm for interacting
with networks of wireless sensors. The metaphor that "the sensornet is a
database" is problematic, however, because sensors do not exhaustively represent
... [more]
Declarative queries are proving to be an attractive paradigm for interacting
with networks of wireless sensors. The metaphor that "the sensornet is a
database" is problematic, however, because sensors do not exhaustively represent
the data in the real world. In order to map the raw sensor readings onto
physical reality, a "model" of that reality is required to complement the
readings. In this article, we enrich interactive sensor querying with
statistical modeling techniques. We demonstrate that such models can help
provide answers that are both more meaningful, and, by introducing
approximations with probabilistic confidences, significantly more efficient to
compute in both time and energy. Utilizing the combination of a model and live
data acquisition raises the challenging optimization problem of selecting the
best sensor readings to acquire, balancing the increase in the confidence of our
answer against the communication
and data acquisition costs in the network. We describe an exponential time algorithm for finding the optimal solution
to this optimization problem, and a polynomial-time heuristic for identifying solutions that perform well in practice.
We evaluate our approach on several real-world sensor-network data sets, taking into account the real measured data and
communication quality, demonstrating that our model-based approach provides a high-fidelity representation of the real
phenomena and leads to significant performance gains versus traditional data acquisition techniques.
- Resource-Aware Wireless Sensor-Actuator Networks;
Amol Deshpande, Carlos Guestrin, Sam Madden;
IEEE Data Engineering Bulletin March 2005.
[pdf]
Innovations in wireless sensor networks (WSNs) have dramatically expanded
the applicability of control technology in day-to-day life, by enabling
the cost-effective deployment of large scale sensor-actuator systems.
... [more]
Innovations in wireless sensor networks (WSNs) have dramatically expanded
the applicability of control technology in day-to-day life, by enabling
the cost-effective deployment of large scale sensor-actuator systems.
In this paper, we discuss the issues and challenges involved in deploying control-oriented
applications over unreliable, resource-constrained WSNs, and describe the design
of our planned Sensor Control System (SCS) that can enable the rapid development and
deployment of such applications.
- Model-Driven Data Acquisition in Sensor Networks;
Amol Deshpande, Carlos Guestrin, Sam Madden, Joseph M. Hellerstein, Wei Hong;
VLDB 2004.
[pdf] Recipient of the best paper award.
Declarative queries are proving to be an attractive paradigm for interacting
with networks of wireless sensors. The metaphor that "the sensornet is a
database" is problematic, however, because sensors do not exhaustively represent
... [more]
Declarative queries are proving to be an attractive paradigm for interacting
with networks of wireless sensors. The metaphor that "the sensornet is a
database" is problematic, however, because sensors do not exhaustively represent
the data in the real world. In order to map the raw sensor readings onto
physical reality, a "model" of that reality is required to complement the
readings. In this paper, we enrich interactive sensor querying with statistical
modeling techniques. We demonstrate that such models can help provide
answers that are both more meaningful, and, by introducing approximations
with probabilistic confidences,
significantly more efficient to compute in both time and energy. Utilizing the combination of
a model and live data acquisition raises the challenging optimization
problem of selecting the best sensor readings to acquire, balancing the increase
in the confidence of our answer against the communication and data acquisition costs in the network.
We describe an exponential
time algorithm for finding the optimal solution to this optimization problem, and a
polynomial-time heuristic for identifying solutions that
perform well in practice. We evaluate our approach on several
real-world sensor-network data sets, taking into account the real measured data and communication quality, demonstrating that our model-based approach
provides a high-fidelity representation of the real phenomena and leads to significant performance gains versus traditional data acquisition
techniques.
- Cache-and-Query for Wide Area Sensor Databases;
Amol Deshpande, Suman Nath, Phil Gibbons, Srini Seshan;
SIGMOD 2003.
[pdf] [talk] (IrisNet Project Web Page)
Webcams, microphones, pressure gauges and other sensors provide exciting new
opportunities for querying and monitoring the physical world. In this paper we
focus on querying wide area sensor databases, containing (XML) data derived from
... [more]
Webcams, microphones, pressure gauges and other sensors provide exciting new
opportunities for querying and monitoring the physical world. In this paper we
focus on querying wide area sensor databases, containing (XML) data derived from
sensors spread over tens to thousands of miles. We present the first scalable
system for executing XPATH queries on such databases. The system maintains the
logical view of the data as a single XML document, while physically the data is
fragmented across any number of host nodes. For scalability, sensor data is
stored close to the sensors, but can be cached elsewhere as dictated by the
queries. Our design enables self-starting distributed queries that jump directly
to the lowest common ancestor of the query result, dramatically reducing query
response times. We present a novel query-evaluategather technique (using XSLT)
for detecting (1) which data in a local database fragment is part of the query
result, and (2) how to gather the missing parts. We define partitioning and
cache invariants that ensure that even partial matches on cached data are
exploited and that correct answers are returned, despite our dynamic
query-driven caching. Experimental results demonstrate that our techniques dramatically increase query throughputs and decrease query response
times in wide area sensor databases.