Shichao Jia, Tianjin University, jsc_se@tju.edu.cn PRIMARY
Jiaqi Wang, Tianjin University, qimelbourne@gmail.com
Zeyu Li, Tianjin University, lzytianda@tju.edu.cn
Jiawan Zhang, Tianjin University, jwzhang@tju.edu.cn SUPERVISOR
Student
Team: YES
Python
R
D3.js
Approximately how many hours were spent working on
this submission in total?
About 150 hours ( 30 days, and 5
hours/day)
May we post your submission in the Visual Analytics
Benchmark Repository after VAST Challenge 2019 is complete?
YES
Video
Questions
Your task, as
supported by visual analytics that you apply, is to help St. Himark's emergency
management team combine data from the government-operated stationary monitors
with data from citizen-operated mobile sensors to help them better understand
conditions in the city and identify likely locations that will require further
monitoring, cleanup, or even evacuation. Will data from citizen scientists
clarify the situation or make it more uncertain? Use visual analytics to
develop responses to the questions below. Novel visualizations of uncertainty
are especially interesting for this mini-challenge.
1 – Visualize radiation
measurements over time from both static and mobile sensors to identify areas
where radiation over background is detected. Characterize changes over time.
Limit your response to 6 images and 500 words.
We first use scatter plots to visualize the radiation
measurements. However, we notice that all the data are too uncertain. The
measurements are jumping around all the time and do not have a smooth change.
This leads to a great challenge to data analysis. Therefore, we decide to first
smooth the data and evaluate the uncertainty. We tried several techniques
before we finally decided to use the LOWESS / LOESS model (Local Weighted
Regression, or Locally Estimated Scatterplot Smoothing, or Locally Weighted
Scatterplot Smoothing). Two key advantages differentiate this method from other
methods (such as bin smoothing and kernel smoothing) include:
1. LOESS model is nonparametric, which means we do not need
to provide a function the data are fitted in.
2. LOESS model provides a confidence interval to evaluate the
uncertainty.
Figure 1 provides one example using LOESS model. The gray
scatter is the original data, the black line is the smoothed value along the
time, and blue band stands for a 95% confidence interval.

Figure
1
We use this model to abstract all the time series, which
greatly ease our analysis pipeline. In order to compare the results, we design the
colored band to summarize the data. We use color to encode the smoothed value
and bandwidth to stand for the confidence interval or uncertainty.
We group all the colored bands in the trajectory wall for
easy comparison. Notice that the gray lines behind the colored bands stand for
the cars are moving.

Figure
2
2 – Use visual analytics to represent
and analyze uncertainty in the measurement of radiation across the city.
a. Compare uncertainty of
the static sensors to the mobile sensors. What anomalies can you see? Are there
sensors that are too uncertain to trust?
b. Which regions of the city
have greater uncertainty of radiation measurement? Use visual analytics to
explain your rationale.
c. What effects do you see
in the sensor readings after the earthquake and other major events? What effect
do these events have on uncertainty?
Limit your responses to 12 images and 1000 words.
Besides trajectory wall, we provide a line chart to summarize
the time series. Each line stands for one sensor reading along the time. Orange
lines stand for static sensor readings and gray lines for mobile sensors. Users
can switch between the smoothed value and the uncertainty. The results are
shown in Figure 3.

Figure
3
a. In Figure 3, we
can notice that most gray lines are above orange lines both in the value line
chart and uncertainty line chart. This means overall, static sensors have both
low value and low uncertainty compared with mobile sensors. Besides, in
addition to the time ranges in the red circles, the value lines are relatively
stable. We think they can still be trusted.
b. There are several factors to influence the uncertainty of the
radiation measurements. Radiation background of the environment may change the
measurements. Different objects (buildings, trees, etc) have different
radiation which may result in different influence on the uncertainty of
radiation measurements. Second, the sensor devices may have system uncertainty,
which means different mobile sensors have different uncertainty. Third,
uncertainty will be affected by events such as earthquakes. We can clearly see
that the five events have a great influence on the total uncertainty. However,
the radiation measurements mix all these kinds of uncertainties together and
get the final readings. Therefore, it's quite hard to distinguish whether the
uncertainty is due to the regions or the sensors themselves.
Our
hypothesis is that compared with uncertainty caused by regions, device
uncertainty is the main factor to influence the final uncertainty. This
hypothesis is proposed by the fact that most sensors have a general stable
width (uncertainty) no matter where they go (See Figure 2). In order to check our hypothesis, we check
each colored bands with a relatively wide width. We check which locations these
cars always stay. And draw circles on these locations to filter out all the
trajectory segments that pass by the regions. We find that different cars with
varied uncertainty can stay in the same neighborhoods, which means regions have
no strong connection with the uncertainty.
We
provide one example (Figure 4, Figure 5) to explain our rationale. M21 is a
mobile sensor with large uncertainty all the time. It goes the rounds between
locations A and B. When we select regions around A and B, all trajectory
segments that go through the regions will be listed at the top of the heatmap.
However, M21 coexist with lots of other sensors with low uncertainty. This
example is not a single case. Instead, this phenomenon is universal in the
dataset.

Figure
4

Figure
5
c. In Figure 3, we
notice there are five events all the time in total. We circle the events in
red. In each event, both the value and uncertainty have large fluctuations.
Besides, after the third event ( a major earthquake), both value and
uncertainty lines have uptrends. Besides, in the uncertainty line chart, the
gap between the group of static readings and mobile readings becomes larger.
This means earthquake has a larger influence on mobile sensors than static
sensors.
3 – Given the uncertainty you
observed in question 2, are the radiation measurements reliable enough to
locate areas of concern?
a. Highlight potential
locations of contamination, including the locations of contaminated cars.
Should St. Himark officials be worried about contaminated cars moving around
the city?
b. Estimate how many cars
may have been contaminated when coolant leaked from the Always Safe plant. Use
visual analysis of radiation measurements to determine if any have left the
area.
c. Indicated where you would
deploy more sensors to improve radiation monitoring in the city. Would you
recommend more static sensors or more mobile sensors or both? Use your
visualization of radiation measurement uncertainty to justify your
recommendation.
Limit your responses to 10 images and 1000 words
In order to help users to find contamination sources over the
city, we create interpolation maps using the sensor readings. We tried several
techniques before we last decided to use kriging interpolation method. Two key
advantages differentiate kriging interpolation method from other interpolation
methods (linear interpolation, bilinear interpolation, inverse distance
weighting, etc):
1. It provides an optimal prediction surface based on the
semi-variogram model.
2. It also delivers a measure of confidence of how likely
that prediction will be true, namely error of prediction surface.
We use the smoothed value of the LOESS model to interpolate
the radiation field every five minutes so that users can quickly depict which
regions of the city have higher radiation. Besides, we provide one interaction
to enable users to verify their hypothesis. Users can select one region on the
map. Then our system will filter out all the trajectory segments that passed by
the selected region. The trajectories filtered out will be reordered at the top
of the trajectory wall based on how much time they passed by. If every time
these cars go through the region, their radiation readings get higher, then we
can conclude that this region may be a location of contamination source. More
details can check our video.
a. Based on this
interaction, we provide potential locations of contamination as follows:

Figure 6

Figure 7

Figure 8

Figure 9

Figure
10
Notice the contaminated
locations have already not limited to the regions around the Always Safe power
plant. Therefore, St. Himark officials should be worried about contaminated
cars moving around the city.
b. Since there is no clear definition of what are
contained cars, we decide to classify cars based on the features of their
radiation readings and the likelihood that they may be contaminated.
C1) The radiation readings are low before the major
earthquake. But after the earthquake, the radiation readings become higher and
never go down.
C2) The radiation readings are always high all the time no
matter whether they are before or after the major events.
C3) The radiation readings go up during the events and go
down after the events.
We think these kinds of readings have a different level of
possibility to imply the cars are contaminated. Cars with always low readings
can not be contaminated cars.
Based on this consideration, we estimate there are total
23 cars are contaminated (C1: 3, C2: 12, C3: 8). We list them in Figure 11.

Figure
11
In order to find whether they have left the area, we
should only focus on the endpoints of each colored bands. If the colored bands
are ended with a gray band, then we may guess that the cars have left because
there are no trajectory data after the last move. If there are still trajectory
data after their last move, then we may conclude that they still stay on the
island. This method can quickly help us narrow the search space. Besides, we
can use animation to check our judgment further. The steps of interactions can
be checked in our video. We find that five contaminated cars have left the
island. Others are still on the island. We highlight the cars that have left
the island in Figure 11, which includes number M48, M49, M45, M21, M22.
c. We use kriging uncertainty map to guide us find the
blind spots of monitoring. Kriging uncertainty map depicts how confident the
predictions are. The darker of orange color, the more uncertain the results
are. Dark regions that remain most time should be the resident blind spots of
monitoring. Therefore, we suggest deploying more sensors in these
locations. Besides, we recommend
deploying more static sensors. Because they are more reliable compared with
mobile sensors. On the other hand, they can not transmit contamination like
contaminated cars. We show several major blind spots in Figure 12. Animation to
check our judgment can be referred to our video.

Figure
12
4 –– Summarize the state of radiation measurements at the end of the
available period. Use your novel visualizations and analysis approaches to
suggest a course of action for the city. Use visual analytics to compare the
static sensor network to the mobile sensor network. What are the strengths and
weaknesses of each approach? How do they support each other? Limit your
response to 6 images and 800 words.

Figure
13
We only focus on the last event at the end of the
available period. Then we select all the cars that have noticeable changes
during the time. These mobile sensors include M32, M42, M44, M37, M41, and M18.
We label their locations on the map in Figure 13. These mobile sensors are just
located at the contamination sources we found in question 3, in which M32
locates in L1, M42, M44, M37, and M41 locate in L3, and M18 locates in L4.
Notice the readings M32, M37, M41, and M18 all get higher during this event,
however, M42 and M44 remain high after the major earthquake.
However, the static sensors near the three contamination
sources including S13, S14, and S6 have no noticeable changes. The reason may
be these static sensors are still not close enough to the contamination
sources. In this case, static sensors work more like baselines to represent
background contamination.
Therefore, we suggest the emergency management team should
quickly evacuate citizens around the area in L1, L3, and L4. Then stop these
contaminated cars moving around (M32, M42, M44, M37, M41, and M18). Besides,
this team should take a series of actions to clean the coolant and the
contaminated cars in that area.
Therefore, we summarize the pros and cons of static
sensors and mobile sensors:
|
|
Strength |
Weakness |
|
Static Sensors |
More stable and reliable |
Limit to specific locations and can not cover a large
area |
|
Mobile Sensors |
More free and can cover a large area |
More unstable and unreliable The readings can be affected by the contaminated cars |
5 –– The data for this
challenge can be analyzed either as a static collection or as a dynamic stream
of data, as it would occur in a real emergency.
Describe how you analyzed the data - as a static collection or a stream. How do you think this choice affected your
analysis? Limit your response to 200 words and 3 images.
We analyze the data as a static collection. However, our
technique can be applied to the streaming setting. One the one hand, the LOESS
model is a local regression model, which can be adapted to the progressive
setting. On the other hand, we interpolate the radiation field every time step.
Therefore, the underlying data processing steps can all be applied in a
streaming setting.