Entry Name:  TJU-Jia-MC2

VAST Challenge 2019
Mini-Challenge 2

 

 

Team Members:

Shichao Jia, Tianjin University, jsc_se@tju.edu.cn     PRIMARY
Jiaqi Wang, Tianjin University, qimelbourne@gmail.com 

Zeyu Li, Tianjin University, lzytianda@tju.edu.cn

Jiawan Zhang, Tianjin University, jwzhang@tju.edu.cn   SUPERVISOR



Student Team:  YES

 

Tools Used:

Python

R

D3.js

 

Approximately how many hours were spent working on this submission in total?

About 150 hours ( 30 days, and 5 hours/day)

 

May we post your submission in the Visual Analytics Benchmark Repository after VAST Challenge 2019 is complete? 

YES

 

Video

https://youtu.be/GXY44zXsXEU  

 

 

Questions

Your task, as supported by visual analytics that you apply, is to help St. Himark's emergency management team combine data from the government-operated stationary monitors with data from citizen-operated mobile sensors to help them better understand conditions in the city and identify likely locations that will require further monitoring, cleanup, or even evacuation. Will data from citizen scientists clarify the situation or make it more uncertain? Use visual analytics to develop responses to the questions below. Novel visualizations of uncertainty are especially interesting for this mini-challenge.

1Visualize radiation measurements over time from both static and mobile sensors to identify areas where radiation over background is detected. Characterize changes over time. Limit your response to 6 images and 500 words.

 

We first use scatter plots to visualize the radiation measurements. However, we notice that all the data are too uncertain. The measurements are jumping around all the time and do not have a smooth change. This leads to a great challenge to data analysis. Therefore, we decide to first smooth the data and evaluate the uncertainty. We tried several techniques before we finally decided to use the LOWESS / LOESS model (Local Weighted Regression, or Locally Estimated Scatterplot Smoothing, or Locally Weighted Scatterplot Smoothing). Two key advantages differentiate this method from other methods (such as bin smoothing and kernel smoothing) include:

1. LOESS model is nonparametric, which means we do not need to provide a function the data are fitted in.

2. LOESS model provides a confidence interval to evaluate the uncertainty.

Figure 1 provides one example using LOESS model. The gray scatter is the original data, the black line is the smoothed value along the time, and blue band stands for a 95% confidence interval.

q1-2

Figure 1

 

We use this model to abstract all the time series, which greatly ease our analysis pipeline. In order to compare the results, we design the colored band to summarize the data. We use color to encode the smoothed value and bandwidth to stand for the confidence interval or uncertainty.

We group all the colored bands in the trajectory wall for easy comparison. Notice that the gray lines behind the colored bands stand for the cars are moving. 

q1-3

Figure 2

2Use visual analytics to represent and analyze uncertainty in the measurement of radiation across the city.

a.       Compare uncertainty of the static sensors to the mobile sensors. What anomalies can you see? Are there sensors that are too uncertain to trust?

b.      Which regions of the city have greater uncertainty of radiation measurement? Use visual analytics to explain your rationale.

c.       What effects do you see in the sensor readings after the earthquake and other major events? What effect do these events have on uncertainty?

Limit your responses to 12 images and 1000 words.

 

Besides trajectory wall, we provide a line chart to summarize the time series. Each line stands for one sensor reading along the time. Orange lines stand for static sensor readings and gray lines for mobile sensors. Users can switch between the smoothed value and the uncertainty. The results are shown in Figure 3.

q2-1

Figure 3

a.       In Figure 3, we can notice that most gray lines are above orange lines both in the value line chart and uncertainty line chart. This means overall, static sensors have both low value and low uncertainty compared with mobile sensors. Besides, in addition to the time ranges in the red circles, the value lines are relatively stable. We think they can still be trusted.

b.      There are several factors to influence the uncertainty of the radiation measurements. Radiation background of the environment may change the measurements. Different objects (buildings, trees, etc) have different radiation which may result in different influence on the uncertainty of radiation measurements. Second, the sensor devices may have system uncertainty, which means different mobile sensors have different uncertainty. Third, uncertainty will be affected by events such as earthquakes. We can clearly see that the five events have a great influence on the total uncertainty. However, the radiation measurements mix all these kinds of uncertainties together and get the final readings. Therefore, it's quite hard to distinguish whether the uncertainty is due to the regions or the sensors themselves.

Our hypothesis is that compared with uncertainty caused by regions, device uncertainty is the main factor to influence the final uncertainty. This hypothesis is proposed by the fact that most sensors have a general stable width (uncertainty) no matter where they go (See Figure 2).  In order to check our hypothesis, we check each colored bands with a relatively wide width. We check which locations these cars always stay. And draw circles on these locations to filter out all the trajectory segments that pass by the regions. We find that different cars with varied uncertainty can stay in the same neighborhoods, which means regions have no strong connection with the uncertainty.

We provide one example (Figure 4, Figure 5) to explain our rationale. M21 is a mobile sensor with large uncertainty all the time. It goes the rounds between locations A and B. When we select regions around A and B, all trajectory segments that go through the regions will be listed at the top of the heatmap. However, M21 coexist with lots of other sensors with low uncertainty. This example is not a single case. Instead, this phenomenon is universal in the dataset.

Figure 4

Figure 5

c.       In Figure 3, we notice there are five events all the time in total. We circle the events in red. In each event, both the value and uncertainty have large fluctuations. Besides, after the third event ( a major earthquake), both value and uncertainty lines have uptrends. Besides, in the uncertainty line chart, the gap between the group of static readings and mobile readings becomes larger. This means earthquake has a larger influence on mobile sensors than static sensors.

3 – Given the uncertainty you observed in question 2, are the radiation measurements reliable enough to locate areas of concern?

a.       Highlight potential locations of contamination, including the locations of contaminated cars. Should St. Himark officials be worried about contaminated cars moving around the city?

b.      Estimate how many cars may have been contaminated when coolant leaked from the Always Safe plant. Use visual analysis of radiation measurements to determine if any have left the area.

c.       Indicated where you would deploy more sensors to improve radiation monitoring in the city. Would you recommend more static sensors or more mobile sensors or both? Use your visualization of radiation measurement uncertainty to justify your recommendation.

Limit your responses to 10 images and 1000 words

 

In order to help users to find contamination sources over the city, we create interpolation maps using the sensor readings. We tried several techniques before we last decided to use kriging interpolation method. Two key advantages differentiate kriging interpolation method from other interpolation methods (linear interpolation, bilinear interpolation, inverse distance weighting, etc):

1. It provides an optimal prediction surface based on the semi-variogram model.

2. It also delivers a measure of confidence of how likely that prediction will be true, namely error of prediction surface.

We use the smoothed value of the LOESS model to interpolate the radiation field every five minutes so that users can quickly depict which regions of the city have higher radiation. Besides, we provide one interaction to enable users to verify their hypothesis. Users can select one region on the map. Then our system will filter out all the trajectory segments that passed by the selected region. The trajectories filtered out will be reordered at the top of the trajectory wall based on how much time they passed by. If every time these cars go through the region, their radiation readings get higher, then we can conclude that this region may be a location of contamination source. More details can check our video.

a.       Based on this interaction, we provide potential locations of contamination as follows:

Figure 6

Figure 7

Figure 8

Figure 9

Figure 10

Notice the contaminated locations have already not limited to the regions around the Always Safe power plant. Therefore, St. Himark officials should be worried about contaminated cars moving around the city.

b. Since there is no clear definition of what are contained cars, we decide to classify cars based on the features of their radiation readings and the likelihood that they may be contaminated.

C1) The radiation readings are low before the major earthquake. But after the earthquake, the radiation readings become higher and never go down.

C2) The radiation readings are always high all the time no matter whether they are before or after the major events.

C3) The radiation readings go up during the events and go down after the events.

We think these kinds of readings have a different level of possibility to imply the cars are contaminated. Cars with always low readings can not be contaminated cars.

Based on this consideration, we estimate there are total 23 cars are contaminated (C1: 3, C2: 12, C3: 8). We list them in Figure 11.

q3-8

Figure 11

In order to find whether they have left the area, we should only focus on the endpoints of each colored bands. If the colored bands are ended with a gray band, then we may guess that the cars have left because there are no trajectory data after the last move. If there are still trajectory data after their last move, then we may conclude that they still stay on the island. This method can quickly help us narrow the search space. Besides, we can use animation to check our judgment further. The steps of interactions can be checked in our video. We find that five contaminated cars have left the island. Others are still on the island. We highlight the cars that have left the island in Figure 11, which includes number M48, M49, M45, M21, M22.

c. We use kriging uncertainty map to guide us find the blind spots of monitoring. Kriging uncertainty map depicts how confident the predictions are. The darker of orange color, the more uncertain the results are. Dark regions that remain most time should be the resident blind spots of monitoring. Therefore, we suggest deploying more sensors in these locations.  Besides, we recommend deploying more static sensors. Because they are more reliable compared with mobile sensors. On the other hand, they can not transmit contamination like contaminated cars. We show several major blind spots in Figure 12. Animation to check our judgment can be referred to our video.

blind

Figure 12

4Summarize the state of radiation measurements at the end of the available period. Use your novel visualizations and analysis approaches to suggest a course of action for the city. Use visual analytics to compare the static sensor network to the mobile sensor network. What are the strengths and weaknesses of each approach? How do they support each other? Limit your response to 6 images and 800 words.

 

q4-2

Figure 13

We only focus on the last event at the end of the available period. Then we select all the cars that have noticeable changes during the time. These mobile sensors include M32, M42, M44, M37, M41, and M18. We label their locations on the map in Figure 13. These mobile sensors are just located at the contamination sources we found in question 3, in which M32 locates in L1, M42, M44, M37, and M41 locate in L3, and M18 locates in L4. Notice the readings M32, M37, M41, and M18 all get higher during this event, however, M42 and M44 remain high after the major earthquake.

However, the static sensors near the three contamination sources including S13, S14, and S6 have no noticeable changes. The reason may be these static sensors are still not close enough to the contamination sources. In this case, static sensors work more like baselines to represent background contamination.

Therefore, we suggest the emergency management team should quickly evacuate citizens around the area in L1, L3, and L4. Then stop these contaminated cars moving around (M32, M42, M44, M37, M41, and M18). Besides, this team should take a series of actions to clean the coolant and the contaminated cars in that area.

Therefore, we summarize the pros and cons of static sensors and mobile sensors:

 

 

Strength

Weakness

Static Sensors

More stable and reliable

Limit to specific locations and can not cover a large area

Mobile Sensors

More free and can cover a large area

More unstable and unreliable

The readings can be affected by the contaminated cars

 

5 –The data for this challenge can be analyzed either as a static collection or as a dynamic stream of data, as it would occur in a real emergency.  Describe how you analyzed the data - as a static collection or a stream.  How do you think this choice affected your analysis? Limit your response to 200 words and 3 images.

 

We analyze the data as a static collection. However, our technique can be applied to the streaming setting. One the one hand, the LOESS model is a local regression model, which can be adapted to the progressive setting. On the other hand, we interpolate the radiation field every time step. Therefore, the underlying data processing steps can all be applied in a streaming setting.