Entry Name: TJU-Jia-MC1
Shichao Jia, Tianjin University, jsc_se@tju.edu.cn PRIMARY
Jiaqi Wang, Tianjin University, qimelbourne@gmail.com
Zeyu Li, Tianjin University, lzytianda@tju.edu.cn
Jiawan Zhang, Tianjin University, jwzhang@tju.edu.cn SUPERVISOR
Student
Team: YES
Python
D3.js
Approximately how many hours were spent working on
this submission in total?
About 150 hours ( 30 days, and 5
hours/day)
May we post your submission in the Visual Analytics
Benchmark Repository after VAST Challenge 2019 is complete?
YES
Video
Questions
1 – Emergency responders will
base their initial response on the earthquake shake map. Use visual analytics
to determine how their response should change based on damage reports from citizens
on the ground. How would you prioritize neighborhoods for response? Which parts
of the city are hardest hit? Limit your response to 1000 words and 10 images.
We use the
heatmap to summarize the multivariate time series. Each cell stands for the uncertainty
or the mean value of one dimension data in each hour. Users can select any
range of time in the line chart to filter data. Then the rows of the heatmap
will be reordered based on the overall severity. We use the following metric to
measure the total damage:

In which
stands for one
dimension (shake_intensity, swer_and_water, power, roads_and_bridges, medical
or buildings),
is the mean value of
each cell,
is the selected time
range of
and
is the set cardinality.
The color of
hexagons on the map encodes the severity measurement. Hexagons align with the
heatmap show the data distribution of each dimension. Higher levels layout at
the outer ring of the hexagons, and lower levels layout inner. The color encodes
the frequency of different level for each dimension. This kind of visualization
provides a compact depiction of both value and uncertainty. If the data
distribute evenly, the uncertainty will be high. Conversely, centralized
distribution means low uncertainty. Besides, hard damage will be distributed at
the outer ring of hexagons.

Figure
1
In Figure 1, we
notice there are three salient peaks of line charts, depicting how many
citizens upload data in different neighborhoods using the app every hour. We select two major peaks as examples since
they stand for two earthquakes in St. Himark. Notice that there are several
delays in the receipt of reports due to the power outages after each
earthquake.
We first select a
time range during the major earthquake with the number of citizens above forty
people. This interaction enables us to filter time intervals that do not belong
to the earthquake and includes time intervals during the delays that should
belong to the earthquake. In general, we suggest that emergency responders
should pay more attention to the neighborhoods closest to the regions where the
earthquake happens (such as Old Town, Easton, Safe Town, etc), and the southern
regions (such as Scenic Vista, Broadview, Wilson Forest etc) of St.
Himark. Besides, the hexagons beside the
heatmap provide sense of uncertainty. Data distribute more evenly means they
are more uncertain. Therefore, although the top three neighborhoods are damaged
hard, their reports may are not reliable. However, the following five
neighborhoods are damaged hard and their reports are reliable. We provide more
analysis in Question 2.

Figure
2
More
Specifically, by toggling different dimensions, emergency responders can
response different targeted events. We show the result in Figure2 using the
response maps, each for one dimension. For shake intensity, the damage reports
are consistent with the shake map. Old Town, Wilson Forest, Pepper Mill, and
Safe Town are the top neighborhoods which shake most. For sewer and water, the
result is slightly different. Scenic
Vista, Broadview, Old Town, Easton and Terrapin Springs are top-ranked. For
other events, readers can refer to Figure 2.

Figure
3
Then, we select a
time interval during the last earthquake with the number of citizens above
forty to include the two delays (Figure 3). Besides, we show different
dimensions in Figure 4. Overall, Old Town, Scenic Vista, and Broadview are
always ranked top in different aspects.

Figure
4
2 – Use visual analytics to
show uncertainty in the data. Compare the reliability of neighborhood reports. Which
neighborhoods are providing reliable reports? Provide a rationale for your
response. Limit your response to 1000 words and 10 images.
We use normalized
information entropy to evaluate the uncertainty of the neighborhood reports.
Because it is more suitable for ordinal survey data. The normalized entropy is
calculated as follows:

We choose base 11 (total 11 measurement levels from 0 to 10) to
normalize information entropy so that the range will be [0, 1], in which
stands for one
dimension (shake_intensity, swer_and_water, power, roads_and_bridges, medical
or buildings). The higher the
entropy is, the more uncertain the reports are, which means citizens have less
unified or consistent reports. We use blue color to encode entropy. The dark
color means large entropy and light blue means low entropy.
We select a time range to include the last two major
earthquakes. Rows in the heatmap will be reordered base on the mean entropy:

in which
is the selected time
range of
and
is the set
cardinality. Mean entropy evaluates the overall uncertainty of selected data
for each neighborhood.

Figure
5
The result is shown in Figure 5. Notice that data distribute
more evenly in the hexagons on the top than those at the bottom. Therefore, we
can conclude that Southton, Cheddarford, and Palace Hills etc provide more
uncertain reports than others. In contrast, neighborhoods at the end of heatmap
provide more reliable reports. These neighborhoods include West Parton, Oak
Willow, and Pepper Mill.
3 – How do conditions change
over time? How does uncertainty in change over time? Describe the key changes
you see. Limit your response to 500 words and 8 images.
In Figure 6 we can notice that there are three major peaks
all the time. Citizens post a lot during Monday afternoon, Wednesday morning,
and Thursday afternoon. By looking at the heatmap, we can find the uncertainty
during these peaks decrease at the same time. 
We select only the last two earthquakes for more details (Figure 7). Notice that some neighborhoods have more uncertainty about the medical reports. These neighborhoods are Scenic Vista, Weston, Easton, Northwest, East Parton, Pepper Mill, and Chapparal. And These neighborhoods happen to be the regions without hospitals.
Figure
6

Figure
7
Besides, we find that there are several breakpoints after
each major earthquakes. Supposing that these may be the delays due to power
outages, we select each time range for more details. We first select a time
range after the first major earthquake (Figure 8). Notice the power of selected
four neighborhoods (Broadview, Old Town, Scenic Vista, Chapparal) all have been
damaged hard. Besides, Broadview and Scenic Vista have a great emergency on
sewer, water, power, roads, and bridges, though their shake intensity is not higher
than Old Town. This implies that these neighborhoods may have lagged utilities.

Figure
8
Then we move the time range to the last two breakpoints
(Figure 9). We can find that Old Town and Scenic Vista have both damaged
hardest on sewer, water, power, roads, bridges. 
Figure
9
4 –– The data for this challenge can be analyzed either as a static
collection or as a dynamic stream of data, as it would occur in a real
emergency. Describe how you analyzed the data - as a static collection or a
stream. How do you think this choice affected your analysis? Limit your
response to 200 words and 3 images.
We current analyze the data as a static
collection. However, our system can be applied to the stream. Analyzing the
data as the dynamic stream is suitable in a real emergency. In contrast,
analyzing the data as a static collection provides a whole picture of the
events. In this scenario, most decisions may not differ much whether we analyze
the data as static collection or stream. However, it is different when data is
delayed. For example, after each major earthquake, several neighborhoods can
not upload reports timely due to the power outages. If the application is
streaming, we get no data at this time. Therefore, we have no idea what
situation these neighborhoods in. The system suddenly can not help users to
make decisions. In contrast, we definitely can analyze the data post hoc, and
understand the situation. However, this may be outdated after the earthquakes
have happened. Therefore, it's really a dilemma in this scenario. More work
should be done to think about this scenario.