Rodrigo Santos do Amor Divino Lima,
Universidade Federal do Pará - LABVIS rodrigodivino.ufpa@gmail.com
PRIMARY
Bianchi Serique Meiguins, Universidade
Federal do Pará – LABVIS bianchi@ufpa.br
Carlos Gustavo Resque dos Santos, Universidade Federal do Pará –
LABVIS gustavo.resque@gmail.com
Student Team: NO
VisLinks, developed by LABVIS - Laboratory
of Visualization, Interaction, and Intelligent Systems. Developed for the challenge
and further adapted for other team projects.
D3.js visualization kernel - developed by Mike Bostock and contributors
Approximately how many
hours were spent working on this submission in total?
120
May we post your submission
in the Visual Analytics Benchmark Repository after VAST Challenge 2019 is
complete? YES
Video:
Two available links (same video on both)
Download (Google Drive): https://drive.google.com/open?id=1QbHe8G96KjP0hwholvYuK1QZexM0Bcd5
Watch Online (Youtube): https://youtu.be/f4k25Djk8HI
Questions
1 – Emergency
responders will base their initial response on the earthquake shake map. Use
visual analytics to determine how their response should change based on damage reports
from citizens on the ground. How would you prioritize neighborhoods for
response? Which parts of the city are hardest hit? Limit your response to
1000 words and 10 images.
We developed a tool that has four main views,
as shown in Figure 1.

Figure
1: Tool overview:(i) a
horizon-histogram grid of the reports, normalized by the rate of messages per
minute,(ii) Steamgraphs that show how the reports change over time,(iii)
choropleth maps that show the median of the reports in each location and
type(iv) an earthquake timeline showing the earthquake intensity throughout the
event.
The timeline visualization shows three main
moments of the event (Figure 2):
• Moment 1: Monday 06, starting at 2 pm: A
small increase in the number of low-intensity reports
• Moment 2: Wednesday 08, beginning at 10 am:
A large number of reports, with intensities from 1 to 7
• Moment 3: Thursday 09, starting at 2 pm:
Another increase in the number of reports, with intensities from 1 to 4
The timeline also shows anomalies: 5-minute
blocks during the event where there are a lot of reports, but in a
discontinuous way (Figure 2). This is probably due to power blackouts, as will
be seen later in the analysis.

Figure 2:
Moments and anomalies of the event.
Analyzing
the first moment with a selection in the timeline (i), it is possible to observe
that most locations in all the categories reported a considerable amount of
reports of low intensity, as shown in Figure 3. The
allocation of resources must help those places whose distribution is tilted
right.

Figure 3: Reports distribution
at Moment 1.
Moment 2 has the
following histograms (Figure 4)

Figure 4: Reports distribution in
moment 2.
Sewer and Water: Uncertain situation in Palace Hills(1)
with a distribution centered at 3 and 8. Possibly only a part of the
neighborhood was affected. Delicate situation in Scenic Vista(8)
and Broadview(9), with centers 8 and 7, requires priority.
Power: There
are indications of a great blackout in Chapparal(10),
Old Town(3), Broadview(9) and Scenic Vista(8). Figure 5 shows the communication
interruption. It is recommended to send recon units to see if the situation is
severe or not. There is also evidence of a small blackout at Terrapin Springs(11) around 10 am, as shown in Figure 6.

Figure 5: Blackout in Chapparal(10),
Old Town(3), Broadview(9) e Scenic Vista(8). Safe Town has normal
communication.

Figure 6: Small Blackout in Terrapin
Springs around 10 am.
Uncertain situation in Palace Hills(1), with a distribution with center at 2 and 9.
Critical state in Terrapin Springs(11), with center 9
and with evidence of small blackout around 10 am, requires extreme attention.
Roads and Bridges: Easton(14), Scenic Vista(8), and Broadview(9) have priority
with center 7.
Medical: Uncertain situation in
Palace Hills(1), with centers in 3 and 7. In Old Town(3) the situation is dangerous, with a center in 7.
Buildings: Situation uncertain in
Palace Hills(1), with centers in 2 and 8. In Chapparal(10) the center is 7, but since the distribution is
irregular and with few reports, we suggest recon. Attention should be given to Broadview(9) with center 6.
Shake Intensity: The
tremor was reported more in Old Town(3) with the
distribution around 6. The earthquake in Safetown(4) had
many reports varying from 0 to 7, showing the uncertainty of the reports in
there.
The anomalies were
messages accumulated during the blackout in certain regions, explaining the
large density of discontinuous messages. Figure 7 below shows the content of
the messages in the anomalies, showing how the situation was in those places
during the moment 2. No action is necessary since these messages are from hours
ago.

Figure 7: Anomalies were
delayed messages from Old Town(3), Chapparal(10),
Scenic Vista(8), ande Broadview(9), after the blackout.
Selecting
the moment time 3 gives the following histograms (Figure 8).

Figure 8: Report distribution in
moment 3.
Sewer and Water: Priority for Broadview(9)
with a center in 9, followed by Scenic Vista(8), with the center also in 9 but
with a smaller amount of reports.
Power: There are signs of blackout in Old Town(3) and Scenic Vista(8), as shown in Figure 9. It is
recommended to send recon units. Wilson Forest's(7) sign
may be confused with a blackout, but an analysis of his Steamgraph history
shows that it is a neighborhood with a naturally low number of messages.

Figure
9: Blackout em Old Town(3) e Scenic Vista(8). Wilson Forest (7) has few reports, but the
communication seems normal.
Priority for Scenic Vista(8)
with center 9. Attention must be drawn to Wilson Forest(7)
with a center 8, but the number of messages is low, and it is recommended to
send a recon unit to confirm the situation. Chapparal(10)
has a center 7 and a more consistent distribution, being next in the help
queue.
Roads and Bridges:
Priority for Scenic Vista(8) with center at 10,
requiring extreme attention.
Medical: Priority for Broadview(9) with a center of 5.
Buildings: Priority for Scenic Vista(8) with center at 7.
Shake Intensity: The
tremor was mostly felt in Safe Town(4), with center 5.
After the blackout in Old
Town(3) and Scenic Vista(8) in Moment 3, the
accumulated messages created anomalies, as shown in Figure 10 below. Again no
action is required.

Figure 10: Anomalies after moment 3
were again delayed messages from Old Town(3) and Scenic Vista(8).
2 – Use
visual analytics to show uncertainty in the data. Compare the reliability of
neighborhood reports. Which neighborhoods are providing reliable reports?
Provide a rationale for your response. Limit your response to 1000 words and 10
images.
Uncertainty is when we
cannot tell if a location is in trouble or not, and also when we cannot tell
how much it is in trouble.
At the moment 1, the most
significant source of uncertainty in the data is the incidence of isolated high
reports, as shown in Figure 11 and Figure 12. This
pattern can mean prank reports or blindspots in the neighborhood (only a few
inhabitants see critical damage).
Northwest reports(2), have highs and lows, but also intermediate
reports between them. As there are more reports and the distribution is more
consistent, these reports can be estimated to be more reliable than those of
Terrapin Springs(11), although there is still a
certain level of uncertainty linked. The blindspots hypothesis in the
neighborhood is more likely in this distribution, because if there are places
with critical damage, the close the inhabitants are, the higher should be the
report

Figure 11: Uncertainty Patterns in
moment 1, illustrating with Terrapin Springs(11) e Northwest(2) in Power.

Figure 12: All location-type pairs
that have this uncertainty pattern. It is more frequente in Power and in
Pallace Hills(1).
At the moment 2, the most significant source
of uncertainty are distributions that deviate severely from a normal
distribution, as shown in Figure 13 and 14. Examples
are Safe Town (4), and Palace Hills (1) reports in Power.
The distribution of Safe Town (4) has low
kurtosis, leading to a Platykurtic curve. With this distribution, it is not
advisable to define a center, since the number of reports different from the
central value is large. The distribution in Palace Hills (1) differs from a
normal curve by having two well-defined centers, instead of only one, thus
being a bimodal curve.
The high consistency of a bimodal curve,
having two well-defined centers, leads one to believe that there is a large
blindspot in the neighborhood. One hypothesis is that severe damage occurred
only in part of the city, while the other left almost unharmed. Another
hypothesis would be a large blindspot: at some point specific to severe damage,
but not all inhabitants can see it.
In the Platykurtic distribution, however,
these assumptions do not fit. A more consistent hypothesis for them is that
there are points of damage scattered throughout the city, each with different
levels of severity.
The Platykurtic distribution conveys more
uncertainty than the bimodal distribution: even though the bimodal distribution
is uncertain, a large and consistent number of high reports suggests that the
damage exists, whereas in the Platykurtic distribution there is a higher
possibility that this inconsistency reflects only the subjectivity of the inhabitants
in the assessment of the damage.

Figure 13: Uncertainty Patterns in
moment 2, illustrating with Safe Town(4) and Palace Hills(1) in Power. In Safe Town(4), the distribution is platykurtic.. However, In
Palace Hills(1), the distribution is bimodal.

Figure 14: All location-type pairs
that follows uncartainty patterns in moment 2. The platykurtic distributions
are found in Safe Town(4), Cheddarford(13), and
Southton(16), while the bimodal patterns happens in Palace Hills(1).
At moment 3, a new
pattern of uncertainty can be found in the data. This distribution is a more
specific type of bimodal curve where the centers are not firmly defined (i.e.,
they are not far apart). Another difference of this distribution is that one
center has many more reports than another, so untying the uncertainty becomes
easier.
In the example of the Figure 15, the Southton distribution (16) in Roads and
Bridges ranges from approximately 1 to 6, with a center in 2 and another in 5. This
distribution is more consistent than a Platykurtic: there are many more reports
surrounding 5, then it is more likely to exist damage with this severity in the
location.
The Figure 16 shows all
the uncertainties that follow this pattern at the moment. The uncertainty in
Safe Town (4) and Palace Hills (1) during moment 2 did not happen during moment
3.

Figure 15: Uncertainty patterns in moment
3, illustrating with Southton(16) in Roads and Bridges. The distribution has two close centers, where one is bigger than the
other.

Figure 16: All location-type pairs
that follow a uncertainty pattern in moment 3. They are found Cheddarford(13) and Southton(16).
3 – How
do conditions change over time? How does uncertainty in change over time?
Describe the key changes you see. Limit your response to 500 words and 8
images.
Key
Change 1: Uncertainty
resolution in Palace Hills(1) at Moment 2
The bimodal distribution patterns in Palace
Hills (1) during Moment 2 do not last long. At the beginning of the moment, the
distribution is quite uncertain, but 3 hours later, the amount of damage
reports above "4" drops tremendously. The Steamgraph of the Figure 17 illustrates this drop in Power, but this
behavior occurs in Palace Hills (1) in all categories.

Figure 17: Although situation in
Palace Hills during moment 2 is uncertain, a few hours later it becomes more
clear that the damage is not that severe, since there is a reduction in high reports
(the steam graph becomes prdominantly blue).
Key Change 2: Uncertainty
in locations with Platykurtic distributions lasts more and is harder to
predict.
Unlike the Palace Hills
distribution (1), the Platykurtic distributions of Safe Town (4), Cheddarford
(13) and Southton (16) do not resolve easily. Figure 18 exemplifies this
statement with the Building steamgraphs for these three locations.

Figure 18: Uncertainty of
Platykurtic distributions over time. Although they are similar early on, they
change drastically at the end of Moment 1: in Safe Town(4) there is a predominance of blue, in Cheddarford(13),
a predominance of yellow, and in Southton(16) a predominance of both red and
blue. Very different outcomes that would be hard to predict.
Key Change 3: Power
situation in Terrapin Springs(11) was controlled in
moment 3, even though it was critical during moment 2.
Despite the small blackout at 10 am during
moment 2, Terrapin Springs (11) had no serious problems with Power during the
3rd moment. The Steamgraph of Figure 19 shows that
there was no new blackout, and that the situation was much less critical.

Figure 19: Even though the situation in Terrapin
Springs(11) was critical in moment 2, it was more controlled in momoent 3, with
a distribution around “5”.
Key
Change 4: The
biggest problem in Safe Town(4) in the whole
event was a sewer and water problem, but it only happened in moment 3.
In Safe Town (4) during moments 2 and 3, only the situation in Sewer and Water got worse, even though the earthquake was lighter. The steamgraphs in Figure 20 show this problem. Since the location has a Power Plant with a risk of contamination, it is important to send a team to further investigate the problem.

Figure
20: Steamgraphs for Safe Town(4) ranging moments 2 to 3. Only the top one, Sewer and
Water, showed increase in severity, as can be seen by the predominance of red
in the second message peak.
4 –– The
data for this challenge can be analyzed either as a static collection or as a
dynamic stream of data, as it would occur in a real emergency. Describe how you
analyzed the data - as a static collection or a stream. How do you think this
choice affected your analysis? Limit your response to 200 words and 3 images.
We chose to analyze
the data as a static collection, but following a streaming metaphor: Data that
is beyond the selection in the timeline (i.e., in the future) is not considered
in the data derivations. For this reason, the visualizations could be adapted
to support streaming data, but with two limitations:
1. The color scale of the timeline would
be less accurate because, without the entire collection, it would not be
possible to truncate the scale. By doing so, it would be more sensitive to
anomalies: accumulated messages would be attributed to color values so high
that other moments would be obscured.
2. If there is a severe limitation in the
machine's memory, the visualizations would be limited to showing only the slice
that corresponds to the most recent amount of data supported, and loss of
information could occur.
The choice of static collection allowed the
clear identification of the 3 moments and the anomalies of the event through a
truncated color scale, which allowed robustness to accumulated messages. The
choice also enabled analysis using the "From Start to Selection"
configuration of the steamgraphs, which would not exist if the old data were
not available anymore. This allowed for temporal comparisons between more
distant moments, such as the one that led to the discovery of Key Change 4 of
the previous question (The biggest problem in Safe Town (4) in the whole event
was a sewer and water problem, but it only happened in moment 3 ).
Figure
21 illustrates the impact of the limitation on the timeline color scale: Visual
mapping would be more inefficient and unstable in color scale because it would
be sensitive to a large number of messages of anomalies.

Figure 21: Timeline adapted for
streaming: In (a) the moment 1 is mostly red. In (b) it turns yellow, and in
(c) it almost disappears due to anomalies of delayed messages.