Entry Name: gicentre-wood-mc3

VAST Challenge 2019

Mini-Challenge 3

Team Members:
Jo Wood, giCentre, City, University of London, j.d.wood@city.ac.uk PRIMARY

Student Team: No

Tools Used:
LitVis, developed by the giCentre (integrates Vega, Vega-Lite with elm and markdown), for narrative and visualization document creation.
*nix command-line tools (awk, sed, cut etc.) for some data cleaning.

Approximately how many hours were spent working on this submission in total? c. 80 hours for all three Mini challenges and Grand Challenge (treated as a single integrated process)

May we post your submission in the Visual Analytics Benchmark Repository after VAST Challenge 2019 is complete? Yes.

Video


Note: This document was created in LitVis - a Literate Visualization environment to support visual design and analysis exposition. This answer page is supported by a series of litvis documents providing design and analysis provenance for this, the other mini challenges and grand challenge. They can be found at https://github.com/jwoLondon/vastchallenge2019 (released after VAST challenge deadline has passed).

Questions

The City has been using Y*INT to communicate with its citizens, even post-earthquake. However, City officials needs additional information to determine the best way to allocate emergency resources across all neighborhoods of St. Himark. Your task, using your visual analytics on the community Y*INT data, is to determine the types of problems that are occurring across the St. Himark. Then, advise the City on how to prioritize the distribution of resources. Keep in mind that not all sources on Y*INT are reliable, and that priorities may change over time as the state of neighborhoods also changes.

To contextualise results we provide a gridmap projection for laying out spatially located graphical summaries. This is used because the spatial precision of Y*INT messages is limited to neighbourhoood.

Grid map

Figure 1: St Himark gridmap showing neighbourhoods and bridges. The two cells bottom-left are reserved for withheld and unknown locations.

Question MC3.1

Using visual analytics, characterize conditions across the city and recommend how resources should be allocated at 5 hours and 30 hours after the earthquake. Include evidence from the data to support these recommendations. Consider how to allocate resources such as road crews, sewer repair crews, power, and rescue teams.

Message frequencies over time

Figure 2: Message frequencies over time coloured by origin neighbourhood.

Message frequencies over time

Figure 3: Messages containing the text shak, coloured by origin neighbourhood.

Using messages alone, the first task is to identify the shake events. Messages were first filtered to remove most common spam messages (see litvis documents for details). Interactive regular expression filtering was then used to identify message theme. Messages containing shak suggested three distinct seismic events (Figures 2 and 3) identified by the following messages that were then subsequently reMessaged :

Spanning the main quake event, 5 hours following and 30 hours following, the state of St Himark can be summarised by temporal word clouds aggregated in three hour blocks (stopwords and spam messages filtered out).

Wordcloud Wed

Wordcloud Thu

Figure 4: Wordclouds spanning the main shock periods.

Wordclouds suggest a number of demands on resources during the period:

After 5 hours

After 30 hours

Interactive time and location-based browsing of messages provides evidence of demand for resource allocation. Example snapshots are shown below, but content of messages provided as interactive 'tooltips'.

Fire freq

Figure 5: Interactive browsing of messages over time containing the fire keyword.

Rubble freq

Figure 6: Interactive browsing of messages over time containing the rubble keyword.

Some message accounts are particularly useful in gathering evidence for resource allocation. By filtering out non re-messaged content we can find the originators of common time-critical re-messages. Of particular value are FieldEngineerPhillipCarter (Figure 7) and TVHostBrad (identified by originator of messages containing contamination).

Rubble freq

Figure 7: Message content generated by FieldEngineerPhillipCarter.

Question MC3.2

Identify at least 3 times when conditions change in a way that warrants a re-allocation of city resources. What were the conditions before and after the inflection point? What locations were affected? Which resources are involved?

Transportation Changes

Transportation in and out of the area is critical and given the geography of St Himark, bridge access is key. Monitoring the status of the bridges shows an initial period where most bridges are closed (Figure 8) severely limiting capacity to leave for those uninjured who might otherwise be a strain on local resources.

Bridge status

Figure 8: Automatic detection of open and close bridge related messages.

Flooding, water supply and contamination

By browsing keywords flood water, contamination (prompted by word cloud summary in Figure 4) we see the changing status of clean water supply problems.

Contaminated water

Figure 9: Snapshot of interactive message browsing using contaminated keyword.

Water messages

Figure 10: Messages containing keyword water

Shelters in Local Library

Some libraries were allocated as shelters for those made homeless, but confusion arose when it was not clear which libraries had been designated.

Library messages

Figure 11: Messages containing keyword library

Question MC3.3

Take the pulse of the community. How has the earthquake affected life in St. Himark? What is the community experiencing outside the realm of the first two questions? Show decision makers summary information and relevant/characteristic examples.

Fatalities and information management

A significant issue with a disaster with imperfect information and social media propagating information of questionable veracity is effective information management. As an example, searching for messages on fatalities, we see wide ranges of figures being circulated varying from 5 to 1000 (see Figure 12). There is a risk with such misinformation of fomenting panic that can hamper emergency response.

Fatalities

Figure 12: Figures for numbers of reported fatalities (all messages, log scale)

Red-tagging buildings and civil unrest

A system of 'red-tagging' of unsafe buildings was swiftly enacted. While a necessary step, particularly with the risk of aftershocks damaging already weakened buildings, there is evidence this has led to considerable dissatisfaction by some citizens. Towards Thursday midday there appears even to be a risk of civil unrest (Figure 14), in part driven by a lack of obvious leadership from the Mayor.

Fatalities

Figure 13: Messages with phrase red tag

Fatalities

Figure 14: Messages with phrase mob

Animal Care

We see evidence in messages of inadequate support for animals (largely pets) in the disaster. They appear not to be provided for in shelters and specific animal support is being charged for, resulting in considerable dissatisfaction among some without access to funds.

Question MC3.4

The data for this challenge can be analyzed either as a static collection or as a dynamic stream of data, as it would occur in a real emergency. Describe how you analyzed the data - as a static collection or a stream. How do you think this choice affected your analysis?

Data were analysed as a static collection (as they were provided). However, care was taken to use approaches that would work if the data had been streamed. No time-based calculations required data 'later' in the stream. The timeline-based layout and interactive browsing of messages is amenable to constant streaming.

What was required was the pre-filtering of stopwords and 'spammy' content. While standard stopword lists were used as a basis. This was amended to include irrelevant content more typical of the messages in the sample. This therefore requires some degree of 'pre-disaster' processing of messages to establish the baseline content.