Entry Name:  "JMU-Guo-MC3"

VAST Challenge 2019
Mini-Challenge 3

 

 

Team Members:

Chen Guo, James Madison University guo4cx@jmu.edu     PRIMARY
Xiang Liu, Purdue University
xiang35@purdue.edu
Evie Cai, West Lafayette High School eviecai03@gmail.com
Yingjie Victor Chen, Purdue University victorchen@purdue.edu
Zhenyu Cheryl Qian, Purdue University qianz@purdue.edu
Rui Li, Jiangnan University lrcoolb@jiangnan.edu.cn


Student Team:  NO

 

Tools Used:

For spelling correction: Bing Spell Check, SymSpellpy.

The backend: Python, Gensim, Stanford NLP Tools http://www-nlp.stanford.edu/

The front end was built upon HTML5, D3.js, pyLDAvis.js, textplorer.js


Approximately how many hours were spent working on this submission in total?


200 hours

 

May we post your submission in the Visual Analytics Benchmark Repository after VAST Challenge 2019 is complete? YES

 

Video

1) YouTube Link: https://youtu.be/fOFR_BT7_74


2) Download from TopicInk: Visualizing Disaster Textual Data using LDA Topic Modeling


 

 

 

Questions

The City has been using Y*INT to communicate with its citizens, even post-earthquake. However, City officials needs additional information to determine the best way to allocate emergency resources across all neighborhoods of St. Himark. Your task, using your visual analytics on the community Y*INT data, is to determine the types of problems that are occurring across the St. Himark. Then, advise the City on how to prioritize the distribution of resources.  Keep in mind that not all sources on Y*INT are reliable, and that priorities may change over time as the state of neighborhoods also changes.

MC 3.1Using  visual analytics, characterize conditions across the city and recommend how resources should be allocated at 5 hours and 30 hours after the earthquake.  Include evidence from the data to support these recommendations.  Consider how to allocate resources such as road crews, sewer repair crews, power, and rescue teams. Limit your response to 1000 words and 12 images.


We adopt the LDA modeling tool to identify 12 distinct content topics in the messages, with a coherent value of 0.58 for the current analysis after thorough subjective evaluations of different models. After examining the salient terms, mentioners, hashtags, NER entities as well as a qualitative analysis of the dominant replies in each topic, we find the following 12 frames ranked by its prevalence in the messages (Fig.1):

Fig 1. We used LDA model to aggregate messages into 12 topics. From left to right on the first row is from topic 1 to topic 6. From left to right on the second row is from topic 7 to topic 12. Each bar chart shows the top 30 salient terms for each topic.

Topic 1: house status and information in earthquake, such as communication lines, apps, stations, emergency communication network, and rescues.
Topic 2: people need help to find shelters, cats, ferrets, etc. Shelters were crowded. Bottled water, blankets, first aid and food were needed.
Topic 3: Transportation information. Bridges were collapsed and closed for safety inspection.
Topic 4: Building status. The shelters, Lacki’s building, some houses, etc were collapsed. People were trapped in collapsed buildings, but not answering their phones.
Topic 5: Nuclear power plant and always safe company. Always Safe nuclear power plant has shut down for inspection after the earthquake. Power was out after the quake and nuclear power was restored in many spots one day later.
Topic 6: The distribution, quantity, and storage related information on food, clothes, tents and other relief supplies; reserve aid manufacturers and their daily production capacity information.
Topic 7: People were running out of supplies. They need to stock up on meds, food, water, gas, etc. A lot of people got lost and some were missing, especially a singer named Lacki Dasical.
Topic 8: Rumors spread after the earthquake. People heard that the city would evacuate. They were worried that there would be a tsunami. This topic contains water contamination and sewer broken problems as well.
Topic 9: What was happening during the earthquake? Things were moved like crazy and sirens went off from the power plant.
Topic 10: The rescue teams. SHM and HSS cooperation makes the city safer. The city is getting better with roads and bridges re-opended.
Topic 11: Food and water are needed. Information on the fire department, library, and local news.
Topic 12: Many people panicked, and fatalities rumors spread among the crowd. They were frustrated about not being able to enter their red-tagged homes to access their valuables.

The earthquake happened at 4/8/20 8:36 am. EarthQuakeSeers posted a message “ALERT: A 6.7 earthquake just occurred off the NE shore of the town of St. Himark. This could be severe. Expect heavy damage. “ It caused major damage in the old town and the safe town. People in other towns such as Scenic Vista, Terrapin Springs, Easton, West Parton, Palace Hills, Southton noticed things moved and felt sharp shaking.

How to allocated resources at 5 hours: We filtered messages at 5 hours after the earthquake. The word cloud view shows the frequency of keyword occurrence. The map view displays the spatial distribution of messages. The size of each circle is corresponding to the number of messages posted in each location. We found that people were talking about water, bridge, safe, power, inspection, nuclear, and fire.

  1. Water: Water was contaminated because of broken water and sewer pipes. As is shown in Fig.2, Neighborhoods Old Town ,Safe Town , Scenic Vista , Broadview , Chapparal , Easton should boil their drinking water for 2 minutes.
    Our suggestion: The government should distribute bottled water in all majorly affected towns and repair the broken water and sewer pipes as soon as possible.
    Evidence: A representive message is as follows: 4/8/20 13:05, Water is contaminated. Serious reactions reported in the following neighborhoods Old Town,Safe Town,Scenic Vista,Broadview,Chapparal, 10, TVHostBrad, Downtown, 20339
  2. Fig 2. The word cloud view displays the co-occurence of term "Water" at 5 hours after the earthquake.

  3. Right after the earthquake, Bridges were collapsed because of the huge earthquake. July Bridge, Magritte Bridge, Friday Bridge, Jade Bridge, Tranky Doo Bridge were down. Chunks of asphalt were scattered all over the bridge. At 9:01 am, fishing boats were rescuing people in the water and there were cars in water as well. At 10:50 am, KRAK TV reported that 3 blocks of waterfront property at Entitled Acres housing development have slid into the water. A major rescue operation was needed. Bridge A has opened up for emergency vehicles at 9:30. Bridge B opened up at 10:05 am. Magritte Bridge was re-opened with one-lane at 13:20.
    Our suggestion: The government should notice the rescue team that they need to detour from the damaged bridges and take Bridge A, Bridge B, or Magritte Bridge. The rescue team should go to Entitled Acres housing development to rescue people. Road crew should start with main bridges and roads so that people could transport in and out of the city more efficiently.
    Evidence: The content of some messages are as follows: 4/8/20 9:01, 12th of July Bridge is down. Cars in water. Fishing boats are rescuing people in the water., 11, Simpson2002, 18570
    4/8/20 10:50, re: KRAK TV: 3 blocks of waterfront property at Entitled Acres housing development have slid into the water. All that is left is the golf course. No idea how many houses were lost! Major rescue operation underway., 7, HoldsFradyMouse, West Parton, 19639
    Fig 3. Left: we found the term "bridge" in topic 3. Click on term "bridge" to display all the relevant messages on the right. Right: we can select a segment of the timeline on the streamgraph and update the message view.

  4. Multiple structural fires and low water pressure make fighting fires extremely difficult. There were multiple places on fire including the Dribble’s ice cream factory at about 8:38 am, Red Hot Fire Extinguisher Company at about 10:03 am, and Larimerski brand shoe at about 11:01 am. More importantly, people were seeing building collapsing from windows in downtown and fire from north-east of the town. However, civilians couldn’t be able to find the fire station, or the fire station was closed.
    Our suggestions: The fire department should rescue people from downtown and put out a fire in north-east of the town as well as the stores mentioned above.
    Evidence from the messages: 4/8/20 13:32, Civilians are seeing buildings COLLAPSING from there windows. I see fire from north-east of the town. The town is in a bad shape, and people can't get help., 10, TVHostBrad, Downtown, 20726
    Fig 4. The word cloud and message view for the term "fire" at 5 hours after the earthquake.

  5. Power is out. Always Safe nuclear power plant has shut down for inspection after the earthquake at about 8:36 am. People need to restore power. Someone was worried about the power plant sustained damage and radiation leak. Derek Nolan reported that a power line fell on his taxi at around 4/8 13:33.
    Our suggestion: Always Safe nuclear power company should post message on YInt and tell the public the status of the nuclear power plant and their plans on restoring the power.
    Evidence from the messages: 4/8/20 8:36, As a precaution, Always Safe nuclear power plant has shut down for inspection after the earthquake., 5, DarkMelanieCandy, Safe Town, 18140
    4/8/20 9:00, KRAK TV: We have video from a citizen at the Always Safe Nuclear Power Plant which shows the fence has collapsed for several hundred feet and some damage to a building, 5, A1958Michel, Safe Town, 18550
    Fig 5. The word cloud and message view for the term "power" at 5 hours after the earthquake.

We also filtered and analyzed the messages posted at 30 hours after the earthquake and found the following patterns:

  1. Power is out among many neighborhoods and places. In Human shelters, animal shelters, gas stations, St . Himark Kidney Center, hospital, pharmacy store, grocery store, etc. power is out. Power company should post out their plans on fixing power and post out the area that the power has been restored so people can come back. Power repair crew should make sure shelters and other living supply stores have power. On 4/8/20 15:01, nuclear power has been restored.
    Fig 6. The word cloud and message view for the term "power" at 30 hours after the earthquake.

  2. From Fig. 7, we can see many residents were retweeting about the contaminated water. Neither we able to identify the leak point, nor to test whether the water is safe to drink. We suggest that the city officials should post out that all water should be boiled before drink after the earthquake for precaution. Sewer crew should repair the leak point, do the water test, and post out every water supply area that been repaired.
    Fig 7. The topic view, stream graph, word cloud, message view, and map for the term "water" at 30 hours after the earthquake.

  3. Besides water, power, help, we find shelter stands out from the word cloud. Most of the civilians were saying people were trying to find shelter around them. They could not find any or they do not know how to get to the shelter. They found the shelter was crowded once they reached there. Shelters lacked basic facilities such as pillows. It has been brought to our attention that animal shelters nearly collapsed. The government should not only post out all the shelters information throughout the city, but also show the status of the shelter so that people know where to go next when the shelter around them is full. The hygiene and the facilities need to be updated.
    Fig 8. The word cloud and message view for the term "shelter" at 30 hours after the earthquake.

  4. The topic model shows that food and water distributions are crucial for the affected area, such as old town and downtown, crowded people in a shelter without sufficient food and water for a long time can be very dangerous. After we chose the term "food" from the word cloud, the message display shows us: some people were eagle to have polices' help because they were in a crowded shelter with no food and water. Food and water price was out of control. There are also messages saying people need 10 dollars to buy only a bottle of water. In conclusion, government need to supply sufficient food and water to each shelter, especially after a long-time shortage of supplies. After earthquake, the shelters need police to prevent bad things/crimes happening in the community.
    Fig 9. The word cloud and message view for the term "food" at 30 hours after the earthquake.

MC 3.2 – Identify at least 3 times when conditions change in a way that warrants a re-allocation of city resources.  What were the conditions before and after the inflection point?  What locations were affected?  Which resources are involved? Limit your response to 1000 words and 10 images.

In order to determine the timestamps of supply reorganization, we use a vertical stream graph to analyze time changes of the messages. The spikes in the stream graph represent the increasing messages around that time which also indicate that emergencies occurred. Meanwhile, the context of keywords was looked at in the message display. A small analysis trend was discovered in the stream graph of topics/keywords. During hours with a bottleneck effect (for example, 12 PM, where there are significantly fewer messages than the hours before or after), more urgent messages were sent that often indicated a time period of re-allocation of resources.

Fig 10. The stream graph shows the time changes over 12 topics.

  1. The first time that warrants a re-allocation of materials is at 10:30 AM on 4/8/20. There were several fires throughout the city, with the North and South sides being the main target. During this time period, firefighters need to be reallocated to Old Town to stop the fires.
    Many users such as AttentiveDouglasIcecream retweeted the message “Extensive damage on the north and south side. Although no neighborhood has escaped damage. Several fires throughout City.”

    In addition, many buildings in the residences around the Southwest region have structural damages and need support. In fact, many users such as JFleetChambers from Southwest retweeted the message “Our neighborhood has been hit hard. All the old brick buildings have collapsed or are heavily damaged. # neighborhood.” Rescue teams and repair crews will need to have been sent to these locations around Southwest. However, a portion of firefighters should be reallocated to West Parton. DixonWhale31 and many others sent out messages about being trapped in elevators. “St Himark Fire Departhgementhge : If you are thgerapped in an elevator wrokait for us to come rescue you , do nothge athgethgeempthge to climb outhge on your owrokn . Ithge may wrokork in the movies , buthge in real life it is very dangerous.” The last resource that needs to be reallocated is transportation. C15Davis, from Chapparal, sent out a message “Department of Transportation : We need your help . Dachsunds are blocking the main road in Neighborhood 10 . Pick up the dachsunds if you see them so we can clear rubble. Bring them to the Galactic Truth Church at 2nd and Main.” While there is already a rescue team there, the Department of Transportation is necessary for cleaning up the debris.


    Fig 11. Left: Click on topic1 and find the term "fire" in the top 30 most salient terms. Right: Click on term "fire" to further explore what people were talking about over time.

  2. The second time reallocation is necessary at 13:37 PM on 4/8/20. The most important resource that needs to be shelter materials. Through our analytics, many people actively seek shelter at around this time. For example, FastBreadMary_Pacheco from Terrapin Springs sent the message “Ti’ana and I spent 30 minutes trying to find the shelter and it is packed.” FamousDonaldIcecream from Scenic Vista also messaged “Alexis and I spent 3 days trying to find the shelter and it is missing.” During this time period, building material, water, food, and temporary shelters will need to be reallocated across the city. From Terrapin Springs to Scenic Vista to Old Town, people require assistance and seek safety. More shulters and supplies are needed because the current shelters were crowded and lacked basic materials such as pillows. Through the sorting of the most salient terms, we can see the necessity of this reallocation of one of the most important supplies.
    Fig 12. The stream graph highlights the changes of the term "shelter" over time. Further explore the word cloud, map, and message view to find that shelters were very packed when people reached there and run out of food, water, and meds.

  3. In addition, many users such as DanielObnoxiousBest from West Parton sent out a retweeted message from BusyHCouch2001. This user from Scenic Vista’s message seeks for immediate help and further reallocation of clean water supply and sewage repair crews. “Department of Health and the St Himark Water and Sewer Department: Broken water and sewer pipes create of risk of contaminated drinking water. The following neighborhoods should boil their water: 4, 8,9,10,14.” In fact, EaglePeakHosp from Southton further addresses this issue “Moderate damage. Due to concerns over contaminated water, neonatal unit patients are being transferred to other hospitals as a precaution.” Through our categorization of retweeted messages, the importance of this cannot be overlooked, as other users have also addressed this issue. Clean water needs to be supplied to Southton region, and sewer repair crews will need to be reallocated to Scenic Vista and surrounding areas.
    Fig 13. The changes of the term "water" over time indicate the reallocation of water supply and sewer repair crews.

  4. A fourth time for reallocation is at 8:50 on Thursday 4/9/20. Survivors from Downtown reported that food, water, and electricity are all in a shortage. From the tweet sent by DorothyCat51, the supply trucks can’t get from the warehouses to the grocery stores. The supply was cut off due to the closed bridges. The reallocation is thus immediately needed to fix the related issues and recover the transportation of supplies.
    Fig 14. The stream graph highlights the changes of the term "food" over time. Further explore the word cloud, map, and message view to find that people in multiple neighborhoods need food, water, and electrity. The supply was cut off due to the closed bridges.

  5. A fifth time for necessary reallocation is at 3:00 PM on Thursday 4/9/20. Although it has been one day passed since the earthquake, aftershock occurred. From the message view of “move”, many users such as MazieLion85 from Old Town, SmartFBowl112 from Broadview, and FastBRowl from Terrapin Springs reported the occurrence of aftershock. These messages certainly implicate that immediate help is needed and thus the resource reallocation is required. As the aftershock occurred, people are required to hide in the nearest shelters for security reasons, in this case, the human resources are surely needed.
    Fig 15. The stream graph highlights the changes of the term "move" over time. We found the occurrence of aftershock at about 3 pm on 4/9/20.

MC 3.3 – Take the pulse of the community.  How has the earthquake affected life in St. Himark? What is the community experiencing outside the realm of the first two questions? Show decision makers summary information and relevant/characteristic examples. Limit your response to 800 words and 8 images.

The system contains a scoial network view and a frequency bar chart regarding the most frequently mentioned individuals and companies. We also use Stanford NER to identify the common named entities. The social network illustrates a few active users who are the influential figures in the disaster discussion: @AlwaysSafePowerCompany, @ChloeJohnson, and @TVHostBrad. They posted and retweeted a lot of messages. Some users asked for help on the platform, such as @DerekNolan and @ VanessaCorwin.

By clicking on the labels on the mentioner frequency bar chart, messages related to the corresponding mentioner are shown in the message view. Furthermore, the map, word cloud, and social network view are also presented to the users. Taking advantage of the listed information, we could analyze how this earthquake affected life in St. Himark and answer the question with regard to the community experiencing as follows. The mentioner network is constructed based on 2590 unique YInt accounts connected through mentions.

Fig 16.The mentioner frequency chart (left) and the social network view (right).

  1. @AlwaysSafePowerCompany: Based on the frequency map, the always safe power company is the most frequently mentioned name after the disaster. Even though the company announces that citizens are safe and the situations are handled very well, the messages regarding @AlwaysSafePowerCompany contain a lot of complaints from the survivors, especially the always safe power compnay's rescuing strategy. Based on the messages (see the message view in Fig. 17), the always safe power company didn’t apply thoughtful strategy and allocate sufficient resources to recuse the survivors after the disaster. They also didn’t set up enough water stations which causes a shortage of water. People in danger can't also contact the emergency number.

    The shortage of water could also be verified from the word cloud view. We can tell that water and help are the most prominent words. The word cloud of occurrence and the corresponding massage view also indicate food is in shortage. All the information mentioned reveal the complaints of the survivors.

    Fig 17. Left:The social network, map, word cloud, and message display for @AlwaysSafePowerCompany. Right: use the circular view to explore @AlwaysSafePowerCompany's network and find out who replies to the company and who is mentioned in the company's tweets.

  2. Before the earthquake, people in StHimark talks more about movies, food, and places to have fun. Many people like to raise their children here. But things have been changed after the earthquake. The lack of earthquake knowledge makes them hard to survive during this period of time. For example, they do not know where to find a shelter or how to get to a shelter. They do not know water should be boiled before drink when there is a suspicious about the safety of contaminated water. They do not know they need to stock up life essentials for potential earthquake. There is more and more negative energy spread all over the community. Many complaints about government, the power company, officials, fire department, and rescue teams are found out. Community should take responsibilities to help citizens to have more knowledge about earthquake as well as other nature disasters.
    Fig 18. Left: People's complaints about the rescue workers. Right: Derek Nolan is very active in the network with a degree of 213. He needs help to remove power line on his car, but in the meantime, he was also fostering panic on YInt.

  3. After the earthquake, rumor has panicked local people. Even small rumors can make things worse. Topic 8 is all about rumors and things people heard from friends or neighbors. By clicking on topic 8, we are able to see the time changes of this topic in the stream graph, all the messages belong to this topic in the message view, spatial distributions of messages, as well as the word co-occurrences from the word cloud. We can see that a lot of people were saying "the city is evacuating". Someone heard from friends that there were 100 fatalities or even 548 fatalities. Some were worried that there would be a tsunami. No officials came out to provide accurate information or stop rumors. The government should provide accurate reports with disaster-related information to reduce the ambiguity of information through websites, apps, SMS messages, radio broadcasts, etc. So the public will not be panic and feel safer.

    Fig 19. Rumors explosively spread on the social platform after the earthquake.

MC 3.4The data for this challenge can be analyzed either as a static collection or as a dynamic stream of data, as it would occur in a real emergency.  Describe how you analyzed the data - as a static collection or a stream.  How do you think this choice affected your analysis? Limit your response to 200 words and 3 images.

We mainly analyzed the data as a static collection. A streaming view was also created to help analysts to explore the dynamic changes of keywords, hashtags, mentioners, and NER entities (Fig. 20). It is still a challenge for us to display the number of topics or sub-topics in real time, and the temporal LDA model we use makes the animation really slow. The advantages of the streaming view are that it can maintain situation awareness and provide valuable insights shortly after the emergency has happened. The disadvantages of the stream processing are that the system only shows a fixed number of variables from the stream, and may lose some temporal information of each topic. Additionally, we used machine learning algorithms to fix typos in the messages as well as train models to identify the topics in the textural data. Since the processing is a single pass over the data, streaming analysis is not a good fit for the model training use case, and static analysis can be an effective complement to streaming analysis. Therefore, we developed two views: the streaming view to show the dynamic changes of keywords, hashtags, mentioners, and NER entities in real time, and the analytic view to explore the topics and discover interesting patterns from the text data. In the analytic view, analysts can also view the temporal changes of topics/keywords/mentioners/NER entities over time through brushing the timeline in the stream graph (Fig. 20).

Fig 20. Left: the streaming view shows the number of mentioners/hashtahgs/NER entities in real time. Each particle represents the user who replies to other's tweet or who is mentioned in the replies. It moves from the passer to the mentioner. Right: Brush the timeline in the stream graph to filter time.