Entry Name:  "SMU-JAKAY-MC1"

VAST Challenge 2018
Mini-Challenge 1

Team Members:

Kishan Bharadwaj Shridhar, Singapore Management University, kishanbs.2016@mitb.smu.edu.sg PRIMARY

Akangsha Bandalkul, Singapore Management University, akangshab.2016@mitb.smu.edu.sg

Angad Srivastava, Singapore Management University, angads.2016@mitb.smu.edu.sg

Ong Guan Jie Jason, Singapore Management University, jason.ong.2016@mitb.smu.edu.sg 

Zhang Yanrong Yale, Singapore Management University, yrzhang.2016@mitb.smu.edu.sg

Student Team:  YES

Tools Used:

Python

Tableau

Excel

 

Approximately how many hours were spent working on this submission in total?

180

May we post your submission in the Visual Analytics Benchmark Repository after VAST Challenge 2018 is complete?  YES

Video

https://tinyurl.com/ycgv25nq 

 

Tableau Workbook: https://tinyurl.com/yc3edlrp

 

 

Questions        

1 – Using the bird call collection and the included map of the Wildlife Preserve, characterize the patterns of all of the bird species in the Preserve over the time of the collection. Please assume we have a reasonable distribution of sensors and human collectors providing the recordings, so that the patterns are reasonably representative of the bird locations across the area. Do you detect any trends or anomalies in the patterns? Please limit your answer to 10 images and 1000 words.

 

Spatiotemporal data of bird calls recorded at various locations and at different points in time help us to visualize the migrations of bird species. As seen from below, there is significant activity happening across the time range between various points in the preserve. The lines indicate an animated path through time showing bird calls are being recorded at almost all points in the preserve. Having been provided information that the sensor recording patterns represent bird locations, we can estimate where the home base of the various bird species is.

 

 

A close up of a map

Description generated with high confidence

 

 

Figure 1 Tracing the path of bird movements across the preserve through the cumulative density of the recordings

 

Identifying the home base and migration patterns of bird species

We can analyze where a bird spends the most time in the preserve, and what kind of movement patterns it has exhibited. For each bird, we can infer and attribute a home base. Furthermore, we can also see if there are any notable trends of bird migrations.

 

For the table below, North is at top of the Preserve map provided. For e.g., the dumping Kasios site provided would be in the North East of the Preserve.

 

Bird

Home base identified

Migration patterns (if any)

Bent Beak Riffraff

No clear home base, however this species is mostly on the left side of the preserve

Blue Collared Zipper

South West

Travels across the preserve occasionally

Bombadil

North West

Rarely migrates

Broad-Winged Jojo

South West

Travels across the preserve occasionally

Canadian Cootamum

North West

Travels across the preserve occasionally

Carries Champaign Pipit

South East

Travels across the preserve occasionally

Darkwing Sparrow

No clear home base

Travels all around the Preserve

Eastern Corn Skeet

Upper middle portion of the Preserve

Predominantly travels in the top half of the preserve

Green-tipped Scarlet Pipit

No clear home base

Travels all around the Preserve

Lesser Birchbeere

Lower middle portion of the Preserve

Alternates between South East lower to South West lower

Orange Pine Plover

Lower middle portion of the Preserve

Alternates between South East lower to South West lower

Ordinary Snape

North East

Rarely migrates

Pinkfinch

Lower middle portion of the Preserve

Alternates between South East lower to South West lower

Purple Tooting Foot

No clear home base

Travels all around the Preserve

Qax

Middle-left portion of the preserve.(Based on limited data).

Limited data available

Queenscoat

North West

Rarely migrates

Rose-crested Blue Pipit

North East

Travels to an extent around the Preserve

Scrawny Jay

Left of the Preserve

Alternates between North West and South West

Vermillion Trillian

North West

Alternates between North and South West, closer to the middle of the Preserve

Table 1: Home base and migration patterns of the bird species over time

 

 

 

A close up of text on a white background

Description generated with high confidence

 

 

Figure 2: Identifying the home base and migration patterns of each of the bird species

 

 

 

The Frequent Flyers

Coordinates of each location allow the calculation of distances travelled by each species between each successive recording. A deeper dive into the patterns of total distances travelled over time can be obtained by this stacked box plot of distance below.

 

 

A screenshot of a computer

Description generated with very high confidence

Figure 3: Sum of total distances travelled by a bird species between successive recordings

 

Overall, the Orange Pine Plover, Queenscoat and Rose-Crested Blue Pipit species travel the most across the Preserve, in line with the migration pattern identified from above.

 

An emergency landing take-off in the North East of the Preserve?

 

The exact location of the alleged Kasios dumping site helps to understand if there was any significant activity being observed closer to the alleged dumping site. Specifically, we try to understand if there were any alarming patterns or changes in migration from North East of the preserve at a point in time, by utilizing the Euclidean distance between the species recording and the dumping site.

 

The Rose-crested Blue Pipits, which was inhabiting the region around the dumping site, over the years until around December 2014. A significant and sudden drop in the number of recordings near the dumping site indicates a clear pattern of movement of this pipit species. This does make a strong case for the team to flag this as an anomaly as the resting base of the pipits, which were hitherto stable, has now been stirred.

 

A screenshot of a map

Description generated with very high confidence

Figure 4: Migration of the Rose Crested Blue Pipit from December 2014

 

 

Narrowing down further, it is clear from the below GIF that the sudden movement away from the site starts to happen from December 2014.

 

 

 

 

 

Figure 5: An animated visualization of the sudden shift in the home base of the Rose Crested Blue Pipit

 

 

2 – Turn your attention to the set of bird calls supplied by Kasios. Does this set support the claim of Pipits being found across the Preserve?  A machine learning approach using the bird call library may help your investigation. What is the role of visualization in your analysis of the Kasios bird calls?   Please limit your answer to 10 images and 1000 words.

 

 

In this section, we shall cover the methodology underlying our analyses of the audio files provided, followed by how we utilized that approach to refute Kasios’ claim.

 

Role of Visualization in analyzing bird calls

 

An overview of the methodology is as shown below.

 

 

A screenshot of a cell phone

Description generated with very high confidence

 

 

Figure 6: An overview of the deep learning approach to classify bird sounds

 

 

Data Preprocessing

 

The team decided to analyze audio files by converting them into spectrograms and then applying image classification on them to identify the bird species tagged to each of the audio files provided. We used Librosa, a Python package to accomplish the task above. In doing so, we believe sound quality A, B are of a higher audio quality as compared to the other 3 bands. This inference was arrived at by a manual inspection of a random sample of audio clips belonging to each category.  

Given the size of the audio recordings in each file are of varying lengths, spectrograms were trimmed into sets of chunks with a 5 second duration. As can be seen below, the images are able to clearly show a difference between the audio waves of different bird species.

 

Figure 7: [Above]: An illustration of how an audio file is made into a spectrogram [Below] An illustration of a 5 second chunk of specific bird call audio

 

A dev set of 50 images for each bird was taken for model validation purposes. The rest of the images was utilized as the training data for modeling this as a supervised classification problem.

 

Tuning the weights of the Inception V4 Network

 

The team is inspired by the traction gained in the field of computer vision and believes the active research in the deep learning community can be leveraged to help us accomplish this classification and visualization task. We have utilized the Inception V4[1],is a pioneering neural network architecture developed by Szegedy et.al. Leveraging on inception blocks, they are tailor made for executing computer vision oriented workloads and are popular entrants in events such as the ImageNet Challenge.  The underlying architecture of the network is as shown below, where the red circles indicate the inception blocks, the blue layers indicating Convolution, red boxes indicating pooling layers, and the final fully connected layer feeding into a Yellow SoftMax layer. Normalization layers are illustrated in Green. No changes were made to the author’s architecture, and we follow the author’s implementation of the Inception V4 network.

 

Figure 8: The architecture of the Inception V4 neural network (For more details, please refer to https://arxiv.org/pdf/1602.07261.pdf

 

After training the data through 30 epochs (passes through the training set), we can tune the weights of the network by using a cross entropy loss function and backpropagation. Development was done using Python leveraging the Caffe framework.  

Furthermore, t-distributed Stochastic Neighbor Embedding (t-SNE) is used to visualize how separated each species looks based on the features learnt by the network. The more separated each species is, the better the model performance is. As seen from below, it can make a clear distinction between the Rose Crested Blue Pipit vs other birds.

 

 

 

A map of the street

Description generated with high confidence

 

Figure 9: Visualizing the features learnt by the network on a two-dimensional scale by the use of t-SNE

 

A picture containing text, indoor

Description generated with high confidence

 

Figure 10: The T-SNE shows a clear separation of the Rose Crested Blue Pipit’s features learnt by the network

 

Model Evaluation

 

We use Precision as the metric for assessing our models on the dev set. We attain a 78% precision on the image chunks (label 1 in the below figure). After this, we labeled each of the audio files based on the majority classification of its chunks, i.e. if 7 out of 10 of the chunks were Rose Crested Blue Pipit and the remaining 3 were classified as Qax, the entire recording would be labeled as Rose Crested Blue Pipit. The threshold we use to classify entire bird recordings is greater than 50%.

The precision attained is 80.4% on the overall label classification (label 2).

 

Figure 11: A simplified map illustrating the model flow

 

Testing Kasios Claims

 

The 15 test files provided by Kasios are then fed into the model to produce the output labels. We find that the bird recordings do not all belong to the Pipit species as claimed by Kasios. In fact, 12 out of the 15 recordings belong to non-Pipit species. A breakdown of the classified recordings can be seen below, indicating pipits are not found across the preserve, and just 20% of Kasios’ claims are correct.

 

A screenshot of a cell phone

Description generated with high confidence

 

Figure 12: Results of classification of the 15 audio files provided by Kasios

 

Furthermore, when we plot these classified outputs on the map, it is evident that Kasios’ recording locations do not match to what we inferred as the pipit home base from Figure 2.  

 

A close up of a map

Description generated with high confidence

 

Figure 13: The files provided by Kasios annotated with the results derived from the model indicate that only 3 of them are Rose Crested Blue Pipits as opposed to the Kasios claim of pipits being found across the preserve.

Furthermore, the home base of such pipits should have been near the X annotated in red in North East as identified in Figure 2 and not in the North West as shown above

 

3 – Formulate a hypotheses concerning the state of the Rose Crested Blue Pipit.  What are your primary pieces of evidence to support your assertion?  What next steps should be taken in the investigation to either support or refute the Kasios claim that the Pipits are actually thriving across the Boonsong Lekagul Wildlife Preserve?  Please limit your answer to 500 words.

 

 

Hypotheses and supporting evidence

 

The key hypotheses we develop is that the Rose Crested Blue Pipit is a species which shall continue to be affected by Kasios’ activities such as the chemical dumping and smokestack emissions. The claim put forward by Kasios is that pipits are happily thriving across the preserve.

 

The following are factors that reinforce why we can refute their claim.

  1. Kasios claims to have recorded ‘pipit’ sounds from across the preserve. Assuming by pipits they mean Rose Crested Blue Pipits, only 3 of the 15 recordings (20%) belong to pipits. Therefore, the recordings provided by Kasios cannot be relied on.
  2. We can further strengthen evidence against Kasios claims with the help of the analysis we carried out in Question 1 and the coordinates of the Kasios recordings provided as part of the dataset. When plotting these classified outputs on the map, it is evident that Kasios’ recording locations do not match to what we inferred in terms of the pipit home base from Figure 2.  We identified North East region (near the Kasios dumping site) as the home base for the pipits whereas evidence from the model above shows that pipits are in the North West. See Figure 13 above.
  3. Last but not the least, the clear distancing of the species away from the alleged Kasios dumping site beginning at the end of 2014 implies that there is a cause behind the sudden change in migration patterns.

 

Where to head in the future with this investigation?

 

The next steps to be taken to refute Kaisos’ claim would be to take independent recordings across the entire Preserve over a period of a few months to determine whether the Pipit species is really found across the Preserve. These recordings should be taken at regular intervals, e.g. daily or weekly, in the mornings or evenings, to accurately determine the true migration patterns of the species. The pursuit of estimating the pipit volumes can be extended by using video analytics related technologies to capture visual proofs of the birds at different locations. A similar methodology to audio classification can be extended to apply a similar transfer learning approach, as surveillance methods are fast gaining traction in the current technology era.

 

“The ball is in your court, Kasios!”

P.S.: Is John Torch still driving those trucks around?


[1] https://arxiv.org/pdf/1602.07261.pdf