Chan En Ying
Grace, Singapore Management University, gracechan.2017@mitb.smu.edu.sg PRIMARY
Student Team: YES
R
Approximately how many hours were spent working on
this submission in total?
100 hours
May we post your submission in the Visual Analytics
Benchmark Repository after VAST Challenge 2018 is complete? YES
Video
Questions
1 – Using the bird call collection and the included map of the
Wildlife Preserve, characterize the patterns of all of
the bird species in the Preserve over the time of the collection. Please assume
we have a reasonable distribution of sensors and human collectors providing the
recordings, so that the patterns are reasonably representative of the bird
locations across the area. Do you detect any trends or anomalies in the
patterns? Please limit your answer to 10 images and 1000 words.
1.
Clustering by Species
The facet plot characterises
each species’ clustering patterns relative to the alleged dumping site.
If the dumping took place, one
would expect that other species
in the dumping area to also decrease, besides the Rose Pipits. Other
species clustered around the dumping site include the Ordinary Snape and Lesser
Birchbeere. We will thus assign Ordinary
Snape (OS) and Lesser Birchbeere (LB) as our Control Groups, while we
investigate Rose-Crested Blue Pipits (Rose Pipit) as our Treatment Group.

Figure 1: Clustering of Species Across Locations
2.
Clusters
of Rose Pipits Across Time
In
2013, the cluster was found right at the alleged dumping site. We define it as the
home range, where Pipits tended to thrive.
This
could be signs of dumping. It could be that the dumping caused Pipits to fly further
away, moreover its population dwindled as they struggled to survive.

Figure 2: Rose Pipits Across
Time
3.
Kernel
Density Plot of Rose Pipits Across Time
For clearer visualisation,
we use the Kernel Density Plot to compute point distribution intensity across
2012-2017.
a. Two Clusters Become One; New Cluster
"overtook" Old
There were originally two clusters in 2012 – one at the dumping site, and one away from the dumping site. The cluster at the dumping site grew
in size until 2014, while the cluster further away "took over" the growth from 2014.
b. Old Cluster Vanished in 2015
Also, the cluster at the dumping site vanished totally
from 2015, leaving only the cluster away from the dumping site. This could
indicate a movement of birds from the dumping site cluster towards the cluster
away from the dumping site, possibly a trend of dumping.

Figure 3: Kernel Density Plot of
Rose Pipits
We then apply
density-based and distance-based measures to test for cluster statistical
significance, that could shed light towards the dumping allegation.
a.
Quadrat
Analysis
Clusters were significant across all six years, but became less compact from 2015. The Pearson
statistic peaked at 2015 and reduced after that, showing signs of reduction in
cluster compactness.

Figure
4: Quadrat Analysis Results
b.
Nearest
Neighbour, followed by Clarke-Evans Test
In 2015, majority of each Pipit’s neighbours
were within 10units distance away. From 2016 however, the neighbouring
distance expanded towards 40. Though clusters were statistically significant
from 2012 to 2017 (apply the Clarke-Evans test), the clusters started to
disperse after 2015 which was the year of suspected dumping. This supports the findings
above which saw that year 2015 had the most compact cluster.

Figure 5: Nearest Neighbour for Rose Pipits
c.
K-Function
Similarly, with the K-Function, the size and strength of
clusters reduced after 2015 where the grey confidence band increased. Eventually
in 2017, the cluster lost its significance when radius was below 5, that is,
the cluster became less compact.

Figure 6: K-Function of Rose
Pipits
5.
Were
Rose Pipits the only ones affected?
If there were dumping, even our Control Groups should be
affected unless there were biological reasons why they were “immune” to the
chemical. Interestingly, our Control Groups showed that they were not affected by the Dumping Site.
a)
Ordinary
Snape (Control Group 1)
As visualised in the Kernel Density
Plot, the OS home range did not move and remained near the dumping site, unlike
the Rose Pipits whose home range moved away from the dumping site.
From the K-Function, the significance of clusters grew overtime,
confirming our hypothesis that the OSs were not
affected by the dumping.
When we perform a K-Cross between Rose Pipits and
Ordinary Snapes, we can see that the spatial
dependence between Pipits & OS was strongest in 2015.
Evidence Against Dumping
If dumping too place, then the OS must be affected too,
especially in 2015 since the 2 species were spatially dependent. However, the
OS were not affected in population and homerange.
Well, possibly, the chemical only affected the Rose Pipits.
Evidence For Dumping
Since 2015 was the year where Rose Pipits moved away from
the dumping site, it is perhaps only normal that the spatial dependence is
closest in 2015 if dumping indeed affected the Pipits. This is because the
Pipits moved from the dumping site and is now closer to the OS homerange, thus a stronger K-Cross significance.

Figure 7: KDE, K-Function &
K-Cross for Ordinary Snape
b)
Control
Group 2 – Lesser Birchbeere
As
for the Lesser Birchbeere, its home range moved even
closer towards the Dumping Site, whereby a new cluster was formed near the
dumping site in 2016 and grew almost as large as the former cluster, in 2017.
However, the clusters were not statistically
significant in 2016 (until radius > 15). This could be because of the
emergence of a new cluster in 2016 as seen below. That said, the clusters
became significant in 2017 when the new cluster grew in size.
We thus still stick to our hypothesis that the Lesser Birchbeeres
were not affected by the dumping.
The
K-Cross shows that the LBs did not have spatial dependence with the Rose Pipits
across all 6 years and hence could be a poor control.
Nevertheless, our OSs still serve as good controls, so we will place greater
emphasis on the results arising from the OSs.

Figure 8: KDE, K-Function & K-Cross for Lesser Birchbeere
6.
Perhaps
Bird Song is indicative of thriving population, while Bird Call is a sign of
distress?
Some research suggests that bird songs could be
indicative of happier birds while calls are signs of distress. True enough, Rose
Pipits stopped singing and there were more signals of distress (calls) from
2015 to 2017. This could be because of the dumping. When there was dumping, the
songs turned to calls, especially since 2015 was the year where we see the most
calls.
To confirm this,
let us do a K-Cross between Pipits
who called and Pipits
who sang. As hypothesised, year 2015 which
was the suspected year of dumping, saw the most statistically significant
spatial dependence between Pipits who called and Pipits who sang. This supports
our hypothesis as during the year of dumping, the songs turned to calls.

Figure 9: Facet Plot of Rose Pipit - By Call & Song

Figure 10: K-Cross between Pipits who Sang & Pipits who Called
2 – Turn your attention to the set of bird calls supplied by Kasios.
Does this set support the claim of Pipits being found across the Preserve? A machine learning approach using the bird
call library may help your investigation. What is the role of visualization in
your analysis of the Kasios bird calls?
Please limit your answer to 10 images and 1000 words.
1.
Envelope
Plot
First, we plot the amplitude envelope across the 19 birds
species (training data). We then will plot the same for the
15 test birds (testing data), and compare them against
the training data to label the species.

Figure 11: Amplitude Envelope Plot of 19 Training Bird Species

Figure 12: Amplitude Envelope Plot for 15 Testing Birds
Results
By visualizing the envelope of the amplitude envelope
plots of both the training and testing data, the last column shows the
predicted species for each of the 15 test birds.
2 out of 15 birds are predicted to be Rose Pipits. They
are Test Bird 2 and Test Bird 9.
|
ID |
X |
Y |
Predicted Species |
|
1 |
140 |
119 |
Eastern
Corn Skeet |
|
2 |
63 |
153 |
Rose-Crested
Blue Pipit |
|
3 |
70 |
136 |
Queenscoat |
|
4 |
78 |
150 |
Bombadil |
|
5 |
60 |
90 |
Canadian Cootamum |
|
6 |
126 |
103 |
Qax |
|
7 |
71 |
121 |
Orange Pine
Plover |
|
8 |
78 |
62 |
Green-Tipped
Scarlet Pipit |
|
9 |
61 |
145 |
Rose-Crested
Blue Pipit |
|
10 |
45 |
39 |
Qax |
|
11 |
132 |
106 |
Scrawny Jay |
|
12 |
61 |
20 |
Qax |
|
13 |
35 |
160 |
Qax |
|
14 |
40 |
125 |
Bombadil |
|
15 |
110 |
121 |
Pinkfinch |
2.
Oscillogram Plot
For confirmation, let us also look
at the oscillogram plot. The predicted species is
indicated in the last column, after visualising and
comparing the similarity of the oscillogram plots. Due
to image limit, we display only the birds predicted to be Rose Pipits.
Our results show that the predicted
species based on oscillogram visualisation,
matches the predicted species based on envelope plot visualisation.
This is not a surprise because the envelope is obtained from the oscillogram.
Table 2: Oscillogram
Plot for 2 Testing Birds Predicted as Rose Pipits
|
Bird
ID |
Oscillogram |
Predicted Species |
|
2 |
|
Rose-Crested Pipit |
|
9 |
|
Rose-Crested
Blue Pipit |
3. Trellis Plot of Acoustic Parameters
A
caveat to the previous analysis is that we did not make use of all the training
birds in the visualisation. Rather, we randomly
selected 5 birds per species to visualise, and then
chose 1 to represent the entire species. Thus, we now make use of all the
training birds by plotting the distributions across the parameters.
There is a total of 15 parameters, out
of which, 7 are chosen as these 7 parameters have greater distinction between
the species. The 7 parameters are: dom_median, HNR_median,mean, Freq_median, peakFreq_median, pitch_median, pitchAutocor_median,
pitchSpec_median.
The
trellis plot of the 7 parameters of the training birds is plotted, where the
mean is indicated by the black solid line. Next, we will plot each of the 15
testing birds from Kasios onto this plot, in blue
dotted line. We will then select the closest species for each parameter. The
species with the most parameters selected will be assigned as the predicted
species.
Given
that Test Bird 2 and Test Bird 9 were predicted to be Rose-Crested Blue Pipits,
we will focus on these two birds for visualisation.
The
species with the highest ticks (i.e. closest to the testing bird) will be
selected as the predicted species. Based on this, Test Bird 2 is predicted to
be a Qax. Test
Bird 9 is predicted to be a Vermillion Trillian.
Unfortunately,
this does not match our earlier predictions by visualizing the amplitude plot.
We conclude that this method may not be ideal as it is a numerical
representation, while the amplitude plots are more likely to be more reflective
(though less representative of the entire training population).
As
such, we will rely on Method 1 (Envelope Plot) & Method 2 (Oscillogram Plot), and leave
Method 3 (Trellis Plot) out from our concluding hypothesis.

Figure 13: Test Bird 2 - Trellis Plot of Acoustic Parameters

Figure 14: Test Bird 9 - Trellis Plot of Acoustic Parameters
We also attempted classification to predict the bird species -
first by experimenting Decision Tree and then Random Forest.
The decision tree produced a high misclassification error rate
of 0.574.
Based on the Decision Tree Model, Test Bird 2 was predicted as a
Lesser Birchbeere while Test Bird 9 was predicted as
a Green Tipped Scarlet Pipit. This is contrary to our earlier predictions. Out
of the 15 predictions, only 1 matches, and that's Test Bird 7 (in green below).
Given that the misclassification rate is rather high (57%), we should not rely
on our classification results from the Decision Tree model.
Instead, we use Random Forest to improve the performance of
decision trees. We attempt 3 different Random Forest models, by fine-tuning the
parameters to reduce misclassification rate.
Unfortunately, the lowest classification rate is 0.5565 which is
low and only slightly better than the Decision Tree model. Not only did the
predicted results not match our visualisation plots,
the table below shows that the predicted results did not match that of the Decision
Tree either. We will thus not rely on the predicted results from
classification.
Visualisation.
In my opinion, classification is not a good method for
predicting bird species. This is because, the data obtained is actually the same as that used in the Trellis Plots. Bird
calls across species may have similar amplitude mean, pitch frequency etc, but are different in nature. We should look at the
shape (amplitude pattern), than at the statistical
parameters.
Note: We also attempted spectrogram plot but found little
variation across species.
The two predicted-to-be-Pipit birds (in
green) are not found in the two clusters near the dumping site. But they did
appear together, which makes sense since birds of the same species tend to fly
together, lending credibility to our prediction by visualisation.

Figure 15: Predicted Rose Pipits - Not Found Near Dumping Site
6. Key Observations
·
Only 2
out of the 15 birds have resemblance to the Rose Pipits.
· These 2 birds were
not found near the dumping site, neither were they found in the previous 2
clusters identified.
7. Hypothesis: Pipits not found across preserve
Given that only 2 of the 15 birds provided by Kasios were likely to be Pipits, Kasios'
claim that the Pipits were thriving across the Preserve is doubted. Based on
the set of bird calls supplied by Kasios, it does not support the claim
of Pipits being found across the Preserve.
3 – Formulate a hypotheses concerning the
state of the Rose Crested Blue Pipit.
What are your primary pieces of evidence to support your assertion? What next steps should be taken in the
investigation to either support or refute the Kasios claim that the Pipits are actually thriving across the Boonsong
Lekagul Wildlife Preserve? Please limit your answer to 500 words.
1. Pipit clusters were significant in 2012 to 2017.
2. Pipit population peaked in 2015.
3. Pipit home range moved away from dumping site, from 2015
4. Pipit clusters became less compact from 2015 and lost its significance in
2017 for radius < 5
5. Pipits stopped singing after 2015. Songs turned into Calls - a sign of
distress.
6. Pipits were the only species affected (i.e. their home range &
population).
7. Control Groups thrived and even have their home range move closer to the
dumping site
8. Pipits were spatially dependent to the Ordinary Snapes
in 2015, so they should be both exposed and given the same
"treatment" if dumping did actually occur,
however the Ordinary Snapes were not affected.
Rose Pipits were surviving
as its clusters still exist and were significant, until 2017 for large radiuses.
However, they were not thriving at the dumping site and had to move away from
it. Moreover, its population had fallen. This was especially since 2015 was the
year that songs turned to calls, moreover, the control groups – OS and LB – did
not experience a fall in population and in fact even increased in population
and moved closer to the dumping site from 2015, respectively.
So, I conclude that
there were signs of Dumping and this was likely to take place in 2015, but the
Dumping most likely consisted of chemicals that affected mainly the Rose
Pipits, and not the other species.
2. Next Steps to be Taken: Need for RCT to determine
if Dumping was the cause
However, we have not confirmed whether dumping was the cause. If
it did, then it only affected the Rose Pipits. If it did not, then there must
be something else causing the slow death of Pipits.
To test our
hypothesis to determine whether it is the dumping that caused it, we can
conduct a Randomised Control Trial (RCT). It is a
more rigorous way of determining whether a cause-effect relation exists between
treatment (dumping substance) and outcome (death of Pipits).
Introduce the
dumping substance to both a Rose Pipit and an Ordinary Snape, at the same
location (e.g. dumping site at coordinates = (148,159)). If only the Rose Pipit
dies after being introduced the substance, while the Ordinary Snape survives,
then our hypothesis that "the dumping took place and only affected the
Rose Pipits due to its biological make-up", is correct.
Introduce the
dumping chemical again to a Rose Pipit and an Ordinary Snape, but this time,
introduce it to the birds at a different location - say the new cluster at
coordinates = (130, 120). If only the Rose Pipit dies after being introduced,
then our hypothesis holds. Otherwise, if it does not die, then there must be
"something else causing the deaths of the Pipits at the dumping site area,
but not due to dumping".