Entry Name: “UKON-Cakmak-MC2”

VAST Challenge 2018
Mini-Challenge 2

 

 

Team Members:

Eren Cakmak, University of Konstanz, eren.cakmak@uni.kn – PRIMARY

Daniel Seebacher, University of Konstanz, daniel.seebacher@uni.kn

Juri Buchmüller, University of Konstanz, juri.buchmueller@uni.kn

Student Team:  Yes

Tools Used:

 

Tsfresh - Time Series Feature extraction based on scalable hypothesis tests – Python package

ViCCEx - Visual Chemical Contamination Explorer developed by Eren Cakmak – Online at https://viccex.dbvis.de/

 

Approximately how many hours were spent working on this submission in total?

80

May we post your submission in the Visual Analytics Benchmark Repository after VAST Challenge 2018 is complete?                 Yes

 

Video

https://youtu.be/rFHDDD0WLBk

 

Questions

1.   Characterize the past and most recent situation with respect to chemical contamination in the Boonsong Lekagul waterways. Do you see any trends of possible interest in this investigation?  Your submission for this questions should contain no more than 10 images and 1000 words.

We developed the ViCCEx - Visual Chemical Contamination Explorer (https://viccex.dbvis.de/) to investigate the situation in Boonsong Lekagul waterways. The tool consists of three main views. A t-SNE projection to gain an overview of trends, outliers and the current situation at each location. The t-SNE projection also enables to see the correlation between locations since changes in one river network influences several locations. A sampling strategy visualization to depict the sampling approach by the Hydrology department. A time series chart to depict the changes of chemicals values at each location.


The following images examine some trends and outliers at each location.


1.1. Achara

1.1

 

 

 

 

Anchara is fairly well clustered in t-SNE projection. There is one outlier (1) and a small subgroup (2)

 

 

 

 

 

 

 

 

Sampling started 2009

 

 

 

 

 

 

Outlier (1) in the TSNE is produced mainly by chemical total coliforms

 

 

 

 

 

Total hardness steadily decreased

 

 

 

 

 

 

 

Zinc decreased until 2012 and stayed nearly constant afterward

 

 

 

 

 

 

Arsenic overall decreased at this location

 

 

 

 

 

 

Cadmium has strange behavior, as it increases or decreases in January each year

 

 

 

Probably periodically pattern

 

 

 

 

1.2. Boonsri

 

1.2

 

 

 

Boonsri has multiple clusters in the t-SNE (3,5,6) and outliers (1,2,4) We could identify some outliers in our t-SNE plot, which we could attribute to total dissolved salts (1), zinc (2) and total hardness (4). Additionally, we could identify some high values for the chemicals AGOC-3A (7),  lead (8), copper (9) and potassium (10) values. However, this doesn't cause much deviation in our t-SNE plot. Furthermore, the readings of the potassium levels (10), show strong regularities in their measurement.

 

 

 

 

Extensive sampling strategy

 

 

 

 

Outlier

 

 

 

Zinc has several outliers. Several measurements were taken on the same day with different results

 

 

 

Total hardness has yearly periodic pattern (3) expect for a few outliers (4). The chemical sunk drastically (5) between 2011 and 2013, afterward increased again (6).

 

 

 

High values for AGOC-3A

 

 

 

Lead overall decreased

 

 

 

 

Chopper also decreased

 

 

 

Potassium has strange behavior. The measurement seems to be often rounded after 2003.

 

1.3. Busarakhan

 

1.3

 

 

 

We could identify one clear outlier in our t-SNE plot, which we could attribute to the measured Iron (1) values and some which we could attribute to higher total coliform measurements (4). The values for the total dissolved salts showed some interesting repeating pattern. Two major gaps in the measurements for lead and sodium. There is a decrease of Sodium, Lead and AOX values.

 

 

 

 

 

Sampling started 1998

 

 

 

Outlier

 

 

 

Periodic behavior

 

 

 

 

Outliers

 

 

 

Increased between 2008-2009

 

 

 

Increased between 2008-2009

 

 

 

 

Sodium also decreased

 

 

 

Lead overall decreased and increased slowly 2016 again

 

 

 

Sudden drop to nearly zero after 2009

 

 

 

 

Nickel increased after 2016 again

 

 

 

Barium has some high values

 

 

 

1.4. Chai

 

1.4

 

 

Chai is interesting since it has quite some development. There are several outliers  e.g. (1) and a strange curve produced by daily measurements of water temperature (6)

 

 

 

 

 

 

Sampling strategy changed after 2016 drastically

 

 

 

 

Outlier

 

 

 

Outliers and afterward decreasing values

 

 

 

 

 

Burst in total and fecal coliforms and fecal coliforms

 

 

 

 

 

Lead decreased

 

 

 

High difference of measurements which were taken on the same day

 

 

 

Extreme high density of measurement of the water temperature. However, the values appear to follow the previous periodic pattern. Except for an unusual peak in January 2016.

 

 

Outliers in 2008

 

 

 

Strong increase of the chemical methylosmoline, followed by an extremely steep drop.

 

 

 

Strong increase of the chemical tetrachloromethane, followed by an extremely steep drop

 

 

 

An overall increase

 

 

 

1.5. Decha

1.5

 

 

 

Outliers caused by chemical total coliforms (1, 2) and Cadmium (7). Overall decrease of Zinc, Chromium, Total Nitrogen and dissolved silicates

 

 

 

 

 

 

Started 2009

 

 

 

 

Outliers (2) and rising values (1) at the end of 2011.

 

 

 

High variance

 

 

 

Decreased after 2013

 

 

 

 

Changing values with the beginning of new years e.g. 2013, 2014, 2015

 

 

 

Total dissolved silicates decreased

 

 

 

Some outliers before 2012

 

 

 

 

 

1.6. Kannika

1.6

 

 

Different outliers are visible (1,2,5) caused by Iron, Manganese and Fecal Coliforms. Clusters are visible in (7,8) produced by methylosmoline and tetrachloromethane.

 

 

 

 

 

 

 

Extensive sampling

 

 

 

 

Outliers - two measurements on the same day with different values

 

 

 

Increased 2009 and decreased afterward

 

 

 

Outliers

 

 

 

Strong increase

 

 

 

 

An overall decrease of lead

 

 

 

Three measurements for fecal coliforms on the same day with different measurements

 

 

 

Short occurrence of AOX

 

 

 

 

Measurements were taken on the same day with different results

 

 

 

High values, followed by a drop in January 2016

 

 

 

High values, followed by a strong drop in 2010

 

 

 

 

High values, followed by a drop in January 2016

 

 

 

 

1.7. Kohsoom


1.7

 

 

Clusters are visible (3, 5) produced by total hardness. Multiple outliers are visible (1,2,4)

 

 

 

 

 

 

 

Has some bigger gaps

 

 

 

 

Outlier

 

 

 

 

High variance from 2010 until 2014

 

 

 

High variance from 2010 until 2014

 

 

 

Dropping values of starting with the year 2006 and 2001and followed by an increase in 2014.

 

 

 

 

Outliers

 

 

 

A strong increase, followed by a slow decrease in 2016

 

 

 

High values with the beginning of 2010, 2013, 2016

 

 

 

 

Periodic pattern

 

 

 

Steep drop

 

 

 

A decrease of Atrazine. There are some outliers in 2014. Again changing values after January 2008

 

 

 

 

 

1.8. Skada

 

1.8

 

 

 

Some outliers in Total dissolved salts (1), Chlorides (3) and  total coliforms (4). Multiple clusters are visible e.g. (7) produced by methylosmoline.

 

 

 

 

 

 

Extensive with some gaps

 

 

 

Outliers

 

 

 

 

A decrease in aluminum, multiple measurements on the same day with different results

 

 

 

Outliers

 

 

 

 

Outliers

 

 

 

Increased with the start of 2009 and sudden drop in 2010

 

 

 

Outliers, however multiple measurements on the same day with different values.

 

 

 

 

 

 

High values, followed by an extremely strong drop at the beginning of 2016

 

 

 

1.9. Somchair

 

 

 

 

This projection is interesting since there are different outliers e.g. produced by total coliforms (1) and two clusters (2,3) which are caused by increasing values for methylosmoline.

 

 

 

 

 

Extensive with some gaps

 

 

 

 

An outlier

 

 

 

Strong increase after 2016 which causes the two clusters in t-SNE plot

 

 

 

 

Drop after 2014

 

 

 

Outliers in 2008 - 2010

 

 

 

A sudden drop in AOX

 

 

 

 

High values

 

 

 

 

1.10. Tansanee

 

 

 

 

 

Only one cluster is visible in the t-SNE plot.

 

 

 

 

 

Starting with 2009 with several gaps

 

 

 

 

Outliers

 

 

 

 

Outliers

 

 

 

Decrease after 2010

 

 

 

High variance in 2015

 

 



 

2.   What anomalies do you find in the waterway samples dataset?  How do these affect your analysis of potential problems to the environment? Is the Hydrology Department collecting sufficient data to understand the comprehensive situation across the Preserve? What changes would you propose to make in the sampling approach to best understand the situation? Your submission for this question should contain no more than 6 images and 500 words.


We detected an anomaly in the number of samples taken at the Chai station. In the sampling strategy view, we can see that the number of samples taken at Chai (indicated by the red line) is drastically higher than the number of samples taken at the other stations.

It is also surprising that only the water temperatures and no other chemicals were measured in Chai during the daily measurements. However, no anomalies were found in the measured water temperature values.  We assume that they have probably installed a fixed sensor at this location.
chai_sampling_2017

 

 

Increasing values for Kohsoom and Somchair

 

We encountered a second anomaly when investigating the dangerous chemical Methylosmoline. As highlighted in our Time Series view for this chemical, the amount of Methylosmoline measured before the assumed dumping of Kasios, is nonexistent for all stations. However, starting at 2016 there is a stark increase in this chemical at Kohsoom, near the assumed dumping ground, as well as Somchair, a previously unencumbered place, independent of Kohsoom.
methylosmolene

 

 

 

Additionally, when investigating the sampling strategy for Methylosmoline, we encountered a repeating pattern. The measurements for Busarakhan and Somchair, as well as Kannika and Sakda, we always taken on the same day. We‘ve highlighted by connecting the sampling points of these stations in the following figure. This is an outstanding pattern, since the stations, for which the samples were always taken at the same date, are on two separate river networks, With Busarakhan and Kannika belonging to the first river network and Sakda and Somchair belonging to the second river network. Additionally, we could identify, that there where no Methylosmoline measurements for the stations Tansanee and Decha.

sampling_strategy_river_networks

 

 

Between annual changes, we have no measurements. The measurements stop in the middle of December until January. During this time the people of the hydrology department will probably be on Christmas holidays. After these measurements gaps, some chemicals increase (see Questions 1 for examples). We assume that at such times dumping took place in the preserve.

 

The general analysis of some chemical measurements e.g. 1,2,3-Trichlorobenzene, 1,2,4-Trichlorobenzene, Acenaphthene and Acenaphthylene is difficult because there are no regular measurements. These chemicals are only measured between 2008-2010.

 

Furthermore, it is also often the case that there are large gaps between the measurements. As a result, it is not possible to interpret the development of the respective chemicals at the respective locations. For example for Indeno(1,2,3-c,d)pyrene

Improvements:

Considering the collected data of the Hydrology Department, we would recommend some changes in the sampling strategy. A big problem is caused by the fact, that the Hydrology Department often only measures individual or the small fraction of all the chemicals. If the Hydrology Department is already taking a water sample, they could also try to measure all of the existing chemicals and the sampling dates could be more regular.  The chemicals should also be measured at regular intervals, e.g. monthly or quarterly. Furthermore, measurements should be taken over the Christmas holidays. This would then draw direct attention to possible dumping in the preserve.

 

 

3.   After reviewing the data, do any of your findings cause particular concern for the Pipit or other wildlife? Would you suggest any changes in the sampling strategy to better understand the waterways situation in the Preserve? Your submission for this question should contain no more than 6 images and 500 words.

At the end of the year there is always a big pause, which is followed by an increase in the measurement levels, therefore we would suggest to also make measurements at the end of the year. Additionally, we would improve the sampling strategy by making more regular measurements, measure all of the chemicals and also improve the measurement methods since there are often measurements of a chemical on the same day with extremely different values. All in all, consistent sample measurements must be taken from the rivers to make more trends visible and interpretable. This could be done by creating a scheme where for instance a chemical must be measured at least monthly or quarterly.


Also, there should be some improvements for the Pipit and other wildlife since chemicals like Lead and AOX are decreasing in all locations. However, we see an extremely large risk, which is caused by the chemical Methylosmoline. There is still a high quantity of this chemical measured in Chai and also there was an extremely strong increase of this chemical at Somchair. This might be an indicator, that the topsoil, which was trucked off from the contaminated site, was dumped at another location, which could explain the increase in the chemical Methylosmoline in Somchair.
 Furthermore, the chemical Chlorodine seems to decrease in correlation to the overall increase of Methylosmoline.

 

Another bad sampling strategy approach can be shown by examining total coliforms where we have multiple measurement gaps. Additionally, there are no regular measurements at multiple locations e.g. Achara or that measurements are not taken anymore e.g. Somchair after 2010.

 

 

Furthermore, the sampling of some chemicals, for instance, AOX are not regular enough to depict overall trends. Even though this chemical seems to be interesting since there are multiple sudden steep increases and drops.

 


Another interesting pattern was the sudden increase of measurements taken at the location Chai. We assumed that at this location a sensor for water temperature was installed. Further, sometimes multiple water temperature measurements are taken per day with different results. Another pattern which can be seen at this location is the increase in water temperature after 2016 in the middle of the winter. This is quite unusual.

 

Arsenic increases overall at several locations. This chemical could be measured more often at every location. This could give us more insight into the causes of these sudden changes-

 

Another interesting chemical is Atrazine. It seems that the values change quite a lot every year in January. This phenomenon should be studied in more detail. The general phenomenon that some chemicals always seem to change in drastically every year in January suggests that there may be dumpings in preserve during the Christmas holidays.