Final

Your final project is to create a heatmap of UFO sightings. You will use data from the National UFO Reporting Center and the Google Maps API to do this.

UFO Sightings

The National UFO Reporting Database has an index of UFO sightings. For this project, you can use either all sightings or all North American sightings (i.e. excluding "UNSPECIFIED/INTERNATIONAL" sightings on the state list). Your first step will be to create a perl script that extracts the location for each sighting in the database.

IMPORTANT NOTE: be a responsible data user! DO NOT UNDER ANY CIRCUMSTANCES WRITE CODE THAT REPEATEDLY DOWNLOADS DATA FROM THEIR SITE. You should NOT have the "get URL" code in your test code. It is an abuse of their servers and is totally inappropriate. I will likely hear about it if you do this, so don't. There will be consequences. Instead, save a version of one of the web pages you need to parse and use that local version for all your testing. Only convert your code to access the online data once it is working perfectly on the local page.

Many cities will appear multiple times. You do not want to list the cities over and over. Instead, store them in a hash and keep track of how many times you see them (increment a count for each time you encounter the city/state).

At the end of this step, you should have a list of city,state (or city,province or city,country if you use international locations) where there were sightings and a count of each time that city appeared. Use a tab to separate the city and the count. Your file should look like this:

Chicago,IL	5
Washington,DC	7

Deliverable: A perl file called lastname_firstname_final_1.pl that I can run by typing "perl FILENAME". It should output a file called lastname_firstname_cities.txt that contains one city/state (or city,province or city,country if you use international locations) with the corresponding count on each line.

Here's the sample code from class

Converting To Lat/Lon

Next, convert your list of cities to latitude/longitude coordinates. Do this using the Google Maps API. Documentation is available here. Use the latitude and longitude from the geometry->location element of the data.

To do this (and the next step), you will need an API Key. To get that, go to https://developers.google.com/maps/documentation/javascript/heatmaplayer. At the top of the page, click "Get A Key'. Click Create New Project in the pull down and give your project a name. Then click "Create and Enable API". This will generate a key for you that you can use in the places google indicates you need to include YOUR_API_KEY.

At the end of this step, you should produce a list of latitudes and longitudes and a corresponding count of UFO sightings.

Deliverable: A perl file called lastname_firstname_final_2.pl that I can run by typing "perl FILENAME". It should open your file lastname_firstname_cities.txt from the current directory (DO NOT put a full path to the file in your code. Just use the file name so it will work on my system). It should output a file lastname_firstname_latlon.txt that has one latitude/longitude pair on each line with its corresponding count that matches with the city/state on each line of your lastname_firstname_cities.txt file.

Creating a Heatmap

You can use your list of lat/lon coordinates and counts to make a heatmap. This is also done with the Google Maps API. Details on using the API to create a heatmap are available here. I suggest you copy their sample HTML and create a perl file that prints out that HTML, replacing the points they have listed with the points from your data. You should also review the documentation to do weighted data points. Your points should be weighted with the counts that you have been tracking.

At the end of this, you should have an HTML document that I can open and see a heatmap of UFO sites.

Deliverable: A perl file called lastname_firstname_final_3.pl that I can run by typing "perl FILENAME". It should open your file lastname_firstname_latlon.txt from the current directory (DO NOT put a full path to the file in your code. Just use the file name so it will work on my system). It should output a file called lastname_firstname.html. When I open the html file, it should show me a heatmap of the UFO sightings.

Presentation

Since we will all be immersed in the world of UFO reports, our final class will be dedicated to exploring this data more deeply. You will give a 5-7 minute presentation on one of the following:

  • A location where there is high UFO activity and what might explain that
  • A specific UFO sighting incident from the database. You can pick any incident or a set of them. Note that the database is indexed by date, so high occurrences for a month might indicate many reports about the same UFO sighting.
  • A class/type of sighting. You'll see that the database is indexed by shape. You could investigate one of these types and explain theories about it.

If you have another UFO related topic you'd like to explore, let me know.

Note: you do not have to believe aliens are visiting us for this project. This is about exploring a semi-structured dataset and the presentations augment that numeric data that we are processing in perl with a deeper content-based exploration. While the topic here is admittedly a bit silly, this process is representative of how you should do most analysis. We could, for example, be extracting information from the Enron email dataset and plotting that in a visualization system, followed by reading emails and processing the content (this is, in fact, the subject of many academic papers and a big assignment I give in my network analysis class). In other words: don't be fooled by this whimsical topic. This final is giving you practice with all stages of web data processing with code and deeper contextual analysis you need to be a thoughtful analyst.

You will be graded on how interesting your presentation is, so practice and make it fun. Feel free to incorporate existing video clips, photos, and reports (but don't just show us a 5 minute youtube video someone else made!).

Deadlines

  1. UFO Sightings Deliverable - 11/22
  2. Converting to Lat/Lon Deliverable - 11/29
  3. Heatmap Deliverable - 12/6
  4. Final Presentation - 12/6

Grading

25% for each of the items listed under Deadlines.