Network Visuzliation: personal network visualization

This week, you will analyze your own personal social media network. If you don't have a social media presence or if it is very small, keep reading - there are options for you. If you have fewer than around 100 people in your network, you might want to try an alternative simply so you have something to say.

To download your network, there are a few options that constantly change. I checked these at the beginning of the term, but if they don't work for you, email me and we will make arrangements for an alternative. I don't maintain these apps, so I can't fix them if they break, but I can provide you with an alternative dataset (this is less than ideal but it will work if you don't have a social media preference).

  • Facebook (this is by far the easiest and best option) - You can use the Lost Circles plugin for Chrome. Use that and then click the download option to get your dataset which you can then open in Gehpi. You must use Gephi for this exercise.
  • Twitter - There are two ways to get your own Twitter data. If you have a PC, you can use NodeXL. Twitter network access used to be free but I'm not sure if it is in the current version. If you can get it, here is a tutorial on how to get your Twitter network from NodeXL and view it in Gephi is here. Note: You can view it in NodeXL, and you're welcome to use NodeXL all term, but I won't support it. It's a PC tool, and I have a mac. I only support platform-independent software in class. It also may be that the Twitter network download has moved to the paid-only version of NodeXL. I can't help you with that, sorry. There are other options here if you can't get NodeXL to work.

    You can also use Twecoll. This is a good tutorial.

    For mac users, here's another tutorial video on how to use a python-based tool to download your network. The commands I used are after the video:

    Download the Twecoll tool here (link on the lower right - it says "Download ZIP". Save the zip on your desktop. Double click the downloaded zip to unzip it. The result should be a folder called twecoll-master.

    Get your access info as shown in the video. Do that here. Open the Terminal application. Type:

    pico .twecoll Enter the info as shown in the video. Then do Control-x, and hit enter to save. Then type these commands, replacing YOURNAME with your twitter user name:

    cd Desktop/twecoll-master
    ./twecoll init YOURNAME
    

    (follow the instructions it gives you). When it's done, type:

    ./twecoll fetch YOURNAME
    ./twecoll edgelist YOURNAME
    

    These commands can take a very long time (several hours) due to Twitter's API rate limit restrictions. I started it running, verified that it was working, then left it to run overnight (it took 5-6 hours for my graph).

    The restult will be a file with the gml extension in that directory.

    For everyone doing Twitter: Next, launch Gephi. Click "Open Graph File" from the Welcome Window and select the adjacency list file. Once you open it, select the graph type to be undirected and leave all the other options as their default values.

  • Other - If you can find a way to get your adjacency list from another network, that is fine. However, you MUST analyze it in Gephi. The major purpose of this assignment is to get experience with Gephi, so I will not give credit for work analyzed with a different tool.
  • None - If you don't have a social media account, here's a sample network from the account @hopper_dog that you can analyze for the assignment. Don't rely on this because it's easier! A huge part of this class is learning to collect data, and you'll be doing yourself a disservice by taking the easy way out! To get the file, right click the link and do "Save As". To analyze this, you should look for who these nodes are. The node names are Twitter names and you can look at them by going to http://twitter.com/NAME. This will be critical if you want to analyze this network.

Next, analyze your network! Identify the groups of people, how they are connected, etc. Who are the most important nodes? Why? Describe the network in detail. Use all you learned from the Gephi tutorial to help you (e.g. using node labels, etc). Here's an example from my network. I opened my gdf file in Gephi (click to embiggen).

netvizz

I loaded the network, applied the Yifan Hu Proportional layout algorithm, and then ran the modularity and network diameter statistics. The nodes are sized by betweenness centrality and the color coding is by modularity class (basically, by cluster).

Immediately, the separation of groups is clear. My network is a bit more distinct than I've seen from most students, but the idea is the same. The teal group at the bottom is friends from my hometown. The one large node is my brother, who connects my high school friends and my family members (cousins, aunts and uncles, etc).

The large red group is made up of colleagues, students, and other people from my time as a graduate student and professor, with a few undergraduate friends who have gone on to be academics included as well.

The purpleish group to the left are friends on my hockey team. The tight green cluster at the top is a group of internet friends I met in an online forum almost 10 years ago.

Write up a 600 word paper describing your insights. Turn in that paper along with an image of your visualization.

Grading

2 points: quality of writing
5 points: quality of analysis
3 points: quality of visualization