Perl Home Notes Exercises Resources perl Thoughts Jen's Home Summer Perl Seminar

Week 2 Exercises

Exercise 1

During the collapse of Enron, the Federal Energy Regulatory Commission confiscated the emails of about 150 senior management personnel and, after a Freedom of Information Act request, the emails were released and posted to the web. (more details here). This is a very interesting dataset from many perspectives.*

I have gone through those emails and created a list of people who emailed each other. Each line takes the form a,b which indicates there was an email between a and b. Here is a file with 1,000 lines from that adjacency list.

Write a perl script to count how many messages are to or from the address taylor@enron.com.

Important Hint: There are a few ways to write this code. If your code seems to be running, but it's returning 0 emails from that address, you may be doing your comparisons in a way that requires you to put a \ before the @ sign. Try using "taylor\@enron.com" in place of "taylor@enron.com" and that will probably help. This is a good general hint. Because the @ also shows up in front of arrays (like the $ in front of variables), perl can get confused when you use it in strings. If your code has problems and there are @s, try delimiting them.

Extra tricky extension: How many email addresses do NOT end in enron.com? There are things we haven't learned that will make this easier, but it can be done with only the stuff we've learned.

  • Solution
  • Correct Output

    Exercise 2

    Download this file - it is a tab delimited file (meaning there are tabs between each of the columns in the table). This file has the statistics from every game played by Geovany Soto of the Chicago Cubs this season. The columns are, in order:
    1. Date
    2. Opponent
    3. Score
    4. AB
    5. R
    6. H
    7. 2B
    8. 3B
    9. HR
    10. RBI
    11. BB
    12. K
    13. SB
    14. CS
    15. AVG
    16. OBP
    17. SLG
    18. OPS
    Write out a file that shows the season total for each field 4-14. Then, compute the season total batting average (H / AB) and the season slugging percentage (S + 2*2B + 3*3B + 4*HR) / AB. Since S isn't in the given data, you can compute it as (H-2B-3B-HR). Make sure each statistic is labeled in the file you output.

  • Solution
  • Solution Output File
    *Talk to me and I'll point you to the really juicy ones.