CMSC 838S Application Report
By Vlad Morariu (firstname.lastname@example.org)
This report presents medical school admission data analyzed using two tools: TreeMap 4.1.1 and Spotfire DecisionSite 8.0. The data consisted of two datasets available on the Association of American Medical Colleges website, http://www.aamc.org/data/facts/start.htm. Both datasets provided GPA and MCAT scores for the year 2005, but one split the data by states and the other split the data by undergraduate majors. Though the analysis provided many insights, the four most important were the following: 1) Biological Sciences majors have low acceptance rates compared to most others; 2) most states have roughly the same average acceptance rate and MCAT scores, but Puerto Rico, Canada, and U.S. Territories are glaring exceptions; 3) some states and regions have much higher acceptance rates than others; and 4) by major, acceptance rates depend more directly on MCAT scores than on GPA.
Biological Sciences majors have low acceptance rates compared to most others
Surprisingly, Biological Sciences majors do not have higher average acceptance rates than most others. This is interesting because most students who plan to attend medical school study majors in Biological Sciences, and one might expect that they also have higher acceptance rates. In fact, Figure 1 shows that Math and Statistics, Humanities, and Social Sciences all have higher acceptance rates than Biological Sciences! It is also interesting to note that Specialized Health Sciences have the lowest acceptance rate. By simply looking at the major descriptions, one would expect Biological, Physical, and Specialized Health Sciences to top the list when ranked by acceptance rate, since they seem to be most related to the medical field. Figures 2 and 3 show that students in Biological Sciences have good scores in the Biology section of the MCATs (tied for third place with Humanities), but have poor scores in the other two sections. This hints to the possibility that the Biology section of the MCATs is not the most important in determining admission. Also, Figure 4 reinforces all of the points made so far, but also shows that the Math and Statistics majors are the smallest group of medical school applicants and have the largest acceptance rate. The Biological Sciences is indeed the largest group of applicants with only average acceptance rates.
Figure 1. Bar graph of total GPA and MCAT scores of applicants. Specialized Health Sciences majors perform the worst, and Math and Statistics perform the best.
Figure 2. This heat map again shows that applicants who are Specialized Health Science majors have the lowest scores in the three sections of the MCAT shown above. Not surprisingly, Humanities majors performed the best in the verbal section and Math and Statistics and Physical Sciences majors performed the best in the physical science section.
Figure 3. Bar graph version of data shown in Figure 2. The bar graph makes relationships between scores more evident, but does not provide an overview as quickly as the heat map.
Figure 4. TreeMap view of the acceptance rate for applicants. The size of the squares represents the number of applicants for each major, and the color represents acceptance rate. Specialized Health Science has the worst acceptance rate, and although Math and Statistics applicants are few in numbers, they have the highest acceptance rate, followed by Humanities majors.
Most states have roughly the same average acceptance
rate and MCAT scores, but
Figure 5 shows that all states are somewhat tightly
clustered around an acceptance rate of 50% and an MCAT score of 30. However,
Figure 5. Scatter plot that shows MCAT scores against
acceptance rate by state. Colors
indicate region. Most states are
clustered around an average point, but
Some states and regions have much higher acceptance rates than others
Figure 6 shows the acceptance rate of each state in
Figure 6. This bar graph shows the acceptance rate by state sorted from highest to lowest. The colors indicate the region that each state belongs to. Puerto Rico and Canada hav a very high and low acceptance rates, respectively.
Figure 7. TreeMap showing total number of applicants as size and acceptance rate as color. Some of the smaller states have the highest acceptance rates, and most of them are located in either the Southern or Northeastern regions.
Figure 8. This
TreeMap shows the total number of applicants as size and the MCAT scores as
By major, acceptance rates depend more directly on MCAT scores than on GPA
The last three figures show how average acceptance rates are related to GPA and MCAT scores for each major. Figures 9 and 10, show that Social Sciences is an outlier in both cases, and Humanities is an outlier when compared only with the science GPA. Thus, the GPA scores do not fully predict acceptance rates, at least when the averages are computed by major. When MCAT scores are compared with acceptance rates, the linear relationship becomes very evident. Thus, it seems that the acceptance rate per major is more closely related to average MCAT score than to the GPA of students in that major (even though MCAT scores are probably related to GPA, for the most part). When comparing only to the science GPA, it is not surprising that Social Sciences and Humanities do not fall in line with the other majors; though those majors might have weaker GPAs in the science classes, they have higher verbal skills that bolster their overall MCAT scores. However, the true relationship between acceptance rate and GPA or MCAT score might be more evident if each applicant’s values were used instead of the average values per major.
Figures 9-11. Scatter plots of GPA, science GPA, and MCAT scores versus acceptance rate. Although a linear relationship is generally present, MCAT scores best explain acceptance rates.
Critique of Software Used
Both Spotfire and TreeMap performed very well during our analysis. The format changes necessary to input data into both were only minimal and the tools loaded the datasets without much input from the user. One shortcoming of TreeMap is that it does not accept XLS files, the format of Excel files. However, XLS files can be easily converted to tab separated TXT files, which can be read by TreeMap. Also, changing data colors or label font size was somewhat more complicated than desired in Spotfire. Another shortcoming of Spotfire is that there is no option (at least an easy-to-find option, if one does indeed exist) to also export a legend along with a view. When a graph contains color coded information, a legend is necessary. Also, TreeMap suffers from a similar problem: it cannot export views at all, unless a screenshot is manually taken by the user. Such functions as exporting views and including the proper information in the views are important in software such as TreeMap and Spotfire since the result data analysis is often a graph or visual representation of the findings that needs to be included in a report. Although the two software tools were meant to examine differing types of data, they both provided good insight into different aspects of our data.
Spotfire and Treemap provided meaningful visualizations of the medical school data and led to interesting conclusions. Both TreeMap and Spotfire performed well during the analysis, each providing valuable insight about the data. Though the tools can always have more features such as easier color or font setup for Spotfire, better view export functions for TreeMap, and improved import functions for both, they were very user-friendly. Both tools facilitated the discovery of relationships in the data without requiring a long learning period.