Beth Weinstein
CMSC 838B: Information Visualization
Application Project
February 28, 2001
The oil data examined for this analysis is production and financial information showing how specific Chevron oil fields performed during the year 2000. There are a total of 81 oil fields. The individual oil fields are grouped by assets (teams), and then assets are grouped by profit centers. This grouping creates a hierarchical structure. There are 37 variables for each oil field including production levels, production costs, revenues, and reserves. It is important to analyze financial data of this sort in order to asses how oil fields, or even broader, how profit centers are performing in order to make informed decisions about the future of the field and the company.
Two visualization tools were used to analyze the oil field data: the Table Lens and Spotfire. The Table Lens, developed by Xerox PARC, allows the user to visualize tabular and hierarchical data using animated "focus + context" or "fisheye" techniques. The user views the data in a "cases by variable" table. Spotfire Pro 4 for Windows, which had its beginnings as Christopher Ahlberg's graduate research project, is a general visualization tool that can be used to display a variety of data including multidimensional data using dynamic queries. Today it is used commercially by pharmaceutical, biotechnology, and manufacturing industries.
Relationships were found amongst the variables over the range of oil fields. While some trends appeared in both the Table Lens and Spotfire, some correlations were observed better with the Table Lens and others more clearly with Spotfire.
Before the visualization tools
were used, the linear correlation between income taxes, A/T earnings (reported
and operational), and B/T earnings (reported and operational) could be
predicted. This relationship was seen clearly with both the Table
Lens and Spotfire.
The Table Lens:
Figure 1:
The image below shows the entire data set over three variables: (1) number
of barrels of crude oil produced over the year 2000, (2) number of barrels
of crude oil produced per day, and (3) crude oil revenue for the year,
respectively. There is the obvious, known trend between the number
of barrels of oil produced for the year 2000 and the number of barrels
produced per day. However, there is also a linear correlation between
the number of barrels produced per year/day and the crude oil revenue for
the year. The more crude oil the oil field produced, the more money
they made for the year from crude oil. This last statement seems
to be true for the data, except for two distinct outliers.
Crude
Oil (Bbls) vs. Crude Oil Revenue
Figure 2:
The image below shows the entire data set over two variables: (1) total
production expense and (2) average net capital employed. There seems
to be a linear relationship between the variables. This implies that
the more it costs to produce each oil field, the more company money is
used to acquire physical project needs.
Production Expense vs. Net Capital Employed
Figure 3:
The image below shows the entire data set over three variables: (1) number
of barrels of crude oil produced over the year 2000, (2) number of barrels
of crude oil produced per day, and (3) end of year P1 reserves in barrels
of oil equivalent gas. The image shows a somewhat linear correlation
in the variables, so that the oil fields that produced the most oil over
the year/day had the most hypothetical amount of oil left in the field
at the end of the year. The oil fields that produce more crude oil could
have more crude oil still left in the field because they were larger oil
fields from the start.
Crude
Oil (Bbls) vs. End of Year Reserves
Spotfire:
Figure 4:
This image shows the same relationship as in Figure 3 but in Spotfire.
The two very long columns seen in Crude Oil (Bbls) in Figure 3 correspond
to the two outliers: the uppermost yellow and red squares in Figure 4.
The image below shows the entire
data set over two variables: (1) number of barrels of crude oil produced
over the year 2000 and (2) end of year P1 reserves in barrels of oil equivalent
gas. The image shows a linear correlation in the variables, implying
that the oil fields that produced the most oil over the year had the most
hypothetical amount of oil left in the field at the end of the year.
Crude
Oil (Bbls) vs. End of Year Reserves
Figure 5:
The image below shows the entire data set over two variables: (1) crude
oil revenue for the year 2000 and (2) total revenue for the same year.
The image shows a linear relationship between the two variables, implying
that crude oil revenue makes up a large portion of the total revenue.
This relationship may be surprising, since each oil field does not produce
only crude oil, but produces natural gas as well.
Crude
Oil Revenue vs. Total Revenue
Figure 6:
The image below shows the entire data set over two variables: (1) deprecation,
depletion, & amortization and (2) total production expense. The
linear relationship of the variables implies that as the value of assets
reduces over time, a larger total production amount has to be paid.
DD&A
& Abdn. vs. Production Expense
Figure 7:
The image below shows the entire data set over two variables: (1) actual
abandonment in dollars and (2) profit center. Using the hierarchy
in the data, the profit center HAT&T, shown in blue, seems to have
more oil fields with a high actual abandonment than either of the other
two profit centers. The HAT&T Profit Center should be examined since,
as seen in Figure 5, HAT&T does not have higher total revenue
values than the other two profit centers, so it should not pay more than
the other profit centers.
Actual
Abandonment vs Profit Center
The Table Lens makes a very good first impression on the user and possesses many strong features. Any action is easily reversible. The Table Lens allows for the last 10 actions to be undone and includes unfocus and unspan buttons. It has great functionality such as its ability to filter, spotlight, sort and move columns, and span (similar to Spotfire and Fisheye). In addition, the color of the data for a column can be redefined to reflect the actual information it represents. Overall, clear trends and relationships are easy to observe.
However, the Table Lens does have its limitations. While tasks are reversible, part of an action cannot be undone. For example, if several span objects exist, the user cannot take away just one of them. The user must delete them all and replace the span objects still wanted. The same is true for items in focus. Also, column name labels are not clear at first glance. The user has to focus in order to read the column name or use the tooltips. Finally, filtering the data is not a quick task.
Spotfire is an extremely robust tool. Firstly, it is easy to observe outliers. A line of best fit can be applied to the data to see to what extent a relationship is linear. Also, Spotfire is better at viewing ambiguous correlations. Finally, the dynamic query is simpler and faster than the filtering techniques of the Table Lens.
Nonetheless, Spotfire has limitations as well. When many data points have similar (x,y) positions, even the great stretching feature of Spotfire still cannot help in distinguishing among the number of points involved. Therefore, it is hard to select the correct data point. Also, the user cannot reveal an entire column name label on the right side of the screen without increasing the width of the rightmost frame since the column name label is on the same line as the slider range label.
Both visualization programs used in this analysis are very effective tools. They are user oriented because they accept a wide range of data formats, and for the most part handle the data in the way the user expects and wants. However, both tools have trouble handling missing data. Specifically with respect to my data, the Table Lens was better for viewing close trends, whereas Spotfire was superior for detecting ambiguous correlations. This concept is displayed in Figure 3 and 4. Spotfire shows in Figure 3 the linear relationship with the best of fit line better than the Table Lens shows the trend in Figure 4.
Rao, R., and Card, S.K. "The Table Lens: Merging Graphical and Symbolic Representations in an interactive Focus + Context Visualization for Tabular Information." Proceedings of CHI '94, ACM Conference on Human Factors in Computing Systems, New York, 1994: 318 - 322 and 481 - 482.