Motivation and Background
The geographic patterns of cancer around the world and within countries have provided important clues to the causes of cancer. In the mid-1970s the National Cancer Institute prepared county-based and state economic area (SEA)-based maps of cancer mortality during 1950-69 in the United States that identified distinctive variations for specific tumors, thus prompting a series of analytic studies of cancer in high-risk areas of the country. . The updated geographic patterns should help in formulating etiologic and other hypotheses, and in targeting high-risk populations for further epidemiological research and cancer control interventions.
The goals of the project is to find common trends among sex, race, and geographic location and to use an existing application to encode and visualize information which can be rather difficult to see trends in while in text form.
Tool
Spotfire for Windows was used to examine this data. Spotfire is a versatile information visualization tool that can be used to create charts and graphs for a wide variety of data. For more information Spotfire can be downloaded but at a cost.
Dataset used
I chose to visualize United States Cancer Mortality among white and black between the years of 1970-1994. Rates per 100,000 person-years, directly standardized using the 1970 U.S. population, are calculated by race (whites, blacks) and sex for 40 forms of cancer. For purposes of space not all 40 cancers are visualized. The idea was to get a general idea of the trends involving the most common forms of cancer. I gathered my data from www.dceg.cancer.gov. A majority of the data is represented in PDF files. Which made it extremely difficult to extract the data into .txt or .xls files for visualization using Spotfire. Data for all 50 states and the District of Columbia are presented. Since mortality data were available only at the state level for Alaska and Hawaii.
Variables
V = variable: R, C, LB, or UB
RG = race / gender: BM, BF, WM, WF (B = black, W = white, F = female, M
= male)
T = calendar time: 5094, 5069, 7094, and the 9 5-year periods 5054
through 9094
A = age group: blank (all ages), 019, 2049, 5074, 75+
Example:
RBM70942049 = rate for black males ages 20-49 for the time period 1970-1994
Visualizations:
Imagine trying to pull useful data out of large data sets such as this from excel. It would take an extremely large amount of time to find trends in the information set.

For lung cancer, there have been remarkable changes in the geographic patterns corresponding to regional/temporal variations in smoking trends by sex and race, with the recent emergence of high mortality rates among white men across the South, among white women in the far western states, and among blacks in northern urban areas. See Visualization1. This shows the relationship of lung cancer among males with Leukemia as a contrast.
*******************************************************************************************
A report from the National Cancer Institute (NCI) estimates that about 1 in 8 women in the United States (approximately 12.8 percent) will develop breast cancer during her lifetime.
The 1 in 8 figure means that, if current rates stay constant, a female born today has a 1 in 8 chance of developing breast cancer sometime during her life. On the other hand, she has a 7 in 8 chance of never developing breast cancer. Because the SEER calculations are weighted, they take into account that not all women live to older ages, when breast cancer risk becomes the greatest. A woman's chance of being diagnosed with breast cancer is:*
| from age 30 to age 40 . . . . . . . | 1 out of 257 |
| from age 40 to age 50 . . . . . . . | 1 out of 67 |
| from age 50 to age 60 . . . . . . . | 1 out of 36 |
| from age 60 to age 70 . . . . . . . | 1 out of 28 |
| from age 70 to age 80 . . . . . . . | 1 out of 24 |
| Ever . . . . . . . . . . . . . . . . . . . . | 1 out of 8 |
In the following visualizations you can see that Breast cancer is the leading killer among women. White or Black it doesn't discriminate. Nor does it discriminate between ages. Visualization2. This shows women of different race and ages. It shows mostly a linear relationship among the data.
******************************************************************************************************
Using Spotfire you can show via pie charts the data. It gives a different perspective on how the data is seen to the user. In this visualization it shows the 36 different cancers plotted against the number of deaths for Male and Females. Visualization3.
********************************************************************************************************
Also Spotfire allow you to use yet another way to visualize data. This being of course the ability to profile data. This is a profile of Death rates of Men versus Women for the different cancer types.

This shows the relationship between Male and Female concerning the different types of cancer. For females if you clicked on the red line it would say breast cancer and for males it shows that lung cancer is the largest killer. Followed by Colon, and lung cancer for females and prostate cancer for males.
Critique and suggestions for improvement
Spotfire is a good tool if you have data that is continuous
on the attribute that you wish to examine. In the case of my dataset, the
attributes of interest have only a small number of discrete values. Because of
this, many of the data point lay on top of each other. The tool is still useful
for finding correlations, but I would suggest using a tool such as table lens to
see better the relationship among the data. Another problem that I had
with the Spotfire tool was that many of the attributes that I was interested in
exploring were unordered, because the attribute values had no inherent ordering,
I would have liked to have been able to rearrange the columns in real time to
examine the relationship among the values.