Yoo
Ah Kim
Min-ho Shin
ykim@cs.umd.edu mhshin@cs.umd.edu
Department of Computer Science
Electronic
mail is one of the most popular computer applications. As the number of emails
we exchange increases, it becomes more and more important how to manage huge
volume of electronic messages. In addition, email data patterns may give us
useful information including the personal history. We propose two
visualizations of email dataset: time-based view and thread-based view.
Time-based view displays messages in a two-dimensional table of which the rows
are people and the columns are received/sent time. To scale large volume of
data, we use dynamic query and zooming method. Thread-based view shows emails
that belong to the same ¡°conversational¡± thread. It shows all senders who
participated in a thread and messages in the order of time with relations among
those messages.
Keywords:
Electronic mail, time-based view, thread-based view, scalability
Introduction
Nowadays
electronic mail is one of the most popular computer applications. As the number
of emails we exchange increases, it becomes more and more important how to
manage huge volume of electronic messages. Although emails were invented for
asynchronous communication, they are used for other purposes such as task
management, personal archives. In addition, email data patterns may give us
useful information including the personal history. However, there are no proper
visualization techniques, which can meet these purposes.
In
this paper, we propose two visualizations of email dataset to help users
perform these tasks: time-based view and thread-based view. Time-based view
displays messages in a two-dimensional table, of which the rows are people and the
columns are received/sent time. To scale large volume of data, we use dynamic
query and zooming method. It also has sort, filter, aggregate functions to help
users find information they need. Thread-based view shows emails that belong to
the same ¡°conversational¡± thread. Threads are created using ¡°reply¡± menu when
users send mails. Thread-based view shows all senders who participated in a thread
and messages in the order of time with relationship of those messages.
Design Goals
¡× View sent/received
email patterns
With
email dataset, users may want to see mail patterns according to time.
Interesting questions are who sent the most emails in a certain period of time
or when a person sent emails most frequently. To see patterns with large volume
of messages, the scalability problem should be solved. We used dynamic query,
zooming, aggregation, filtering, and gradation to cope with this problem.
¡× Find people and emails
related to each other
Emails
can be threaded using "reply" and several users participate in a mail
thread. It would be useful if we can see all participating users and who sent
or received emails in the thread with relations among them.
¡× Search information in
the mailbox
Emails
are used as personal archives to find information in the future. Several
studies [8] showed that semantic hierarchies using folders, the most
predominant scheme currently, is not suitable for this task because it is
difficult for users to organize mail folders properly and figure out which mail folder has the
mail they need. Because people can easily figure out senders and approximate
sent/received time of the message, time-based view can help users find a mail
they need. Thread-based view also makes it easy to extract related information
by providing a view for all messages in the same thread.
Related Work
Timestore
[1] [9] organizes messages by time and sender in a two-dimensional grid as
shown in Figure 1. Messages are displayed as dots encoding the number of
messages as size. It allows narrowing of the search space using full-text
searching. They also merged it with task and calendar management system.
Timestore focused on time-based archiving and retrieving emails.

Figure 1. Timestore
Outlook
2000 also has time-based view (Figure 2). They display all messages with
subject at received time without aggregating by date or
considering
senders. Because they used the fixed width for a day and show all messages with
subject, the view might be messy and hard to understand if there are too many
messages. In the case that many
emails arrives for a short time period, they expand y-axis to list them.
Threading
is necessary to help manage conversation history and track the status of
conversation in emails [8]. Many systems are developed to visualize
conversations in chat programs and instant messaging services
[2][3][4][5][7]. Netscan thread trees display conversation thread for newsgroups.
But visualizing email thread is more difficult because both senders and
receivers are important and there are two kinds of messages - incoming and
outgoing - unlike newsgroup.

Figure 2.
Outlook 2000

Figure 3. Netscan Thread Tree
Time-based Visualization
¡× Features
In
this view, we display messages in a two dimensional grid, of which the row is
email address of a person and the column is date as shown in Figure 4.
Each grid has the messages that the corresponding person sent/received at the
given time. We encoded the number of messages as height in bar chart (with
fewer messages) or gradation in spot (with more messages).
The
first section shows email addresses of people who sent or received mails. The
second section shows the number of mails the person sent/received in total,
using bar chart. Users can choose the option of incoming mails or outgoing mails
or both.
Users
can choose date level as date, month and year, so that the messages can be aggregated
by the level. When it is aggregated by date, there appear vertical lines per
week to help users see weekly patterns.
Sort
can be done in the order of email addresses, domain names, and message counts.
It has functions to filter people whose email address has a certain sub-string.
For example filtering by domain name is an interesting query. It is also
possible to search messages by email addresses or subject.

Figure 4.
Time-based visualization. The darkness of each cell represents
the number of messages in it
¡× Scalability
-
Bar chart vs. Gradation
To
see the number of messages in each gird more accurately and compare with
others, bar chart might be more helpful. But if we have many people in a screen
or a range of period is very long, it is difficult to show the patterns using
bar chart. For the case that we have many people and long-term period, we have
another view using gradation. Each cell has spots and the gradation of the spots
represents the number of messages. This view will give a good overview of
messages in terms of people and date. While incoming and outgoing messages can
be shown simultaneously in bar chart with different colors, spots will only
show the total number of messages as chosen. Figure 5 shows the views using bar
chart.


Figure 5.
Bar chart vs. gradation. Bar chart is better to see and
compare patterns in detail for both sent/received mails. Gradation is better to
see large volume of data
-
Dynamic Query
To
manage large dataset, we also use dynamic query method on people and date. This
will dynamically filter and zoom the range of data so that users can easily
find the data they want to see (Figure 6). If users change a range, then
data in the range will fit into the screen and data out of the range gets
hidden. By moving slider bar, we can see the hidden data, too. The labels such
as addresses or date fit dynamically to the chosen range, which displays more
detailed information as zoomed in.


Figure 6.
Dynamic Query. Users can zoom by narrowing the range of the
slider bar
¡× Message Selection
As
users put a mouse over the cell, the information of the cell- person and date -
can be displayed. Users can see the detailed information by clicking the right
mouse button on the cell. Then a pop-up window will show up with a list of the
messages in the cell. Each message contains the subject and the number of
messages in the thread which it belongs to. To see the thread view related to a
message, users choose an individual message in the list. Figure 7 shows the
pop-up window for message selection.

Figure 7.
Message selection
Thread-based
Visualization
Thread
view shows the relations among messages as shown Figure 8. For a chosen
message, we find all messages that are related to it and display them with all
the people who participated in the thread. The rows are people and the columns
are time. Messages are listed in the order of received/sent time. Note that
unlike newsgroup data, showing both senders and receivers is very important.
We
represent senders as big red rectangles and receivers as small blue circles.
There appear arrows between senders and receivers of the same mails. If a mail
is the reply mail to another mail, then a link with different color connects
two mails, which is a red thick line in Figure 8. We divide time axis by date
to help identify easily time information of messages.


Figure 8.
Thread-based visualization. Red rectangles are senders and
blue circle are receivers. Reply information is represented as red links. An
arrow shows that a mail is sent from the sender to the receiver.
Problems in Visualization
For
outgoing mails, displaying receivers is important because senders are always
the writers of the messages. Receivers may not be one, so the same messages may
appear several times in time-based view. This may display more messages on
screen than really exist. In some sense, we can think that several messages
that have the same contents are sent to receivers. But it is also very hard to
know that all those messages are actually the same message.
Our
thread view can be displayed only if users write messages using
"reply", which usually add reply information in email headers. But
sometimes users may send emails without using ¡°reply¡± although they are in fact
replies to other mails. In this case, we should consider subjects, contents and
receiver/senders group but it is much more difficult to get the correct
information. "Forward" information also can be useful for
constructing thread, but it is not available in our implementation because this
is not a part of the standard email header.
In
case that the same person use several email addresses, we cannot detect that
they came from the same person. Especially, if users are in a mailing list, we
cannot figure it out only with mailboxes. In this case, it should be possible
that users can specify the list of addresses used by each person and merge the
email received/sent from the same person.
Future Work
In
our visualization, users can see data in many ways using filter, sort, search,
etc. But they may want to edit or annotate at messages for future use. This
function can be useful, especially in email dataset. For example, users may want
to mark messages as it needs to be replied or as a reminder for future tasks.
Search
functions can be done only for subject, and sender/receivers. But it will be
useful to search among the contents. Specifically we might want to find a
message that has URL, Email-address, or attached files.
In
time-based visualization, we can aggregate or filter people based on domain
name of their email addresses. But other aggregation/filtering can be done for
example, if we define groups for people in various ways. We can make a group
based on thread or based on some real relations such as family, friends,
colleagues, etc. More generally, it would be good if we can connect this
visualization with databases that have information about people, filtering/aggregating
people based on the database.
We
can think of another useful view of emails: group-based visualization.
Email exchange pattern will give useful information about how frequently people
communicate to each other. We may group people based on how frequently they
were in the same thread and visualize those groups as graphs.
Conclusion
We
proposed two visualizations of email dataset: time-based view and thread-based
view. Time-based view displays messages in a two-dimensional table of which the
rows are people and the columns are received/sent time and each cell has a list
of messages for the person (row) and the time (column). To manage large volume
of data, we used dynamic query, zooming and gradation in this view. This view
will give users temporal email exchange patterns of correspondents.
Thread-based view shows the emails exchanged using "reply". It
displays all senders and messages in each thread in order of time, representing
what kinds of relations among the messages.
Acknowledgements
We
would like to thank Jihwang Yeo and Hyunmo Kang for their valuable comments.
Reference
[1]
Baecker, R., Booth K., Jovicic, S., McGrenere, J., Moore, G. "Reducing
the Gap Between What Users Know and What They Need to Know"
[2]
Donath, J., K. Karahalios, and F. Viegas, "Visualizing
conversations", In Proceedings of HICSS 32, January 5-8, 1999
[3]
Rodenstein, Roy and Judith S. Donath. (2000) "Talking in Circles:
Designing A Spatially-Grounded AudioConferencing Environment", In
Proceedings of CHI '2000, pp. 81-88
[4]
Smith, Marc A., Cadiz, JJ and Burkhalter, B., "Conversation Trees and
Threaded Chats", the Proceedings of the 2000 ACM Conference on Computer
Supported Cooperative Work
[5]
Smith, Marc A. and Fiore, Andrew. "Visualization Components for Persistent
Conversations", ACM SIG CHI 2001
[6]
Shneiderman, B., "Dynamic Queries for Visual Information Seeking",
IEEE Software, 11(6), 70-77
[7]
Viegas, F. B. and Donath., J. S. "Chat Circles", Proc. of
CHI'99. 1999
[8]
Whittaker, S. and Sidner, C. "Email overload: exploring personal
information management of email", In Proceedings of Conference on Human
Factors in Computing System `96
[9]
Yiu, K., Baecker, R.M., Silver, N., and Long, B., "A Time-based Interface
for Electronic Mail and Task Management," In Design of Computing Systems:
Proceedings of HCI International '97, Volume 2, Elsevier, 1997, 19-22.
¡¡