KEYWORDS: monitoring, alarm, enterprise management, business network, timeline, visualization, tight coupling, dynamic query
Bizview is a tool for monitoring business and network alarms. We worked with General Electric Information Systems, Inc. (GEIS) whose large network serves the industry by connecting remote business applications such as inventory, order entry, billing and other financial applications. Currently GEIS monitors the network on behalf of the clients and provides regular performance reports. A need is emerging however to give clients direct access to monitoring capabilities at a local level. Bizview demonstrates how clients with little knowledge of network monitoring can browse, filter and customize the alarms generated by their business.
The network monitoring station used currently by GEIS operators consists of a map and a list of coded messages which are hard to browse. Customization and filtering of alarms is possible but rarely used because it is too long and difficult to specify. Visualization techniques have been studied to browse message log files  and alarms from control systems . Bizview illustrates visual information seeking principles applied to a monitoring workstation .
FILTERING AND TIGHT COUPLING The BizView prototype was implemented using Visual basic and runs on a Pentium PC. The interface consists mainly of three screens: the main overview, the textual alarm history, and the monitoring profile. The main overview screen (Figure 1) includes a status map, node filters on the left of the map, and a timeline with a flag for each alarm. A brief description of the latest incoming alarm appears at the bottom of the screen. On the status map, bright red dots indicate nodes in critical condition and light red dots represent warnings. The geographical map representing the business network can be replaced by the a business diagram. Using the filters buttons, nodes can be grayed out or hidden according to desired categories or status.
On the timeline, each vertical flag represents an alarm. New alarms first appear on the right side and slide to the left as time passes. The time period covered by the line can be adjusted . The number of alarms shown during the 5 minutes period of Figure 1 is much greater that what would be typically seen by customers (we sped up the alarm rate to make the video demo more lively). The color of the flag identifies the category of the alarm and its height tells the severity. A click on an alarm displays the corresponding text message, and users can rapidly browse through the messages, backward or forward, using the two arrow buttons. The complete list of all alarm text messages can also be seen at once on the "alarm history" screen, following the traditional way of looking at alarms.
All views are tightly coupled. Filtering the nodes affects the map and the timeline. Selecting an alarm highlights the corresponding node. Selecting a node shows the alarm which caused the last change of status. Double clicking on a node isolates all the alarms for that node by graying out all other alarms. The alarms of the single node can then be reviewed quickly. The legend of the timeline acts as the input device for reviewing the network history on the status map. A click on the timeline legend at the level of time T displays the status map at that time T. This "out-of-date" status map appears with a yellow background to indicate that it does not reflect the current status (using the same yellow as the timeline legend background).
Monitoring profile Bizview relies on the categorization of alarms. The network administrator (e.g. GEIS) publishes a catalog of categories and types of alarms broadcasted by the applications running on the network. Using a two levels outliner the monitoring profile presents this catalog and allow users to control how and when alarms appear. For each category and type of alarm (or groups of them) users can specify the minimum level of severity of the alarm they want to see (i.e. that will generate a flag), the effect on the node status, and the length and color of the flag. In general the number of alarms is small and users have no problems following the status of the network. However we propose a few techniques to handle larger numbers of alarms.
We observed that the first problem with having too many alarms is that users spend most of their time acknowledging alarms, often without finding the time to read the message (high severity alarms pop a window which needs to be closed). Therefore we designed features to alleviate this problem.
Mute level. All alarms below the mute level do not emit sound nor request acknowledgment. The mute level (i.e. the dashed line on the timeline can be adjusted until the user can study the situation without constant interruptions.
Resolved alarms. With an option of the preference menu bar resolved alarms can be hidden: when alarms are resolved (e.g. a node that was down is now up) the warnings, down and up flag for that node disappear and are replaced by a horizontal line in the lower part of the timeline. This lower part of the timeline makes it possible to tell what type of problems have occurred in the past but have been resolved. Of course not all alarms can be associated and eliminated but research in network management is making progress in the proactive automatic analysis of the alarms.
Profile adjustment. The monitoring profile can be used to readjust the importance of each category of alarm and even temporarily eliminate some of them. For example, during high peak sales periods, a department store chain may want to increase the importance of "low inventory" alarms while ignoring the financial alarms which are maintained by another employee.
Single type adjustment. To avoid switching to the monitoring profile and updating it, temporary adjustments can be made on the timeline of the main screen by directly shrinking or stretching a flag. This will affect a single type of alarm for a limited amount of time.
We want to thank Ben Shneiderman from HCIL and Ren Stimart and Cathy O'Donnell from GEIS for their participation in this project. Partial support for this research was provided by General Electric Information Systems Inc. and Maryland Industrial Partnerships.
1. Eick, S., Nelson, M., Schmidt, J., Graphical analysis of Computer Log Files, Communications of the ACM (Dec. 94) 37, 12.
2. Singers, B., Endres, L. S., Metaphoric abstraction: further explorations of the starfield display, to appear in the Proc. of 1995 Symposium on Human Interaction in Complex Systems, Greensboro N.C. (Sept. 95) Kluwer Academic Publishers, MA.
3. Ahlberg, A., Shneiderman, B., Visual information seeking: tight coupling of dynamic query filters with starfield display. In Proc. of CHI 94 , 313-316, ACM, NY.
Figure 1: The status map shows 2 nodes in alarm. The Timeline shows the type and severity of the alarms received. Nodes and alarms can be filtered and their relationship can be highlighted. In addition all historical data can be reviewed from this single screen.