TimeSearcher 2   User's Manual

Author:Aleks Aris ()
If you have any comment or question, send an email to the author.

TimeSearcher Web page: http://www.cs.umd.edu/hcil/timesearcher
Human-Computer Interaction Lab
University of Maryland, College Park

1. Introduction

TimeSearcher 2 is a research tool for visualizing time series with as many as ten variables for each time point. Users typically begin with an overview that allows them to zoom into time periods of interest. Then they can select individual time series, or use the TimeBox widget to select a group of time series that fall within a range of times and values. A powerful addition is the SearchBox that enables users to select a short period in an existing time series and search for similar patterns throughout the data. TimeSearcher 2 also allows users to select time series by their attribute values.

2. System Requirements

TimeSearcher 2 is written using Visual Studio .NET C#. As a result, TimeSearcher 2 works on Windows platforms with .NET Framework 1.1 or higher installed.

3. Installation and Running

To install TimeSearcher 2, simply unzip the .zip bundle to a directory where you want to execute it from. (Winzip 9.0 is recommended to unzip the .zip bundle.) To uninstall TimeSearcher, simply remove the directory that was created when the .zip bundle was unzipped. To run TimeSearcher, simply find ts.exe in the installed directory and execute it.

4. Input Data Formats

The following are definitions regarding TimeSearcher 2 data.
time series
real valued measurements over time, where x axis constitutes time and y axis the values.
time point
every point on the x axis, i.e. on the time axis. For example, a dataset having measurements for every day over a non-leap year will have 365 time points.
an entity, for which there are measurements. For example, in a meteorological dataset, each item could be a different city where measurements are taken. Items have distinct names. In our example, the item names would be the city names.
measurement type. An item can have more than one measurement over time (i.e. for each time point), each a different type. For example, for a meteorological dataset, we may have rainfall, pressure, and temperature measurements. If this is the case we say that the dataset has three variables.
pieces of information for each item (could also be described as data for each item that doesn't change over time). For example, in a meteorological dataset, where items are cities, we may have the following data for every city: area in square miles, minimum height, average height, maximum height. If this is the case, every item in such a dataset would have four attributes. Attribute values do not change over time. In our example, the area of a city is going to be the same regardless of time.
    TimeSearcher 2 categorizes data into two major parts:
  1. time series data
  2. attribute data
Feel free to download the sample dataset
GroceryStores, which illustrates all formats in this section.

4.1 Time series data

Time series data are time series structured in terms of items and variables. For each item, there must be the same number of variables, for which there are time series of the same length, that is, they have the same number of time points. For example, a meteorological dataset may have 15 items with 4 variables, over a year, that is, with 365 time points. The total number of values in this dataset is going to be 15*4*365 values.

The main format to read time series data in TimeSearcher is called the tqd format. TimeSearcher only recognizes the tqd format as the data input format (To open a tqd file, select File->Open from the menu.). However, there are other formats defined, which can be converted to tqd format. They are called the civ format and the ci1 format. These formats are defined in the following subsections.

4.1.1 The tqd format

To see the example file of the GroceryStores dataset in tqd format, click here.
The tqd data format specification is as follows:

Helper definitions:

The structure of the tqd format is as follows: (The line count ignores the comment lines.)

4.1.2 The civ format

civ stands for columnwise items and variables. This data format was introduced due to the common incidence that data is usually available in tabular format, mostly in Excel files. A file in civ format is essentially a CSV file (CSV = Comma Separated Values, this format is recognized by Microsoft Excel) that has some additional specifications.

Once there is a file in civ format, to convert it to tqd format, use the "civ to tqd converter" in the "Tools" menu of TimeSearcher.

To see the example file of the GroceryStores dataset in civ format, click here.

The civ data format specification is as follows: (Microsoft Excel column and row conventions are used to refer to cells.)

4.1.3 The ci1 format

The ci1 format is a special case of the civ format: It is civ with 1 variable, that is, columnwise items with 1 variable. To combine many ci1 files in order to generate a civ file, use the "ci1 Combiner" under the "Tools" menu of TimeSearcher.

To see the example files of the GroceryStores dataset in ci1 format, click either apples_ci1 or oranges_ci1.

4.2 Attribute data

Attribute data is organized in terms of item names, and the data is loaded separately to TimeSearcher. Time series data and attributes data must be compatible, which is defined as follows: For every item in a time series dataset, there must be corresponding attribute data in a compatible attribute dataset. This also means that the attribute dataset may contain more items, which will be ignored when they are loaded. To be able to load an attribute dataset into TimeSearcher, a time series dataset must already have been loaded. To load attrbutes into TimeSearcher, choose "File"->"Load Attributes" from the menu. The main data format specification for an attribute dataset file is the atr data format specification, which is defined in the following section.

4.2.1 The atr format

A file in atr format is essentially a CSV file (CSV = Comma Separated Values, this format is recognized by Microsoft Excel) that has some additional specifications.

The common part between a time series dataset and an attribute dataset is the item names. They are used to associate the attributes in a compatible attribute dataset with items in a time series dataset. The first attribute in an attribute datasetis assumed to be the item names in the time series dataset that this attribute dataset is compatible with.

To see the example attribute file of the GroceryStores dataset in atr format, click here.

The atr data format specification is as follows: (Microsoft Excel column and row conventions are used to refer to cells.)

5. The User Interface

The user interface of TimeSearcher 2 has 4 major parts (see Figure 1):

TimeSearcher 2
Figure 1 The TimeSearcher 2 User Interface, Main Components

5.1 The Overview and the Variables View

The graphical parts of the interface consists of overview + detail, where the overview is at the bottom (surrounded with red border) and the details part is the tab labeled as "Variables". This tab is called the Variables view and is the main part of the interface, where visual interactions occur. It consists of QueryPanels for each variable in the time series dataset that has been loaded. In Figure 1, the dataset has 3 variables, which are "Price", "Velocity", and "Acceleration". The number of variables is also visible on the toolbar and users can limit the variables on the display by using the combo box on the toolbar. In Figure 1, it is set to show all 3 of 3 variables. Users can change the order of the variables in the Variables view by using the combo boxes just above the QueryPanel that they wish to. Similary, the overview has also a combo box that allows the user to set the overview to the desired variable. In Figure 1, since there are 3 variables in the loaded dataset, there are 3 overviews to select from. In Figure 1, "Price" is selected.

The overview (at the bottom in Figure 1) has an orange box, which is called the field of view box. It is resizable in length and moveable. Users can drag either the left or right vertical side to resize the box. To move the box in a horizontal direction without resizing, move the mouse inside the box, and then click and drag to the desired direction. The labels just above the vertical sides show the delimiting time points. The field of view box determines the range of time points to be shown on the Variables view. As soon as a resize or move operation of the overview box is completed, the Variables view is updated to reflect the new bounds.

5.2 The Items List, the Selected Item(s), and the Attributes

The items list is a tabular view of items and the attributes of each item when present. Attributes are optionally loaded. When there are no attributes, the items list has only 1 column listing the item names. When attributes are loaded, the first column is still the item names, which is also regarded as the first attribute in the attribute dataset, and the rest of the columns are the rest of the attributes.

There is at least one item selected at all times, which is highlighted as blue in the items list. The corresponding graphical representations of the selected item in the Variables view are simultaneously highlighted as blue, too. Users can change the selected item by clicking on another item in the items list. When a different item is selected, the Variables view is also updated to remain consistent with the items list. Users can change the selection also from the Variables view by clicking on the time series they want to select. To enable users see which item is going to be selected, the time series that the mouse is over is highlighted as yellow. This provides valueable feedback especially when the mouse is not exactly over a time series but slightly off. There is a threshold of tolerance and by being a little off, it is still possible to select the nearest time series, and hence the corresponding item in the dataset.

Multiple time series may be selected, in which case all selected items will become blue (see Figure 2). By pressing and holding CTRL key, clicking will make the new item to be selected without unselecting the previous one(s). This is true for both in items list and in the variables view. In items list, use SHIFT key to select ranges quickly: select the first (last) item, press and hold SHIFT, then select the last (first) item. To select multiple ranges at once, use the CTRL key the same way.

TimeSearcher 2
Figure 2   Multiple selections

If a selected item is clicked while CTRL is pressed, it becomes unselected. Use CTRL to select/unselect items one by one. Use shift to select ranges and simply select only one item to unselect all selected ones.

Clicking on a column header sorts the attributes list according to the attibute values in that column. Repetitive clicking on a column header alternates between ascending and descending sort order. In order to sort first according to column A, and then B, first click column B, and then A. This generalizes to more than two columns, as well. Column headers are rearrangeable: simply drag and drop the column header in the desired place. (If the order of the columns change in items list, the new order will not be reflected for the attribute names in the Attribute Statistics tab.)

5.3 The Details List and the Time Point Line

The details list is the tabular view just above the items list (see Figure 1). The time point line appears both on the Variables view and on the overview, and corresponds to the highlighed row in details list (see Figure 3). All three are synchronized, that is, the change in one updates the views in the other two.

TimeSearcher 2
Figure 3   The time point line

The details list shows the values of the selected item for each variable. If there are more than one selected items, the details list has the values of the topmost one in the items list. Using the time point line, users can navigate over the values of an item.

The first column of the details list contains the time points, which correspond to the x-axis values on Variables view. The header shows how many time points this dataset contains (See Figure 4).

TimeSearcher 2
Figure 4   Time points & values

5.4 TimeBoxes and Filtering

The TimeBox is a widget that is capable of filtering items. The time series that remain within the vertical bounds of the TimeBox for the complete horizontal duration are kept, and the rest are filtered out. The time series for filtered items appear light gray on the screen and they are removed from the items list (see Figure 5).

TimeSearcher 2
Figure 5   TimeBox, filtered and kept items

To create a TimeBox, click on the TimeBox icon on the toolbar (see Figure 5). This will switch the mode of user interaction to "TimeBox creation" mode. The first mouse click will determine the upper left corner of the TimeBox. Without releasing the mouse, drag the mouse toward the lower right corner of the TimeBox you want to create. The point where the mouse is released becomes the lower right corner of the TimeBox. As soon as the TimeBox is created the mode of user interaction switches back to "Selection" mode automatically. (The "Selection" mode is simply the default mode where the user can select curves by clicking on them.)

To select a TimeBox, click in the middle of it. A TimeBox is selected if and only if its handles appear. A selected TimeBox can be resized and moved. Click and drag the appropriate handle to resize. To move, click and drag in the middle of a TimeBox.

Users may have more than one TimeBox at a time. Simply create another one. The logic of kept (not filtered out) items will be conjunctive. In other words, only those items whose time series are kept by all TimeBoxes will be kept and all the rest will be filtered out.

To resize and move multiple TimeBoxes at the same time, select the TimeBoxes you would like to move or resize together. Then, choose one and move or resize it. The other ones will simultaneously be selected or moved.

When a TimeBox moves, the filtering is dynamic. In other words, the filtering of items will be (partially) updated while the TimeBox is moving. The update is partial because the dynamic update applies only to the QueryPanel that the moving TimeBox is in. The other QueryPanels, the overview and the items list are updated only after the move operation is complete, that is, when the user terminates the drag operation via a mouse release.

TimeBoxes don't have to be in the same QueryPanel. Users may create additional TimeBoxes in more than one QueryPanels. The logic of filtering is still conjunctive.

To delete one or more TimeBox(es), select it/them and press Delete key on the keyboard.

The items list shows only non-filtered items. The header of the first column shows how many items are kept of all items (see Figure 6). For instance, 20/34 would indicate that the dataset contains 34 items and currently only 20 of them are kept and the remaining 14 are filtered out.

TimeSearcher 2
Figure 6   Item count on items list

5.5 SearchBoxes and Pattern Search

The SearchBox is a widget to find similar patterns in time series. The SearchBox specifies the input pattern on an existing time series by limiting the range of time points. The sub-series of the selected time series that the SearchBox is put on becomes the pattern. In order to perform search, select the SearchBox, adjust the tolerance, and click on the SearchButton which appears below the SearchBox. Note that the tolerance and the SearchButton only appear when the SearchBox is selected (see Figure 7).

TimeSearcher 2
Figure 7   SearchBox

The creation, deletion, move and resize operation of SearchBoxes are the same as the TimeBoxes. Please refer to the previous section for the details.

The tolerance can be adjusted either by the mouse, or by pressing + and - keys on the keyboard. Search is performed when the SearchButton is clicked. To perform dynamic search when the tolerance changes, that is, search is performed for every change, use the Q and W keys on the keyboard. Q and W will increase and decrease, respectively, the tolerance by 1%.

The pattern search performed by the SearchBox has parameters. They can be access by selecting Edit->Search Options from the menu, which will open in another window (see Figure 8). There are four transformations that can be applied. They are applied when they are checked. The transformations are linear trend removal, offset translation, amplitude scaling, and noise reduction. There are two different search algorithms: Envelope and Euclidean. The default selections are offset translation and amplitude scaling for transformations and Envelope for the search algorithm.

TimeSearcher 2
Figure 8   Search Options

The following describes the effect of each transformation on how the searching is done.

linear trend removal
When this transformation is applied, the difference due to slopes between the pattern and each subsequence will be ignored.
offset translation
When applied, the constant difference on y values between the pattern and each subsequence will be ignored. The way this is accomplished is via transforming both the pattern and each subsequence to new series by subtracting the mean from the values.
amplitude scaling
When applied, the amplitude difference between the pattern and each subsequence will be ignored. The way this is accomplished is via transforming both the pattern and each subsequence to new series by dividing the values by the sample standard deviation of the pattern/subsequence.
noise reduction
When applied, the noise difference between the pattern and each subsequence will be ignored. The way this is accomplished is via transforming both the pattern and each subsequence to new series by replacing each value with the average of the preceding and succeeding value.

The following describes how the tolerance, and hence, how the search results are effected by the choice of the algorithm.

Envelope algorithm
When chosen, the tolerance value of 100% tolerance is the range (max-min) of the y values of the transformed pattern. (The tolerance value is proportional to the tolerance percentage chosen by the user.) A subsequence matches the pattern only when the difference between each corresponding time points (of the pattern and the subsequence under consideration) are at most the tolerance value.
Euclidean Algorithm
When chosen, the tolerance value of 100% tolerance is the range (max-min) of the y values of the transformed pattern. A subsequence matches the pattern when the euclidean distance (square root of sum of the squares of the difference of each corresponding time points) between the pattern and the subsequence are at most the tolerance value.

When a search is performed, the starting time points of matches are indicated by red triangles on the x axis. The pattern itself is always a match,which is indicated by a green triangle (see Figure 9). The matched subsequences are indicated with red color both on the QueryPanel the search is performed and on the overview of the same variable. Pattern search is performed among all non-filtered items within the same variable. The search results will disappear when the corresponding SearchBox is deleted. It is possible to have a SearchBox on different QueryPanels. However, it is not possible to see the results of more than one SearchBox on the same variable. (As soon as the second SearchBox is selected, the results from the first SearchBox will disappear.)

TimeSearcher 2
Figure 9   Pattern search

6. Miscellaneous

6.1 Preferences

To set preferences, select Edit->Preferences from the menu. There are 3 choices available regarding what dataset to load when TimeSearcher starts: none, a dataset whose location is specified, the last opened dataset. Note that this dataset is only a time series dataset and doesn't consider loading attributes.

Another option is what the number of variables is going to be when TimeSearcher starts. This can be set in the preferences window as well.

6.2 Alternative filtering options

Besides using TimeBoxes, there is one more way to filter items. One can select the items and filter the rest by selecting "Actions"->"Filter unselected" from the menu. To clear all the effects of such actions, select "Edit"->"Undo filter unselected actions" from the menu.

6.3 Help inside TimeSearcher

Besides this manual, quick reference help is available in TimeSearcher under the Help menu. It is accessible via F1 key, as well. In addition, it has a section "Known problems and how to avoid them" that you may find useful to check.

Last updated on: