CategoryMapper Tutorial

CategoryMapper is a way to reduce the number of different categories in your data.  Users can select a set of categories, and map them into a new category (aggregate the categories).  Here we use a sample traffic data to show CategoryMapper's features.  The categories are listed on the left (the map source side), and CategoryMapper tells us that there are 1919 different categories.  We see that there are many Agency-related categories (i.e. agencies are notified when an accident is reported).  We may be only interested in when agencies are notified, and not so much which agency, so we want to aggregate all "Agency....NOTIFIED" categories into a single one.

How to Aggregate Categories

First, we use the input field on the map source side to specify any category that starts with "Agency", and ends with "NOTIFIED" by entering the regular expression "Agency.*NOTIFIED" (This regular expression means "start with 'Agency', followed by anything with any length, and ends with 'NOTIFIED'.  ".*" alone means any character, with any length.  For details of the Java regular expression constructs and syntax, please see here ).  CategoryMapper tells us that there are now 525/1919 categories visible.  We select them all by using the [Select All] button, and we type in the map target category "Agency NOTIFIED" on the right.

After clicking on [Map to Target ->] on the bottom, the mapping is successfully created:

Filtering by Mapping

At this point, we have mapped all categories that begin with "Agency", and ends with "NOTIFIED" to "Agency Notified".  But there are still other categories that remain unmapped.  We can clear the map source text field, and enter ".*" (show me every category), and use the drop-down box to show only unmapped categories.  Here we see that there are 1394 unmapped categories.  At this point, a user can select all these categories, and click on [Map as is] at the bottom to map them into themselves (that is, mapping "Agency-1115 -  DEPARTED" into "Agency-1115 -  DEPARTED").  This is for all categories that do not require aggregation, although in this case, doing a round of mapping similar to "Agency.*NOTIFIED" would probably be more useful.

Once can also filter by categories that have already been mapped by choosing the Mapped option in the drop down box (below).  One can choose to see categories that have been mapped at least n times by adjusting the spinner.

.

Exporting the Mapped Input File

Finally, when every category is satisfactorily mapped, you can save a copy of the input file with the new mappings by using, from the menu,  File->Export Data as Lifelines2 Format.  An option is to preserve the original category as an annotation in the process of export.  To use this option, check the "Use the original category as annotation" checkbox.

 

back to Lifelines2