Parallel Text Search
This application is a trivial parallelization of agrep program from University of Arizona. agrep is capable of partial match and approximate searches . Our parallel version, called pgrep , can operate in one of two modes. In the first mode, the list of files to be searched is partitioned, round-robin, among the processors. In the second mode, each input file is block-partitioned among the processors, so that each processor searches a distinct portion of the file. This application performs I/O using synchronous read() operations.
To drive this application, we used the "/users" document hierarchy on the University of Maryland web server. The hierarchy contained 475~MB in 18,655 files.
We used a complex pattern including wildcards, disjunction and substitution:
You can download the trace files in the following formats:
 Sun Wu and Udi Manber. agrep - a fast approximate pattern-matching tool
. In USENIX Conference Proceedings, pages 153-162, San Fransisco, CA, Winter 1992.
Last updated on Tue May 27 12:37:44 EDT 1997 by Mustafa Uysal (firstname.lastname@example.org ).
Last Updated: 03/01/99