Data Mining


Description

This application tries to extract association rules from retail data -- in particular, buying patterns that characterize the shopping behavior of retail customers. This application performs I/O using synchronous read() operations. Detailed description of this application can be found in:

Andreas Mueller. Fast Sequential and Parallel Algorithms for Association Rule Mining: A Comparison. Technical Report, CS-TR-3515, University of Maryland, College Park, August 1995.

Input Dataset

We have used a database consisting of 50 million transactions, with an average transaction size of 10 items and maximal potentially frequent set size of 3. The synthetic data was generated based on the following retail data model:

R. Agrawal and R. Srikant. Fast Algorithms for Mining Association Rules in Large Databases. Proc. of 20th Int'l Conf. on Very Large Databases ( VLDB ), Santiago, Chile, September 1994.

The dataset size for this program was 4 GB and was partitioned into 8 files, one per processor.

Workload

We used "Find all rules" query that extracts all the possible association rules in the transaction database.

Traces

You can download the trace files in the following formats:

----------------------------------------------------------------------
Last updated on Tue May 27 12:37:44 EDT 1997 by Mustafa Uysal (uysal@cs.umd.edu ).