Knowledge of the workload intrinsic characteristics is essential for dynamic goal oriented workload control algorithms used to optimize the distributed online transaction processing (OLTP) system's performance.
CLUE is an environment for clustering transactions according to their workload intrinsic characteristics. It uses execution traces from distributed OLTP systems in order to cluster transactions with high data affinity in utilization classes. HALC is a simple, fast, heuristic algorithm that was developed to cope with the large volume of trace data.
Validation of CLUE's correctness has been made through the use of synthetic trace files. HALC's speed and quality of clustering were evaluated in comparison with the ISODATA and Bond Energy algorithms on real traces. Results have shown that HALC is exceptionally fast and that the quality of the clustering is always really good.