Structure Aware Data Consolidation

Shihao Wu1 Peter Bertholet1 Hui Huang2 Daniel Cohen-Or4 Minglun Gong3 Matthias Zwicker1
1University of Bern 2Shenzhen VisuCA Key Lab/SIAT 3Memorial University of Newfoundland 4Tel Aviv University

In IEEE Transactions on Pattern Analysis and Machine Intelligence 2017

A challenging clustering problem with two intertwined clusters, corrupted with noise. After projecting the input data (left) onto the underlying structure using our structure-aware filtering (SAF) approach (red points in the middle), spectral clustering or DBSCAN (right) detect the proper clusters.


We present a structure-aware technique to consolidate noisy data, which we use as a pre-process for standard clustering and dimensionality reduction. Our technique is related to mean shift, but instead of seeking density modes, it reveals and consolidates continuous high density structures such as curves and surface sheets in the underlying data while ignoring noise and outliers. We provide a theoretical analysis under some assumptions, and show that our approach significantly improves the performance of many non-linear dimensionality reduction and clustering algorithms in challenging scenarios.

Additional Information