ACE Algorithm: High-Speed and Low-Memory Anomaly Detection via Cache Lookups.

(Less than 4MB memory and 60x faster than state of art Unsupervised Anomaly Detection Algorithms)

Anomaly detection is one of the frequent and important subroutines deployed in large-scale data processing systems. We propose ACE (Arrays of (locality-sensitive) Count Estimators) algorithm that can be 60x faster than the ELKI package, which has the fastest implementation of the unsupervised anomaly detection algorithms. ACE algorithm requires less than 4MB memory, to dynamically compress the full data information into a set of count arrays. These tiny 4MB arrays of counts are sufficient for unsupervised anomaly detection. At the core of the ACE algorithm, there is  a novel statistical estimator which is derived from the sampling view of Locality Sensitive Hashing(LSH). We show the superiority of ACE algorithm over 11 popular baselines on 3 benchmark datasets, including the KDD-Cup99 data which is the largest available benchmark comprising of more than half a million entries with ground truth anomaly labels.

Related Papers

 

Arrays of (locality-sensitive) Count Estimators (ACE): High-Speed Anomaly Detection via Cache Lookups [pdf]

Chen Luo and Anshumali Shrivastava. (Arxiv  2017)

Code Coming Soon

 

Time Adaptive Sketches (Ada-Sketches) for High-Speed Temporal-Data Mining

A generalization of standard sketching algorithms for massive data stream which are aware of time. The basic idea is borrowed from a neat trick of the 1960s used in Dolby Noise reductions.

 

Nicely explained on Adrian Colyer's blog.

 

Thanks Adrian!

Related Papers

 

Time Adaptive Sketches (Ada-Sketches) for Summarizing Data Streams [pdf]

Anshumali Shrivastava, Christian Konig, Misha Bilenko. (SIGMOD 2016)

Download Code

Website by: Pallavi

Contact

Anshumali at rice dot edu

3118 Duncan Hall

Rice University