An incremental data-stream sketch using sparse random proj ections

Aditya Krishna Menon, Gia Vinh Anh Pham, Sanjay Chawla, Anastasios Viglas

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

6 Scopus citations

Abstract

We propose the use of random projections with a sparse matrix to maintain a sketch of a collection of high-dimensional data-streams that are updated asynchronously. This sketch allows us to estimate L2 (Euclidean) distances and dot- products with high accuracy. We verify the validity of this sketch by applying it to an online clustering problem, where we compare our results to the offline algorithm and an existing L2 sketch, and observe comparable results in terms of accuracy, and a reduced runtime cost.

Original languageEnglish
Title of host publicationProceedings of the 7th SIAM International Conference on Data Mining
PublisherSociety for Industrial and Applied Mathematics Publications
Pages563-568
Number of pages6
ISBN (Print)9780898716306
DOIs
StatePublished - 2007
Externally publishedYes
Event7th SIAM International Conference on Data Mining - Minneapolis, MN, United States
Duration: Apr 26 2007Apr 28 2007

Publication series

NameProceedings of the 7th SIAM International Conference on Data Mining

Conference

Conference7th SIAM International Conference on Data Mining
Country/TerritoryUnited States
CityMinneapolis, MN
Period04/26/0704/28/07

Fingerprint

Dive into the research topics of 'An incremental data-stream sketch using sparse random proj ections'. Together they form a unique fingerprint.

Cite this