Posted by J Singh
| Filed under
Clustering,
Locality Sensitive Hashing,
OpenLSH,
Big Data,
K-Means
Join Us for a discussion on Clustering Big Data
When: Thursday June 4, noon – 1:00 pm EST.
Where: (Virtual Meeting)
Contact Us for coordinates.
Description: Approximate Nearest Neighbor methods for clustering and indexing have been actively researched ever since the K-Means algorithm was published in 1975 (and coded in FORTRAN). A
recent book lists about 300 variants and related topics.
The 50th Anniversary issue of Communications of the ACM in 2008 cited two pieces of "Breakthrough Research". One was MapReduce, the other was clustering based on Locality Sensitive Hashing (LSH). Locality Sensitive Hashing is for sets of large data and alleviates many of the issues seen with k-means. Want to see if a body of code has remarkable similarity to a public github repo? Want to see "similar" fragments of DNA that are common between several species? LSH will get you there faster than most other techniques.
The talk will demonstrate
OpenLSH, an open source implementation of LSH we have been working on.
Speaker Bios: Dr. J Singh is a Principal at DataThinks.org, a Cloud and Big Data consulting company. He is a frequent speaker on NoSQL, Hadoop--Map/Reduce and analytics of social media. He is the originator ...
Read more |
Comments |
Thu 28 May 2015
Posted by J Singh
| Filed under
clustering,
k-means,
lsh,
data mining ,
locality sensitive hashing,
pattern matching,
data analytics
Presentation at
Pivotal IO Meetup in New York
Read more |
Comments |
Tue 17 March 2015