Subject: Re: [ublas] sparse vector usage example
From: Jose (jmalv04_at_[hidden])
Date: 2008-12-13 11:26:17
Cosine similarity is a measure of similarity between two vectors by
finding the cosine of the angle between them.
I am interested in using ublas for information retrieval and I assume
some people must already have tackled this problem (with ublas).
This requires sparse vectors, e.g. 10 million components, where a
typical doc may have 1000 words (=sparse vector with 1000 components)
So the real need is to compute the dot product (cosine similarity) for
a large number of sparse vectors with the query vector and rank them
(using Tf-idf - see wikipedia link above)
I've quickly looked at your page and I am planning to study everything
in it but I wanted to ask the list before reinventing the wheel.
On Sat, Dec 13, 2008 at 3:32 PM, Gunter Winkler <guwi17_at_[hidden]> wrote:
> Am Freitag, 12. Dezember 2008 20:03 schrieb Jose:
>> Is there an example of computing the cosine similarity of two sparse
>> vectors ?
> What is cosine similarity? I have never seen such thing in uBLAS.
> PS: There are some other examples on my page: http://www.guwi17.de/ublas
> ublas mailing list