Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

stored vector revised #86

Open
xyyimian opened this issue May 3, 2019 · 2 comments
Open

stored vector revised #86

xyyimian opened this issue May 3, 2019 · 2 comments

Comments

@xyyimian
Copy link

xyyimian commented May 3, 2019

I was going to compute recall rate using NearPy. But I found that the vectors I stored has been revised.

The following is what I did.
I stored a bunch of vectors in engine. And I did
recall_list = engine.neighbours(vectors[0])
I printed out recall_list[0], the third element, which is the distance, shows that recall_list[0] is just vectors[0] since the distance is 0.0.
But I compared concrete vector element value and found that the vectors has been revised.

That's why I can not index the recalled vector in my original vectors.

I don't know what's wrong. Thanks for your reply.

@xyyimian xyyimian changed the title stored vector being revised stored vector revised May 3, 2019
@xyyimian
Copy link
Author

xyyimian commented May 5, 2019

I just found the reason why the vectors are revised. The library used unitvec to normalized vectors before storing. But I think it will cause bugs if users want to use euclidean metric.

I have done a test. I stored [1,1,1.1] and [0.1,0.1,0.1] in the engine and query with [1,1,1]. I got normalized [0.1,0.1,0.1] as nearest neighbor.

@pixelogik
Copy link
Owner

@xyyimian You are right. For euclidian distance usage the normalization is just a bug. This probably was not detected before because the default is cosine and most people use cosine anyway.

We should add a new function to the Distance class, called normalize_vector(), which is used in the Engine. Do you want to do this? If not I will do it at some point in the future.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants