Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implemented Various Distance Metric #107

Open
wants to merge 30 commits into
base: main
Choose a base branch
from

Conversation

Modernbeast02
Copy link
Contributor

Resolved Issue #95

Screenshot 2023-02-14 at 7 45 24 PM

Copy link
Collaborator

@uttammittal02 uttammittal02 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Add docs for distance metrics
  • Resolve the conflicts
  • Shift all the files from neighbours to metrics

Copy link
Collaborator

@uttammittal02 uttammittal02 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Few more changes

Copy link
Collaborator

@uttammittal02 uttammittal02 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add the docs for distance metrics bruh

Copy link
Collaborator

@uttammittal02 uttammittal02 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Last change ig

int n = x.size();
for (int i = 0; i < n; i++)
{
distance += std::pow(x[i] - y[i], power);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use std::abs(x[i] - y[i]) instead.
Otherwise there might be a problem in case of odd values of p.

Copy link
Collaborator

@uttammittal02 uttammittal02 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lgtm

## Manhattan Distance
Manhattan Distance is the sum of absolute differences between points across all the dimensions.

## Manhattan Distance
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change distance name to minkowski

Minkowski Distance is the generalized form of Euclidean and Manhattan Distance.

## Cosine Similarity
Cosine similarity is a metric, helpful in determining, how similar the data objects are irrespective of their size.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add sentence like

lesser is the cosine similarity, more similar are two points.

for all four metric.

#include "distance_metric.hpp"

template<class T>
DistanceMetric<T>::DistanceMetric(std::vector<T> &x, std::vector<T> &y)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
DistanceMetric<T>::DistanceMetric(std::vector<T> &x, std::vector<T> &y)
DistanceMetric<T>::DistanceMetric(const std::vector<T> &x, const std::vector<T> &y)

We don't want user to feel insecure about their data. Use const. (No need to update in .hpp file)

const in cpp but not in hpp?

.hpp file

void f(int);

.cpp file

void f(const int param1) {
   return;
}

is valid and recommended.

Comment on lines 10 to 13
DistanceMetric<T>::DistanceMetric(std::vector<T> &x, std::vector<T> &y)
{
this->x = x;
this->y = y;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
DistanceMetric<T>::DistanceMetric(std::vector<T> &x, std::vector<T> &y)
{
this->x = x;
this->y = y;
DistanceMetric<T>::DistanceMetric(std::vector<T> &x, std::vector<T> &y) : x(x), y(y)
{

Use initialiser list, they are faster.

Comment on lines +13 to +14
* @param x
* @param y
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add short statement describing x and y


/**
* @param power The order of the norm
* @returns minkowski distance between the two vectors
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* @returns minkowski distance between the two vectors
* @returns minkowski distance between the two vectors
* @throws ...

power < 1 is not allowed..


template<class T> double DistanceMetric<T>::magnitude(std::vector<T> &x)
{
double result = 0;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
double result = 0;
long double result = 0;

double might overflow


template<class T> double DistanceMetric<T>::cosineSimilarity()
{
double dotProduct = 0;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
double dotProduct = 0;
long double dotProduct = 0;


template<class T> double DistanceMetric<T>::euclidean()
{
double distance = 0;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
double distance = 0;
long double distance = 0;

You will have to downcast long double to double finally. use std::dynamic_cast

UPDATE: Do this wherever I have added double -> long double comments

@Ishwarendra
Copy link
Collaborator

Fix conflicts as well. @Modernbeast02

Copy link
Member

@harshjohar harshjohar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commit names should be relevant from next time

Copy link
Collaborator

@ken1000minus7 ken1000minus7 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say it would be better to make all functions static instead of passing the vectors as property and then calling respective functions

Copy link
Member

@harshjohar harshjohar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Revert CMakeLists.txt and don't change it in future models

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants