ranking-engine

A ranking engine for text search. Given a query and a set of articles, it returns N most relevant articles.

Input

The input required is a file ("file.txt") with the articles in json format:

{"abstract":"The article text here!!!", "keywords":["keyword1", "keyword2.. etc"], "title":"The title here"}

It may have more fields which will be ignored, also if one of the fields above is missing algorithms ignores it in computations.

First run preprocess.py which does some preprocess to the articles and produces an output file.
Then run rank.py which takes as input the previous generated file and a query from standard input and using tf-idf it returns N most relevant articles. The ranking also gives diferrent weights to the article's abstract, keywords and title.

This project is licensed under the GPLv3 License - see the LICENSE file for details

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
input.txt		input.txt
preprocess.py		preprocess.py
rank.py		rank.py