Document summarization & query expansion - Information Retrieval homework 2, Innopolis University

How to run the code?

Download two folders (data.nosync and engine_data) from my Google Drive
Put both folders in a root directory of the project
Make sure that Python version is 3.7+
Install all required packages by running pip3 install -r requirements.txt
P.S: better to use virtual environment

Now you can run the code by simply typing python3 doc_sum.py for document summarization task and python3 query_exp.py for query expansion one. To provide any other query for document summarization, please consider changin code in doc_sum.py in line query = "your query here" in launch() function.

If you will have a problem with nltk (probably not loaded datasets), please use

	import nltk
	nltk.download('wordnet')     # required for query expansion
	nltk.download('stopwords')   # required for both parts

Name		Name	Last commit message	Last commit date
Latest commit History 35 Commits
.gitignore		.gitignore
README.md		README.md
doc_sum.py		doc_sum.py
query_exp.py		query_exp.py
requirements.txt		requirements.txt
retrieve_articles.py		retrieve_articles.py
search_engine.py		search_engine.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Document summarization & query expansion - Information Retrieval homework 2, Innopolis University

About

Releases

Packages

Languages

DrompiX/ir_hw2

Folders and files

Latest commit

History

Repository files navigation

Document summarization & query expansion - Information Retrieval homework 2, Innopolis University

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages