Inquiry is a search engine of electronic preprints.
For now, you can access to papers in the fields of mathematics, physics, computer science and statistics. These papers are retrieved from the arXiv repository.
In order to use Inquiry, we assume you have met the following requirements:
- python>=3.x
- elasticsearch==7.4.0
- gcc
- Poppler cpp lib
To install Inquiry, follow these steps:
-
Clone this repository:
$ git clone https://github.com/javiergarea/inquiry.git
-
Run the following command to install the project dependencies:
$ pip3 install -r requirements.txt
If something goes wrong during this step, ensure you have installed
pip
,gcc
andpopplerlib
.
-
Run the arXiv spider in order to crawl the documents:
$ scrapy crawl arxiv
This should generate an
items.jsonl
file in the root directory. -
Start the Elasticsearch service:
$ elasticsearch
Check that is running properly by running the command
curl localhost:9200
. -
Index the crawled data in Elasticsearch:
$ python3 elastic_manage.py -i items.jsonl
-
Run the Inquiry service:
$ python3 manage.py runserver
-
Access to localhost:8000 and perform your queries.
Inquiry is an Information Retrieval project. This project has been developed as part of the MSc. in Computer Science at Universidade da Coruña. The software is accompanied by a technical document which details its development. This document is available in web version.
Javier Garea - javier.garea@udc.es
Martín Sande - martin.sande@udc.es
This project uses the following license: MIT.