A Telegram bot to retrieve the title, doi, authors and publication date of papers on PubMed, starting on general search terms or on specific publication names
You can pull it from GitHub Docker Container registry:
docker pull ghcr.io/astrabert/biomedicalpapersbot:main
docker run -p 7860:7860 ghcr.io/astrabert/biomedicalpapersbot:main
Or you can clone the repository:
git clone https://github.com/AstraBert/BioMedicalPapersBot
cd BioMedicalPapersBot
Create a virtual environment and activate it:
python3 -m venv virtualenv
source virtualenv/bin/activate
Install the required dependencies:
python3 -m pip install -r requirements.txt
Run the application:
python3 scripts/app.py
In both cases, you will find the application on http://localhost:7860
Find a demo here.
It is a (bio)python-based Gradio bot that searches PubMed and returns the features of the papers that correspond to the search.
You can find a snippet code of the functions used to retrieve and parse data from PubMed in pubmedScraper.py. The workflow is pretty simple:
search_pubmed
does the actual webscraping, thanks to the Entrez NCBI module, that remotely connects to online servers and communicate with them: the function returns a list of PubMed IDsfetch_pubmed_details
, thanks to a faster access to paper metadata and data with the IDs from the previous function, retrieves significant information about papers and outputs it in standard XML formatfetch_xml
takes care of parsing the XML output and extracting titles, authors, dates of publication and DOIs.respond_to_query
outputs the information of interest in a format that is human-readable and message-sendable
You can also find the basic architecture of the python code that is used for the Gradio bot itself.
Keep in mind that there are several ways to define a python bot: thus, if you find a faster or better implementation for it, feel free to suggest it in the ISSUE
section.
If you found this project useful, please consider to fund it and make it grow: let's support open-source together!😊
This project is provided under MIT license: it will always be open-source and free to use.
If you use this project, please cite the author: Astra Clelia Bertelli