Skip to content

A Gradio bot to retrieve PubMed papers' title, doi, authors and publication date based on general search terms or on specific publication names

License

Notifications You must be signed in to change notification settings

AstraBert/BioMedicalPapersBot

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

BioMedicalPapersBot

A Telegram bot to retrieve the title, doi, authors and publication date of papers on PubMed, starting on general search terms or on specific publication names

How to activate it

You can pull it from GitHub Docker Container registry:

docker pull ghcr.io/astrabert/biomedicalpapersbot:main
docker run -p 7860:7860 ghcr.io/astrabert/biomedicalpapersbot:main

Or you can clone the repository:

git clone https://github.com/AstraBert/BioMedicalPapersBot
cd BioMedicalPapersBot

Create a virtual environment and activate it:

python3 -m venv virtualenv
source virtualenv/bin/activate

Install the required dependencies:

python3 -m pip install -r requirements.txt

Run the application:

python3 scripts/app.py

In both cases, you will find the application on http://localhost:7860

Find a demo here.

Description

It is a (bio)python-based Gradio bot that searches PubMed and returns the features of the papers that correspond to the search.

You can find a snippet code of the functions used to retrieve and parse data from PubMed in pubmedScraper.py. The workflow is pretty simple:

  • search_pubmed does the actual webscraping, thanks to the Entrez NCBI module, that remotely connects to online servers and communicate with them: the function returns a list of PubMed IDs
  • fetch_pubmed_details, thanks to a faster access to paper metadata and data with the IDs from the previous function, retrieves significant information about papers and outputs it in standard XML format
  • fetch_xml takes care of parsing the XML output and extracting titles, authors, dates of publication and DOIs.
  • respond_to_query outputs the information of interest in a format that is human-readable and message-sendable

You can also find the basic architecture of the python code that is used for the Gradio bot itself.

Keep in mind that there are several ways to define a python bot: thus, if you find a faster or better implementation for it, feel free to suggest it in the ISSUE section.

Funding

If you found this project useful, please consider to fund it and make it grow: let's support open-source together!😊

License and rights of usage

This project is provided under MIT license: it will always be open-source and free to use.

If you use this project, please cite the author: Astra Clelia Bertelli