Skip to content

Latest commit

 

History

History
35 lines (22 loc) · 1.24 KB

README.md

File metadata and controls

35 lines (22 loc) · 1.24 KB

Scraping Dominio Publico (Gov Brasil)

Tools for scraping data and files from the Public Domain (gov BR).

badge-js badge-python badge-shellscript

Dependencies:

  • Python 3.11+
  • NodeJS 19+
  • A Linux bash environment (if you want to use the script).

How to use

Run the file "run.sh" in a bash terminal and choose an option. Alternatively, if you want to execute it directly, include the option number as an argument.

Run the script with the menu to choose an option:

./run.sh

Run the script with the option already included:

# In this case, the option "1" from the menu
./run.sh 1

The scraping data is stored in the "json" directory with the name "raw_data.json", and the downloaded book files are saved in the "booklibrary" directory. It is necessary to perform the scraping of the data first before running the script to download them.