Scraping Dominio Publico (Gov Brasil)

Tools for scraping data and files from the Public Domain (gov BR).

Dependencies:

Python 3.11+
NodeJS 19+
A Linux bash environment (if you want to use the script).

How to use

Run the file "run.sh" in a bash terminal and choose an option. Alternatively, if you want to execute it directly, include the option number as an argument.

Run the script with the menu to choose an option:

./run.sh

Run the script with the option already included:

# In this case, the option "1" from the menu
./run.sh 1

The scraping data is stored in the "json" directory with the name "raw_data.json", and the downloaded book files are saved in the "booklibrary" directory. It is necessary to perform the scraping of the data first before running the script to download them.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Scraping Dominio Publico (Gov Brasil)

Dependencies:

How to use

Files

README.md

Latest commit

History

README.md

File metadata and controls

Scraping Dominio Publico (Gov Brasil)

Dependencies:

How to use