Skip to content

python3 spiderbot which scraps a given Wikipedia URL and stores the info from the article in a text file.

License

Notifications You must be signed in to change notification settings

janus-tg/Araneae

Repository files navigation

Araneae

This is a python3 program that works as a Wikipedia crawler and utilizes BeautifulSoup and the requests modules. It goes through the entire Wikipedia page of the user entered URL and stores the information available on the wikipedia page as a .txt file on the system locally.

Setup

Installation

  1. Install Python 3.7 or above.
  2. Install these modules:
    • requests
    • bs4

Running the program

  1. Download the Crawler.py file.
  2. Open the folder containing the aforementioned file in your terminal (for MacOS and Linux) or command prompt (for Windows).
  3. Type python Crawler.py and press enter to run the program.

OR

  1. Install any IDE.
  2. Create a new project, copy the Crawler.py code and paste it in a .py file.
  3. Run the program.

Contributing

Pull requests are welcome for adding more feautures or fixing exisiting issues.

License

MIT

About

python3 spiderbot which scraps a given Wikipedia URL and stores the info from the article in a text file.

Topics

Resources

License

Stars

Watchers

Forks

Languages