SpiderBolt

SpiderBolt is a fast and efficient Python web scraping script that extracts links from websites using multi-threading and random user agents. It categorizes links into HTML and other types, groups them by paths, and saves them in an organized file. Customizable settings ensure flexibility for various scraping needs.

Features

🌟 Multi-threading: Handles up to 500 threads for fast and efficient web scraping.
🌐 Custom User Agents: Mimics real browsers using random user-agent headers to avoid detection.
📊 Link Categorization: Automatically categorizes links into HTML and other types, grouping them by paths for easy analysis.
🛠️ Customizable Settings: Adjust the number of threads and tweak other settings to suit your scraping needs.

Installation

Clone the repository:

git clone https://github.com/ogtirth/SpiderBolt.git
cd SpiderBolt

Install the required dependencies:
```
pip install -r requirements.txt
```
Make sure to add a `user-agents.txt` file with a list of user agents (one per line) in the project directory.

Usage

Run the script:

python spiderbolt.py

Follow the on-screen prompts to:

Enter the domain to scrape links.
Specify the number of threads you want.

The script will handle the rest, providing you with real-time status updates for each request.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

SpiderBolt

Features

Installation

Usage

Files

README.md

Latest commit

History

README.md

File metadata and controls

SpiderBolt

Features

Installation

Usage