GitHub - mahmudsudo/-WebCrawlerX-: 🕷️ WebCrawlerX 🚀 is a rust based crawler for the open web ,inspired by scrapy.

🕷️ WebCrawlerX 🚀

A flexible and efficient web crawler written in Rust.

Features

Multiple spider implementations (CVE Details, GitHub, Quotes)
Configurable crawling parameters (delay, concurrent requests, page limit)
Easy to extend with new spiders

Installation

cargo install webcrawlerx

Usage

List available spiders:

webcrawlerx spiders

Run a specific spider:

webcrawlerx run --spider <spider_name>
--spider <spider_name> [--delay <ms>] [--concurrent <num>] [--limit <num>]

Example:

webcrawlerx run --spider cvedetails --delay 200 --concurrent 2 --limit 10

Adding a New Spider

To add a new spider, create a new module in the spiders directory and implement the Spider trait. Then, update the run_spider function in main.rs to include your new spider.

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
.github/workflows		.github/workflows
src		src
tests		tests
.gitignore		.gitignore
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Features

Installation

Usage

Adding a New Spider

Contributing

License

About

Releases

Packages

Languages

License

mahmudsudo/-WebCrawlerX-

Folders and files

Latest commit

History

Repository files navigation

Features

Installation

Usage

Adding a New Spider

Contributing

License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages