أهلا وسهلا 👋

What you will find here: Unlimited collecting, validating, and caching free proxies. Collect form any endpoints includes: Text APIs, JSON APIs, or web pages, by simply adding URLs to the proxy_sources.txt file, Automatically handle collecting (scraping) validation, and caching. All this done very fast ✨

Features

We support HTTP, HTTPS proxies for validating (soon Socks4 & 5)

✨ Unique IP!: Ensure only proxies with unique IP addresses are returned.
⚡ Asynchronous Power: Asynchronously scrape URLs and validate proxies simultaneously, which will result in a very fast processing time 🚀.
🧹 Scraping & Collect: Extract proxies from URLs listed in proxy_source.txt using regular expressions for Webpages, JSON, and Text content.
✅ Validating: Validate proxies concurrently. We don't wait for all URLs to finish; validation happens as soon as each proxy is ready 💪.
💾 Caching: Optionally cache valid proxies and set a duration for automatic revalidation.
🐞 Monitoring: Track runtime details, including valid/invalid proxies, scraping status, source-specific proxy counts, and errors.

Examples 💡

Here's basics example without any options or configuration:

import asyncio
from get_proxy import ProxyFetcher  # import the module


async def main():
  async with ProxyFetcher() as proxy_fetcher:
      valid_proxies = await proxy_fetcher.get_valid_proxies()
      # process proxies as you want
      print(valid_proxies)


asyncio.run(main())

Use ProxyConfig():

Lets enable proxy caching, and set cache duration to 5m.

So, proxies will reuse as long the cache is valid, else will revalidate it.

import asyncio
from get_proxy import ProxyFetcher, ProxyConfig


async def main():
    config = ProxyConfig(
        cache_enabled=True,
        enforce_unique_ip=False,
        cache_duration_minutes=5,
    )
    proxy_fetcher = ProxyFetcher(config)
    proxies = await proxy_fetcher.get_valid_proxies()
    print(proxies)

    # after end!
    await proxy_fetcher.close()


if __name__ == "__main__":
    asyncio.run(main())

Supported Content Types

We handle various types of content: Webpages, JSON APIs, and Text APIs.

Webpages

Text APIs

JSON APIs

JSON sources might provide IP and port numbers in different fields. Here’s how to configure them:

Add a URL to your proxy resources file.
Add the following after the URL: json=true&ip=<ip_field>&port=<port_field>
- Replace <ip_field> with the key for the IP address.
- Replace <port_field> with the key for the port number.
- Make sure there is a space between the URL and the parameters.

Example:

If your JSON response looks like this:

[
  {
    "IP": "314.235.43.2",
    "PORT": "80",
    "foo": "bar"
  },
  {"..."},
]

And your URL is http://example.com/api/free-proxy?format=json, you should write:

http://example.com/api/free-proxy?format=json json=true&ip=IP&port=PORT

INFO: Ensure there is a space between the URL and the parameters.

How To Use It:

Requirements📋:
- aiohttp
Clone repo, and navigate to working director:

git clone https://github.com/abdelrahman-mh/get-proxy
cd get-proxy

Setup working directory:

# create python venv (optional!) and activate it
python3 -m venv .venv && source .venv/bin/activate

# install requirement
pip install -r requirements.txt

Try it!:

python3 get_proxy.py

Reference 📚

`ProxyFetcher()`:

ProxyFetcher(config: ProxyConfig = ProxyConfig())

Options

config: ProxyConfig class! (default: ProxyConfig())

Methods

ProxyFetcher.get_valid_proxies() -> list[str]: return valid proxy list ready to use
- Asynchronous, must call with await keyword

`ProxyConfig()`:

ProxyConfig(
    prefix: str = "http://",
    user_agent: str = "Mozil...",
    ip_check_api: str = "http://httpbin.org/ip",
    request_timeout: int = 15,
    retry: int = 0,
    concurrency_limit: int = 500,
    proxy_sources_file: str = "proxy_sources.txt",
    proxy_cache_file: str = "proxy_cache.txt",
    cache_enabled: bool = False,
    cache_duration_minutes: int = 20,
    enforce_unique_ip: bool = True,
    strict_x_forwarded_for: bool = False
)

Options

prefix: Proxy URL prefix (default: "http://").
user_agent: User-agent string (default: "Mozil...").
ip_check_api: API for public IP check and proxy validation (default: "http://httpbin.org/ip").
request_timeout: Timeout for proxy validity checks (default: 15 seconds).
retry: Number of retries for failed proxy requests (default: 0).
concurrency_limit: Maximum concurrent proxy validation requests (default: 500).
proxy_sources_file: File containing proxy source URLs (default: "proxy_sources.txt").
proxy_cache_file: File for storing cached proxies (default: "proxy_cache.txt").
cache_enabled: Whether to enable caching (default: False).
cache_duration_minutes: Duration for caching proxies (default: 20 minutes).
enforce_unique_ip: Ensure each proxy has a unique IP (default: True).
strict_x_forwarded_for: Enforce strict handling of X-Forwarded-For headers, there's some proxies not really hide your IP! (default: False).

For Developers 🛠️

PRs are welcoming!

To-Do List 📝:

Add an option to limit the number of working proxies that returns.
Design Patterns:
- Use caching to store configurations during initialization, avoiding repeated checks at runtime.
- Consider patterns like Strategy or Factory to manage varying behaviors based on configuration.
- Implement a method for handling proxy limits and use asyncio.as_completed() for processing results as they finish, instead of asyncio.gather().
- Apply these patterns to improve configuration handling for options like enforce_unique_ip and cache_enabled.
Socks 4 & 5: Add support for Socks4 and Socks5 proxies.
Separate proxy scraping and validating
Add type annotations and hints to the code.

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
docs		docs
.gitignore		.gitignore
README.md		README.md
get_proxy.py		get_proxy.py
proxy_sources.txt		proxy_sources.txt
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

أهلا وسهلا 👋

Features

Table Of Content

Examples 💡

Use ProxyConfig():

Supported Content Types

Webpages

Text APIs

JSON APIs

How To Use It:

Reference 📚

`ProxyFetcher()`:

`ProxyConfig()`:

For Developers 🛠️

To-Do List 📝:

About

Releases

Packages

Languages

abdelrahman-mh/get-proxy

Folders and files

Latest commit

History

Repository files navigation

أهلا وسهلا 👋

Features

Table Of Content

Examples 💡

Use ProxyConfig():

Supported Content Types

Webpages

Text APIs

JSON APIs

How To Use It:

Reference 📚

ProxyFetcher():

ProxyConfig():

For Developers 🛠️

To-Do List 📝:

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

`ProxyFetcher()`:

`ProxyConfig()`:

Packages