Skip to content

Commit

Permalink
Update to use FakeUserAgent (#8)
Browse files Browse the repository at this point in the history
* Remove Python2 and Add FakeUserAgent

* Update config.yml

Allowing building test build

* Update noisy.py

Fix fake useragent requiring float

* Update noisy.py

Fix urllib parse import
  • Loading branch information
madereddy authored Sep 7, 2023
1 parent f5ed5a2 commit 37d3b6b
Show file tree
Hide file tree
Showing 3 changed files with 7 additions and 19 deletions.
8 changes: 1 addition & 7 deletions .circleci/config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -70,13 +70,7 @@ workflows:
version: 2
build-master:
jobs:
- build:
filters:
branches:
only: master
- build
- publish-latest:
requires:
- build
filters:
branches:
only: master
17 changes: 5 additions & 12 deletions noisy.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,23 +4,16 @@
import logging
import random
import re
import sys
import time

import requests
from urllib3.exceptions import LocationParseError

try: # Python 2
from urllib.parse import urljoin, urlparse
except ImportError: # Python 3
from urlparse import urljoin, urlparse

try: # Python 2
reload(sys)
sys.setdefaultencoding('latin-1')
except NameError: # Python 3
pass
import fake_useragent
from fake_useragent import UserAgent
ua = UserAgent(min_percentage=15.1)

from urllib.parse import urljoin, urlparse

class Crawler(object):
def __init__(self):
Expand All @@ -43,7 +36,7 @@ def _request(self, url):
:param url: the url to visit
:return: the response Requests object
"""
random_user_agent = random.choice(self._config["user_agents"])
random_user_agent = ua.random
headers = {'user-agent': random_user_agent}

response = requests.get(url, headers=headers, timeout=5)
Expand Down
1 change: 1 addition & 0 deletions requirements.txt
Original file line number Diff line number Diff line change
@@ -1 +1,2 @@
requests
fake-useragent

0 comments on commit 37d3b6b

Please sign in to comment.