Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to Run AadhaarSearchEngine #3

Open
mzfr opened this issue Mar 20, 2018 · 0 comments
Open

Unable to Run AadhaarSearchEngine #3

mzfr opened this issue Mar 20, 2018 · 0 comments

Comments

@mzfr
Copy link
Contributor

mzfr commented Mar 20, 2018

I have installed scrapy with sudo apt intall python-scrapy and running this command
scrapy crawl AadhaarSpider -a keyword="aadhaar meri pehachan filetype:pdf" -a se=google -a pages=10 gives an error.

2018-03-20 22:10:50 [boto] ERROR: Caught exception reading instance data
Traceback (most recent call last):
  File "/usr/lib/python2.7/dist-packages/boto/utils.py", line 210, in retry_url
    r = opener.open(req, timeout=timeout)
  File "/usr/lib/python2.7/urllib2.py", line 429, in open
    response = self._open(req, data)
  File "/usr/lib/python2.7/urllib2.py", line 447, in _open
    '_open', req)
  File "/usr/lib/python2.7/urllib2.py", line 407, in _call_chain
    result = func(*args)
  File "/usr/lib/python2.7/urllib2.py", line 1228, in http_open
    return self.do_open(httplib.HTTPConnection, req)
  File "/usr/lib/python2.7/urllib2.py", line 1198, in do_open
    raise URLError(err)
URLError: <urlopen error timed out>
2018-03-20 22:10:50 [boto] ERROR: Unable to read instance data, giving up
2018-03-20 22:10:50 [scrapy] INFO: Enabled downloader middlewares: HttpAuthMiddleware, DownloadTimeoutMiddleware, UserAgentMiddleware, RetryMiddleware, DefaultHeadersMiddleware, MetaRefreshMiddleware, HttpCompressionMiddleware, RedirectMiddleware, CookiesMiddleware, ChunkedTransferMiddleware, DownloaderStats
2018-03-20 22:10:50 [scrapy] INFO: Enabled spider middlewares: HttpErrorMiddleware, OffsiteMiddleware, RefererMiddleware, UrlLengthMiddleware, DepthMiddleware
2018-03-20 22:10:50 [scrapy] INFO: Enabled item pipelines: SespiderPipeline
2018-03-20 22:10:50 [scrapy] INFO: Spider opened
2018-03-20 22:10:50 [scrapy] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min)
2018-03-20 22:10:50 [scrapy] DEBUG: Telnet console listening on 127.0.0.1:6023
2018-03-20 22:10:50 [scrapy] ERROR: Error downloading <GET https://www.google.com/search?q=aadhaar%20meri%20pehachan%20filetype:pdf&start=0>
Traceback (most recent call last):
  File "/usr/lib/python2.7/dist-packages/scrapy/utils/defer.py", line 45, in mustbe_deferred
    result = f(*args, **kw)
  File "/usr/lib/python2.7/dist-packages/scrapy/core/downloader/handlers/__init__.py", line 41, in download_request
    return handler(request, spider)
  File "/usr/lib/python2.7/dist-packages/scrapy/core/downloader/handlers/http11.py", line 44, in download_request
    return agent.download_request(request)
  File "/usr/lib/python2.7/dist-packages/scrapy/core/downloader/handlers/http11.py", line 211, in download_request
    d = agent.request(method, url, headers, bodyproducer)
  File "/usr/lib/python2.7/dist-packages/twisted/web/client.py", line 1655, in request
    parsedURI.originForm)
  File "/usr/lib/python2.7/dist-packages/twisted/web/client.py", line 1432, in _requestWithEndpoint
    d = self._pool.getConnection(key, endpoint)
  File "/usr/lib/python2.7/dist-packages/twisted/web/client.py", line 1318, in getConnection
    return self._newConnection(key, endpoint)
  File "/usr/lib/python2.7/dist-packages/twisted/web/client.py", line 1330, in _newConnection
    return endpoint.connect(factory)
  File "/usr/lib/python2.7/dist-packages/twisted/internet/endpoints.py", line 2092, in connect
    self._wrapperFactory(protocolFactory)
  File "/usr/lib/python2.7/dist-packages/twisted/internet/endpoints.py", line 903, in connect
    EndpointReceiver, self._hostText, portNumber=self._port
  File "/usr/lib/python2.7/dist-packages/twisted/internet/_resolver.py", line 189, in resolveHostName
    onAddress = self._simpleResolver.getHostByName(hostName)
  File "/usr/lib/python2.7/dist-packages/scrapy/resolver.py", line 21, in getHostByName
    d = super(CachingThreadedResolver, self).getHostByName(name, timeout)
  File "/usr/lib/python2.7/dist-packages/twisted/internet/base.py", line 276, in getHostByName
    timeoutDelay = sum(timeout)
TypeError: 'float' object is not iterable
@mzfr mzfr changed the title Unable to Run this Unable to Run AadhaarSearchEngine Mar 20, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant