Skip to content

Commit

Permalink
Merge branch 'release-1.1.1' into 1.1
Browse files Browse the repository at this point in the history
  • Loading branch information
Digenis committed Nov 3, 2016
2 parents 65ad369 + 4ef975b commit d537b98
Show file tree
Hide file tree
Showing 13 changed files with 192 additions and 40 deletions.
2 changes: 2 additions & 0 deletions .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,8 @@ matrix:
exclude:
- env: TRAVISBUG="#1027"
include:
- python: "2.6"
env: BUILDENV=lucid
- python: "2.7"
env: BUILDENV=precise
- python: "2.7"
Expand Down
3 changes: 2 additions & 1 deletion .travis/requirements-lucid.txt
Original file line number Diff line number Diff line change
@@ -1,2 +1,3 @@
Scrapy
Scrapy<0.19 --install-option=--single-version-externally-managed
w3lib<1.9
twisted==10.0.0
5 changes: 4 additions & 1 deletion docs/api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,10 @@ Schedule a spider run (also known as a job), returning the job id.
* ``project`` (string, required) - the project name
* ``spider`` (string, required) - the spider name
* ``setting`` (string, optional) - a scrapy setting to use when running the spider
* any other parameter is passed as spider argument
* The spider queue also uses the optional ``priority`` argument (default 0.0)
which adjusts the priority of the scheduled spider run
in its project's queue. A greater number means higher priority.
* Any other parameter is passed as an argument to the spider.

Example request::

Expand Down
4 changes: 2 additions & 2 deletions docs/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -48,9 +48,9 @@
# built documents.
#
# The short X.Y version.
version = '0.18'
version = '1.1'
# The full version, including alpha/beta/rc tags.
release = '0.18'
release = '1.1'

# The language for content autogenerated by Sphinx. Refer to documentation
# for a list of supported languages.
Expand Down
26 changes: 19 additions & 7 deletions docs/config.rst
Original file line number Diff line number Diff line change
Expand Up @@ -75,11 +75,11 @@ items_dir

.. versionadded:: 0.15

The directory where the Scrapy items will be stored. If you want to disable
storing feeds of scraped items (perhaps, because you use a database or other
storage) set this option empty, like this::

items_dir =
The directory where the Scrapy items will be stored.
This option is disabled by default
because you are expected to use a database or a feed exporter.
Setting it to non-empty results in storing scraped item feeds
to the specified directory by overriding the scrapy setting ``FEED_URI``.

.. _jobs_to_keep:

Expand All @@ -88,11 +88,23 @@ jobs_to_keep

.. versionadded:: 0.15

The number of finished jobs to keep per spider. Defaults to ``5``. This
includes logs and items.
The number of finished jobs to keep per spider.
Defaults to ``5``.
This refers to logs and items.

This setting was named ``logs_to_keep`` in previous versions.

.. _finished_to_keep:

finished_to_keep
----------------

.. versionadded:: 0.14

The number of finished processes to keep in the launcher.
Defaults to ``100``.
This only reflects on the website /jobs endpoint and relevant json webservices.

poll_interval
-------------

Expand Down
100 changes: 98 additions & 2 deletions docs/news.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,103 @@
Release notes
=============

1.0
---
1.1.1 - 2016-11-03
------------------

Removed
~~~~~~~

- Disabled bdist_wheel command in setup to define dynamic requirements
despite of pip-7 wheel caching bug.

Fixed
~~~~~

- FEED_URI was always overridden by scrapyd
- Specified maximum versions for requirements that became incompatible.
- Marked package as zip-unsafe because twistd requires a plain ``txapp.py``
- Don't install zipped scrapy in py26 CI env
because its setup doesn't include the ``scrapy/VERSION`` file.

Added
~~~~~

- Enabled some missing tests for the sqlite queues.
- Enabled CI tests for python2.6 because it was supported by the 1.1 release.
- Document missing config options and include in default_scrapyd.conf
- Note the spider queue's ``priority`` argument in the scheduler's doc.


1.1.0
-----
*Release date: 2015-06-29*

Features & Enhancements
~~~~~~~~~~~~~~~~~~~~~~~

- Outsource scrapyd-deploy command to scrapyd-client (c1358dc, c9d66ca..191353e)
**If you rely on this command, install the scrapyd-client package from pypi.**
- Look for a ``~/.scrapyd.conf`` file in the users home (1fce99b)
- Adding the nodename to identify the process that is working on the job (fac3a5c..4aebe1c)
- Allow remote items store (e261591..35a21db)
- Debian sysvinit script (a54193a, ff457a9)
- Add 'start_time' field in webservice for running jobs (6712af9, acd460b)
- Check if a spider exists before schedule it (with sqlite cache) (#8, 288afef..a185ff2)

Bugfixes
~~~~~~~~

- F̶i̶x̶ ̶s̶c̶r̶a̶p̶y̶d̶-̶d̶e̶p̶l̶o̶y̶ ̶-̶-̶l̶i̶s̶t̶-̶p̶r̶o̶j̶e̶c̶t̶s̶ ̶(̶9̶4̶2̶a̶1̶b̶2̶)̶ → moved to scrapyd-client
- Sanitize version names when creating egg paths (8023720)
- Copy txweb/JsonResource from scrapy which no longer provides it (99ea920)
- Use w3lib to generate correct feed uris (9a88ea5)
- Fix GIT versioning for projects without annotated tags (e91dcf4 #34)
- Correcting HTML tags in scrapyd website monitor (da5664f, 26089cd)
- Fix FEED_URI path on windows (4f0060a)

Setup script and Tests/CI
~~~~~~~~~~~~~~~~~~~~~~~~~

- Restore integration test script (66de25d)
- Changed scripts to be installed using entry_points (b670f5e)
- Renovate scrapy upstart job (d130770)
- Travis.yml: remove deprecated ``--use-mirros`` pip option (b3cdc61)
- Mark package as zip unsafe because twistd requires a plain ``txapp.py`` (f27c054)
- Removed python 2.6/lucid env from travis (5277755)
- Made Scrapyd package name lowercase (1adfc31)

Documentation
~~~~~~~~~~~~~

- Spiders should allow for arbitrary keyword arguments (696154)
- Various typos (51f1d69, 0a4a77a)
- Fix release notes: 1.0 is already released (6c8dcfb)
- Point website module's links to readthedocs (215c700)
- Remove reference to 'scrapy server' command (f599b60)

1.0.2
-----
*Release date: 2016-03-28*

setup script
~~~~~~~~~~~~

- Specified maximum versions for requirements that became incompatible.
- Marked package as zip-unsafe because twistd requires a plain ``txapp.py``

documentation
~~~~~~~~~~~~~

- Updated broken links, references to wrong versions and scrapy
- Warn that scrapyd 1.0 felling out of support

1.0.1
-----
*Release date: 2013-09-02*
*Trivial update*

1.0.0
-----
*Release date: 2013-09-02*

First standalone release (it was previously shipped with Scrapy until Scrapy 0.16).
2 changes: 1 addition & 1 deletion scrapyd/VERSION
Original file line number Diff line number Diff line change
@@ -1 +1 @@
1.1.0
1.1.1
2 changes: 1 addition & 1 deletion scrapyd/default_scrapyd.conf
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
[scrapyd]
eggs_dir = eggs
logs_dir = logs
items_dir = items
items_dir =
jobs_to_keep = 5
dbs_dir = dbs
max_proc = 0
Expand Down
2 changes: 1 addition & 1 deletion scrapyd/environ.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ class Environment(object):
def __init__(self, config, initenv=os.environ):
self.dbs_dir = config.get('dbs_dir', 'dbs')
self.logs_dir = config.get('logs_dir', 'logs')
self.items_dir = config.get('items_dir', 'items')
self.items_dir = config.get('items_dir','')
self.jobs_to_keep = config.getint('jobs_to_keep', 5)
if config.cp.has_section('settings'):
self.settings = dict(config.cp.items('settings'))
Expand Down
37 changes: 19 additions & 18 deletions scrapyd/sqlite.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,8 @@
import sqlite3
import cPickle
try:
import cPickle as pickle
except:
import pickle
import json
from UserDict import DictMixin

Expand All @@ -12,7 +15,7 @@ def __init__(self, database=None, table="dict"):
self.table = table
# about check_same_thread: http://twistedmatrix.com/trac/ticket/4040
self.conn = sqlite3.connect(self.database, check_same_thread=False)
q = "create table if not exists %s (key text primary key, value blob)" \
q = "create table if not exists %s (key blob primary key, value blob)" \
% table
self.conn.execute(q)

Expand Down Expand Up @@ -60,26 +63,26 @@ def items(self):
def encode(self, obj):
return obj

def decode(self, text):
return text
def decode(self, obj):
return obj


class PickleSqliteDict(SqliteDict):

def encode(self, obj):
return buffer(cPickle.dumps(obj, protocol=2))
return sqlite3.Binary(pickle.dumps(obj, protocol=2))

def decode(self, text):
return cPickle.loads(str(text))
def decode(self, obj):
return pickle.loads(bytes(obj))


class JsonSqliteDict(SqliteDict):

def encode(self, obj):
return json.dumps(obj)
return sqlite3.Binary(json.dumps(obj))

def decode(self, text):
return json.loads(text)
def decode(self, obj):
return json.loads(bytes(obj))



Expand Down Expand Up @@ -155,18 +158,16 @@ def decode(self, text):
class PickleSqlitePriorityQueue(SqlitePriorityQueue):

def encode(self, obj):
return buffer(cPickle.dumps(obj, protocol=2))
return sqlite3.Binary(pickle.dumps(obj, protocol=2))

def decode(self, text):
return cPickle.loads(str(text))
def decode(self, obj):
return pickle.loads(bytes(obj))


class JsonSqlitePriorityQueue(SqlitePriorityQueue):

def encode(self, obj):
return json.dumps(obj)

def decode(self, text):
return json.loads(text)

return sqlite3.Binary(json.dumps(obj))

def decode(self, obj):
return json.loads(bytes(obj))
5 changes: 3 additions & 2 deletions scrapyd/tests/test_environ.py
Original file line number Diff line number Diff line change
Expand Up @@ -30,8 +30,9 @@ def test_get_environment_with_eggfile(self):
self.assertEqual(env['SCRAPY_SPIDER'], 'myspider')
self.assertEqual(env['SCRAPY_JOB'], 'ID')
self.assert_(env['SCRAPY_LOG_FILE'].endswith(os.path.join('mybot', 'myspider', 'ID.log')))
self.assert_(env['SCRAPY_FEED_URI'].startswith('file://{}'.format(os.getcwd())))
self.assert_(env['SCRAPY_FEED_URI'].endswith(os.path.join('mybot', 'myspider', 'ID.jl')))
if env.get('SCRAPY_FEED_URI'): #not compulsory
self.assert_(env['SCRAPY_FEED_URI'].startswith('file://{}'.format(os.getcwd())))
self.assert_(env['SCRAPY_FEED_URI'].endswith(os.path.join('mybot', 'myspider', 'ID.jl')))
self.failIf('SCRAPY_SETTINGS_MODULE' in env)

def test_get_environment_with_no_items_dir(self):
Expand Down
20 changes: 17 additions & 3 deletions scrapyd/tests/test_spiderqueue.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
from zope.interface.verify import verifyObject

from scrapyd.interfaces import ISpiderQueue
from scrapyd.spiderqueue import SqliteSpiderQueue
from scrapyd import spiderqueue

class SpiderQueueTest(unittest.TestCase):
"""This test case can be used easily for testing other SpiderQueue's by
Expand All @@ -15,12 +15,16 @@ class SpiderQueueTest(unittest.TestCase):
def setUp(self):
self.q = self._get_queue()
self.name = 'spider1'
self.args = {'arg1': 'val1', 'arg2': 2}
self.args = {
'arg1': 'val1',
'arg2': 2,
'arg3': u'\N{SNOWMAN}',
}
self.msg = self.args.copy()
self.msg['name'] = self.name

def _get_queue(self):
return SqliteSpiderQueue(':memory:')
return spiderqueue.SqliteSpiderQueue(':memory:')

def test_interface(self):
verifyObject(ISpiderQueue, self.q)
Expand Down Expand Up @@ -64,3 +68,13 @@ def test_clear(self):

c = yield maybeDeferred(self.q.count)
self.assertEqual(c, 0)


class JsonSpiderQueueTest(unittest.TestCase):
def _get_queue(self):
return spiderqueue.JsonSqliteSpiderQueue(':memory:')


class PickleSpiderQueueTest(unittest.TestCase):
def _get_queue(self):
return spiderqueue.PickleSqliteSpiderQueue(':memory:')
24 changes: 23 additions & 1 deletion setup.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
from os.path import join, dirname
import sys

with open(join(dirname(__file__), 'scrapyd/VERSION')) as f:
version = f.read().strip()
Expand Down Expand Up @@ -38,6 +39,27 @@
except ImportError:
from distutils.core import setup
else:
setup_args['install_requires'] = ['Twisted>=8.0', 'Scrapy>=0.17']
if sys.version_info < (2, 7):
setup_args['install_requires'] = ['Twisted>=8.0,<=15.1', 'Scrapy>=0.17,<0.19', 'w3lib<1.9']
else:
setup_args['install_requires'] = ['Twisted>=8.0', 'Scrapy>=0.17']


try:
import wheel
except ImportError:
pass
else:
from wheel.bdist_wheel import bdist_wheel as _bdist_wheel
class bdist_wheel(_bdist_wheel):
description = (
'Building wheels is disabled for this unsupported version of scrapyd'
' because of dynamic dependencies.'
' If you need to build a wheel, try a newer version of scrapyd.'
)
def run(self):
raise SystemExit(self.description)
setup_args.setdefault('cmdclass', {}).update(bdist_wheel=bdist_wheel)


setup(**setup_args)

0 comments on commit d537b98

Please sign in to comment.