Skip to content

Commit

Permalink
Kill leftovers of pre-MVCC read conflicts
Browse files Browse the repository at this point in the history
In the early days, before MVCC was introduced, ZODB used to raise
ReadConflictError on access to object that was simultaneously changed by
another client in concurrent transaction. However, as
doc/articles/ZODB-overview.rst says

	Since Zope 2.8 ZODB has implemented **Multi Version Concurrency Control**.
	This means no more ReadConflictErrors, each transaction is guaranteed to be
	able to load any object as it was when the transaction begun.

So today the only way to get a ReadConflictError should be

  1) at commit time for an object that was requested to stay unchanged
     via checkCurrentSerialInTransaction, and

  2) at plain access time, if a pack running simultaneously to current
     transaction, removes object revision that we try to load.

The second point is a bit unfortunate, since when load discovers that
object was deleted or not yet created, it is logically more clean to
raise POSKeyError. However due to backward compatibility we still want
to raise ReadConflictError in this case - please see comments added to
MVCCAdapter for details.

Anyway, let's remove leftovers of handling regular read-conflicts from
pre-MVCC era:

Adjust docstring of ReadConflictError to explicitly describe that this
error can only happen at commit time for objects requested to be
current, or at plain access if pack is running simultaneously under
connection foot.

There were also leftover code, comment and test bits in Connection,
interfaces, testmvcc and testZODB, that are corrected/removed
correspondingly. testZODB actually had ReadConflictTests that was
completely deactivated: commit b0f992f ("Removed the mvcc option..."; 2007)
moved read-conflict-on-access related tests out of ZODBTests, but did not
activated moved parts at all, because as that commit says when MVCC is
always on unconditionally, there is no on-access conflicts:

    Removed the mvcc option.  Everybody wants mvcc and removing us lets us
    simplify the code a little. (We'll be able to simplify more when we
    stop supporting versions.)

Today, if I try to manually activate that ReadConflictTests via

    @@ -637,6 +637,7 @@ def __init__(self, poisonedjar):
     def test_suite():
         return unittest.TestSuite((
             unittest.makeSuite(ZODBTests, 'check'),
    +        unittest.makeSuite(ReadConflictTests, 'check'),
             ))

     if __name__ == "__main__":

it fails in dumb way showing that this tests were unmaintained for ages:

    Error in test checkReadConflict (ZODB.tests.testZODB.ReadConflictTests)
    Traceback (most recent call last):
      File "/usr/lib/python2.7/unittest/case.py", line 320, in run
        self.setUp()
      File "/home/kirr/src/wendelin/z/ZODB/src/ZODB/tests/testZODB.py", line 451, in setUp
        ZODB.tests.utils.TestCase.setUp(self)
    AttributeError: 'module' object has no attribute 'utils'

Since today ZODB always uses MVCC and there is no way to get
ReadConflictError on concurrent plain read/write access, those tests
should be also gone together with old pre-MVCC way of handling
concurrency.

/cc @jimfulton
  • Loading branch information
navytux committed Jul 23, 2020
1 parent 22df1fd commit 0d499a3
Show file tree
Hide file tree
Showing 7 changed files with 44 additions and 181 deletions.
18 changes: 2 additions & 16 deletions src/ZODB/Connection.py
Original file line number Diff line number Diff line change
Expand Up @@ -148,10 +148,8 @@ def before_instance(before):
self._pre_cache = {}

# List of all objects (not oids) registered as modified by the
# persistence machinery, or by add(), or whose access caused a
# ReadConflictError (just to be able to clean them up from the
# cache on abort with the other modified objects). All objects
# of this list are either in _cache or in _added.
# persistence machinery.
# All objects of this list are either in _cache or in _added.
self._registered_objects = [] # [object]

# ids and serials of objects for which readCurrent was called
Expand Down Expand Up @@ -182,15 +180,6 @@ def before_instance(before):
# in the cache on abort and in other connections on finish.
self._modified = [] # [oid]

# We intend to prevent committing a transaction in which
# ReadConflictError occurs. _conflicts is the set of oids that
# experienced ReadConflictError. Any time we raise ReadConflictError,
# the oid should be added to this set, and we should be sure that the
# object is registered. Because it's registered, Connection.commit()
# will raise ReadConflictError again (because the oid is in
# _conflicts).
self._conflicts = {}

# To support importFile(), implemented in the ExportImport base
# class, we need to run _importDuringCommit() from our commit()
# method. If _import is not None, it is a two-tuple of arguments
Expand Down Expand Up @@ -460,7 +449,6 @@ def _abort(self):

def _tpc_cleanup(self):
"""Performs cleanup operations to support tpc_finish and tpc_abort."""
self._conflicts.clear()
self._needs_to_join = True
self._registered_objects = []
self._creating.clear()
Expand Down Expand Up @@ -528,8 +516,6 @@ def _commit(self, transaction):
for obj in self._registered_objects:
oid = obj._p_oid
assert oid
if oid in self._conflicts:
raise ReadConflictError(object=obj)

if obj._p_jar is not self:
raise InvalidObjectReference(obj, obj._p_jar)
Expand Down
13 changes: 10 additions & 3 deletions src/ZODB/POSException.py
Original file line number Diff line number Diff line change
Expand Up @@ -144,10 +144,17 @@ def get_serials(self):
return self.serials

class ReadConflictError(ConflictError):
"""Conflict detected when object was loaded.
"""Conflict detected when object was requested to stay unchanged.
An attempt was made to read an object that has changed in another
transaction (eg. another thread or process).
An object was requested to stay not modified via
checkCurrentSerialInTransaction, and at commit time was found to be
changed by another transaction (eg. another thread or process).
Note: for backward compatibility ReadConflictError is also raised on
plain object access if
- object is found to be removed, and
- there is possibility that database pack was running simultaneously.
"""
def __init__(self, message=None, object=None, serials=None, **kw):
if message is None:
Expand Down
6 changes: 0 additions & 6 deletions src/ZODB/interfaces.py
Original file line number Diff line number Diff line change
Expand Up @@ -78,12 +78,6 @@ class IConnection(Interface):
Two options affect consistency. By default, the mvcc and synch
options are enabled by default.
If you pass mvcc=False to db.open(), the Connection will never read
non-current revisions of an object. Instead it will raise a
ReadConflictError to indicate that the current revision is
unavailable because it was written after the current transaction
began.
The logic for handling modifications assumes that the thread that
opened a Connection (called db.open()) is the thread that will use
the Connection. If this is not true, you should pass synch=False
Expand Down
29 changes: 27 additions & 2 deletions src/ZODB/mvccadapter.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
# -*- coding: utf-8 -*-
"""Adapt IStorage objects to IMVCCStorage
This is a largely internal implementation of ZODB, especially DB and
Expand All @@ -9,7 +10,7 @@
import zope.interface

from . import interfaces, serialize, POSException
from .utils import p64, u64, Lock
from .utils import p64, u64, Lock, oid_repr, tid_repr

class Base(object):

Expand Down Expand Up @@ -152,7 +153,31 @@ def load(self, oid):
assert self._start is not None
r = self._storage.loadBefore(oid, self._start)
if r is None:
raise POSException.ReadConflictError(repr(oid))
# object was deleted or not-yet-created.
# raise ReadConflictError - not - POSKeyError due to backward
# compatibility: a pack(t+δ) could be running simultaneously to our
# transaction that observes database as of t state. Such pack,
# because it packs the storage from a "future-to-us" point of view,
# can remove object revisions that we can try to load, for example:
#
# txn1 <-- t
# obj.revA
#
# txn2 <-- t+δ
# obj.revB
#
# for such case we want user transaction to be restarted - not
# failed - by raising ConflictError subclass.
#
# XXX we don't detect for pack to be actually running - just assume
# the worst. It would be good if storage could provide information
# whether pack is/was actually running and its details, take that
# into account, and raise ReadConflictError only in the presence of
# database being simultaneously updated from back of its log.
raise POSException.ReadConflictError(
"load %s @%s: object deleted, likely by simultaneous pack" %
(oid_repr(oid), tid_repr(p64(u64(self._start)-1))))

return r[:2]

def prefetch(self, oids):
Expand Down
151 changes: 0 additions & 151 deletions src/ZODB/tests/testZODB.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,6 @@
##############################################################################
from persistent import Persistent
from persistent.mapping import PersistentMapping
from ZODB.POSException import ReadConflictError
from ZODB.POSException import TransactionFailedError

import doctest
Expand Down Expand Up @@ -445,156 +444,6 @@ def checkMultipleUndoInOneTransaction(self):
transaction.abort()
conn.close()

class ReadConflictTests(ZODB.tests.util.TestCase):

def setUp(self):
ZODB.tests.utils.TestCase.setUp(self)
self._storage = ZODB.MappingStorage.MappingStorage()

def readConflict(self, shouldFail=True):
# Two transactions run concurrently. Each reads some object,
# then one commits and the other tries to read an object
# modified by the first. This read should fail with a conflict
# error because the object state read is not necessarily
# consistent with the objects read earlier in the transaction.

tm1 = transaction.TransactionManager()
conn = self._db.open(transaction_manager=tm1)
r1 = conn.root()
r1["p"] = self.obj
self.obj.child1 = P()
tm1.get().commit()

# start a new transaction with a new connection
tm2 = transaction.TransactionManager()
cn2 = self._db.open(transaction_manager=tm2)
# start a new transaction with the other connection
r2 = cn2.root()

self.assertEqual(r1._p_serial, r2._p_serial)

self.obj.child2 = P()
tm1.get().commit()

# resume the transaction using cn2
obj = r2["p"]
# An attempt to access obj should fail, because r2 was read
# earlier in the transaction and obj was modified by the othe
# transaction.
if shouldFail:
self.assertRaises(ReadConflictError, lambda: obj.child1)
# And since ReadConflictError was raised, attempting to commit
# the transaction should re-raise it. checkNotIndependent()
# failed this part of the test for a long time.
self.assertRaises(ReadConflictError, tm2.get().commit)

# And since that commit failed, trying to commit again should
# fail again.
self.assertRaises(TransactionFailedError, tm2.get().commit)
# And again.
self.assertRaises(TransactionFailedError, tm2.get().commit)
# Etc.
self.assertRaises(TransactionFailedError, tm2.get().commit)

else:
# make sure that accessing the object succeeds
obj.child1
tm2.get().abort()


def checkReadConflict(self):
self.obj = P()
self.readConflict()

def checkReadConflictIgnored(self):
# Test that an application that catches a read conflict and
# continues can not commit the transaction later.
root = self._db.open().root()
root["real_data"] = real_data = PersistentMapping()
root["index"] = index = PersistentMapping()

real_data["a"] = PersistentMapping({"indexed_value": 0})
real_data["b"] = PersistentMapping({"indexed_value": 1})
index[1] = PersistentMapping({"b": 1})
index[0] = PersistentMapping({"a": 1})
transaction.commit()

# load some objects from one connection
tm = transaction.TransactionManager()
cn2 = self._db.open(transaction_manager=tm)
r2 = cn2.root()
real_data2 = r2["real_data"]
index2 = r2["index"]

real_data["b"]["indexed_value"] = 0
del index[1]["b"]
index[0]["b"] = 1
transaction.commit()

del real_data2["a"]
try:
del index2[0]["a"]
except ReadConflictError:
# This is the crux of the text. Ignore the error.
pass
else:
self.fail("No conflict occurred")

# real_data2 still ready to commit
self.assertTrue(real_data2._p_changed)

# index2 values not ready to commit
self.assertTrue(not index2._p_changed)
self.assertTrue(not index2[0]._p_changed)
self.assertTrue(not index2[1]._p_changed)

self.assertRaises(ReadConflictError, tm.get().commit)
self.assertRaises(TransactionFailedError, tm.get().commit)
tm.get().abort()

def checkReadConflictErrorClearedDuringAbort(self):
# When a transaction is aborted, the "memory" of which
# objects were the cause of a ReadConflictError during
# that transaction should be cleared.
root = self._db.open().root()
data = PersistentMapping({'d': 1})
root["data"] = data
transaction.commit()

# Provoke a ReadConflictError.
tm2 = transaction.TransactionManager()
cn2 = self._db.open(transaction_manager=tm2)
r2 = cn2.root()
data2 = r2["data"]

data['d'] = 2
transaction.commit()

try:
data2['d'] = 3
except ReadConflictError:
pass
else:
self.fail("No conflict occurred")

# Explicitly abort cn2's transaction.
tm2.get().abort()

# cn2 should retain no memory of the read conflict after an abort(),
# but 3.2.3 had a bug wherein it did.
data_conflicts = data._p_jar._conflicts
data2_conflicts = data2._p_jar._conflicts
self.assertFalse(data_conflicts)
self.assertFalse(data2_conflicts) # this used to fail

# And because of that, we still couldn't commit a change to data2['d']
# in the new transaction.
cn2.sync() # process the invalidation for data2['d']
data2['d'] = 3
tm2.get().commit() # 3.2.3 used to raise ReadConflictError

cn2.close()

class PoisonedError(Exception):
pass

Expand Down
5 changes: 2 additions & 3 deletions src/ZODB/tests/testmvcc.py
Original file line number Diff line number Diff line change
Expand Up @@ -25,9 +25,8 @@
always see a consistent view of the database while it is executing.
If transaction A is running, has already read an object O1, and a
different transaction B modifies object O2, then transaction A can no
longer read the current revision of O2. It must either read the
version of O2 that is consistent with O1 or raise a ReadConflictError.
When MVCC is in use, A will do the former.
longer read the current revision of O2. It must read the
version of O2 that is consistent with O1.
This note includes doctests that explain how MVCC is implemented (and
test that the implementation is correct). The tests use a
Expand Down
3 changes: 3 additions & 0 deletions src/ZODB/transact.py
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,9 @@ def g(*args, **kwargs):
try:
r = f(*args, **kwargs)
except ReadConflictError as msg:
# the only way ReadConflictError can happen here is due to
# simultaneous pack removing objects revision that f could try
# to load.
transaction.abort()
if not n:
raise
Expand Down

0 comments on commit 0d499a3

Please sign in to comment.