Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FileStorage: fix rare data corruption when using restore after multiple undos #395

Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions CHANGES.rst
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,9 @@

- Fix sorting issue in ``scripts/space.py``.

- FileStorage: fix a rare data corruption when using restore after multiple undos.
For details see `#395 <https://github.com/zopefoundation/ZODB/pull/395>`_.


5.8.1 (2023-07-18)
==================
Expand Down
31 changes: 18 additions & 13 deletions src/ZODB/FileStorage/FileStorage.py
Original file line number Diff line number Diff line change
Expand Up @@ -672,28 +672,33 @@ def _data_find(self, tpos, oid, data):
tid, tl, status, ul, dl, el = unpack(TRANS_HDR, h)
status = as_text(status)
self._file.read(ul + dl + el)
tend = tpos + tl + 8
tend = tpos + tl
d-maurer marked this conversation as resolved.
Show resolved Hide resolved
pos = self._file.tell()
backpointer = None
while pos < tend:
h = self._read_data_header(pos)
if h.oid == oid:
# Make sure this looks like the right data record
if h.plen == 0:
# This is also a backpointer. Gotta trust it.
# This is also a backpointer, remember it and keep
# looking at other records from this transaction,
# if there is multiple undo records, we use the
# oldest.
backpointer = pos
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we sure that the "backpointer" found at the highest position is the right one? I.e. why have older undo records a higher position?

With your change, the code no longer matches the introductory source comment. The source comment should get updated.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello everyone. I had the same concern and actually this issue is not only about undo, but also how FileStorage handles and/or should handle a situation when one transaction contains two data records with the same oid. What loadBefore should return for such case - data from the first data record or from the second? Current behaviour is to return the data from the second data record - the one that was last to be .store()' ed. Proof:

create database with multiple data entries in the same transaction for oid=1

---- 8< ---- (1.txn)

user "author"
description "T1"
extension ""
obj 0000000000000001 3 null:00
AAA
obj 0000000000000001 3 null:00
BBB
$ python -m zodbtools.zodb commit 1.fs 0000000000000000 <1.txn 
03f63ee48fa55c00
$ fs1 dump 1.fs
Trans #00000 tid=03f63ee48fa55c00 time=2024-02-01 10:44:33.667016 offset=35
    status=' ' user='author' description='T1'
  data #00000 oid=0000000000000001 size=3 class=?.?
  data #00001 oid=0000000000000001 size=3 class=?.?

try to read that oid=1

---- 8< ---- (catobj.py)

from __future__ import print_function

from ZODB.FileStorage import FileStorage
from ZODB.utils import p64, u64

stor = FileStorage('1.fs', read_only=True)
head = stor.lastTransaction()
head_ = p64(u64(head)+1)
oid = p64(1)
x = stor.loadBefore(oid, head_)
print(x)
$ python catobj.py 
('BBB', '\x03\xf6>\xe4\x8f\xa5\\\x00', None)

I'm not sure 100% but probably this is kind of expected behaviour.


So taking this into account for consistency it should be also the latest undo record that is taking the effect.

And for generality we should also consider cases like:

  • data record 1: undo to T2
  • data record 2: undo to T1
  • data record 3: store data3

because here it should be data3 to be returned by loadBefore, not data from T1.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(undo record can be, and probably should be, perceived as regular store with instruction to copy data from particular previous data record)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the above point of view I suggest to amend the patch with the following:

--- a/src/ZODB/FileStorage/FileStorage.py
+++ b/src/ZODB/FileStorage/FileStorage.py
@@ -663,6 +663,10 @@ def _data_find(self, tpos, oid, data):
         # Else oid's data record contains the data, and the file offset of
         # oid's data record is returned.  This data record should contain
         # a pickle identical to the 'data' argument.
+        # When looking for oid's data record we scan all data records in
+        # the transaction till the end looking for the latest record with oid,
+        # even if there are multiple such records. This matches load behaviour
+        # which uses the data record created by last store.
 
         # Unclear:  If the length of the stored data doesn't match len(data),
         # an exception is raised.  If the lengths match but the data isn't
@@ -674,31 +678,36 @@ def _data_find(self, tpos, oid, data):
         self._file.read(ul + dl + el)
         tend = tpos + tl
         pos = self._file.tell()
-        backpointer = None
+        data_hdr = None
+        data_pos = 0
+        # scan all data records in this transaction looking for the latest
+        # record with our oid
         while pos < tend:
             h = self._read_data_header(pos)
             if h.oid == oid:
-                # Make sure this looks like the right data record
-                if h.plen == 0:
-                    # This is also a backpointer, remember it and keep
-                    # looking at other records from this transaction,
-                    # if there is multiple undo records, we use the
-                    # oldest.
-                    backpointer = pos
-                else:
-                    if h.plen != len(data):
-                        # The expected data doesn't match what's in the
-                        # backpointer.  Something is wrong.
-                        logger.error("Mismatch between data and"
-                                     " backpointer at %d", pos)
-                        return 0
-                    _data = self._file.read(h.plen)
-                    if data != _data:
-                        return 0
-                    return pos
+                data_hdr = h
+                data_pos = pos
             pos += h.recordlen()
             self._file.seek(pos)
-        return backpointer or 0
+        if data_hdr is None:
+            return 0
+
+        # return position of found data record, but make sure this looks like
+        # the right data record to return.
+        if data_hdr.plen == 0:
+            # This is also a backpointer,  Gotta trust it.
+            return data_pos
+        else:
+            if data_hdr.plen != len(data):
+                # The expected data doesn't match what's in the
+                # backpointer.  Something is wrong.
+                logger.error("Mismatch between data and"
+                             " backpointer at %d", pos)
+                return 0
+            _data = self._file.read(data_hdr.plen)
+            if data != _data:
+                return 0
+            return data_pos
 
     def restore(self, oid, serial, data, version, prev_txn, transaction):
         # A lot like store() but without all the consistency checks.  This

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, I like this patch better, it's more logical. I have integrated this in the pull request.

When I wrote that patch, the reason to take the latest data record was to match the behavior of _txn_undo_write, it iterates while pos < tend: and updates the index.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, Jérome.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And I see about your original observations.

else:
if h.plen != len(data):
# The expected data doesn't match what's in the
# backpointer. Something is wrong.
logger.error("Mismatch between data and"
" backpointer at %d", pos)
return 0
_data = self._file.read(h.plen)
if data != _data:
return 0
return pos
if h.plen != len(data):
# The expected data doesn't match what's in the
# backpointer. Something is wrong.
logger.error("Mismatch between data and"
" backpointer at %d", pos)
return 0
_data = self._file.read(h.plen)
if data != _data:
return 0
return pos
pos += h.recordlen()
self._file.seek(pos)
return 0
return backpointer or 0

def restore(self, oid, serial, data, version, prev_txn, transaction):
# A lot like store() but without all the consistency checks. This
Expand Down
40 changes: 40 additions & 0 deletions src/ZODB/tests/RecoveryStorage.py
Original file line number Diff line number Diff line change
Expand Up @@ -199,3 +199,43 @@ def testRestoreWithMultipleObjectsInUndoRedo(self):
# part of the test.
self._dst.copyTransactionsFrom(self._storage)
self.compare(self._storage, self._dst)

def testRestoreWithMultipleUndoRedo(self):
db = DB(self._storage)
c = db.open()
r = c.root()

# TO
r["obj"] = MinPO(0)
transaction.commit()
c.close()

# T1: do 1
r = db.open().root()
r["obj"].value = 1
transaction.commit()

# T2: do 2
r = db.open().root()
r["obj"].value = 2
transaction.commit()

# T3: undo T1 and T2
db.undoMultiple([u["id"] for u in db.undoLog(0, 2)])
transaction.commit()
# obj will be a backpointer to T0

# T4: do 3
r = db.open().root()
r["obj"].value = 3
transaction.commit()

# T4: undo
self._undo(self._storage.undoInfo()[0]['id'])
# obj will be a backpointer to T3, which is a backpointer to T0

r = db.open().root()
self.assertEqual(r["obj"].value, 0)

self._dst.copyTransactionsFrom(self._storage)
self.compare(self._storage, self._dst)
4 changes: 4 additions & 0 deletions src/ZODB/tests/testFileStorage.py
Original file line number Diff line number Diff line change
Expand Up @@ -456,6 +456,10 @@ def testRestoreAcrossPack(self):
# Skip this check as it calls restore directly.
pass

def testRestoreWithMultipleUndoRedo(self):
# This case is only supported with restore, not with store "emulation"
pass
perrinjerome marked this conversation as resolved.
Show resolved Hide resolved


class AnalyzeDotPyTest(StorageTestBase.StorageTestBase):

Expand Down
Loading