-
Notifications
You must be signed in to change notification settings - Fork 92
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FileStorage: fix rare data corruption when using restore after multiple undos #395
FileStorage: fix rare data corruption when using restore after multiple undos #395
Conversation
The case of a history like this: - T0 initialize an object in state 0 - T1 modifies object to state 1 - T2 modifies object to state 2 - T3 undo T2 and T1, bringing back to state 0 - T4 modified object to state 3 - T5 undo T4, bringing back object to state 0 was not correct after `restore`: the state is 1 instead of the expected 0. This is because T3 contains two data records: - an undo of T2, with a backpointer to the data of state 1 - an undo of T1, with a backpointer to the data of state 0 When restoring T5 (the undo of T4), the transaction is made of one data record, with a backpointer that is copied from the backpointer from T3, but this uses backpointer from the first record for this object, which is incorrect, in such a case where there is more than one backpointer for the same oid, we need to iterate in all data record to find the oldest version.
e68c5f3
to
44d36b0
Compare
For reference, the file format is described in ZODB/src/ZODB/FileStorage/format.py Lines 15 to 84 in 6f88781
|
src/ZODB/FileStorage/FileStorage.py
Outdated
# looking at other records from this transaction, | ||
# if there is multiple undo records, we use the | ||
# oldest. | ||
backpointer = pos |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we sure that the "backpointer" found at the highest position is the right one? I.e. why have older undo records a higher position?
With your change, the code no longer matches the introductory source comment. The source comment should get updated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hello everyone. I had the same concern and actually this issue is not only about undo, but also how FileStorage handles and/or should handle a situation when one transaction contains two data records with the same oid. What loadBefore should return for such case - data from the first data record or from the second? Current behaviour is to return the data from the second data record - the one that was last to be .store()
' ed. Proof:
create database with multiple data entries in the same transaction for oid=1
---- 8< ---- (1.txn
)
user "author"
description "T1"
extension ""
obj 0000000000000001 3 null:00
AAA
obj 0000000000000001 3 null:00
BBB
$ python -m zodbtools.zodb commit 1.fs 0000000000000000 <1.txn
03f63ee48fa55c00
$ fs1 dump 1.fs
Trans #00000 tid=03f63ee48fa55c00 time=2024-02-01 10:44:33.667016 offset=35
status=' ' user='author' description='T1'
data #00000 oid=0000000000000001 size=3 class=?.?
data #00001 oid=0000000000000001 size=3 class=?.?
try to read that oid=1
---- 8< ---- (catobj.py
)
from __future__ import print_function
from ZODB.FileStorage import FileStorage
from ZODB.utils import p64, u64
stor = FileStorage('1.fs', read_only=True)
head = stor.lastTransaction()
head_ = p64(u64(head)+1)
oid = p64(1)
x = stor.loadBefore(oid, head_)
print(x)
$ python catobj.py
('BBB', '\x03\xf6>\xe4\x8f\xa5\\\x00', None)
I'm not sure 100% but probably this is kind of expected behaviour.
So taking this into account for consistency it should be also the latest undo record that is taking the effect.
And for generality we should also consider cases like:
- data record 1: undo to T2
- data record 2: undo to T1
- data record 3: store data3
because here it should be data3 to be returned by loadBefore, not data from T1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(undo record can be, and probably should be, perceived as regular store with instruction to copy data from particular previous data record)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With the above point of view I suggest to amend the patch with the following:
--- a/src/ZODB/FileStorage/FileStorage.py
+++ b/src/ZODB/FileStorage/FileStorage.py
@@ -663,6 +663,10 @@ def _data_find(self, tpos, oid, data):
# Else oid's data record contains the data, and the file offset of
# oid's data record is returned. This data record should contain
# a pickle identical to the 'data' argument.
+ # When looking for oid's data record we scan all data records in
+ # the transaction till the end looking for the latest record with oid,
+ # even if there are multiple such records. This matches load behaviour
+ # which uses the data record created by last store.
# Unclear: If the length of the stored data doesn't match len(data),
# an exception is raised. If the lengths match but the data isn't
@@ -674,31 +678,36 @@ def _data_find(self, tpos, oid, data):
self._file.read(ul + dl + el)
tend = tpos + tl
pos = self._file.tell()
- backpointer = None
+ data_hdr = None
+ data_pos = 0
+ # scan all data records in this transaction looking for the latest
+ # record with our oid
while pos < tend:
h = self._read_data_header(pos)
if h.oid == oid:
- # Make sure this looks like the right data record
- if h.plen == 0:
- # This is also a backpointer, remember it and keep
- # looking at other records from this transaction,
- # if there is multiple undo records, we use the
- # oldest.
- backpointer = pos
- else:
- if h.plen != len(data):
- # The expected data doesn't match what's in the
- # backpointer. Something is wrong.
- logger.error("Mismatch between data and"
- " backpointer at %d", pos)
- return 0
- _data = self._file.read(h.plen)
- if data != _data:
- return 0
- return pos
+ data_hdr = h
+ data_pos = pos
pos += h.recordlen()
self._file.seek(pos)
- return backpointer or 0
+ if data_hdr is None:
+ return 0
+
+ # return position of found data record, but make sure this looks like
+ # the right data record to return.
+ if data_hdr.plen == 0:
+ # This is also a backpointer, Gotta trust it.
+ return data_pos
+ else:
+ if data_hdr.plen != len(data):
+ # The expected data doesn't match what's in the
+ # backpointer. Something is wrong.
+ logger.error("Mismatch between data and"
+ " backpointer at %d", pos)
+ return 0
+ _data = self._file.read(data_hdr.plen)
+ if data != _data:
+ return 0
+ return data_pos
def restore(self, oid, serial, data, version, prev_txn, transaction):
# A lot like store() but without all the consistency checks. This
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, I like this patch better, it's more logical. I have integrated this in the pull request.
When I wrote that patch, the reason to take the latest data record was to match the behavior of _txn_undo_write
, it iterates while pos < tend:
and updates the index.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, Jérome.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And I see about your original observations.
I very much prefer PRs from the main repository (rather than from forks): for difficult code, it is important to see the modification in a larger context and not only the diff. To support this, I have cloned the main repository and do not like to add additional clones (or "remote"s) for reviews. |
Hello Dieter. I understand your rationale, and on my side try to follow your advice after you expressed the same reasoning to me. But just for the reference, if you add the following into
you will be able to check out code of all pull requests without adding any remote or something. For example with hereby PR it would be |
Co-authored-by: Kirill Smelkov <kirr@nexedi.com>
Jérome Perrin wrote at 2024-2-1 05:40 -0800:
...
I can make another pull request to change all these to use skip, unless someone here thinks it's bad ?
`unittest` used to have problems when `dubug` was called for
a skipped method. With `zope.testrunner` `debug` is called with
the "-D" option.
When `zope.testrunner`'s `-D` option is used, it is expected
that it runs until the first failure and there enters the debugger
(allowing the tester to analyse the failure context).
Due to the above problem, the first encountered `skip` was considered
a failure and entered the debugger.
I filed a corresponding Python bug report but I am not sure
whether a fix is in all supported Python versions.
Should the problem still exist, I would not massively use `skip`.
|
Ah yes, nowadays zope.testrunner's |
…txn_prev if restore is called with an existing txn_prev that does not contain the oid, the "8-byte redundant transaction length -8" and the beginning of the next transaction was read as a data record. The condition to stop was incorrect as discussed in [1], in practice this error would only happens when restore is passed a bad txn_prev so this should not be a problem in normal conditions because we are supposed to find the data record for that oid. [1] zopefoundation#395 (comment)
…txn_prev if restore is called with an existing txn_prev that does not contain the oid, the "8-byte redundant transaction length -8" and the beginning of the next transaction was read as a data record. The condition to stop was incorrect as discussed in [1], in practice this error would only happens when restore is passed a bad txn_prev so this should not be a problem in normal conditions because we are supposed to find the data record for that oid. [1] zopefoundation#395 (comment)
Jérome Perrin wrote at 2024-2-14 07:23 -0800:
...
@perrinjerome commented on this pull request.
> @@ -672,28 +672,33 @@ def _data_find(self, tpos, oid, data):
tid, tl, status, ul, dl, el = unpack(TRANS_HDR, h)
status = as_text(status)
self._file.read(ul + dl + el)
- tend = tpos + tl + 8
+ tend = tpos + tl
I tried to write a test calling restore with a txn that did not contain the oid, it triggers the problem, but it does not really represent a normal scenario so I'm not 100% sure what to do.
I think, we do not need a test to fix the `+ 8` bug.
I know that I asked about the "why remove `+ 8`" but our discussion
convinced me that it is the right thing:
According to "format.py", `tl` contains "transaction record length - 8"
and a transaction record ends with a sequence of "data record"s
followed by "8-byte redundant transaction length -8".
This means that the last data record ends at `tpos + tl`,
not `tpos + tl + 8`.
|
@perrinjerome, thanks for adding the test and robustifying handling of records reader. @d-maurer, thanks also for confirming you are ok with the fix and we all now share common understanding. I think that tests are added not because the code is wrong, but to preserve it to stay working correctly. So in general whenever we fix something, we try to add a test for the fixed problem so that we know for sure it won't return after follow-up changes. Sometimes it might be hard to do a test and the difficulties outweight usefulness of having tests. But in this case we already have tests done as part of #397 so unless there is a strong reason not to take them, let's just add both the fix and the test to make sure that quality is improved and stays ok. Anyway, besides the tests, am I right that we all three agree that this patch is ok now? Or did I missed something and there is still something else to be improved? Kirill |
Kirill Smelkov wrote at 2024-2-16 08:25 -0800:
...
@d-maurer, thanks also for confirming you are ok with the fix and we all now share common understanding.
I think that tests are added not because the code is wrong, but to preserve it to stay working correctly. So in general whenever we fix something, we try to add a test for the fixed problem so that we know for sure it won't return after follow-up changes.
In this particular case, it is very unlikely that someone will readd
the `+ 8` again and change the loop logic otherwise.
But, I am not against a test if someone has already spent time on it.
|
Dieter, thanks for feedback. What you say is reasonable. |
I feel that we agreed on merging this after merging #397 , I "requested review" from github interface, if you agree please approve the review and I'll apply the changes. |
The case of a history like this:
was not correct after
restore
: the state is 1 instead of the expected 0.This is because T3 contains two data records:
When restoring T5 (the undo of T4), the transaction is made of one data record, with a backpointer that is copied from the backpointer from T3, but this uses backpointer from the first record for this object, which is incorrect, in such a case where there is more than one backpointer for the same oid, we need to iterate in all data record to find the oldest version.