Skip to content

Commit

Permalink
fix[tool]: update VarAccess pickle implementation (#4270)
Browse files Browse the repository at this point in the history
fix a bug where unpickling `annotated_vyper_module` would lead to
a crash:

```
AttributeError: 'VarAccess' object has no attribute 'variable'
```

this is a blocker for tooling, for instance, titanoboa
relies on pickle/unpickle to cache `CompilerData` objects:
https://github.com/vyperlang/titanoboa/blob/86df8936654db2068641/boa/util/disk_cache.py#L65-L66

the underlying issue is that `pickle.loads()` calls `obj.__hash__()`
for objects that are keys in a hashed data structure - namely, dicts,
sets and frozensets. this causes a crash when there is a cycle in
the object graph, because the object is not fully instantiated at the
time that `__hash__()` is called. this is a cpython issue, reported at
python/cpython#124937.

@serhiy-storchaka suggested the approach taken in this PR, which breaks
the loop before pickling:
python/cpython#124937 (comment)

note that the implementation of `__reduce__()` in this PR is safe, since
there is no cycle in the hash function itself, since the recursion
breaks in `VarInfo.__hash__()`. in other words, there is no possibility
of `VarAccess.__hash__()` changing mid-way through reconstructing the
object.
  • Loading branch information
charles-cooper authored Oct 7, 2024
1 parent c02d2d8 commit d079562
Show file tree
Hide file tree
Showing 2 changed files with 31 additions and 2 deletions.
19 changes: 19 additions & 0 deletions tests/integration/test_pickle_ast.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
import copy
import pickle

from vyper.compiler.phases import CompilerData


def test_pickle_ast():
code = """
@external
def foo():
self.bar()
y: uint256 = 5
x: uint256 = 5
def bar():
pass
"""
f = CompilerData(code)
copy.deepcopy(f.annotated_vyper_module)
pickle.loads(pickle.dumps(f.annotated_vyper_module))
14 changes: 12 additions & 2 deletions vyper/semantics/analysis/base.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
import enum
from dataclasses import dataclass
from dataclasses import dataclass, fields
from functools import cached_property
from typing import TYPE_CHECKING, Any, ClassVar, Dict, Optional, Union

Expand Down Expand Up @@ -234,6 +234,17 @@ class VarAccess:
# A sentinel indicating a subscript access
SUBSCRIPT_ACCESS: ClassVar[Any] = object()

# custom __reduce__ and _produce implementations to work around
# a pickle bug.
# see https://github.com/python/cpython/issues/124937#issuecomment-2392227290
def __reduce__(self):
dict_obj = {f.name: getattr(self, f.name) for f in fields(self)}
return self.__class__._produce, (dict_obj,)

@classmethod
def _produce(cls, data):
return cls(**data)

@cached_property
def attrs(self):
ret = []
Expand Down Expand Up @@ -286,7 +297,6 @@ def __post_init__(self):
for attr in should_match:
if getattr(self.var_info, attr) != getattr(self, attr):
raise CompilerPanic(f"Bad analysis: non-matching {attr}: {self}")

self._writes: OrderedSet[VarAccess] = OrderedSet()
self._reads: OrderedSet[VarAccess] = OrderedSet()

Expand Down

0 comments on commit d079562

Please sign in to comment.