Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Objects being overwritten from memory during tessellation #94

Open
tdeyster opened this issue Feb 21, 2023 · 4 comments
Open

Objects being overwritten from memory during tessellation #94

tdeyster opened this issue Feb 21, 2023 · 4 comments

Comments

@tdeyster
Copy link

Problem:

When using the jupyter cadquery function jcq.base._tessellate_group(group_to_tessellate) sometimes objects are pulled from memory rather than the group_to_tessellate

I've tried numerous options including:

-make sure I'm deleting objects
-try importing and deleting jcq functions inside each loop
-try passing cq objects through jcq.cad_objects.to_assembly() first (rather than using jcq group)
-running various IDE (cmd, spyder, jupyter)

My best guess if that somehow variables aren't being correctly disposed by jcq or python.
Though, if this is a system specific issue, I'd appreciate any suggestions.

Code to reproduce issue in jupyter notebooks

this code generates rows of boxes, if is find a box pulled from memory rather than the current object it breaks and displays.
Note all block should be in a line, the block out of line is pulled from a previous loop iteration.
image

# imports
import cadquery as cq
import jupyter_cadquery as jcq
from cad_viewer_widget import show as viewer_show
import numpy as np
#versions
import sys
print("python: %s"%sys.version)
print("cadquery: %s (funny version number but was installed recently should be 2.?)"%cq.__version__)
print("jupyter_cadquery: %s"%jcq.jcq_version)
# generate boxes (may need to run this block a few times to see error)
ni,nj = 300,30 # loop through ni rows, each with nj boxes
for i in range(ni):
    boxes = [] #partlist
    for j in range(nj):
        x = np.random.rand() #box will be random size 0-1
        box = cq.Workplane("XY").transformed(offset=cq.Vector(2*i,j,0)).box(x,x,x)
        boxes.append(jcq.Part(box,name="box_%s"%j))
        del box
        
    result = jcq.PartGroup(boxes,name="boxes")
    del boxes
    # tessellate boxes
    shapes,states = jcq.base._tessellate_group(result)
    del result
    # get bb
    bb = jcq.base._combined_bb(shapes)
    shapes["bb"] = bb.to_dict()
    # check that bb is correct
    if bb.xmin<(2*i-.5) or bb.xmax>(2*i+.5):
        print(i-1)
        viewer_show(shapes,states) # shapes and states
        break
        
    del shapes,states

System info:

PC: Windows 10 64bit
python: 3.9.15 | packaged by conda-forge | (main, Nov 22 2022, 08:41:22) [MSC v.1929 64 bit (AMD64)]
cadquery: 0.0.0 (funny version number but was installed last month should be 2.?)
jupyter_cadquery: 3.4.0

@bernhard-42
Copy link
Owner

Thanks for reporting. I was able to reproduce it. Will look into it

@bernhard-42
Copy link
Owner

Thank you for the great analysis and reproducing code! Really appreciate it.

You are right, it is the cache function. The cache key uses shape.HashCode(MAX_HASH_KEY) to create a hash for each object. unfortunately, this hash function has collisions (which I should have expected, given that HASH_MAX == 2147483647)

Now, ..., I know the reason, however, need to find a solution (that is fast, else caching doesn't make sense)

@tdeyster
Copy link
Author

Thanks for the quick response! Hopefully you find a suitable solution. It's a great package you've built.

@bernhard-42
Copy link
Owner

To document an idea:

When I read the pybind11 docs (the technology to bring C++ OCCT to Python), a "pybind11 based object" is a Python object that contains a reference to the C++ object.
Hence, obj.HashCode(max_int) uses the C++ memory pointer for the hash and id(obj) uses the Python wrapper object memory address. Two different values in two different memory spaces.

id is guaranteed to be unique for objects that have an overlapping life span. We could still get the same id if an old object was deleted (or set to None), garbage collection has kicked in and a new Python wrapper was created at the same memory address for a different C++ OCCT object.

But if I combine both the Python wrapper object address id(obj) and the C++ memory pointer hash (obj.HashCode(max_key) in my cache key, then I don't think a collision can be created too easily: The newly created C++ object would need to be at the same memory address as the one before, and by chance the Python object would also need to be at the same Python memory address as the one before referencing the older C++ OCCT object.

The change would mean to change

key = (
tuple((s.HashCode(MAX_HASH_KEY) for s in shape)),
deviation,
angular_tolerance,
compute_edges,
compute_faces,
)

to

key = ( 
     tuple(( (s.HashCode(MAX_HASH_KEY), id(s) ) for s in shape)), 
    ...

Of course, a collision could still happen, however it should be very unlikely. In some initial tests with 10000s of objects I wasn't able to get a collision. And I still can add a switch to turn off caching, if an issue arises.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants