-
Notifications
You must be signed in to change notification settings - Fork 879
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Cythonize get_cell_list_contents #1995
Conversation
Requirements so far for migrating to Cythonized space.py:
|
Maybe we should just skip cythonizing the existing space.py, and just straight to cythonizing #1994. |
This is the error message due to custom decorator:
|
Thanks for this effort. It might indeed be interesting how #1994 develops. |
I think thats really not too bad at all and I think the changes in this file are only minimal. Hope you get this to work |
@quaquel was enthusiastic about this effort and wanted to port some parts to Cyhton once the whole built-pipeline was up and running. He also suggested that might be an interesting NumFocus / GSoC idea. |
For me, the most important first step is moving to meson and ensuring we can build the code for the various platforms, etc. Once that machinery is in place, we can start experimenting. In my view, the long-term goal is to move all performance-critical parts to Cython. However, we are simultaneously also rebuilding various parts (schedulers, spaces). And we are still debating other parts like data collection. So it is a bit of a shame to port stuff that will be replaced anyway. But it is also potentially annoying to have new experimental features only in cython because these parts might change, and it is less accessible for anyone running into an issue and willing to investigate. In my experience, it is not that much work to turn good python code into cython code. So writing experimental parts in cython is fine for me as a developer. I am less sure about the user implications. We might also just test it with for example the new grid stuff and see what we learn. another option that I have seen is that libraries offer two versions of essentially the same code. A slow Python version and a fast c/cython/c++ version. The drawback is that in induced code duplication. So, no clear conclusion for me. Just some choices that have to be made. |
I stick to using Hatch here because:
I suppose it is safer & for exercise purpose to Cythonize the stable space.py code, given that it is unlikely to change except for bugfixes. Even with the old space.py, we can still experiment with using C++ STL map to avoid GIL, and hence be faster than Cython dict.
I wouldn't want code duplication unless necessary. It's going to be hard to maintain. |
just a quick clarifying question: do you plan to move to meson in the longer run? |
Yes, personally, when I have the time to properly port the packaging system. Though we probably need a comprehensive reasoning why it has to be Meson, for a consensus from the @projectmesa/maintainers. My reason to pick Hatch at the time: #1882 (comment). |
I don't know enough about build systems to make an informed decision about this. |
The story behind SciPy's move to Meson: https://labs.quansight.org/blog/2021/07/moving-scipy-to-meson. The RFC itself. |
From the previous link:
|
That was a nice and helpful read. Like @EwoutH, build systems are not my area of expertise. But the fact that some of the big libraries in the scientific computing python ecosystem use meson (e.g., scipy, numpy, pandas) speaks in favor of using it even while our needs are nowhere as complicated as for those libraries. |
Will read up later, but agreed! Best practices are best practices for a reason. |
New commits are about improving the typing annotation because Cython is stricter, and removing the
This PR can only be merged if support for 3.9 is dropped, and so it depends on #2003. |
Forgot to say that tests have passed for Python >= 3.10. |
I want to dig into this in the near future. In the meantime, do you have any idea about the performance improvements? |
There shouldn't be a much performance improvement of cpdef long[:, :] convert_tuples_to_mview(self, object cell_list):
cdef long x, y
cdef long[:, :] tuples_mview
length = len(cell_list)
tuples_mview = np.ndarray((length, 2), long)
for i in range(length):
pos = cell_list[i]
x, y = pos[0], pos[1]
tuples_mview[i, 0], tuples_mview[i, 1] = x, y
return tuples_mview
cpdef object[:] get_cell_mview_contents(self, long[:, :] tuples_mview):
cdef long default_val
cdef int count
cdef object[:] agent_mview
cdef long x, y
length = len(tuples_mview)
agent_mview = np.ndarray(length, object)
count = 0
default_val = self._default_val_ids()
for i in range(length):
x, y = tuples_mview[i, 0], tuples_mview[i, 1]
id_agent = self._ids_grid[x, y]
if id_agent == default_val:
continue
agent_mview[count] = self._agents_grid[x, y]
count += 1
return agent_mview[:count]
cpdef get_cell_list_contents(self, object cell_list):
tuples_mview = self.convert_tuples_to_mview(cell_list)
agent_mview = self.get_cell_mview_contents(tuples_mview)
return self.convert_agent_mview_to_list(agent_mview) which is not in this PR. Here I limit the scope to preparing a Cython setup. |
The benchmark failed because it didn't run a |
Local benchmark result
|
@EwoutH you could probably help debug this faster. I tried at the commit "benchmark: Install main & PR branch of Mesa explicitly", but not sure if it is the right approach. |
The commit probably works, but it only takes effect after it is merged due to the way pull_request_target works. So could you do this in a separate PR? Once the other PR is merged you can rerun the benchmark on this one |
Performance benchmarks:
|
Closing in favor of using Rust instead. |
No description provided.