Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor RasterLayer to be based on PropertyLayer in Mesa #201

Open
wang-boyu opened this issue Apr 24, 2024 · 16 comments
Open

Refactor RasterLayer to be based on PropertyLayer in Mesa #201

wang-boyu opened this issue Apr 24, 2024 · 16 comments
Labels
enhancement Release notes label help wanted

Comments

@wang-boyu
Copy link
Member

wang-boyu commented Apr 24, 2024

What's the problem this feature will solve?

The current implementation of RasterLayer includes copies of some methods from mesa.space.Grid (see #120), making it difficult to maintain.

Describe the solution you'd like

A more related class in Mesa is its recent PropertyLayer (projectmesa/mesa#1898). RasterLayer could inherit from PropertyLayer with added geospatial capabilities.

Additional context

Might also be beneficial to have part of GeoSpace inherit from _PropertyGrid to manage multiple raster layers.

Some related past issues:

@EwoutH
Copy link
Member

EwoutH commented Apr 24, 2024

Agree on this! I don't have time to work on it, but I can provide guidance and review.

@wang-boyu
Copy link
Member Author

Also need something similar to projectmesa/mesa#2336 for visualization.

@SongshGeo
Copy link

Hi, I'm happy to help with this issue.

I've browsed the PropertyLayer class in Mesa's latest codebase. I also found the Cell class under the experiment.cell_space module, which seems to be an experimental feature.

I don't know if they will be integrated with the Cell class in Mesa-Geo in the future. I believe Cell should be regarded as a stable basic unit of discrete space with more functionality. I mean, experimental Cell in the current Mesa codebase could be the direction. What do you think?

@wang-boyu
Copy link
Member Author

Thanks for your interest on this!

I have to say that I'm not familiar with the experimental Cell and PropertyLayer features in Mesa. Perhaps it's better to have some help and advice from @EwoutH instead.

One thing to note though is that Mesa-Geo's current Cell in RasterLayer may not be exactly the same as the experimental Cell in Mesa, despite they share the same name (unfortunately). Here I used Cell to be an agent-like entity (i.e., with a step() function) in a RasterLayer, whereas the experimental Cell in Mesa looks like some kind of agent containers. One thing in common is that they both seem to be able to have properties.

Hope I'm not confusing you : )

@EwoutH
Copy link
Member

EwoutH commented Nov 5, 2024

@SongshGeo very cool!

We will stabilize the whole cell space, likely in Mesa 3.1. In the long term, we will remove the current spaces. For now, you can regard Cell as being stable, we likely won’t be changing it significantly.

In my (personal) vision, in the long term Mesa-geo will be integrated more tightly in Mesa. See projectmesa/mesa#2330. So I will support any changes to move in that direction.

You should probably also read this issue:

Let me know if you have any questions! If you want to chat directly, you can join our Matrix chat room.

@SongshGeo
Copy link

Thanks for your explanation, @EwoutH. Cool. I support the long-term integration of Mesa and Mesa-geo. But for now, I will mainly focus on changes to Mesa-geo.
I understand @wang-boyu 's point; currently, mesa-geo is just a simple instance with attributes. The experimental Cell in Mesa is similar to mine (see what I did in ABSESpy), as it serves as a container for Agents and can also store attributes.

I haven't had much time to examine the visualization part, but I'm happy to open a PR that modifies Mesa-geo using Cell and PropertyLayer from Mesa.

By the way, I joined the Matrix community a long time ago. Instant messaging sometimes causes me some stress, so I prefer to communicate here.

Re. the projects/mesa#2431 , I agree with what you said about using "GIS models" to imagine attributes as a "big cake". We can set multiple attributes for the same layer (i.e., a cell can have various attributes); however, agents do not need to move between different layers. We should use ideas similar to the Xarray package to manage selections between different layers based on coordinates.

What do you think?

@EwoutH
Copy link
Member

EwoutH commented Nov 5, 2024

If I understand you correctly we're thinking in the same direction. I imagine agents still having a single position according to a single CRS, but being able to fetch data from multiple layers (simultaneously), which might or might not have similar CRS.

Go ahead and start playing with potential solutions! Feel free to open a draft PR here early.

@SongshGeo
Copy link

Cool, that's exactly what I meant. It is good to know that we're in the same direction. I will do it this week.

@SongshGeo
Copy link

Hi, @wang-boyu and @EwoutH

I have made some preliminary refactoring of RasterLayer to showcase my ideas.

All previous methods can call the experimental APIs of Cell and PropertyLayer in this way. In other words, RasterLayer will combine both functionalities, encapsulated with geographic information. Given that the data attribute of PropertyLayer is implemented based on Numpy 2d arrays, array-style indexing in RasterLayer aligns better with GIS practices. For instance, we can have a property like this:

>>> layer.array_cells
array([[Cell((0, 3), []), Cell((1, 3), []), Cell((2, 3), [])],
       [Cell((0, 2), []), Cell((1, 2), []), Cell((2, 2), [])],
       [Cell((0, 1), []), Cell((1, 1), []), Cell((2, 1), [])],
       [Cell((0, 0), []), Cell((1, 0), []), Cell((2, 0), [])]],
      dtype=object)

For another example, add_property (i.e., previous apply_raster) could be:

    def add_property(
        self,
        data: np.ndarray | float | int | bool,
        attr_name: str,
        add_to_cells: bool = True,
    ) -> None:
        """Add a property layer to the grid."""
        if isinstance(data, np.ndarray):
            if data.shape != (self.height, self.width):
                raise ValueError(
                    f"Data shape does not match raster shape. "
                    f"Expected {(self.height, self.width)}, received {data.shape}."
                )
        else:
            data = np.full((self.height, self.width), data)
        property_layer = PropertyLayer(
            attr_name,
            self.width,
            self.height,
            default_value=np.nan,
        )
        # Would be better if `PropertyLayer` had a class method to create directly from array.
        property_layer.data = data
        self.grid.add_property_layer(property_layer, add_to_cells)

I haven't completed all function and test modifications yet; if you agree with this direction, I will continue to work on them.

@EwoutH
Copy link
Member

EwoutH commented Nov 7, 2024

Thanks!

One of the advantages of the current PropertyLayer is that it’s extremely fast, because almost all operations can be performed vectorized. If I’m correct, you now propose an array of cells. Will that retain the PropertyLayer’s performance?

From an API/implementation standpoint, I’m curious what Wang thinks.

@quaquel you might also be interested in this.

@SongshGeo
Copy link

Sorry, perhaps my description caused a misunderstanding. I don't think I have stored any additional Numpy attributes. array_cells is a cached property generated by Grid. I mean, indexing should be done in this array format, with the initial coordinates in the top-left corner. The modified RasterLayer now builds combinations between Grid, PropertyLayer, Cell, and CellCollection functionalities.

@wang-boyu
Copy link
Member Author

Thanks @SongshGeo!

Previously (or currently) RasterLayer is rather slow, because its Cell is agent-type with a step() function, so each model step will iterate through all Cell's step function individually. But sometimes it might be easier to write these step() functions I guess?

Conversely Mesa's PropertyLayer is fast because the updates to it are vectorized operation applied directly on the entire layer (i.e., numpy array), e.g., https://github.com/projectmesa/mesa-examples/blob/4c25596df38618dccaa3ab0ac4e560735714c00b/examples/conways_game_of_life_fast/model.py#L30-L47. Vectorization may be a challenge for some users.

I think we need to make a decision on which way to go (or both?).

Btw rasterio returns numpy arrays when reading raster data: https://github.com/projectmesa/mesa-geo/blob/071724056c7670a80b0454f9ec44e4a5cffa380b/mesa_geo/raster_layers.py#L554C13-L554C36, then Mesa-Geo set it to each cell's attribute.

Also the name apply_raster is borrowed from NetLogo's gis extension: https://ccl.northwestern.edu/netlogo/docs/gis.html#gis:apply-raster

@SongshGeo
Copy link

@wang-boyu I support having both. I am a fan of vectorization. But you are right; this may need to be more user-friendly for some users. So, in the initial proposal I put forward, I proposed this approach.

RasterLayer will generate a two-dimensional discrete Grid; it establishes and manages multiple PropertyLayers with the same CRS.

I am quite familiar with reading data from rasterio, and since it is also an array like PropertyLayer, I strongly recommend using Numpy's [row, col] indexing mode.

Interesting insight! I have yet to use NetLogo's GIS; I've been looking for a Python solution from the start. I got it! Then I may continue to use the name apply_raster.

@quaquel
Copy link
Member

quaquel commented Nov 7, 2024

Thanks for tagging me on this conversation. First of all, I am not familiar with MESA-Geo at all, so I won't be able to comment much on that side of things. However, I am rather familiar with the experimental Cell Space stuff and have been looking at how to make property layers work with those. I am also opinionated when it comes to anything that even remotely smells of NetLogo...; let's, for now, keep it at "I am not a fan" 😉.

With @EwoutH, I share a desire to maximize the similarity between mesa and mesa-geo with the long-term vision of merging them. I have no idea whether this is desirable or possible, but in the meantime, let's see how far we can get.

My most recent thinking on cells and property layers is captured in #2431 and #2059. That is, ideally, discrete spaces / new-style mesa.experiment.cellspace grids have a CRS, and property layers inherit this CRS. At some later stage, we might think about translations, but let's start simple.

Second, new-style mesa grids entail a fundamental shift in the API. In old-style grids, everything is handled at the level of the grid, and all interactions run through the grid. In new-style grids, the cell is central. Agents interact with and via the cell with their neighborhood. So, we likely need to make a similar distinction for property layers.

That is, PropertyLayers are added to grids and defined at the grid level. Cells know and can easily access/view the properties in the layers for their coordinates, and have numpy masks based on their connections. I even believe it might be relatively easy to add attribute-style access to property layer values at the cell. That is, imagine you have a property layer called grass. I think it is easy to make it possible for a sheep to access this property for its current cell via self.cell.grass. (As one might expect from me, there will be some descriptor magic involved in making this work)

I'll try to take a closer look at the code discussed here. If you have questions or suggestions on the mesa side of things, also please say so.

@SongshGeo
Copy link

@quaquel I appreciate your interest in this topic. I said, "I am a fan of vectorization," because I have used similar designs in ABSESpy. In other words, if I were to implement "Game of Life," I would also not hesitate to operate on 2-d arrays.

I agree with placing the Cell at the centre, and I like the design of discrete grids in the experimental stage. Let's design APIs around discrete grids in the future while mesa-geo mainly supports CRS aspects. Since GIS and PropertyLayer are fundamentally inclined towards two-dimensional array characteristics, we should maintain certain APIs for users to perform vectorized operations.

Thank you all for taking the time to look at my code briefly. Currently, please pay attention to how I am calling the experimental API; I have not yet made changes to the CRS part. Next week, I'd love to contribute more.

@EwoutH
Copy link
Member

EwoutH commented Nov 8, 2024

Thanks for your work on it, it's appreciated!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Release notes label help wanted
Projects
None yet
Development

No branches or pull requests

4 participants