adamamer20 / mesa-frames Goto Github PK

Extension of mesa for performance and scalability

Home Page: https://adamamer20.github.io/mesa-frames/api

License: MIT License

Python 100.00%

agent-based-modeling complex-systems complexity-analysis gis mesa modeling-agents pandas simulation simulation-environment simulation-framework

mesa-frames's Introduction

mesa-frames

mesa-frames is an extension of the mesa framework, designed for complex simulations with thousands of agents. By storing agents in a DataFrame, mesa-frames significantly enhances the performance and scalability of mesa, while maintaining a similar syntax. mesa-frames allows for the use of vectorized functions whenever simultaneous activation of agents is possible.

Why DataFrames?

DataFrames are optimized for simultaneous operations through SIMD processing. At the moment, mesa-frames supports the use of two main libraries: Pandas and Polars.

Pandas is a popular data-manipulation Python library, developed using C and Cython. Pandas is known for its ease of use, allowing for declarative programming and high performance.
Polars is a new DataFrame library with a syntax similar to Pandas but with several innovations, including a backend implemented in Rust, the Apache Arrow memory format, query optimization, and support for larger-than-memory DataFrames.

The following is a performance graph showing execution time using mesa and mesa-frames for the Boltzmann Wealth model.

(The script used to generate the graph can be found here, but if you want to additionally compare vs Mesa, you have to uncomment mesa_implementation and its label)

Installation

Cloning the Repository

To get started with mesa-frames, first clone the repository from GitHub:

git clone https://github.com/adamamer20/mesa_frames.git
cd mesa_frames

Installing in a Conda Environment

If you want to install it into a new environment:

conda create -n myenv

If you want to install it into an existing environment:

conda activate myenv

Then, to install mesa-frames itself:

# For pandas backend
pip install -e .[pandas]
# Alternatively, for Polars backend
pip install -e .[polars]

Installing in a Python Virtual Environment

If you want to install it into a new environment:

python3 -m venv myenv
source myenv/bin/activate  # On Windows, use `myenv\Scripts\activate`

If you want to install it into an existing environment:

source myenv/bin/activate  # On Windows, use `myenv\Scripts\activate`

Then, to install mesa-frames itself:

# For pandas backend
pip install -e .[pandas]
# Alternatively, for Polars backend
pip install -e .[polars]

Usage

Note: mesa-frames is currently in its early stages of development. As such, the usage patterns and API are subject to change. Breaking changes may be introduced. Reports of feedback and issues are encouraged.

You can find the API documentation here.

Creation of an Agent

The agent implementation differs from base mesa. Agents are only defined at the AgentSet level. You can import either AgentSetPandas or AgentSetPolars. As in mesa, you subclass and make sure to call super().__init__(model). You can use the add method or the += operator to add agents to the AgentSet. Most methods mirror the functionality of mesa.AgentSet. Additionally, mesa-frames.AgentSet implements many dunder methods such as AgentSet[mask, attr] to get and set items intuitively. All operations are by default inplace, but if you'd like to use functional programming, mesa-frames implements a fast copy method which aims to reduce memory usage, relying on reference-only and native copy methods.

from mesa-frames import AgentSetPolars

class MoneyAgentPolars(AgentSetPolars):
    def __init__(self, n: int, model: ModelDF):
        super().__init__(model)
        # Adding the agents to the agent set
        self += pl.DataFrame(
            {"unique_id": pl.arange(n, eager=True), "wealth": pl.ones(n, eager=True)}
        )

    def step(self) -> None:
        # The give_money method is called
        self.do("give_money")

    def give_money(self):
        # Active agents are changed to wealthy agents
        self.select(self.wealth > 0)

        # Receiving agents are sampled (only native expressions currently supported)
        other_agents = self.agents.sample(
            n=len(self.active_agents), with_replacement=True
        )

        # Wealth of wealthy is decreased by 1
        self["active", "wealth"] -= 1

        # Compute the income of the other agents (only native expressions currently supported)
        new_wealth = other_agents.group_by("unique_id").len()

        # Add the income to the other agents
        self[new_wealth, "wealth"] += new_wealth["len"]

Creation of the Model

Creation of the model is fairly similar to the process in mesa. You subclass ModelDF and call super().__init__(). The model.agents attribute has the same interface as mesa-frames.AgentSet. You can use += or self.agents.add with a mesa-frames.AgentSet (or a list of AgentSet) to add agents to the model.

from mesa-frames import ModelDF

class MoneyModelDF(ModelDF):
    def __init__(self, N: int, agents_cls):
        super().__init__()
        self.n_agents = N
        self.agents += MoneyAgentPolars(N, self)

    def step(self):
        # Executes the step method for every agentset in self.agents
        self.agents.do("step")

    def run_model(self, n):
        for _ in range(n):
            self.step()

What's Next?

Refine the API to make it more understandable for someone who is already familiar with the mesa package. The goal is to provide a seamless experience for users transitioning to or incorporating mesa-frames.
Adding support for default mesa functions to ensure that the standard mesa functionality is preserved.
Adding GPU functionality (cuDF and Rapids).
Creating a decorator that will automatically vectorize an existing mesa model. This feature will allow users to easily tap into the performance enhancements that mesa-frames offers without significant code alterations.
Creating a unique class for AgentSet, independent of the backend implementation.

License

mesa-frames is made available under the MIT License. This license allows you to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the software, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
The software is provided "as is", without warranty of any kind.

For the full license text, see the LICENSE file in the GitHub repository.

mesa-frames's People

Contributors

Stargazers

Watchers

Forkers

rht

mesa-frames's Issues

Migrate Tests from Private APIs to Public APIs

The public API will be more and more stable with respect to private APIs.
Most tests are implemented using the private API, so they need to be refactored

Refactoring Discrete Event scheduler

This issue should be closed when all the discrete event scheduler in mesa.experimental.devs is implemented.

Automatic check/update of docstring parameters/returns based on function/class signature via pre-commit hook

Due to type hinting being so extensive and APIs currently being subject to substantial changes, it would be useful to have a pre-commit hook that deals with the mismatch between type hints, parameters/attribute names defined in the function/class signature and in the docstring. This would ensure that the documentation is up to date.I tried searching online but couldn't find any existing tool. However, I found a small implementation using ast in this post.
@rht do you know if something similar exists?

Run-time type checking

Due to the extensive type annotations in the library, runtime type checking. This ensures that functions receive and return values of the expected types, catching potential bugs and mismatches early in the development. Three main libraries exists for this purpose: Typeguard, Pydantic, and Beartype. Performance overhead need to be tested.
Beartype seems to be the most promising as it has minimal performance overhead and supports user-defined types for type annotations.

Refactoring mesa.datacollector

GPU integration: Dask, cuda (cudf) and RAPIDS (Polars)

https://pola.rs/posts/polars-on-gpu/

https://www.reddit.com/r/Python/comments/xjx4uo/benchmarking_pandas_cudf_modin_apache_arrow_and/

The benchmark in the ReadMe seems to execute a too slow pure-python version

I stumbled across this repository, and while the approach seems interesting, actually the benchmark compares apple-to-oranges because the pure Python version is way too slow than what is expected. The running time seems also quadratic, which doesn't make a lot of sense to me because the money model seems linear in the number of agents. By digging into the source code I think that the culprit is at https://github.com/adamamer20/mesa-frames/blob/main/docs/scripts/readme_plot.py#L31 because self.model.schedule.agents creates a copy of the structure containing the agents, which makes the model much slower than what it should be, and explains also the quadratic behaviour.

Indeed an Agents.jl version of the model is still like 40x faster than Polars:

julia> using Agents, Random

julia> @agent struct WealthAgent(NoSpaceAgent)
           wealth::Int
       end

julia> function wealth_model(; numagents = 100, initwealth = 1)
           model = ABM(WealthAgent; agent_step!, scheduler = Schedulers.Randomly(), 
                       rng = Xoshiro(42), container = Vector)
           for _ in 1:numagents
               add_agent!(model, initwealth)
           end
           return model
       end
wealth_model (generic function with 1 method)

julia> function agent_step!(agent, model)
           agent.wealth == 0 && return
           agent.wealth -= 1
           random_agent(model).wealth += 1
       end
agent_step! (generic function with 1 method)

julia> m = wealth_model(; numagents=9000);

julia> @time step!(m, 1);
  0.067163 seconds (94.95 k allocations: 4.839 MiB, 99.34% compilation time)

julia> @time step!(m, 100);
  0.011685 seconds

Hope this helps you to find a better benchmark :-)

Towards parity with Mesa's core repo practices

This issue should be informative: projectmesa/mesa-examples#11.
And a continually updated list (so that future Mesa projects can get to parity quickly):

Use declarative pyproject.toml instead of setup.py
Setup Ruff, codespell, Black checks: https://github.com/projectmesa/mesa/blob/main/.pre-commit-config.yaml (must be via pre-commit instead of GitHub Actions, because the former is much faster)
Setup CI for testing

Refactoring mesa.visualization

Benchmark result of docs/scripts/readme_plot.py, native expression vs concise/simple API

For both pandas and Polars, there are at least 3 ways to do things for each steps. But they can be roughly split into 2: using native expression and using a simpler, more concise API. We should split further into 4 agents:

MoneyAgentPolarsNative
MoneyAgentPolarsConcise
MoneyAgentPandasNative
MoneyAgentPandasConcise

and benchmark them. Because:

researchers who want to speed up their code further could see if the effort is worth it
it could become a how-to guide on speeding up the code further via the native expression

Refactoring mesa.Agent and mesa.AgentSet

Refactoring mesa-examples

Refactoring mesa.space

Decorator for existing mesa.Model / Agent method

Test-Driven Development Using `hypothesis` and `deal`

hypothesis is a property-based testing library that generates test cases automatically, covering a wide range of input scenarios. This approach helps uncover edge cases and unexpected behaviors that might be missed with manually written tests.

deal is a contract-based programming library that enforces a Design by contract (DbC) approach for functions, specifying preconditions, postconditions, and invariants. This ensures that functions behave as expected, making the codebase more robust and predictable. Additionally, Deal has an experimental feature for formal verification, which provides mathematical proof of code correctness, further enhancing reliability.

Use the -O flag for improved performance in error handling

When running a python script with the -O flag, CPython doesn't compile the bytecode of

assert
if __debug__ code blocks

We might optimize "production-ready" models by converting errors to assertions or raising errors in an if __debug__ code block, while maintaining error handling capabilities during development and debugging.