Giter Club home page Giter Club logo

cymem's Introduction

cymem: A Cython Memory Helper

cymem provides two small memory-management helpers for Cython. They make it easy to tie memory to a Python object's life-cycle, so that the memory is freed when the object is garbage collected.

tests pypi Version conda Version Python wheels

Overview

The most useful is cymem.Pool, which acts as a thin wrapper around the calloc function:

from cymem.cymem cimport Pool
cdef Pool mem = Pool()
data1 = <int*>mem.alloc(10, sizeof(int))
data2 = <float*>mem.alloc(12, sizeof(float))

The Pool object saves the memory addresses internally, and frees them when the object is garbage collected. Typically you'll attach the Pool to some cdef'd class. This is particularly handy for deeply nested structs, which have complicated initialization functions. Just pass the Pool object into the initializer, and you don't have to worry about freeing your struct at all โ€” all of the calls to Pool.alloc will be automatically freed when the Pool expires.

Installation

Installation is via pip, and requires Cython. Before installing, make sure that your pip, setuptools and wheel are up to date.

pip install -U pip setuptools wheel
pip install cymem

Example Use Case: An array of structs

Let's say we want a sequence of sparse matrices. We need fast access, and a Python list isn't performing well enough. So, we want a C-array or C++ vector, which means we need the sparse matrix to be a C-level struct โ€” it can't be a Python class. We can write this easily enough in Cython:

"""Example without Cymem

To use an array of structs, we must carefully walk the data structure when
we deallocate it.
"""

from libc.stdlib cimport calloc, free

cdef struct SparseRow:
    size_t length
    size_t* indices
    double* values

cdef struct SparseMatrix:
    size_t length
    SparseRow* rows

cdef class MatrixArray:
    cdef size_t length
    cdef SparseMatrix** matrices

    def __cinit__(self, list py_matrices):
        self.length = 0
        self.matrices = NULL

    def __init__(self, list py_matrices):
        self.length = len(py_matrices)
        self.matrices = <SparseMatrix**>calloc(len(py_matrices), sizeof(SparseMatrix*))

        for i, py_matrix in enumerate(py_matrices):
            self.matrices[i] = sparse_matrix_init(py_matrix)

    def __dealloc__(self):
        for i in range(self.length):
            sparse_matrix_free(self.matrices[i])
        free(self.matrices)


cdef SparseMatrix* sparse_matrix_init(list py_matrix) except NULL:
    sm = <SparseMatrix*>calloc(1, sizeof(SparseMatrix))
    sm.length = len(py_matrix)
    sm.rows = <SparseRow*>calloc(sm.length, sizeof(SparseRow))
    cdef size_t i, j
    cdef dict py_row
    cdef size_t idx
    cdef double value
    for i, py_row in enumerate(py_matrix):
        sm.rows[i].length = len(py_row)
        sm.rows[i].indices = <size_t*>calloc(sm.rows[i].length, sizeof(size_t))
        sm.rows[i].values = <double*>calloc(sm.rows[i].length, sizeof(double))
        for j, (idx, value) in enumerate(py_row.items()):
            sm.rows[i].indices[j] = idx
            sm.rows[i].values[j] = value
    return sm


cdef void* sparse_matrix_free(SparseMatrix* sm) except *:
    cdef size_t i
    for i in range(sm.length):
        free(sm.rows[i].indices)
        free(sm.rows[i].values)
    free(sm.rows)
    free(sm)

We wrap the data structure in a Python ref-counted class at as low a level as we can, given our performance constraints. This allows us to allocate and free the memory in the __cinit__ and __dealloc__ Cython special methods.

However, it's very easy to make mistakes when writing the __dealloc__ and sparse_matrix_free functions, leading to memory leaks. cymem prevents you from writing these deallocators at all. Instead, you write as follows:

"""Example with Cymem.

Memory allocation is hidden behind the Pool class, which remembers the
addresses it gives out.  When the Pool object is garbage collected, all of
its addresses are freed.

We don't need to write MatrixArray.__dealloc__ or sparse_matrix_free,
eliminating a common class of bugs.
"""
from cymem.cymem cimport Pool

cdef struct SparseRow:
    size_t length
    size_t* indices
    double* values

cdef struct SparseMatrix:
    size_t length
    SparseRow* rows


cdef class MatrixArray:
    cdef size_t length
    cdef SparseMatrix** matrices
    cdef Pool mem

    def __cinit__(self, list py_matrices):
        self.mem = None
        self.length = 0
        self.matrices = NULL

    def __init__(self, list py_matrices):
        self.mem = Pool()
        self.length = len(py_matrices)
        self.matrices = <SparseMatrix**>self.mem.alloc(self.length, sizeof(SparseMatrix*))
        for i, py_matrix in enumerate(py_matrices):
            self.matrices[i] = sparse_matrix_init(self.mem, py_matrix)

cdef SparseMatrix* sparse_matrix_init_cymem(Pool mem, list py_matrix) except NULL:
    sm = <SparseMatrix*>mem.alloc(1, sizeof(SparseMatrix))
    sm.length = len(py_matrix)
    sm.rows = <SparseRow*>mem.alloc(sm.length, sizeof(SparseRow))
    cdef size_t i, j
    cdef dict py_row
    cdef size_t idx
    cdef double value
    for i, py_row in enumerate(py_matrix):
        sm.rows[i].length = len(py_row)
        sm.rows[i].indices = <size_t*>mem.alloc(sm.rows[i].length, sizeof(size_t))
        sm.rows[i].values = <double*>mem.alloc(sm.rows[i].length, sizeof(double))
        for j, (idx, value) in enumerate(py_row.items()):
            sm.rows[i].indices[j] = idx
            sm.rows[i].values[j] = value
    return sm

All that the Pool class does is remember the addresses it gives out. When the MatrixArray object is garbage-collected, the Pool object will also be garbage collected, which triggers a call to Pool.__dealloc__. The Pool then frees all of its addresses. This saves you from walking back over your nested data structures to free them, eliminating a common class of errors.

Custom Allocators

Sometimes external C libraries use private functions to allocate and free objects, but we'd still like the laziness of the Pool.

from cymem.cymem cimport Pool, WrapMalloc, WrapFree
cdef Pool mem = Pool(WrapMalloc(priv_malloc), WrapFree(priv_free))

cymem's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

cymem's Issues

Provide wheel for python 3.9

Hello, I can't get cymem to install with python 3.9 - there's no wheel, and building from source is failing. Would it be possible to provide a wheel for python 3.9?

ModuleNotFoundError on cython

The recently introduced change here causes a 'ModuleNotFoundError' when cython is not explicitly pre-installed

Screenshot from 2021-01-18 10-52-33

A workaround is to explicitly install cython on the host system but I feel like such minor releases shouldn't such cause breaking changes and there should perhaps be a fail-safe to run alternative code if cython isn't installed.

Building fails on non-UTF8 terminals

I was building cymem from source through an SSH connection and it fails due to an unicode character in README.rst

Traceback (most recent call last):
  File "setup.py", line 142, in <module>
    setup_package()
  File "setup.py", line 93, in setup_package
    readme = f.read()
  File "/usr/lib/python3.6/encodings/ascii.py", line 26, in decode
    return codecs.ascii_decode(input, self.errors)[0]
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 1181: ordinal not in range(128)

I solved on my side by adding unicode support to my remote install.

Cannot Install Cymem from wheel?

when trying to install the ".whl" file I always end up Errors like
"ERROR: cymem-2.0.3-cp38-cp38m-win32.whl is not a supported wheel on this platform."

C code- file location

Hi!
I've looked on your code and notice you are creating interface for c file (.pxd).
However, I can't seems to find those c files. Would like to contribute to this project.

Thanks in advnace,
Guy Arieli

Realloc appears out of date in data structures.

It looks like realloc is not up to date with the rest of the code. addresses is treated like a set() and it starts with an uninitialized cdef size_t addr.

Additionally, the docstring is slightly confusing because it seems to state that there are two possible error conditions possible if the block was not allocated by the Pool. It appears that only the MemoryError is raised.

If p is not in the Pool or new_size is 0, a MemoryError is raised. If p is not found in the Pool, a KeyError is raised.

Wheel support for linux aarch64 [arm64]

Summary
Installing cymem on aarch64 via pip using command "pip3 install cymem" tries to build wheel from source code

Problem description
cymem doesn't have wheel for aarch64 on PyPI repository. So, while installing cymem via pip on aarch64, pip builds wheel for same resulting in it takes more time to install cymem. Making wheel available for aarch64 will benefit aarch64 users by minimizing cymem installation time.

Expected Output
Pip should be able to download cymem wheel from PyPI repository rather than building it from source code.

@cymem-team, please let me know if I can help you building wheel/uploading to PyPI repository. I am curious to make cymem wheel available for aarch64. It will be a great opportunity for me to work with you.

Installing via pip fails

  Running setup.py install for cymem ... error   
    Complete output from command /home/me/stuff/venv/bin/python3 -u -c "import setuptools, tokenize;__file__='/tmp/pip-build-ls6b2vma/cymem/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" install --record /tmp/pip-5l6ru3r9-record/install-record.txt --single-version-externally-managed --compile --install-headers /home/me/stuff/venv/include/site/python3.6/cymem:
    running install     
    running build       
    running build_py    
    creating build      
    creating build/lib.linux-x86_64-3.6          
    creating build/lib.linux-x86_64-3.6/cymem    
    copying cymem/about.py -> build/lib.linux-x86_64-3.6/cymem                                     
    copying cymem/__init__.py -> build/lib.linux-x86_64-3.6/cymem                                  
    package init file 'cymem/tests/__init__.py' not found (or not a regular file)                  
    creating build/lib.linux-x86_64-3.6/cymem/tests                                                
    copying cymem/tests/test_import.py -> build/lib.linux-x86_64-3.6/cymem/tests                   
    copying cymem/cymem.pyx -> build/lib.linux-x86_64-3.6/cymem                                    
    copying cymem/cymem.pxd -> build/lib.linux-x86_64-3.6/cymem                                    
    copying cymem/__init__.pxd -> build/lib.linux-x86_64-3.6/cymem                                 
    running build_ext   
    building 'cymem.cymem' extension             
    creating build/temp.linux-x86_64-3.6         
    creating build/temp.linux-x86_64-3.6/cymem   
    x86_64-linux-gnu-gcc -pthread -DNDEBUG -g -fwrapv -O2 -Wall -Wstrict-prototypes -g -fdebug-prefix-map=/build/python3.6-sXpGnM/python3.6-3.6.3=. -specs=/usr/share/dpkg/no-pie-compile.specs -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -fPIC -I/usr/include/python3.6m -I/home/me/stuff/venv/include -I/usr/include/python3.6m -c cymem/cymem.cpp -o build/temp.linux-x86_64-3.6/cymem/cymem.o -O3 -Wno-strict-prototypes -Wno-unused-function
    x86_64-linux-gnu-gcc: error: cymem/cymem.cpp: No such file or directory                        
    x86_64-linux-gnu-gcc: fatal error: no input files                                              
    compilation terminated.                      
    error: command 'x86_64-linux-gnu-gcc' failed with exit status 1        

Build fails on Python 3.9

Caused by the removal of the PyTypeObject.tp_print field, which is referenced by the C++ source files generated by the outdated version of Cython specified in requirements.txt (<0.28.0).

If I lift the Cython version restriction and install the most recent version at the time of writing (0.29.21), cymem builds fine, so the removal of the tp_print field seems to have already been addressed by Cython.

Are there any reasons why the Cython dependency can't be easily upgraded? Maybe the fact that cymem builds fine with a newer Cython doesn't necessarily mean it will also work correctly? See also #14.

Stop republishing new wheels under old versions.

It breaks hash-pinning. Same goes for murmurhashes.

THESE PACKAGES DO NOT MATCH THE HASHES FROM Pipfile.lock!. If you have updated the package versions, please update the hashes. Otherwise, examine the package contents carefully; someone may have tampered with them.
   cymem==1.31.2 from https://files.pythonhosted.org/packages/a5/0f/d29aa68c55db37844c77e7e96143bd96651fd0f4453c9f6ee043ac846b77/cymem-1.31.2-cp36-cp36m-manylinux1_x86_64.whl#sha256=d7ce7b63a74566490d4661d3934870741cd0e37c3543cdf73c09c79501f1cf8a (from -r /tmp/pipenv-ky6qhpot-requirements/pipenv-jvo6obwd-requirement.txt (line 1)):
       Expected sha256 00bb3645dfb9a020d735ba3d6f822b04656388180588d8b2cebde967ee678bcc
       Expected     or 0dd61d05977839a922c0d797c355b98949210575918b1743b41e38ae9fb2c3a7
       Expected     or 4bc1056b52d959fcbb1e0f32ec84fa131754d6be1e36b65782c6ac86419f4bf3
       Expected     or 4c5d9ca6ec706792b8d9b1faf6db77b95545c388c768b21d940f197aa7efbb7e
       Expected     or 50292f4dd0d950a8698bae27d71efe59da7ff08e591b735e08b658aae42c4745
       Expected     or 616d06333f46dd03c128d97912d361183fc02249e6420a7b7907b41214c51562
       Expected     or 944af97d4d34a2470b5199f1c31d2dfc79cdec7bd7a41354d839a8ab87fdfaa6
       Expected     or b38056efb99078b06c504adb5f03a8d9e822a5543451737b746028a71c4b1ac3
       Expected     or b6513b2926c60d641f159e79e6fb16460dfb50ebcce31a5af0370c51837c7efc
       Expected     or daa6003fcc199752ab703142021cff74774872a932303b240dc0ea177adf295d
       Expected     or f06d9b50da0474d7405674d8101c319d89a17d33792d6d429fe3d5c64f0d9df1
            Got        d7ce7b63a74566490d4661d3934870741cd0e37c3543cdf73c09c79501f1cf8a

You are using pip version 18.0, however version 18.1 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.


MSG:

non-zero return code```

calloc separate from malloc ?

Wondering if it makes sense to have a separate calloc call that does the memset instead of malloc.

I happy to do the work to make this happen, but maybe there's a design decision behind not having a non-zeroing allocator.

Aligned allocation

This seems like a very nice and useful little library. Definitely fills a need.

Would you be willing to include support for aligned allocation? This is handled a little differently on each platform, and providing a nice, platform independent interface to it that "just works" would be quite useful.

For instance, if you see lines 12-43 in this file, you can see my own attempt at this. Frankly, I have no confidence that I've done this the right way. :-)

(I'm considering using cymem in the library linked to above. Intel's Embree library requires memory to be allocated with alignment.)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.