Giter Club home page Giter Club logo

pybind11_weaver's Introduction

Pybind11 Weaver: Python Binding Code Generator

Pybind11 Weaver is a powerful code generator designed to automate the generation of pybind11 code from C++ header files. It streamlines the process of creating Python bindings, enabling users to focus on writing critical pybind11 code and offloading the tedious work to Pybind11 Weaver.

This tool takes a c_lib.h file and transforms it into a _binding.cc.inc file using cfg.yaml as a guide. Following the binding with a single line auto update_guard = DeclFn(m); in binding.cc, all elements from the header file become accessible in Python as demonstrated in this example.

A more pragmatic example is available in pylibclang, a comprehensive Python wrapper for libclang that uses Pybind11 Weaver to generate the binding code.

  1. Its practicality stems from the fact that Pybind11 Weaver operates on it as well. Indeed, Pybind11 Weaver is self-hosted and generates the binding code for its own use.
  2. Approximately 30k lines of C++ code are generated from a mere 10 lines of cfg.yaml.
  3. Some binding code is manually crafted to handle special cases and integrates seamlessly with the generated code.

pylibtooling is a much more advanced example that uses Pybind11 Weaver to generate the binding code for libtooling, and will be used to demonstrate the capabilities of Pybind11 Weaver when working with large C++ only libraries.

Docs

Check https://github.com/edimetia3d/pybind11_weaver/wiki

Key Features

  1. Highly Customizable: While the default configuration is super simple and suitable for most cases, it allows for high customization.
  2. Ease of Use: As a pure Python package, a simple pip install gets it ready to work.
  3. Versatility: All generated code is under your control, you can easily modify/enhance/disable any part of generated code, and all generated code will work with your hand-written code seamlessly.
  4. Structure Preservation: It retains the module structure of the original C++ code.

Features & Roadmap

  • Binding for Enum
  • Binding for Namespace (as submodule)
  • Binding for Function, with support of function overloading
  • Binding for C style function pointer (usually used as callback functions)
  • Binding for opaque pointer and pointer to incomplete type
  • Binding for Operator overloading
  • Binding for Class method, method overloading, static method, static method overloading, constructor, constructor overloading, class field
  • Trampoline class for virtual function
  • Binding for concreate template instance, that includes: implicit(explicit) class(struct) template instantiation, full class(struct) template specialization, extern function template instance declaration.
  • Support class inheritance hierarchy
  • Auto ignore symbols by : Linkage (e.g. static), visibility (e.g. visibility=hidden), member access control (e.g. private, protected)
  • Docstring generation from c++ doxygen style comment
  • Namespace hierarchy to Python module hierarchy
  • Dynamic update/disable binding by API call.
  • Static update/disable binding by define macro (Mainly used to disable wrong binding code to avoid compilation error)
  • Auto snake case

Background & Recommendations

This project originated from an internal project aimed at creating a Python binding for a LARGE developing C++ library. This posed significant challenges:

  1. The C++ library interface contained a vast number of classes, functions, and enums. Creating bindings for all these elements was not only tedious but also error-prone.
  2. Because the C++ library was under active development, staying updated with daily additions and frequent code modifications was a maintenance challenge.
  3. Some aspects of the C++ library, due to historical reasons, were incompatible with Python conventions, necessitating hand-written binding codes.
  4. The sheer size of the library added to the complexity, making it difficult to develop a generator smart enough to handle everything, hence the need for manual binding code writing.

In light of these challenges, I designed Pybind11 Weaver as a tool to generate the majority of the binding code, leaving users to handcraft the remaining parts as needed. If this approach suits your needs, this tool will be a valuable asset.

Typical workflow:

Though most features should work out of the box, the more your API looks like "C With Class", the higher chance Pybind11 Weaver will do all the work for you. If you use too many advanced C++ features, you may need to write some binding code by yourself.

  1. Create a cfg.yaml file, mainly to tell the generator which files to parse.
  2. Use Pybind11 Weaver to generate files, like pybind11-weaver --config cfg.yaml.
  3. Create a binding.cc, include the generated files, and call the binding code.
  4. Disable some generated binding code by define some macro, if there is any compilation error.
  5. Add some custom code to replace part of the generated code, or adding some new binding that generator had not exported.
  6. Compile all code into a pybind11 module.
  7. Optionally, use pybind11-stubgen to generate .pyi stub files, enhancing readability for both humans and MYPY in a static way.
  8. Test the module in Python, find bugs, and go to step 5 to fix them.

Also, if you encountered too many problems, you are welcome to open an issue at github, or create a PR to fix it.

How it works

The Pybind11 Weaver operates under the hood by utilizing libclang, a library that parses C++ header files. This enables us to obtain all APIs from the header file, which are then used to generate the binding code on your behalf.

Notably, only header files are required, as we need declarations, not definitions. However, to ensure accurate parsing of the code, some compiler flags, especially for macros, are necessary.

The code generated is structured into a struct:

  1. During the construction of the struct, it creates some Pybind11 objects, such as pybind11::class_ or pybind11::enum_.
  2. When the Update() API is invoked, the Pybind11 object experiences an update.

The use of a struct permits us to:

  • Separate the processes of object creation and updates, ensuring that Pybind11 consistently acknowledges all exported classes, which aids in the generation of accurate documentation.
  • Increase the readability of the generated code, making it simpler to debug.
  • Simplify customization, as you can easily inherit the struct and override or reimplement necessary elements.

pybind11_weaver's People

Contributors

edimetia3d avatar lunixoid avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

Forkers

lunixoid

pybind11_weaver's Issues

Support generate docstrings

Doxygen style C++ document could be injected as docstring for exported entities.

  • docstring for enum item
  • docstring for function
  • docstring for class
  • docstring for method
  • docstring for public data member

Handling `T*` and `T&` of function parameter

When a function uses T* (or T & ) as a parameter, this function will usually want to modify the object it pointed to, e.g.

ErrorCode GetValue(int * v);
ErrorCode GetValue(std::vector<int> * vec);

However:

  1. Python itself does not support mutable int
  2. pybind11 will create a copy when T is a std container type, any modification to the copied object will not return to python runtime.

These cases should be handled automatically, or at least a warning should be printed.

Add options to help controll lifetime and return policy

This issue pertains to another complication involving pointers: The management of heap object lifetimes within pybind11 remains somewhat enigmatic.

By default, pybind11 presupposes that the lifespan of all heap objects will be regulated by the Python Garbage Collection system.

Consequently, pybind11 inherently supports only T* and std::unique_ptr<T>. In the case of T*, it is assumed that each pointer references either a newly created or an existing object. Meanwhile, std::unique_ptr<T> is permissible solely as a return type, invariably pointing to a newly instantiated object.

Should your API diverge from this paradigm, extensive coding involving Holder and Call Policy becomes imperative.

Currently, Pybind11-Weaver offers no solutions for managing these lifetimes; thus, if your API deviates from the default configuration of pybind11, the resultant code will also fail to serve your needs, necessitating the crafting of bespoke code to integrate these APIs.

I propose two potential enhancements:

  1. Introduce additional options to enable users to dictate the container type for each class.
  2. Implement support for specific C++ inline comment markers, such as /*return_policy=take_ownership*/, to facilitate accurate call policy determination by pybind11-weaver.

Code quality improvement

There should be some quality assurance:

  • Static code check, and fix errors, tools like MYPY and PyLint should be used
  • CI should be enabled, to enable auto testing/building/releasing
  • More tests should be added.

This is still a developing project, not even alpha

Hi,

This project is still under development, the PyPI package and GitHub repo are both just placeholders.

I do have a full plan to re-implement it in an open-source way, but it may take some time.

whatever, the idea in the readme should be enough, it should be able to help you create your own generator that fits your needs.

Provide warnning when overloading with builtin types

Pybind11 intrinsically supports the binding of overloaded functions.

Nonetheless, several predicaments may arise:

  1. Pybind11 internally scrutinizes overloaded functions sequentially, employing the first viable function. Please refer to this Documentation for further details.
  2. Certain discrepancies exist between Python and C++ types, such as Python's bool being a subclass of int.

Consequently, when generating an overload solely reliant on the builtin type, potential complications could emerge. For instance:

Should you bind void Foo(int, float) prior to void Foo(bool, float), the Python call mod.foo(False, 8) would align with void Foo(int, float), a situation incongruous with the C++ overloading mechanism.

In the face of such circumstances, it would be prudent to issue appropriate warnings when generating binding code.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.