mozilla / cbindgen Goto Github PK
View Code? Open in Web Editor NEWA project for generating C bindings from Rust code
License: Mozilla Public License 2.0
A project for generating C bindings from Rust code
License: Mozilla Public License 2.0
When derive_eq
is flagged on a type cbindgen
will attempt to create an operator==
for the C++ struct. We currently check that the primitive types that compose it can be compared with operator==
.
This check isn't thorough enough though, and structs with struct members without derive_eq
will incorrectly create an operator==
.
I've filed rust-lang/rust#43697 to find out what we should do about this.
The current code for converting a Rust type to a C declaration is not correct and doesn't handle all the intricacies needed. It works for simpler things, like nested pointers, simple arrays, and simple function pointers, but I don't have confidence it works for arbitrary composition of those things.
Some broken things:
Currently, code like this:
cbindgen::generate(&crate_dir)
.unwrap()
.write_to_file("include/ffi.h");
will fail with
thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: Error { repr: Os { code: 2, message: "No such file or directory" } }', /checkout/src/libcore/result.rs:859
On GCC 5.4.0 tests doesn't compile without -std=c++11
.
It seems like when macro expansion is enabled in the parse section comments do not survive. I am currently wrapping a bunch of functions so that I can inject panic handling automatically. None of the doc comments make it into the final output it seems.
There is no need in clap
when we use cbindgen
in a build script.
With recent changes, writing a build script is more difficult than it should be.
We should provide a helper function that does everything. It will probably be close to main.rs
There are still some systems which cannot deal with C++ comments. If the language is set to C it would not be a bad idea to always generate /* C style */
comments.
Generic structs that accept a 'unit' type parameter (like in Euclid) use PhantomData to tell the compiler that the type is being used. It's an empty type, cbindgen should just ignore it.
This lets us do things like putting FontKey's in a std::map
Each Rust Item
has a path that includes the crate and mods needed to get to it. Currently cbindgen
just ignores all segments of a path except the last one, which is the name of the item.
This works as long as there are no duplicates in different modules, but isn't correct and will probably cause someone an issue someday.
We currently accept some limited options, but it'd be nice to accept more. This would be useful for when a cbindgen.toml
is too heavyweight.
// crate/src/lib.rs
mod Foo {
#[repr(C)]
struct Bar { ... }
#[no_mangle]
extern "C" fn root(a: Bar) { }
}
cbindgen crate/src/lib.rs -o sample.h
will output nothing.
cbindgen crate/ -o sample.h
will output the expected definitions.
The bug is in rust_lib.rs
, which will need some reworking to share code between single source parsing and whole crate parsing.
Currently we will recursively export all of the referenced types from each of the exported extern "C"
functions. This works well and keeps the bindings minimal.
In the future, it could be useful to support a whitelist or a blacklist, along with a way to specify that a type should get a binding even if it's not referenced from an extern "C"
function.
Currently rust_lib::parse_lib
works by parsing the whole crate tree and dumping a Vec<syn::Item>
through a callback into Library
for each mod.
With #5 enabled this is getting problematic because we end up parsing crate dependencies that are never used in bindings and that are private. And there are a lot of them, and parsing is repeated multiple times usually for multiple dependencies. This is mitigated through a cache right now.
An optimal solution would be to turn rust_lib::parse_lib
into a struct which is powered by Library
. Library
would parse and convert the bindings crate, then determine which dependencies are needed based on what is used. Then recurse and do another parse and convert step.
I believe this depends on #7 for determining which crate a PathRef refers to.
It would be nice to have a document explaing how to write rust bindings correctly and use cbindgen
.
I've got some Rust code looking like this:
#[no_mangle]
pub static LIBLO_OK: c_uint = 0;
#[no_mangle]
pub static LIBLO_ERROR_INVALID_ARGS: c_uint = 12;
These are used as return codes for some of my API functions, but when cbindgen runs it doesn't include their declarations in the header. I'd expect it to include them like this:
extern unsigned int LIBLO_OK;
extern unsigned int LIBLO_ERROR_INVALID_ARGS;
If I add the above block to cbindgen's trailer
config string, I can then use the constants in the C code that calls my library, but isn't this the sort of thing that cbindgen should be doing automatically?
Steps to reproduce:
$ git clone https://github.com/dbrgn/svg2polylines
$ cd svg2polylines
$ cat >cbindgen.toml <<EOL
language = "C"
[parse]
parse_deps = true
include = ["svg2polylines"]
$ cbindgen -v svg2polylines-ffi/src/lib.rs -c cbindgen.toml -o svg2polylines.h
INFO: take ::Polyline
INFO: take ::svg_str_to_polylines
INFO: take ::free_polylines
WARN: can't find CoordinatePair
WARN: can't find size_t
WARN: can't find size_t
WARN: can't find size_t
The CoordinatePair
struct is defined in the svg2polylines
crate and is pretty simple:
/// A CoordinatePair consists of an x and y coordinate.
#[derive(Debug, PartialEq, Copy, Clone)]
#[cfg_attr(feature = "use_serde", derive(Serialize, Deserialize))]
#[repr(C)]
pub struct CoordinatePair {
pub x: f64,
pub y: f64,
}
What's the reason why it can't be found?
cbindgen panics. This is the stack trace. I have no idea what could be the cause.
This is the current git master. But it was doing it on a previous version.
thread 'main' panicked at 'assertion failed: path.generics.len() == 0', /home/hub/.cargo/registry/src/github.com-1ecc6299db9ec823/cbindgen-0.1.25/src/bindgen/cdecl.rs:67:16
stack backtrace:
0: std::sys::imp::backtrace::tracing::imp::unwind_backtrace
at /checkout/src/libstd/sys/unix/backtrace/tracing/gcc_s.rs:49
1: std::sys_common::backtrace::_print
at /checkout/src/libstd/sys_common/backtrace.rs:71
2: std::panicking::default_hook::{{closure}}
at /checkout/src/libstd/sys_common/backtrace.rs:60
at /checkout/src/libstd/panicking.rs:380
3: std::panicking::default_hook
at /checkout/src/libstd/panicking.rs:396
4: std::panicking::rust_panic_with_hook
at /checkout/src/libstd/panicking.rs:611
5: std::panicking::begin_panic_new
6: cbindgen::bindgen::cdecl::CDecl::build_type
7: cbindgen::bindgen::bindings::Bindings::write
8: cbindgen::main
9: __rust_maybe_catch_panic
at /checkout/src/libpanic_unwind/lib.rs:98
10: std::rt::lang_start
at /checkout/src/libstd/panicking.rs:458
at /checkout/src/libstd/panic.rs:361
at /checkout/src/libstd/rt.rs:59
11: __libc_start_main
12: _start
typedef %
Example
struct Foo {
data: *const Bar,
}
struct Bar {
data: *const Foo,
}
Possible output
struct Bar;
struct Foo {
const Bar* data;
}
struct Bar {
const Foo* data;
}
This would require the dependency ordering algorithm to understand when a reference needs a declaration and when it needs a definition. If in the previous example, the data members were not pointers then it wouldn't work.
Currently f32
and f64
are assumed to be float
and double
respectively and this isn't guaranteed to be true to my knowledge.
cbindgen translates the following rust code:
#[repr(C)]
struct Foo {
dummy: usize
}
into the following c++ code:
struct Foo {
size_t dummy;
};
gcc will reject the generated code with the following error message:
error: ‘size_t’ does not name a type
size_t dummy;
^~~~~~
This could be fixed by including for example #include<cstdlib>
Anecdotally, it seems like if you change when I a type gets used it can end up changing order in the generated header which causes needless code churn. It might be nice to avoid this somehow.
Currently when you make an enum they end up with the field names in the header. I think it would not be unreasonable to add an option to prefix them with the type.
Eg:
#[repr(u32)]
struct FooBar {
A = 1,
B,
C
}
Would end up something like this:
enum FooBar {
FOO_BAR_A = 1,
FOO_BAR_B = 2,
FOO_BAR_C = 3
}
typedef uint32_t FooBar;
409755b changed function formatting to use cdecl.rs
so that functions that return function pointers and such would be formatted correctly. Doing so broke the ability to format function arguments vertically.
I have a plan to fix this in cdecl.rs
but it's ugly so I'm trying to think of a better way.
Most Rust APIs which work on the file system can accept anything which is AsRef<Path>
, however cbindgen::generate()
and write_to_file()
only accept &str
. Is there any chance you could make these user-facing functions more generic/lenient in what they accept?
I've got a function which returns an iterator over borrowed data, with a signature something like this:
pub struct ShapeIterator<'a>(Iter<'a, Object>);
pub unsafe extern "C" fn shape_iterator_new<'a>(objects: *mut Vec<Object>) -> *mut ShapeIterator<'a> {
...
}
Even though the 'a
litefime is essentially useless, I'm keeping it around because it reminds you the ShapeIterator
is actually borrowed and the Vec<Object>
needs to outlive it.
It looks like including any lifetimes in a function's signature makes the crate think it contains generics, meaning the function is ignored. Would it be possible to relax this restriction?
We talked about this a bit at the SF all-hands. If we use '-Z print-type-sizes' we can figure out the size of a non-repr(C) type and make an opaque C type of the same size and alignment.
@gankro is going to explore what a stable version of '-Z print-type-sizes' would be.
Example:
#[cfg(target_os = "windows")]
#[repr(C)]
pub struct Font {
data: i32,
}
#[cfg(target_os = "macos")]
#[repr(C)]
pub struct Font {
data: char,
}
In this case, it is okay to have two items with the same name, because they will have #ifdefs to prevent them from both being compiled.
The current code searches for dependencies in the same level directory as the root crate. This works for simple cases where all the crates needed for generating bindings are local, but not for crates downloaded through Cargo.
This is one thing needed for WebRender to generate structs for Euclid types.
If I have a function with a type from libc like libc::c_uchar
:
extern crate libc;
#[no_mangle]
pub extern "C" fn allsorts_lookup_glyph_index(subtable: *mut libc::c_uchar, char_code: u32) -> u32 {
abort_on_panic(move || unimplemented!())
}
cbindgen will produce an empty header:
/* Generated with cbindgen:0.1.23 */
#include <cstdint>
#include <cstdlib>
extern "C" {
} // extern "C"
where as using u32
causes the function prototype to be output:
extern crate libc;
#[no_mangle]
pub extern "C" fn allsorts_lookup_glyph_index(subtable: *mut u32, char_code: u32) -> u32 {
abort_on_panic(move || unimplemented!())
}
/* Generated with cbindgen:0.1.23 */
#include <cstdint>
#include <cstdlib>
extern "C" {
uint32_t allsorts_lookup_glyph_index(uint32_t *subtable, uint32_t char_code);
} // extern "C"
I'd like to have some way to run tests to ensure I don't break anything. My current approach is to just run cbindgen on the Gecko/Webrender bindings and check for regressions, but that's not optimal.
A good first step would be a tool to run cbindgen for a set of sample files and compile the generated bindings with gcc and a dummy main.c
. That will check for syntax errors at the least.
A next step could be functional tests that make sure the bindings are written correctly and work. But that's not as urgent and is more complicated.
I'm trying to export some Rust functions as a DLL to be used in a larger application, but instead of the usual extern "C"
calling convention, I'm using stdcall
. With stdcall
, cbindgen
doesn't emit declarations.
Would it be possible to tweak the rules so cbindgen
will make bindings for anything marked extern
and #[no_mangle]
, no matter the calling convention? I don't think there's any time you wouldn't want bindings for a function marked extern
.
This looks like it's related to #49, but affects all forms of extern
, not just where the "C"
is omitted.
When trying to generate header files for an FFI crate I get these warnings:
WARN: skip ::svg_str_to_polylines - (not both
no_mangle
andextern "C"
)
WARN: skip ::free_polylines - (not bothno_mangle
andextern "C"
)
The warnings are correct, I use #[no_mangle]
for the functions but not "C"
:
#[no_mangle]
pub extern fn svg_str_to_polylines(...
According to https://stackoverflow.com/a/44664851/284318 there is not a real need for the "C"
part, and the version without seems preferred: rust-lang/style-team#52
Is there a reason why cbindgen requires that part of the declaration?
C enums don't have an explicit size and so aren't FFI safe.
cbindgen
currently incorrectly generates them when in C language mode. In C++ mode cbindgen
uses enum class
which can be typed correctly.
Rust has untagged unions.
They're not very common in Rust code because it requires unsafe to use, but they're definitely useful for FFI.
This is a function that we implement in Gecko and use in Rust.
If I have
pub extern fn my_function(this: *mut MyStruct) -> bool
cbingen will generate
bool my_function(MyStruct* this);
But this
is a reserved keyword in C++. It should be escaped. To this_
would work.
Example:
const LENGTH: usize = 64;
#[repr(C)]
pub struct Data {
pub d: [u8; LENGTH],
}
If I replace LENGTH with a number - everything works just fine.
Right now we make a guess that it's the name of the folder, and this is bad.
Looks like only "used" enums are exported.
I've got a crate using cbindgen in its build script, and everything works (aside from builds using cargo-travis, but I'll file a different bug for that) when I build the crate directly, but if I have it as a dependency and build the dependent crate, I get an odd error. Here's the relevant output of RUST_BACKTRACE=1 cargo build -v
from the dependent crate:
Compiling libespm v0.1.0 (https://github.com/WrinklyNinja/libespm?branch=rust-rewrite#4afc26da)
Running `rustc --crate-name build_script_build /home/oliver/.cargo/git/checkouts/libespm-add271feea385607/4afc26d/build.rs --crate-type bin --emit=dep-info,link -C debuginfo=2 --cfg 'feature="default"' -C metadata=ffa693acaa10afaa -C extra-filename=-ffa693acaa10afaa --out-dir /home/oliver/Documents/Code/libloadorder/target/debug/build/libespm-ffa693acaa10afaa -L dependency=/home/oliver/Documents/Code/libloadorder/target/debug/deps --extern cbindgen=/home/oliver/Documents/Code/libloadorder/target/debug/deps/libcbindgen-9c2f32f2141e6cd7.rlib --cap-lints allow`
Running `/home/oliver/Documents/Code/libloadorder/target/debug/build/libespm-ffa693acaa10afaa/build-script-build`
error: failed to run custom build command for `libespm v0.1.0 (https://github.com/WrinklyNinja/libespm?branch=rust-rewrite#4afc26da)`
process didn't exit successfully: `/home/oliver/Documents/Code/libloadorder/target/debug/build/libespm-ffa693acaa10afaa/build-script-build` (exit code: 101)
--- stderr
thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: "couldn\'t load \"/home/oliver/.cargo/git/checkouts/libespm-add271feea385607/4afc26d/Cargo.lock\": Io(Error { repr: Os { code: 2, message: \"No such file or directory\" } })"', /checkout/src/libcore/result.rs:859
stack backtrace:
0: std::sys::imp::backtrace::tracing::imp::unwind_backtrace
at /checkout/src/libstd/sys/unix/backtrace/tracing/gcc_s.rs:49
1: std::sys_common::backtrace::_print
at /checkout/src/libstd/sys_common/backtrace.rs:71
2: std::panicking::default_hook::{{closure}}
at /checkout/src/libstd/sys_common/backtrace.rs:60
at /checkout/src/libstd/panicking.rs:355
3: std::panicking::default_hook
at /checkout/src/libstd/panicking.rs:371
4: std::panicking::rust_panic_with_hook
at /checkout/src/libstd/panicking.rs:549
5: std::panicking::begin_panic
at /checkout/src/libstd/panicking.rs:511
6: std::panicking::begin_panic_fmt
at /checkout/src/libstd/panicking.rs:495
7: rust_begin_unwind
at /checkout/src/libstd/panicking.rs:471
8: core::panicking::panic_fmt
at /checkout/src/libcore/panicking.rs:69
9: core::result::unwrap_failed
at /checkout/src/libcore/macros.rs:29
10: <core::result::Result<T, E>>::unwrap
at /checkout/src/libcore/result.rs:737
11: build_script_build::main
at ./build.rs:18
12: __rust_maybe_catch_panic
at /checkout/src/libpanic_unwind/lib.rs:98
13: std::rt::lang_start
at /checkout/src/libstd/panicking.rs:433
at /checkout/src/libstd/panic.rs:361
at /checkout/src/libstd/rt.rs:57
14: main
15: __libc_start_main
16: _start
Sure enough, Cargo.lock doesn't get generated at the given path, though one is generated if I cd
there and run cargo build
there (and after that building the dependent crate works).
The dependency's build.rs
is:
fn main() {
// Don't run cbindgen if it's not cargo being run
// (i.e. not for cargo coveralls)
if !env::var("CARGO").unwrap().ends_with("cargo") {
return;
}
let crate_dir = env::var("CARGO_MANIFEST_DIR").unwrap();
fs::create_dir_all("include").expect("could not create include directory");
cbindgen::generate(&crate_dir).unwrap().write_to_file(
"include/libespm.h",
);
let mut config = cbindgen::Config::from_root_or_default(PathBuf::from(&crate_dir).as_path());
config.language = cbindgen::Language::Cxx;
cbindgen::generate_with_config(&crate_dir, &config)
.unwrap()
.write_to_file("include/libespm.hpp");
}
Builds also work if I comment out the cbindgen calls.
Currently if you need to define an item in C/C++ and use it from Rust, you can use rust-bindgen
and it will output the Rust definition for you.
This is problematic if that type needs to be used in a Rust FFI boundary, as cbindgen
will try to output a second C/C++ definition for that type, which will cause errors.
I believe that if cbindgen
is run before bindgen
, we'll do the right thing and not output a definition for those items, but you will need to ensure you include the proper C/C++ files before the generated header.
I just added an annotation no-export
which will prevent a type from being given a definition in the output header, which can help with this, but isn't optimal.
If you use cargo install cbindgen
to install cbindgen, you will get the version currently published to crates.io. If we subsequently push new versions to crates.io, you will need to run cargo install --force cbindgen
to get the latest version. People might forget to do this, and we will eventually run into the situation where people run an older version of cbindgen and inadvertently roll back changes to the generated file. For example say a new version of cbindgen ends up reordering a couple of things in the generated file, and we check in that new version. Later somebody using an older version of cbindgen regenerates the file, those things will back to their old ordering. This might still compile and run but it's a spurious change.
I think a stopgap solution (until we have cbindgen plugged into the build workflow and automatically updating) is to emit the cbindgen version number into the generated file. That way if somebody uses an older version of cbindgen the version number change will show up in the diff and we're more likely to catch it.
Some of them are very vague and could be improved.
This could be a useful feature for C++ output. Depends on #7
Right now there are some minimal tests under compile-tests/
. I'd really like to improve the testing coverage.
Some ideas:
cargo test
compile-tests/
generated headers so we can track syntactic changes over timecrate-tests/
which would test cbindgen
on crates instead of single source filesThe current version on crates.io is 0.1.10
, whereas the repository contains already version 0.1.12
.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.