calebwin / emu Goto Github PK
View Code? Open in Web Editor NEWThe write-once-run-anywhere GPGPU library for Rust
Home Page: https://calebwin.github.io/emu
License: MIT License
The write-once-run-anywhere GPGPU library for Rust
Home Page: https://calebwin.github.io/emu
License: MIT License
Hello !
First, thanks for this crate and your contribution to the Rust community. This is amazingly simple to use, even for me who has not ever touch OpenCL.
I tried running a simple benchmark program and I found the result unsatisfying.
I suppose this example is so trivial, the initial cost of initializing the opencl environment each time is heavy and this is slowing down the entire function call.
The program is (based on your example) :
#![feature(test)]
extern crate em;
extern crate ocl;
extern crate test;
use em::emu;
emu! {
function logistic(x [f32]) {
x[..] = 1 / (1 + pow(E, -x[..]));
}
pub fn logistic(x: &mut Vec<f32>);
}
pub fn logistic_cpu(x: &mut Vec<f32>) {
let mut result = Vec::new();
for value in x {
result.push(1.0 / (1.0 + 2.71828182846_f32.powf(-*value)))
}
}
#[cfg(test)]
mod tests {
use super::*;
use test::Bencher;
#[bench]
fn logistic_opencl(b: &mut Bencher) {
let mut test_data = vec![0.9, 4.9, 4.8, 3.9, 1.3, 4.8, 9.13, -0.16, 81.20, -16.0, 0.9, 4.9, 4.8, 3.9, 1.3, 4.8, 9.13, -0.16, 81.20, -16.0, 0.9, 4.9, 4.8, 3.9, 1.3, 4.8, 9.13, -0.16, 81.20, -16.0, 0.9, 4.9, 4.8, 3.9, 1.3, 4.8, 9.13, -0.16, 81.20, -16.0, 0.9, 4.9, 4.8, 3.9, 1.3, 4.8, 9.13, -0.16];
b.iter(|| logistic(&mut test_data));
println!("OpenCL : {:?}", test_data);
}
#[bench]
fn logistic_non_opencl(c: &mut Bencher) {
let mut test_data = vec![0.9, 4.9, 4.8, 3.9, 1.3, 4.8, 9.13, -0.16, 81.20, -16.0, 0.9, 4.9, 4.8, 3.9, 1.3, 4.8, 9.13, -0.16, 81.20, -16.0, 0.9, 4.9, 4.8, 3.9, 1.3, 4.8, 9.13, -0.16, 81.20, -16.0, 0.9, 4.9, 4.8, 3.9, 1.3, 4.8, 9.13, -0.16, 81.20, -16.0, 0.9, 4.9, 4.8, 3.9, 1.3, 4.8, 9.13, -0.16];
c.iter(|| logistic_cpu(&mut test_data));
println!("non OpenCL : {:?}", test_data);
}
}
And the result is :
test tests::logistic_non_opencl ... bench: 561 ns/iter (+/- 66)
test tests::logistic_opencl ... bench: 72,081,552 ns/iter (+/- 4,863,815)
My initial intention was to write a recurrent network as efficiently as possible. Do you think using Emu is a good choice ?
Hey! I noticed that when I tried to run the basic
example that I was unable to. The reason seems to be that it runs the python
command. Unfortunately, on my system (Ubuntu 20.04), I didn't have this command set, as I only had python3
. I got around this by using update-alternatives
as per this documentation.
Here is the error I got:
cargo run --features glsl-compile --release --example basic
Compiling shaderc-sys v0.6.2
The following warnings were emitted during compilation:
warning: Checking for system installed libraries. Use --features = build-from-source to disable this behavior
warning: System installed library not found. Falling back to build from source
error: failed to run custom build command for `shaderc-sys v0.6.2`
Caused by:
process didn't exit successfully: `/home/vax/code/emu/target/release/build/shaderc-sys-53b173bc91346eab/build-script-build` (exit code: 101)
--- stdout
cargo:warning=Checking for system installed libraries. Use --features = build-from-source to disable this behavior
cargo:warning=System installed library not found. Falling back to build from source
--- stderr
thread 'main' panicked at '
couldn't find required command: "python"
', /home/vax/.cargo/registry/src/github.com-1ecc6299db9ec823/shaderc-sys-0.6.2/build/cmd_finder.rs:50:13
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
Link: https://habr.com/en/post/454678/
I'll translate it to English in the near future, but someone else can do it instead (just write a comment before you'll do it).
I am running on Ubuntu 22.04 with emu_core 0.1.1, info()?.name
is "NVIDIA GeForce RTX 3050 Ti Laptop GPU", the driver is version 515 of the official NVIDIA Linux driver, installed through APT.
The problem seems to be related to the presence of a storage buffer, the one called prec_mat
: if I remove it in both in the shader and in the SpirvBuilder, the issue does not manifest. I am using rust-gpu to write my shader. Note that if my integrated AMD GPU is selected, the code runs fine.
Below is a comprehensive stack trace:
___lldb_unnamed_symbol462 (@___lldb_unnamed_symbol462:301)
___lldb_unnamed_symbol11106 (@___lldb_unnamed_symbol11106:2200)
___lldb_unnamed_symbol11107 (@___lldb_unnamed_symbol11107:19)
___lldb_unnamed_symbol16036 (@___lldb_unnamed_symbol16036:120)
___lldb_unnamed_symbol11528 (@___lldb_unnamed_symbol11528:60)
___lldb_unnamed_symbol11308 (@___lldb_unnamed_symbol11308:258)
_nv002nvvm (@_nv002nvvm:11)
___lldb_unnamed_symbol58166 (@___lldb_unnamed_symbol58166:66)
___lldb_unnamed_symbol58168 (@___lldb_unnamed_symbol58168:583)
___lldb_unnamed_symbol58169 (@___lldb_unnamed_symbol58169:146)
___lldb_unnamed_symbol58181 (@___lldb_unnamed_symbol58181:164)
___lldb_unnamed_symbol58182 (@___lldb_unnamed_symbol58182:8)
___lldb_unnamed_symbol58172 (@___lldb_unnamed_symbol58172:148)
___lldb_unnamed_symbol58204 (@___lldb_unnamed_symbol58204:91)
___lldb_unnamed_symbol57964 (@___lldb_unnamed_symbol57964:70)
___lldb_unnamed_symbol57965 (@___lldb_unnamed_symbol57965:28)
ash::vk::features::DeviceFnV1_0::create_compute_pipelines (/home/mikidep/.cargo/registry/src/github.com-1ecc6299db9ec823/ash-0.31.0/src/vk/features.rs:5094)
gfx_backend_vulkan::device::<impl gfx_hal::device::Device<gfx_backend_vulkan::Backend> for gfx_backend_vulkan::Device>::create_compute_pipeline (/home/mikidep/.cargo/registry/src/github.com-1ecc6299db9ec823/gfx-backend-vulkan-0.5.11/src/device.rs:1044)
wgpu_core::device::<impl wgpu_core::hub::Global<G>>::device_create_compute_pipeline (/home/mikidep/.cargo/registry/src/github.com-1ecc6299db9ec823/wgpu-core-0.5.6/src/device/mod.rs:1932)
wgpu_device_create_compute_pipeline (/home/mikidep/.cargo/registry/src/github.com-1ecc6299db9ec823/wgpu-native-0.5.1/src/device.rs:347)
wgpu::Device::create_compute_pipeline (/home/mikidep/.cargo/registry/src/github.com-1ecc6299db9ec823/wgpu-0.5.2/src/lib.rs:906)
emu_core::device::Device::compile (/home/mikidep/.cargo/registry/src/github.com-1ecc6299db9ec823/emu_core-0.1.1/src/device.rs:611)
emu_core::compile::SpirvOrFinished<P,C>::finish (/home/mikidep/.cargo/registry/src/github.com-1ecc6299db9ec823/emu_core-0.1.1/src/compile.rs:305)
scene_emu::main (/home/mikidep/Documenti/Codice/scene-emu/src/main.rs:104)
core::ops::function::FnOnce::call_once (@core::ops::function::FnOnce::call_once:6)
std::sys_common::backtrace::__rust_begin_short_backtrace (@std::sys_common::backtrace::__rust_begin_short_backtrace:6)
std::rt::lang_start::{{closure}} (@std::rt::lang_start::{{closure}}:7)
core::ops::function::impls::<impl core::ops::function::FnOnce<A> for &F>::call_once (@std::rt::lang_start_internal:184)
std::panicking::try::do_call (@std::rt::lang_start_internal:183)
std::panicking::try (@std::rt::lang_start_internal:183)
std::panic::catch_unwind (@std::rt::lang_start_internal:183)
std::rt::lang_start_internal::{{closure}} (@std::rt::lang_start_internal:183)
std::panicking::try::do_call (@std::rt::lang_start_internal:183)
std::panicking::try (@std::rt::lang_start_internal:183)
std::panic::catch_unwind (@std::rt::lang_start_internal:183)
std::rt::lang_start_internal (@std::rt::lang_start_internal:183)
std::rt::lang_start (@std::rt::lang_start:13)
main (@main:10)
__libc_start_call_main (@__libc_start_call_main:29)
__libc_start_main_impl (@__libc_start_main@@GLIBC_2.34:43)
_start (@_start:15)
I am also attaching relevant Rust code and disassembled shader SPIR-V code:
DeviceBox
definitions;Below are extracts from the above source files, in which the incriminated parameter is declared:
(in main.rs)
let spirv = SpirvBuilder::new()
.set_entry_point_name("main")
.add_param_mut::<[u32]>() // alpha
.add_param_mut::<[StackSym]>() // stack
.add_param_mut::<[usize]>() // gives_stack
.add_param_mut::<[u32]>() // prec_mat
.add_param::<usize>() // length
.add_param::<usize>() // chunk_size
.add_param::<u32>() // term_thresh
.set_code_with_u8(std::io::Cursor::new(code))?
.build();
let c = compile::<Spirv<_>, SpirvCompile, _, GlobalCache>(spirv)?.finish()?;
Segfault happens on the last line.
(in lib.rs)
#[spirv(compute(threads(4)))]
pub fn main(
#[spirv(global_invocation_id)] id: UVec3,
#[spirv(storage_buffer, descriptor_set = 0, binding = 0)] alpha: &mut [u32],
#[spirv(storage_buffer, descriptor_set = 0, binding = 1)] stack: &mut [StackSym],
#[spirv(storage_buffer, descriptor_set = 0, binding = 2)] gives_stack: &mut [usize],
#[spirv(storage_buffer, descriptor_set = 0, binding = 3)] prec_mat: &mut [u32],
#[spirv(storage_buffer, descriptor_set = 0, binding = 4)] length: &mut usize,
#[spirv(storage_buffer, descriptor_set = 0, binding = 5)] chunk_size: &mut usize,
#[spirv(storage_buffer, descriptor_set = 0, binding = 6)] term_thresh: &mut u32,
) { // ...
I understand that the issue should be related to NVIDIA's Vulkan implementation, but maybe you know something about this kind of issue. Thank you in advance.
Hello,
Your project looks very interesting and I wanted to give it ago. I copied your example code on the README and I couldn't compile it. After solving the issue in ticket #27 by using:
em = { git = "https://github.com/calebwin/emu", branch = "dev" }
I get the below output when doing cargo check
:
error[E0433]: failed to resolve: could not find `__rt` in `quote`
--> C:\Users\Pedro\.cargo\git\checkouts\emu-7973979264d9dc07\095942b\emu_macro\src\accelerating.rs:123:66
|
123 | ... .is_ident(&Ident::new("load", quote::__rt::Span::call_site()))
| ^^^^ could not find `__rt` in `quote`
error[E0433]: failed to resolve: could not find `__rt` in `quote`
--> C:\Users\Pedro\.cargo\git\checkouts\emu-7973979264d9dc07\095942b\emu_macro\src\accelerating.rs:169:66
|
169 | ... .is_ident(&Ident::new("read", quote::__rt::Span::call_site()))
| ^^^^ could not find `__rt` in `quote`
error[E0433]: failed to resolve: could not find `__rt` in `quote`
--> C:\Users\Pedro\.cargo\git\checkouts\emu-7973979264d9dc07\095942b\emu_macro\src\accelerating.rs:193:68
|
193 | ... .is_ident(&Ident::new("launch", quote::__rt::Span::call_site()))
| ^^^^ could not find `__rt` in `quote`
error[E0433]: failed to resolve: could not find `__rt` in `quote`
--> C:\Users\Pedro\.cargo\git\checkouts\emu-7973979264d9dc07\095942b\emu_macro\src\accelerating.rs:259:64
|
259 | let ident = Ident::new(¶m.name, quote::__rt::Span::call_site());
| ^^^^ could not find `__rt` in `quote`
I tried hard-setting the 'quote' crate to a specific version (1.0.1
) and got the following message:
error: failed to select a version for `quote`.
... required by package `emu_macro v0.1.0 (https://github.com/calebwin/emu?branch=dev#095942ba)`
... which is depended on by `em v0.3.0 (https://github.com/calebwin/emu?branch=dev#095942ba)`
... which is depended on by `emu-test v0.1.0 (D:\Code\Rust\emu-test)`
versions that meet the requirements `^1.0.2` are: 1.0.3
all possible versions conflict with previously selected packages.
previously selected package `quote v1.0.1`
... which is depended on by `emu-test v0.1.0 (D:\Code\Rust\emu-test)`
failed to select a version for `quote` which could resolve this conflict
Setting quote
to 1.0.3
doesn't solve the issue, but it's interesting that 1.0.2
seems to have been yanked from crates.io. Is it possible emu_macro
depends on code that is no longer present in 1.0.3
?
Hello, I am trying the example:
#[macro_use]
extern crate em;
use em::*;
#[gpu_use]
fn main() {
let mut x = vec![0.0; 1000];
gpu_do!(load(x)); // move data to the GPU
gpu_do!(launch()); // off-load to run on the GPU
for i in 0..1000 {
x[i] = x[i] * 10.0;
}
gpu_do!(read(x)); // move data back from the GPU
println!("{:?}", x);
}
Here's the error.
...
Compiling emu_macro v0.1.0
error[E0277]: the trait bound `syn::Expr: std::convert::From<quote::__rt::TokenStream>` is not satisfied
--> /Users/mrrobb/.cargo/registry/src/github.com-1ecc6299db9ec823/emu_macro-0.1.0/src/passing.rs:337:50
|
337 | let gpu_ident = quote! {gpu}.into();
| ^^^^ the trait `std::convert::From<quote::__rt::TokenStream>` is not implemented for `syn::Expr`
|
= help: the following implementations were found:
<syn::Expr as std::convert::From<syn::ExprArray>>
<syn::Expr as std::convert::From<syn::ExprAssign>>
<syn::Expr as std::convert::From<syn::ExprAssignOp>>
<syn::Expr as std::convert::From<syn::ExprAsync>>
and 35 others
= note: required because of the requirements on the impl of `std::convert::Into<syn::Expr>` for `quote::__rt::TokenStream`
error: aborting due to previous error
For more information about this error, try `rustc --explain E0277`.
error: Could not compile `emu_macro`.
warning: build failed, waiting for other jobs to finish...
error: build failed
Am I doing something wrong? Thank you.
Test project: emu_test.zip
Errors (and warning) produced:
error[E0433]: failed to resolve: use of undeclared type or module `futures`
--> src\main.rs:15:5
|
15 | futures::executor::block_on(assert_device_pool_initialized());
| ^^^^^^^ use of undeclared type or module `futures`
error[E0433]: failed to resolve: use of undeclared type or module `GlslBuilder`
--> src\main.rs:23:9
|
23 | GlslBuilder::new()
| ^^^^^^^^^^^ use of undeclared type or module `GlslBuilder`
error[E0433]: failed to resolve: use of undeclared type or module `futures`
--> src\main.rs:70:22
|
70 | println!("{:?}", futures::executor::block_on(x.get())?);
| ^^^^^^^ use of undeclared type or module `futures`
error[E0412]: cannot find type `GlslCompile` in this scope
--> src\main.rs:22:31
|
22 | let c = compile::<String, GlslCompile, _, GlobalCache>(
| ^^^^^^^^^^^ not found in this scope
warning: unused import: `emu_glsl::*`
--> src\main.rs:1:5
|
1 | use emu_glsl::*;
| ^^^^^^^^^^^
|
= note: `#[warn(unused_imports)]` on by default
error[E0277]: the trait bound `Rectangle: zerocopy::AsBytes` is not satisfied
--> src\main.rs:18:71
|
18 | let mut x: DeviceBox<[Rectangle]> = vec![Default::default(); 128].as_device_boxed()?;
| ^^^^^^^^^^^^^^^ the trait `zerocopy::AsBytes` is not implemented for `Rectangle`
|
help: trait impl with same name found
--> src\main.rs:6:10
|
6 | #[derive(AsBytes, FromBytes, Copy, Clone, Default, Debug)]
| ^^^^^^^
= note: Perhaps two different versions of crate `zerocopy` are being used?
= note: required because of the requirements on the impl of `zerocopy::AsBytes` for `[Rectangle]`
= note: required because of the requirements on the impl of `emu_core::boxed::AsDeviceBoxed<[Rectangle]>` for `std::vec::Vec<Rectangle>`
error[E0599]: no method named `get` found for type `emu_core::device::DeviceBox<[Rectangle]>` in the current scope
--> src\main.rs:70:52
|
70 | println!("{:?}", futures::executor::block_on(x.get())?);
| --^^^
| | |
| | this is an associated function, not a method
| help: use associated function syntax instead: `emu_core::device::DeviceBox::<[Rectangle]>::get`
|
= note: found the following associated functions; to be used as methods, functions must have a `self` parameter
note: the candidate is defined in the trait `emu_core::cache::Cache`
--> C:\Users\jonat\.cargo\registry\src\github.com-1ecc6299db9ec823\emu_core-0.1.1\src\cache.rs:20:5
|
20 | fn get(key: u64) -> Arc<DeviceFnMut>;
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
= help: to disambiguate the method call, write `emu_core::cache::Cache::get(x)` instead
= note: the method `get` exists but the following trait bounds were not satisfied:
`Rectangle : zerocopy::FromBytes`
error: aborting due to 6 previous errors
Some errors have detailed explanations: E0277, E0412, E0433, E0599.
For more information about an error, try `rustc --explain E0277`.
error: could not compile `emu_test`.
To learn more, run the command again with --verbose.
I am interested in using emu but I have a few questions regarding this project:
Emu is a modern library for low-level, cross-platform GPGPU enabling on-device, reproducible, privacy-focused compute
What does it mean by being privacy-focused compute?
Could there be standard benchmarks that also compare against raw WebGPU peeformance? This is important for adoption.
Could there be standard benchmarks that also compare against raw WebGPU performance? This is important for adoption.
Are there any plans to provide string or byte array handling capability from within emu kernels?
I believe it would be feasible if there was more support for integer types within emu.
I understand both cuda/opencl provide integer support within kernels.
Thank you for listening.
You don't need to store examples/multiply_by_scalar/target
Hi! I tried with the example at readme, and it failed to compile with stable rustc.
I tried adding em version 0.3 and from git.
|
338 | i.args.insert(0, gpu_ident);
| ^^^^^^^^^ the trait `std::convert::From<quote::__rt::TokenStream>` is not implemented for `syn::Expr`
|
= help: the following implementations were found:
<syn::Expr as std::convert::From<syn::ExprArray>>
<syn::Expr as std::convert::From<syn::ExprAssign>>
<syn::Expr as std::convert::From<syn::ExprAssignOp>>
<syn::Expr as std::convert::From<syn::ExprAsync>>
and 35 others
= note: required because of the requirements on the impl of `std::convert::Into<syn::Expr>` for `quote::__rt::TokenStream`
Hey
is it planned to support binding to CUDA in the future?
It would be a great enhancement I think :)
Best regards
Chips
I'm trying to run the tests for the arithmetic
example (env RUST_BACKTRACE=1 cargo test
), but am getting undefined behaviour. In most cases a double free is detected and the program crashes:
running 2 tests
X server found. dri2 connection failed!
X server found. dri2 connection failed!
X server found. dri2 connection failed!
X server found. dri2 connection failed!
X server found. dri2 connection failed!
X server found. dri2 connection failed!
X server found. dri2 connection failed!
X server found. dri2 connection failed!
X server found. dri2 connection failed!
X server found. dri2 connection failed!
X server found. dri2 connection failed!
X server found. dri2 connection failed!
X server found. dri2 connection failed!
double free or corruption (top)
error: process didn't exit successfully: `/home/cedwards/git/emu/examples/arithmetic/target/debug/deps/arithmetic-8e93e645dd4fd0c6` (signal: 6, SIGABRT: process abort signal)
Are you aware of Futhark ?
They let the user write GPU code with highlevel functions (in a dedicated language) that are then optimized and compiled down to openCL or Cuda.
As a long term goal, using a similar approach (or integrating with their optimizer), you could let the user use an iterator (gpu_iter()
?) and optimize the code at compile time.
Hello, running the compute example:
use emu_core::prelude::*;
use emu_glsl::*;
use zerocopy::*;
#[repr(C)]
#[derive(AsBytes, FromBytes, Copy, Clone, Default, Debug, GlslStruct)]
struct Shape {
x: u32,
y: u32,
w: i32,
h: i32,
r: [i32; 2],
}
fn main() -> Result<(), Box<dyn std::error::Error>> {
// ensure that a device pool has been initialized
// this should be called before every time when you assume you have devices to use
// that goes for both library users and application users
futures::executor::block_on(assert_device_pool_initialized());
println!("{:?}", take()?.lock().unwrap().info.as_ref().unwrap());
// create some data on GPU
// even mutate it once loaded to GPU
let mut shapes: DeviceBox<[Shape]> = vec![Default::default(); 1024].as_device_boxed_mut()?;
let mut x: DeviceBox<[i32]> = vec![0; 1024].as_device_boxed_mut()?;
shapes.set(vec![
Shape {
x: 0,
y: 0,
w: 100,
h: 100,
r: [2, 9]
};
1024
])?;
// compile GslKernel to SPIR-V
// then, we can either inspect the SPIR-V or finish the compilation by generating a DeviceFnMut
// then, run the DeviceFnMut
let c = compile::<GlslKernel, GlslKernelCompile, Vec<u32>, GlobalCache>(
GlslKernel::new()
.spawn(64)
.share("float stuff[64]")
.param_mut::<[Shape], _>("Shape[] shapes")
.param_mut::<[i32], _>("int[] x")
.param::<i32, _>("int scalar")
.with_struct::<Shape>()
.with_const("int c", "7")
.with_helper_code(
r#"
Shape flip(Shape s) {
s.x = s.x + s.w;
s.y = s.y + s.h;
s.w *= -1;
s.h *= -1;
s.r = ivec2(5, 3);
return s;
}
"#,
)
.with_kernel_code(
"shapes[gl_GlobalInvocationID.x] = flip(shapes[gl_GlobalInvocationID.x]); x[gl_GlobalInvocationID.x] = scalar + c + int(gl_WorkGroupID.x);",
),
)?.finish()?;
unsafe {
spawn(16).launch(call!(c, &mut shapes, &mut x, &DeviceBox::new(10)?))?;
}
// download from GPU and print out
println!("{:?}", futures::executor::block_on(shapes.get())?);
println!("{:?}", futures::executor::block_on(x.get())?);
Ok(())
}
$ cargo run
yields
Finished dev [unoptimized + debuginfo] target(s) in 0.44s
Running `target/debug/emu_test`
Limits {
max_bind_groups: 4,
max_dynamic_uniform_buffers_per_pipeline_layout: 8,
max_dynamic_storage_buffers_per_pipeline_layout: 4,
max_sampled_textures_per_shader_stage: 16,
max_samplers_per_shader_stage: 16,
max_storage_buffers_per_shader_stage: 4,
max_storage_textures_per_shader_stage: 4,
max_uniform_buffers_per_shader_stage: 12,
max_uniform_buffer_binding_size: 16384,
max_push_constant_size: 0,
}
{ name: "Intel(R) Iris(TM) Plus Graphics 655", vendor_id: 0, device_id: 0, device_type: IntegratedGpu }
wgpu error: Validation Error
Caused by:
In CommandEncoder::copy_buffer_to_buffer
Copy error
destination buffer/texture is missing the `COPY_DST` usage flag
note: destination = `<Buffer-(4, 1, Metal)>`
thread 'main' panicked at 'Handling wgpu errors as fatal by default', /Users/wbrickner/.cargo/registry/src/github.com-1ecc6299db9ec823/wgpu-0.7.0/src/backend/direct.rs:1896:5
stack backtrace:
0: std::panicking::begin_panic
at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/std/src/panicking.rs:616:12
1: wgpu::backend::direct::default_error_handler
at /Users/wbrickner/.cargo/registry/src/github.com-1ecc6299db9ec823/wgpu-0.7.0/src/backend/direct.rs:1896:5
2: core::ops::function::Fn::call
at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/core/src/ops/function.rs:70:5
3: <alloc::boxed::Box<F,A> as core::ops::function::Fn<Args>>::call
at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/alloc/src/boxed.rs:1875:9
4: wgpu::backend::direct::ErrorSinkRaw::handle_error
at /Users/wbrickner/.cargo/registry/src/github.com-1ecc6299db9ec823/wgpu-0.7.0/src/backend/direct.rs:1883:9
5: wgpu::backend::direct::Context::handle_error
at /Users/wbrickner/.cargo/registry/src/github.com-1ecc6299db9ec823/wgpu-0.7.0/src/backend/direct.rs:109:9
6: wgpu::backend::direct::Context::handle_error_nolabel
at /Users/wbrickner/.cargo/registry/src/github.com-1ecc6299db9ec823/wgpu-0.7.0/src/backend/direct.rs:121:9
7: <wgpu::backend::direct::Context as wgpu::Context>::command_encoder_copy_buffer_to_buffer
at /Users/wbrickner/.cargo/registry/src/github.com-1ecc6299db9ec823/wgpu-0.7.0/src/backend/direct.rs:1542:13
8: wgpu::CommandEncoder::copy_buffer_to_buffer
at /Users/wbrickner/.cargo/registry/src/github.com-1ecc6299db9ec823/wgpu-0.7.0/src/lib.rs:1954:9
9: emu_core::device::Device::get::{{closure}}
at /Users/wbrickner/.cargo/git/checkouts/emu-7973979264d9dc07/9fe3db3/emu_core/src/device.rs:391:9
10: <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll
at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/core/src/future/mod.rs:91:19
11: emu_core::boxed::<impl emu_core::device::DeviceBox<[T]>>::get::{{closure}}
at /Users/wbrickner/.cargo/git/checkouts/emu-7973979264d9dc07/9fe3db3/emu_core/src/boxed.rs:298:23
12: <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll
at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/core/src/future/mod.rs:91:19
13: futures_executor::local_pool::block_on::{{closure}}
at /Users/wbrickner/.cargo/registry/src/github.com-1ecc6299db9ec823/futures-executor-0.3.21/src/local_pool.rs:315:23
14: futures_executor::local_pool::run_executor::{{closure}}
at /Users/wbrickner/.cargo/registry/src/github.com-1ecc6299db9ec823/futures-executor-0.3.21/src/local_pool.rs:90:37
15: std::thread::local::LocalKey<T>::try_with
at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/std/src/thread/local.rs:442:16
16: std::thread::local::LocalKey<T>::with
at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/std/src/thread/local.rs:418:9
17: futures_executor::local_pool::run_executor
at /Users/wbrickner/.cargo/registry/src/github.com-1ecc6299db9ec823/futures-executor-0.3.21/src/local_pool.rs:86:5
18: futures_executor::local_pool::block_on
at /Users/wbrickner/.cargo/registry/src/github.com-1ecc6299db9ec823/futures-executor-0.3.21/src/local_pool.rs:315:5
19: emu_test::main
at ./src/main.rs:71:22
20: core::ops::function::FnOnce::call_once
at /rustc/fe5b13d681f25ee6474be29d748c65adcd91f69e/library/core/src/ops/function.rs:227:5
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
my understanding is that buffers must have their usage declared correctly (with some amount of detail) at construction time through wgpu
.
This code waits 10 seconds and prints the error:
thread 'main' panicked at 'Platform::default(): Unable to get platform id list after 10 seconds of waiting.', src/libcore/result.rs:999:5
note: Run with `RUST_BACKTRACE=1` environment variable to display a backtrace.
Code:
use em::{build, emu};
extern crate ocl;
use ocl::{flags, Buffer, Context, Device, Kernel, Platform, Program, Queue};
emu! {
multiply(global_vector [f32], scalar f32) {
global_vector[get_global_id(0)] *= scalar;
}
}
build! { multiply [f32] f32 }
fn main() {
let args = std::env::args().collect::<Vec<String>>();
if args.len() < 3 {
panic!("cargo run -- <SCALAR> <NUMBERS>...");
}
let scalar = args[1].parse::<f32>().unwrap();
let vector = args[2..]
.into_iter()
.map(|string| string.parse::<f32>().unwrap())
.collect();
let result = multiply(vector, scalar).unwrap();
dbg!(result);
}
My operating system is Ubuntu and I installed OpenCL by the following commands:
$ sudo apt update
$ sudo apt install ocl-icd-opencl-dev
The search bar in the rusdoc documentation does not work.
Because several bugs and UBs were fixed:
cogciprocate/ocl#150
cogciprocate/ocl#142
Hello!
Is there currently any way to manually list and select the OpenCL device?
Thanks! :)
Line 130 in d26bfe7
rustc 1.53.0 (53cb7b09b 2021-06-17)
cargo 1.53.0 (4369396ce 2021-04-27)
Same happens under Windows and Linux (given the errors, I doubted it was an OS level issue)
Steps to recreate:
cargo new emdemo
cd emdemo
echo 'em = "0.3.0"' >> Cargo.toml
cargo build
There's a few errors being reported,
Errors messages follow. The first 4 are similar, all in accelerating.rs
:
error[E0433]: failed to resolve: could not find `__rt` in `quote`
--> /home/paul/.cargo/registry/src/github.com-1ecc6299db9ec823/emu_macro-0.1.0/src/accelerating.rs:123:72
|
123 | ... .is_ident(&Ident::new("load", quote::__rt::Span::call_site()))
| ^^^^ not found in `quote::__rt`
|
help: consider importing one of these items
|
3 | use crate::Span;
|
3 | use crate::__private::Span;
|
3 | use proc_macro2::Span;
|
3 | use proc_macro::Span;
|
and 3 other candidates
error[E0433]: failed to resolve: could not find `__rt` in `quote`
--> /home/paul/.cargo/registry/src/github.com-1ecc6299db9ec823/emu_macro-0.1.0/src/accelerating.rs:169:72
|
169 | ... .is_ident(&Ident::new("read", quote::__rt::Span::call_site()))
| ^^^^ not found in `quote::__rt`
|
help: consider importing one of these items
|
3 | use crate::Span;
|
3 | use crate::__private::Span;
|
3 | use proc_macro2::Span;
|
3 | use proc_macro::Span;
|
and 3 other candidates
error[E0433]: failed to resolve: could not find `__rt` in `quote`
--> /home/paul/.cargo/registry/src/github.com-1ecc6299db9ec823/emu_macro-0.1.0/src/accelerating.rs:193:74
|
193 | ... .is_ident(&Ident::new("launch", quote::__rt::Span::call_site()))
| ^^^^ not found in `quote::__rt`
|
help: consider importing one of these items
|
3 | use crate::Span;
|
3 | use crate::__private::Span;
|
3 | use proc_macro2::Span;
|
3 | use proc_macro::Span;
|
and 3 other candidates
error[E0433]: failed to resolve: could not find `__rt` in `quote`
--> /home/paul/.cargo/registry/src/github.com-1ecc6299db9ec823/emu_macro-0.1.0/src/accelerating.rs:259:70
|
259 | let ident = Ident::new(¶m.name, quote::__rt::Span::call_site());
| ^^^^ not found in `quote::__rt`
|
help: consider importing one of these items
|
3 | use crate::Span;
|
3 | use crate::__private::Span;
|
3 | use proc_macro2::Span;
|
3 | use proc_macro::Span;
|
and 3 other candidates
Compiling num-bigint v0.1.44
Then the 5th is quite different:
error[E0277]: the trait bound `syn::Expr: From<TokenStream2>` is not satisfied
--> /home/paul/.cargo/registry/src/github.com-1ecc6299db9ec823/emu_macro-0.1.0/src/passing.rs:337:50
|
337 | let gpu_ident = quote! {gpu}.into();
| ^^^^ the trait `From<TokenStream2>` is not implemented for `syn::Expr`
|
= help: the following implementations were found:
<syn::Expr as From<ExprArray>>
<syn::Expr as From<ExprAssign>>
<syn::Expr as From<ExprAssignOp>>
<syn::Expr as From<ExprAsync>>
and 35 others
= note: required because of the requirements on the impl of `Into<syn::Expr>` for `TokenStream2`
Hi all,
I am trying to execute the tests in the examples. I did tried running tests in integration
, arithmetic
example folders, in both the cases my system is failing the tests.
Is this anything to do with the crate or am I missing some library?
Hi everyone! This project ia awesome. I was tring run example code, but I got error:
let gpu_ident = quote! {gpu}.into(); ->
^^^^ the trait std::convert::From<quote::__private::TokenStream>
is not implemented for syn::Expr
this error was in passing.rs file.
Help me please to understand how it works?
p.s. my OS is Windows 10 64 bit, Rust has 1.42.0 version, and I have installed OpenCL driver (my video card is Nvidia GeForce 1070).
As mentioned in #31 , I'm migrating my project to use the new emu_core
GLSL abstraction layer and I am encountering a few documentation issues getting started.
In the dependencies section, it is said that this is how you add emu
to a project:
emu_core = {
git = "https://github.com/calebwin/emu/tree/master/emu_core.git",
rev = "265d2a5fb9292e2644ae4431f2982523a8d27a0f"
}
Newlines inside of a dep key in Cargo are invalid, and also this isn't a valid Git URL. Currently, the only way I see to use emu_core
is to clone the whole emu
repository and then use a path dependency. In workspace setups, typically the individual crates are published to crates.io individually, avoiding this issue.
Additionally, there's a discord link at the bottom of the README that I was going to use to address the above question, but the Discord invite is invalid.
The only person who can fix these issues would be @calebwin so that's why I'm filing an issue instead of a PR. Thanks again for the crate -- I'll be using it with the cloned dependency for now.
Unfortunately the logo is clearly an ostrich not an emu.
We should implement several additional traits for DeviceBox<T>
to make it more useful.
Clone
- for duplicating data on the GPUHash
- for parallel hashingEq
and PartialEq
- for parallel equality checkingAcceleratedIterator
- we should define a new trait for doing high-level iterator-based computationI don't know if other people can replicate this. But at least on my system, I can't use a local work group size of anything greater than 32.
A function in Emu operates on a "work-item" (work-item is a term OpenCL uses; I loosely use it here but we can refer to it differently if we come up with a better name).
multiply(global_buffer [f32], scalar f32) {
global_buffer[get_global_id(0)] *= scalar;
}
With the above function, a work-item corresponds to a particular index in global_buffer
. So the work can be thought of as a 1d grid with dimensions equal to the length of global_buffer
. Let's consider another function.
multiply_matrices(m i32, n i32, k i32, global_a [f32], global_b [f32], global_c [f32]) {
let row: i32 = get_global_id(0);
let col: i32 = get_global_id(1);
let acc: f32 = 0.0;
for i in 0..k {
acc += global_a[i*m + row] * global_b[col*k + i];
}
global_c[col * m + row] = acc;
}
When this function is run, a work-item corresponds to a pair of indices - one in global_a
and one in global_b
. So the work in this case is a 2d grid with dimensions equal to the product of the lengths of global_a
and global_b
.
Now here's the thing - both of these functions can be ultimately run with a binding to OpenCL. But only the first function can be run with the build!
macro. This is because functions you intend to run with the build!
macro operate on 1d grids of work where the dimension is by default the length of the first parameter to the function.
This is an important thing to note and I think it can help us answer the following key questions.
get_global_id()
?Hi!
Your library is getting more and more popular.
Would you please use gitflow? At least would you please protect master
branch from pushing into it without pull requests even for administrators? It will help people to track your progress and it will help you not to screw up undeliberately by pushing unwanted code into master
branch.
Best regards
Hey there! I was curious what the current project status is, as it hasn't been updated in a few years.
I totally understand if you just ran out of time for the project or something like that, but I was curious what your thoughts on the current code state is. Do you feel like the direction was right, and might just need to be updated to later versions of WGPU or something, or do you feel like there are better options now, or maybe the original plan didn't go as well as immagined?
That kind of stuff. :)
I'm thinking of adding a WGPU-powered backend to burn
for machine learning, and Emu seems to have a lot of the right goals, but I'm not sure if it'd be better to use WGPU directly, since there hasn't been a lot of activity here for a while.
How can I pass a 2D array of floats?
Preparing any sort of DeviceBox
from a Vec<Vec<f32>>
is a no-go it seems.
The dimensions of the vector are compile-time constants from the perspective of GLSL (they get formatted in).
The dimensions are determined at runtime on the rust side.
Can I just flatten into a single buffer and the GLSL code won't notice?
I wanted to ping you and make you aware of this project having raised quite some interest on reddit. There are two interesting questions, one about potential struct support, and the other about type checking/inference before translation.
Maybe they are of interest to you? See the reddit thread here: https://www.reddit.com/r/rust/comments/bvwvpd/emu_gpu_programming_language_for_rust/
The only things Emu functions can really accept right now are vectors (technically arrays/pointers) and primitive data (f32 or i32). Simple structures could be accepted with 2 changes.
-A change to the language so you can declare what kind of structs you accept and how to unpackage primitive data from them in the declaration of the Emu function
-A change to the build! macro to generate a function that can accept structs of a certain type and unpackage them into primitive data to send to the Emu function.
Before these changes are implemented, we should think about how the general interface to an Emu user should change. How should they pass structs to functions in a way that is most seamless.
There are several internal things that make Emu's performance potentially suboptimal. This issue is a place to discuss them.
wgpu::Device::poll
is used here and right now it blocks in an async context. I'm not sure what the solution is but there is some discussion here.DeviceBox
. This should be replaced by some sort of global pool of staging buffers which is shared by all DeviceBox
s.Edit: the Device::poll
thing is probably not an optimization but a bug in the library
Hello,
I am the chair of a working group at Khronos developping standards for graphics and compute hardware accelerators (as well as a representative for Nvidia in those groups).
We are organizing a summit to gather feedback from the ecosystem, and influence the design of the next round of improvements, and I thought that you may be interested in being represented:
https://www.khronos.org/events/2021-invitation-to-the-khronos-machine-learning-summit
This summit is IP free, and will let you present your project and your needs for improvements in the ML ecosystem, as well as hearing from other companies.
regards,
Pierre Boudier
Software architect at Nvidia
chair of the Machine Learning TSG at Khronos
Line 16 in d26bfe7
For global_particles_y and global_particles_z.
I get this in WSL2 Debian on Windows
error[E0455]: native frameworks are only available on macOS targets
--> /home/walther/.cargo/registry/src/github.com-1ecc6299db9ec823/coreaudio-sys-0.1.2/src/audio_unit.rs:6380:1
|
6380 | #[link(name = "AudioUnit", kind = "framework")]
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
error[E0455]: native frameworks are only available on macOS targets
--> /home/walther/.cargo/registry/src/github.com-1ecc6299db9ec823/coreaudio-sys-0.1.2/src/audio_unit.rs:6739:1
|
6739 | #[link(name = "AudioUnit", kind = "framework")]
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
error: aborting due to 2 previous errors
For more information about this error, try `rustc --explain E0455`.
error: could not compile `coreaudio-sys`.
I read about gpufs. It has a capability to access files from within gpu kernel functions. Are there any plans to provide file io apis from within emu's gpu kernel functions? i.e. open file, create file, write/read, close.
Thank you for listening.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.