Comments (8)
The main thing stopping string handling is supporting bytes/ints/chars, right? This is blocked right now until we figure out better type inference.
I'm not sure about regex'ing but slicing will probably never be supported in the Rust subset because it is practically impossible on GPUs. GPUs have different sections of memory - "global", "local", and "private." "global" is the most expensive to allocate and place data into and out of. "private" is the cheapest but is only registers. Registers are like slots but can only hold primitive types like int
or float
. So a slice would have to be placed in "global" memory which would be really, really inefficient.
And I don't think slices are really necessary. Once you have a slice you want to do one of 2 things.
- Index into the slice
- Iterate over the slice
The first is already possible (if you index directly into the data you are trying to take a slice of) and once we support for loops inside of the "kernel"/for loop body, the second will be possible too.
Sorting could be implemented by hand as parallel bubble sort once we have support for if statements, modulo operator, variables (to add support we need to work on modifying this traversing code and ensure that the type-safety is not messed up. Sorting like this is could also maybe be implemented at some point.
let mut x = vec![0.0; 1000];
// ...
// ...store random numbers in x...
// ...
gpu_do!(load(x));
gpu_do!(launch());
x.sort();
from emu.
You can read the linked comment above for figuring out type inference. But basically the challenge is that for OpenACC that does what Emu does but for C/C++, they have stuff like this.
int z = x + y;
And they know the type is int
so they can produce the OpenCL, int z = x + y
and maintain type safety.
But we have Rust code like this.
let z = x + y;
And somehow, we need to figure out that this z
is an int
.
from emu.
Wait, actually, sorting shouldn't be built in. It should be defined in some separate crate GPU-accelerated sorting.
let mut x = vec![0.0; 1000];
// ...
// ...store random numbers in x...
// ...
gpu_do!(load(x));
x = sorting::sort(x);
Regex'ing and slicing also won't be built in. All of these should be implemented manually. However, for these to be implement-able, the above things do still need to be supported. (variables, if/else, type inference, etc.)
from emu.
I did read a bit more into the "CUDA C PROGRAMMING GUIDE PG-02829-001_v10.1 | August 2019".
In theory, the emu vectors could contain any of these types:
char1, uchar1 1
char2, uchar2 2
char3, uchar3 1
char4, uchar4 4
short1, ushort1 2
short2, ushort2 4
short3, ushort3 2
short4, ushort4 8
int1, uint1 4
int2, uint2 8
int3, uint3 4
int4, uint4 16
long1, ulong1 4 if sizeof(long) is equal to sizeof(int) 8, otherwise
long2, ulong2 8 if sizeof(long) is equal to sizeof(int), 16, otherwise
long3, ulong3 4 if sizeof(long) is equal to sizeof(int), 8, otherwise
long4, ulong4 16
longlong1, ulonglong1 8
longlong2, ulonglong2 16
longlong3, ulonglong3 8
longlong4, ulonglong4 16
float1 4
float2 8
float3 4
float4 16
double1 8
double2 16
double3 8
double4 6
The "if" conditional is supported within cuda kernels.
It's also supported within OpenACC.
https://www.openacc.org/sites/default/files/inline-files/API%20Guide%202.7.pdf
Although outside of the scope of your emu, it could be interesting to see support for GPUDirect RDMA within emu also:
https://www.sc-asia.org/2018/wp-content/uploads/2018/03/1_1500_Ido_Shamay.pdf
https://www.mellanox.com/related-docs/prod_software/RDMA_Aware_Programming_user_manual.pdf
from emu.
Wait, actually, sorting shouldn't be built in. It should be defined in some separate crate GPU-accelerated sorting.
let mut x = vec![0.0; 1000]; // ... // ...store random numbers in x... // ... gpu_do!(load(x)); x = sorting::sort(x);Regex'ing and slicing also won't be built in. All of these should be implemented manually. However, for these to be implement-able, the above things do still need to be supported. (variables, if/else, type inference, etc.)
Actually my intent was not to mutate the input request vector itself. I would be passing along a second response vector itself which would contain a different structure of vector, but with similar type something like 8-bit unsigned integer "u8" also known as a byte which is what you would find within your typical memory location or file. If all goes well the actual response reference passed in is a direct mapping to an intended response file which could be local or remote.
from emu.
In theory, the emu vectors could contain any of these types:
char1, uchar1 1 char2, uchar2 2 char3, uchar3 1 char4, uchar4 4 short1, ushort1 2 short2, ushort2 4 short3, ushort3 2 short4, ushort4 8 int1, uint1 4 int2, uint2 8 int3, uint3 4 int4, uint4 16 lovng1, ulong1 4 if sizeof(long) is equal to sizeof(int) 8, otherwise long2, ulong2 8 if sizeof(long) is equal to sizeof(int), 16, otherwise long3, ulong3 4 if sizeof(long) is equal to sizeof(int), 8, otherwise long4, ulong4 16 longlong1, ulonglong1 8 longlong2, ulonglong2 16 longlong3, ulonglong3 8 longlong4, ulonglong4 16 float1 4 float2 8 float3 4 float4 16 double1 8 double2 16 double3 8 double4 6
Yes. While f32
is what GPUs are optimized for, other primitive types can have support added for them easily. The reason why I haven't just gone ahead and added them is because I'm trying to think carefully about types, type safety.
The "if" conditional is supported within cuda kernels.
It's also supported within OpenACC.
https://www.openacc.org/sites/default/files/inline-files/API%20Guide%202.7.pdf
I also haven't added if statements because that would require adding bool
to the type system. And I'm not entirely convinced that just adding these types can be done without breaking type safety guarantee. I'm certain there is a way to do it, I just don't know if the "easy way" is the right way or if there is a harder way that will guarantee type safety with even more certainty.
Actually my intent was not to mutate the input request vector itself. I would be passing along a second response vector itself which would contain a different structure of vector, but with similar type something like 8-bit unsigned integer "u8" also known as a byte which is what you would find within your typical memory location or file. If all goes well the actual response reference passed in is a direct mapping to an intended response file which could be local or remote.
You can create a separate vector and mutate that instead. Emu lets you do that. The only big complication is adding the u8
type. Again, it's probably safe to add, but I'm not yet convinced you can do it easily.
from emu.
Is there any way to do value clamping without supporting if or bool?
from emu.
Not at the moment. I had plans for a rewrite system that would replace expressions with appropriate builtin functions in OpenCL (so an if statement would be replaced with a clamp). Nothing has materialized yet, unfortunately.
from emu.
Related Issues (20)
- README instructions have various issues HOT 4
- Benchmarks HOT 3
- Benchmarks + against WebGPU
- Compiling error HOT 8
- Add badges to readme.md HOT 2
- Work group sizes greater than 32 don't work on Intel integrated GPU
- Important internal optimizations, potential bug
- Implementing more traits for `DeviceBox`
- Questions regarding the project HOT 1
- Logo is an ostrich not an emu HOT 1
- Unable to run on Ubuntu 20.04 due to python command HOT 1
- Rustdoc documentation search bar does not work
- Example code given in readme doesn't seem to work HOT 6
- coreaudio-sys AudioUnit compile error on Windows & Linux HOT 2
- em compilation fails, traits not implemented, and some missing imports? HOT 2
- Khronos ML summit
- `.finish()` stage of the shader compilation segfaults on NVIDIA Vulkan driver HOT 1
- Example panics at runtime (`COPY_DST` flag) HOT 4
- How to pass a 2D array of floats? HOT 2
- Current Project Status Update? HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from emu.