juliaparallel / ucx.jl Goto Github PK
View Code? Open in Web Editor NEWLicense: MIT License
License: MIT License
See #6
IIUC UCX has support for AM and that is probably the best layer to implement RPC ala Distributed.jl over,
it seems that AM's are not part of the UCP API (despite being advertised as such) and are instead part of the UCT API.
blocked by JuliaGPU/AMDGPU.jl#6
This issue is used to trigger TagBot; feel free to unsubscribe.
If you haven't already, you should update your TagBot.yml
to include issue comment triggers.
Please see this post on Discourse for instructions and more details.
If you'd like for me to do this for you, comment TagBot fix
on this issue.
I'll open a PR within a few hours, please be patient!
vchuravy@odin ~/s/UCX (master)> julia --project=. examples/client_server.jl test 2
ERROR: LoadError: TaskFailedException:
AssertionError: String(buffer) == data
Stacktrace:
[1] start_client(::Int64) at /home/vchuravy/src/UCX/examples/client_server.jl:58
[2] (::var"#5#7"{Int64})() at ./task.jl:356
...and 1 more exception(s).
IIUC UCX has 3 different kinds of shmem support:
knem
and xpmem
seem to be a tough call for Yggdrasil since they are partially implemented as kernel modules.
cma
right now is available for the ppc64le builds, but not the x86_64 builds since those build against glibc 2.12
If one calls progress
during an AM handler, hilarity ensues and everything becomes recursive.
This means if we happen to switch to a different task on the same thread and that task call progress
we are in deep trouble.
Currently UCX.jl uses FileWatching.poll_fd
in order to wait on incoming messages, this relies on Julia usage of libuv to allow the watcher task to suspend. This is the preferred approach since it doesn't introduce a Julia task that looks artificially busy to the scheduler.
This does depend on UCP_FEATURE_WAKEUP
which currently leads to shared memory not being used by UCX (openucx/ucx#5322)
In symmetric use-cases like tagged messages and streams not using polling is fine, but in asymmetric use-cases like active messages polling is essential, otherwise we will not call progress
on the UCXWorker
often enough. Even with polling I have seen timing improvements by switching the watcher task into busy waiting (which is bad for the rest of the system).
Alternatives that I thought about:
idle
276874d -- need to benchmark again.@threadcall
: progress
will invoke callbacks into Julia code, can't execute that on a libuv threadLines 276 to 281 in 6a369d8
The Julia survey asked about what tools we use for parallelism and it mentioned UCX.jl. As I never heard of it before, I was curious and tried to find out what it is. But this repository doesn't really help with that :-(.
I am guessing that perhaps it is related to this https://github.com/openucx/ucx ?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.