Giter Club home page Giter Club logo

ucx.jl's People

Contributors

juliatagbot avatar vchuravy avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

ucx.jl's Issues

Add bindings for Active Messages

IIUC UCX has support for AM and that is probably the best layer to implement RPC ala Distributed.jl over,
it seems that AM's are not part of the UCP API (despite being advertised as such) and are instead part of the UCT API.

TagBot trigger issue

This issue is used to trigger TagBot; feel free to unsubscribe.

If you haven't already, you should update your TagBot.yml to include issue comment triggers.
Please see this post on Discourse for instructions and more details.

If you'd like for me to do this for you, comment TagBot fix on this issue.
I'll open a PR within a few hours, please be patient!

`examples/client_server.jl` returns corrupted data with more than one client

vchuravy@odin ~/s/UCX (master)> julia --project=. examples/client_server.jl test 2

ERROR: LoadError: TaskFailedException:
AssertionError: String(buffer) == data
Stacktrace:
 [1] start_client(::Int64) at /home/vchuravy/src/UCX/examples/client_server.jl:58
 [2] (::var"#5#7"{Int64})() at ./task.jl:356

...and 1 more exception(s).

Active message handler should not call `progress`

If one calls progress during an AM handler, hilarity ensues and everything becomes recursive.

This means if we happen to switch to a different task on the same thread and that task call progress we are in deep trouble.

  1. Add check for recursion and turns this into an error
  2. Add lock to prevent
  3. Can we disable task switch from am handler? Or make task switch an error

Polling vs non polling

Currently UCX.jl uses FileWatching.poll_fd in order to wait on incoming messages, this relies on Julia usage of libuv to allow the watcher task to suspend. This is the preferred approach since it doesn't introduce a Julia task that looks artificially busy to the scheduler.

This does depend on UCP_FEATURE_WAKEUP which currently leads to shared memory not being used by UCX (openucx/ucx#5322)

In symmetric use-cases like tagged messages and streams not using polling is fine, but in asymmetric use-cases like active messages polling is essential, otherwise we will not call progress on the UCXWorker often enough. Even with polling I have seen timing improvements by switching the watcher task into busy waiting (which is bad for the rest of the system).

Alternatives that I thought about:

  • LibUV timer, but wake ups occur not often enough
  • LibUV idle276874d -- need to benchmark again.
  • @threadcall: progress will invoke callbacks into Julia code, can't execute that on a libuv thread

Port selection without using `listenany`

UCX.jl/src/UCX.jl

Lines 276 to 281 in 6a369d8

# Choose free port
if port === nothing || port == 0
port_hint = 9000 + (getpid() % 1000)
port, sock = listenany(UInt16(port_hint))
close(sock) # FIXME: https://github.com/rapidsai/ucx-py/blob/72552d1dd1d193d1c8ce749171cdd34d64523d53/ucp/core.py#L288-L304
end

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.