Giter Club home page Giter Club logo

Comments (11)

Keno avatar Keno commented on August 25, 2024

The code in the linux backend is assuming linux/glibc semantics not POSIX. It's unlikely we'll move away from this model. Can you elaborate what breaks? We may be able to suggest workarounds.

from libuv.

green-nsk avatar green-nsk commented on August 25, 2024

There're a lot of things that break, so I think the simplest workaround for us would be to patch uv_spawn() and replace vfork() with fork() (haven't tried it yet, but I expect it should work, with maybe some minor modifications).

In broad strokes, the implementation of accelerated sockets relies on keeping extra information about file descriptors and signals in userspace (and intercepting some system calls). When the child process shares userspace with the parent, those don't work as expected.

from libuv.

green-nsk avatar green-nsk commented on August 25, 2024

Take for example signal()/sigaction() calls. Supplied signal handler is stored in user-space. When a parent has a signal handler stored, and child process attempts to reset signal handler back to SIG_DFL here, it ends up "resetting" parent signal handler as well. I haven't investigated other calls in much detail, but I expect the general scenario to be similar.

We're hitting this problem with exasock network stack from Cisco/ex-Exablaze. Here's relevant signal()/sigaction() code. But there're others with a similar architecture that may be hit by similar problems, e.g. openonload from Xilinx/ex-SolarFlare

from libuv.

Keno avatar Keno commented on August 25, 2024

Yes, I'm familiar with those techniques - it's usually a huge pile of hacks, that make assumptions that don't hold in any even reasonably complicated code base. Our recommendation to vendors we've worked with on these kinds of issues is to make sure to have an API that does not rely on any interception tricks that runtime systems that manage more complicated state can opt into.

from libuv.

ancapdev avatar ancapdev commented on August 25, 2024

Yes, I'm familiar with those techniques - it's usually a huge pile of hacks, that make assumptions that don't hold in any even reasonably complicated code base. Our recommendation to vendors we've worked with on these kinds of issues is to make sure to have an API that does not rely on any interception tricks that runtime systems that manage more complicated state can opt into.

Hi @Keno. Agreed on the ideal state and what ought to be. However, vendors cater to customers, and a large fraction of customers want a drop in preload library that transparently runs their network stack faster. This has been the de facto solution to user space networking for quite some time, and while it often comes with some gotchas and limitations, I've also seen plenty complicated code bases run perfectly fine over it.

Out of interest, for this specific issue, what was the rationale for using vfork on linux?

from libuv.

Keno avatar Keno commented on August 25, 2024

Out of interest, for this specific issue, what was the rationale for using vfork on linux?

vfork is significantly faster, particularly on applications that have a lot of memory mappings (as julia applications often do), because the page tables need not be copied.

from libuv.

ancapdev avatar ancapdev commented on August 25, 2024

vfork is significantly faster, particularly on applications that have a lot of memory mappings (as julia applications often do), because the page tables need not be copied.

Thanks, that makes sense.

from libuv.

vtjnash avatar vtjnash commented on August 25, 2024

I have some code that switches to using posix_spawn from glibc, but note that also requires vfork, and the implementation looks pretty similar to our implementation here. But perhaps it would avoid the symbol imposition from happening with the signal code?

from libuv.

green-nsk avatar green-nsk commented on August 25, 2024

I would expect system calls happening within posix_spawn() won't be intercepted by the exasock. That way, the parent process memory will not be mutated. In that sense, using posix_spawn() would be better for our case.

That said, the state of signals/file descriptors tracked by exasock inside the child process may diverge from their "true" state. This may backfire in more subtle ways. but realistically I don't expect that to be a problem.

In the meanwhile, switching from vfork() to fork() has indeed fixed all the issues we're seeing.

from libuv.

rshpount avatar rshpount commented on August 25, 2024

I think some of the process initialization code looks unstable in relation to vfork. Specifically, https://github.com/JuliaLang/libuv/blob/julia-uv2-1.44.2/src/unix/process.c#L450 and https://github.com/JuliaLang/libuv/blob/julia-uv2-1.44.2/src/unix/process.c#L452. These two lines modify the pipes array in the parent process memory and replace the file handles with new handles created by F_DUPFD in the child process. These handles do not exist in the parent space, so uv__process_open_stream will fail when attempting to close the non-existing handle in https://github.com/JuliaLang/libuv/blob/julia-uv2-1.44.2/src/unix/process.c#L221.

from libuv.

Keno avatar Keno commented on August 25, 2024

I agree, that looks like a bug

from libuv.

Related Issues (8)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.