Giter Club home page Giter Club logo

Comments (4)

3XX0 avatar 3XX0 commented on August 10, 2024

In a lot of cases, the error is not very meaningful.

What do you mean by meaningful, each xfunc usually adds more context on top of the error (e.g. a path), and it's up to the upper layer to decide whether the error is meaningful or not.

In some cases, we lose some info (ex: xopen, xclose will return 1 error type only)

They all embed errno or something else so no you don't lose info.

In some cases, we might even fail while failing, ex: xstrdup, that can fail due to a bad alloc. If we fail, we will go to error_set, which will call asprintf, which will alloc memory. Yes!

How is it failing exactly? In the case of strdup it's just trying to allocate an error string. You could be lucky and be able to do that, if you can't, you simply don't have an error msg populated.

in other cases, the error check we have to do after is the same (xdlopen ?)

What do you mean?

My opinion is that these functions SHOULD exit when failing. This lib has no fallback when something goes wrong. Errors are set, but not handled, so we can either crash, or have UBs.
My suggestion is rewrite the whole file, and each function, if failing should:
write a meaningful message to stderr
abort

What do you mean by errors are not handled? When the library fails you usually have a pretty good idea why. What crash or UB are you experiencing?

No and no!
First, you don't get to abort a program from a shared library, especially not a container runtime. You have no idea what has been done and what needs to be undone and you don't even know what your abort is going to do, maybe you're PID 1, maybe you're a subreaper, maybe you're inside a cgo constructor...

Same applies to stderr, you have no idea where it's pointing at and whether it's safe or not to use it.

from libnvidia-container.

Keenuts avatar Keenuts commented on August 10, 2024

Hi Jonathan,

What do you mean by meaningful
They all embed errno or something else so no you don't lose info.

Well, from what I understood, this file is supposed to give helpers that will simplify error handling. errno and non-errno function unified with a common API that set the error message. errno function will give you more details, example, xopen will give the reason behind the failure. But here, we loose it. We only know opening Y file failed. And the user of the functions has to use errno to get the value. But if he does that, we loose the "uniformization" effort brought the file.

How is it failing exactly? In the case of strdup it's just trying to allocate an error [...]
This rely on the underlying library to handler properly allocations. But sometime, it's not the case, so I think it would be safer to have a code which does not use dynamic allocation when handling dynamic allocation errors. Just a matter of logic.

in other cases, the error check we have to do after is the same (xdlopen ?)

I think this function has the same issue than the open function. We have an uniform API with struct error, and we don't use it since we just say "load library error". We would like to call dlerror here, to get more context.

unhandled errors

The close case by example, which is not checked.

First, you don't get to abort a program from a shared library

yes, that's true...

Same applies to stderr

Right, I agree on that. But couldn't we have a better way to output error log ?

should we rewrite the whole thing ?

This is clearly a clumsy over-statement. Do you think there is some part we could rework here ?

from libnvidia-container.

3XX0 avatar 3XX0 commented on August 10, 2024

Again you do not lose anything, if you look at how the error functions work, they append the system error string to the input message (similar to err) and they store the error code. dlerror is called in the case of dso errors.

close is not checked because it's pointless if you look at where it's used.

There are already logging facilities implemented, you can easily get comprehensive logs with the environment variable NVC_DEBUG_FILE.

It's clearly not perfect but the points you're making are not really valid as is. Error handling in a reentrant library can be tricky, if you're curious you can check how others do it in the industry, for example libtls has a very similar approach.

from libnvidia-container.

Keenuts avatar Keenuts commented on August 10, 2024

if you look at how the error functions work
oops, missed the error_vset wrapper, see that.

close is not checked
logging facilities
ok

Ok right, see your point.

from libnvidia-container.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.