Giter Club home page Giter Club logo

Comments (21)

creationix avatar creationix commented on May 24, 2024

Interesting comments. I am concerned about the performance of my current scheme that created coroutines everywhere.

If I assumed that a single lua process only used the libuv default event loop, it would simplify my code a bit.

Also I think I'm fine with changing luv to use lua callbacks instead of coroutines. I can have all callbacks back into lua use lua_pcall and make sure to never mutate the stack from the uv.run call's lua_State.

I'm still undecided what to do with the uv_req_t objects. Before I didn't expose them to lua to simplify things, but then you can't ever cancel a request. I guess if I switch to callback style I can create the req in C and return it from the non-blocking call.

from luv.

imzyxwvu avatar imzyxwvu commented on May 24, 2024

req cancel is not often used, but under some situation it might be required(for example, timeout). lua's multi return-value support has a great advantage for passing the request objects back. So in my opinion req should be added.

uv.run in Lua is like a function that calls other functions according to IO events. So we don't have to care if the lua stack is mutated during uv_run. While Lua callbacks are called, uv_run won't return.

from luv.

moteus avatar moteus commented on May 24, 2024

Hello.
I just start implement my own version of libuv binding and I also have same quesionts.

And one multi-threading thread can only hold one uv_loop because uv_run blocks the thread.
So exposing uv_loop to Lua is useless

I use llthreads2 library to run several luastates in different os threads. So I need independed uv loops in each lua states. So I think it is sufficient just be able create loop and set it as default for Lua state.
Also we can manipulate with some options of loop (e.g. error policy, default error handler) and it more convinient do with object.

The event callback style should be redesigned in the following style too.
uv.read_start(stream, function(data) end)

Agree.
My version is: all callbacks recv at least 2 arguments:
1 - object (loop, file, handle). We need this to be able write functions without upvalues and also it allows get current loop.
2 - error object or nil. I like errors be objects but not just string.
and rest is function specific data

local function on_stream_read(cli, err, data)
  if err then return cli:close() end
  cli:write(data, on_write)
end

And in the read_cb implemention, luv used lua_pushlstring, which always does a copy operation that costs CPU resource. Please have a look at lua_Buffer APIs that allocs memory by Lua.

luaL_Buffer does not do anything magick. It just allocate userdata allows copy data to it and at the end call lua_pushlstring.
It more sense implement buffer class that allows wrap raw pointer and provide functionaliti similar to string library (check lbuffer) but it is hard task.

I'm still undecided what to do with the uv_req_t objects. Before I didn't expose them to lua to simplify things, but then you can't ever cancel a request.

So in my opinion req should be added.

Now we can allocate uv_req_t internally and free them in callback and we have full control on its life time (we can use owr allocator). But if we start expose it to Lua then we have to use userdata.
But I agree it may be very useful.

from luv.

creationix avatar creationix commented on May 24, 2024

@imzyxwvu I liked your idea about callbacks. I prototyped it last night and it's much cleaner!

Also since you said table indexing was slow, I instead implemented a simple linked list to attach various event handlers to the userdata. I would imagine this is much faster since it's all in pure C and doesn't interact with the lua VM much. Also most userdata shouldn't have more than about 4 handlers in the extreme cases (most have just one).

https://github.com/luvit/luv/blob/40c5af9c03da102a913eaa9fea8f412b770be1ac/src/lhandle.h

I also talked with the libuv people for a while and we couldn't come up with a good reason to expose multiple uv_loop_t instances to a single lua thread, so I'm back to using the default loop in luv.

I'll probably have the rest of the existing code ported to this new style in an hour or so.

from luv.

imzyxwvu avatar imzyxwvu commented on May 24, 2024

Lua table internally has two parts, array and hash table. The array part uses a C array which is resized with realloc. So linked list seems to be also slower than Lua's table's array when indexing(LUA_REGISTRYINDEX). I prefer to register the callbacks to req instead of the handle. uv's req structs have a field called data and we can store a luv_req struct which contains the ref integer there.

It's also great to back to using default loop. This would make the development easier.

To @moteus , multi-threading is getting less used. Now Nginx's great performance shows that one thread can provide greater performance than multi threads due to no lock requirement. If I want to deal different things with multi threads, I would prefer forking a new process. libuv reports errors by callbacks so I don't know which kind of errors is to be handled with the loop object.

Now LuaSocket also uses lua_pushlstring so I think lua_pushlstring is enough currently.

from luv.

imzyxwvu avatar imzyxwvu commented on May 24, 2024

About saving callbacks, my idea is:

  1. define a struct luv_req { int callback_ref; int destroyed; }
  2. use lua_newuserdata to alloc the memory for luv_req, set the metatable of it to provide __gc to cancel the request
  3. luaL_ref the callback function and store the ref to the luv_req
  4. create the uv_req_s and store the luv_req to the data field, then call the libuv API
  5. when the request is done, mark the luv_req destroyed

But this requires each API's caller store the userdata, or the lua GC will cancel the request.

I came up with another design but it has some difficults:

  1. luv provide an API to create requests with a callback, which have a cancel method. (We need to implement a set of API to create different types of callbacks).
  2. When request APIs are called, check the callback's type. If userdata, check it out as an request object, else, ref it and malloc a request struct, and mark it as to-free.

from luv.

imzyxwvu avatar imzyxwvu commented on May 24, 2024

luv_handle's reference counting is still required. When it was created or a request about it is created, increase it. When its __gc called or a request about it is done, decrease it. To make sure when request is in progress, it won't be closed.

from luv.

moteus avatar moteus commented on May 24, 2024

If I want to deal different things with multi threads, I would prefer forking a new process.

I use Windows and fork is not easy. Also if you use share nothing and use messages to communicate with threads it does not require locks at all.

libuv reports errors by callbacks so I don't know which kind of errors is to be handled with the loop object

In Lua there two way to report about error. lua_error or return nil and error.
Both ways has its use case so I like if library provide user way to choose which one to use.
I do not speak about callbacks. Many functions could return errors by it self (e.g. uv_XXX_bind or uv_accept does not have callback at all).
Also callbacks could generate Lua errors

s:start_read(function(...) error(some error) end)

If we just pass thrue this errors that may lead to resource leak in libuv.
So we need use lua_pcall and third argument is error handler function. We can set It to debug.traceback.
In my binding I store error in upvalue and call uv_stop. and run method check this upvalue and raise error if it contain value.

from luv.

creationix avatar creationix commented on May 24, 2024

@imzyxwvu I need the linked list for handles because not all callbacks are related to reqs. libuv only uses reqs for operations that are potentially very slow and the user would want the option to cancel them. Examples of callbacks not using reqs are uv_close, uv_timer_start, uv_read_start, etc. But like I said before, a single handle will have very few of these, so the O(n) cost of the linked list is fine. I'm storing the actual lua callbacks in LUA_REGISTRYINDEX and storing the ref integer in the C list. Also I have a C enum for the names instead of using strings to key the items in the list. A single list node is just two integers and a pointer.

Regarding the refs, I think you're confusing lua refs with libuv refs. Refs in libuv are things that keep uv_run blocking. By default this is all automatic inside libuv and you don't have to worry about them when binding to lua. I expose uv_ref and uv_unref to lua in case the user has special needs and wants to tweak the uv refs manually (like not blocking in stdin).

The lua refs are required since the GC knows nothing about the internal C callbacks in libuv. I create a lua ref (luaL_ref) for the userdata upon creation and don't release it till the callback to uv.close. This means there will never be a __gc event so long as the user hasn't called uv.close. But this does keep things very simple and bug free. The user needs to call uv.close anyway to prevent leaks inside libuv so I think it's fine to require it for the userdata lifetime as well.

The reqs have a much simpler model. I'll ref them upon creation (along with their one callback) and release both upon cancel or callback fireing.

In the new version I don't need to worry about reffing the lua states because I use the main thread for all callbacks. I only allow uv_run to be called in the main thread and record it in a glocal C variable when that happens.

from luv.

moteus avatar moteus commented on May 24, 2024

luv_handle's reference counting is still required. When it was created or a request about it is created, increase it. When its __gc called or a request about it is done, decrease it. To make sure when request is in progress, it won't be closed

Do we really need that handles be GC'ed. Each handle if it opened could be accesse via loop object.
If we use only one loop per library then all handles accessable.
I use this scheme: lua_uv_handle is

typedef struct lluv_handle_tag{
  uv_handle_t *handle;
  lua_State   *L;
  lluv_flags_t flags;
  int    callbacks[1];
} lluv_handle_t;

Number of callbacks depend from handle type.
And this is create function
I crate in registry pair lightuserdata(handle)=>lluv_handle_t and assign in C handle->data = lluv_handle_t;
So all handles steal alive until they wont be closed and there easy way to access to them.

lluv_handle_t *lhandle = handle->data;
lua_rawgetp(L, handle); // get lua object

About uv_req_t;
Lua library should save uv_req_t handle only until callback. after that it can forget about it.
For userdata I think about this structure: struct luv_req_t{ flags; luv_req_t req}; so we can allocate any size in one call and do just C type cast (e.g. (vu_connect_t*)&lreg->req) and use offset_of if we need. Also Lua would be know about all memory.

from luv.

imzyxwvu avatar imzyxwvu commented on May 24, 2024

I need the linked list for handles because not all callbacks are related to reqs.

In this situation, I would prefer a C array because we have already know how many callbacks there will be.

In Lua there two way to report about error. lua_error or return nil and error.
Both ways has its use case so I like if library provide user way to choose which one to use.

This had been discussed in an older issue. luv wont call lua_error itself. libuv only reports errors by callbacks and luv should be a bare binding.

from luv.

creationix avatar creationix commented on May 24, 2024

I thought about an array, but we don't know how many callbacks there will
be. We can probably know the maximum number of callbacks per uv handle
type but that would complicate the code a bit having different handling for
each type.

If the linked list ever becomes a bottleneck I can certainly consider it.
It's already implemented and working great.

from luv.

grrrwaaa avatar grrrwaaa commented on May 24, 2024

Multi-threaded lua (without separate processes) can easily happen when Lua is embedded in a host application. Typically in that case each OS-level thread has an independent lua_State. I have had a project with exactly this case, and luv's use of uv_default_loop() makes it unsuitable. Using multiple processes would not be a good fit for the application, as there was significant shared memory/resources.

Pushing a uv_loop into the lua registry would work fine.

from luv.

creationix avatar creationix commented on May 24, 2024

@grrrwaaa I wonder if it would work if I store the loop in thread-local storage? I don't know enough about native threads to know how this works. http://docs.libuv.org/en/latest/threading.html#thread-local-storage

from luv.

creationix avatar creationix commented on May 24, 2024

But yes, storing the loop in the registry would work great. I just worry about the performance implications of this. uv.fs_*, uv.new_*, and uv.spawn are the main functions that require the uv_loop. Anything taking in a handle or req has a reference to it from the libuv structs.

from luv.

grrrwaaa avatar grrrwaaa commented on May 24, 2024

I would imagine the performance hit of accessing the registry is a lot less than the overhead of uv.fs_*, uv.new_ etc. The lua registry might also be faster than libuv's thread-local storage.

Another way around it would be to make the uv_loop an upvalue of all uv.fs__, uv.new__ functions.

uv_loop_t * loop = lua_touserdata(L, lua_upvalueindex(1)); // very fast.

The trick is to install this up value when the module is loaded; which makes the module registration code a bit more complicated. But modules only load once per Lua state universe, so the mapping is good.

from luv.

creationix avatar creationix commented on May 24, 2024

I like it. And I can store a reference to the lua state in the uv_loop's
data member (for all the libuv callbacks to get back to lua)

On Fri, Oct 3, 2014 at 1:46 PM, Graham Wakefield [email protected]
wrote:

I would imagine the performance hit of accessing the registry is a lot
less than the overhead of uv.fs_*, uv.new_ etc. The lua registry might also
be faster than libuv's thread-local storage.

Another way around it would be to make the uv_loop an upvalue of all uv.fs__,
uv.new__ functions.

uv_loop_t * loop = lua_touserdata(L, lua_upvalueindex(1)); // very fast.


Reply to this email directly or view it on GitHub
#65 (comment).

from luv.

grrrwaaa avatar grrrwaaa commented on May 24, 2024

Yes.

A luajit/FFI binding will need a totally different approach of course.

from luv.

creationix avatar creationix commented on May 24, 2024

@grrrwaaa that worked out pretty well 1521fa2

from luv.

creationix avatar creationix commented on May 24, 2024

The storage and retrieval code is at

luv/src/luv.c

Lines 238 to 268 in 1521fa2

static lua_State* luv_state(uv_loop_t* loop) {
return loop->data;
}
// TODO: find out if storing this somehow in an upvalue is faster
static uv_loop_t* luv_loop(lua_State* L) {
uv_loop_t* loop;
lua_pushstring(L, "uv_loop");
lua_rawget(L, LUA_REGISTRYINDEX);
loop = lua_touserdata(L, -1);
lua_pop(L, 1);
return loop;
}
LUALIB_API int luaopen_luv (lua_State *L) {
uv_loop_t* loop;
int ret;
loop = lua_newuserdata(L, sizeof(*loop));
ret = uv_loop_init(loop);
if (ret < 0) {
return luaL_error(L, "%s: %s\n", uv_err_name(ret), uv_strerror(ret));
}
// Tell the state how to find the loop.
lua_pushstring(L, "uv_loop");
lua_insert(L, -2);
lua_rawset(L, LUA_REGISTRYINDEX);
// Tell the loop how to find the state.
loop->data = L;

from luv.

creationix avatar creationix commented on May 24, 2024

I'm closing this for now. Open new issues for specific ideas. Luv should now be thread-safe and eventually I'll even wrap uv_thread_t and expose lua workers, each with their own CPU thread, lua_State and implicit libuv loop.

from luv.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.