Giter Club home page Giter Club logo

Comments (10)

lilizoey avatar lilizoey commented on July 21, 2024 2

Just gonna summarize general thread safety issues for the various types. When we focus more on thread safety, we'll likely need to split all the thread unsafe types in two to create a thread-safe variant for each. But there are several types which likely dont need such a split, and some types may only require slight modifications to be made safe.

Known thread safe

  • All the mathy types are implemented in rust and thus are entirely thread-safe
  • integers, floats, and bool

Likely thread safe

  • StringName is immutable + atomically refcounted and should therefore be thread safe
  • Packed*Array except String, these should be largely COW types so can't be mutated concurrently from different threads
  • Gd<T> where T is a godot-singleton, it may be that our assumptions in making Gd makes this thread unsafe but usage of the godot singletons is thread safe

Likely thread unsafe

String types can be easily cloned into a shared reference and then mutated afterwards, this is likely gonna cause concurrency issues.

  • GString is mutable and thus likely not thread safe
  • NodePath is mutable and thus likely not thread safe
  • PackedStringArray, GString is likely not thread safe so this likely isn't either

Known thread unsafe

  • Array, allows arbitrary sending and sharing of other godot types, even Array<i64> cannot in a thread-safe manner always maintain its safety invariant
  • Dictionary, allows arbitrary sending and sharing of other godot types
  • Variant, it can be any of the other types
  • Gd<T>, intentionally thread unsafe, will likely need a different type for thread safe object access - see above
  • Callable, arbitrary code may be run
  • Signal, arbitrary code may be run

from gdext.

Bromeon avatar Bromeon commented on July 21, 2024 1

This is a great overview! 👍

If we want to provide ergonomic usage, we need to think if/how we can support common use cases such as:

  • load resources on a separate thread
  • construct a node on a separate thread, send it to main, attach to tree there

Some observations:

  1. Certain methods like Object::call, Object::get, Node::call_group etc. must be excluded from thread-safety guarantees, since those can invoke arbitrary code (Rust and non-Rust).
    • This is already a problem in regular Gd<T>.
  2. Some node interactions must be run on main, e.g. scene-tree related ones like Node::add_child, Node::add_sibling, Node::reparent.
    • I don't know how to verify this unless we annotate each method manually, which I'd like to avoid if somehow possible.
  3. Other options like Node3D::set_position (most properties in general) are fine to execute on any thread, as long as there is no concurrent access to the same object from another thread.
    • Unique objects (e.g. newly created ones) are always safe to use.
    • Shared objects obey the borrow principle: 1 writer XOR many readers.
    • Node is not reference-counted, so it's impossible to know how many references exist.
    • However, at least from Rust side we could provide best-effort diagnostics in Debug mode: method calls check a global lock for that instance ID, and panic if the lock is being held. This can't be done in GDScript code, but we need to live with that being outside our guarantees.

As for unsafe usage, we need to see where it makes sense -- if every single cross-thread method becomes unsafe, we are back to gdnative's approach and don't win much. I'm particularly considering how to map Send (allow moving but not sharing) and Sync (allow shared access) to the above-mentioned semantics.

from gdext.

Bromeon avatar Bromeon commented on July 21, 2024

Re-labeling this as feature for general discussion about multithreading support.

from gdext.

Bromeon avatar Bromeon commented on July 21, 2024

To elaborate on my original plans a bit and keep the discussion public:

As mentioned in #212 (comment), some operations are currently unsound, for example:

  • when storing a Gd<T> equivalent reference on GDScript side
  • through the scene tree
  • Gd::upcast()
  • Gd::from_instance_id()
  • Gd::from_variant()
  • ...

So individual unsound access points like from_instance_id are mostly symptoms, the core problem is that Gd is not generally safe to construct on other threads.

The semantics here are actually more limiting than Rust's own:

in general, Gd<T> can only be safely constructed and used on the main thread.

This is very important, as Gd<T> is not an isolated instance (as it were in Rust). Through the engine, there can be lots of hidden links to other objects (scene tree, connected signals, base classes...). In other words, Gd<T> potentially has side effects transcending thread boundaries. This concept is not expressible using Rust standard traits; Rust generally assumes that if you have ownership of something, it's unique; and if it's shared across threads, it's Sync.


So, what can we do? My idea is still more or less the same from the original post.

  1. When threads feature is off: we strictly forbid construction of Gd<T> in other threads.

    • Validated through thread ID and panics.
  2. When threads feature is on: we need to differentiate which types are "thread-safe" according to Godot.

    • This not only concerns classes, but also builtins, especially the composite ones (Variant, Array, PackedArray, Dictionary, Callable).
    • Servers and Godot singletons should be constructible from any thread, and shareable across thread boundaries.
    • Resources can be loaded on one thread and sent to another (Send) and read concurrently from different threads (Sync), but not modified concurrently.
  3. Once we have a model, we can turn it into the type system:

    • Traits for properties that can be statically verified (like "this type is safe to construct on non-main threads").
    • Runtime checks and panics for properties that cannot be verified at compile time.
    • Escape hatches via unsafe for cases that need to be supported but don't fit into the type system. Ideally, this surface should be small.

from gdext.

lilizoey avatar lilizoey commented on July 21, 2024

i still kinda feel like it'd be better to just have two separate Gds. One that is only meant to be used on the main thread and one that is able to be used concurrently. maybe more idk we can make it as granular as we like there.

mainly because i really think that making Gd thread safe will be a very invasive procedure and likely hurt the ergonomics a fair bit. several methods may need to be made unsafe, etc. and having it be a feature that changes the behavior of Gd would mean it's either all or nothing.

im addition, how do we document that? a separate type is easy to document, but documenting all the changes from this feature seems like it'd be very confusing and weird. i imagine there is a way to do it in the docs but i cant imagine it'd look nice.

in addition, for the other core classes we can probably just make them Send/Sync if they are. all the value-types are already, since they're just implemented in rust. and for Dictionary and Array we can probably make a thread safe version of each.

anyway let's say we split Gd into Gd and GdSync (temp names). we could just make Gd check that it's always on the main thread when created, and have it not implement Send or Sync ever. (or maybe we can allow it in very specific cases, like perhaps the singletons? that might also just be confusing though).

We could also potentially use object metadata to store what thread affinity an object should have, if we want to allow Gd to be used on more than just the main thread. But that would also mean we'd need to ensure the user doesn't override the metadata somehow. unless, can we set instance bindings on arbitrary objects? if so then we might be able to use that instead.

GdSync would then have a more restricted api. likely more unsafe methods, and probably some way to track whether it should be Send or Sync. i dont know if we can do that by just implementing Send/Sync directly on the classes. at least for Sync that would allow:

let foo = Gd::<Node>::from_instance_id(id);
let node: &Node = *foo;
thread::spawn(|| node.get_child())

maybe it is fine to allow this for classes that actually are Sync (Node probably isn't).

Also there's a distinction i just thought of:

  • some classes can always be safely sent/shared between threads
  • some classes may be safely sent/shared between threads as long as the object is actually that class and not a subclass

generally the first one of these will be true if the entire subtree of the class hierarchy is also send/sync. this would technically include user-classes but we can perform a check in user classes to make sure they're not created/shared where they shouldn't be.

the second one may be true in some cases, for instance it's true for Object i believe.

we may also have some cases where some individual methods are safe to call from other threads. i suspect this may be a bit too hard to model properly though, may be feasible through an unsafe escape hatch. If we do add a RawGd we can use that as the escape hatch, RawGd would be Send and Sync, but most methods on it are unsafe.

There are also a couple other possibilities for dedicated smart pointers we could make if we wanted to. One I've thought of would be analogous to the Unique typestate in gdnative. Essentially a GdBox, which would be probably the most permissive of all the types. However it can only be safely constructed fresh, and unsafely converted from another Gd type. But also can be safely consumed and turned into other gd types. It's also freed on drop. We might actually be able to safely convert RefCounted objects into this type, since we can check the refcount to see that we're the only one owning it. This would likely be very useful in dealing with Resource in that case.

from gdext.

Bromeon avatar Bromeon commented on July 21, 2024

Thanks a lot for the great feedback!

Yep, it might be better to have its dedicated type GdSync (or Gds, SyncGd, Agd or whatever -- I'll use Gds for brevity here) if the APIs differ too much.

Proliferation of public types with closely related meaning, such as Gd, GdSync, RawGd, GdBox, ... goes a bit against our declared Simplicity principle. As a user wanting to develop a game, I'm usually not interested in selecting out of 4+ smart pointer types and memorize all the subtle differences and use cases, if GDScript or C# just allows me to get stuff done. But probably we will see if there's a real need for those once things get more concrete 🙂

Additionally, there are a lot of things we cannot check at compile time either way, by the nature how Godot operates.

You mentioned a good one: if Gds<Base> can point to Gds<Derived>, we need to ensure that all derivates follow the same constraints. Which creates non-locality: adding a new class deriving from Base may break existing usages of Gd<Base> (at runtime). Or it must be guaranteed that polymorphism (up/downcasting) is only possible when derived classes implement the same traits.

Alternatively, we could outlaw polymorphism on Gds and enforce that Gds<T> always hold exactly T. This might be an acceptable trade-off, since synced pointers are usually short-lived (e.g. load resource in background thread, do a parallel computation with a result, etc).

from gdext.

lilizoey avatar lilizoey commented on July 21, 2024

One thing we could do is have a Gd type with Box-like semantics, similar to gdnative's Unique typestate. Though a different type entirely instead of typestates may be more appropriate.

Its safety invariant is that it has unique ownership of the object. This would let us much more easily use it safely from different threads, it could be Send and maybe even Sync. Since we know there is only one reference to this object, then it should be entirely safe to use the vast majority of methods on it. It may be a bit difficult to figure out what methods could break the safety invariant though. We could then have a way to safely convert it to a normal Gd (consuming the original object). This would let us support patterns like "do things on other threads, send to main thread, then use like normal".

from gdext.

Bromeon avatar Bromeon commented on July 21, 2024

We could also start very conservatively, and limit the way how this UniqueGd (or however it's called) can be used:

  • ✔️ any Deref/DerefMut API methods on it (otherwise it's useless).
    • (We could potentially mark some unsafe if really appropriate, but this won't scale well)
  • ✔️ constructors new_alloc, new_gd, from_init_fn, from_object
  • cast, try_cast (consumes object), upcast, upcast_ref, upcast_mut (basically Deref)
  • ✔️ free (consumes object)
  • ✔️ instance_id
  • ✔️ bind, bind_mut
  • ✔️ into_gd (consumes object)
  • ✔️ pass to Godot APIs by consuming the object

These operations are absent:

  • clone(), obviously
  • from_godot and from_variant
  • to_godot and to_variant

These operations could be marked unsafe (or also absent):

  • ✖️ from_gd() or from_gd_unchecked() for manually managed types
    • could involve best-effort diagnostics in debug mode, but quite involved
  • ⚠️ from_gd() for ref-counted types, which is dynamically checked and safe
  • ✖️ from_instance_id()
    • can initially also be absent, as this may not even be needed; can also be trivially achieved with a from_gd constructor
  • ✖️ pass to Godot APIs but via reference

from gdext.

lilizoey avatar lilizoey commented on July 21, 2024

We could also start very conservatively, and limit the way how this UniqueGd (or however it's called) can be used:

* ✔️ any `Deref`/`DerefMut` API methods on it (otherwise it's useless).

  * (We could potentially mark some unsafe if really appropriate, but this won't scale well)

We do need to double check some of this, Deref is almost certainly fine. But DerefMut might have some issues where we could for instance do:

let mut foo = UniqueGd::<Node>::new_alloc();
let bar = Gd::<Node>::new_alloc();
foo.add_child(bar.clone());

send_to_main_thread(foo); // Also sends `bar` to the main thread?

bar.some_function(); // And we have a reference to `bar` on both threads?

Maybe if there was some way to limit the internal APIs such that functions called through DerefMut for UniqueGd only accepts UniqueGds...

* ✔️ `into_gd` (consumes object)

* ✔️ pass to Godot APIs by consuming the object

Both of these could maybe just be a From<UniqueGd> for Gd impl.

from gdext.

Bromeon avatar Bromeon commented on July 21, 2024

We do need to double check some of this, Deref is almost certainly fine. But DerefMut might have some issues where we could for instance do:

Good observation -- I checked Node and it seems like most "dangerous" methods actually take &mut self. We'll possibly find one or another outlier, but this might greatly help at keeping the &self methods ergonomic 🙂


Maybe if there was some way to limit the internal APIs such that functions called through DerefMut for UniqueGd only accepts UniqueGds...

We could maybe reuse the trick from std::thread::spawn() and require a closure with a Send bound?

So mutable methods could not be implicitly called through DerefMut, but would need something like

unique.with_mut(move |this| {
    this.add_child(other_unique);
});

from gdext.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.