Giter Club home page Giter Club logo

wg's People

Contributors

hackmd-deploy avatar quietlychris avatar tiberiusferreira avatar zusez4 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

wg's Issues

Skill tree

I've seen skill trees used to coordinate some medium/larger projects i.e. making firefox faster or const-eval in the rust compiler. And started wondering about the utility in filling in the gaps in something like a language ecosystem (i.e. ML ๐Ÿ˜‰). It might also help new contributors find a niche to work in.

Obviously this would be more scaled at the components needed to build things at an application level not to the depth of a specific project. It might also be nicer in terms of specificity than arewelearningyet/task-board. Just wondering about peoples thoughts, I can try and have a play around with this over the week and see.

Examples:

Consider migrating Zulip chat over to Discord

@bytesnake @YuhanLiin @ZuseZ4 Over the past few months, I've been keeping an eye on the Rust-ML Zulip chat at https://rust-ml.zulipchat.com, and have noticed that there's very little activity. This has generally been true since we started that server up back in (2019? 2020?) There's periodically been some activity if a few people were working on the same thing at the same time, but generally speaking, we occasionally get an introduction, a few people "wave", and then nothing really happens.

I believe that part of this is that, at least at the moment, Zulip's web client doesn't seem to allow for checking multiple servers in the same tab; if you're logged in, you're talking specifically in that server. In addition, while Zulip's threaded discussions allow for very targeted topics, the Rust-ML group doesn't generally have enough going on that we necessarily need a new stream for each topic to keep everyone updated; standard channels like the ones in Discord could also work just as well, with one linfa channel, potentially one dfdx channel, one transformers channel, etc. Part of our original decision for using Zulip was both that the Rust Project had a Zulip, plus it's open source (unlike Discord), which seemed complimentary. That said, I think there's an argument to be made for practicality for platform choice between using a more prevalent, likely easier platform (Discord) over a more difficult, but open source one (Zulip).

In addition, since much of my work is fairly interdisciplinary, I often find myself with a Discord tab (or I guess if you have it, the app) open while browsing the Discord channels in other Rust-focused open source communities such as the Bevy game engine, the dfdx crate, AeroRust community, etc., even outside of more personal ones that people might have. There's a bit of a natural flow, where if I log in to check or ask something in one community, I'll often do a bit of a browse through the other servers I've joined as well. This means that, even in passing, I suspect that a Rust-ML Discord group would get much more traffic and have more engagement than the Zulip chat does, which in turn could drive more interest/contributors into projects.

I guess my proposal is this: create a Rust-ML Discord channel, make an announcement about it/change community docs to mention both it and the Zulip chat, then do an evaluation period of a few months to see if there's a difference in traction between the two. If there's a significant change, we then consider deprecating the Zulip chat. I haven't had much experience administrating a Discord server, but I'm happy to take on that role and have a couple people I can reach out to for advice on how to make that easier. Thoughts?

Prior approaches

Existing Rust ML Solutions

Leaf & Collenchyma

This framework focuses narrowly on just getting the most basic layers and the operations. It is very old and not maintained.

Tensors

Leaf is an ML framework that uses its own custom backend-agnostic tensor library called Collenchyma. The tensor type is SharedTensor. This tensor type is not parameterized by any backend, but the backend must be passed when creating the tensor. This means that the backend associated with the tensor can be ignored, which may simplify type bounds on functions that operate on a tensor. It can even have the tensor on multiple backends at the same time, hence the "Shared".
The only things you can do with the tensor itself is to reshape it or extract the memory.

Backends

Collenchyma supports multiple backends: Native, OpenCL, and CUDA. It does this in most places using enums:

All types of contexts can be used simultaneously in the same binary. However, since collenchyma builds all the backends into itself, this means that if you depend on collenchyma you now have Native, OpenCL, and CUDA backend code build into your dependency tree. For this reason, the approach of Collenchyma is probably not that great. This approach does have the advantage
that tensors can freely be shared among backends, handled by Collenchyma.

The actual operations exist on the backend. For instance, the CUDA backend can execute the sigmoid operation:

backend.sigmoid(&mut x, &mut result).unwrap();

Rusty Machine

This framework is old and not maintained.

Tensors

This framework uses rulinalg for its tensors. This crate has fallen out of favor of nalgebra. It doesn't even support 3d tensors, and is not worth considering in a modern application. This also means it has no support for GPUs.

Learning

You can find the docs for learning here: https://athemathmo.github.io/rusty-machine/doc/rusty_machine/learning/index.html

It is clear that there is not an emphasis on traditional neural networks.

mli

This framework was written more recently (by me!). This framework has enough built in tools to create basic convolutional neural networks, with some examples. Only native backends are currently supported. It is not currently actively maintained.

Tensors

This framework has no tensor type. Instead, it only supplies abstractions to chain ops together. These ops then typically depend on a tensor type. For instance, the sigmoid op's forward looks like this:

impl Forward for Logistic {
    type Input = f32;
    type Internal = ();
    type Output = f32;

    fn forward(&self, &input: &f32) -> ((), f32) {
        ((), logistic(input))
    }
}

To run this on a whole tensor, this is where you use mli-ndarray:

pub struct Map3One<G>(pub G);

impl<G> Forward for Map3One<G>
where
    G: Forward,
{
    type Input = Array3<G::Input>;
    type Internal = Array3<G::Internal>;
    type Output = Array3<G::Output>;

    fn forward(&self, input: &Self::Input) -> (Self::Internal, Self::Output) {
        let both_vec: Vec<(G::Internal, G::Output)> =
            input.iter().map(|input| self.0.forward(input)).collect();
        let (internal_vec, output_vec) = both_vec.into_iter().fold(
            (vec![], vec![]),
            |(mut internal_vec, mut output_vec), (internal, output)| {
                internal_vec.push(internal);
                output_vec.push(output);
                (internal_vec, output_vec)
            },
        );
        let internal_array = Array::from_shape_vec(input.raw_dim(), internal_vec).unwrap();
        let output_array = Array::from_shape_vec(input.raw_dim(), output_vec).unwrap();
        (internal_array, output_array)
    }
}

This struct wraps an op that operates on one item, and lets it span across a whole array. There is a similar impl for the backpropogation:

impl<G> Backward for Map3One<G>
where
    G: Backward,
    G::TrainDelta: Clone + Add + Zero,
{
    type OutputDelta = Array3<G::OutputDelta>;
    type InputDelta = Array3<G::InputDelta>;
    type TrainDelta = G::TrainDelta;

    fn backward(
        &self,
        input: &Self::Input,
        internal: &Self::Internal,
        output_delta: &Self::OutputDelta,
    ) -> (Self::InputDelta, Self::TrainDelta) {
        let both_vec: Vec<(G::InputDelta, G::TrainDelta)> =
            izip!(input.iter(), internal.iter(), output_delta.iter(),)
                .map(|(input, internal, output_delta)| {
                    self.0.backward(input, internal, output_delta)
                })
                .collect();
        let (input_delta_vec, train_delta_vec) = both_vec.into_iter().fold(
            (vec![], vec![]),
            |(mut input_delta_vec, mut train_delta_vec), (input_delta, train_delta)| {
                input_delta_vec.push(input_delta);
                train_delta_vec.push(train_delta);
                (input_delta_vec, train_delta_vec)
            },
        );
        let input_delta_array = Array::from_shape_vec(input.raw_dim(), input_delta_vec).unwrap();
        let train_delta_array = Array::from_shape_vec(input.raw_dim(), train_delta_vec).unwrap();
        (input_delta_array, train_delta_array.sum())
    }
}

As you can see, the input and output types are specific to ndarray. This means that you can write ops that are as specific as they need to be.

Backends

mli makes the backend chosen by the code itself. For instance, once you have used Map3One from mli-ndarray, that piece of the graph now only runs with ndarray. Additionally, this means that the graph is "static". The graph cannot be stored to disk and loaded. This is not a problem in and of itself. Some pros and cons:

Pros

  • It is easy to write code that is backend agnostic and integrate it with backend-specific code
  • The compiler can inline and optimize native code so CPU implementations can run faster

Cons

  • Since the backend is not specified, any backend context (such as for a GPU) must be stored globally or in the tensor type
  • You cannot have a black box model that can be swapped by changing a file
    • You can trained weights, but not the actual graph itself
  • We cannot perform optimizations on the graph (like fusing or rearanging ops) since it is compiled as code and not stored in memory

deep

deep didn't even get past the first PR adding the graph (until I just merged it, right now). Here is what it does:

Tensors

Tensors in deep are specificed by the backend. This means that if you are using em (Emu) you can use DeviceBox<[f32]> as your Tensor type. This is non-ideal since we would like to have tensors containing other types than f32. Unfortunately, this cannot be done without Generic Associated Types (GATs). A small example from the RFC:

impl PointerFamily for RcFamily {
    type Pointer<T> = Rc<T>;
    fn new<T>(value: T) -> Self::Pointer<T> {
        Rc::new(value)
    }
}

As you can see, the line type Pointer<T> = Rc<T>; creates an associated type with a type parameter. This type parameter then parameterizes Rc. We need this functionality to allow deep to achieve the same thing with tensors:

type Tensor<T> = DeviceBox<[T]>;

As you can see, now we can have backend tensors with arbitrary types. This gets even better:

type Tensor<T, const S> = Array<T, S>;

This is what it would look like once GATs and const generics are merged. This would allow us to pass a shape to the underlying tensor. On native systems, this could mean huge performance gains since algorithms can be tuned at compile-time to work with particular shapes and filter sizes. Unfortunately, this is a far-off thing. A better solution for now might be to use something like em to get a specific tensor type we can parameterize.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.