rust-ml / wg Goto Github PK

View Code? Open in Web Editor NEW

18.0 18.0 0.0 34 KB

Coordination repository of the Machine Learning (applicant) Working Group

wg's People

Contributors

Stargazers

Watchers

wg's Issues

Skill tree

I've seen skill trees used to coordinate some medium/larger projects i.e. making firefox faster or const-eval in the rust compiler. And started wondering about the utility in filling in the gaps in something like a language ecosystem (i.e. ML 😉). It might also help new contributors find a niche to work in.

Obviously this would be more scaled at the components needed to build things at an application level not to the depth of a specific project. It might also be nicer in terms of specificity than arewelearningyet/task-board. Just wondering about peoples thoughts, I can try and have a play around with this over the week and see.

Examples:

Consider migrating Zulip chat over to Discord

@bytesnake @YuhanLiin @ZuseZ4 Over the past few months, I've been keeping an eye on the Rust-ML Zulip chat at https://rust-ml.zulipchat.com, and have noticed that there's very little activity. This has generally been true since we started that server up back in (2019? 2020?) There's periodically been some activity if a few people were working on the same thing at the same time, but generally speaking, we occasionally get an introduction, a few people "wave", and then nothing really happens.

I believe that part of this is that, at least at the moment, Zulip's web client doesn't seem to allow for checking multiple servers in the same tab; if you're logged in, you're talking specifically in that server. In addition, while Zulip's threaded discussions allow for very targeted topics, the Rust-ML group doesn't generally have enough going on that we necessarily need a new stream for each topic to keep everyone updated; standard channels like the ones in Discord could also work just as well, with one linfa channel, potentially one dfdx channel, one transformers channel, etc. Part of our original decision for using Zulip was both that the Rust Project had a Zulip, plus it's open source (unlike Discord), which seemed complimentary. That said, I think there's an argument to be made for practicality for platform choice between using a more prevalent, likely easier platform (Discord) over a more difficult, but open source one (Zulip).

In addition, since much of my work is fairly interdisciplinary, I often find myself with a Discord tab (or I guess if you have it, the app) open while browsing the Discord channels in other Rust-focused open source communities such as the Bevy game engine, the dfdx crate, AeroRust community, etc., even outside of more personal ones that people might have. There's a bit of a natural flow, where if I log in to check or ask something in one community, I'll often do a bit of a browse through the other servers I've joined as well. This means that, even in passing, I suspect that a Rust-ML Discord group would get much more traffic and have more engagement than the Zulip chat does, which in turn could drive more interest/contributors into projects.

I guess my proposal is this: create a Rust-ML Discord channel, make an announcement about it/change community docs to mention both it and the Zulip chat, then do an evaluation period of a few months to see if there's a difference in traction between the two. If there's a significant change, we then consider deprecating the Zulip chat. I haven't had much experience administrating a Discord server, but I'm happy to take on that role and have a couple people I can reach out to for advice on how to make that easier. Thoughts?

Prior approaches

Existing Rust ML Solutions

Leaf & Collenchyma

This framework focuses narrowly on just getting the most basic layers and the operations. It is very old and not maintained.

Tensors

Leaf is an ML framework that uses its own custom backend-agnostic tensor library called Collenchyma. The tensor type is SharedTensor. This tensor type is not parameterized by any backend, but the backend must be passed when creating the tensor. This means that the backend associated with the tensor can be ignored, which may simplify type bounds on functions that operate on a tensor. It can even have the tensor on multiple backends at the same time, hence the "Shared".
The only things you can do with the tensor itself is to reshape it or extract the memory.

Backends

Collenchyma supports multiple backends: Native, OpenCL, and CUDA. It does this in most places using enums:

All types of contexts can be used simultaneously in the same binary. However, since collenchyma builds all the backends into itself, this means that if you depend on collenchyma you now have Native, OpenCL, and CUDA backend code build into your dependency tree. For this reason, the approach of Collenchyma is probably not that great. This approach does have the advantage
that tensors can freely be shared among backends, handled by Collenchyma.

The actual operations exist on the backend. For instance, the CUDA backend can execute the sigmoid operation:

backend.sigmoid(&mut x, &mut result).unwrap();

Rusty Machine

Docs

This framework is old and not maintained.

Tensors

This framework uses rulinalg for its tensors. This crate has fallen out of favor of nalgebra. It doesn't even support 3d tensors, and is not worth considering in a modern application. This also means it has no support for GPUs.

Learning

You can find the docs for learning here: https://athemathmo.github.io/rusty-machine/doc/rusty_machine/learning/index.html

It is clear that there is not an emphasis on traditional neural networks.

mli

https://github.com/vadixidav/mli

This framework was written more recently (by me!). This framework has enough built in tools to create basic convolutional neural networks, with some examples. Only native backends are currently supported. It is not currently actively maintained.

Tensors

This framework has no tensor type. Instead, it only supplies abstractions to chain ops together. These ops then typically depend on a tensor type. For instance, the sigmoid op's forward looks like this:

impl Forward for Logistic {
    type Input = f32;
    type Internal = ();
    type Output = f32;

    fn forward(&self, &input: &f32) -> ((), f32) {
        ((), logistic(input))
    }
}

To run this on a whole tensor, this is where you use mli-ndarray:

pub struct Map3One<G>(pub G);

impl<G> Forward for Map3One<G>
where
    G: Forward,
{
    type Input = Array3<G::Input>;
    type Internal = Array3<G::Internal>;
    type Output = Array3<G::Output>;

    fn forward(&self, input: &Self::Input) -> (Self::Internal, Self::Output) {
        let both_vec: Vec<(G::Internal, G::Output)> =
            input.iter().map(|input| self.0.forward(input)).collect();
        let (internal_vec, output_vec) = both_vec.into_iter().fold(
            (vec![], vec![]),
            |(mut internal_vec, mut output_vec), (internal, output)| {
                internal_vec.push(internal);
                output_vec.push(output);
                (internal_vec, output_vec)
            },
        );
        let internal_array = Array::from_shape_vec(input.raw_dim(), internal_vec).unwrap();
        let output_array = Array::from_shape_vec(input.raw_dim(), output_vec).unwrap();
        (internal_array, output_array)
    }
}

This struct wraps an op that operates on one item, and lets it span across a whole array. There is a similar impl for the backpropogation:

impl<G> Backward for Map3One<G>
where
    G: Backward,
    G::TrainDelta: Clone + Add + Zero,
{
    type OutputDelta = Array3<G::OutputDelta>;
    type InputDelta = Array3<G::InputDelta>;
    type TrainDelta = G::TrainDelta;

    fn backward(
        &self,
        input: &Self::Input,
        internal: &Self::Internal,
        output_delta: &Self::OutputDelta,
    ) -> (Self::InputDelta, Self::TrainDelta) {
        let both_vec: Vec<(G::InputDelta, G::TrainDelta)> =
            izip!(input.iter(), internal.iter(), output_delta.iter(),)
                .map(|(input, internal, output_delta)| {
                    self.0.backward(input, internal, output_delta)
                })
                .collect();
        let (input_delta_vec, train_delta_vec) = both_vec.into_iter().fold(
            (vec![], vec![]),
            |(mut input_delta_vec, mut train_delta_vec), (input_delta, train_delta)| {
                input_delta_vec.push(input_delta);
                train_delta_vec.push(train_delta);
                (input_delta_vec, train_delta_vec)
            },
        );
        let input_delta_array = Array::from_shape_vec(input.raw_dim(), input_delta_vec).unwrap();
        let train_delta_array = Array::from_shape_vec(input.raw_dim(), train_delta_vec).unwrap();
        (input_delta_array, train_delta_array.sum())
    }
}

As you can see, the input and output types are specific to ndarray. This means that you can write ops that are as specific as they need to be.

Backends

mli makes the backend chosen by the code itself. For instance, once you have used Map3One from mli-ndarray, that piece of the graph now only runs with ndarray. Additionally, this means that the graph is "static". The graph cannot be stored to disk and loaded. This is not a problem in and of itself. Some pros and cons:

Pros

It is easy to write code that is backend agnostic and integrate it with backend-specific code
The compiler can inline and optimize native code so CPU implementations can run faster

Cons

Since the backend is not specified, any backend context (such as for a GPU) must be stored globally or in the tensor type
You cannot have a black box model that can be swapped by changing a file
- You can trained weights, but not the actual graph itself
We cannot perform optimizations on the graph (like fusing or rearanging ops) since it is compiled as code and not stored in memory

deep

deep didn't even get past the first PR adding the graph (until I just merged it, right now). Here is what it does:

Tensors

Tensors in deep are specificed by the backend. This means that if you are using em (Emu) you can use DeviceBox<[f32]> as your Tensor type. This is non-ideal since we would like to have tensors containing other types than f32. Unfortunately, this cannot be done without Generic Associated Types (GATs). A small example from the RFC:

impl PointerFamily for RcFamily {
    type Pointer<T> = Rc<T>;
    fn new<T>(value: T) -> Self::Pointer<T> {
        Rc::new(value)
    }
}

As you can see, the line type Pointer<T> = Rc<T>; creates an associated type with a type parameter. This type parameter then parameterizes Rc. We need this functionality to allow deep to achieve the same thing with tensors:

type Tensor<T> = DeviceBox<[T]>;

As you can see, now we can have backend tensors with arbitrary types. This gets even better:

type Tensor<T, const S> = Array<T, S>;

This is what it would look like once GATs and const generics are merged. This would allow us to pass a shape to the underlying tensor. On native systems, this could mean huge performance gains since algorithms can be tuned at compile-time to work with particular shapes and filter sizes. Unfortunately, this is a far-off thing. A better solution for now might be to use something like em to get a specific tensor type we can parameterize.

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.