Giter Club home page Giter Club logo

carton's People

Contributors

lei-rs avatar vivekpanyam avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

carton's Issues

C# bindings

Looks like a very interesting project especially if it has bindings for most of the popular programming languages.

I saw on the site that you were planning to have bindings for C#. I have never created such a project to bind Rust to C#, but I would love to give it a shot.

Cannot the quickstart nor any of the examples on https://carton.pub/

I cannot seem to run any of the examples shown on https://carton.pub/.

For example I tried bert-base-uncased:

import cartonml as carton

# A permalink to the model
MODEL_URL = "https://carton.pub/google-research/bert-base-uncased/5f26d87c5d82b7c37ebf92fcb38788a063d49a64cfcf1f9d118b3b710bb88005"

async def main():
    # Load the model
    model = await carton.load(MODEL_URL)

    # Set up inputs
    inputs = {
        "input": await model.info.examples[0].inputs["input"].get(),
    }

    # Run the model
    results = await model.infer(inputs)

    # Print the results
    print(results)

    # Print the expected results
    print({
        "tokens": await model.info.examples[0].sample_out["tokens"].get(),
        "scores": await model.info.examples[0].sample_out["scores"].get(),
    })

import asyncio
asyncio.run(main())

And the result I get is:

(carton-tests) โžœ  carton-tests python main.py
Request: 0 67584
thread 'tokio-runtime-worker' panicked at 'called `Result::unwrap()` on an `Err` value: Custom("invalid value: integer `1281`, expected variant index 0 <= i < 6")', /app/source/carton-runner-interface/src/do_not_modify/framed.rs:41:59
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

I also tried distilbert-base-cased-distilled-squad and that gave the same response.

Maybe I am doing something wrong or I might be missing some dependencies, but I haven't been able to figure it out yet.

Implement SHM serialization for Tensors

Maybe also make Handle serialize differently based on the size of the tensor? For large tensors, it uses shared memory. For small ones it serializes them inline? Make Handle a variant so we can choose dynamically.

C++ Bindings

I'm actively working on C++ bindings and will update this issue once PRs are ready.

Feel free to subscribe to this issue to be notified of progress.

WASM Interface

While I am workin on finishing up #173, I drafted up some stuff here which allows us to wrap the user's infer implementation with custom logic. I don't know if this would actually work though. For the user it would look like:

use carton_wasm_interface::infer;

#[infer]
fn infer(in_: Vec<(String, Tensor)>) -> Vec<(String, Tensor)> {
    // ...
}

The reason this work around is needed is because the .wit interface needs to be implemented in the same place the bindings are generated. Now we can implement stuff like conversions for candle and returning a pointer (and managing it's lifetime) easily.

Let me know if you have any thoughts! I'm hoping it makes developing wasm models more ergonomic.

Ludwig support

Adding support for Ludwig seems like it should be fairly straightforward. Ludwig supports export to TorchScript and we already have a TorchScript runner.

Specifically, it would probably make sense to create something similar to the export_neuropod utility in Ludwig.

This would involve:

This can be done entirely in Python code so if you want to contribute, but you're not familiar with Rust, this is a good option!

Investigate logging during runner shutdown

In some cases, worker threads panic when we try to send log messages to the main process during runner shutdown.

This doesn't really have a practical impact (as it's contained to the runner process and only happens after communication with the main process stops), but it can make stdout/stderr confusing because it looks like something important broke.

WASM support

How easy would it be to add support for Rust based libraries like Candle and Burn. I'd like to implement this if you aren't already working on it. I'd also appreciate your thoughts on whether this integration is even necessary or useful, since both packages allow you to compile everything down. Maybe it would make more sense, to instead create runners for the formats those libraries can produces, like binaries, wasm, and executables.

Crate `carton_window` not found on `crates.io`

[dependencies]
carton = "0.1.0"
$ cargo run
    Updating crates.io index
error: no matching package named `carton_window` found
location searched: registry `crates-io`
required by package `carton v0.1.0`
    ... which satisfies dependency `carton = "^0.1.0"` of package ...

Python bindings logging

It seems like some logs below INFO don't go to the python logging system (or aren't appearing for some reason on the python end) even though others do. Dig into this more

XLA runner (to support JAX)

Background

XLA is an ML "compiler for GPUs, CPUs, and ML accelerators."

Carton support for XLA would primarily be used to provide JAX support, but in theory it could also support some PyTorch and TensorFlow models.

Here's a guide on how to export a JAX model from Python and run it from C++ using XLA: google/jax#5337 (comment).

This is an example of the above in the JAX codebase.

I've explored doing this in the past (outside of Carton), but there weren't XLA prebuilt binaries available and it required building from source in the TensorFlow repo. Now, with OpenXLA and prebuilt binaries, this is a lot easier.

@LaurentMazare created rust bindings to XLA that include a straightforward example of loading the HLO IR generated by the JAX export code. That should make it fairly easy to prototype an integration with Carton if anyone is interested in doing so.

Implementation

Concretely, this could be implemented as follows:

Recommended reading:

ONNX support

There are many different ways of running an ONNX model from Rust:

tract

"Tiny, no-nonsense, self-contained, Tensorflow and ONNX inference".

Notes:

wonnx

"A WebGPU-accelerated ONNX inference run-time written 100% in Rust, ready for native and the web"

Notes:

  • This uses wgpu under the hood so it supports a lot of platforms
  • Importantly, this supports WASM and WebGPU.
  • I'm unclear on how strong its CPU inference support is. wgpu supports Vulkan and there are software implementations of it (e.g. SwiftShader), but not sure how plug-and-play it is.
  • Can it run in WASM without WebGPU?

ort

"A Rust wrapper for ONNX Runtime"

Notes:

  • Rust bindings for the official ONNX Runtime
  • Seems to be used in prod
  • Doesn't appear to support WASM yet. The underlying runtime does support it so maybe that's coming soon. There's an issue about it with recent activity.

If we're going to have one "official" ONNX runner, it should probably use ort. Unfortunately, since ort doesn't have WASM support, we need another solution for running from WASM environments.

This could be:

  • One "official" ONNX runner for Carton that uses ort on desktop, tract on WASM without GPU, and wonnx on WASM with GPUs. This seems like a complex solution especially because they don't all support the same set of ONNX operators.
  • Use tract everywhere, but don't have GPU support
  • Use wonnx everywhere, but require GPU/WebGPU

@kali @pixelspark @decahedron1 If you get a chance, I'd really appreciate any thoughts you have on the above. Thank you!

Investigate using serde in the bindings

Most of the code in language bindings just does type conversion. If you squint a bit, this fits into serde's definition of serialization/deserialization.

You could implement a serde::Serializer that "serializes" things into PyO3/Neon objects and a serde::Deserializer that does the opposite.

People have done this and implemented things like neon-serde and pythonize. This would significantly simplify boilerplate code in the language bindings, make things easier to maintain, and also make it easier to add support for new languages.

[Rust] Quick start example throw Broken pipe error via WSL

Source

https://carton.run/quickstart

Error

{"tokens": String([["day"]], shape=[1, 1], strides=[1, 1], layout=CFcf (0xf), dynamic ndim=2), "scores": Float([[14.551311]], shape=[1, 1], strides=[1, 1], layout=CFcf (0xf), dynamic ndim=2)}
thread 'tokio-runtime-worker' panicked at 'called `Result::unwrap()` on an `Err` value: Os { code: 32, kind: BrokenPipe, message: "Broken pipe" }', /app/source/carton-runner-interface/src/do_not_modify/framed.rs:71:38
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
thread 'tokio-runtime-worker' panicked at 'called `Result::unwrap()` on an `Err` value: SendError(RPCResponse { id: 0, complete: true, data: LogMessage { record: LogRecord { metadata: LogMetadata { level: Trace, target: "mio::poll" }, args: "deregistering event source from poller", module_path: Some("mio::poll"), file: Some("/root/.cargo/registry/src/index.crates.io-6f17d22bba15001f/mio-0.8.5/src/poll.rs"), line: Some(663) } } })', /app/source/carton-runner-interface/src/server.rs:299:36

Any hint?

C Bindings

The initial implementation of the C bindings is in #169

Feel free to subscribe to this issue/any of the PRs above to be notified of progress.

Improve memory usage when packing a model

The previous zip file library we used during packing required complete files to be available before they could be stored. This required us to load large (possibly multi GB) files into memory.

This is no longer required. The following two places within the packing code can be refactored to read, compute sha256, and store files in a streaming/incremental fashion:

// Load the data and compute the sha256
let mut hasher = Sha256::new();
let data = tokio::fs::read(entry.path()).await.unwrap();
hasher.update(&data);
let sha256 = format!("{:x}", hasher.finalize());
manifest_contents.insert(relative_path.clone(), Some(sha256));
// Add the entry to the zip file
writer = tokio::task::spawn_blocking(move || {
writer
.start_file(
relative_path,
zip::write::FileOptions::default()
.compression_method(zip::CompressionMethod::Zstd),
)
.unwrap();
writer.write_all(&data).unwrap();
writer
})
.await
.unwrap();

// Load the data and compute the sha256
let mut hasher = Sha256::new();
let data = tokio::fs::read(entry.path()).await.unwrap();
log::trace!("Done reading file {}", &relative_path);
let (data, sha256) = tokio::task::spawn_blocking(move || {
hasher.update(&data);
(data, format!("{:x}", hasher.finalize()))
})
.await
.unwrap();
log::trace!("Computed sha256 of {}", &relative_path);
// Only store the file in the zip if (1) we don't have any linked files or (2) the linked files don't include this sha256
if linked_files
.as_ref()
.map_or(true, |v| !v.urls.contains_key(&sha256))
{
// Add the entry to the zip file
let relative_path = relative_path.clone();
writer = tokio::task::spawn_blocking(move || {
writer
.start_file(
relative_path,
zip::write::FileOptions::default()
.compression_method(zip::CompressionMethod::Zstd)
.large_file(data.len() >= 4 * 1024 * 1024 * 1024),
)
.unwrap();
writer.write_all(&data).unwrap();
writer
})
.await
.unwrap();
}

Remove special casing for `carton.pub` now that Carton is open source

Now that Carton is open source (and the websites are public), we can remove this special case for carton.pub:

// Temporary workaround while the site is not public
let mut parsed = Url::parse(&url).unwrap();
if parsed.host_str() == Some("carton.pub") {
parsed.set_host(Some("dl.carton.pub")).unwrap();
parsed.to_string()
} else {

That change lets us remove this test as well:

/// This tests a subdomain of carton.pub to exercise a different code path
/// We can remove this once the special case for carton.pub in `http.rs` is removed
#[tokio::test]
async fn test_other_domain() {
let _ = env_logger::builder()
.filter_level(log::LevelFilter::Info)
.filter_module("carton", log::LevelFilter::Trace)
.is_test(true)
.try_init();
let start = Instant::now();
let _info =
super::Carton::get_model_info("https://assets.carton.pub/manifest_sha256/0851b8cbda75c2f587c4c2a832c245575330a65932b9206f6e70391b78032c51")
.await
.unwrap();
println!("Loaded info in {:#?}", start.elapsed());
}

If you're looking for a quick task to get started with the codebase, this is a good option!

Rust quickstart appears incomplete

When trying the quickstart with Rust, I got several compilation errors:

error[E0433]: failed to resolve: use of undeclared crate or module `ndarray`
  --> src/main.rs:14:15
   |
14 |     let arr = ndarray::ArrayD::from_shape_vec(
   |               ^^^^^^^ use of undeclared crate or module `ndarray`

error[E0433]: failed to resolve: use of undeclared type `Tensor`
  --> src/main.rs:22:37
   |
22 |         .infer([("input_sequences", Tensor::<GenericStorage>::String(arr))])
   |                                     ^^^^^^ use of undeclared type `Tensor`
   |
help: consider importing this enum
   |
1  + use carton::types::Tensor;
   |

error[E0412]: cannot find type `GenericStorage` in this scope
  --> src/main.rs:22:46
   |
22 |         .infer([("input_sequences", Tensor::<GenericStorage>::String(arr))])
   |                                              ^^^^^^^^^^^^^^ not found in this scope
   |
help: consider importing this struct
   |
1  + use carton::types::GenericStorage;
   |

error[E0433]: failed to resolve: use of undeclared crate or module `ndarray`
  --> src/main.rs:15:9
   |
15 |         ndarray::IxDyn(&[1]),
   |         ^^^^^^^ use of undeclared crate or module `ndarray`

error[E0752]: `main` function is not allowed to be `async`
 --> src/main.rs:4:1
  |
4 | async fn main() {
  | ^^^^^^^^^^^^^^^ `main` function is not allowed to be `async`

Some errors have detailed explanations: E0412, E0433, E0752.
For more information about an error, try `rustc --explain E0412`.
error: could not compile `carton-test2` (bin "carton-test2") due to 5 previous errors

To resolve, I had to add a couple crates:

cargo add -F macros,rt-multi-thread tokio
cargo add ndarray

I also needed to make a few small code changes:

@@ -1,6 +1,9 @@
 use carton::Carton;
+use carton::types::GenericStorage;
 use carton::types::LoadOpts;
+use carton::types::Tensor;

+#[tokio::main]
 async fn main() {
     // Load the model
     let model = Carton::load(

Add "standard" problem definitions/interfaces and a way to check them

e.g. a standard definition for a model that does translation, one for summarization, one for image infill, etc.

This way, models can be drop in replacements of each other (at least on the inference path; loading may require different options).

Versioning for definitions? Do the standard definitions have to be "special" (i.e. specially handled in the library or the registry website) or can we handle any definitions?

Maybe each task in the public registry has a "standard interface" that models can adopt

Investigate CI build timings

When building on a 2 vCPU instance (both arm and x86) using buildkite and agents on AWS, build and test takes ~55 min even with sccache.

This is much slower than GH actions builds were. Explicitly caching the target dir and some of .cargo might make it a lot faster, but it would be simpler if just sccache worked.

As a workaround, linux CI now runs on 32 vCPU instances. This isn't ideal, but it's good enough for now.

A good first step to improve this might be to build with --timings in CI and explore from there

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.