Giter Club home page Giter Club logo

Comments (12)

L-M-Sherlock avatar L-M-Sherlock commented on September 23, 2024 2

I implement the data convertor for CSV in this PR: open-spaced-repetition/fsrs-rs#182

The training loop works well. But the progress bar is a little different from the fsrs-browser.

$ cargo test --release -- training::tests::training --nocapture
   Compiling fsrs v0.6.1 (/Users/jarrettye/Codes/open-spaced-repetition/fsrs-rs)
    Finished release [optimized] target(s) in 5.05s
     Running unittests src/lib.rs (target/release/deps/fsrs-b2e90131299db624)

running 1 test
[src/convertor_tests.rs:221:5] revlogs.len() = 303051
[src/convertor_tests.rs:223:5] fsrs_items.len() = 265921
progress: 138240/1143615
progress: 233331/1143615
progress: 337267/1143615
progress: 452979/1143615
progress: 525030/1143615
progress: 610022/1143615
progress: 686169/1143615
progress: 784985/1143615
progress: 871001/1143615
progress: 928204/1143615
progress: 1022412/1143615
progress: 1106380/1143615
[src/training.rs:485:9] &parameters = [
    1.6358638,
    9.219646,
    100.0,
    100.0,
    5.1443,
    1.2006,
    0.8627,
    0.0362,
    1.629,
    0.1342,
    1.0166,
    2.1174,
    0.0839,
    0.3204,
    1.4676,
    0.219,
    2.8237,
]
test training::tests::training ... ok

test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured; 40 filtered out; finished in 13.02s

   Doc-tests fsrs

running 0 tests

test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s

from fsrs-browser.

AlexErrant avatar AlexErrant commented on September 23, 2024 2

Setting the Config.toml to

[profile.release]
codegen-units = 1
lto = "fat"
debug = true

[package.metadata.wasm-pack.profile.release]
wasm-opt = false

and running the file yields

image

And indeed, find_owned_graph is recursive. I'm guessing that fsrs-rs can train on that file because the stack size of "real" machines is far longer than WASM's. (Your test also passes on my machine.)

I'll open an issue with Burn.

from fsrs-browser.

AlexErrant avatar AlexErrant commented on September 23, 2024 2

FYI the above PR can train with martin's file

image

from fsrs-browser.

L-M-Sherlock avatar L-M-Sherlock commented on September 23, 2024 1
  • Training on that file works if you're on the commit before the burn 0.13.1 upgrade, indicating that something went wrong with the burn upgrade. Do you know if training with that file/data works on the latest fsrs-rs?

It works in 0.6.1. So I guess it's an issue related to the progress bar.

image

from fsrs-browser.

L-M-Sherlock avatar L-M-Sherlock commented on September 23, 2024 1

I guess I locate the problem. It's related to loss.backward() instead of DynBatcher:

    for epoch in 1..=config.num_epochs {
        let mut iterator = dataloader_train.iter();
        let mut iteration = 0;
        while let Some(item) = iterator.next() {
            iteration += 1;
            info!("epoch: {:?} iteration: {:?}", epoch, iteration);
            let lr = LrScheduler::<B>::step(&mut lr_scheduler);
            info!("lr: {:?}", lr);
            let progress = iterator.progress();
            info!("progress: {:?}", progress.items_processed);
            let loss = model.forward_classification(
                item.t_historys,
                item.r_historys,
                item.delta_ts,
                item.labels,
                Reduction::Mean,
            );
            info!("loss");
            let mut gradients = loss.backward();
            info!("backward");
            if model.config.freeze_stability {
                gradients = model.freeze_initial_stability(gradients);
            }
            let grads = GradientsParams::from_grads(gradients, &model);
            model = optim.step(lr, model, grads);
            info!("step");
            model.w = Param::from_tensor(weight_clipper(model.w.val()));
            renderer.render_train(TrainingProgress {
                progress,
                epoch,
                epoch_total: config.num_epochs,
                iteration,
            });
            info!("render");
image

from fsrs-browser.

AlexErrant avatar AlexErrant commented on September 23, 2024

Hm, two thoughts:

  1. Training on that file works if you're on the commit before the burn 0.13.1 upgrade, indicating that something went wrong with the burn upgrade. Do you know if training with that file/data works on the latest fsrs-rs?

  2. Console logging indicates that we aren't escaping this loop:

image

fsrs-browser does make some changes to training.rs, but they're all related to progress. Notably, training with that file works with progress, which again leads me to consider a problematic burn 0.13.1 upgrade.

from fsrs-browser.

ishiko732 avatar ishiko732 commented on September 23, 2024

ts-fsrs-demo uses npm [email protected], which is able to train properly, but unfortunately, it does not support progress.

The website supports training using CSV format: https://fsrs.parallelveil.com/train

image

image

from fsrs-browser.

AlexErrant avatar AlexErrant commented on September 23, 2024

Hmm, let me emphasize:

Training on that file works if you're on the commit EXACTLY BEFORE the burn 0.13.1 upgrade

image

I don't think the progress bar is the problem. I think there's either a problem with fsrs-rs's burn upgrade, or my changes to burn.

@L-M-Sherlock can you confirm if training on that file/data works in vanilla fsrs-rs? (You might need to add a CSV importer to fsrs-rs, not sure if that functionality already exists.) Confirming that the latest fsrs-rs works would mean that there's definitely a problem with my changes to burn.

from fsrs-browser.

L-M-Sherlock avatar L-M-Sherlock commented on September 23, 2024
image

You are right. I checkout 1d41fbf and try to reproduce the error. But it works well.

So it's a problem related to the burn v0.13.1. I will try to test the vanilla fsrs-rs with a large collection to reproduce it.

from fsrs-browser.

AlexErrant avatar AlexErrant commented on September 23, 2024

Maybe it has something to do with the switch to DynBatcher...? The guilty loop touches BatchShuffledDataLoaderBuilder

https://github.com/open-spaced-repetition/fsrs-rs/blob/a8629a7a6883b0c62f593f8b88ceaa3f3be5b28d/src/training.rs#L345-L349

from fsrs-browser.

L-M-Sherlock avatar L-M-Sherlock commented on September 23, 2024

I test fsrs-browser in several revlog files. I find that the error only occurs in those large revlog files.

Here is my file (can reproduce the error): revlog.csv

In my case, the epoch has 76649 items, and the error occurs after 11264 items are processed.

In martin's case, the epoch has 228615 items, and the error occurs after 133632 items are processed.

from fsrs-browser.

L-M-Sherlock avatar L-M-Sherlock commented on September 23, 2024

I find that this error only occurs when the length of sequence is greater than 40.

Processed 133632 items.
fsrs_browser.js:877 batching 512 items
fsrs_browser.js:877 Shape { dims: [69, 512] }
fsrs_browser.js:877 epoch: 1 iteration: 261
fsrs_browser.js:877 lr: 0.038669088705732234
fsrs_browser.js:877 progress: 133632
fsrs_browser.js:877 loss
Processed 11264 items.
fsrs_browser.js:877 batching 512 items
fsrs_browser.js:877 Shape { dims: [65, 512] }
fsrs_browser.js:877 epoch: 1 iteration: 22
fsrs_browser.js:877 lr: 0.03991513761877139
fsrs_browser.js:877 progress: 11264
fsrs_browser.js:877 loss
Processed 12800 items.
fsrs_browser.js:877 batching 512 items
fsrs_browser.js:877 Shape { dims: [52, 512] }
fsrs_browser.js:877 epoch: 1 iteration: 25
fsrs_browser.js:877 lr: 0.03998594726543939
fsrs_browser.js:877 progress: 12800
fsrs_browser.js:877 loss
Processed 10240 items.
fsrs_browser.js:877 batching 512 items
fsrs_browser.js:877 Shape { dims: [51, 512] }
fsrs_browser.js:877 epoch: 1 iteration: 20
fsrs_browser.js:877 lr: 0.0399993460939846
fsrs_browser.js:877 progress: 10240
fsrs_browser.js:877 loss
Processed 8192 items.
fsrs_browser.js:877 batching 512 items
fsrs_browser.js:877 Shape { dims: [49, 512] }
fsrs_browser.js:877 epoch: 1 iteration: 16
fsrs_browser.js:877 lr: 0.03962740252278827
fsrs_browser.js:877 progress: 8192
fsrs_browser.js:877 loss

from fsrs-browser.

Related Issues (10)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.