Comments (15)
Most parameters need "hard" bounds, because, for example, negative values or values outside of [0, 1] wouldn't make sense in the formulas. In fact, I don't think there are any parameters in FSRS that can span from -β to β.
Our current optimizer supports L2 regularization (it's called weight_decay
in the documentation). I can benchmark it to see if it helps to decrease RMSE.
from srs-benchmark.
Aside from using default parameters as a starting point for optimization and choosing reasonable ranges for parameters, there is no way (that I can think of) to utilize the parameters of user A (or multiple users) to train FSRS on user B's data.
The default parameters are chosen by running FSRS on all collections, recording the optimal values, and taking the median. Btw, if you are curious about the distributions of parameters, check this out: https://github.com/open-spaced-repetition/fsrs-benchmark/tree/main/plots
Just simply grouping all reviews into a single dataset wouldn't me meaningful.
Also, I don't know what you mean by "apply regularization to the parameters relative to the individual mean", please explain it in detail.
from srs-benchmark.
For each parameter x_i where i is the user, apply the regularization loss l_2*(x_i - x)**2.
x can either be the median like it is now, or it can be a free parameter.
Using the median as a default or to determine the range is less effective than regularization, and you can also use a validation set to optimize the l_2 coefficient, etc.
from srs-benchmark.
Sure, having a hard bound could still be useful with regularization.
You'll have to offset each parameter by the default values, otherwise it will regularize all parameters towards 0 which isn't good. Thanks for having a look.
from srs-benchmark.
Grouping all users into one single dataset for training will run out of my device's RAM.
And it will be biased by users who have more reviews.
Sometime I even want to use mode as the default values instead of median.
from srs-benchmark.
Sometime I even want to use mode as the default values instead of median.
Well, it's time to go deep down the rabbit hole of estimating the mode of a continuous variable.
I know 3 ways of doing that: half-range mode, half-sample mode, and kernel density estimation. The first one is based on a simple principle: take (x_max - x_min)/2, use it as a sliding window and slide across the sample until you find the densest range. Repeat this process within that range. The second one is similar: divide the sample into two groups with an equal number of elements, and find the group with the smallest value of x_max - x_min. The last one is based on creating an empirical probability density function. So which one is better? No idea.
Here's the code, have fun:
Modes.zip
from srs-benchmark.
At least for the interval of good rating, the mode is smaller than median.
from srs-benchmark.
@L-M-Sherlock I think we should use the median in cases where the mode is the min (or max) allowed value, like for w_0:
Here using the mode would just make w_0 = 0.1
But when the mode is not the min/max value, I think it makes sense to use mode. For example, here:
Do you have all of the parameters of all users saved? If so, can you give them to me via a Google Drive link (.json files from the result folder are fine too)? I'll calculate the new default parameters using the median in some cases and the mode in other cases.
So here's my idea: we will do a dry run of 3 sets of default parameters:
- Median parameters (already done)
- Mode parameters (I'll calculate them myself)
- Hybrid set where some values are modes and other values are medians
And then we'll see which set results in the lowest RMSE during the dry run.
from srs-benchmark.
I saved all parameters in the .json
files. You can refer to this code: https://github.com/open-spaced-repetition/fsrs-benchmark/blob/main/analysis.py
And then we'll see which set results in the lowest RMSE during the dry run.
I think it's not a problem about RMSE. It is a problem for new users. Using median for w[0], w[1], w[2] and w[3] means the default first intervals are too long for half learners (too short for the rest, too). But users often complain that the first intervals are too long instead of too short.
from srs-benchmark.
I saved all parameters in the
.json
files. You can refer to this code
Yes, but I need the files themselves, and I don't want to run the benchmark myself if you have all the parameters saved.
EDIT: nevermind, I can just download them from here: https://github.com/open-spaced-repetition/fsrs-benchmark/tree/main/result/FSRSv4
from srs-benchmark.
Seems like mode estimation will have to wait: open-spaced-repetition/fsrs4anki#461 (comment)
from srs-benchmark.
So I did some preliminary testing on a smaller dataset, and it seems like I was right - the mode isn't useful in cases where it doesn't arise naturally, and instead arises as an artifact of clamping. Here's an example:
from srs-benchmark.
Interestingly, in this case the mode is in the middle.
from srs-benchmark.
Btw, in order to calculate the mode, I use all three estimators (HRM, HSK, KDE), and then take the average of the two closest ones.
from srs-benchmark.
I'll make a new issue about modes and all of this because it's technically unrelated to the current issue.
from srs-benchmark.
Related Issues (20)
- Inclusion of any of the boosting models HOT 23
- Remove collecitons of people who misuse Hard from the calculation of the default parameters HOT 4
- collect bad cases from Anki users' dataset HOT 9
- visualize metrics over time HOT 2
- [Feature Request] Train a gradient-boosted decision tree HOT 36
- Some weird first forgetting curves HOT 11
- [Feature request] Add confidence intervals for all metrics HOT 9
- accidental post
- Revlogs parsing HOT 12
- [Question] A βrawβ version of the tiny_dataset.zip HOT 3
- [Feature Request] Add a BiLSTM HOT 2
- [Feature request] Add the ACT-R model (see paper) HOT 21
- [TODO] Add DASH and its variants HOT 13
- [Feature request] A quantitative measure of cheating HOT 9
- Write an article about binned RMSE and cheating calibration metrics HOT 7
- Ebisu? HOT 7
- [Question] Some more details from a ML perspective HOT 8
- Cannot download dataset from huggingface HOT 4
- Neural network scheduler HOT 42
- Add MCC
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
π Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. πππ
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google β€οΈ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from srs-benchmark.