Comments (12)
Because we use sqrt(count)
in 8ac5d5b#diff-66ddf16d3b863ea428c3d0d49c515150cb2c4cd81dc1abe88123b4c08824d550R847, so the impact of reviews with interval = 24 decreases.
from fsrs-optimizer.
But the orange line doesn't match any of the data points.
from fsrs-optimizer.
I guess that 8ac5d5b#diff-66ddf16d3b863ea428c3d0d49c515150cb2c4cd81dc1abe88123b4c08824d550L851 is the cause. You should probably add This is not the cause because sum(weight)
in place of the removed total_count
.total_count
has been removed from the denominator of both logloss and L1.
Unrelated, but in 8ac5d5b#diff-66ddf16d3b863ea428c3d0d49c515150cb2c4cd81dc1abe88123b4c08824d550R831, shouldn't the range be (1, 4) rather than (1, 5)?
from fsrs-optimizer.
shouldn't the range be (1, 4) rather than (1, 5)?
In python, range(1, 4) doesn't include 4.
But the orange line doesn't match any of the data points.
I think 129 data points are not enough to allow the parameter to deviate the default value too much.
from fsrs-optimizer.
I think 129 data points are not enough to allow the parameter to deviate the default value too much.
But, I think that w[3] for my collection should still be somewhat higher. Even if it doesn't reach 60, it should still be at least 40.
from fsrs-optimizer.
I prefer to be conservative here. And a short interval could allow FSRS to collect the data more quickly.
from fsrs-optimizer.
In that case, I will have to manually increase the value of w[3] because such a low value is not acceptable to me (especially when it is equal to w[2]).
By the way, after manually increasing the value of w[3] to 60 and clicking Evaluate
, log loss decreased from 0.2139 to 0.2137.
from fsrs-optimizer.
@Expertium, what do you think?
from fsrs-optimizer.
Well, according to my testing, the new implementation is more accurate. That's all I can say.
from fsrs-optimizer.
And the default w[3] has decreased in the recent update. It would also induce this issue. I will update the default weights in the this week.
from fsrs-optimizer.
I think that the real issue is 8ac5d5b#diff-66ddf16d3b863ea428c3d0d49c515150cb2c4cd81dc1abe88123b4c08824d550R656.
I used python optimizer version 4.12.0 to obtain a more complete S0 dataset from my deck file. A screenshot of the relevant portion is here:
![image](https://private-user-images.githubusercontent.com/92206575/292752623-7d4f7b58-5775-4ab7-9a46-cc16ec9adbf5.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MDkwMTgyMTksIm5iZiI6MTcwOTAxNzkxOSwicGF0aCI6Ii85MjIwNjU3NS8yOTI3NTI2MjMtN2Q0ZjdiNTgtNTc3NS00YWI3LTlhNDYtY2MxNmVjOWFkYmY1LnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDAyMjclMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQwMjI3VDA3MTE1OVomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTZkNmIzYWU0ODdhYWE1ZjFmZjUzYjRlM2UyM2U4ZGI2NTA5NzRlMjBlYjg5YTU1NjJiNjNmOTgxNTZhMmQ0MTAmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lkPTAmcmVwb19pZD0wIn0.4AdJwrPyrjrexcLzzSEewt9sZrmlNPl9eoY2Ma0WdNM)
As you can see, I don't have only 129 datapoints with initial_rating = 4, but 294 datapoints. The above change wiped off more than half of the available data.
Well, at one point of time, I myself suggested filtering out such datapoints (open-spaced-repetition/fsrs4anki#282 (comment)). But now, I am very sure that filtering this data is not a good idea.
from fsrs-optimizer.
I decided to remove that line myself and see what happens. (user1823@803d4b2)
Before | After |
---|---|
w = 1.1968, 3.7785, 21.1966, 21.2803, 4.4494, 1.733, 2.1947, 0.0, 1.8094, 0.1566, 1.1751, 1.4114, 0.1838, 0.7023, 0.0132, 0.0, 4.0 | w = 1.1968, 3.2318, 21.1971, 35.1279, 4.4911, 1.7286, 2.1952, 0.0, 1.8014, 0.1571, 1.1847, 1.4137, 0.1793, 0.7051, 0.0154, 0.0, 4.0 |
Loss after training: 0.2127 | Loss after training: 0.2120 |
RMSE: 0.0111 | RMSE: 0.0118 |
from fsrs-optimizer.
Related Issues (20)
- [BUG] Initial stability for "Good" will be larger than for "Easy" if "Good" has more datapoint HOT 24
- how to input data from obsidian-spaced-repetition-recall, ob-revlog.csv, into optimizer HOT 26
- Optimizer filtering out data which is not outlier HOT 58
- [Feature request] A way to extrapolate values of S0 without curve_fit HOT 8
- [BUG] file not found when running local optimizer for multiple decks HOT 2
- Use results from benchmark experiment as initial values of S0 HOT 12
- Command Line typo on usage section fsrs-optimizer doesn't exist [BUG] HOT 1
- [BUG] Loosen the clamping for w[10] and w[8] HOT 5
- [Feature Request] Loosen the clampings for w[9] HOT 5
- index 1 is out of bounds for axis 0 with size 1 [BUG] HOT 1
- Training data is inadequate. HOT 5
- [Question] Explain how the optimizer calculates retention that minimizes review times HOT 12
- [Feature Request] make the simulator more precise by using different values of recall_cost for Hard, Good and Easy
- [Bug] Can't use absolute path as arg HOT 1
- [BUG] 'Optimizer' object has no attribute 'w' HOT 1
- [Feature request] Improve post-lapse stability analysis HOT 13
- See if this code could be used to speed up finding optimal retention HOT 16
- [Feature Request] Investigate how robust are parameters and RMSE HOT 18
- [Feature Request] Add another condition to the outlier filter HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from fsrs-optimizer.