Comments (12)
Thanks for this report. I will fix it in soon.
from srs-benchmark.
from srs-benchmark.
I get the correct result from your code. It seems an environment problem.
from srs-benchmark.
I find that only review_th is inconsistent with my result. It doesn't matter. The order is correct.
from srs-benchmark.
Unfortunately, it does matter for my analysis. I need the review_th to be right to order the entire file, also the delta_t column was different. I really need to figure it out in my environment.
from srs-benchmark.
The review_th
is calculated here:
I recommend searching the document of pandas about this function.
Document: https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.rank.html
I'm not helpful here because I can't reproduce the bug.
Itβs helpful for debugging to store the intermediate products during the converting. You can save the df into csv after each step. Then you may locate the bug.
from srs-benchmark.
OK, that makes sense. It appears the densification is working but the original times coming from the revlog are different. Did you test with the revlog I provided? Not your original?
from srs-benchmark.
is there any chance the stats_pb2 file is the wrong version? I got that from @dae 's package and it was needed to run your file. @L-M-Sherlock
from srs-benchmark.
I also got that from dae. Could you show some cases about the different review time?
from srs-benchmark.
Here is an output before dropping the rows: review_time card_id rating review_state is_learn_start sequence_group last_learn_start mask relative_day delta_t i review_th
0 97218963 0 3 0 True 1 1 True -19683 -1 1 4863
1 97224667 0 3 0 False 1 1 True -19683 0 2 4864
2 440742459 0 3 1 False 1 1 True -19679 4 3 4997
3 933416194 0 4 1 False 1 1 True -19674 5 4 5846
4 1046892324 0 2 3 False 1 1 True -19672 2 5 6105
... ... ... ... ... ... ... ... ... ... ... .. ...
7070 -1726999624 645 3 0 False 620 620 True -19705 0 2 1367
7071 -1726339624 645 3 3 False 620 620 True -19705 0 3 1380
7072 -1697912624 645 3 1 False 620 620 True -19704 1 4 1639
7073 -1659497624 645 3 3 False 620 620 True -19704 0 5 1959
7074 -1637230624 645 3 3 False 620 620 True -19704 0 6 2077
[6966 rows x 12 columns]
card_id review_th delta_t rating
0 0 4863 -1 3
1 0 4864 0 3
2 0 4997 4 3
3 0 5846 5 4
4 0 6105 2 2
... ... ... ... ...
7070 645 1367 0 3
7071 645 1380 0 3
7072 645 1639 1 3
7073 645 1959 0 3
7074 645 2077 0 3
from srs-benchmark.
It's weird that the review_time is negative.
Could you check whether they are correct after below this line?
from srs-benchmark.
Ha! my environment demoted the int64 to int32 here, which corrupted it. Problem solved. @L-M-Sherlock
df["review_time"] = df["review_time"].astype(int) fixed with
df["review_time"] = df["review_time"].astype("int64")
from srs-benchmark.
Related Issues (20)
- [Feature Request] Group users into single dataset HOT 15
- Using the mode to find the best default parameters HOT 6
- [Feature Request] Add a Transformer HOT 15
- collect bad cases from Anki users' dataset HOT 9
- visualize metrics over time HOT 2
- [Feature Request] Train a gradient-boosted decision tree HOT 36
- Some weird first forgetting curves HOT 7
- [Feature request] Add confidence intervals for all metrics HOT 9
- accidental post
- [Question] A βrawβ version of the tiny_dataset.zip HOT 3
- [Feature Request] Add a BiLSTM HOT 2
- [Feature request] Add the ACT-R model (see paper) HOT 21
- [TODO] Add DASH and its variants HOT 13
- [Feature request] A quantitative measure of cheating HOT 9
- Write an article about binned RMSE and cheating calibration metrics HOT 7
- Ebisu? HOT 6
- [Question] Some more details from a ML perspective HOT 8
- Cannot download dataset from huggingface HOT 4
- Neural network scheduler HOT 42
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
π Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. πππ
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google β€οΈ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from srs-benchmark.