That's the section of the optimizer that I am least familiar with. I would like to und

<a target="_blank" rel="noopener noreferrer" href="https://private-user-images.githubu

[Question] Explain how the optimizer calculates retention that minimizes review times,about open-spaced-repetition/fsrs-optimizer

Comments (12)

L-M-Sherlock commented on September 25, 2024 1

By the way, I plan to use a brand new way to find the optimal retention in FSRS Optimizer. It has been implemented in https://github.com/open-spaced-repetition/fsrs-rs

The main idea of the new method is to simulate the review process of user with different retention and select the retention which maximize the estimated total knowledge.

from fsrs-optimizer.

L-M-Sherlock commented on September 25, 2024 1

fsrs-optimizer/src/fsrs_optimizer/fsrs_simulator.py

Lines 101 to 103 in 3583fd2

 review_ratings = np.random.choice( 

 [1, 2, 3], np.sum(true_review & ~forget), p=review_rating_prob 

 )

The forgotten cards' ratings must be again, so we don't need to sample for them. The code only sample rating fromhard, good and easy for those recalled cards.

fsrs-optimizer/src/fsrs_optimizer/fsrs_optimizer.py

Lines 515 to 532 in 3583fd2

 new_card_revlog = df[(df["review_state"] == New)] 

 self.first_rating_prob = np.zeros(4) 

 self.first_rating_prob[ 

 new_card_revlog["review_rating"].value_counts().index - 1 

 ] = ( 

 new_card_revlog["review_rating"].value_counts() 

 / new_card_revlog["review_rating"].count() 

 ) 

 recall_card_revlog = df[ 

 (df["review_state"] == Review) & (df["review_rating"] != 1) 

 ] 

 self.review_rating_prob = np.zeros(3) 

 self.review_rating_prob[ 

 recall_card_revlog["review_rating"].value_counts().index - 2 

 ] = ( 

 recall_card_revlog["review_rating"].value_counts() 

 / recall_card_revlog["review_rating"].count() 

 )

Here I have estimated these probabilities from the user's actual review history.

from fsrs-optimizer.

L-M-Sherlock commented on September 25, 2024

Please see my paper. The mechanism of this part has been described in Section 4 OPTIMAL SCHEDULING:

source: www.maimemo.com/paper/

from fsrs-optimizer.

Expertium commented on September 25, 2024

The main idea of the new method is to simulate the review process of user with different retention and select the retention which maximize the estimated total knowledge.

Can you explain how that one works, then? The way I see it, it will just output the max. retention every time, since that's what maximizes total knowledge. Unless you mean total knowledge acquired per unit of time.

from fsrs-optimizer.

L-M-Sherlock commented on September 25, 2024

In the simulation, the time to learn per day is fixed. If the retention is too high, the user will have no time to learn new cards.

For the details of simulation, please see this pseudo code:

from fsrs-optimizer.

Expertium commented on September 25, 2024

That's not very easy for me to understand, but thank you.
Unrelated, but have you compared the speeds of the Rust-based optimizer and Python-based optimizer? I would assume that Rust version is faster.

from fsrs-optimizer.

L-M-Sherlock commented on September 25, 2024

I haven't compared the speeds. Because Rust-based optimizer hasn't implemented the splits of dataset.

from fsrs-optimizer.

Expertium commented on September 25, 2024

Also, how do you know whether the new method is better?
With the algorithm itself, finding out whether a change is good or bad is very straightforward - just run both on the same dataset and check the RMSE. We have "ground truth" - the actual repetition history. But how do you assess which method of finding optimal retention is better?

from fsrs-optimizer.

L-M-Sherlock commented on September 25, 2024

The old method didn't support decimal difficulty, so it's imprecise. And it also assumed the user only press good and again.

from fsrs-optimizer.

Expertium commented on September 25, 2024

So I was looking at the code, and this raises 2 questions:

If I understand it correctly, does it mean that in the simulator "Again" can only happen during the first review, and cannot happen during later reviews? review_rating_prob only has 3 values.
Wouldn't it be better to estimate these probabilities from the user's actual review history?

from fsrs-optimizer.

Expertium commented on September 25, 2024

Thank you. One more question: does the simulator use the same value of answer time (recall_cost) for Hard, Good and Easy? Wouldn't it be more precise to use 3 different values for three passing grades?

from fsrs-optimizer.

L-M-Sherlock commented on September 25, 2024

The current simulator uses the same value. You can open an issue for this feature request.

from fsrs-optimizer.

[Question] Explain how the optimizer calculates retention that minimizes review times about fsrs-optimizer HOT 12 CLOSED

Comments (12)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

	review_ratings = np.random.choice(
	[1, 2, 3], np.sum(true_review & ~forget), p=review_rating_prob
	)

	new_card_revlog = df[(df["review_state"] == New)]
	self.first_rating_prob = np.zeros(4)
	self.first_rating_prob[
	new_card_revlog["review_rating"].value_counts().index - 1
	] = (
	new_card_revlog["review_rating"].value_counts()
	/ new_card_revlog["review_rating"].count()
	)
	recall_card_revlog = df[
	(df["review_state"] == Review) & (df["review_rating"] != 1)
	]
	self.review_rating_prob = np.zeros(3)
	self.review_rating_prob[
	recall_card_revlog["review_rating"].value_counts().index - 2
	] = (
	recall_card_revlog["review_rating"].value_counts()
	/ recall_card_revlog["review_rating"].count()
	)