Comments (21)
The model described in the paper is not exactly the same as the ACT-R model, but it is based on it from 2005. The latest version of ACT-R from http://act-r.psy.cmu.edu/software/ seems to be from 2023.
The equations are recursive and they also add an interference scalar
The following equations should be more complete.
Example:
(EDIT: note that all previous presentations need to be calculated with their respective decay, therefore simply taking the previous output of
where
The code for the model is available in Excel format on their website http://act-r.psy.cmu.edu/?post_type=publications&p=14206, under downloads in the Model and Sequence Files: http://act-r.psy.cmu.edu/wordpress/wp-content/uploads/2013/09/model-and-seq.zip
from srs-benchmark.
My code for example:
import math
sp = [0, 126, 252, 4844, 5877] # spacing
a = 0.176786766570677 # decay intercept
c = 0.216967308403809 # decay scale
s = 0.254893976981164 # noise
tau = -0.704205679427144 # threshold
m = [-999999]
t = [0]
d = [a]
def act():
for u in range(0, len(sp)-1):
sumact = 0
prev = len(m)
for i in range(1,prev):
mi = math.exp(m[i])
ti = sp[prev] - sp[i]
sumact = sumact + ti**(-(c*mi + a))
t1 = sp[prev]
act = math.log(sumact+(t1**(-a)))
m.append(act)
t.append(t1)
def activation(m):
return 1/(1+math.exp((tau-m)/s))
act()
print("m: ", m)
print("t: ", t)
p = map(activation, m[1:])
print("p: ", list(p))
Results:
m: [-999999, -0.8549906405542196, -0.4332005949791639, -0.9299686617111677, -0.6174756722171177]
t: [0, 126, 252, 4844, 5877]
p: [0.3562771058516888, 0.7433029478836352, 0.2919952459321521, 0.584253471491992]
from srs-benchmark.
Could you rewrite the ACT-R model in state-transition equations of D, S and R?
By the way, I'm benchmarking FSRS with short-term schedule. It reduces 3.2% RMSE(bins) compared with FSRS-4.5. I'm wondering whether it's worth to release it.
from srs-benchmark.
Could you rewrite the ACT-R model in state-transition equations of D, S and R?
Well, it's not exactly the same as DSR, but if you're asking me to code it - I'll do my best.
from srs-benchmark.
It seems that their approach is quite different from ours. The first review (delta_t=0) is called "study" and consecutive reviews are called "tests". So their first "test" is our second review, we'll have to discard the first review. The way they calculate R is also different, and this is a bit difficult to explain. We use r = power_forgetting_curve(X[:, 0], state[:, 0])
and calculate new_s
later, but what they do is more like r = power_forgetting_curve(X[:, 0], new_s)
. Basically, we use S[n] to calculate R[n+1], they use m[n] (activation) to calculate R[n].
Because of these differences I cannot implement this model myself.
from srs-benchmark.
I just realized that there is another problem: the way they calculate delta_t. We calculate delta_t as the difference between the most recent review and the previous review, but they calculate it as the total time since the first review.
Suppose that our delta_t's look like this:
1 day
2 day
5 days
15 days
Their delta_t's would look like this
1 day
3 days
8 days
23 days
And that's not everything. Even though their notation says
, it's actually not what it looks. Calculating that sum seems complicated and their notation is misleading. Read the appendix in the linked paper.
from srs-benchmark.
Btw, there is source code of software that uses ACT-R, but it's in freaking Lisp: http://act-r.psy.cmu.edu/actr7.x/actr7.x.zip. This code is probably more ancient than both of us.
from srs-benchmark.
http://act-r.psy.cmu.edu/wordpress/wp-content/uploads/2013/09/model-and-seq.zip
I can't download it.
EDIT: I downloaded it from the website.
from srs-benchmark.
The following equations should be more complete.
Could you calculate the
r_history = [0, 0, 1, 1, 0, 1]
t_history = [0, 4, 4, 15, 10, 1]
delta_t = 1
from srs-benchmark.
The appendix (first link at the very top of this issue) has an example.
from srs-benchmark.
The following equations should be more complete.
Could you calculate the pr(m) from following review history?
r_history = [0, 0, 1, 1, 0, 1] t_history = [0, 4, 4, 15, 10, 1] delta_t = 1
I have python code that replicates the example in the appendix, I'll create a gist of that and see if I can calculate p from the history given. I'll just need to understand what r_history, t_history and delta_t represent.
from srs-benchmark.
I don't know either, the actual dataset uses a different format.
Here's an example:
3000.csv
card_id
is self-explanatory, review_th
is some sort of order thingy that tells you which card was reviewed before which card (I think?), delta_t
is the time elapsed between the last review and the new review and rating
is like this: Again=1, Hard=2, Good=3, Easy=4.
And delta_t
can be -1 for some reason, I don't know why. Only Sherlock knows how to use this stuff.
from srs-benchmark.
delta_t
is the time elapsed between the last review and the new review andrating
is like this: Again=1, Hard=2, Good=3, Easy=4.
Is delta_t measures in days? Since this model requires time in seconds from initial review.
from srs-benchmark.
Yes, in days. Btw, none of the models in the benchmark use same-day reviews, so we won't need h (interference scalar). It would be unfair if this was the only model that uses same-day reviews.
from srs-benchmark.
My code:
import numpy as np
h = 1
a = 0.177
c = 0.217
tau = -0.704
s = 0.255
def next_m(t, m):
if t == 0:
return m
return np.log(np.exp(m) + np.power(h * t, - c * np.exp(m) - a))
def p_recall(m):
return 1 / (1 + np.exp((tau - m) / s))
m = -np.inf
for t in (0, 126, 252, 4844, 5877):
m = next_m(t, m)
print(m, p_recall(m))
Results:
-inf 0.0
-0.8560218975304116 0.35522173003731444
-0.42991484236460425 0.7455169723229881
-0.33159217939281044 0.8115973363882435
-0.2568675632858423 0.8523887453193757
The first three lines are consistent with the appendix. I don't know why the time changes hugely in the 3rd activation.
Edit: I get it. The t
is not the time between adjacent activations. It's the time elapsed from the activation to now.
from srs-benchmark.
It's because they transform time. Read about h. Nevermind, you used transformed time.
It's the time elapsed from the activation to now.
And that too. I've mentioned that before.
from srs-benchmark.
The problem is the code has two nested loops and use lists to store the inter-output. It's hard to implement it in torch.
from srs-benchmark.
The problem is the code has two nested loops and use lists to store the inter-output. It's hard to implement it in torch.
I'll see if I can port it to torch. Not sure if gradient descent would handle this properly, I could also try using SciPy, which has more general optimization methods and use the same method as in the paper.
from srs-benchmark.
Never mind. I have implemented a basic version:
import torch
sp = torch.tensor([0, 126, 252, 4844, 5877]) # spacing
a = 0.176786766570677 # decay intercept
c = 0.216967308403809 # decay scale
s = 0.254893976981164 # noise
tau = -0.704205679427144 # threshold
m = torch.zeros_like(sp, dtype=torch.float)
m[0] = -torch.inf
t = torch.zeros_like(sp, dtype=torch.float)
d = torch.zeros_like(sp, dtype=torch.float)
d[0] = a
def act():
for i in range(1, len(sp)):
sumact = 0
for j in range(1, i):
mi = torch.exp(m[j])
ti = (sp[i] - sp[j])
sumact = sumact + ti**(-(c*mi + a))
t1 = sp[i]
act = torch.log(sumact+(t1**(-a)))
m[i] = act
t[i] = t1
def activation(m):
return 1/(1+torch.exp((tau-m)/s))
act()
print("m: ", m)
print("t: ", t)
p = activation(m[1:])
print("p: ", p)
The next step is to move it into https://github.com/open-spaced-repetition/fsrs-benchmark/blob/main/other.py and make some necessary modification.
from srs-benchmark.
sumact = 0 for j in range(1, i): mi = torch.exp(m[j]) ti = (sp[i] - sp[j]) sumact = sumact + ti**(-(c*mi + a))
Avoid loop:
sumact = torch.sum((sp[i] - sp[1:i])**(-(c*torch.exp(m[1:i]) + a)))
from srs-benchmark.
Model:
import torch
import torch.nn as nn
from torch import Tensor
class ACT_R(nn.Module):
a = 0.176786766570677 # decay intercept
c = 0.216967308403809 # decay scale
s = 0.254893976981164 # noise
tau = -0.704205679427144 # threshold
init_w = [a, c, s, tau]
def __init__(self):
super().__init__()
self.w = nn.Parameter(torch.tensor(self.init_w))
def forward(self, sp: Tensor):
"""
:param inputs: shape[seq_len, batch_size, 1]
"""
m = torch.zeros_like(sp, dtype=torch.float)
m[0] = -torch.inf
for i in range(1, len(sp)):
act = torch.log(
torch.sum(
(sp[i] - sp[0:i]) ** (-(self.w[1] * torch.exp(m[0:i]) + self.w[0])),
dim=0,
)
)
m[i] = act
return self.activation(m)
def activation(self, m):
return 1 / (1 + torch.exp((self.w[3] - m) / self.w[2]))
model = ACT_R()
sp = torch.tensor(
[[[0], [0]], [[126], [252]], [[252], [512]], [[4844], [9581]], [[5877], [18853]]]
) # spacing
p = model(sp)
print("p: ", p)
Results:
p: tensor([[[0.0000],
[0.0000]],
[[0.3563],
[0.2550]],
[[0.7433],
[0.6352]],
[[0.2920],
[0.2174]],
[[0.5843],
[0.3131]]], grad_fn=<MulBackward0>)
The next step is to extract the spacing feature from the dataset.
from srs-benchmark.
Related Issues (20)
- [Feature Request] Group users into single dataset HOT 15
- Using the mode to find the best default parameters HOT 6
- [Feature Request] Add a Transformer HOT 15
- collect bad cases from Anki users' dataset HOT 9
- visualize metrics over time HOT 2
- [Feature Request] Train a gradient-boosted decision tree HOT 36
- Some weird first forgetting curves HOT 7
- [Feature request] Add confidence intervals for all metrics HOT 9
- accidental post
- Revlogs parsing HOT 12
- [Question] A βrawβ version of the tiny_dataset.zip HOT 3
- [Feature Request] Add a BiLSTM HOT 2
- [TODO] Add DASH and its variants HOT 13
- [Feature request] A quantitative measure of cheating HOT 9
- Write an article about binned RMSE and cheating calibration metrics HOT 7
- Ebisu? HOT 6
- [Question] Some more details from a ML perspective HOT 8
- Cannot download dataset from huggingface HOT 4
- Neural network scheduler HOT 42
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
π Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. πππ
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google β€οΈ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from srs-benchmark.