Comments (7)
My guess is that 50 pairs for training NCC is nowhere near enough, i would suggest using a polynomial generator to generate ~2000 pairs.
from cdt.data import CausalPairGenerator
c = CausalPairGenerator('polynomial')
data, labels = c.generate(2000, 500)
Best,
Diviyan
from causaldiscoverytoolbox.
Both sample size and training epochs has influence:
When 50 testing vs 50 training, if epochs=500, average acc is ~50% as mentioned, while if epochs=200, average acc goes up to ~55%.
When 1 testing vs all 99 training, if epochs=500, acc is ~65%, but if epochs=1000, acc is ~49%.
I will also try to train on artificial pairs.
Thank you!
from causaldiscoverytoolbox.
When 50 testing vs 50 training, if epochs=500, average acc is ~50% as mentioned, while if epochs=200, average acc goes up to ~55%.
There might be some overfitting hidden here, I'll be waiting for the extensive results on artificial pairs :)
from causaldiscoverytoolbox.
Hi
I tried to train on 3000 artificial pairs. The testing performance on TCEP is still only slightly better than guess.
And strangely, NCC seems to be overfitting even with only 5 training epoches.
Code (I checked CausalPairGenerator
returns pairs with random directions, so I didnt do the shuffling.):
def test_NCC():
method = NCC
print(method)
m = method()
from cdt.data import CausalPairGenerator
data0, dirs0 = CausalPairGenerator('polynomial').generate(1000, 500)
data1, dirs1 = CausalPairGenerator('gp_add').generate(1000, 500)
data2, dirs2 = CausalPairGenerator('nn').generate(1000, 500)
data = pd.concat([data0, data1, data2])
dirs = pd.concat([dirs0, dirs1, dirs2])
m.fit(data, dirs, epochs=5)
r = m.predict_dataset(tueb)
acc = np.mean(r.values * labels.values > 0)
print(acc)
Output with 1000 Epochs:
Epochs: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1000/1000 [2:02:04<00:00, 7.72s/it, Acc=0.983]
65%|█████████████████████████████████████████████████████████████████████████████████████████▎ | 66/102 [00:00<00:00, 299.84it/s]
0.5294117647058824
10 Epochs:
Epochs: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 10/10 [00:56<00:00, 5.60s/it, Acc=0.845]
87%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▍ | 89/102 [00:00<00:00, 871.85it/s]
0.5490196078431373
5 Epochs:
Epochs: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 5/5 [00:43<00:00, 8.96s/it, Acc=0.849]
74%|█████████████████████████████████████████████████████████████████████████████████████████████████████▍ | 75/102 [00:00<00:00, 747.90it/s]
0.5588235294117647
from causaldiscoverytoolbox.
Hi,
Right, I'll look into it.
from causaldiscoverytoolbox.
Hi
has this ever been solved?
Thanks in advance
from causaldiscoverytoolbox.
Hello,
I didn't get an answer from the author, I will get back to the implementation myself.
from causaldiscoverytoolbox.
Related Issues (20)
- SID and SHD do not get the same results as the author of SID HOT 3
- SID error HOT 1
- Is it possible to insert prior knowledge before the causal graph creation? HOT 3
- [BUG] CGNN (Causal Graph Generation) + Usage of multiprocessing with pytorch HOT 1
- R Package (k)pcalg/RCIT is not available. RCIT has to be installed from https://github.com/Diviyan-Kalainathan/RCIT HOT 6
- [BUG] cdt.data.load_dataset('sachs') + one of the returned objects, 'target', is inconsistent with the paper(Sachs,etc 2005) HOT 1
- [fileNotFoundError: [Errno 2]] cdt.causality.graph.LiNGAM + No such file or directory: 'C:\\anaconda\\lib\\site-packages\\cdt\\utils\\R_templates\\test_import.R' HOT 1
- GIES targets and target.index parameter needs to be exposed HOT 2
- [BUG] orient_graph removes some of the edges
- [Question] What does the causal score in the pairwise model really indicate?
- ImportError: R Package (k)pcalg/RCIT is not available. HOT 3
- [BUG] CGNN run() Wrong way to calculate the score HOT 1
- FloatingPointError: The system is too ill-conditioned for this solver. The system is too ill-conditioned for this solver HOT 1
- Help! HOT 1
- Can PC algorithm be used for causal discovery under mixed types of data?
- ImportError: R Package pcalg is not available
- [BUG] autoset_settings() fails with MIG GPU
- CCDr algorithm execution error
- CCDr Algorithm + estimate.dag in R Script, Error in weights HOT 3
- CGNN running time is too long
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from causaldiscoverytoolbox.