I test NCC with half TCEP pairs for training and half for testing, and randomly split

Hi has this ever been solved? Thanks in advance</p

NCC got random guess performance on TCEP about causaldiscoverytoolbox HOT 7 OPEN

fentechsolutions commented on May 14, 2024

NCC got random guess performance on TCEP

from causaldiscoverytoolbox.

Comments (7)

Diviyan-Kalainathan commented on May 14, 2024

My guess is that 50 pairs for training NCC is nowhere near enough, i would suggest using a polynomial generator to generate ~2000 pairs.

from cdt.data import CausalPairGenerator
c = CausalPairGenerator('polynomial')
data, labels = c.generate(2000, 500)

Best,
Diviyan

from causaldiscoverytoolbox.

wpzdm commented on May 14, 2024

Both sample size and training epochs has influence:
When 50 testing vs 50 training, if epochs=500, average acc is ~50% as mentioned, while if epochs=200, average acc goes up to ~55%.
When 1 testing vs all 99 training, if epochs=500, acc is ~65%, but if epochs=1000, acc is ~49%.

I will also try to train on artificial pairs.

Thank you!

from causaldiscoverytoolbox.

Diviyan-Kalainathan commented on May 14, 2024

When 50 testing vs 50 training, if epochs=500, average acc is ~50% as mentioned, while if epochs=200, average acc goes up to ~55%.

There might be some overfitting hidden here, I'll be waiting for the extensive results on artificial pairs :)

from causaldiscoverytoolbox.

wpzdm commented on May 14, 2024

I tried to train on 3000 artificial pairs. The testing performance on TCEP is still only slightly better than guess.
And strangely, NCC seems to be overfitting even with only 5 training epoches.

Code (I checked CausalPairGenerator returns pairs with random directions, so I didnt do the shuffling.):

def test_NCC():
    method = NCC
    print(method)
    m = method()

    from cdt.data import CausalPairGenerator
    data0, dirs0 = CausalPairGenerator('polynomial').generate(1000, 500)
    data1, dirs1 = CausalPairGenerator('gp_add').generate(1000, 500)
    data2, dirs2 = CausalPairGenerator('nn').generate(1000, 500)
    data = pd.concat([data0, data1, data2])
    dirs = pd.concat([dirs0, dirs1, dirs2])

    m.fit(data, dirs, epochs=5)
    r = m.predict_dataset(tueb)
    acc = np.mean(r.values * labels.values > 0)

    print(acc)

Output with 1000 Epochs:

Epochs: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1000/1000 [2:02:04<00:00,  7.72s/it, Acc=0.983]
 65%|█████████████████████████████████████████████████████████████████████████████████████████▎                                                | 66/102 [00:00<00:00, 299.84it/s]
0.5294117647058824

10 Epochs:

Epochs: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 10/10 [00:56<00:00,  5.60s/it, Acc=0.845]
 87%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▍                 | 89/102 [00:00<00:00, 871.85it/s]
0.5490196078431373

5 Epochs:

Epochs: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 5/5 [00:43<00:00,  8.96s/it, Acc=0.849]
 74%|█████████████████████████████████████████████████████████████████████████████████████████████████████▍                                    | 75/102 [00:00<00:00, 747.90it/s]
0.5588235294117647

from causaldiscoverytoolbox.

Diviyan-Kalainathan commented on May 14, 2024

Hi,
Right, I'll look into it.

from causaldiscoverytoolbox.

sAviOr287 commented on May 14, 2024

has this ever been solved?

Thanks in advance

from causaldiscoverytoolbox.

Diviyan-Kalainathan commented on May 14, 2024

Hello,
I didn't get an answer from the author, I will get back to the implementation myself.

from causaldiscoverytoolbox.

NCC got random guess performance on TCEP about causaldiscoverytoolbox HOT 7 OPEN

Comments (7)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent