Giter Club home page Giter Club logo

ganf's People

Contributors

jiechenjiechen avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

ganf's Issues

How to use the baseline training code

Hello,
Thanks for sharing the code! I found the python code for baselines in the models folder, but I am not sure how to use them. Could you please give some instructions about how to use them to reproduce the result of baselines when you have a moment?

 Thank you in advance! My email is [email protected].

Thank you!

The confusion about code

Hello, thank you for making the code available, it's a very nice work. When I run train_water.py, I have a confusion on line 66 of code in GANF.py. In the paper, the log(p(x)) in equation (10) is sum() but in the code it becomes mean(). I can't understand why this change was introduced. I have tried to change mean() to sum(), I find the result about the best roc_test is 0.7875 that satisfies 79.6±0.9 in Table 1. But if I don't change it, the result will be better that the roc_test will be 0.79 or 0.80. I can't understand why this change was introduced. Thank you!

6@ZN`GPZZ6@ZYE}RF$%9NYT

I think log_prob = log_prob.sum(dim=1) are more reasonable.

class GANF(nn.Module):

    def __init__ (self, n_blocks, input_size, hidden_size, n_hidden ,dropout = 0.1, model="MAF", batch_norm=True):
        super(GANF, self).__init__()

        self.rnn = nn.LSTM(input_size=input_size,hidden_size=hidden_size,batch_first=True, dropout=dropout)
        self.gcn = GNN(input_size=hidden_size, hidden_size=hidden_size)
        if model=="MAF":
            self.nf = MAF(n_blocks, input_size, hidden_size, n_hidden, cond_label_size=hidden_size, batch_norm=batch_norm,activation='tanh')
        else:
            self.nf = RealNVP(n_blocks, input_size, hidden_size, n_hidden, cond_label_size=hidden_size, batch_norm=batch_norm)

    def forward(self, x, A):

        return self.test(x, A).mean()

    def test(self, x, A):
        # x: N X K X L X D 
        full_shape = x.shape

        # reshape: N*K, L, D
        x = x.reshape((x.shape[0]*x.shape[1], x.shape[2], x.shape[3]))
        h,_ = self.rnn(x)

        # resahpe: N, K, L, H
        h = h.reshape((full_shape[0], full_shape[1], h.shape[1], h.shape[2]))


        h = self.gcn(h, A)

        # reshappe N*K*L,H
        h = h.reshape((-1,h.shape[3]))
        x = x.reshape((-1,full_shape[3]))

        log_prob = self.nf.log_prob(x,h).reshape([full_shape[0],-1])#*full_shape[1]*full_shape[2]
        log_prob = log_prob.mean(dim=1)

        return log_prob

The dataset in train_water.py

Hi, the dataset of train GANF in train_water.py is SWaT_Dataset_Attack_v0.csv. When running train_water.py, SWaT_Dataset_Attack_v0.csv is splitted train/val/test dataloader. I can't understand why this model was trained in SWaT_Dataset_Attack_v0.csv. I think this model is more reasonable to train on SWaT_Dataset_Normal_v1.csv that is not attacked, and to test on SWaT_Dataset_Attack_v0.csv. I think this training method will make the attacked points more likely to be located in areas of low probability density. Thank you very much!

Univariate time series

Hello, I would like to ask if the code has an effect on univariate time series, and I would appreciate it if you could answer me.

DeepSVDD get 84% AUROC on SWaT, better than GANF

I'm following your great work. However, when I run the DeepSVDD provided in the code, I get 84% AUROC on SWaT, better than GANF. It seems that DeepSVDD overfits the dataset. How can I solve this? The settings are as follows. I wish you can provide the setting of DeepSVDD so that I can continue following your great work. Thank you very much.
epochs = 40
input_feature = 51
hidden_size = 64
If possible, could you send your trianing code for baseline models to my email [email protected]
image

Baselines Training Code

Hi,

Great work from the authors and thank you for making the code available. I was wondering whether the training code for the baselines could also be shared? I am working on a variant problem for which I would like to try one of the baselines DeepSVDD or DeepSAD. Since the codebase provides a nice framework to build on, I would highly appreciate it if the baseline training code could also be made available.
My email id is [email protected] for further communication.

Thanks a lot.

How to run DeepSAD and DeepSVDD

I am trying to run the baseline DeepSAD and DeepSVDD that you provided, however, I don't know what the formal parameters delta_t and sigma of the test function refer to, or what parameters to pass into the test function. Can you answer my confusion?I hope you can reply soon!

Bug in load_water

There is a bug in the load_water(..) function:

root = 'data/SWaT_Dataset_Attack_v0.csv'
data = pd.read_csv(root)
data = data.rename(columns={"Normal/Attack":"label"})
data.label[data.label!="Normal"]=1
data.label[data.label=="Normal"]=0
ts_format = pd.to_datetime(data["Timestamp"], format="%d/%m/%Y %I:%M:%S %p")
ts_no_format = pd.to_datetime(data["Timestamp"])

In the above code block, the dataframes ts_format and ts_no_format should be identical. However, since ts_no_format does not see the format, it treats the string 2/1/2016 7:00:00 AM as Feb 1st 2016 instead of the TRUE date Jan 2nd 2016.

The format specified in the format argument matches the format of the string timestamp. This can be easily verified by checking the format of any string timestamp with date >12.

Not sure how much this bug would affect the performance, but would be nice if the authors could fix it.

How does MAF work for this one-dimensional univariate time series?

Thank you very much for sharing the code. In your paper, you used CNF based on MAF to evaluate the conditional probability distribution of each time series variable. But MAF should mask some of the dimensions of hidden variables and then perform autoregression. How does MAF work for this one-dimensional univariate time series?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.