netmanaiops / omnianomaly Goto Github PK

KDD 2019: Robust Anomaly Detection for Multivariate Time Series through Stochastic Recurrent Neural Network

License: MIT License

Python 100.00%

omnianomaly's Issues

OmniAnomaly is unsupervised learning method?

As I have read the paper. The OmniAnomaly is an unsupervised learning method, however, your Server Machine dataset has labels?

requesting an open-source license (eg. MIT)

Thanks for the great work. Have you considered adding a license to your code so that people can reuse the code, modify it and distribute?

Github's default license is that no one has the right to reuse, modify or redistribute your code - https://choosealicense.com/no-permission/ unless you choose a different open-source license (such as MIT).

What does anomaly interpretation use for?

What does anomaly interpretation use for? #13

How can I install tensorflow 1.12.0?

pip install tensorflow-gpu==1.12.0

ERROR: Could not find a version that satisfies the requirement tensorflow-gpu==1.12.0 (from versions: 1.13.0rc1, 1.13.0rc2, 1.13.1, 1.13.2, 1.14.0rc0, 1.14.0rc1, 1.14.0, 1.15.0rc0, 1.15.0rc1, 1.15.0rc2, 1.15.0rc3, 1.15.0, 2.0.0a0, 2.0.0b0, 2.0.0b1, 2.0.0rc0, 2.0.0rc1, 2.0.0rc2, 2.0.0, 2.1.0rc0, 2.1.0rc1, 2.1.0rc2, 2.1.0)
ERROR: No matching distribution found for tensorflow-gpu==1.12.0

Seems that version 1.12.0 is not longer provided by pip

SMD Dataset - Default settings same as results reported in paper?

Hi authors,

I am trying to replicate the results reported in your paper. I would like to confirm whether the default hyperparameter settings provided in the code are exactly the same as the experiments you reported in your paper?

I think another issue I'm having in replicating similar Pot-F1 scores is the issue mentioned in #17, how do you combine the Pot-F1 from 28 different entities into a single score? I have attached the results I by running the code: omnianom_results.txt.

Thank you!

Tensor had NaN values

请问有没有人遇到过Tensor had NaN values的问题，全程都按照readme里操作的

Physical meaning of each column of the server machine dataset

I really appreciate you sharing the ServerMachineDataset. But could you please specify the exact physical meaning of each column? That will be really helpful for understanding the dataset.

No such file or directory: 'processed/MSL_train.pkl

I run this command:
"python main.py --dataset='MSL' --max_epoch=20" and I got this error

Traceback (most recent call last):
  File "main.py", line 220, in <module>
    main()
  File "main.py", line 97, in main
    test_start=config.test_start)
  File "/media/harry/harry/ML/LSTM/OmniAnomaly/omni_anomaly/utils.py", line 59, in get_data
    f = open(os.path.join(prefix, dataset + '_train.pkl'), "rb")
IOError: [Errno 2] No such file or directory: 'processed/MSL_train.pkl

Please help me, How can I run Server Machine Dataset?

Calculating Scores from Estimations

Hello,

I want to get the estimated distributions of x_ts so, I'm dumping the estimated mu and sigma using the following code lines from model.py line 191:

mu = pnet['x'].distribution.mean[:, -1, :]
sigma = pnet['x'].distribution.sigma[:, -1, :]

to verify, I'm calculating the score on a python notebook using mu and sigma

score = np.sum(norm.logpdf(x, mu, sigma), axis=1)

but my scores are actually different than the scores calculated by your code.

I couldn't really debug the problem, so I'm asking: Is my approach correct ?

Many thanks.

I meet some problems when i run 'python main.py' about "from tfsnippet.scaffold import VariableSaver"

It cannot run because of "ImportError: cannot import name 'VariableSaver'". And I cannot find VariableSaver in tfsnippet.scaffold

Please support us !!

I have not seen any replied from the author. Please support us answering those questions having issues.
Thank you!

Why are different telemetry channels trained by the same model ?

First of all, thank you for your great work.

I am a bit confused that all channels of the SMAP are merged together into one dataset, and then trained with one model.
However, the authors of the MSL ands SMAP datasets, fit a individual model for each channel.
And this makes also much sense to me since the pattern of an temperature time-series looks different than a radiation time-series.

Questions about the SMD dataset

Hello. I am a fellow researcher working on interpretable time series anomaly detection.
I have some questions about your work.

As I am working with the SMD dataset and the given interpretation label to measure the anomaly interpretation performance, I got some doubts about the integrity of the dataset. First of all, the start and the end timestamp in the interpretation label does not correctly match to the test label. Also, there are some missing or extra interpretations in the dataset. How did you deal with the inconsistencies when conducting the experiment?

I would appreciate it if you could clarify these points for me. Again thank you for the nice work.

怎么加载训练好的tf模型文件去实现预测？

你好：
模型训练完成后会生成TF模型文件。我想加载模型文件去做预测，可以如何去实现呢？如何加载模型替换训练的过程呢？
谢谢！

Project fails to run on M1 mac processor

M1 mac processor ultimately relies on Tensorflow v2.x. The packages that this project is built on, such as tfsnippet, require Tensorflow v1.x . I have been trying for many hours to get it to run on my M1-processor including attempts to enforce tfsnippet's TF v1.x API calls consistent with TF v2.x's but to no avail. Can you consider making your project compatible to run on new mac processors? Thanks.

How I run it using Nvidia cuda image

Run OmniAnomaly using Nividia cuda9 image on Ubuntu

Prepare your machine: install nvidia-docker2

distribution=$(. /etc/os-release;echo $ID$VERSION_ID) && curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add - && curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list

sudo apt update

sudo apt install -y nvidia-docker2

sudo systemctl restart docker

docker pull nvidia/cuda:9.0-cudnn7-devel

docker run -it --gpus all -t nvidia/cuda:9.0-cudnn7-devel bash

Create the env to run OmniAnomaly (in the container)

apt-get install -y make build-essential libssl-dev zlib1g-dev \
libbz2-dev libreadline-dev libsqlite3-dev wget curl llvm libncurses5-dev \
libncursesw5-dev xz-utils tk-dev libffi-dev liblzma-dev \
libgdbm-dev libnss3-dev libedit-dev libc6-dev

wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh

bash Miniconda3-latest-Linux-x86_64.sh

source .bashrc

conda create -n tf12 python=3.6

conda activate tf12

apt install -y unzip git

export PYTHONIOENCODING=utf8 && export CUDA_HOME=/usr/local/cuda && export PATH=${CUDA_HOME}/bin:${PATH}
export LD_LIBRARY_PATH=${CUDA_HOME}/lib64:$LD_LIBRARY_PATH

ln -f -s /usr/local/cuda/lib64/stubs/libcuda.so.1 /usr/local/cuda/lib64/libcuda.so.1

Done. Follow the steps in OmniAnomaly ReadMe.

References

Any repo with updated packages and python version?

Interpretation_label for ServerMachineDataset

Hi,

Thank you for collecting and making ServerMachineDataset public. While reading this data, I have a small question about interpretation_label:

How does interpretation_label relate to test_label if they are related at all? Why they don't seem to match exactly? For example, the interpretation_label for machine-1-1 indicates that 15849-16368 is an anomaly. However, when checking the test_label for machine-1-1, it indicates that the values at rows 15850-16395 is an anomaly.

Do the indices in interpretation_label also indicate anomaly locations in the train set?

Possible leak from the test set for POT

Dear authors,

Thank you for sharing this work.

I tried random numbers from the normal distribution with mean 0 and std 0.1 as model scores and evaluated using the pot_eval function. The best-f1 score seems fine, yet, the pot-f1 score seems buggy.

I attached main_random.py to reproduce the results.
main_random.txt
Github does not allow uploading .py, thus, rename main_random.txt to main_random.py and place it on the main directory. The installation is the same with this project.

For the MSL dataset,
best-f1: 0.3184
pot-f1: 0.8987 (The score in paper is 0.8989 in Table 3)
To reproduce, run python main_random.py --dataset='MSL'.

For the SMAP dataset,
best-f1: 0.3691
pot-f1: 0.9610 (The score in the paper is 0.8434 in Table 3)
To reproduce, run python main_random.py --dataset='SMAP'

These pot-f1 scores are higher than the ones in the paper. If I am not doing anything wrong, the results indicate that there is a leak from the for the POT method and its corresponding scores in the paper. Issue #15 also seems related.

May you help to explain why this happens? Thank you.

Groups of Entities (SMD)

Dear authors,

It it stated that the SMD (Server Machine Dataset) contains 3 groups of entities. Each of them is named by machine-<group_index>-.

May I know what do each of the 3 groups mean, and why are they grouped in this way?

Thank you.

Data preprocessing

During data preprocessing, data sets are converted through data standardization. How does data standardization work?

Update to Code

I'm getting a lot of errors when I try running the code for all three datasets, particularly:

Tensorflow is raising a few deprecation warnings because they updated their API. Can you carry out the updates to your code (tf.GraphKeys & tf.train.AdamOptimizer)
the file 'machine-1-1' is breaking the entire code. For the SMD this is because there is no folder named test_label inside ServerMachineDataset (The test_labels that is there is inside data and it is empty). For the other two datasets, a NoneType error is raised because I'm guessing it doesn't know which folder to reference for the files or it's referencing the wrong folder

Server Machine Dataset description

Could you please give me the original dataset of "server machine" dataset? Or please describe how first the data looks like?
I have no idea how the data looks like. I opened one file in train folder
Here is the first line of the file:
0.032258,0.039195,0.027871,0.024390,0.000000,0.915385,0.343691,0.000000,0.020011,0.000122,0.106312,0.081081,0.027397,0.060266,0.085018,0.122516,0.000000,0.000000,0.062195,0.041221,0.043242,0.031607,0.533195,0.010224,0.011195,0.009274,0.000000,0.036625,0.000000,0.004298,0.029993,0.022131,0.000000,0.000045,0.034677,0.034747,0.000000,0.000000

Python version should >= 3.7

According to README.md:

Install dependencies (with python 3.5, 3.6)

the error protobuf requires Python '>=3.7' but the running Python is 3.6.10 after install dependency pip install -r requirements.txt

What is the exactly meaning of each column of the 38 columns of SMD

I am now get your Server Machine Dataset. As written in your paper, there are CPU load , network usage, memory usage record for each machine record. But there is no information of each column and I am now trying to use this dataset for analysis. What's more, is each row sampled in a equal time intervals? Thanks!

Why is Best F1 search method not preferred over POT method?

The comparison between the two is shown in the paper also, and the threshold given by best F1 search method performs better then POT method, still why do many other time-series anomaly detection papers also use POT for thresholding?

adjust_predicts in eval_method.py

This function currently adjusts predictions based on ground truth labels. This would increase the F1 score for example. However, during the inference stage, there won't be any labels available. Should predictions be adjusted by other method, or they remain as they are?

Data Preprocessing

In the README you specify the code:python data_process.py but there is no file called data_process however, the file data_preprocess.py exists hence a tiny correction is needed

some questions about dataset

A simple Question about SMD Dataset

Thanks for you disclose SMD dataset.
I want know why former half part of dataset for training don’t have labels? Does this mean that all data in train dataset is normal？

Code can't run in win10/ubuntu18

Hello, running your code, we encountered a lot of problems that have not been resolved yet. The details are as follows:

TensorFlow version problem that the library cannot be found. I follow requestments.txt install the dependent libraries. The problem looks like TFSnippet problem? I also tried to install a different version of TFSnippet, but still couldn't fix the problem.

Environmental information:

win10
cudn10.0/cudnn11.1
RTX 3070
conda python3.6
requestments.txt(including branch of matser and dependabot/pip/tensorflow-gpu-2.4.0)

Errors message:

$ python main.py
2021-04-17 13:55:52.732294: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cudart64_110.dll
Traceback (most recent call last):
  File "main.py", line 13, in <module>
    from tfsnippet.examples.utils import MLResults, print_with_title
  File "C:\Users\LIU\anaconda3\envs\OmniAnomaly2\lib\site-packages\tfsnippet\__init__.py", line 4, in <module>
    from . import (dataflows, datasets, distributions, layers, ops,
  File "C:\Users\LIU\anaconda3\envs\OmniAnomaly2\lib\site-packages\tfsnippet\dataflows\__init__.py", line 1, in <module>
    from .array_flow import *
  File "C:\Users\LIU\anaconda3\envs\OmniAnomaly2\lib\site-packages\tfsnippet\dataflows\array_flow.py", line 4, in <module>
    from tfsnippet.utils import minibatch_slices_iterator
  File "C:\Users\LIU\anaconda3\envs\OmniAnomaly2\lib\site-packages\tfsnippet\utils\__init__.py", line 17, in <module>
    from .session import *
  File "C:\Users\LIU\anaconda3\envs\OmniAnomaly2\lib\site-packages\tfsnippet\utils\session.py", line 71, in <module>
    def get_variables_as_dict(scope=None, collection=tf.GraphKeys.GLOBAL_VARIABLES):
AttributeError: module 'tensorflow' has no attribute 'GraphKeys'

of course, there are some small problems we have solved during the process.

Then I switching different graphics cards environment hope to solve the problem. However it happen some new problem.

Environmental information:

ubuntu
cudn10.2
RTX 2080 TI
conda python3.6
requestments.txt(in branch of matser)

Errors message:

$ python main.py 
Traceback (most recent call last):
  File "/home/jingliu/.local/lib/python3.6/site-packages/tensorflow/python/pywrap_tensorflow.py", line 58, in <module>
    from tensorflow.python.pywrap_tensorflow_internal import *
  File "/home/jingliu/.local/lib/python3.6/site-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 28, in <module>
    _pywrap_tensorflow_internal = swig_import_helper()
  File "/home/jingliu/.local/lib/python3.6/site-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 24, in swig_import_helper
    _mod = imp.load_module('_pywrap_tensorflow_internal', fp, pathname, description)
  File "/home/nsai/anaconda3/envs/lj_dl/lib/python3.6/imp.py", line 242, in load_module
    return load_dynamic(name, filename, file)
  File "/home/nsai/anaconda3/envs/lj_dl/lib/python3.6/imp.py", line 342, in load_dynamic
    return _load(spec)
ImportError: libcublas.so.9.0: cannot open shared object file: No such file or directory

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "main.py", line 12, in <module>
    import tensorflow as tf
  File "/home/jingliu/.local/lib/python3.6/site-packages/tensorflow/__init__.py", line 24, in <module>
    from tensorflow.python import pywrap_tensorflow  # pylint: disable=unused-import
  File "/home/jingliu/.local/lib/python3.6/site-packages/tensorflow/python/__init__.py", line 49, in <module>
    from tensorflow.python import pywrap_tensorflow
  File "/home/jingliu/.local/lib/python3.6/site-packages/tensorflow/python/pywrap_tensorflow.py", line 74, in <module>
    raise ImportError(msg)
ImportError: Traceback (most recent call last):
  File "/home/jingliu/.local/lib/python3.6/site-packages/tensorflow/python/pywrap_tensorflow.py", line 58, in <module>
    from tensorflow.python.pywrap_tensorflow_internal import *
  File "/home/jingliu/.local/lib/python3.6/site-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 28, in <module>
    _pywrap_tensorflow_internal = swig_import_helper()
  File "/home/jingliu/.local/lib/python3.6/site-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 24, in swig_import_helper
    _mod = imp.load_module('_pywrap_tensorflow_internal', fp, pathname, description)
  File "/home/nsai/anaconda3/envs/lj_dl/lib/python3.6/imp.py", line 242, in load_module
    return load_dynamic(name, filename, file)
  File "/home/nsai/anaconda3/envs/lj_dl/lib/python3.6/imp.py", line 342, in load_dynamic
    return _load(spec)
ImportError: libcublas.so.9.0: cannot open shared object file: No such file or directory


Failed to load the native TensorFlow runtime.

See https://www.tensorflow.org/install/errors

for some common reasons and solutions.  Include the entire stack trace
above this error message when asking for help.

In branch of dependabot/pip/tensorflow-gpu-2.4.0, after executing pip install -r requestments：

INFO: pip is looking at multiple versions of zhusuan to determine which version is compatible with other requirements. This could take a while.
INFO: pip is looking at multiple versions of tfsnippet to determine which version is compatible with other requirements. This could take a while.
ERROR: Cannot install -r requirements.txt (line 11), -r requirements.txt (line 13), -r requirements.txt (line 14), -r requirements.txt (line 7) and six==1.11.0 because these package versions have conflicting dependencies.

The conflict is caused by:
    The user requested six==1.11.0
    tfsnippet 0.2.0a1 depends on six>=1.11.0
    zhusuan 0.4.0 depends on six
    fs 2.3.0 depends on six~=1.10
    tensorflow-gpu 2.4.0 depends on six~=1.15.0

To fix this you could try to:
1. loosen the range of package versions you've specified
2. remove package versions to allow pip attempt to solve the dependency conflict

ERROR: ResolutionImpossible: for help visit https://pip.pypa.io/en/latest/user_guide/#fixing-conflicting-dependencies

In a word, we spent a lot of time running your code and it failed.
is it convenient for you to provide the running environment of docker version? What's your hardware conditions and information like CUDN that you can run?

Fitting on two different Scaler

The code have a major flaw : if I decide to normalize the data, a MinMaxScaler is used on trainset and testset, but it's not the same for the two dataset. You have to fit on one of the dataset, and transform on both of them with the same.

Error While training

I got this error while tranining SMAP

26860, 207.18948957385916], -371.0)
('cur thr: ', -199.0, [0.6784833256741009, 0.9701897913384155, 0.5216469874941413, 29291, 370467, 900, 26860, 207.18948957385916], [0.6794118610225376, 0.973996607929373, 0.5216469874941413, 29291, 370585, 782,
26860, 207.18948957385916], -371.0)
('cur thr: ', -149.0, [0.6784518943728867, 0.9700612681006586, 0.5216469874941413, 29291, 370463, 904, 26860, 192.475273927267], [0.6794118610225376, 0.973996607929373, 0.5216469874941413, 29291, 370585, 782, 26860, 207.18948957385916], -371.0)
('cur thr: ', -99.0, [0.6900573719198619, 0.966784338585756, 0.536499795099553, 30125, 370332, 1035, 26026, 159.45766892637948], [0.6904290381906943, 0.9682447848268425, 0.536499795099553, 30125, 370379, 988, 26026, 165.87430885704643], -106.0)
('cur thr: ', -49.0, [0.683723519711516, 0.9395354626877341, 0.5374080603128336, 30176, 369425, 1942, 25975, 111.03955584177663], [0.6904290381906943, 0.9682447848268425, 0.536499795099553, 30125, 370379, 988, 26026, 165.87430885704643], -106.0)
('cur thr: ', 1.0, [0.6625702902960295, 0.7697275382366061, 0.581610300692488, 32658, 361597, 9770, 23493, 59.91650023194379], [0.6904290381906943, 0.9682447848268425, 0.536499795099553, 30125, 370379, 988, 26026, 165.87430885704643], -106.0)
('cur thr: ', 51.0, [0.6276050282163165, 0.6815108512768132, 0.581610300692488, 32658, 356105, 15262, 23493, 56.472065355374006], [0.6904290381906943, 0.9682447848268425, 0.536499795099553, 30125, 370379, 988, 26026, 165.87430885704643], -106.0)
('cur thr: ', 101.0, [0.23218566283936742, 0.13134183823531775, 0.9999999998219087, 56151, 0, 371367, 0, 0.0], [0.6904290381906943, 0.9682447848268425, 0.536499795099553, 30125, 370379, 988, 26026, 165.87430885704643], -106.0)
('cur thr: ', 151.0, [0.23218566283936742, 0.13134183823531775, 0.9999999998219087, 56151, 0, 371367, 0, 0.0], [0.6904290381906943, 0.9682447848268425, 0.536499795099553, 30125, 370379, 988, 26026, 165.87430885704643], -106.0)
('cur thr: ', 201.0, [0.23218566283936742, 0.13134183823531775, 0.9999999998219087, 56151, 0, 371367, 0, 0.0], [0.6904290381906943, 0.9682447848268425, 0.536499795099553, 30125, 370379, 988, 26026, 165.87430885704643], -106.0)
('cur thr: ', 251.0, [0.23218566283936742, 0.13134183823531775, 0.9999999998219087, 56151, 0, 371367, 0, 0.0], [0.6904290381906943, 0.9682447848268425, 0.536499795099553, 30125, 370379, 988, 26026, 165.87430885704643], -106.0)
('cur thr: ', 301.0, [0.23218566283936742, 0.13134183823531775, 0.9999999998219087, 56151, 0, 371367, 0, 0.0], [0.6904290381906943, 0.9682447848268425, 0.536499795099553, 30125, 370379, 988, 26026, 165.87430885704643], -106.0)
('cur thr: ', 351.0, [0.23218566283936742, 0.13134183823531775, 0.9999999998219087, 56151, 0, 371367, 0, 0.0], [0.6904290381906943, 0.9682447848268425, 0.536499795099553, 30125, 370379, 988, 26026, 165.87430885704643], -106.0)
([0.6904290381906943, 0.9682447848268425, 0.536499795099553, 30125, 370379, 988, 26026, 165.87430885704643], -106.0)
Initial threshold : 26.704884
Number of peaks : 1350
Traceback (most recent call last):
  File "main.py", line 220, in <module>
    main()
  File "main.py", line 174, in main
    pot_result = pot_eval(train_score, test_score, y_test[-len(test_score):], level=config.level)
  File "/media/harry/harry/ML/LSTM/OmniAnomaly/omni_anomaly/eval_methods.py", line 137, in pot_eval
    s.initialize(level=level, min_extrema=True)  # initialization step
  File "/media/harry/harry/ML/LSTM/OmniAnomaly/omni_anomaly/spot.py", line 207, in initialize
    g, s, l = self._grimshaw()
  File "/media/harry/harry/ML/LSTM/OmniAnomaly/omni_anomaly/spot.py", line 345, in _grimshaw
    n_points, 'regular')
TypeError: unbound method _rootsFinder() must be called with SPOT instance as first argument (got function instance instead)

What is happening?

The code provide the reconstructed probability but how to get the reconstructed sequence

AttributeError: module 'tensorflow' has no attribute 'GraphKeys'

I am facing this error AttributeError: module 'tensorflow' has no attribute 'GraphKeys'. Even, I installed tensorflow version as the requirement.

Implemented model architecture does not match the paper

In your paper (Robust anomaly detection for multivariate time series through stochastic recurrent neural network) in Figure 3 b1 you describe the network architecture of qnet, which begins as follows:

$x_t$ -> GRU -> $e_t$ (GRU hidden output) -> concatenate $e_t$ with $z_{t-1}$ -> Dense layer $h^\phi$ -> ...

However, the implemented code does not follow this architecture:

After the RNN cell, you immediately put the RNN's hidden output $e_t$ through (two) linear layers:

wrapper.py (lines 113-120)

try:
    outputs, _ = rnn.static_rnn(fw_cell, x, dtype=tf.float32)
except Exception:  # Old TensorFlow version only returns outputs not states
    outputs = rnn.static_rnn(fw_cell, x, dtype=tf.float32)
outputs = tf.stack(outputs, axis=time_axis)
for i in range(hidden_dense):
    outputs = tf.layers.dense(outputs, dense_dim)
return outputs

So this code shows first issue:

As you are not specifying the activation type in tf.layers.dense() function, these layers have linear activation, not ReLU as you claim in the paper. Also, did you use hidden_dense=2 in the experiments described in the paper or was it 1?

Second:

In the paper you say that you concatenate $e_t$ with $z_t$ and after that apply a dense layer with ReLU activation, which is not done in your implementation:

recurrent_distribution.py (lines 43-46):

input_q = tf.concat([input_q_n, z_previous], axis=-1)
mu_q = self.mean_q_mlp(input_q, reuse=tf.AUTO_REUSE)  # n_sample * batch_size * z_dim

std_q = self.std_q_mlp(input_q)  # n_sample * batch_size * z_dim

The code above show that the concatenated output is directly given to mu and std layers, not to a dense layer with ReLU activation before this.

So what your current implementation does in the qnet is more like following:
$x_t$ -> GRU -> $e_t$ (GRU hidden output) -> x (two?) linear layers -> concatenate $e_t$ with $z_{t-1}$ -> Get mu and std using the concatenation as an input for the mu layer and std layer -> ...

What is the reason for this difference in the actual implementation and the one described in the original paper? Which architecture was used in the experiments reported in the paper?

Combining entity wise results into data set wise results

Dear authors,
in your paper you report results for SMAP, MSL and SMD however the anomaly detection is done at the entity level. Could you explain how you obtained precision, recall and f1 score for the whole data sets from the anomaly predictions you had for each of the entities that composed the data sets?
Best

Cannot install and run the code with tf2.9.3

Hi,

I saw your update on tf 2.9.3 compatible but I tried

pip install -r requirement.txt

and it fails. So I remove the version of some package and it seems like tfsnippet has some problems with import

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.8/dist-packages/tfsnippet/__init__.py", line 4, in <module>
    from . import (dataflows, datasets, distributions, evaluation, layers,
  File "/usr/local/lib/python3.8/dist-packages/tfsnippet/dataflows/__init__.py", line 1, in <module>
    from .array_flow import *
  File "/usr/local/lib/python3.8/dist-packages/tfsnippet/dataflows/array_flow.py", line 4, in <module>
    from tfsnippet.utils import minibatch_slices_iterator, generate_random_seed
  File "/usr/local/lib/python3.8/dist-packages/tfsnippet/utils/__init__.py", line 20, in <module>
    from .session import *
  File "/usr/local/lib/python3.8/dist-packages/tfsnippet/utils/session.py", line 71, in <module>
    def get_variables_as_dict(scope=None, collection=tf.GraphKeys.GLOBAL_VARIABLES):
AttributeError: module 'tensorflow' has no attribute 'GraphKeys'

Any solutions?

ValueError: Input tensor enters the loop with shape (?, 3, 1), but has shape (?, 3, ?) after one iteration.

I got this error while using SMAP dataset

Instructions for updating:
Use keras.layers.dense instead.
Traceback (most recent call last):
  File "main.py", line 213, in <module>
    main()
  File "main.py", line 110, in main
    valid_step_freq=config.valid_step_freq)
  File "/media/harry/harry/ML/LSTM/OmniAnomaly/omni_anomaly/training.py", line 121, in __init__
    x=self._input_x, n_z=n_z)
  File "/media/harry/harry/ML/LSTM/OmniAnomaly/omni_anomaly/model.py", line 138, in get_training_loss
    chain = self.vae.chain(x, n_z=n_z, posterior_flow=self._posterior_flow)
  File "/media/harry/harry/ML/LSTM/OmniAnomaly/omni_anomaly/vae.py", line 362, in chain
    observed={'x': x}
  File "/usr/local/lib/python3.6/site-packages/tfsnippet/bayes.py", line 403, in variational_chain
    latent_axis=latent_axis,
  File "/usr/local/lib/python3.6/site-packages/tfsnippet/variational/chain.py", line 55, in __init__
    model.local_log_probs(iter(model)))
  File "/usr/local/lib/python3.6/site-packages/tfsnippet/bayes.py", line 297, in local_log_probs
    ret.append(self._stochastic_tensors[name].log_prob(name=ns))
  File "/usr/local/lib/python3.6/site-packages/tfsnippet/stochastic.py", line 156, in log_prob
    self.tensor, self.group_ndims, name=name)
  File "/media/harry/harry/ML/LSTM/OmniAnomaly/omni_anomaly/wrapper.py", line 75, in log_prob
    log_prob, _, _, _, _, _, _ = self._distribution.forward_filter(given)
  File "/usr/local/lib/python3.6/site-packages/tensorflow_probability/python/distributions/linear_gaussian_ssm.py", line 716, in forward_filter
    initializer=initial_state)
  File "/usr/local/lib/python3.6/site-packages/tensorflow/python/ops/functional_ops.py", line 724, in scan
    maximum_iterations=n)
  File "/usr/local/lib/python3.6/site-packages/tensorflow/python/ops/control_flow_ops.py", line 3556, in while_loop
    return_same_structure)
  File "/usr/local/lib/python3.6/site-packages/tensorflow/python/ops/control_flow_ops.py", line 3087, in BuildLoop
    pred, body, original_loop_vars, loop_vars, shape_invariants)
  File "/usr/local/lib/python3.6/site-packages/tensorflow/python/ops/control_flow_ops.py", line 3059, in _BuildLoop
    next_vars.append(_AddNextAndBackEdge(m, v))
  File "/usr/local/lib/python3.6/site-packages/tensorflow/python/ops/control_flow_ops.py", line 677, in _AddNextAndBackEdge
    _EnforceShapeInvariant(m, v)
  File "/usr/local/lib/python3.6/site-packages/tensorflow/python/ops/control_flow_ops.py", line 620, in _EnforceShapeInvariant
    "less-specific shape." % (input_t.name, input_t.shape, n_shape))
ValueError: Input tensor 'model/trainer/loss/training_loss/VAE.chain/VariationalChain/model_log_joint/z.log_prob/forward_filter/add:0' enters the loop with shape (?, 3, 1), but has shape (?, 3, ?) after one iteration. To allow the shape to vary across iterations, use the `shape_invariants` argument of tf.while_loop to specify a less-specific shape.

question about your paper's results

Hi! Dear authors,

I have a question about your reported results.

I have tested your OmniAnomaly model on the dataset MSL and I can get 89% Pot-F1 which is much closer to the result reported in your paper.
But, when I set the model's anomaly score as random numbers from [0, 1], the pot-F1 can reach above 89.8833%. it is confusing since these random "anomaly scores" are not obtained from the model.

I think this issue should be caused by the point-adjust approach mentioned in your paper.
Actually, I also try to evaluate a simple RNN with your codes and settings (same data and same evaluation), its best F1 can also be above 90%.

Can you help to explain my question? Many thanks.

This instruction fails on colab: pip install -r requirements.txt

Hello, the command fails if run on colab.

This is what I get:

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting git+https://github.com/thu-ml/zhusuan.git (from -r requirements.txt (line 13))
Cloning https://github.com/thu-ml/zhusuan.git to /tmp/pip-req-build-csj2aoui
Running command git clone --filter=blob:none --quiet https://github.com/thu-ml/zhusuan.git /tmp/pip-req-build-csj2aoui
Resolved https://github.com/thu-ml/zhusuan.git to commit 4386b2a12ae4f4ed8e694e504e51d7dcdfd6f22a
Preparing metadata (setup.py) ... done
Collecting git+https://github.com/haowen-xu/[email protected] (from -r requirements.txt (line 14))
Cloning https://github.com/haowen-xu/tfsnippet.git (to revision v0.2.0-alpha1) to /tmp/pip-req-build-iwf0sus7
Running command git clone --filter=blob:none --quiet https://github.com/haowen-xu/tfsnippet.git /tmp/pip-req-build-iwf0sus7
Running command git checkout -q 7b43abdbdd29f1914dbc11b961b5d45b9de76653
Resolved https://github.com/haowen-xu/tfsnippet.git to commit 7b43abdbdd29f1914dbc11b961b5d45b9de76653
Preparing metadata (setup.py) ... done
Collecting six==1.11.0
Downloading six-1.11.0-py2.py3-none-any.whl (10 kB)
Collecting matplotlib==3.0.2
Downloading matplotlib-3.0.2.tar.gz (36.5 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 36.5/36.5 MB 9.9 MB/s eta 0:00:00
Preparing metadata (setup.py) ... done
Collecting numpy==1.15.4
Downloading numpy-1.15.4.zip (4.5 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 4.5/4.5 MB 16.0 MB/s eta 0:00:00
Preparing metadata (setup.py) ... done
Collecting pandas==0.23.4
Downloading pandas-0.23.4.tar.gz (10.5 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 10.5/10.5 MB 17.4 MB/s eta 0:00:00
error: subprocess-exited-with-error

× python setup.py egg_info did not run successfully.
│ exit code: 1
╰─> See above for output.

note: This error originates from a subprocess, and is likely not a problem with pip.
Preparing metadata (setup.py) ... error
error: metadata-generation-failed

× Encountered error while generating package metadata.
╰─> See above for output.

note: This is an issue with the package mentioned above, not pip.
hint: See above for details.

Training on other dataset

Hello sir.
I would like to use omnianomaly on training with other dataset such as SWaT and WADI.
Is it possible to train on SWaT?
If yes, which part of the code should I change?

Thank you in advanced :)

libcublas.so.9.0: cannot open shared object file:No such file or directory. Failed to load the native tensorflow runtime.

libcublas.so.9.0: cannot open shared object file:No such file or directory.
Failed to load the native tensorflow runtime.

Is there any way to train OmniAnomaly for multiple servers?

Is there any way to train OmniAnomaly for multiple servers? #12

Is there a way to separate the training and testing process?

Problem with training parameters about SMAP and MSL.

Dear Author,
In your paper and the project, I can see the different parameter“level” with SMAP = 0.07 and MSL = 0.01.
Can you tell me the other different parameters setting about the two datasets training?
Thanks.

Request: tfsnippet version in requirements.txt not available

Hello,

Thank you for this repo.
It seems the version of the repo required for the OmniAnomaly repo is no longer available. Can you kindly host the v0.2.0-alpha1 of the tfsnippet repo or send via email? opooladz at ucla dot edu

Thank you in advance

Code doesn't run

When i run the following :

python data_preprocess.py SMD

I get the following error

ImportError: DLL load failed: Le module spécifié est introuvable.
Failed to load the native TensorFlow runtime.

I see that other people have trouble running the code.

Does anyone know how to fix it or know a version of the code that works ?

I meet some problems when i run 'python main.py'!!!!!!!!!!

The problem happened in wrapper.py 'init()' function

A problem about timestep of SMD datasets

Hi, very lucky to read this wonderful paper and thanks for your efforts to collect the SMD dataset. A simple question is does the timestep in SMD keep continuous? For example, the time of the first datapoint is 2020.1.1 1:00, the second is 2020.1.1 1:01, and the last is after 700k minutes. Then the testset starts from the 700k+1 minutes. Is this correct?

Thanks and Looking forward to your reply!

netmanaiops / omnianomaly Goto Github PK

omnianomaly's Issues

Run OmniAnomaly using Nividia cuda9 image on Ubuntu

Prepare your machine: install nvidia-docker2

Create the env to run OmniAnomaly (in the container)

References

Recommend Projects

Recommend Topics

Recommend Org