Giter Club home page Giter Club logo

hcrf-light's People

Contributors

yalesong avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

hcrf-light's Issues

windowsize

Dear Yale,

Looks like this light version HCRF does not support window size for dependency range. Currently just consider consecutive nodes, right? Or is there a way to specify such window size in your current code? Thanks a lot.

OCCRF doesn't work on toy anomaly dataset

Hi Song, I know it's been some time since this paper is published. I'm trying to run the code for OCCRF model for a few days but I couldn't figure out the correct configuration. FYI, I could run the hcrf model in README.md and I hope to have some guidance on running the occrf model.

Here's what I'm trying:

./hCRF-light/distribute/bin/hcrf-light -m occrf \
-Fd ./hCRF-light/data/toy/dataAnomalyTrain.csv -Fl ./hCRF-light/data/toy/labelsAnomalyTrain.csv \
-FD ./hCRF-light/data/toy/dataAnomalyTest.csv -FL ./hCRF-light/data/toy/labelsAnomalyTest.csv

And here's what I'm getting:

L-BFGS optimization terminated with status code = -1000
fx = 0.0151543

Reading testing set... 
Starting testing ...
Press a key to continue...

Here's the cat stats.txt. Looks like it's predicting all data points in test set to be normal (label 0).
image

I noticed that the status code is -1000, and I've checked the lbfgs library and this error code links to LBFGSERR_ROUNDING_ERROR. I'm not sure why this is happening, I'm guessing that the gradient became too small during optimization.

Then, I try to solve it by changing to use nrbm optimizer by adding -o nrbm, and here's the cat stats.txt result. It's predicting all rows in test set as anomalies:
image

(1) I'll be very pleased if you would guide me on the configuration of occrf for the toy anomaly dataset / other dataset. ๐Ÿ˜„

(2) I also have another question in mind: Shouldn't occrf be predicting just one instead of multiple labels per sequence? According to the publised paper, there is only 1 anomaly score per employee. But in this code, it's considered a continuousModel, and uses fl & fL (1 label / timestep) instead of fq & fQ (1 label / sequence)...

Thank you! ๐Ÿ˜ƒ

Issues with higher density data

When I ran this library with a dataset of 4200 files. Dimension of seqs data(93 columns and rows [not fixed]) It gave me accuracy to be somewhere around 0.0015 which is very low. I gave the split valid parameters to be same as split test. Can anyone explain the reason for this low accuracy?
Do I need to change seeds or any other parameter?

Cannot find shared library

I followed the instructions in compile.sh and got no error.
After make and make distribute, I tried to run the command
./distribute/bin/hcrf-light and got the following error :
./distribute/bin/hcrf-light: error while loading shared libraries: liblbfgs-1.8.so: cannot open shared object file: No such file or directory
The library liblbfgs-1.8.so indeed exists in ./lib/liblbfgs/lib/.libs/. How can I solve this problem ?

I am using Debian 8.7.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.