yalesong / hcrf-light Goto Github PK

View Code? Open in Web Editor NEW

19.0 19.0 9.0 4.74 MB

hCRF-light Library 3.0

License: Other

C 16.77% MATLAB 3.28% C++ 53.16% CMake 0.05% Makefile 5.47% Shell 21.27%

hcrf-light's People

Contributors

Stargazers

Watchers

Forkers

apoliver amiaty jbschiratti chivychao mohtamohit wyvern92 mugenzebra const-yield zhy-water

hcrf-light's Issues

windowsize

Dear Yale,

Looks like this light version HCRF does not support window size for dependency range. Currently just consider consecutive nodes, right? Or is there a way to specify such window size in your current code? Thanks a lot.

OCCRF doesn't work on toy anomaly dataset

Hi Song, I know it's been some time since this paper is published. I'm trying to run the code for OCCRF model for a few days but I couldn't figure out the correct configuration. FYI, I could run the hcrf model in README.md and I hope to have some guidance on running the occrf model.

Here's what I'm trying:

./hCRF-light/distribute/bin/hcrf-light -m occrf \
-Fd ./hCRF-light/data/toy/dataAnomalyTrain.csv -Fl ./hCRF-light/data/toy/labelsAnomalyTrain.csv \
-FD ./hCRF-light/data/toy/dataAnomalyTest.csv -FL ./hCRF-light/data/toy/labelsAnomalyTest.csv

And here's what I'm getting:

L-BFGS optimization terminated with status code = -1000
fx = 0.0151543

Reading testing set... 
Starting testing ...
Press a key to continue...

Here's the cat stats.txt. Looks like it's predicting all data points in test set to be normal (label 0).

I noticed that the status code is -1000, and I've checked the lbfgs library and this error code links to LBFGSERR_ROUNDING_ERROR. I'm not sure why this is happening, I'm guessing that the gradient became too small during optimization.

Then, I try to solve it by changing to use nrbm optimizer by adding -o nrbm, and here's the cat stats.txt result. It's predicting all rows in test set as anomalies:

(1) I'll be very pleased if you would guide me on the configuration of occrf for the toy anomaly dataset / other dataset. 😄

(2) I also have another question in mind: Shouldn't occrf be predicting just one instead of multiple labels per sequence? According to the publised paper, there is only 1 anomaly score per employee. But in this code, it's considered a continuousModel, and uses fl & fL (1 label / timestep) instead of fq & fQ (1 label / sequence)...

Thank you! 😃

Issues with higher density data

When I ran this library with a dataset of 4200 files. Dimension of seqs data(93 columns and rows [not fixed]) It gave me accuracy to be somewhere around 0.0015 which is very low. I gave the split valid parameters to be same as split test. Can anyone explain the reason for this low accuracy?
Do I need to change seeds or any other parameter?

Cannot find shared library

I followed the instructions in compile.sh and got no error.
After make and make distribute, I tried to run the command
./distribute/bin/hcrf-light and got the following error :
./distribute/bin/hcrf-light: error while loading shared libraries: liblbfgs-1.8.so: cannot open shared object file: No such file or directory
The library liblbfgs-1.8.so indeed exists in ./lib/liblbfgs/lib/.libs/. How can I solve this problem ?

I am using Debian 8.7.

Undefined function or variable 'matHCRF'.

When I run the file 'test_oc.m', I get the error "Undefined function or variable 'matHCRF'." How could I solve this problem?

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.