Giter Club home page Giter Club logo

athenacep's Introduction

AthenaCEP


AthenaCEP aims at solving the problem of load shedding in CEP. It provides a fundamentally new perspective to shed the exponentially increasing partial matches as well as input event tuples. It has the wisdom, as Athena, the Greek goddess of wisdom, to decide when, what and how much partial matches or event tuples to drop in order to satisfy\ the strict latency bound and keep the maximum accuracy.


Prerequisites

  • Compilers: The compiler needs to support C++11 or higher. In Makefiles, the default compiler is set as g++.
  • Advanced Vector Extensions: We set ccompiler flag -mavx2 for X86 machines. Howevever, if you may remove it from Makefile if it is not supported.
  • Libraries: POSIX Threads(Pthreads), STL, Python, scikit-learn for offiline machine learning.
  • AZll running/configuration scripts are written for linux OS.
  • We build parsers to parse query workloads from files. We define query workloads in files ending with .eql.

Code and Data

The main funciton of the CEP engine is in ./AthenaCEP/src_synthetic/src_NS/cep_match/cep_match.cpp

To compile the code in Linux:

  1. cd $where you cloned the project$/AthenaCEP/src_synthetic/src_NS/
  2. make -f MakefileCM
  3. the executable file is /AthenaCEP/src_synthetic/src_NS/bin/cep_match

A parser for CEP query has also been implemented, based on Lex and Yacc.

CEP queries for synthetic data sets have been defined in AthenaCEP/queries/Synthetic.eql

CEP queries for real-world data sets, NYC bike sharing data, have been defined in AthenaCEP/queries/Bike.eql

parsing the event schema & queries, and constructing the CEP engine are automatically done.

=====================================================================================

Synthetic data sets: To run the experiments: A. Generate synthetic datas

  1. cd ./AthenaCEP/src_synthetic/NormalEventStreamGen/
  2. sh make.sh
  3. ./UniDistGen.out The configuration of attribute values distributions and the number of the data could be found in: ./AthenaCEP/src_synthetic/NormalEventStreamGen/UniDistGen.cpp

B. Set the path in /AthenaCEP/src_synthetic/src_NS/cep_match/cep_match.cpp. It could be done by setting the value of "string streamFile"

C. Compile the code.

D. The offile estmation Python code is in ./AthenaCEP/src_synthetic/python-clustering-classification/ The trained classifiers could be found in .pkl files and could be visualised by .dot and .pdf files. To automatically genearated classifiers in C++ code, please find the source-to-source transformation tool "DecisionTreeToCpp" at https://github.com/papkov/DecisionTreeToCpp

F. In order to run the CEP engine, you need a simple script like:

./AthenaCEP/src_synthetic/src_NS//bin/cep_match -c ./Synthetic.eql -q P1 -s -n P1NS -p monitoring_P1NS.csv > runInfor_P1NS.txt

the arguments of the command line is reflected in the main function in cep_match.cpp, which are very straightforward and self explained.

For the above specific query,

-c ./Synthetic.eql means the event schema and query configuration file ./Synthetic.eql.

-q P1 means to evaluate query P1.

-n P1NS means the prefix of generated results and static files

-s means to appendi the timestamps for events appeared in partial matches and complete matches.

-p monitoring_P1NS.csv means to dump the throughput information to the file monitoring_P1NS.csv

more scripts are attached in the repository.

Three files will be generated after evaluation: take the above example monitoring_P1NS.csv is the monitoring file with the record of the throughput. runInfor_P1NS.txt contains the information of the accuracy, the number of dropped events, the number of droppped partial matches. latency_P1NS.csv contains the record of latency for every complete matches.

=====================================================================================

real-world data sets: NYC bike-sharing. The link of the data sets: https://www.citibikenyc.com/system-data

A. Transform the time stamp/ data cleaning run CleanData.py for 201810-citibike-tripdata.csv data set and get the cleaned data.

B. Set the path of the data source in main function. change the streamFile string in main function in ./AthenaCEP/src_testing_bike_sharing/cep_match/

C. Complie the code:

  1. cd ./AthenaCEP/src_testing_bike_sharing/
  2. make -f MakefileCM

D. Run the experiments. An example is illustrated in ./AthenaCEP/bike_sharing_data/query_p1.sh

=====================================================================================

Please feel free to drop me an e-mail at bo.zhao "at" hu-berlin.de, if you have any questions.

Or visit https://hu.berlin/bo_zhao

athenacep's People

Contributors

zbjob avatar

Stargazers

 avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.