Giter Club home page Giter Club logo

periodic-patterns-mdl's Introduction

Mining Periodic Patterns with a MDL Criterion

Conference article:

E. Galbrun, P. Cellier, N. Tatti, A. Termier, and B. Crémilleux. Mining periodic patterns with a MDL criterion. In the proceedings of the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML-PKDD'18), September 2018

This branch is for the materials associated with the original publication. See branch 'dev' for later code development.

Abstract

The quantity of event logs available is increasing rapidly, be they produced by industrial processes, computing systems, or life tracking, for instance. It is thus important to design effective ways to uncover the information they contain. Because event logs often record repetitive phenomena, mining periodic patterns is especially relevant when considering such data. Indeed, capturing such regularities is instrumental in providing condensed representations of the event sequences.

We present an approach for mining periodic patterns from event logs while relying on a Minimum Description Length (MDL) criterion to evaluate candidate patterns. Our goal is to extract a set of patterns that suitably characterises the periodic structure present in the data. We evaluate the interest of our approach on several real-world event log datasets.

Periodic patterns

An event log recording daily activities might look like this

Date Time Activity
16-04-2018 7:30 wake up
16-04-2018 7:40 prepare coffee
  ...  
16-04-2018 8:10 take metro
  ...  
16-04-2018 11:00 attend meeting
  ...  
16-04-2018 11:00 eat dinner
  ...  
17-04-2018 7:32 wake up
17-04-2018 7:38 prepare coffee
  ...  
20-04-2018 7:28 wake up
20-04-2018 7:41 prepare coffee
  ...  
15-06-2018 7:28 wake up
  ...  

A simple periodic pattern of daily activities extracted from this data might look like this...

Example of a simple pattern

... and a slightly more complex pattern might look like that:

Example of a more complex pattern

List of contents

  • The data folder contains the datasets used in the experiments or instruction about how to obtain them.
  • The xps folder contains the summary of results, a LateX template to produce a report of results, and is meant to receive files produced when running scripts.
  • The scripts contain the scripts, for mining as well as for parsing results and preparing figures, tables and examples.

Running the experiments

  1. Go to the scripts folder.

  2. Mine the real-world datasets:

    mkdir ../xps/runs
    python run_mine.py vX ##ID##
    

    where ##ID## is replaced by the id of the series to run, chosen among {bugzilla_0_rel_all, bugzilla_1_rel_all, sacha_18_absI_G1440, sacha_18_absI_G60, 3zap_1_rel, sacha_18_absI_G30, 3zap_0_rel, sacha_18_absI_G15, sacha_18_rel, sacha_18_absI_G1, samba_auth_abs, sacha_18_absI_G720} to run one configuration, or among {UBIQ_ABS, UBIQ_REL, SACHA, ALL} to run several at once.

  3. Parse the result files to produce the summary file:

    python xps_parse.py vX
    
    # so : 
    
    python xps_parse.py bugzilla_0_rel_all # to parse *bugzilla_0_rel_all_log.txt*)
    

    will create a file called run_results_vX.csv in the xps folder.

  4. Produce the tables and plots to visualize the results of the real-world sequence experiments:

    python xps_plot.py filename*
    #so
    python xps_plot.py bugzilla_0_rel_all.csv # to plot *bugzilla_0_rel_all.csv*)
    
    
    
    python xps_tables.py vX
    #so
    python xps_tables.py bugzilla_0_rel_all.csv # to make tables from *runs_results_bugzilla_0_rel_all.csv*)
  5. Parse results from sacha to turn ids to text and timestamps to date and time:

    mkdir ../xps/sacha_text
    cp ../xps/runs/sacha*_patts.txt ../xps/sacha_text/
    python xps_examples.py
    

    for each file XXX_patts.txt in the sacha_text folder, this will produce a corresponding file XXX_text-patts.txt containing the patterns in more readable format

  6. Run synthetic experiments:

    mkdir ../xps/synthe
    python run_synthe.py
    
  7. Produce boxplots and scatter plots to visualize the results of the synthetic experiments:

    python xps_synthe_boxplots.py
    python xps_synthe_scatter.py
    
  8. Compile the latex file to generate a report of the experiments.

periodic-patterns-mdl's People

Contributors

nurblageij avatar cyril-data avatar hermann74 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.