Giter Club home page Giter Club logo

km-db's Introduction

The km-db gem should be useful to KissMetrics (KM) users. Its aim is to efficiently process data obtained with KM's "Data Export" feature.

It is meant to :

  • import KM event dumps into a SQL database (preferably MySQL / PostgreSQL)
  • quickly process KM event dumps

Once imported, you can run complex queries against your visit history, for instance run multivariate analysis.

Beware though, KM data can be huge, and processing it is taxing !

Installing

Add this to your Gemfile if you're using Bundler:

gem 'km-db', :git => 'git://github.com/HouseTrip/km-db.git'

Importing data

Running reports on raw logs can be less effective than running against a (relational) database. km-db provides a km_db_import executable. Run it with:

$ bundle exec km_db_import <data-dump-directory>…

By default, you events will be imported in test.db, a SQLite database.

You can create km_db.yml or config/km_db.yml to have it import using another adapter, for instance:

---- km_db.yml ----
adapter:  mysql2
database: km_events
user:     root

Remember to add sqlite3-ruby or mysql2 to your Gemfile.

Using imported data

The KMDB module exposes four ActiveRecord classes: Event, Property, User are the main domain objects. Key is used to intern strings (event and property names) for performance.

Finding events and properties

All visits during Jan. 2012:

KMDB::Event.before('2012-02-1').after('2012-01-01').named('visited site').by_date

All of a user's visit:

KMDB::User.last.events.named('visited site')

A user's referers:

KMDB::User.last.properties.named('referer').map(&:value)

Load some properties with events (uses a left join by default):

KMDB::User.last.events.with_properties('a prop', 'another prop').map(&:another_prop)

Note that many more complex queries will require building SQL queries directly.

Processing data

You don't have to import to filter your data.

The two classes you're looking for are KMDB::Parser and KMDB::ParallelParser. The latter runs your filter task on all available CPUs, using the parallel gem.

The following example counts the number of aliasing events in all JSON files under dumps/:

require 'rubygems'
require 'kmdb'

counter = 0
parser = KMDB::Parser.new
parser.add_filter do |text,event|
    counter += 1 if event['_p2']
end
parser.run('dumps/')
puts counter

Note that it will not work with ParallelParser, as the counter variable will be different for each process.

km-db's People

Contributors

mezis avatar jberlinsky avatar

Watchers

 avatar James Cloos avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.