Giter Club home page Giter Club logo

micro-analytics-cli's Introduction

micro-analytics πŸ“ˆ

Public analytics as a Node.js microservice, no sysadmin experience required.

Build Status codecov node

A tiny analytics server, easy to run and hack around on. It does one thing, and it does it well: count the views of something and making the views publicly accessible via an API. It supports custom database adapters so you can use your storage engine of choice.

(there is currently no frontend to display pretty graphs, feel free to build one yourself!)

This is a lerna repo with several packages in the same repository. There is more info in on each package in their subfolder:

The main package used to run micro-analytics.

npm install -g micro-analytics-cli
micro-analytics --help

A package that contains several useful utilities and tests that will make it easier to create storage adapters.

The default storage adapter. It stores the data in a single file. This adapter is automatically installed by micro-analytics-cli.

A storage adapter that keeps everything in memory, when using this all data will be lost when the app restarts.

Community adapters

Demo

We have a demo instance on demo.micro-analytics.io automatically deploys the master branch from this repository. Feel free to use it to test your clients.

License

Copyright ©️ 2017 Maximilian Stoiber & Rolf Erik Lekang, licensed under the MIT License. See license.md for more information.

micro-analytics-cli's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

micro-analytics-cli's Issues

Move the /_* urls to /_/*

This way, you can name your pages however you’d like. Also, there could be a feature where requests to /_/([a-z\d-]+) get require()d from a specific directory in the source code, making it simpler to create separate files for each request.

Add debugging information output

Lots of useful information can be logged out to the user but that should only be provided in the debugging context. So, allow for a --debug option that will allow writes out. Have the services modules point to a reusable logging component. On init, we will tell that logging component if it should echo out the proceeding logs to the console.

wildcard search in ?all

So, trying to mimic some really expressive behavior that grafana uses. They allow you to track metrics with keys like: a.b.c.d.e and you can do analytics/querying based on searches of a.b.*.d.e. Note, using dots instead of slashes wouldn't change how this is implemented.

A real world scenario around this:

myapp.*.logins

where the convention used is myapp.version.logins.

That query (services.com/myapp.*.logins?all=true) allows you to get all login event metrics across all versions/deploys and can compare. A/B testing and a lot of things would be enabled by this.

I'd imagine we would just pass the raw string to the db adaptor and let them figure out how to parse/support it. For the file-db adaptor, should be a simple change of the filter to use a regex. Something along the lines of key.match(new RegExp('^' + options.pathname.replace('*','.*')))

analytics scope

So, what this really is, is counting of events on a specific key. I wanted to open up the conversation around opening up the scope of this project. Views was a good start, but it seems a bit odd to me that the db adaptors can support arbitrary key/values but we limit the scope to views explicitly.

But keeping with the counting strategy, we could simply offer ways for users to specify the type of information segments being tracked. What this allows for is the ability to track counts in semantic groupings.

For example, I might want to track user interactions, page views, page load times, etc. But as it is right now I could try to track that with MA as it is today but it could really overcomplicate querying.

like as a major version change what if we supported something like:

analyticsservice.com/count/:trackingSegment/:key
analyticsservice.com/count/:trackingSegment?key=:key

breaking that down:

  • keep the simple structure of the endpoints that we have today
  • add /count/ prefix path to indicate the action/type of data.
    • supports backward compatibility with what is there today. (unless someone's path is /count/)
    • indicates the action and data type when getting and putting. This allows us to support different types later on potentially.
  • trackingSegment is what we should group the metric
    • examples: /count/views/, /count/purchased/
  • :key is, like we have today, the string that would be captured as the key
    • ?key=:key seems like a good idea in general so that you can wrap it in quotes and pass non uri safe values as the key. ?key='/path?otherquery&params'
  • remove the views counting object field. so
    "/hello":{ views: [] } -> "/hello":{ count: [] }

Diving more into the tracking segment. The goal would be that you have something like

{
  // could be any string tracking segment name 
  // like interactions, checkouts, etc.
  "views": {
    "/hello": {
      "count": [{
        "time": 123
      }, {
        "time": 124
      }]
    }
  }
}

since we have this segmenting of data being tracked, our querying will also be segmented by that type. Allowing us to track multiple types of data and querying isn't "looking for a banana but you got the gorilla holding the banana and the entire jungle."

And supporting what we have today can have this:
analyticsservice.com/my/path be applied to the db as if it were:
analyticsservice.com/count/views/my/path

Move flat-file-db adapter to its own package

It should be its own package, and a dependency of micro-analytics-cli to make sure people install it when they install the CLI.

That way core stays lean and clean, and we have an example of an adapter package to point people to if they get stuck implementing their own.

HTTPS support

Hi friends! Thanks again for the wonderful project.

I realized that when trying to fetch() the analytics for a page on HTTPS, it fails, since the micro-analytics server is HTTP, so Chrome rejects the fetch because it's an "insecure resource".

Looks like micro can be set up with HTTPS pretty simply. There's an example here.

I'm considering forking this project to implement it, but I first wanted to check if there wasn't a simpler way to do this that I was overlooking? Has anyone else implemented this for an HTTPS page?

Thanks!

Connection leaks in adapter tests

The adapter tests does not close connection correctly resulting in tests for the redis adapter never quitting.

The problem is the at the moment is call init should not throw. Overwrites the connections with new one and then when the tests closes connection in afterEach the connection created in beforeEach is not closed.

I think it would be a good thing to require adapter.close() and handle the closing of database connections in the adapter utils.

fair-analytics

Hi,

inspired but some of your work on micro-analytics I released fair-analytics.

The approach is quite different in its implementation.

  • It uses an append-only algorithm to store raw visit logs. This makes collected data accessible to users.
  • instead of GET it uses POST to write data to the log file
  • it is possible to listen to append events from other processes (for aggregating data in a real database and display fancy charts for instance)

I will definitely mention micro-analytics in my project because you, with your work, played a big part πŸ’―

I mentioned micro-analytics in the readme of my project, thanks for being such an inspiration.

I'd love to have your feedback, and hope you will find the project useful.

cheers.

Tests throw "ReferenceError: async is not defined"

I am pulling my hair out trying to get the aysnc-to-gen stuff going. I am running node 6.2 (and I have jest-cli globally).

ReferenceError: async is not defined
      
      at Object.<anonymous> (node_modules/micro-analytics-adapter-flat-file-db/index.js:11:8)
      at Object.<anonymous> (src/db.js:9:13)
      at Object.<anonymous> (src/index.js:4:12)
      at Object.<anonymous> (tests/items.test.js:5:17)

for all three tests. Basically, jest isn't applying the transforms but I have no idea why. Do you experience this as well?

Time based segmenting

As it is right now, the service returns two types of data, aggregated counts and ?all to get the full records back.

The problem is rooted with the aggregated counts. They, as of today, are the sum total since the beginning of time. Which, becomes less useful as time goes on. Because in 3 months from now if I have 13,000 views on a page, this doesn't help me know how my app trends (getting better or worse view counts). This is where the usefulness of ?all comes in. With that param, you can get all of the time data about your metrics. However, this is also since the beginning of time. So, in your aggregation of your trends over time, you will need to fetch all records in order to calculate the time segments of the last day, weeks, months, etc.

So, the proposal is really for both endpoint styles, count and ?all. You could have them support a
?since={time} and a ?before={time} to do really basic time range filtering.

The important benefit here is that it is done on the server side query so it doesn't affect the performance of the endpoint over time.

favicon.ico

I've been using chrome to open up views to test and it's requesting for favicon.ico. Is there something we could/should do here? I think this could be a common usage pattern for people trying out the project/testing it.

So some sort of filtering or blacklist?

?meta query param

We could add support for ?meta on GET requests that increment, which would allow users to save arbitrary information in the database. We allow them to send key-value pairs in the key:value format, and putting many of these pairs together delimited with a comma key:value,key2:value2 to not clash with other query params.

For example, this request:

service.com/car?meta=user:asdf123,sessionlength:1m2s

Would store this in the database:

{
  time: 12345,
  meta: {
    user: 'asdf123',
    sessionlength: '1m2s',
  },
}

By allow this we make users able to build more sophisticated analytics tracking where they can save things like the length of the session, how many pages were visited etc.

Do you reckon this is useful?

Database adapter test suite?

Should we have a test suite new database adapter can run against to check compatibility with the spec? Currently writing docs for it, I feel like that'd make it much easier.

This might require having a repo setup with the tests and a bunch of other niceties (e.g. linting) which developers can simply clone to write a new adapter.

Add support for cli options for adapters

Add two items to adapters.

  • options - array of cli options according to .options(list) in the leo/args
  • init(options) - function that receives all options cli options that can setup database connections and so on.

Both should be optional, but if options is defined then init must be defined.

Live updates

In #8 we decided to add adapter.subscribe to allow to subscribe to changes in the database. We need to somehow implement a live API endpoint to allow UIs to show realtime data! (websockets?)

IP Blacklist

Obviously this could be implemented in the client-side on a per-project basis, but I think it would be a decent idea to implement some sort of IP blacklisting system so that stats don't get bloated during development/QA.

Maybe something along the lines of providing a .ipignore file that follows the same general rules of .gitignore files?

127.0.0.1
192.102.*
*.512.*

Again, not sure if you're trying to keep this as micro as possible leaving things like this up to the individual developer. But seems like a feature that many could benefit from.

Should a GET really change the value by default?

At first I wanna say THANK YOU for this service and for your work. I'playing around with the service in the moment and it works totally fine πŸ‘

I understand that this service provide a small (REST)-API (GET and POST). And all service i've seen before (Twitter, Facebook, Netflix, aso. ) never changing a value by sending a GET. Because a GET response all information and nothing more. If I wanna change values I have to send a request via POST or PUT.
In the documentation you mention, that there is the query parameter "inc" to change this behavior. For me it was totally clear that a GET never change values/data by default and so lets implement that a GET only response information and POSTs and PUTs changing values. ;-)

What is your opinion about this? I would take care this task :-)

Include additional examples

Hi!

I was excited to try micro-analytics with my personal site, built using next.js and hosted on zeit now, however, after trying to get something working with my site, I am confused and not sure how this app is supposed to be used. Perhaps including more context in the example and add more examples would help.

I'd be happy to create a PR with examples if a bit of guidance was provided... :)

Thanks!
Peter

Token Authentication

As a first pass, I don't think we need to go full blown user management here. But allowing the service to only be reachable from an authenticated user might make this approachable for people who don't want their data visible to the world.

Perhaps we allow an environment variable that sets a token (user supplied) and, if set, we require the token to be passed before returning anything from the endpoint. The endpoints would still do incrementing without the token, but no data returned.

Thoughts?

CORS support

Hi there! Thanks for the great tool, was delighted when Max told me this existed :)

I've run it on my digitalocean droplet, but I'd like to use it for sites on Github Pages. When I try to make a fetch request for it, I get a CORS error:

Cross-origin redirection to https://server-ip-address/my-path denied by Cross-Origin Resource Sharing policy: Origin http://localhost:3002 is not allowed by Access-Control-Allow-Origin.

It looks like a PR was opened last year, #17, which fixed this by adding the right header. Looking through the current source, though, I don't see that header being sent. Was it intentionally removed? If so, is there a way to specify allowed origins that I'm missing?

Happy to open a quick PR if necessary!

Healthcheck endpoint

GET - /_health

{
  health: "ok" | "critical" | "unknown", 
  version: "2.0.0",
  adapter: {
    name: "flat-file-db",
    version: "1.0.0"
  }
}

The endpoint should call db.health that should return "ok" | "critical" | "unknown". If adapter has implemented health it should offload to that function if not it should return"unknown".

Analytics on homepage?

Attempting to GET the / endpoint yields Please include a path to a page. on the client and:

Error: Please include a path to a page.
    at createError (/app/node_modules/micro/lib/server.js:152:15)
    at analyticsHandler (/app/node_modules/micro-analytics-cli/src/handler.js:58:11)
    at /app/node_modules/micro-analytics-cli/src/handler.js:109:16
    at resolve (/app/node_modules/micro/lib/server.js:24:34)
    at Promise._execute (/app/node_modules/bluebird/js/release/debuggability.js:300:9)
    at Promise._resolveFromExecutor (/app/node_modules/bluebird/js/release/promise.js:483:18)
    at new Promise (/app/node_modules/bluebird/js/release/promise.js:79:10)
    at Function.exports.run (/app/node_modules/micro/lib/server.js:24:3)
    at Server.server (/app/node_modules/micro/lib/server.js:13:50)
    at emitTwo (events.js:126:13)
    at Server.emit (events.js:214:7)
    at parserOnIncoming (_http_server.js:602:12)
    at HTTPParser.parserOnHeadersComplete (_http_common.js:116:23)

in the server logs. Is there any way to track events at the root page?

For example, running fetch(`https://my-micro.analytics/${location.pathname}`) doesn’t record visits to the homepage.

removing leading slash from saved key

so when I GET:

service.com/test it will store { "/test": {...} }

I was experimenting with using grafana like namespaces for metrics like name.space.id.a.metric and the only thing that gets in the way is the leading slash.

so
service.com/extension.chrome.installed it will store { "/extension.chrome.installed": {...} }

My feeling is that, even if you are wanting to use the paths for tracking, it's a bit redundant to have that leading slash.

atomicity and db operation concerns

So as a few people have brought up, we are mimicking locking in our pushView util function which seems to be doing the db adapter's job. For example, if a db solution supports a "add or increment" function or handles atomicity cross process, then we are introducing a performance bottleneck by having all adapters use our locks logic and 2-3 transaction inserts. For the record, I think it was a really good starting point but we should take it to the next level.

So, I think that the next phase of our adaptors, while there are only two, need to support the API that we provided but the put needs to be changed. If they need to manually call their this.has() and this.get() to reconcile what they need to do, then go for it, but we shouldn't force that. So we should change put (or possibly rename it) but we should give them the key and they need to resolve a promise with the count value.

What do you think?

Should the atomicity be part of the adapter?

I don't know how many databases support atomic operations, but it might make sense to move the atomicity requirement to the adapters. That way if a database supports atomic operations the core doesn't do any unnecessary work.

This probably depends quite highly on #19, as this is hard to write and prove without a good test suite if the database does not have atomic operations. We could also provide a wrapper to make generic, non-atomic put operations atomic to make that step easier.

Adapter utils

The following code is in almost all of the adapters, just some have them inline. I was wondering if we should provide these functions from micro-analytics-cli/adapter-utils or something similar?

function keyRegex(str) {
  str = str.split('*').map(s => escapeRegexp(s)).join('*');
  return new RegExp('^' + str.replace('*', '.*'));
}

function filterPaths(options) {
  return key =>
    options.ignoreWildcard
      ? key.startsWith(options.pathname)
      : key.match(keyRegex(options.pathname));
}

function filterViews(views, options) {
  return (views || []).filter(view => {
    if (options && options.before && view.time > options.before)
      return false;
    if (options && options.after && view.time < options.after)
      return false;
    return true;
  });
}

Add support for db plugins

As I suggested on twitter it would be nice to have support for db plugins. I am am more than happy to implement this. There are a few things that needs to be considered. How should the api be? I am thinking the same as db.js is today just with promises for everything, like you mentioned on twitter. Should plugins be in this repo? I think it would be nice if they were external packages e.g. micro-analytics-db-redis.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.