Giter Club home page Giter Club logo

m3ninx's Introduction

M3

GoDoc Build Status FOSSA Status

M3 Logo

Distributed TSDB and Query Engine, Prometheus Sidecar, Metrics Aggregator, and more such as Graphite storage and query engine.

Table of Contents

More Information

Community Meetings

You can find recordings of past meetups here: https://vimeo.com/user/120001164/folder/2290331.

Install

Dependencies

The simplest and quickest way to try M3 is to use Docker, read the M3 quickstart section for other options.

This example uses jq to format the output of API calls. It is not essential for using M3DB.

Usage

The below is a simplified version of the M3 quickstart guide, and we suggest you read that for more details.

  1. Start a Container
docker run -p 7201:7201 -p 7203:7203 --name m3db -v $(pwd)/m3db_data:/var/lib/m3db quay.io/m3db/m3dbnode:v1.0.0
  1. Create a Placement and Namespace
#!/bin/bash
curl -X POST http://localhost:7201/api/v1/database/create -d '{
  "type": "local",
  "namespaceName": "default",
  "retentionTime": "12h"
}' | jq .
  1. Ready a Namespace
curl -X POST http://localhost:7201/api/v1/services/m3db/namespace/ready -d '{
  "name": "default"
}' | jq .
  1. Write Metrics
#!/bin/bash
curl -X POST http://localhost:7201/api/v1/json/write -d '{
  "tags": 
    {
      "__name__": "third_avenue",
      "city": "new_york",
      "checkout": "1"
    },
    "timestamp": '\"$(date "+%s")\"',
    "value": 3347.26
}'
  1. Query Results

Linux

curl -X "POST" -G "http://localhost:7201/api/v1/query_range" \
  -d "query=third_avenue" \
  -d "start=$(date "+%s" -d "45 seconds ago")" \
  -d "end=$( date +%s )" \
  -d "step=5s" | jq .  

macOS/BSD

curl -X "POST" -G "http://localhost:7201/api/v1/query_range" \
  -d "query=third_avenue > 6000" \
  -d "start=$(date -v -45S "+%s")" \
  -d "end=$( date +%s )" \
  -d "step=5s" | jq .

Contributing

You can ask questions and give feedback in the following ways:

M3 welcomes pull requests, read contributing guide to help you get setup for building and contributing to M3.


This project is released under the Apache License, Version 2.0.

m3ninx's People

Contributors

jeromefroe avatar prateek avatar robskillington avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

m3ninx's Issues

Revisit segment.DocID as a uint64

We currently use a uint32 for segment.DocID, which limits the maximum number of documents per segment to 4B. The primary motivation for this is lack of support for uint64 in RoaringBitmap. We should revisit this if we ever see this becoming an issue.

Add a high-level example of how to use m3ninx

We should have a high level example of how to use m3ninx. Specifically, it should include:

  • how to create an index (segment currently)
  • how to insert documents
  • how to search for documents

Cheap method to retrieve num documents in a Segment

Need a method on Segment which returns an estimate for the number of documents it contains. It can be a rough approximation.

Intent is to use this during a background process in m3db (ticking) to emit metrics ~= cardinality.

PostingsList Refactoring

We've got a quite a few data points indicating we should re-work our postings list implementation/API, they are:

  • Support for postings.ID to be uint64 (#12)
  • Benchmark numbers showing upstream Roaring is worse than Pilosa
  • We're using Pilosa's serialisation mechanism for the FST segment, as a result are having to double serialise/deserialise when accessing data. i.e. on write it goes from our roaring implementation -> pilosa's -> []byte. And reverse that order for reads. Can avoid an extra hop by just using Pilosa directly.

The points above clearly indicate we should use Pilosa instead of upstream roaring. Further, when re-working the APIs we should consider:

  • Our current postings lists implementation currently wraps all access in a RWMutex. This made sense when all we had was a mutable segment. But it's not required when we've got immutable mmap'd data.
  • Impact of immutability semantics on the postings list itself: the postings list in mutable segments only need Add() as a mutator.
  • Pooling of postings lists: upstream roaring's implementation does not lend itself to pooling. We should revisit this with Pilosa, and immutability (i.e. if we can pool cheaply, an immutable API would be even more tempting).
  • Implications of Seek() ahead, which is required for efficient posting list merging at query time.

Regexp mismatch between mem v fs segment

Currently, mem segments support everything golang/Regexp will compile; fs segments support a subset of that (e.g. Vellum has DFA state limits, no support for zero-width assertions, etc).

We need to make this uniform across the two.

Make `idx/Query` Serialisable

Add the following methods to idx/query.go:

func Marshall(Query) ([]byte, error) { ... }
func Unmarshall([]byte) (Query, error) { ... }

Can eliminate Query.SearchQuery() after that.

Avoid using a map with string keys for the terms dictionary

String keys require extra allocs that we could avoid if we rolled with our own hashmap implementation. We should evaluate the perf impact of doing so and make the corresponding changes.

Doing so also opens up the possibility of using a map hidden from the GC ("native"/mmap'd), which we should look into too.

Segment Fields/Terms Iterators

Here's the current API spec for two methods on the Segment type:

type Segment interface {
  Fields() ([][]byte, error)
  Terms([]byte) ([][]byte,error)
} 

The return types as [][]byte have a number of detriments:

  • Can't pool the underlying []byte (no easy way for user to finalise and return)
  • Have to alloc the entire [][]byte upfront
  • Can't wrap underlying iterators. e.g. we could wrap vellum's FST iterator if returned an iterator ourselves

Other factors to consider:

  • We could enforce lexicographical ordering on the returned iterators, which would be useful when constructing FSTs, and the iterators returned by FST would natively satisfy this.
  • Could tie-lifecycle of the iterator being returned to the segment's lifecycle - e.g. FST iterator wrapper would only be valid for as long as the iterator itself.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.