Giter Club home page Giter Club logo

flume-elasticsearch-sink's Introduction

Elasticsearch Sink

The sink reads events from a channel, serializes them into json documents and batches them into a bulk processor. Bulk processor batches the writes to elasticsearch as per configuration.

The elasticsearch index and type for each event can be defined statically in the configuration file or can be derived dynamically using a custom IndexBuilder.

By default, events are assumed to be in json format. This assumption can be overridden by implementing the Serializer interface.

Follow these steps to use this sink in Apache flume:

  • Build the plugin. This command will create the zip file inside the target directory.

mvn clean assembly:assembly

  • Extract the file into the flume installation directories plugin.d folder.

  • Configure the sink in the flume configuration file with properties as below

Required properties are in bold.

Property Name Default Description
channel -
type - The component type name, has to be com.cognitree.flume.sink.elasticsearch.ElasticSearchSink
es.cluster.name elasticsearch Name of the elasticsearch cluster to connect to
es.client.hosts - Comma separated hostname:port pairs ex: host1:9300,host2:9300. The default port is 9300
es.bulkActions 1000 The number of actions to batch into a request
es.bulkProcessor.name flume Name of the bulk processor
es.bulkSize 5 Flush the bulk request every mentioned size
es.bulkSize.unit MB Bulk request unit, supported values are KB and MB
es.concurrent.request 1 The maximum number of concurrent requests to allow while accumulating new bulk requests
es.flush.interval.time 10s Flush a batch as a bulk request every mentioned seconds irrespective of the number of requests
es.backoff.policy.time.interval 50M Backoff policy time interval, wait initially for the 50 miliseconds
es.backoff.policy.retries 8 Number of backoff policy retries
es.index default Index name to be used to store the documents
es.type default Type to be used to store the documents
es.index.builder com.cognitree.
flume.sink.
elasticsearch.
StaticIndexBuilder
Implementation of com.cognitree.flume.sink.elasticsearch.IndexBuilder interface
es.serializer com.cognitree.
flume.sink.
elasticsearch.
SimpleSerializer
Implementation of com.cognitree.flume.sink.elasticsearch.Serializer interface
es.serializer.csv.fields - Comma separated csv field name with data type i.e. column1:type1,column2:type2, Supported data types are string, boolean, int and float
es.serializer.csv.delimiter ,(comma) Delimiter for the data in flume event body
es.serializer.avro.schema.file - Absolute path for the schema configuration file

Example of agent named agent

  agent.channels = es_channel
  agent.sinks = es_sink
  agent.sinks.es_sink.type=com.cognitree.flume.sink.elasticsearch.ElasticSearchSink
  agent.sinks.es_sink.es.bulkActions=5
  agent.sinks.es_sink.es.bulkProcessor.name=bulkprocessor
  agent.sinks.es_sink.es.bulkSize=5
  agent.sinks.es_sink.es.bulkSize.unit=MB
  agent.sinks.es_sink.es.concurrent.request=1
  agent.sinks.es_sink.es.flush.interval.time=5m
  agent.sinks.es_sink.es.backoff.policy.time.interval=50M
  agent.sinks.es_sink.es.backoff.policy.retries=8
  agent.sinks.es_sink.es.cluster.name=es-cluster
  agent.sinks.es_sink.es.client.hosts=127.0.0.1:9300
  agent.sinks.es_sink.es.index=defaultindex
  agent.sinks.es_sink.es.index.builder=com.cognitree.flume.sink.elasticsearch.HeaderBasedIndexBuilder
  agent.sinks.es_sink.es.serializer=com.cognitree.flume.sink.elasticsearch.SimpleSerializer
  agent.sinks.es_sink.es.serializer.csv.fields=id:int,name:string,isemployee:boolean,leaves:float
  agent.sinks.es_sink.es.serializer.csv.delimiter=,
  agent.sinks.es_sink.es.serializer.avro.schema.file=/usr/local/schema.avsc

flume-elasticsearch-sink's People

Contributors

biswassouvik avatar srinathc avatar sumit-ct avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

flume-elasticsearch-sink's Issues

ElasticSearchSink.process() can't catch exception when Es down

dear Cognitree:
I try this es-sink for testing, find that ElasticSearchSink.process() can't catch exception when Es down.
It is seem that the exception message was send to BulkProcessor.Listener.afterBulk() and can,t throw.
so the logs was consumed with no stored, can you help to this issue?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.