Giter Club home page Giter Club logo

aliyun-odps-fluentd-plugin's Introduction

Aliyun ODPS Plugin for Fluentd

Getting Started


Introduction

  • ODPS-Open Data Processing Service is a massive data processing platform designed by alibaba.
  • DHS-ODPS DataHub Service is a service in Odps, which provides real-time upload and download functions for user.

Requirements

To get started using this plugin, you will need these things:

  1. Ruby 2.1.0 or later
  2. Gem 2.4.5 or later
  3. Fluentd-0.10.49 or later (Home Page)
  4. Protobuf-3.5.1 or later(Ruby protobuf)
  5. Ruby-devel

Install the Plugin

install the project from gem or github:

$ gem install fluent-plugin-aliyun-odps
$ git clone https://github.com/aliyun/aliyun-odps-fluentd-plugin.git

Use gem to install dependency:

$ gem install protobuf
$ gem install fluentd --no-ri --no-rdoc

Your plugin is in aliyun-odps-fluentd-plugin/lib/fluent/plugin, entry file is out_odps.rb.

Use the Plugin

  • If you installed this plugin from gem, please ignore this step.
  • Move the plugin dir into the plugin directory of Fluentd.
  • (i.e., copy the folder aliyun-odps-fluentd-plugin/lib/fluent/plugin into {YOUR_FLUENTD_DIRECTORY}/lib/fluent/plugin).
$ cp aliyun-odps-fluentd-plugin/lib/fluent/plugin/* {YOUR_FLUENTD_DIRECTORY}/lib/fluent/plugin/ -r

ODPS Fluentd plugin now is available. Following is a simple example of how to write ODPS output configuration.

<source>
   type tail
   path /opt/log/in/in.log
   pos_file /opt/log/in/in.log.pos
   refresh_interval 5s
   tag in.log
   format /^(?<remote>[^ ]*) - - \[(?<datetime>[^\]]*)\] "(?<method>\S+)(?: +(?<path>[^\"]*?)(?: +\S*)?)?" (?<code>[^ ]*) (?<size>[^ ]*) "-" "(?<agent>[^\"]*)"$/
   time_format %Y%b%d %H:%M:%S %z
</source>
<match in.**>
  type aliyun_odps
  aliyun_access_id ************
  aliyun_access_key *********
  aliyun_odps_endpoint http://service.odps.aliyun.com/api
  aliyun_odps_hub_endpoint http://dh.odps.aliyun.com
  buffer_chunk_limit 2m
  buffer_queue_limit 128
  flush_interval 5s
  project your_projectName
  <table in.log>
	table your_tableName
	fields remote,method,path,code,size,agent
	partition ctime=${datetime.strftime('%Y%m%d')}
	time_format %d/%b/%Y:%H:%M:%S %z
	shard_number 1
  </table>
</match>

Parameters

  • type(Fixed): always be aliyun_odps.
  • aliyun_access_id(Required):your aliyun access id.
  • aliyun_access_key(Required):your aliyun access key.
  • aliyun_odps_hub_endpoint(Required):if you are using ECS, set it as http://dh-ext.odps.aliyun-inc.com, otherwise using http://dh.odps.aliyun.com.
  • aliyunodps_endpoint(Required):if you are using ECS, set it as http://odps-ext.aiyun-inc.com/api, otherwise using http://service.odps.aliyun.com/api .
  • buffer_chunk_limit(Optional):chunk size,“k” (KB), “m” (MB), and “g” (GB) ,default 8MB,recommended number is 2MB, max size is 20MB.
  • buffer_queue_limit(Optional):buffer chunk size,example: buffer_chunk_limit2m,buffer_queue_limit 128,then the total buffer size is 2*128MB.
  • flush_interval(Optional):interval to flush data buffer, default 60s.
  • abandon_mode(Optional):drop pack after retry 3 times.
  • project(Required):your project name.
  • table(Required):your table name.
  • fields(Required): must match the keys in source.
  • partition(Optional):set this if your table is partitioned.
    • partition format:
      • fix string: partition ctime=20150804
      • key words: partition ctime=${remote}
      • key words int time format: partition ctime=${datetime.strftime('%Y%m%d')}
  • time_format(Optional):
    • if you are using the key words to set your and the key word is in time format, please set the param <time_format>. example: source[datetime] = "29/Aug/2015:11:10:16 +0800", and the param <time_format> is "%d/%b/%Y:%H:%M:%S %z"
  • shard_number(Optional): will write data to shards between [0,shard_number-1], this config must more than 0 and less than the max shard number of your table.
  • enable_fast_crc(Optional): use fast crc.so to calculate crc, this will improve speed up a lot, but this is not supported in some os.

Useful Links


Authors && Contributors


License


licensed under the Apache License 2.0

aliyun-odps-fluentd-plugin's People

Contributors

hongbosoftware avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.