Giter Club home page Giter Club logo

flume-taildirectory-source's Introduction

flume-taildirectory-source [DEPRECATED]

Project is no longer maintained. If you need a Flume source for ingesting local data try:

Notes

Source of Flume NG for tailing files in a directory. This plugin is based on jinoos ([email protected]) https://github.com/jinoos/flume-ng-extends Is refactored to support logs rotate in windows and linux, and the code has been cleaned to much more simple working, and apache.common.vfs2 dependency has been replaced with the native java 7 java.nio library. Thanks for the inspiration.

Compilation
mvn package
Use

Make the directory in flume installation path $FLUME_HOME/plugins.d/tail-directory-source/lib and copy the file flume-taildirectory-source-1.1.1.jar in it. Edit flume configuration file with the parameters above.

Configuration
Property Name Default Description
Channels -
Type - org.apache.flume.source.taildirectory.DirectoryTailSource
dirs - NICK of directories, it's such as list of what directories are monitored
dirs.NICK.path - Directory path
unlockFileTime 1 Delay to check not modified files to unlock the access to them ( in minutes )
fileHeader false Include file absolute path in events header
fileHeaderKey file Key of file absolute path header
basenameHeader false Include file base name in events header
basenameHeaderKey basename Key of file base name header
followLinks false Follow symbolic links to directories referenced in monitorized directories
  • Example
agent.sources = tailDir
agent.sources.tailDir.type = org.apache.flume.source.taildirectory.DirectoryTailSource
agent.sources.tailDir.dirs = monitDir1 monitDir2
agent.sources.tailDir.dirs.monitDir1.path = /var/lib/flume/tailDir-1
agent.sources.tailDir.dirs.monitDir2.path = /var/lib/flume/tailDir-2
agent.sources.tailDir.dirs.unlockFileTime = 1
agent.sources.tailDir.basenameHeader = true
agent.sources.tailDir.basenameHeaderKey = basenameFilename
agent.sources.tailDir.fileHeader = true
agent.sources.tailDir.fileHeaderKey = file
agent.sources.tailDir.followLinks = false

agent.sources.tailDir.channels = memoryChannel

flume-taildirectory-source's People

Contributors

benhollomon avatar mvalleavila avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

flume-taildirectory-source's Issues

Only one file are being released due bad loop implementation

for (FileSet fileSet: fileSetMap.values()){

                    [.......]
                    if (currentTime - lastAppendTime > TimeUnit.MINUTES.toMillis(timeToUnlockFile)){
                        logger.info("File: " + fileSet.getFilePath() + 
                                " not modified after " + timeToUnlockFile + " minutes" +
                                " removing from monitoring list");
                        fileSetMap.get(fileKey).clear();
                        fileSetMap.get(fileKey).close();
                        fileSetMap.remove(fileKey);
                        filePathsAndKeys.remove(fileSet.getFilePath().toString());
                    }
                }

Check this line, not possible to remove a item for hahsmap being looped
fileSetMap.remove(fileKey);

Agent shutdown fail causes log lines lost

If flume agent fails and the service stops, when it's restarted taildirectory can't start from the last line processed.

One posibility to avoid this problem is to save the current file pointer and serialize an object containing this pointer and the inode file identification.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.