Giter Club home page Giter Club logo

orientdb-etl's Introduction

ETL

The OrientDB-ETL module is an amazing tool to move data from and to OrientDB by executing an ETL process. It's super easy to use. OrientDB ETL is based on the following principles:

  • one configuration file in JSON format
  • one Extractor is allowed to extract data from a source
  • one Loader is allowed to load data to a destination
  • multiple Transformers that transform data in pipeline. They receive something in input, do something, return something as output that will be processed as input by the next component

How ETL works

EXTRACTOR => TRANSFORMERS[] => LOADER

Example of a process that extract from a CSV file, apply some change, lookup if the record has already been created and then store the record as document against OrientDB database:

+-----------+-----------------------+-----------+
|           |              PIPELINE             |
+ EXTRACTOR +-----------------------+-----------+
|           |     TRANSFORMERS      |  LOADER   |
+-----------+-----------------------+-----------+
|   FILE   ==>  CSV->FIELD->MERGE  ==> OrientDB |
+-----------+-----------------------+-----------+

The pipeline, made of transformation and loading phases, can run in parallel by setting the configuration {"parallel":true}.

## Installation Starting from OrientDB v2.0 the ETL module will be distributed in bundle with the official release. If you want to use it, then follow these steps:

  • Clone the repository on your computer, by executing:
  • git clone https://github.com/orientechnologies/orientdb-etl.git
  • Compile the module, by executing:
  • mvn clean install
  • Copy script/oetl.sh (or .bat under Windows) to $ORIENTDB_HOME/bin
  • Copy target/orientdb-etl-2.0-SNAPSHOT.jar to $ORIENTDB_HOME/lib

Usage

$ cd $ORIENTDB_HOME/bin
$ ./oetl.sh config-dbpedia.json

## Available Components

Examples:

Look to the Documentation for more information.

orientdb-etl's People

Contributors

lvca avatar nathanielmichael avatar stokito avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.