Giter Club home page Giter Club logo

paraflow's People

Contributors

lemongrasssmell avatar rainmaple avatar ray6080 avatar taoyouxian avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

paraflow's Issues

[paraflow] readme

Write a detailed documentation on project aims, installation, usage guides and configuration explanations.

add support for Pixels

Pixels is a new 'smart' columnar format developed by us, and it will be open source soon.

We need add support for Pixels, and integrate our optimizations over columnar storage into this project.

[list] [paraflow-presto]

paraflow-presto module is for executing SQL queries using Presto.
This is a list of all related issues of this module, this can be seen as a mini task board.

[list] [paraflow-loader-producer]

paraflow-loader-producer module is for loading data into Kafka cluster.
This is a list of all related issues of this module, this can be seen as a mini task board.

  1. Issue #45

[feature] [bug] [paraflow-tools] Deployment shell

  1. Check if lib dir exists under the directory of ParaFlow/dist/ParaFlow-xxx/.
  2. If not exists, create the lib dir.
  3. Copy jar files from RealTimeAnalysis/dist/bin/ to lib.
  4. Copy jar files from Presto/presto-server/target/lib to lib.
  5. Package the ParaFlow-xxx/ dir to a tar file as ParaFlow-xxx.tar.
  6. scp tar file to each server specified in servers file.
  7. Untar the tar file to specified dir in each server.

[list] [paraflow-tools]

paraflow-tools is for convenient tools to compile, test and deploy paraflow system.
This is a list of all related issues of this module, this can be seen as a mini task board.

  1. compile and test tool
  2. deployment tool

[list] [paraflow-commons]

paraflow-commons module is for common classes shared by several modules.
This is a list of all related issues of this module, this can be seen as a mini task board.

  1. #19 Logger system
  2. ConfigurationFactory
  3. Exceptions

[common] [paraflow-loader-producer] producer for Kafka

A producer client for loading data into Kafka.
The client provides following apis:

  1. send()
  2. createTopic()
  3. createUser()
  4. createDatabase()
  5. createFiberTable()
  6. createRegularTable()
  7. createFiberFunc()
  8. registerFilter()
  9. registerTransformer()

[paraflow-master] configuration server

We shall design a configuration server, which is a centralized service for all collectors and loaders in the cluster. Each collector or loader listens on this service to get latest configuration parameters.

In this way, we can avoid copying the same configuration files all over the cluster, and tune the cluster on the fly without halting.

[enhancement] [paraflow-metaserver] refactor

Refactor MetaServer:

  1. Service. Defines services provided to client in several ways:
    1. params. Necessary parameters to call this service.
    2. execution flow. Each service consists of a list of actions executing from first to the end.
    3. transaction. Deal with transaction based on needs.
  2. Action. Defines basic unit of execution. It can be reused in different services.
  3. ConnectionPool. A connection factory to get connection instance.
  4. Connection. Handles low level execution. Each connection is bind to a TransactionController.
  5. TransactionController. Responsible for connection transaction commit and rollback.
  6. ParaFlowException. Describe all kinds of exceptions in MetaServer. It has five levels:
    1. DEBUG. This is used only for debugging by developers.
    2. INFO. Useful information provided back to clients.
    3. WARNING. Exceptions trigged by inappropriate user behaviours.
    4. ERROR. System errors which should be fixed, however, the system can still run.
    5. FATAL. Fatal errors. The system can no longer continue running.
  7. ExceptionHandler. Define handler for exceptions, and convert exception to status.

[common] [paraflow-metaserver] MetaServer api

List all public interfaces of MetaServer.

Sub task of #22

API:

  • List<String> listDatabases()
  • List<String> listTables(String database)
  • Database getDatabase(String database)
  • Table getTable(String database, String table)
  • Column getColumn(String database, String table, String column)
  • Status createDatabase(Database database)
  • Status createTable(Table table)
  • Status deleteDatabase(String database)
  • Status deleteTable(String database, String table)
  • Status renameDatabase(String oldName, String newName)
  • Status renameTable(String database, String oldName, String newName)
  • Status renameColumn(String database, String table, String oleName, String newName)
  • Status createFiber(String database, String table, long value)
  • List<Long> listFiberValues(String database, String table, long value)
  • Status addBlockIndex(String database, String table, long fiberV, String beginTime, String endTime, String path)
  • List<String> filterBlockPaths(String database, String table, String timeLow, String timeHigh)
  • List<String> filterBlockPaths(String database, String table, long fiberV, String timeLow, String timeHigh)

Models:

  1. Database
    String name;
    String locationUri;
    User user;
  2. Table
    Database database;
    long creationTime;
    long lastAccessTime;
    User owner;
    String tableName;
    String tableLocationUri;
    List<Column> columns;
  3. User
    String userName;
    String userPass;
    String roleName;
    long creationTime;
    long lastVisitTime;
  4. Column
    Table table;
    String colName;
    String dataType;
    int colIndex;
  5. Status (Enum type)
  • OK
  • DATABASE_ALREADY_EXISTS
  • DATABASE_NOT_FOUND
  • TABLE_ALREADY_EXISTS
  • TABLE_NOT_FOUND
  • COLUMN_ALREADY_EXISTS
  • COLUMN_NOT_FOUND
  • FIBER_ALREADY_EXISTS
  • BLOCK_INDEX_ERROR

add support for ORC

Currently, we only support Parquet.
ORC is as popular as Parquet, and it has well integrated with Presto.

We need add support for ORC.

[list] [paraflow-metaserver]

MetaServer is a standalone server for meta data management.
It keeps all metadata inside, and provide powerful and rich rpc interfaces.
This is a list of all related issues of this module, this can be seen as a mini task board.

  1. Issue #5
  2. Issue #20
  3. Issue #21
  4. Issue #27
  5. Issue #42
  6. Issue #43

[list] [paraflow-loader-consumer]

paraflow-loader-consumer module is for loading data from Kafka into file system.
This is a list of all related issues of this module, this can be seen as a mini task board.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.