Giter Club home page Giter Club logo

starrocks-connector-spark's Introduction

starrocks-connector-spark

apache spark读写starrocks的工具包,读写一体,结合了doris-spark-connectorstarrocks-connector-for-apache-spark

背景

将两个结合一下,现阶段主要是方便自用。等后面starrocks-connector-for-apache-spark生态比较好了直接用starrocks-connector-for-apache-spark。

Build

如果要引用该工具需要手动编译。

mvn install -Dskiptest
# 如果需要频繁使用可以放到maven私服
mvn deploy # you need to set your distribution manager before

Usage

方式一,pom.xml中添加依赖包(需要上述build过程中deploy到maven repository):

<dependency>
    <groupId>cn.howard</groupId>
    <artifactId>starrocks-connector-spark</artifactId>
    <version>0.1.0-SNAPSHOT</version>
    <scope>compile</scope>
</dependency>

方式二,如果没有传到私服,需要通过system的方式将Build过程生成的jar包引入进来:

<dependency>
    <groupId>cn.howard</groupId>
    <artifactId>starrocks-connector-spark</artifactId>
    <version>0.1.0-SNAPSHOT</version>
    <systemPath>your_path/starrocks-connector-spark-0.1.0-SNAPSHOT.jar</systemPath>
    <scope>system</scope>
</dependency>

your_path是工程中jar包存放的相对路径。

Reading

val fenodes = args(0) // eg: 127.0.0.1:8030
val user = args(1) // eg: admin
val password = args(2) // eg: 123456
val dbtable = "test_table"

val data = spark.read
  .format("starrocks")
  .option("starrocks.fenodes", fenodes)
  .option("user", user)
  .option("password", password)
  .option("starrocks.table.identifier", dbtable)
  .load()
data.show()

Writing

  • 需要先创建表
  • 仅支持append方式持续写入

示例:

1)创建表格

CREATE TABLE `your_db`.`test_table`
(
    `code`  varchar(255) NOT NULL COMMENT '学号',
    `name`  varchar(255) NOT NULL COMMENT '姓名',
    `value` double       NOT NULL COMMENT '成绩'
) ENGINE = OLAP PRIMARY KEY(`code`)
COMMENT '测试starrocks-connector-spark读写'
DISTRIBUTED BY HASH(`code`) BUCKETS 4
PROPERTIES (
  "replication_num" = "3",
  "in_memory" = "false",
  "storage_format" = "DEFAULT"
);

2)写入代码

val fenodes = args(0) // eg: 127.0.0.1:8030
val user = args(1) // eg: admin
val password = args(2) // eg: 123456
val dbtable = "your_db.test_table"

val data = spark.createDataFrame(
  Seq(
    ("1001", "张三", 103.5),
    ("1001", "李四", 93.12),
    ("1001", "王五", 119.8),
    ("1001", "赵六", 112.7),
  )
).toDF("code", "name", "value")
data.write.format("starrocks")
  .option("starrocks.fenodes", fenodes)
  .option("user", user)
  .option("password", password)
  .option("starrocks.table.identifier", dbtable)
  .save()

starrocks-connector-spark's People

Contributors

xuhaowork avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.