Light

phial3 / flink-clickhouse-etl Goto Github PK

View Code? Open in Web Editor NEW

This project forked from linkingio/flink-clickhouse-etl

0.0 2.0 0.0 273 KB

Flink real-time process data into ClickHouse

Java 100.00%

flink-clickhouse-etl's Introduction

flink-clickhouse-etl

real-time Flink process data into ClickHouse

Function

Flink从Kafka解析数据

多版本迭代解析数据，通过数据解析的封装，实现对数据格式的优化解析。
在kafka connector 中泛型的使用，实现良好的解析扩展性。
解析数据时，通过每一条数据的topic，partition，offset 来设计主键。
消费数据的offset可以存储到状态后端，保证数据消费成功后才提交offset。

Flink ETL

用户统计

按照操作系统进行新老用户统计，使用到分区等知识。
按照新老用户进行统计分析。
按照device 进行进行判断是否是新老用户。采用bloom Filter方法实现，也可以直接使用state进行统计。

TopN统计

基于时间窗口的不同event类型，类别，商品TopN的访问量。使用滑动窗口进行5min数据粒度划分，滑动时间为1分钟。使用listState进行聚合统计，在聚合统计中使用定时器，在窗口结束时间 + 1 进行全量TopN 排序。同时使用值状态和map状态都可以统计。

数据Sink 到ClickHouse

通过使用JDBC连接到CH，需要导入flink-connector-jdbc，实现JDBCSink。在CH设计表的时候，使用ReplacingMergeTree，通过每条数据的唯一主键，结合FLink exactly once预计，保证整个数据链路的exactly once语义。

flink-clickhouse-etl's People

Contributors

Watchers

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.