housepower / clickhouse_sinker Goto Github PK

View Code? Open in Web Editor NEW

500.0 500.0 119.0 5.61 MB

Easily load data from kafka to ClickHouse

Home Page: https://housepower.github.io/clickhouse_sinker

License: Apache License 2.0

Go 96.13% Makefile 0.57% Shell 2.92% Dockerfile 0.39%

clickhouse clickhouse-bulk clickhouse-server kafka

clickhouse_sinker's People

Contributors

Stargazers

Watchers

Forkers

javartisan 0xqq lingya chunyu215 imavtandil mlikewater bill1234567890abc chijiajian 3-shake xjs513 isgasho leegkon zhanglei woozhijun abeusher solarwindscloud ww1516123 jgulick48 showsmall xonze linyue515 seanlook zweite adsfall chienlungcheung miaoshixuan donghaic tianshangjun aland-zhang long1208 ynuosoft artemchekunov z1q1q7 airbots1980 ahzamali taiyang-li forestlzj genezhang gary0416 raceli gvadrian ycg bochuxt fourspaces santiment junneyang toannhu96 shitao0713 amudong jeromehj zhs077 doslin levsion shangshujie365 clooney2007 donge awesomedatatool sundy-li oleggavrilov jiajie999 yinchuanwang josepowera shixiaoliu laplace789 kisdry machworklab plusxp yenchangchan qingcloudhx kioco haplone cpp-la zhbitzwz jacklx2021 forever765 allapospelova draev ivantq drgnchan yokel00 wushiming540 596192804 jfyjsff laashub-soa lingo-xp skygra mrssunlight lccbiluox2 data-bytes305 tomasdubec halen88 yixy janhen zhu733756 yuzhichang falongyang super-xiu zhouruisong viru-tech genzgd

clickhouse_sinker's Issues

update dependencies - Sarama to support Kafka 2.3

Please update Sarama version to at least Version 1.23.0 (2019-07-02) that includes support for newer kafka versions

from changelog: Add support for Kafka 2.3.0 (1418). https://github.com/Shopify/sarama/pull/1418/files

add LICENSE file into the repo

Hello @sundy-li,

Add a license file into the repository, please

what is the difference between dims and metrics?

生产环境下使用的一些建议

我们希望用clickhouse_sinker 作为一个系统级服务，能够常驻，自动适应clickhouse, kafka链接状态的变化, 而不需要去手动停止。

现在的配置retryTimes, 如果使用，将导致clickhouse集群维护期间大量的数据丢失，不适合生产环境.

建议LoopWrite, 抛弃retryTimes的设置,

func (c *ClickHouse) LoopWrite(metrics []model.Metric) {
	err := c.Write(metrics)
	times := 1
	for err != nil {
		log.Error("saving msg error", err.Error(), "will loop to write the data")
		waitTime := time.Duration(math.Min(math.Pow(2, float64(times)), 3600))
		time.Sleep(waitTime * time.Second)
		err = c.Write(metrics)
		times = times + 1
	}
}

写失败的情况下，自动倍增系统sleep的时间，直到成功。这样就无需在clickhouse停机的情况下，首先要关闭clickhouse_sinker, clickhouse_sinker可以常驻，在最多1小时候就自动提供sinker服务了。

我进行如下配置，但是总是报错，提示。host怎么换都没有用，一直报这个错
{
"clickhouse": {
"ch1": {
"db": "test",
"host": "127.0.0.1",
"maxLifeTime": 300,
"dnsLoop" : false,
"password": "",
"retryTimes" : 5,
"port": 9000,
"username": "default"
}
},
"kafka": {
"kfk1": {
"brokers": "localhost:9092",
"sasl": {
"password": "",
"username": ""
}
}
},
"common": {
"bufferSize": 90000,
"minBufferSize": 2000,
"flushInterval": 5,
"logLevel": "info"
}
}

2019/11/20 15:58:39 Initial [clickhouse_sinker] failure: [lookup : no such host]
panic: lookup : no such host

goroutine 1 [running]:
github.com/housepower/clickhouse_sinker/vendor/github.com/wswz/go_commons/app.Run(0x97ae75, 0x11, 0xc0000c7f20, 0xc00008b1d0, 0xc0000c7f00)
/home/go/src/github.com/housepower/clickhouse_sinker/vendor/github.com/wswz/go_commons/app/app.go:28 +0x50a
main.main()
/home/go/src/github.com/housepower/clickhouse_sinker/bin/main.go:41 +0x12d

print task name on buf size log line

It would be greate to printout taskname of kafka topic when we see buf size line.

Why? Because we quickly see what task name is fully using max buf size - and is therefore full power processing at the moment.

current:

2020/02/14 18:59:46  [1;34m[I] csv_dsp_dsprtbbids_raw tick [0m                                                                                                                                         
2020/02/14 18:59:46  [1;34m[I] buf size: 3575 [0m

expected:

2020/02/14 18:59:46  [1;34m[I] csv_dsp_dsprtbbids_raw tick [0m      
2020/02/14 18:59:46  [1;34m[I] buf size: 3575 [0m   csv_dsp_dsprtbbids_raw

I got this message: Initial [clickhouse_sinker] failure: [driver: bad connection]

here is what I used:
kafka: 2.4.1
clickhouse_sinker: 1.5 (binary file)

here is my config files

{
  "clickhouse": {
    "ch": {
      "db": "default",
      "host": "192.168.0.111",
      "maxLifeTime": 300,
      "password": "",
      "port": 8123,
      "username": "ck"
    }
  },
  "kafka": {
    "kfk": {
      "brokers": "192.168.0.111:9092",
      "sasl": {
        "password": "",
        "username": ""
      },
      "Version": "2.4.1"
    }
  },
  "common": {
    "bufferSize": 90000,
    "flushInterval": 5,
    "logLevel": "info"
  }
}

{
    "name" : "daily_request",
    "kafka": "kfk",
    "topic": "daily",
    "earliest" : true,
    "consumerGroup" : "group2",
    "parser" : "json",
    "clickhouse" : "ch",
    "tableName" : "daily",
    "@desc_of_autoSchema" : "auto schema will auto fetch the schema from clickhouse",
    "autoSchema" : true,
    "@desc_of_exclude_columns" : "this columns will be excluded by insert SQL ",
    "excludeColumns" : [],
    "bufferSize" : 90000
}

here is my clickhouse:
$8P27M_7$MK~}8QJ 3${ALRQ$

I am not sure if I config correctly

Consume kafka topic with smallest offset

Hi,

I run a simple test and see that when the sinker restarts it recorded last committed offset and can pick up where it left off.

Can we restart the sinker to process with offset largest/smallest ?

Thanks,

为什么使用kafka引擎建表，数据重复

Ship Docker images on release

Also - It would be great to keep old docker image tags up.

如何写入clickhouse的array字段

clickhouse表：
CREATE TABLE test.test(gname String, glist Array(String)) ENGINE = Memory

task/json_reques.json：
{
"name" : "json_test",

    "kafka": "kfk",
    "topic": "test",
    "earliest" : false,
    "consumerGroup" : "testgroup1",
    "parser" : "json",

    "clickhouse" : "ch",
    "tableName" : "test",

    "@desc_of_autoSchema" : "auto schema will auto fetch the schema from clickhouse",
    "autoSchema" : true,
    "@desc_of_exclude_columns" : "this columns will be excluded by insert SQL ",
    "excludeColumns" : [],
    "bufferSize" : 90000

}

往kafka写入数据：
{"gname": "name", "glist":["1","2"]}
{"gname": "name", "glist":['1','2']}

最后从clickhouse查到gname列是写进去了，但是glist写不进去。
请问下：
1.写array列应该怎样写？
2.是否支持写clickhouse的map类型字段，支持的话也给个demo

谢谢！

Apache Pulsar support

Now we have abstracted the input as interface{}, so it may be great to have Apache Pulsar support.

这个是否为bug

INSERT Date from Unix Timestamp

https://groups.google.com/forum/#!topic/clickhouse/mFbaYKPTIGs
As following topic, I think Date and Nullable(Date) should return "string" instead of "int" because it can only be insert with string like 2020-06-01 not with unix timestamp

func switchType(typ string) string {
	switch typ {
	case "DateTime", "UInt8", "UInt16", "UInt32", "UInt64", "Int8",
		"Int16", "Int32", "Int64", "Nullable(DateTime)",
		"Nullable(UInt8)", "Nullable(UInt16)", "Nullable(UInt32)", "Nullable(UInt64)",
		"Nullable(Int8)", "Nullable(Int16)", "Nullable(Int32)", "Nullable(Int64)":
		return "int"
	case "Array(Date)", "Array(DateTime)", "Array(UInt8)", "Array(UInt16)", "Array(UInt32)",
		"Array(UInt64)", "Array(Int8)", "Array(Int16)", "Array(Int32)", "Array(Int64)":
		return "intArray"
	case "Date", "Nullable(Date)", "String", "FixString", "Nullable(String)":
		return "string"
	case "Array(String)", "Array(FixString)":
		return "stringArray"
	case "Float32", "Float64", "Nullable(Float32)", "Nullable(Float64)":
		return "float"
	case "Array(Float32)", "Array(Float64)":
		return "floatArray"
	case "ElasticDateTime":
		return "ElasticDateTime"
	default:
		panic("unsupport type " + typ)
	}
}

Difference with Kafka Engine

What are the benefits of using clickhouse_sinker instead of Kafka engine?

貌似不支持分布式表的插入

Exactly-once semantics

Hi,

Does ClickHouse sinker guarantee exactly-once semantics? Specifically, can be sure that each Kafka message is applied to ClickHouse exactly one time, i.e. without data loss or duplication?

By the way, is there any design document available for ClickHouse sinker?

Thanks

初始化conf异常，无改动代码

panic: json: cannot unmarshal string into Go struct field Config.Kafka of type creator.KafkaConfig

goroutine 1 [running]:
github.com/housepower/clickhouse_sinker/creator.InitConfig(0x9c7203, 0x37, 0xbf8b4003d99b3534)
E:/coding/szhq-project/go-coding/clickhouse_sinker/creator/config.go:47 +0x368
main.main.func1(0x9b2c66, 0xc)
E:/coding/szhq-project/go-coding/clickhouse_sinker/bin/main.go:41 +0x64
github.com/wswz/go_commons/app.Run(0x9b5c5f, 0x11, 0xc00009df20, 0xc000061170, 0xc00009df00)
E:/global_path/pkg/mod/github.com/wswz/[email protected]/app/app.go:26 +0xb4
main.main()
E:/coding/szhq-project/go-coding/clickhouse_sinker/bin/main.go:40 +0x134

{
"clickhouse": {
"ch1": {
"db": "default",
"host": "cdh01",
"maxLifeTime": 300,
"password": "fangte2019",
"retryTimes" : 5,
"port": 9020,
"username": "default"
}
},
"kafka": {
"kfk1": {
"brokers": "cdh01:9092,cdh02:9092,cdh03:9092",
"sasl": {
"password": "",
"username": ""
}
},
"version": "2.1.0"
},
"common": {
"bufferSize": 90000,
"minBufferSize": 2000,
"flushInterval": 5,
"logLevel": "info"
}
}

Big memory consumption

We stopped clickhouse-sinker for some time (1h) to make a big lag in kafka processing (to simulate many queued records - backlog in kafka)

We set config.json buffer, minBufferSize to very low value to minify memory consumption
}, "common": { "bufferSize": 1000, "minBufferSize": 500, "flushInterval": 5, "logLevel": "info" } }

and run 20 different tasks in CSV mode with minimal task config (no buffer defined or similar)
{ "name": "csv_dsp_dyncreuserevents_raw", "kafka": "kfk1", "topic": "dyncreuserevents_raw", "earliest": true, "consumerGroup": "chrawctb6", "parser": "csv", "csvFormat": [ "shopid", "userid", "itemid", "eventtype", "eventtime", "uemh", "sessionid", "jsonaddon", "campaignid", "creativeid", "reason", "reasonsource", "tenantid" ], "delimiter": ",", "clickhouse": "ch1dsp", "tableName": "dyncreuserevents_raw_raw", "@desc_of_autoSchema": "auto schema will auto fetch the schema from clickhouse", "autoSchema": true, "@desc_of_exclude_columns": "this columns will be excluded by insert SQL", "excludeColumns": [ "eventDate", "eventTimeStamp" ] }

Kafka message value in in range - length of 100-250 characters.

However when we run this app we see memory consumption going up to 2GB and then gets killed by docker out of memory.

20tasks*1000 buffer * 250bytes per message=5000000 bytes.

We checked logs and buffer is 1000messages
2020/02/14 21:34:29 [I] buf size: 1000 2020/02/14 21:34:29 [I] buf size: 1000 2020/02/14 21:34:29 [I] buf size: 1000 2020/02/14 21:34:29 [I] buf size: 1000 2020/02/14 21:34:29 [I] buf size: 1000 ... ... ...

Could there be anything like a memory leak going on - input is 5000000bytes and memory consumption more than 2GB (gigabytes)

When we start clickhouse_sinker without any lag in kafka processing (no artificial stop to make big queue) we can run clickhouse_sinker for days... However with big backpressure we have a problem.

Version used:
https://hub.docker.com/layers/artemchekunov/clickhouse_sinker/latest/images/sha256-b9c3f30eb936d31794fa103ac8e2c3901abd055612764980a7fb3b398ff7232c?context=explore

消费远程kafka报错

公司有一个远程的kafka环境，我在本地采用./bin/kafka-console-consumer.sh --bootstrap-server 10.2.3.4:6551 --from-beginning --topic kafka_topic 消费能打印出数据，但是我配置到clickhouse_sinker里面的config.json里面的时候，运行会报错：
2019/07/23 20:17:28 Initial [clickhouse_sinker] complete
panic: kafka: client has run out of available brokers to talk to (Is your cluster reachable?)

goroutine 129 [running]:
github.com/housepower/clickhouse_sinker/task.(*TaskService).Run(0xc0001f4180)
/Users/sundy/pan/gopath/src/github.com/housepower/clickhouse_sinker/task/task.go:49 +0x5fd
created by main.(*Sinker).Run
/Users/sundy/pan/gopath/src/github.com/housepower/clickhouse_sinker/bin/main.go:77 +0x59

想问一下是什么原因

Prometheus Metrics

I see that Prometheus Metrics is supported but there is no documentation on how to use it. Could you explain how we would go about doing this?

Thank you for this great library! It has made our lives so much easier.

2020 Q1 TODOs

CI-DI with integration tests. (Docker, kafka, clickhouse)
More detail docs
Kafka consumer committer to enable atomic insert. (Now it commits offsets after flushing to clickhouse)
Prometheus statistics. @taiyang-li @ArtemChekunov
Enable parallel parsing
auto task assignment.
Rowbinary format support.
Dynamic configs

打开普罗米修斯监控，出现unknown

clickhouse_sinker_parse_timespan{instance="unknown",job="clickhouse_sinker",task="rta_11",quantile="0.5"} 4.34e-07

主机是阿里云ECS，这个可以解决吗

如何导入嵌套json

我想把mysql的binlog导入到clickhouse中，kafka的数据格式为：
{"database":"tmp","table":"test","type":"delete","ts":1563867465,"xid":373627,"xoffset":0,"data":{"user_id":232,"role_id":11}}
怎么将这种嵌套json类型的数据导入到clickhouse中呢？如果不支持嵌套，可否直接取data里面的json内容
顺便问一下，clickhouse_sinker/conf/tasks/falcon_sample.json这个例子是什么用途

运行时出现kafka消费异常

初始化完成开始消费时出现此错误，用的1.5的版本，请问该如何解决呢
2020/04/26 15:10:19 Initial [clickhouse_sinker]
&creator.Config{
Kafka: {
"kfk": &creator.KafkaConfig{
Brokers: "127.0.0.1:19091,127.0.0.1:19092,127.0.0.1:19093",
Sasl: struct { Password string; Username string }{
Password: "",
Username: "",
},
Version: "0.10.2.0",
},
},
Clickhouse: {
"ch": &creator.ClickHouseConfig{
Db: "ailpha",
Host: "127.0.0.1",
Port: 9101,
Username: "default",
Password: "",
MaxLifeTime: 300,
DnsLoop: false,
},
},
Tasks: []*creator.Task{
&creator.Task{
Name: "request",
Kafka: "kfk",
Topic: "daily",
ConsumerGroup: "testgroup",
Earliest: true,
Parser: "json",
CsvFormat: []string{},
Delimiter: "",
Clickhouse: "ch",
TableName: "daily",
AutoSchema: true,
ExcludeColumns: []string{},
Dims: []struct { Name string; Type string }{},
Metrics: []struct { Name string; Type string }{},
FlushInterval: 0,
BufferSize: 90000,
},
},
Common: struct { FlushInterval int; BufferSize int; LogLevel string }{
FlushInterval: 5,
BufferSize: 90000,
LogLevel: "debug",
},
}
2020/04/26 15:10:19 [I] Prepare sql=> INSERT INTO ailpha.daily () VALUES ()
2020/04/26 15:10:19 Initial [clickhouse_sinker] complete
2020/04/26 15:10:19 [I] start to dial kafka 127.0.0.1:19091,127.0.0.1:19092,127.0.0.1:19093
2020/04/26 15:10:19 [E] Error from consumer: EOF %!v(MISSING)
2020/04/26 15:10:19 [E] Error from consumer: EOF %!v(MISSING)
2020/04/26 15:10:19 [E] Error from consumer: EOF %!v(MISSING)
2020/04/26 15:10:19 [E] Error from consumer: EOF %!v(MISSING)
...

Is this code on CH commit error correct?

clickhouse_sinker/output/clickhouse.go

Lines 112 to 116 in 59ee004

 if err = tx.Commit(); err != nil { 

 if shouldReconnect(err) { 

 _ = conn.ReConnect() 

 } 

 return nil

It seems to silence the commit error...
Thanks.

window下运行

发版的时候是否可以发个window版本，因为团队还是有很多人使用win系统，但是对go等技术却不熟。

用起来不是很方便。

mergetree必须有date字段

你好，
由于我们项目必须要使用mergetree，而mergetree必须有一个date字段，请问后续能支持date字段么

Manual schema config

Hi!
Is there an example of how to manually configure a schema for clickhouse table?

运行时clickhouse没有数据插入

Common: struct { FlushInterval int; BufferSize int; MinBufferSize int; LogLevel string }{
FlushInterval: 5,
BufferSize: 90000,
MinBufferSize: 0,
LogLevel: "debug",
},
}
2020/06/18 11:08:49 [I] Prepare sql=> INSERT INTO default.daily (log,stream,master_url) VALUES (?,?,?)
2020/06/18 11:08:49 Initial [clickhouse_sinker] complete
2020/06/18 11:08:49 [I] start to dial kafka test0:9092
2020/06/18 11:08:49 [I] Run http server http://0.0.0.0:2112
2020/06/18 11:08:49 [I] TaskService daily_request TaskService has started
2020/06/18 11:08:54 [I] daily_request: tick
2020/06/18 11:08:59 [I] daily_request: tick
2020/06/18 11:09:04 [I] daily_request: tick
2020/06/18 11:09:09 [I] daily_request: tick
2020/06/18 11:09:14 [I] daily_request: tick
2020/06/18 11:09:19 [I] daily_request: tick
2020/06/18 11:09:24 [I] daily_request: tick
2020/06/18 11:09:29 [I] daily_request: tick
2020/06/18 11:09:34 [I] daily_request: tick
2020/06/18 11:09:39 [I] daily_request: tick
2020/06/18 11:09:44 [I] daily_request: tick
2020/06/18 11:09:49 [I] daily_request: tick

关于怎么使用的问题

我下载https://github.com/housepower/clickhouse_sinker/releases/download/v1.3/clickhouse_sinker_1.3_linux_amd64.zip的时候，解压发现里面有clickhouse_sinker和README.md两个文件，手动在相同的目录下编写了conf.json文件，文件如下：
{
"clickhouse": {
"ch1": {
"db": "default",
"host": "127.0.0.1",
"dnsLoop" : false,
"maxLifeTime": 300,
"password": "123456",
"port": 9000,
"username": "default"
}
},
"kafka": {
"kfk1": {
"brokers": "127.0.0.1:9092",
"sasl": {
"password": "",
"username": ""
}
}
},
"common": {
"bufferSize": 90000,
"flushInterval": 5,
"logLevel": "info"
}
}
然后运行：./clickhouse_sinker -conf binlog.json
报错：
2019/07/12 11:31:52 Initial [clickhouse_sinker]
panic: stat binlog.json/config.json: not a directory

goroutine 1 [running]:
github.com/housepower/clickhouse_sinker/creator.InitConfig(0x7ffe95ae746b, 0xb, 0x1fdd80)
/Users/sundy/pan/gopath/src/github.com/housepower/clickhouse_sinker/creator/config.go:41 +0x371
main.main.func1(0x9078dc, 0xc)
/Users/sundy/pan/gopath/src/github.com/housepower/clickhouse_sinker/bin/main.go:42 +0x5d
github.com/housepower/clickhouse_sinker/vendor/github.com/wswz/go_commons/app.Run(0x90a6dd, 0x11, 0xc0000f9f28, 0xc0000bb1b0, 0xc0000f9f08)
/Users/sundy/pan/gopath/src/github.com/housepower/clickhouse_sinker/vendor/github.com/wswz/go_commons/app/app.go:26 +0xad
main.main()
/Users/sundy/pan/gopath/src/github.com/housepower/clickhouse_sinker/bin/main.go:41 +0x13a
请问这个是怎么使用的呢

Tuning for high load.

Hello!

|'ve faced with performance issue.
There is a topic with stream for about 125K messages per second in the kafka cluster, looks like either "clickhouse_sinker" cannot consume that stream or something is wrong in configuration. I tried to play with bufferSize, num, flushInterval - but without success.

   "common": {
    "bufferSize": 9000000,
    "minBufferSize": 2000,
    "flushInterval": 0,
    "logLevel": "info"
  }

Do you have any idea, what I'm doing wrong ?

Thank You!

go_common服务中的信号监听经常异常

在go_common这个服务的utils模块中WaitForExitSign这个信号监听，经常会收到SIGHUP信号而导致服务退出。我是centos7.5的系统。具体原因不详。所以我直接把这个监听这个信号去掉了才能正常服务。所以大神是否了解是什么原因

[E] saving msg error code: 252

使用clickhouse_sinker，大量的
[E] saving msg error code: 252, message: Too many parts (300). Merges are processing significantly slower than inserts. will loop to write the data
错误。

请问如何解决？

现支持的clickhouse版本

请问下现在是否支持18.12版本以上，有支持decimal和datetime的计划吗

csv parser in csv.go should support quoted strings and not split this to "sub-fields"

csv.go has
msgs := strings.Split(msgData, c.delimiter)

however for real csv parsing a proper decoder should be used

Why?

Example below is also valid CSV

client_id,client_name,client_age

1,"DO,NOT,SPLIT",42
2,Daniel,26
3,Vincent,32

As client_name in line 1 has ","- commas inside, the string is quoted and therefore valid again.

Therefore probably some existing library to parse csv would be much better

https://golang.org/pkg/encoding/csv/ specifys basic rules for quoted string and how to interpret doublequotes.

安装错误

go get -u github.com/housepower/clickhouse_sinker
package github.com/housepower/clickhouse_sinker: no Go files in /root/go/src/github.com/housepower/clickhouse_sinker

centos7.6
golang-1.12

Record into 2 tables from the same topic

Hi!
I was wondering if it is possible to record into 2 different tables from the same topic and how can I configure this process.

配置statistics监控信息上报卡死的问题

"statistics": {
"enable": true,
"pushInterval": 5,
"pushGateWayAddrs": ["192.168.1.1:9091"]
}

我配置这个普罗米修斯上报地址，地址telnet是通的，然后运行程序，就卡在这里了
2020/03/24 09:51:42 [I] Run http server 0.0.0.0:2112

enable: false，一切又正常了

关于FixedString数据类型无法兼容

你好，因为我clickhouse中有FixedString的数据类型，然后我运行服务时候clickhouse_sinker报错如下：
panic: unsupport type FixedString(16)

goroutine 58 [running]:
github.com/housepower/clickhouse_sinker/util.switchType(...)
/home/go/src/ch_sinker/util/value.go:74
github.com/housepower/clickhouse_sinker/util.GetValueByType(0xbb78c0, 0xc000129990, 0xc0002f40c0, 0xa, 0xc0002de8b0)
/home/go/src/ch_sinker/util/value.go:26 +0xdd0
github.com/housepower/clickhouse_sinker/output.(*ClickHouse).Write(0xc00016b8c0, 0xc000404000, 0x1, 0x3e8, 0x0, 0x0)
/home/go/src/ch_sinker/output/clickhouse.go:103 +0x22c
github.com/housepower/clickhouse_sinker/output.(*ClickHouse).LoopWrite(0xc00016b8c0, 0xc000404000, 0x1, 0x3e8)
/home/go/src/ch_sinker/output/clickhouse.go:131 +0x50
github.com/housepower/clickhouse_sinker/task.(*Service).flush(0xc000276e40, 0xc000404000, 0x1, 0x3e8)
/home/go/src/ch_sinker/task/task.go:119 +0x11f
github.com/housepower/clickhouse_sinker/task.(*Service).Run(0xc000276e40)
/home/go/src/ch_sinker/task/task.go:99 +0x510
created by main.(*Sinker).Run
/home/go/src/ch_sinker/bin/main.go:192 +0x61

大概就是说value.go下没有对FixedString这个数据类型对支持，而且我clickhouse里面是FixedString(16)，这种情况对于这个类型应该如何处理呢？还是有哪些地方我弄错了呢？

[Question] Custom Parser support on nested json object or json array

I see that Custom Parser is supported but I wonder that whether it support to parse nested json object or not and how to config this. For example I have json object like below and I want to parse json object field event.transaction.transaction_id or json array field event.product_id to insert to same column name in clickhouse table

{
   "event": {
         "transaction": {
              "transaction_id": "123", // insert to column transaction_id in clickhouse table
              "event_time": 151222222222
          },
         "product_id": [1,2,3,4,5] // insert to array column product_id in clickhouse table
    }
}

Thank you for your great work!

请问，clickhouse down掉，最后一个批次的数据可会丢失

请问, 如果clickhouse down掉, 最后一次批次的数据LoopWrite失败，这个批次的数据可会丢失？

Config memagement with Nacos

Nacos golang client:
https://github.com/nacos-group/nacos-sdk-go

Every record in kafka has a timestamp, can i use it?

https://kafka.apache.org/0102/javadoc/org/apache/kafka/clients/consumer/ConsumerRecord.html

嵌套JSON问题

kafka生产的嵌套JSON，像这样的，如果要根据type的内容，把data的数据写进ck，要怎么做啊。

{
"database": "test",
"table": "dim_table1",
"type": "insert",
"ts": 1543912762,
"xid": 2209,
"commit": true,
"data": {
"id": 5815,
"name": "jay"
}
}

怎么检查错误

能正常启动。但是clickhouse么有数据。怎样排查？有没有日志可以查

空值在插入clickhouse中被清洗为默认值，觉的应该保留空值。

null 在插入clickhouse中被清洗为默认值，会造成数据统计上的错误。

我们的应用场景：
有多个事件，他们的类型组合在一起构建了一个大宽表，
举个例子：
A事件字段 "kind", "onlyid", "app_version", "app_time"
B事件字段 "kind", "onlyid", "app_version", “open_number”

大宽表的字段
"kind", "onlyid", "app_version", "app_time"，“open_number”

如果 A事件的 app_time 字段和 B事件的open_number 字段被设置为默认值，会导致 A事件与 B事件的统计结果不准确。

Error during insertions

What happens if there is an error during the inserting into the database? Would clickhouse-sinker try to reinsert the data?

panic: unsupport type Decimal(18, 2)

tast.json :

{
"name" : "abc",

"kafka": "kfk",
"topic": "abc",
"consumerGroup" : "abc_group",

"parser" : "json",
"clickhouse" : "clickhouse",

"tableName" : "abc",
"autoSchema": true,
"excludeColumns" : [],
"bufferSize": 90000
}

Issue while running ./clickhouse_sinker -conf conf

I am facing a issue while running command ./clickhouse_sinker -conf conf, it says command not found, I have configured conf file according to my requirements.
Kindly help.

config.json

{
"clickhouse": {
"ch1": {
"db": "default",
"host": "127.0.0.1",
"dnsLoop" : false,
"maxLifeTime": 300,
"password": "1234",
"port": 9900,
"username": ""
}
},
"kafka": {
"kfk1": {
"brokers": "127.0.0.1:9092",
"sasl": {
"password": "",
"username": ""
}
}
},
"common": {
"bufferSize": 90000,
"flushInterval": 5,
"logLevel": "info"
}
}

	if err = tx.Commit(); err != nil {
	if shouldReconnect(err) {
	_ = conn.ReConnect()
	}
	return nil