Giter Club home page Giter Club logo

clickhouse_sinker's People

Contributors

ahzamali avatar artemchekunov avatar cheng1483x avatar dependabot[bot] avatar draev avatar exfly avatar genzgd avatar imogthe avatar jgulick48 avatar momingkotoba avatar orginux avatar sashayakovtseva avatar sundy-li avatar taiyang-li avatar testwill avatar toannhu96 avatar twmb avatar ww1516123 avatar yenchangchan avatar yuzhichang avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

clickhouse_sinker's Issues

生产环境下使用的一些建议

我们希望用clickhouse_sinker 作为一个系统级服务,能够常驻,自动适应clickhouse, kafka链接状态的变化, 而不需要去手动停止。

现在的配置retryTimes, 如果使用,将导致clickhouse集群维护期间大量的数据丢失,不适合生产环境.

建议LoopWrite, 抛弃retryTimes的设置,

func (c *ClickHouse) LoopWrite(metrics []model.Metric) {
	err := c.Write(metrics)
	times := 1
	for err != nil {
		log.Error("saving msg error", err.Error(), "will loop to write the data")
		waitTime := time.Duration(math.Min(math.Pow(2, float64(times)), 3600))
		time.Sleep(waitTime * time.Second)
		err = c.Write(metrics)
		times = times + 1
	}
}

写失败的情况下,自动倍增系统sleep的时间,直到成功。这样就无需在clickhouse停机的情况下,首先要关闭clickhouse_sinker, clickhouse_sinker可以常驻,在最多1小时候就自动提供sinker服务了。

出现no such host

我进行如下配置,但是总是报错,提示。host怎么换都没有用,一直报这个错
{
"clickhouse": {
"ch1": {
"db": "test",
"host": "127.0.0.1",
"maxLifeTime": 300,
"dnsLoop" : false,
"password": "",
"retryTimes" : 5,
"port": 9000,
"username": "default"
}
},
"kafka": {
"kfk1": {
"brokers": "localhost:9092",
"sasl": {
"password": "",
"username": ""
}
}
},
"common": {
"bufferSize": 90000,
"minBufferSize": 2000,
"flushInterval": 5,
"logLevel": "info"
}
}

2019/11/20 15:58:39 Initial [clickhouse_sinker] failure: [lookup : no such host]
panic: lookup : no such host

goroutine 1 [running]:
github.com/housepower/clickhouse_sinker/vendor/github.com/wswz/go_commons/app.Run(0x97ae75, 0x11, 0xc0000c7f20, 0xc00008b1d0, 0xc0000c7f00)
/home/go/src/github.com/housepower/clickhouse_sinker/vendor/github.com/wswz/go_commons/app/app.go:28 +0x50a
main.main()
/home/go/src/github.com/housepower/clickhouse_sinker/bin/main.go:41 +0x12d

print task name on buf size log line

It would be greate to printout taskname of kafka topic when we see buf size line.

Why? Because we quickly see what task name is fully using max buf size - and is therefore full power processing at the moment.

current:

2020/02/14 18:59:46  [1;34m[I] csv_dsp_dsprtbbids_raw tick [0m                                                                                                                                         
2020/02/14 18:59:46  [1;34m[I] buf size: 3575 [0m   

expected:

2020/02/14 18:59:46  [1;34m[I] csv_dsp_dsprtbbids_raw tick [0m      
2020/02/14 18:59:46  [1;34m[I] buf size: 3575 [0m   csv_dsp_dsprtbbids_raw 

I got this message: Initial [clickhouse_sinker] failure: [driver: bad connection]

here is what I used:
kafka: 2.4.1
clickhouse_sinker: 1.5 (binary file)

here is my config files

{
  "clickhouse": {
    "ch": {
      "db": "default",
      "host": "192.168.0.111",
      "maxLifeTime": 300,
      "password": "",
      "port": 8123,
      "username": "ck"
    }
  },
  "kafka": {
    "kfk": {
      "brokers": "192.168.0.111:9092",
      "sasl": {
        "password": "",
        "username": ""
      },
      "Version": "2.4.1"
    }
  },
  "common": {
    "bufferSize": 90000,
    "flushInterval": 5,
    "logLevel": "info"
  }
}
{
    "name" : "daily_request",
    "kafka": "kfk",
    "topic": "daily",
    "earliest" : true,
    "consumerGroup" : "group2",
    "parser" : "json",
    "clickhouse" : "ch",
    "tableName" : "daily",
    "@desc_of_autoSchema" : "auto schema will auto fetch the schema from clickhouse",
    "autoSchema" : true,
    "@desc_of_exclude_columns" : "this columns will be excluded by insert SQL ",
    "excludeColumns" : [],
    "bufferSize" : 90000
}

here is my clickhouse:
8P27M_7$MK~}8QJ 3${ALRQ

I am not sure if I config correctly

Consume kafka topic with smallest offset

Hi,

I run a simple test and see that when the sinker restarts it recorded last committed offset and can pick up where it left off.

Can we restart the sinker to process with offset largest/smallest ?

Thanks,

如何写入clickhouse的array字段

clickhouse表:
CREATE TABLE test.test(gname String, glist Array(String)) ENGINE = Memory

task/json_reques.json:
{
"name" : "json_test",

    "kafka": "kfk",
    "topic": "test",
    "earliest" : false,
    "consumerGroup" : "testgroup1",
    "parser" : "json",

    "clickhouse" : "ch",
    "tableName" : "test",

    "@desc_of_autoSchema" : "auto schema will auto fetch the schema from clickhouse",
    "autoSchema" : true,
    "@desc_of_exclude_columns" : "this columns will be excluded by insert SQL ",
    "excludeColumns" : [],
    "bufferSize" : 90000

}

往kafka写入数据:
{"gname": "name", "glist":["1","2"]}
{"gname": "name", "glist":['1','2']}

最后从clickhouse查到gname列是写进去了,但是glist写不进去。
请问下:
1.写array列应该怎样写?
2.是否支持写clickhouse的map类型字段,支持的话也给个demo

谢谢!

Apache Pulsar support

Now we have abstracted the input as interface{}, so it may be great to have Apache Pulsar support.

INSERT Date from Unix Timestamp

https://groups.google.com/forum/#!topic/clickhouse/mFbaYKPTIGs
As following topic, I think Date and Nullable(Date) should return "string" instead of "int" because it can only be insert with string like 2020-06-01 not with unix timestamp

func switchType(typ string) string {
	switch typ {
	case "DateTime", "UInt8", "UInt16", "UInt32", "UInt64", "Int8",
		"Int16", "Int32", "Int64", "Nullable(DateTime)",
		"Nullable(UInt8)", "Nullable(UInt16)", "Nullable(UInt32)", "Nullable(UInt64)",
		"Nullable(Int8)", "Nullable(Int16)", "Nullable(Int32)", "Nullable(Int64)":
		return "int"
	case "Array(Date)", "Array(DateTime)", "Array(UInt8)", "Array(UInt16)", "Array(UInt32)",
		"Array(UInt64)", "Array(Int8)", "Array(Int16)", "Array(Int32)", "Array(Int64)":
		return "intArray"
	case "Date", "Nullable(Date)", "String", "FixString", "Nullable(String)":
		return "string"
	case "Array(String)", "Array(FixString)":
		return "stringArray"
	case "Float32", "Float64", "Nullable(Float32)", "Nullable(Float64)":
		return "float"
	case "Array(Float32)", "Array(Float64)":
		return "floatArray"
	case "ElasticDateTime":
		return "ElasticDateTime"
	default:
		panic("unsupport type " + typ)
	}
}

Exactly-once semantics

Hi,

Does ClickHouse sinker guarantee exactly-once semantics? Specifically, can be sure that each Kafka message is applied to ClickHouse exactly one time, i.e. without data loss or duplication?

By the way, is there any design document available for ClickHouse sinker?

Thanks

初始化conf异常,无改动代码

panic: json: cannot unmarshal string into Go struct field Config.Kafka of type creator.KafkaConfig

goroutine 1 [running]:
github.com/housepower/clickhouse_sinker/creator.InitConfig(0x9c7203, 0x37, 0xbf8b4003d99b3534)
E:/coding/szhq-project/go-coding/clickhouse_sinker/creator/config.go:47 +0x368
main.main.func1(0x9b2c66, 0xc)
E:/coding/szhq-project/go-coding/clickhouse_sinker/bin/main.go:41 +0x64
github.com/wswz/go_commons/app.Run(0x9b5c5f, 0x11, 0xc00009df20, 0xc000061170, 0xc00009df00)
E:/global_path/pkg/mod/github.com/wswz/[email protected]/app/app.go:26 +0xb4
main.main()
E:/coding/szhq-project/go-coding/clickhouse_sinker/bin/main.go:40 +0x134

{
"clickhouse": {
"ch1": {
"db": "default",
"host": "cdh01",
"maxLifeTime": 300,
"password": "fangte2019",
"retryTimes" : 5,
"port": 9020,
"username": "default"
}
},
"kafka": {
"kfk1": {
"brokers": "cdh01:9092,cdh02:9092,cdh03:9092",
"sasl": {
"password": "",
"username": ""
}
},
"version": "2.1.0"
},
"common": {
"bufferSize": 90000,
"minBufferSize": 2000,
"flushInterval": 5,
"logLevel": "info"
}
}

Big memory consumption

We stopped clickhouse-sinker for some time (1h) to make a big lag in kafka processing (to simulate many queued records - backlog in kafka)

We set config.json buffer, minBufferSize to very low value to minify memory consumption
}, "common": { "bufferSize": 1000, "minBufferSize": 500, "flushInterval": 5, "logLevel": "info" } }

and run 20 different tasks in CSV mode with minimal task config (no buffer defined or similar)
{ "name": "csv_dsp_dyncreuserevents_raw", "kafka": "kfk1", "topic": "dyncreuserevents_raw", "earliest": true, "consumerGroup": "chrawctb6", "parser": "csv", "csvFormat": [ "shopid", "userid", "itemid", "eventtype", "eventtime", "uemh", "sessionid", "jsonaddon", "campaignid", "creativeid", "reason", "reasonsource", "tenantid" ], "delimiter": ",", "clickhouse": "ch1dsp", "tableName": "dyncreuserevents_raw_raw", "@desc_of_autoSchema": "auto schema will auto fetch the schema from clickhouse", "autoSchema": true, "@desc_of_exclude_columns": "this columns will be excluded by insert SQL", "excludeColumns": [ "eventDate", "eventTimeStamp" ] }

Kafka message value in in range - length of 100-250 characters.

However when we run this app we see memory consumption going up to 2GB and then gets killed by docker out of memory.

20tasks*1000 buffer * 250bytes per message=5000000 bytes.

We checked logs and buffer is 1000messages
2020/02/14 21:34:29 [I] buf size: 1000 2020/02/14 21:34:29 [I] buf size: 1000 2020/02/14 21:34:29 [I] buf size: 1000 2020/02/14 21:34:29 [I] buf size: 1000 2020/02/14 21:34:29 [I] buf size: 1000 ... ... ...

Could there be anything like a memory leak going on - input is 5000000bytes and memory consumption more than 2GB (gigabytes)

When we start clickhouse_sinker without any lag in kafka processing (no artificial stop to make big queue) we can run clickhouse_sinker for days... However with big backpressure we have a problem.

Version used:
https://hub.docker.com/layers/artemchekunov/clickhouse_sinker/latest/images/sha256-b9c3f30eb936d31794fa103ac8e2c3901abd055612764980a7fb3b398ff7232c?context=explore

消费远程kafka报错

公司有一个远程的kafka环境,我在本地采用./bin/kafka-console-consumer.sh --bootstrap-server 10.2.3.4:6551 --from-beginning --topic kafka_topic 消费能打印出数据,但是我配置到clickhouse_sinker里面的config.json里面的时候,运行会报错:
2019/07/23 20:17:28 Initial [clickhouse_sinker] complete
panic: kafka: client has run out of available brokers to talk to (Is your cluster reachable?)

goroutine 129 [running]:
github.com/housepower/clickhouse_sinker/task.(*TaskService).Run(0xc0001f4180)
/Users/sundy/pan/gopath/src/github.com/housepower/clickhouse_sinker/task/task.go:49 +0x5fd
created by main.(*Sinker).Run
/Users/sundy/pan/gopath/src/github.com/housepower/clickhouse_sinker/bin/main.go:77 +0x59

想问一下是什么原因

Prometheus Metrics

I see that Prometheus Metrics is supported but there is no documentation on how to use it. Could you explain how we would go about doing this?

Thank you for this great library! It has made our lives so much easier.

2020 Q1 TODOs

  • CI-DI with integration tests. (Docker, kafka, clickhouse)

  • More detail docs

  • Kafka consumer committer to enable atomic insert. (Now it commits offsets after flushing to clickhouse)

  • Prometheus statistics. @taiyang-li @ArtemChekunov

  • Enable parallel parsing

  • auto task assignment.

  • Rowbinary format support.

  • Dynamic configs

如何导入嵌套json

我想把mysql的binlog导入到clickhouse中,kafka的数据格式为:
{"database":"tmp","table":"test","type":"delete","ts":1563867465,"xid":373627,"xoffset":0,"data":{"user_id":232,"role_id":11}}
怎么将这种嵌套json类型的数据导入到clickhouse中呢?如果不支持嵌套,可否直接取data里面的json内容
顺便问一下,clickhouse_sinker/conf/tasks/falcon_sample.json这个例子是什么用途

运行时出现kafka消费异常

初始化完成开始消费时出现此错误,用的1.5的版本,请问该如何解决呢
2020/04/26 15:10:19 Initial [clickhouse_sinker]
&creator.Config{
Kafka: {
"kfk": &creator.KafkaConfig{
Brokers: "127.0.0.1:19091,127.0.0.1:19092,127.0.0.1:19093",
Sasl: struct { Password string; Username string }{
Password: "",
Username: "",
},
Version: "0.10.2.0",
},
},
Clickhouse: {
"ch": &creator.ClickHouseConfig{
Db: "ailpha",
Host: "127.0.0.1",
Port: 9101,
Username: "default",
Password: "",
MaxLifeTime: 300,
DnsLoop: false,
},
},
Tasks: []*creator.Task{
&creator.Task{
Name: "request",
Kafka: "kfk",
Topic: "daily",
ConsumerGroup: "testgroup",
Earliest: true,
Parser: "json",
CsvFormat: []string{},
Delimiter: "",
Clickhouse: "ch",
TableName: "daily",
AutoSchema: true,
ExcludeColumns: []string{},
Dims: []struct { Name string; Type string }{},
Metrics: []struct { Name string; Type string }{},
FlushInterval: 0,
BufferSize: 90000,
},
},
Common: struct { FlushInterval int; BufferSize int; LogLevel string }{
FlushInterval: 5,
BufferSize: 90000,
LogLevel: "debug",
},
}
2020/04/26 15:10:19 [I] Prepare sql=> INSERT INTO ailpha.daily () VALUES ()
2020/04/26 15:10:19 Initial [clickhouse_sinker] complete
2020/04/26 15:10:19 [I] start to dial kafka 127.0.0.1:19091,127.0.0.1:19092,127.0.0.1:19093
2020/04/26 15:10:19 [E] Error from consumer: EOF %!v(MISSING)
2020/04/26 15:10:19 [E] Error from consumer: EOF %!v(MISSING)
2020/04/26 15:10:19 [E] Error from consumer: EOF %!v(MISSING)
2020/04/26 15:10:19 [E] Error from consumer: EOF %!v(MISSING)
...

window下运行

发版的时候是否可以发个window版本,因为团队还是有很多人使用win系统,但是对go等技术却不熟。

用起来不是很方便。

mergetree必须有date字段

你好,
由于我们项目必须要使用mergetree,而mergetree必须有一个date字段,请问后续能支持date字段么

Manual schema config

Hi!
Is there an example of how to manually configure a schema for clickhouse table?

运行时clickhouse没有数据插入

Common: struct { FlushInterval int; BufferSize int; MinBufferSize int; LogLevel string }{
FlushInterval: 5,
BufferSize: 90000,
MinBufferSize: 0,
LogLevel: "debug",
},
}
2020/06/18 11:08:49 [I] Prepare sql=> INSERT INTO default.daily (log,stream,master_url) VALUES (?,?,?)
2020/06/18 11:08:49 Initial [clickhouse_sinker] complete
2020/06/18 11:08:49 [I] start to dial kafka test0:9092
2020/06/18 11:08:49 [I] Run http server http://0.0.0.0:2112
2020/06/18 11:08:49 [I] TaskService daily_request TaskService has started
2020/06/18 11:08:54 [I] daily_request: tick
2020/06/18 11:08:59 [I] daily_request: tick
2020/06/18 11:09:04 [I] daily_request: tick
2020/06/18 11:09:09 [I] daily_request: tick
2020/06/18 11:09:14 [I] daily_request: tick
2020/06/18 11:09:19 [I] daily_request: tick
2020/06/18 11:09:24 [I] daily_request: tick
2020/06/18 11:09:29 [I] daily_request: tick
2020/06/18 11:09:34 [I] daily_request: tick
2020/06/18 11:09:39 [I] daily_request: tick
2020/06/18 11:09:44 [I] daily_request: tick
2020/06/18 11:09:49 [I] daily_request: tick

关于怎么使用的问题

我下载https://github.com/housepower/clickhouse_sinker/releases/download/v1.3/clickhouse_sinker_1.3_linux_amd64.zip的时候,解压发现里面有clickhouse_sinker和README.md两个文件,手动在相同的目录下编写了conf.json文件,文件如下:
{
"clickhouse": {
"ch1": {
"db": "default",
"host": "127.0.0.1",
"dnsLoop" : false,
"maxLifeTime": 300,
"password": "123456",
"port": 9000,
"username": "default"
}
},
"kafka": {
"kfk1": {
"brokers": "127.0.0.1:9092",
"sasl": {
"password": "",
"username": ""
}
}
},
"common": {
"bufferSize": 90000,
"flushInterval": 5,
"logLevel": "info"
}
}
然后运行:./clickhouse_sinker -conf binlog.json
报错:
2019/07/12 11:31:52 Initial [clickhouse_sinker]
panic: stat binlog.json/config.json: not a directory

goroutine 1 [running]:
github.com/housepower/clickhouse_sinker/creator.InitConfig(0x7ffe95ae746b, 0xb, 0x1fdd80)
/Users/sundy/pan/gopath/src/github.com/housepower/clickhouse_sinker/creator/config.go:41 +0x371
main.main.func1(0x9078dc, 0xc)
/Users/sundy/pan/gopath/src/github.com/housepower/clickhouse_sinker/bin/main.go:42 +0x5d
github.com/housepower/clickhouse_sinker/vendor/github.com/wswz/go_commons/app.Run(0x90a6dd, 0x11, 0xc0000f9f28, 0xc0000bb1b0, 0xc0000f9f08)
/Users/sundy/pan/gopath/src/github.com/housepower/clickhouse_sinker/vendor/github.com/wswz/go_commons/app/app.go:26 +0xad
main.main()
/Users/sundy/pan/gopath/src/github.com/housepower/clickhouse_sinker/bin/main.go:41 +0x13a
请问这个是怎么使用的呢

Tuning for high load.

Hello!

|'ve faced with performance issue.
There is a topic with stream for about 125K messages per second in the kafka cluster, looks like either "clickhouse_sinker" cannot consume that stream or something is wrong in configuration. I tried to play with bufferSize, num, flushInterval - but without success.

   "common": {
    "bufferSize": 9000000,
    "minBufferSize": 2000,
    "flushInterval": 0,
    "logLevel": "info"
  }

Screenshot_1

Do you have any idea, what I'm doing wrong ?

Thank You!

go_common服务中的信号监听经常异常

在go_common这个服务的utils模块中WaitForExitSign这个信号监听,经常会收到SIGHUP信号而导致服务退出。我是centos7.5的系统。具体原因不详。所以我直接把这个监听这个信号去掉了才能正常服务。所以大神是否了解是什么原因

[E] saving msg error code: 252

使用clickhouse_sinker,大量的
[E] saving msg error code: 252, message: Too many parts (300). Merges are processing significantly slower than inserts. will loop to write the data
错误。

请问如何解决?

csv parser in csv.go should support quoted strings and not split this to "sub-fields"

csv.go has
msgs := strings.Split(msgData, c.delimiter)

however for real csv parsing a proper decoder should be used

Why?

Example below is also valid CSV

client_id,client_name,client_age

1,"DO,NOT,SPLIT",42
2,Daniel,26
3,Vincent,32

As client_name in line 1 has ","- commas inside, the string is quoted and therefore valid again.

Therefore probably some existing library to parse csv would be much better

https://golang.org/pkg/encoding/csv/ specifys basic rules for quoted string and how to interpret doublequotes.

安装错误

go get -u github.com/housepower/clickhouse_sinker
package github.com/housepower/clickhouse_sinker: no Go files in /root/go/src/github.com/housepower/clickhouse_sinker

centos7.6
golang-1.12

配置statistics监控信息上报卡死的问题

"statistics": {
"enable": true,
"pushInterval": 5,
"pushGateWayAddrs": ["192.168.1.1:9091"]
}

我配置这个普罗米修斯上报地址,地址telnet是通的,然后运行程序,就卡在这里了
2020/03/24 09:51:42 [I] Run http server 0.0.0.0:2112

enable: false,一切又正常了

关于FixedString数据类型无法兼容

你好,因为我clickhouse中有FixedString的数据类型,然后我运行服务时候clickhouse_sinker报错如下:
panic: unsupport type FixedString(16)

goroutine 58 [running]:
github.com/housepower/clickhouse_sinker/util.switchType(...)
/home/go/src/ch_sinker/util/value.go:74
github.com/housepower/clickhouse_sinker/util.GetValueByType(0xbb78c0, 0xc000129990, 0xc0002f40c0, 0xa, 0xc0002de8b0)
/home/go/src/ch_sinker/util/value.go:26 +0xdd0
github.com/housepower/clickhouse_sinker/output.(*ClickHouse).Write(0xc00016b8c0, 0xc000404000, 0x1, 0x3e8, 0x0, 0x0)
/home/go/src/ch_sinker/output/clickhouse.go:103 +0x22c
github.com/housepower/clickhouse_sinker/output.(*ClickHouse).LoopWrite(0xc00016b8c0, 0xc000404000, 0x1, 0x3e8)
/home/go/src/ch_sinker/output/clickhouse.go:131 +0x50
github.com/housepower/clickhouse_sinker/task.(*Service).flush(0xc000276e40, 0xc000404000, 0x1, 0x3e8)
/home/go/src/ch_sinker/task/task.go:119 +0x11f
github.com/housepower/clickhouse_sinker/task.(*Service).Run(0xc000276e40)
/home/go/src/ch_sinker/task/task.go:99 +0x510
created by main.(*Sinker).Run
/home/go/src/ch_sinker/bin/main.go:192 +0x61

大概就是说value.go下没有对FixedString这个数据类型对支持,而且我clickhouse里面是FixedString(16),这种情况对于这个类型应该如何处理呢?还是有哪些地方我弄错了呢?

[Question] Custom Parser support on nested json object or json array

I see that Custom Parser is supported but I wonder that whether it support to parse nested json object or not and how to config this. For example I have json object like below and I want to parse json object field event.transaction.transaction_id or json array field event.product_id to insert to same column name in clickhouse table

{
   "event": {
         "transaction": {
              "transaction_id": "123", // insert to column transaction_id in clickhouse table
              "event_time": 151222222222
          },
         "product_id": [1,2,3,4,5] // insert to array column product_id in clickhouse table
    }
}

Thank you for your great work!

嵌套JSON问题

kafka生产的嵌套JSON,像这样的,如果要根据type的内容,把data的数据写进ck,要怎么做啊。

{
"database": "test",
"table": "dim_table1",
"type": "insert",
"ts": 1543912762,
"xid": 2209,
"commit": true,
"data": {
"id": 5815,
"name": "jay"
}
}

怎么检查错误

能正常启动。但是clickhouse么有数据。怎样排查? 有没有日志可以查

空值在 插入clickhouse中被清洗为默认值,觉的应该保留空值。

null 在 插入clickhouse中被清洗为默认值,会造成数据统计上的错误。

我们的应用场景:
有多个事件, 他们的类型组合在一起构建了一个大宽表,
举个例子:
A事件字段 "kind", "onlyid", "app_version", "app_time"
B事件字段 "kind", "onlyid", "app_version", “open_number”

大宽表的字段
"kind", "onlyid", "app_version", "app_time",“open_number”

如果 A事件的 app_time 字段 和 B事件的open_number 字段被设置为默认值,会导致 A事件与 B事件的统计结果不准确。

Error during insertions

What happens if there is an error during the inserting into the database? Would clickhouse-sinker try to reinsert the data?

panic: unsupport type Decimal(18, 2)

tast.json :

{
"name" : "abc",

"kafka": "kfk",
"topic": "abc",
"consumerGroup" : "abc_group",

"parser" : "json",
"clickhouse" : "clickhouse",

"tableName" : "abc",
"autoSchema": true,
"excludeColumns" : [],
"bufferSize": 90000
}

Issue while running ./clickhouse_sinker -conf conf

I am facing a issue while running command ./clickhouse_sinker -conf conf, it says command not found, I have configured conf file according to my requirements.
Kindly help.

config.json

{
"clickhouse": {
"ch1": {
"db": "default",
"host": "127.0.0.1",
"dnsLoop" : false,
"maxLifeTime": 300,
"password": "1234",
"port": 9900,
"username": ""
}
},
"kafka": {
"kfk1": {
"brokers": "127.0.0.1:9092",
"sasl": {
"password": "",
"username": ""
}
}
},
"common": {
"bufferSize": 90000,
"flushInterval": 5,
"logLevel": "info"
}
}

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.