Giter Club home page Giter Club logo

Comments (19)

stone1100 avatar stone1100 commented on September 22, 2024
  1. 提供一下console overview上面database description的信息
  2. 提供一下1/2两2个问题的sql

from lindb.

dragon-dan avatar dragon-dan commented on September 22, 2024

+-----------+----------------+-----------------------------------------------------------------------------------+
| Name | Storage | Desc |
+-----------+----------------+-----------------------------------------------------------------------------------+
| _internal | /lindb-cluster | create database _internal with shard 1, replica 1, intervals [10s->1M] |
| testdb | /lindb-cluster | create database testdb with shard 1, replica 1, intervals [10s->5y,5m->5y,1h->5y] |
+-----------+----------------+-----------------------------------------------------------------------------------+

select 'tbnum' from 'tb.dk' on 'default-ns' where time>now()-6h group by time(290s)
这个昨天还能查到数据,今天就查不到了,时间范围扩大到昨天也没有。
插入程序是个死循环,每隔几十秒就插入100多条数据,运行了一天,直接查询返回条数太多,
希望group by time(1h)聚合一下,查不出东西,反复尝试group by time(4m)和group by time(299s)可以返回数据,时间再大一点就没返回了

from lindb.

stone1100 avatar stone1100 commented on September 22, 2024

都是用time>now()-6h这样的方式查的吗,能通过group by time(1m) 然后指定一个时间段,大概1-2个小时左右,可以查历史的数据吗,如:
select f1 from host_disk_701 where time<'2023-03-16 20:00:00' and time>'2023-03-16 19:00:00' group by time(1m)

同时在data的下面这个目录下,du一下,贴一下结果:
data_dir/storage/data/test2/shard/0/segment

初步怀疑是rollup的策略导致,因为group time >=5m之后,会选 5m->5y这块的数据,由于rollup的条件没有满足这里的数据没有生成,所以导致查不到数据。我们看一下怎么再优化一下rollup触发的条件。

from lindb.

dragon-dan avatar dragon-dan commented on September 22, 2024

data_dir/storage/data/testdb 占用128mb
data_dir/storage/data/testdb/shard/0/segment 占用36kb
data_dir/storage/data/testdb/shard/1/segment 占用0kb

目前是在windows server上面做尝试的,目前还没到linux测试。
创建语句如下:
create database {"option":{"intervals":[{"interval":"10s","retention":"5y"},{"interval":"5m","retention":"5y"},{"interval":"1h","retention":"5y"}],"autoCreateNS":true,"behead":"5y","ahead":"5y"},"name":"testdb","storage":"/lindb-cluster","numOfShard":1,"replicaFactor":1}

关网文档上没有创建参数的详细说明,我是跑到代码仓库全局搜索,根据代码注释自己理解修改了下。
上面这个是可以用的(插入数据能查得到),在我尝试过程中,很多次都是改完之后创建数据库成功,
然后插入数据什么也查不到,怎么组织查询语句都查不出来,也不知道为啥。

select 'tbnum' from 'tb.dk' on 'default-ns' where time>'2023-03-15 14:00:00' and time<'2023-03-15 16:00:00' group by time(1m)
这样指定时间段,依然查不到前几天的数据。
今天刚重新运行了下插入程序,今天插入的可以查询到。

select 'tbnum' from 'tb.dk' on 'default-ns' where time>now()-23h
极限是23h内可以查到数据,now()-24h就没了。

from lindb.

stone1100 avatar stone1100 commented on September 22, 2024

可以到segment里面看一下month/year下面是不是没有数据文件生产。.sst的文件。
image

创建的看起来没有问题,就是behead/ahead跨度有一些大。这边重现了还没有执行rollup job导到group by time>5m的时候查不到数据。

方便的话,可以分享一下你的写入代码,这边好复现一下。

from lindb.

dragon-dan avatar dragon-dan commented on September 22, 2024

data_dir/storage/data/testdb/shard/0/segment
日期文件夹下只有CURRENT、LOCK、MANIFEST-000001、OPTIONS四个文件,没有.sst的文件

data_dir/storage/data/testdb/shard/1/segment 日期文件夹下什么都没有

behead/ahead具体什么意思,并没有理解,我以为这个范围小了的话,就查不到数据,所以弄得很大。

写入代码如下:

public static void fun2() throws Exception {
        Options options = Options.builder()
                .useGZip(true).batchSize(4000).batchQueue(4096).flushInterval(1000)
                .build();
        // create LinDB Client with broker endpoint
        Client client = ClientFactory.create("http://192.168.52.143:9000", options);
        // get write for database
        Write write = client.write("testdb");

        ThreadLocalRandom random = ThreadLocalRandom.current();
        LocalDateTime time = LocalDateTime.of(2023, 3, 17, 23, 0, 0);
        while (LocalDateTime.now().isBefore(time)) {
            System.out.println(LocalDateTime.now().format(DatePattern.NORM_DATETIME_FORMATTER));
            int j = random.nextInt(50, 250);
            for (int i = 1; i <= j; ++i) {
                int nextInt = random.nextInt(32);
                Point.Builder builder = Point.builder("tb.dk")
                        .addTag("reg", "reg-" + nextInt)
                        .addSum("tbnum", 1.0);
                if (nextInt % 2 == 0) {
                    builder.addSum("tbhfnum", 1.0);
                }
                Point point = builder.build();
                boolean ok = write.put(point);
                //System.out.println("write status: " + ok);
            }
            System.out.println("done1");
            TimeUnit.SECONDS.sleep(random.nextInt(10, 60));
        }
        // need close write after write done
        write.close();
        // 停顿一下
        TimeUnit.SECONDS.sleep(15);
        System.out.println("done2");
    }

from lindb.

stone1100 avatar stone1100 commented on September 22, 2024

很奇怪,numOfShard是1, 应该只有一个shard 0才对。中间有什么特别的操作吗

behead/ahead 是允许写入的范围,当前时间前后多少范围,不在这个范围内的不拒绝写入。

from lindb.

dragon-dan avatar dragon-dan commented on September 22, 2024

第一次创建数据库numOfShard设置的是2,然后发现用不了,删除数据库重新创建,
lindb删除数据库好像没有删除文件夹,有几次我发现删除了数据库再重建同名数据库还能看到删除前的数据。

不在behead/ahead范围内的不拒绝写入?也就是说在范围内的拒绝写入吗,java client也看到设置时间戳的api,
刚试了下今天确实又看不到上周五的数据了,再运行下插入程序就能看到今天插入的数据了。

from lindb.

dragon-dan avatar dragon-dan commented on September 22, 2024

好像时间段内统计总是返回一个列表,有没有什么办法直接返回时间段内总值,不用自己循环累加

from lindb.

stone1100 avatar stone1100 commented on September 22, 2024

目前需要基于查询的time range来算出一个group by time interval。如查询范围是1小时,需要加一个group by time(1h),返回一个聚合之后的数据点。
已经记得了一个功能,系统可以自动基于query time range来算出一个interval.
#943

from lindb.

dragon-dan avatar dragon-dan commented on September 22, 2024

select 'tbnum' from 'tb.dk' on 'default-ns' where time<='2023-03-23 12:00:00' group by time(4m)
1679541120000: 452
1679541360000: 897
1679541600000: 1028
1679541840000: 924
1679542080000: 790
1679542320000: 1056
1679542560000: 100
试了下结果就是这么个数组,那目前只能我自己去累加获得总数吧。
那隔天数据就查不到和group by time(1d)、group by time(1M)这样大段分组查不到的,是不是得等新版本才能用啊。
现在拿来测试使用都不行,隔天就什么都查不到了。

还有个小问题,standalone 模式启动时,内置etcd需要监听两个端口,
有一个etcd互相通信的2380端口,还像没找到哪里可以配置,电脑因为公司杀软占用这个端口,一直起不来,
只能换电脑有点小麻烦。

from lindb.

stone1100 avatar stone1100 commented on September 22, 2024
  1. group by time这个近期支持上
  2. 隔天这个这边没有复现,可以把数据清理一下,重新初始一个db,再试一下看看
  3. 启动standalone的时候可以关了内置的etcd,自己在外面启一下,可以通过下面这个参数调整
    --embed-etcd enable embed etcd server (default true)

from lindb.

dragon-dan avatar dragon-dan commented on September 22, 2024

好的,我用同样的配置创建个新库,插入点数据,明天看下能搜索到不

from lindb.

stone1100 avatar stone1100 commented on September 22, 2024

可以指标timestamp这种方式来写测试数据,如每次加1分钟这种。这样就不用等很长时间了,也可以写一些跨天的数据。
可以参考下面这种:
https://github.com/lindb/lindb/blob/main/e2e/benchmark/write_metric_test.go#L135

https://github.com/lindb/client_java/blob/main/src/main/java/io/lindb/client/api/Point.java#L243

from lindb.

dragon-dan avatar dragon-dan commented on September 22, 2024

指定时间戳查当天的数据可以,超过24小时就看不到。
指定时间戳到上午
select 'tbnum' from 'tb.dk' on 'default-ns' where time>now()-12h
可以看到插入到上午的数据。

指定时间戳到昨天
select 'tbnum' from 'tb.dk' on 'default-ns' where time>now()-1d
这个就查不到数据了。

from lindb.

stone1100 avatar stone1100 commented on September 22, 2024

方便的话可以发一个联系方式,[email protected]

from lindb.

dragon-dan avatar dragon-dan commented on September 22, 2024

[email protected] 已发邮件

from lindb.

stone1100 avatar stone1100 commented on September 22, 2024

可以用v0.2.2这个版本再看一下。修复了windows下面rollup没有生成数据的问题。

from lindb.

stone1100 avatar stone1100 commented on September 22, 2024

select 'tbnum' from 'tb.dk' on 'default-ns' where time<='2023-03-23 12:00:00' group by time(4m) 1679541120000: 452 1679541360000: 897 1679541600000: 1028 1679541840000: 924 1679542080000: 790 1679542320000: 1056 1679542560000: 100 试了下结果就是这么个数组,那目前只能我自己去累加获得总数吧。

这个可以通过如下的方式直接返回一个结果。
select 'tbnum' from 'tb.dk' on 'default-ns' where time<='2023-03-23 12:00:00' group by time()

from lindb.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.