Giter Club home page Giter Club logo

paimon-trino's Introduction

paimon-trino's People

Contributors

groobyming avatar jingsongli avatar jjrrzxc avatar leaves12138 avatar miomiocat avatar s7monk avatar shidayang avatar songpcmusic avatar tsreaper avatar zyl891229 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

paimon-trino's Issues

listDatabases will recursively list all files under warehouse dir which is time consuming

I upgrade trino-paimon connector to 0.8-SNAPSHOT and found show schemas in paimon cost 8 minutes. The behavior has changed due to the following code org.apache.paimon.trino.fileio.TrinoFileIO#listStatus:

 FileIterator fileIterator = trinoFileSystem.listFiles(location);

API declaration io.trino.filesystem.TrinoFileSystem#listFiles

Lists all files within the specified directory recursively.

HDFS Implemention: io.trino.filesystem.hdfs.HdfsFileSystem#listFiles

// recursive = true
return new HdfsFileIterator(location, directory, fileSystem.listFiles(directory, true));

can not order by timestamp column in trino version 447/448 #72

Paimon Version
paimon-trino-432-0.8-SNAPSHOT-plugin.tar.gz

Trino Verion

Trino 447/448
java version "22.0.1" 2024-04-16
Java(TM) SE Runtime Environment Oracle GraalVM 22.0.1+8.1 (build 22.0.1+8-jvmci-b01)
Java HotSpot(TM) 64-Bit Server VM Oracle GraalVM 22.0.1+8.1 (build 22.0.1+8-jvmci-b01, mixed mode, sharing)

schema

CREATE TABLE apisix_log (
  id               string,
  upstream         string,
  start_time       timestamp(3),
  client_ip        string,
  service_id       string,
  route_id         string,
  request          row<
    url            string,
    headers        string,
    body           string,
    size           bigint,
    querystring    string,
    uri            string,
    `method`       string
  >,
  response         row<
    status         bigint,
    headers        string,
    body           string,
    size           bigint
  >,
  server           row<
    hostname       string,
    version        string
  >,
  latency          double,
  apisix_latency   double,
  upstream_latency double,
  dt               string,
  WATERMARK FOR start_time AS start_time
)  comment 'APISIX kafka-logger 原始日志数据' PARTITIONED BY (dt) with (
  'bucket' = '6'
);

query with order by returns exception

SELECT t.*
FROM raw_log.apisix_log t
WHERE dt='2024-05-21'
ORDER BY start_time

client exception

[2024-05-17 20:35:22] [65536] Query failed (#20240521_045409_01038_3pyv6): readNanos is negative (-9275621)
[2024-05-17 20:35:22] java.lang.IllegalArgumentException: readNanos is negative (-9275621)

sever log
server-448.log

can not read row type in trino version 440

Paimon version

paimon-trino-427-0.8-20240512.000527-20-plugin.tar.gz
Trino Version

Trino 440

java version "22.0.1" 2024-04-16
Java(TM) SE Runtime Environment Oracle GraalVM 22.0.1+8.1 (build 22.0.1+8-jvmci-b01)
Java HotSpot(TM) 64-Bit Server VM Oracle GraalVM 22.0.1+8.1 (build 22.0.1+8-jvmci-b01, mixed mode, sharing)

Querying data from the following table returns exception

CREATE TABLE apisix_log (
  id               string,
  upstream         string,
  start_time       timestamp(3),
  client_ip        string,
  service_id       string,
  route_id         string,
  request          row<
    url            string,
    headers        string,
    body           string,
    size           bigint,
    querystring    string,
    uri            string,
    `method`       string
  >,
  response         row<
    status         bigint,
    headers        string,
    body           string,
    size           bigint
  >,
  server           row<
    hostname       string,
    version        string
  >,
  latency          double,
  apisix_latency   double,
  upstream_latency double,
  dt               string,
  WATERMARK FOR start_time AS start_time
)  comment 'APISIX kafka-logger 原始日志数据' PARTITIONED BY (dt) with (
  'bucket' = '6'
);

Client exception

[2024-05-12 15:35:34] [65536] Query failed (#20240512_073511_00085_uwx8s): 'io.trino.spi.block.Block io.trino.spi.block.RowBlock.fromFieldBlocks(int, java.util.Optional, io.trino.spi.block.Block[])'
[2024-05-12 15:35:34] java.lang.NoSuchMethodError: 'io.trino.spi.block.Block io.trino.spi.block.RowBlock.fromFieldBlocks(int, java.util.Optional, io.trino.spi.block.Block[])'

Server exception
server.log.zip

server-440.log.zip

Documentation updated to add instructions for specifying the Paimon temporary directory.

Support predicate pushdown for partitioned table

Description

When executing a query with a partition key filter, there is no effect of predicate pushdown, the ScanFilter operator still performs a full table scan.

paimon version: 0.4
trino version: 412

Timestamp type resolved incorrectly when create (or show the table structure ) of a paimon catalog table in trino

eg:
create ddl:
CREATE TABLE sales_order_line (
seq_id BIGINT,
id VARCHAR(2147483647),
tenant_id VARCHAR(2147483647),
biz_time TIMESTAMP(0),
organization_id VARCHAR(2147483647),
extend_create_info VARCHAR(2147483647),
extend_modify_info VARCHAR(2147483647),
order_id VARCHAR(2147483647),
line_type VARCHAR(2147483647),
root_line_id VARCHAR(2147483647),
parent_line_id VARCHAR(2147483647),
customer_id VARCHAR(2147483647),
line_index VARCHAR(2147483647),
quantity INT,
total_retail_amount DECIMAL(25, 2),
total_transaction_amount DECIMAL(25, 2),
discount DECIMAL(25, 2),
arrears_amount DECIMAL(25, 2),
arrears_clear_amount DECIMAL(25, 2),
product_id VARCHAR(2147483647),
product_name VARCHAR(2147483647),
product_type VARCHAR(2147483647),
note VARCHAR(2147483647),
origin_item_id VARCHAR(2147483647),
expire_date TIMESTAMP(0),
sales_item_snapshot_id VARCHAR(2147483647),
redeems VARCHAR(2147483647),
deleted TINYINT,
version INT,
create_by VARCHAR(2147483647),
create_time TIMESTAMP(0),
modify_by VARCHAR(2147483647),
modify_time TIMESTAMP(0),
leaf TINYINT
) WITH (
file_format = 'ORC',
primary_key = ARRAY['seq_id','biz_time'],
partitioned_by = ARRAY['biz_time'],
bucket = '3',
bucket_key = 'seq_id',
changelog_producer = 'input',
path='oss://lc-bigdata/store/testdb.db/sales_order_line'
)

the real paimon table structure is :
{
"id" : 0,
"fields" : [ {
"id" : 0,
"name" : "seq_id",
"type" : "BIGINT NOT NULL"
}, {
"id" : 1,
"name" : "id",
"type" : "VARCHAR(2147483646)"
}, {
"id" : 2,
"name" : "tenant_id",
"type" : "VARCHAR(2147483646)"
}, {
"id" : 3,
"name" : "biz_time",
"type" : "TIMESTAMP(6) NOT NULL"
}, {
"id" : 4,
"name" : "organization_id",
"type" : "VARCHAR(2147483646)"
}, {
"id" : 5,
"name" : "extend_create_info",
"type" : "VARCHAR(2147483646)"
}, {
"id" : 6,
"name" : "extend_modify_info",
"type" : "VARCHAR(2147483646)"
}, {
"id" : 7,
"name" : "order_id",
"type" : "VARCHAR(2147483646)"
}, {
"id" : 8,
"name" : "line_type",
"type" : "VARCHAR(2147483646)"
}, {
"id" : 9,
"name" : "root_line_id",
"type" : "VARCHAR(2147483646)"
}, {
"id" : 10,
"name" : "parent_line_id",
"type" : "VARCHAR(2147483646)"
}, {
"id" : 11,
"name" : "customer_id",
"type" : "VARCHAR(2147483646)"
}, {
"id" : 12,
"name" : "line_index",
"type" : "VARCHAR(2147483646)"
}, {
"id" : 13,
"name" : "quantity",
"type" : "INT"
}, {
"id" : 14,
"name" : "total_retail_amount",
"type" : "DECIMAL(25, 2)"
}, {
"id" : 15,
"name" : "total_transaction_amount",
"type" : "DECIMAL(25, 2)"
}, {
"id" : 16,
"name" : "discount",
"type" : "DECIMAL(25, 2)"
}, {
"id" : 17,
"name" : "arrears_amount",
"type" : "DECIMAL(25, 2)"
}, {
"id" : 18,
"name" : "arrears_clear_amount",
"type" : "DECIMAL(25, 2)"
}, {
"id" : 19,
"name" : "product_id",
"type" : "VARCHAR(2147483646)"
}, {
"id" : 20,
"name" : "product_name",
"type" : "VARCHAR(2147483646)"
}, {
"id" : 21,
"name" : "product_type",
"type" : "VARCHAR(2147483646)"
}, {
"id" : 22,
"name" : "note",
"type" : "VARCHAR(2147483646)"
}, {
"id" : 23,
"name" : "origin_item_id",
"type" : "VARCHAR(2147483646)"
}, {
"id" : 24,
"name" : "expire_date",
"type" : "TIMESTAMP(6)"
}, {
"id" : 25,
"name" : "sales_item_snapshot_id",
"type" : "VARCHAR(2147483646)"
}, {
"id" : 26,
"name" : "redeems",
"type" : "VARCHAR(2147483646)"
}, {
"id" : 27,
"name" : "deleted",
"type" : "TINYINT"
}, {
"id" : 28,
"name" : "version",
"type" : "INT"
}, {
"id" : 29,
"name" : "create_by",
"type" : "VARCHAR(2147483646)"
}, {
"id" : 30,
"name" : "create_time",
"type" : "TIMESTAMP(6)"
}, {
"id" : 31,
"name" : "modify_by",
"type" : "VARCHAR(2147483646)"
}, {
"id" : 32,
"name" : "modify_time",
"type" : "TIMESTAMP(6)"
}, {
"id" : 33,
"name" : "leaf",
"type" : "TINYINT"
} ],
"highestFieldId" : 33,
"partitionKeys" : [ "biz_time" ],
"primaryKeys" : [ "seq_id", "biz_time" ],
"options" : {
"bucket" : "3",
"path" : "oss://lc-bigdata/store/testdb.db/sales_order_line",
"bucket-key" : "seq_id",
"changelog-producer" : "input",
"file.format" : "orc"
},
"timeMillis" : 1694140846488
}

we can see TIMESTAMP(0) datatype converted to TIMESTAMP(6),
and when show this table in trino:
image

timestamp(6) saved in paimon schema converted to timestamp(3) in trino schema.
schema unaligned between engine and storage may cause some unpredictable problems,it should be resolved.

Should support data type of MAP

Trino 422, Paimon 0.5.0
// create paimon table using flink sql

create table test (
  id BIGINT,
  name String,
  tags MAP<String, String>
);

// then insert some data

// query the paimon table without tags field by trino sql, the query result is OK
select id, name from test;

// then query the paimon table including tags field, then it reported error
select * from test;

Caused by: java.lang.NoSuchMethodError: 'io.trino.spi.block.BlockBuilder io.trino.spi.block.BlockBuilder.beginBlockEntry()'
	at org.apache.paimon.trino.TrinoPageSourceBase.writeBlock(TrinoPageSourceBase.java:271)
	at org.apache.paimon.trino.TrinoPageSourceBase.appendTo(TrinoPageSourceBase.java:201)
	at org.apache.paimon.trino.TrinoPageSourceBase.nextPage(TrinoPageSourceBase.java:138)
	at org.apache.paimon.trino.TrinoPageSourceBase.lambda$getNextPage$0(TrinoPageSourceBase.java:119)
	at org.apache.paimon.trino.ClassLoaderUtils.runWithContextClassLoader(ClassLoaderUtils.java:30)
	at org.apache.paimon.trino.TrinoPageSourceBase.getNextPage(TrinoPageSourceBase.java:116)
	at io.trino.operator.TableScanOperator.getOutput(TableScanOperator.java:299)
	at io.trino.operator.Driver.processInternal(Driver.java:395)
	at io.trino.operator.Driver.lambda$process$8(Driver.java:298)
	at io.trino.operator.Driver.tryWithLock(Driver.java:694)
	at io.trino.operator.Driver.process(Driver.java:290)
	at io.trino.operator.Driver.processForDuration(Driver.java:261)
	at io.trino.execution.SqlTaskExecution$DriverSplitRunner.processFor(SqlTaskExecution.java:887)

[Bug] org.apache.paimon.trino.fileio.TrinoFileIO#listStatus recursive list path when use hdfs

In this issuse #54, When use hdfs , the semantics of org.apache.paimon.trino.fileio.TrinoFileIO#listStatus have issues

I understand the semantics of the org.apache.paimon.trino.fileio.TrinoFileIO#listStatus method to be listing the first-level directories and files under the current path.

public FileStatus[] listStatus(Path path) throws IOException {
List<FileStatus> fileStatusList = new ArrayList<>();
Location location = Location.of(path.toString());
if (trinoFileSystem.directoryExists(location).orElse(false)) {
FileIterator fileIterator = trinoFileSystem.listFiles(location);
while (fileIterator.hasNext()) {
FileEntry fileEntry = fileIterator.next();
fileStatusList.add(
new TrinoFileStatus(
fileEntry.length(),
new Path(fileEntry.location().toString()),
fileEntry.lastModified().getEpochSecond()));
}
trinoFileSystem
.listDirectories(Location.of(path.toString()))
.forEach(
l ->
fileStatusList.add(
new TrinoDirectoryFileStatus(new Path(l.toString()))));
}
return fileStatusList.toArray(new FileStatus[0]);

However, when using HDFS, the implementation of trinoFileSystem.listFiles recursively traverses subdirectories, which can lead to unnecessary data scanning.
https://github.com/trinodb/trino/blob/a9c5719705614b3849f2e1a22b2a545da125bd32/lib/trino-hdfs/src/main/java/io/trino/filesystem/hdfs/HdfsFileSystem.java#L228-L246
A possible solution is to add a parameter to the trinoFileSystem.listFiles method to control the scanning depth or add an option to switch recursive/non-recursive modes.

[Feature] support prestosql and jdk8

Motivation

To be compatible with older PrestoSQL systems

Solution

No response

Anything else?

No response

Are you willing to submit a PR?

I'm willing to submit a PR!

TrinoSplitManagerBase's single read batch is too large

The nextPage method of TrinoPageSourceBase reads a batch of data each time. Through debugging, I found that the batch is very large, which leads to the use of DynamicSliceOutput.writeBytes of a VariableWidthBlockBuilder object when processing strings. String writing will cause the size attribute to continue to expand and eventually exceed the maximum value of int and overflow.
debug code:
image

batch size print:
image

Exception occurrence code:
image

image

java.lang.IndexOutOfBoundsException: end index (-2147483499) must not be negative

paimon version : 0.5
trino version : 416

When I was executing the select statement, this exception occurred. I added some exception traps myself and output some information.
I found that the size in the VariableWidthBlockBuilder class will exceed the maximum value of int and become negative.

io.trino.spi.TrinoException: org.apache.paimon.table.source.KeyValueTableRead$RowDataRecordReader
at org.apache.paimon.trino.TrinoPageSourceBase.nextPage(TrinoPageSourceBase.java:141)
at org.apache.paimon.trino.TrinoPageSourceBase.lambda$getNextPage$0(TrinoPageSourceBase.java:114)
at org.apache.paimon.trino.ClassLoaderUtils.runWithContextClassLoader(ClassLoaderUtils.java:30)
at org.apache.paimon.trino.TrinoPageSourceBase.getNextPage(TrinoPageSourceBase.java:112)
at io.trino.operator.TableScanOperator.getOutput(TableScanOperator.java:301)
at io.trino.operator.Driver.processInternal(Driver.java:402)
at io.trino.operator.Driver.lambda$process$8(Driver.java:305)
at io.trino.operator.Driver.tryWithLock(Driver.java:701)
at io.trino.operator.Driver.process(Driver.java:297)
at io.trino.operator.Driver.processForDuration(Driver.java:268)
at io.trino.execution.SqlTaskExecution$DriverSplitRunner.processFor(SqlTaskExecution.java:888)
at io.trino.execution.executor.PrioritizedSplitRunner.process(PrioritizedSplitRunner.java:187)
at io.trino.execution.executor.TaskExecutor$TaskRunner.run(TaskExecutor.java:561)
at io.trino.$gen.Trino_416____20230609_030235_2.run(Unknown Source)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
at java.base/java.lang.Thread.run(Thread.java:833)
Caused by: io.trino.spi.TrinoException: VariableWidthBlockBuilder{positionCount=54228272, size=2147483624}
Slice length :173,Slice value :资产计算状态为:1000,资产对effDate字段进行补充数据:Thu Oct 22 19:51:18 CST 2020,资产对expDate字段进行补充数据:Wed Jan 01 00:00:00 CST 3000,
at org.apache.paimon.trino.TrinoPageSourceBase.writeSlice(TrinoPageSourceBase.java:215)
at org.apache.paimon.trino.TrinoPageSourceBase.appendTo(TrinoPageSourceBase.java:193)
at org.apache.paimon.trino.TrinoPageSourceBase.nextPage(TrinoPageSourceBase.java:134)
... 16 more
Caused by: java.lang.IndexOutOfBoundsException: end index (-2147483499) must not be negative
at io.airlift.slice.Preconditions.checkPositionIndexes(Preconditions.java:81)
at io.airlift.slice.Slice.checkIndexLength(Slice.java:1301)
at io.airlift.slice.Slice.setBytes(Slice.java:788)
at io.airlift.slice.DynamicSliceOutput.writeBytes(DynamicSliceOutput.java:152)
at io.trino.spi.block.VariableWidthBlockBuilder.writeBytes(VariableWidthBlockBuilder.java:244)
at io.trino.spi.type.VarcharType.writeSlice(VarcharType.java:220)
at io.trino.spi.type.VarcharType.writeSlice(VarcharType.java:214)
at org.apache.paimon.trino.TrinoPageSourceBase.writeSlice(TrinoPageSourceBase.java:212)
... 18 more

Improvement for paimon connector setup

Presto 333 does not need to add <hudi_catalog> to access the hudi table. Currently, only need to place a hudi-trino-bundle-0.13.0.jar in the $PRESTO_HOME/plugin/hive-hadoop2 directory of the coordinator node when deploy.

requirements are:

  1. access paimon tables in hive.
  2. only need to deploy jars in coordinator node.

Unable to select map type field when using trino-server-422

To Reproduce

  1. If it is a bigint/int/string type field, the select query can be executed normally.
image
  1. But if it is a map type field, the select query will be failed.
image

Stacktrace

java.lang.NoSuchMethodError: 'io.trino.spi.block.BlockBuilder io.trino.spi.block.BlockBuilder.beginBlockEntry()'
	at org.apache.paimon.trino.TrinoPageSourceBase.writeBlock(TrinoPageSourceBase.java:271)
	at org.apache.paimon.trino.TrinoPageSourceBase.appendTo(TrinoPageSourceBase.java:201)
	at org.apache.paimon.trino.TrinoPageSourceBase.nextPage(TrinoPageSourceBase.java:138)
	at org.apache.paimon.trino.TrinoPageSourceBase.lambda$getNextPage$0(TrinoPageSourceBase.java:119)
	at org.apache.paimon.trino.ClassLoaderUtils.runWithContextClassLoader(ClassLoaderUtils.java:30)
	at org.apache.paimon.trino.TrinoPageSourceBase.getNextPage(TrinoPageSourceBase.java:116)
	at io.trino.operator.ScanFilterAndProjectOperator$ConnectorPageSourceToPages.process(ScanFilterAndProjectOperator.java:386)
	at io.trino.operator.WorkProcessorUtils$ProcessWorkProcessor.process(WorkProcessorUtils.java:412)
	at io.trino.operator.WorkProcessorUtils.getNextState(WorkProcessorUtils.java:261)
	at io.trino.operator.WorkProcessorUtils$YieldingProcess.process(WorkProcessorUtils.java:181)
	at io.trino.operator.WorkProcessorUtils$ProcessWorkProcessor.process(WorkProcessorUtils.java:412)
	at io.trino.operator.WorkProcessorUtils$3.process(WorkProcessorUtils.java:346)
	at io.trino.operator.WorkProcessorUtils$ProcessWorkProcessor.process(WorkProcessorUtils.java:412)
	at io.trino.operator.WorkProcessorUtils$3.process(WorkProcessorUtils.java:346)
	at io.trino.operator.WorkProcessorUtils$ProcessWorkProcessor.process(WorkProcessorUtils.java:412)
	at io.trino.operator.WorkProcessorUtils$3.process(WorkProcessorUtils.java:346)
	at io.trino.operator.WorkProcessorUtils$ProcessWorkProcessor.process(WorkProcessorUtils.java:412)
	at io.trino.operator.WorkProcessorUtils.getNextState(WorkProcessorUtils.java:261)
	at io.trino.operator.WorkProcessorUtils$BlockingProcess.process(WorkProcessorUtils.java:207)
	at io.trino.operator.WorkProcessorUtils$ProcessWorkProcessor.process(WorkProcessorUtils.java:412)
	at io.trino.operator.WorkProcessorUtils.lambda$flatten$6(WorkProcessorUtils.java:317)
	at io.trino.operator.WorkProcessorUtils$3.process(WorkProcessorUtils.java:359)
	at io.trino.operator.WorkProcessorUtils$ProcessWorkProcessor.process(WorkProcessorUtils.java:412)
	at io.trino.operator.WorkProcessorUtils$3.process(WorkProcessorUtils.java:346)
	at io.trino.operator.WorkProcessorUtils$ProcessWorkProcessor.process(WorkProcessorUtils.java:412)
	at io.trino.operator.WorkProcessorUtils.getNextState(WorkProcessorUtils.java:261)
	at io.trino.operator.WorkProcessorUtils.lambda$processStateMonitor$2(WorkProcessorUtils.java:240)
	at io.trino.operator.WorkProcessorUtils$ProcessWorkProcessor.process(WorkProcessorUtils.java:412)
	at io.trino.operator.WorkProcessorUtils.getNextState(WorkProcessorUtils.java:261)
	at io.trino.operator.WorkProcessorUtils.lambda$finishWhen$3(WorkProcessorUtils.java:255)
	at io.trino.operator.WorkProcessorUtils$ProcessWorkProcessor.process(WorkProcessorUtils.java:412)
	at io.trino.operator.WorkProcessorSourceOperatorAdapter.getOutput(WorkProcessorSourceOperatorAdapter.java:145)
	at io.trino.operator.Driver.processInternal(Driver.java:395)
	at io.trino.operator.Driver.lambda$process$8(Driver.java:298)
	at io.trino.operator.Driver.tryWithLock(Driver.java:694)
	at io.trino.operator.Driver.process(Driver.java:290)
	at io.trino.operator.Driver.processForDuration(Driver.java:261)
	at io.trino.execution.SqlTaskExecution$DriverSplitRunner.processFor(SqlTaskExecution.java:887)
	at io.trino.execution.executor.PrioritizedSplitRunner.process(PrioritizedSplitRunner.java:187)
	at io.trino.execution.executor.TaskExecutor$TaskRunner.run(TaskExecutor.java:555)
	at io.trino.$gen.Trino_tag_trino_422_1_0_1____20230831_065806_38.run(Unknown Source)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
	at java.base/java.lang.Thread.run(Thread.java:833)

Component Version

  • Trino version: 422
  • paimon-trino version: 0.5-SNAPSHOT

Query data of Time,TIMESTAMP_TZ_MILLIS type error

Exception in thread "main" java.sql.SQLException: Query failed (#20230809_075412_00001_3b2ik): Unhandled type for long: time(3)
at io.trino.jdbc.AbstractTrinoResultSet.resultsException(AbstractTrinoResultSet.java:1937)
at io.trino.jdbc.TrinoResultSet$ResultsPageIterator.computeNext(TrinoResultSet.java:294)
at io.trino.jdbc.TrinoResultSet$ResultsPageIterator.computeNext(TrinoResultSet.java:254)
at io.trino.jdbc.$internal.guava.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:145)
at io.trino.jdbc.$internal.guava.collect.AbstractIterator.hasNext(AbstractIterator.java:140)
at java.base/java.util.Spliterators$IteratorSpliterator.tryAdvance(Spliterators.java:1811)
at java.base/java.util.stream.StreamSpliterators$WrappingSpliterator.lambda$initPartialTraversalState$0(StreamSpliterators.java:294)
at java.base/java.util.stream.StreamSpliterators$AbstractWrappingSpliterator.fillBuffer(StreamSpliterators.java:206)
at java.base/java.util.stream.StreamSpliterators$AbstractWrappingSpliterator.doAdvance(StreamSpliterators.java:161)
at java.base/java.util.stream.StreamSpliterators$WrappingSpliterator.tryAdvance(StreamSpliterators.java:300)
at java.base/java.util.Spliterators$1Adapter.hasNext(Spliterators.java:681)
at io.trino.jdbc.TrinoResultSet$AsyncIterator.lambda$new$1(TrinoResultSet.java:179)
at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:834)
Caused by: io.trino.spi.TrinoException: Unhandled type for long: time(3)
at org.apache.paimon.trino.TrinoPageSourceBase.appendTo(TrinoPageSourceBase.java:187)
at org.apache.paimon.trino.TrinoPageSourceBase.nextPage(TrinoPageSourceBase.java:138)
at org.apache.paimon.trino.TrinoPageSourceBase.lambda$getNextPage$0(TrinoPageSourceBase.java:119)
at org.apache.paimon.trino.ClassLoaderUtils.runWithContextClassLoader(ClassLoaderUtils.java:30)
at org.apache.paimon.trino.TrinoPageSourceBase.getNextPage(TrinoPageSourceBase.java:116)
at io.trino.operator.TableScanOperator.getOutput(TableScanOperator.java:311)
at io.trino.operator.Driver.processInternal(Driver.java:410)
at io.trino.operator.Driver.lambda$process$10(Driver.java:313)
at io.trino.operator.Driver.tryWithLock(Driver.java:698)
at io.trino.operator.Driver.process(Driver.java:305)
at io.trino.operator.Driver.processForDuration(Driver.java:276)
at io.trino.execution.SqlTaskExecution$DriverSplitRunner.processFor(SqlTaskExecution.java:740)
at io.trino.execution.executor.PrioritizedSplitRunner.process(PrioritizedSplitRunner.java:163)
at io.trino.execution.executor.TaskExecutor$TaskRunner.run(TaskExecutor.java:488)
at io.trino.$gen.Trino_testversion____20230809_075404_1.run(Unknown Source)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:834)

Query error when defining column as timestamp_ltz

Trino version: 382
paimon jar: paimon-trino-370-0.5-20230907.000855-113.jar
https://repository.apache.org/snapshots/org/apache/paimon/paimon-trino-370/0.5-SNAPSHOT/

Table Schema:
image

Query Sql:
image

Error:

2023-09-12T17:28:42.610+0800    DEBUG   stage-scheduler io.trino.execution.QueryStateMachine    Query 20230912_092836_00011_qb5pf failed
io.trino.spi.TrinoException: Unhandled type for long: timestamp(3) with time zone
        at org.apache.paimon.trino.TrinoPageSourceBase.appendTo(TrinoPageSourceBase.java:187)
        at org.apache.paimon.trino.TrinoPageSourceBase.nextPage(TrinoPageSourceBase.java:138)
        at org.apache.paimon.trino.TrinoPageSourceBase.lambda$getNextPage$0(TrinoPageSourceBase.java:119)
        at org.apache.paimon.trino.ClassLoaderUtils.runWithContextClassLoader(ClassLoaderUtils.java:30)
        at org.apache.paimon.trino.TrinoPageSourceBase.getNextPage(TrinoPageSourceBase.java:116)
        at io.trino.operator.TableScanOperator.getOutput(TableScanOperator.java:311)
        at io.trino.operator.Driver.processInternal(Driver.java:410)
        at io.trino.operator.Driver.lambda$process$10(Driver.java:313)
        at io.trino.operator.Driver.tryWithLock(Driver.java:698)
        at io.trino.operator.Driver.process(Driver.java:305)
        at io.trino.operator.Driver.processForDuration(Driver.java:276)
        at io.trino.execution.SqlTaskExecution$DriverSplitRunner.processFor(SqlTaskExecution.java:1092)
        at io.trino.execution.executor.PrioritizedSplitRunner.process(PrioritizedSplitRunner.java:163)
        at io.trino.execution.executor.TaskExecutor$TaskRunner.run(TaskExecutor.java:488)
        at io.trino.$gen.Trino_382_bk_1_0_1____20230912_092719_2.run(Unknown Source)
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
        at java.base/java.lang.Thread.run(Thread.java:829)

java.lang.IndexOutOfBoundsException: end index (-2147481323) must not be negative

I also encountered the same problem today, as mentioned in previous issuses. Is there a solution? Here is my abnormal information:
2024-04-09T18:50:12.548+0800 ERROR stage-scheduler io.trino.execution.StageStateMachine Stage 20240409_104939_00005_8j8ji.1 failed
java.lang.IndexOutOfBoundsException: end index (-2147481323) must not be negative
at io.airlift.slice.Preconditions.checkPositionIndexes(Preconditions.java:81)
at io.airlift.slice.Slice.checkIndexLength(Slice.java:1302)
at io.airlift.slice.Slice.setBytes(Slice.java:788)
at io.airlift.slice.DynamicSliceOutput.writeBytes(DynamicSliceOutput.java:153)
at io.trino.spi.block.VariableWidthBlockBuilder.writeBytes(VariableWidthBlockBuilder.java:245)
at io.trino.spi.type.VarcharType.writeSlice(VarcharType.java:205)
at io.trino.spi.type.VarcharType.writeSlice(VarcharType.java:199)
at org.apache.paimon.trino.TrinoPageSourceBase.writeSlice(TrinoPageSourceBase.java:211)
at org.apache.paimon.trino.TrinoPageSourceBase.appendTo(TrinoPageSourceBase.java:194)
at org.apache.paimon.trino.TrinoPageSourceBase.nextPage(TrinoPageSourceBase.java:138)
at org.apache.paimon.trino.TrinoPageSourceBase.lambda$getNextPage$0(TrinoPageSourceBase.java:119)
at org.apache.paimon.trino.ClassLoaderUtils.runWithContextClassLoader(ClassLoaderUtils.java:30)
at org.apache.paimon.trino.TrinoPageSourceBase.getNextPage(TrinoPageSourceBase.java:116)
at io.trino.operator.TableScanOperator.getOutput(TableScanOperator.java:311)
at io.trino.operator.Driver.processInternal(Driver.java:410)
at io.trino.operator.Driver.lambda$process$10(Driver.java:313)
at io.trino.operator.Driver.tryWithLock(Driver.java:698)
at io.trino.operator.Driver.process(Driver.java:305)
at io.trino.operator.Driver.processForDuration(Driver.java:276)
at io.trino.execution.SqlTaskExecution$DriverSplitRunner.processFor(SqlTaskExecution.java:1092)
at io.trino.execution.executor.PrioritizedSplitRunner.process(PrioritizedSplitRunner.java:163)
at io.trino.execution.executor.TaskExecutor$TaskRunner.run(TaskExecutor.java:488)
at io.trino.$gen.Trino_5bf98be____20240409_104157_2.run(Unknown Source)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:829)

After converting a field in Paimon from float type to Trino's real type, encountered a type error while querying the data.

As mentioned in the title, after MySQL CDC, I encountered the following error when querying a field of float type:

io.trino.spi.TrinoException: Unhandled type for long: real
at org.apache.paimon.trino.TrinoPageSourceBase.appendTo(TrinoPageSourceBase.java:182)
at org.apache.paimon.trino.TrinoPageSourceBase.nextPage(TrinoPageSourceBase.java:135)
at org.apache.paimon.trino.TrinoPageSourceBase.lambda$getNextPage$0(TrinoPageSourceBase.java:116)
at org.apache.paimon.trino.ClassLoaderUtils.runWithContextClassLoader(ClassLoaderUtils.java:30)
at org.apache.paimon.trino.TrinoPageSourceBase.getNextPage(TrinoPageSourceBase.java:113)
at io.trino.operator.TableScanOperator.getOutput(TableScanOperator.java:300)
at io.trino.operator.Driver.processInternal(Driver.java:396)
at io.trino.operator.Driver.lambda$process$8(Driver.java:299)
at io.trino.operator.Driver.tryWithLock(Driver.java:695)
at io.trino.operator.Driver.process(Driver.java:291)
at io.trino.operator.Driver.processForDuration(Driver.java:262)
at io.trino.execution.SqlTaskExecution$DriverSplitRunner.processFor(SqlTaskExecution.java:888)
at io.trino.execution.executor.PrioritizedSplitRunner.process(PrioritizedSplitRunner.java:187)
at io.trino.execution.executor.TaskExecutor$TaskRunner.run(TaskExecutor.java:556)
at io.trino.$gen.Trino_420____20230626_051456_2.run(Unknown Source)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
at java.base/java.lang.Thread.run(Thread.java:833)

After investigating the error, it was found that the RealType#getJavaType in Trino actually returns long.class. This caused a missing check for RealType in TrinoPageSourceBase#appendTo, resulting in this error. Referring to other database handling methods, I added type.writeLong(output, floatToIntBits(((Number) value).floatValue())); to resolve this issue.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.