Comments (8)
It looks like a known issue reported in: #9305
from hudi.
Hi @danny0405, yes it does look similar. However, the table was already running with Spark 3.3.2 and hudi 0.13.1 without errors. The only changes here were we upgraded the Hudi version to 0.14.1 and turned on metadata.
The casting also seems to be in the opposite direction and the first run with Hudi 0.14.1 did not have metadata and succeeded. Do you think the issue is related to how the metadata table is saved? In other words, is metadata not supported with Spark 3.3.2?
Thanks for the help!
from hudi.
Do you think the issue is related to how the metadata table is saved? In other words, is metadata not supported with Spark 3.3.2?
It is supported, can you share you config options related with metadata table?
from hudi.
Hi @danny0405 , we are using defaults only. All hudi configs specified are listed above. Is there something we should configure specifically?
from hudi.
I'm pretty sure it is a jar conflict, can you check the jar that involves the reported class?
from hudi.
@vicuna96 How many columns are there in your dataset? If its more than 100, did you tried setting spark.sql.codegen.maxFields
from hudi.
Hi @danny0405 , this seems to be in the spark-catalyst_2.12-3.3.2.jar package. but org.apache.spark.sql.catalyst.expressions.UnsafeRow does not extend org.apache.spark.sql.vectorized.ColumnarBatch. Is this expected in different versions?
Hi @ad1happy2go , I can give it a try but the table should have less than 100 columns and also this seems like a spark property rather than hudi property and the spark version has not changed. I will update once I get a chance to test it.
from hudi.
@vicuna96 Did you get a chance to test out.
from hudi.
Related Issues (20)
- issue with reading the data using hudi streamer HOT 3
- [DISCUSSION] Deltastreamer - Reading commit checkpoint from Kafka instead of latest Hoodie commit HOT 1
- [SUPPORT]Hudi Deltastreamer compaction is taking longer duration HOT 4
- [SUPPORT]Performance degrade for migrating from Hudi 0.7 to Hudi 0.14 HOT 5
- [SUPPORT] Pulsar connection error for Hoodie Streamer HOT 1
- Failed insert schema compatibility mismatch issue HOT 9
- [SUPPORT] Datadog Metrics reporter fails with null pointer exception using hudi 0.14.0
- HUDI 0.14.1 and AWS GLUE 4.0 issues with schema evolution HOT 2
- [logical delete data] How to use flink-cdc to logical delete the hudi data HOT 1
- [SUPPORT] Flink bucket index partitioner may cause data skew HOT 6
- [SUPPORT] Failed to parse HoodieCommitMetadata HOT 1
- [SUPPORT] NPE when using PySpark with release-0.15.0 HOT 4
- org.apache.hudi.exception.HoodieException: org.apache.avro.AvroTypeException: Cannot encode decimal with precision 14 as max precision 13 HOT 2
- [SUPPORT] Failed to upsert for commit time xxxx ,HUDI 0.14.1 & Glue 4.0 HOT 4
- [SUPPORT] - Partial update of the MOR table after compaction with Hudi Streamer HOT 3
- [SUPPORT] Spark-Hudi: Unable to perform Hard delete using Pyspark on HUDI table from AWS Glue HOT 7
- [SUPPORT] Issue with RECORD_INDEX Initialization Falling Back to GLOBAL_SIMPLE HOT 1
- duplicated records when use insert overwrite HOT 2
- [SUPPORT] CVE problems in latest 0.14.1
- [SUPPORT] using spark's observe feature on dataframes saved by hudi is stuck
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from hudi.