Comments (6)
I'm reasonably confident the error is orignating from here, based on my read of various error messages:
delta-rs/crates/core/src/operations/optimize.rs
Lines 500 to 511 in f041692
Since it's run in a blocking context in the python side, I'm wondering if that's causing any weirdness (it shouldn't).
from delta-rs.
@abhiaagarwal I wish I could assist but my Rust knowledge is very limited. But let me know if I need to test something.
from delta-rs.
This same issue is happening occasionally when also reading from a deltatable in Azure Gen 2:
File "pyarrow\\_dataset.pyx", line 562, in pyarrow._dataset.Dataset.to_table
File "pyarrow\\_dataset.pyx", line 3804, in pyarrow._dataset.Scanner.to_table
File "pyarrow\\error.pxi", line 154, in pyarrow.lib.pyarrow_internal_check_status
File "pyarrow\\error.pxi", line 88, in pyarrow.lib.check_status
OSError: Generic MicrosoftAzure error: error decoding response body
In which to_table
is causing this.
from delta-rs.
@Josh-Hiz what happens if try benchmarking with azcopy
like I have done here: apache/arrow-rs#5882 (comment) maybe you can add a data point as a comment?
from delta-rs.
@Josh-Hiz what happens if try benchmarking with
azcopy
like I have done here: apache/arrow-rs#5882 (comment) maybe you can add a data point as a comment?
Very gentle ping @Josh-Hiz :-)
from delta-rs.
@thomasfrederikhoeck try to create a reproducible example that mimics the size and characteristics of your table on Azure. Otherwise no one can properly replicate
from delta-rs.
Related Issues (20)
- Allow Multiple `DeltaOps` in a single commit.
- DeltaTable.from_data_catalog not working HOT 1
- Optimizer Compact not running parallel to append writers
- Failure to delete dir and files HOT 3
- segmentation fault - Python 3.10 on Mac M3 HOT 11
- Failed to commit transaction: 15 when writing an Iterator of recordbatches HOT 3
- Choose which columns to store min/max values for HOT 1
- delete_dir bug HOT 2
- append is deleting records HOT 2
- AWS WebIdentityToken exposure in log files HOT 8
- CDC support in deltalog when writing delta table HOT 6
- `IN (...)` clauses appear to be ignored in merge commands with S3 - extra partitions scanned
- SchemaError occurs during table optimisation after upgrade to v0.18.1 HOT 4
- Slow add_actions.to_pydict for tables with large number of columns, impacting read performance HOT 1
- `DeltaScanBuilder` does not respect datafusion context's `datafusion.execution.parquet.pushdown_filters` HOT 1
- Error decoding field 'stats' when creating checkpoint HOT 3
- Regression in Python multiprocessing support HOT 9
- `RecordBatchWriter` only creates stats for the first 32 columns; this prevents calling `create_checkpoint`. HOT 6
- Provide documentation how to configure various storage backends
- Write also insert change types in writer CDC
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from delta-rs.