Comments (2)
FYI I'm working towards standardizing the interface for PyArrow datasets to make it easier for engines to consume, including Dask. My research for that is how I found that. If interested, feel free to read and/or comment on this document. https://docs.google.com/document/d/1r56nt5Un2E7yPrZO9YPknBN4EDtptpx-tqOZReHvq1U/edit?usp=sharing
from dask-deltatable.
BTW, this would allow column projection pushdown, based on this protocol: https://github.com/dask/dask/blob/12c7c10d0c15391a6522fe2dc7df191f8088967e/dask/dataframe/io/utils.py#L224
Maybe that same protocol will support filter pushdown too?
from dask-deltatable.
Related Issues (20)
- Handle timestamps other than `datetime64[us]`
- Release soon? HOT 5
- Finalize API for writing Delta Tables HOT 1
- Support pyarrow types_mapper kwarg
- Pickle error with `ParquetFileWriteOptions` and `distributed.Client`
- Support reading and writing to remote filesystems (s3, gcsfs, azure)
- Credentials for remote filesystems?
- `storage_options` inconsistency between `read_deltalake` and `to_deltalake`
- `TypeError`: cannot pickle `builtins.RawDeltaTable` object
- `read_deltalake` vs `read_parquet` performance HOT 1
- Can we get rid of `filters_to_expression`?
- What are the limitations of to_deltalake? HOT 1
- Problem with `pyarrow` dependency when installing dask-deltatable HOT 3
- Failed import when running `deltalake==0.14.0` HOT 4
- Order data by partitions if available HOT 3
- Specify AWS Permissions if reading from S3 HOT 1
- Overwriting tables
- `ImportError` with `deltalake=0.16.0` HOT 4
- Example in Readme not reproducible HOT 2
- `read_deltalake` breaks with dask>=2024.3.1 HOT 7
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from dask-deltatable.