Comments (4)
For reference, this is an implementation I added in https://github.com/data-engineering-collective/plateau/blob/de3363b6e0799eaf9be8b3feee658a51eac56428/plateau/io/dask/dataframe.py#L354
it's essentially a custom tree reduction to perform a dataset commit.
The use of .reduction
in Plateau is indeed exactly how I stumbled over this issue. Thanks for the context, @fjetter .
from dask.
Hi, thanks for your report. We removed this intentionally since this was never really meant to be public, could you give a bit more context about what you are using the reduction method for?
from dask.
For reference, this is an implementation I added in https://github.com/data-engineering-collective/plateau/blob/de3363b6e0799eaf9be8b3feee658a51eac56428/plateau/io/dask/dataframe.py#L354
it's essentially a custom tree reduction to perform a dataset commit.
from dask.
Thanks!
from dask.
Related Issues (20)
- dask-expr: DataFrame.map_partitions no longer takes a `token` keyword HOT 1
- dask-expr is now a hard dependency HOT 3
- Sparse masking throws error HOT 1
- Importing dask 2023.7.1 breaks `sys.last_traceback` in IPython HOT 2
- Dask Nunique bug under dask 2024.2.1 HOT 7
- CI failing on `main`
- CI is printing tracebacks for all xfailed tests which can be very confusing
- Combined save and calculation is using excessive memory HOT 2
- Array API in Dask
- Feedback - DataFrame query planning HOT 17
- importing dask.dataframe changes pandas behaviour in 2024.3.0 HOT 11
- Dumb code error in the Example code in Dask-SQL Homepage HOT 3
- dask.bag.Bag.to_dataframe behavior change in 2024.3.0 - setting dtype to string rather than object by default HOT 4
- TypeError: float() argument must be a string or a real number, not 'csr_matrix' HOT 1
- Dask query planning string column unique bug HOT 2
- Dataframe constructed from single partition bag cannot be shuffled with query planning enabled HOT 2
- dask.dataframe.DataFrame.reduction fails on`split_every=False` if query planning is in effect HOT 1
- as of v2024.3.1, comparing a 1D dask.array.Array to a dask.dataframe.Series fails HOT 1
- value_counts with NaN sometimes raises ValueError: No objects to concatenate HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from dask.