Comments (2)
Thanks for the question. I would say generally an operation with a specific a dtype backend (pyarrow, nullable numpy, normal numpy) should return the same dtype backend unless
- An API states it should return a specific type, e.g.
ExtensionArray
method APIs - Mixing two different backends in an operation, e.g.
int64[pyarrow]
+Int64
. (Probably not well defined in all cases)
from pandas.
This is a great question and something that it would serve us to best define in the future though. For simple types like ints, floats, etc... where NumPy and pyarrow share the same storage layout (at least for the data buffers), I think NumPy + Arrow should return Arrow; otherwise it would be a lossy operation
For more complex types that's an open question, but I would still prefer for Arrow types to come out on top
from pandas.
Related Issues (20)
- BUG: RangeIndex.searchsorted() when using a negative step HOT 3
- BUG: inconsistent behavior of pandas.api.types.pandas_dtype
- BUG: DataFrame.groupby returns invalid value when dropna=False HOT 2
- ENH: Add paramenter `index` to `drop_duplicates` to drop duplicate indices
- Potential regression induced by "BUG: Use large_string in string array consistently" HOT 1
- Mac (with M2 chip) install of pandas with Poetry HOT 1
- REF/API: make construct_array_type a non-classmethod
- BUG: DatetimeIndex.is_year_start breaks on custom business days frequencies bigger then `1C`
- DEPR: Deprecate method argument of reindex_like HOT 4
- Potential regression induced by "CLN: Simplify map_infer_mask (#58483)" HOT 1
- Potential regression induced by "CLN: Enforce read_csv(keep_date_col, parse_dates) deprecations (#58622)"
- ENH: Also apply formatters to the index in `to_latex` HOT 3
- BUG: In `main`, using `resample().interpolate(inplace=True)` raises an exception HOT 7
- BUG: edge case when masking "null[pyarrow]" pd.Series
- BUG: .max() raises exception on Series with object dtype and mixture of Timestamp and NaT: TypeError: '>=' not supported between instances of 'Timestamp' and 'float' HOT 1
- BUG: numerical inconsistency in calculating rolling kurtosis HOT 4
- BUG: No kwargs in df.apply(raw=True, engine="numba") HOT 5
- BUG: pd.merge fail with numpy.uintc on Windows HOT 2
- BUG: scipy rolling exponential is breaking MultiIndex columns HOT 2
- BUG: ChainedAssignmentError link to documentation will break? HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pandas.