Comments (5)
Hmm if I could I would add a flatten
parameter to to_pandas()
, but we don't control that method (it's in PyArrow).
Other DataFrames do have decent support for nested columns, such as Polars. So I don't think flattening in general is what we want.
Perhaps we can provide a helpful snippet to teach them how to unflatten a column? IIRC it's just something like:
df.assign(nested = lambda df: [x['key'] for x in df['struct']])
from lance.
@pchalasani i think we can do this in LanceDB repo instead of the format level (please see the referencing PR)
from lance.
Nice, thanks, I seem to be conflating the two repos in my mind 😀
from lance.
For future reference for Lance users, you can write:
dataset.to_table(...).flatten().to_pandas()
If you have multiple levels of nested fields, you may need to call flatten()
multiple times.
Maybe I can make this a tip in the user guide?
from lance.
from lance.
Related Issues (20)
- Lance scalar index search loads dataset metadata (which should be cached)
- Top-level index concept with stable ID HOT 1
- perf bug: Inserting data is O(num versions)
- Pushdown scanner not working with is null and large_binary
- Change Error::IO variant to hold BoxError, so it could be downcast if needed
- correctly categorize Error types
- Dataset not found during frequent writes
- Add schema support for pyarrow map / struct / dictionary type
- Fragment.count_rows is opening files HOT 1
- doc: table.proto message DataFragment documentation confusion HOT 1
- Add support for dictionary encoded fields to the v2 reader/writer HOT 5
- perf: Chunk read performance on wide tables HOT 8
- Vector Index V3 (0.3?) HOT 1
- Potential panic in shared stream (used by merge insert with scalar index)
- All tests involving a dataset should run against both v1 and v2
- Change `use_experimental_writer` to `use_legacy_writer`
- Change the writer default to write v2 files HOT 1
- Compact with many thread could fail with commit conflict HOT 1
- More sophisticated simplification in v2 zone maps scheduler
- Support for nested fields in v2 pushdown
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from lance.