Comments (4)
Could you clarify the question? I am not sure I am understanding. Atm this only supports writing PLAIN, and, in PLAIN with nullable values, only valid slots should be written in parquet.
from arrow2.
Thanks reply, Viewed:
https://github.com/jorgecarleitao/parquet2/blob/f86ba6f65fede75dd7b9f7dd6b0fe3e271d2aa58/src/serialization/read/mod.rs#L50
arrow2/src/io/parquet/read/mod.rs
Line 146 in 176569c
parquet2: page_iter_to_array vs arrow2:page_to_array
I donβt understand:
Who should be subject to actual use
from arrow2.
ahh, got it.
Who should be subject to actual use
it depends on what you are looking for: parquet2 makes no assumption about which in-memory format the consumer wants to use. It offers a minimal deserializer for a simple in-memory format that is mostly used to run integration tests against other producers and consumers.
This format does not have the complexities of arrow, and it is thus easier to understand for normal rust users (e.g. it uses Vec<Option<bool>>
, not two bitmaps for validity and values of a boolean array).
Arrow2 on the other hand offers a deserializer specifically for arrow's in-memory format.
So, if you wish to use parquet without arrow, use parquet's version; else, use arrow's version. :)
For arrow, there is a guide on how to read and write via parquet:
- https://jorgecarleitao.github.io/arrow2/io/parquet_read.html
- https://jorgecarleitao.github.io/arrow2/io/parquet_write.html
and also examples in examples/
.
from arrow2.
thanks
from arrow2.
Related Issues (20)
- Error when timestamp casting for time unit millisecond or microsecond HOT 1
- does arrow2 support filter pushdown in parquet reader HOT 1
- Avro maps are unsupported
- Writing chunked dictionary arrays to IPC currently impossible due to difference in key maps? HOT 1
- Incorrect nullability inferred for nested parquet schema HOT 2
- Any plans to add an async flavor for json/ ndjson format?
- MutableDictionaryArray - another rewrite needed HOT 1
- arrow2 0.18.0 release broke against minimal dependencies in the Cargo.toml HOT 3
- `infer_records_schema` results in incorrect `Schema` when input json is in non-`Chunk` form
- Add Float16/Half-float logical type to Parquet
- Compressed IPC Crash in certain cases HOT 1
- Crash when loading avro file
- Specify compression per column instead of globally
- deserialize_schema looks not working
- Support for Utf8View in the Rust library HOT 1
- Upgrade odbc-api to stable 4.1.0
- [nightly] When compiling with `+nightly` one symbol is not found. HOT 1
- Tags / Commits for the 0.18.0 release HOT 1
- Append to existing ipc file results in ErrorLocation: InvalidOffset when reading new block
- arrow2 cannot read ipc files compressed by official's arrow crate
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
π Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. πππ
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google β€οΈ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from arrow2.