It appears the rust parquet library supports compression but isn't used during query e

New version 0.6.1 published with support for <code cl

Enable parquet compresssion options about odbc2parquet HOT 4 CLOSED

pacman82 commented on August 15, 2024

Enable parquet compresssion options

from odbc2parquet.

Comments (4)

pacman82 commented on August 15, 2024 1

Seems like a good idea to me. I'll probaly look into this during the weekend (no promises though). Could be an easy win if it just about passing a command line argument to the parquet writer.

from odbc2parquet.

plaisted commented on August 15, 2024 1

Awesome, I'll give it a try! Appreciate the update.

If anyone runs into a similar issue or needs custom options and has python accessible, here is a short script to compress an existing parquet file.

import pyarrow as pa
import pyarrow.parquet as pq
import sys

pq_file =  pq.ParquetFile(sys.argv[1])
with pq.ParquetWriter(sys.argv[2], pq_file.schema_arrow, compression='ZSTD') as writer:
    for ri in range(pq_file.num_row_groups):
        table = pq_file.read_row_group(ri)
        writer.write_table(table)

from odbc2parquet.

pacman82 commented on August 15, 2024

New version 0.6.1 published with support for --column-compression-default command line option. New default is gzip. I only did the minimal thing here. There are a lot more option which could be forwarded like encoding or the ability to control encoding/compression for an individual column.

Let me know if this is already good enough for your use case, or if more is needed (or at least nice to have).

Cheers, Markus

from odbc2parquet.

pacman82 commented on August 15, 2024

Thank you, for you feedback and script!

from odbc2parquet.

Recommend Projects

Enable parquet compresssion options about odbc2parquet HOT 4 CLOSED

Comments (4)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent