Comments (3)
Sure! Would you like to propose a PR for that? Ideally, with a small unit test showing that it works.
from petastorm.
It seems more complicated than I thought with the current code base of the library and I am not using the library right now. Still think the idea is worthwhile but feel free to close the issue if you wish. Otherwise, let it open in order to keep the idea close.
from petastorm.
Wondering if there is a reason for the string type to not be supported? A column in a parquet file could be all string labels. It is currently not possible to read this parquet file as a PyTorch dataloader becuase this line in the code.
from petastorm.
Related Issues (20)
- using SHAP with petastorm dataset HOT 1
- Random seed doesn't seem to work well HOT 2
- Customized dataset HOT 1
- How to pass pin_memory argument when using make_torch_dataloader HOT 2
- when hdfs-site.xml file has xi:include tag, the function cann't get hadoop_configuration info
- Prediction issue using Keras and TransformSpec with PySpark
- Petastorm sharding and setting batch sizes
- make_batch_reader Documentation out of date? seed?
- How to transform the string data to numerical when using make_batch_reader?
- AttributeError: 'bool' object has no attribute 'map' βwhile using Predicate
- TypeError: __init__() missing 2 required positional arguments: 'instance' and 'token'
- Seeing worse model performance from using petastorm vs normal pytorch dataloader HOT 1
- Issue with loading nested array type from spark DF to torch
- Bug in ConcurrentVentilator._ventilate() when randomize_item_order=True and random seed is fixed
- make_torch_dataloader using TransformSpec applies transformation on entire dataframe (not lazy loading) HOT 2
- FutureWarning: 'ParquetDataset.partitions' attribute is deprecated as of pyarrow 5.0.0 and will be removed in a future version. HOT 1
- make_reader fails for example HOT 1
- ParquetDataset has an invalid parameter validate_schema HOT 1
- Petastorm hangs forever in DataBricks HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
π Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. πππ
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google β€οΈ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from petastorm.