Comments (5)
需要实现如下接口
xorbits.datasets.to_huggingface
xorbits.datasets.Dataset.from_dataframe
xorbits.datasets.export_json
from xorbits.
需要向dataset.Dataset中新增一列用于记录中间值,如何处理,只看到__getitem__,没有实现__setitem__
from xorbits.
dataset.Dataset如何进行过滤,
类似于huggingface.dataset:https://github.com/huggingface/datasets/blob/ef0f986518bd252c5314a7e3a419dedcbb166630/src/datasets/arrow_dataset.py#L5061
from xorbits.
@codingl2k1 看下这个问题。
@simplew2011 你有兴趣来贡献吗?
from xorbits.
dataset.Dataset如何进行过滤,
类似于huggingface.dataset:https://github.com/huggingface/datasets/blob/ef0f986518bd252c5314a7e3a419dedcbb166630/src/datasets/arrow_dataset.py#L5061
Currently, xorbits dataframe can export the dataframe to csv, parquet, sql, and dataframe apply
may be able to meet your needs. xorbits dataset can map data and convert the dataset to dataframe, but the filter
is not implemented.
Could you provide some example code?
from xorbits.
Related Issues (20)
- BUG: df.map_chunk with empty DataFrame cannot work
- BUG: df groupby nunique when by is series type
- BUG: read_parquet generates a memory allocation error HOT 1
- BUG: Integrated pandas can't Read CSV while latest pandas can HOT 1
- BUG: too many open files HOT 6
- BUG: user-defined function groupby.agg has unexpected keyword argument HOT 2
- BUG: pd.read_csv cannot read pathlib.Path
- BUG: read_csv Indexing to a list of numbers is not supported. HOT 1
- BUG: pd.read_csv(compression="gzip") can not run paralllel
- BUG: `xorbits._mars.learn.neighbors.NearestNeighbors` doesn't work
- BUG: service stopped when pivot a 1125138913x5 matrix into 4000 columns on a 160U-4096GBmem machine HOT 1
- BUG: set column when using fallback results
- BUG: FileNotFoundError: [Errno 2] No such file or directory HOT 2
- How to perform deduplication in a cluster environment? HOT 5
- Does xorbits support sklearn and which algorithms are supported? HOT 6
- BUG: How to read local csv file HOT 4
- BUG: xorbits.shutdown occur some error
- BUG: OSError: [Errno 24] Too many open files HOT 5
- ENH: xorbits's read_parquet compatible with pandas on pyarrow engine
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from xorbits.