Comments (5)
Hi @JiaLeXian, thanks for reporting! I will take a look at vaex when get a moment. For now, you can manually convert your data into the form of numpy array in order to use deep fprest.
from deep-forest.
It looks like vaex does not support slicing (vaexio/vaex#911), which is an essential operation in deep forest, e.g., bootstrap sampling when building random forests. At least for now, this problem cannot be solved :-(
Thanks for reporting anyway.
from deep-forest.
Hi @xuyxu, thanks for investigating the problem. Appreciated! So, for DF, it's best to use numpy array or original pandas dataframe?
In our case, we have more than 100 million rows of data. That's why we use vaex to load the data to reduce memory occupation. We still want to try DF on our dataset. We will explore other ways to try. Thank you!
from deep-forest.
Could you take a look at numpy.memmap
, it looks like there is also no need to load the entire dataset into the memory with memmap
.
Besides, feel free to tell me if you have any problem when trying out this solution ;-). We are willing to further improve the functionality of DF when faced with such large datasets.
from deep-forest.
@xuyxu thanks for the quick reply. Thanks for suggesting numpy.memmap. We will try this option in the following days. Will keep you posted. Thank you!
from deep-forest.
Related Issues (20)
- GPU Support HOT 2
- The api for multi-grain HOT 1
- How to plot roc curve ? HOT 3
- ValueError: too many values to unpack (expected 2) HOT 2
- cannot import name 'CascadeForestRegressor' from 'deepforest' HOT 6
- cant import CascadeForestRegressor HOT 2
- pip install deep-forest didn't work in wsl2 HOT 8
- pip install deep-forest didn't work in python3.10 HOT 3
- question HOT 8
- Please consider support on py310 HOT 1
- pip install deep-forest ERROR: Could not find a version that satisfies the requirement deep-forest (from versions: none) HOT 2
- np.int has been removed in munpy 1.24 HOT 5
- pip install doesn't work HOT 4
- What is the `model.get_later_feature_importance(0)` for? HOT 7
- Cannot find deep-forest in PyPI HOT 2
- Request for permission to cite your figure in the DF document HOT 3
- Unable to import CascadeForestClassifier HOT 7
- 导入出现错误 HOT 1
- Installation issues on Macbook Pro 2022 (Apple Silicon) and Ubuntu HOT 12
- module 'numpy' has no attribute 'int'. HOT 5
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from deep-forest.