4paradigm / autox Goto Github PK
View Code? Open in Web Editor NEWAutoX is an efficient automl tool, which is mainly aimed at data mining tasks with tabular data.
Home Page: https://autox.readthedocs.io
License: Apache License 2.0
AutoX is an efficient automl tool, which is mainly aimed at data mining tasks with tabular data.
Home Page: https://autox.readthedocs.io
License: Apache License 2.0
当前使用fasttext进行特征提取的效率较慢,同等数据量下与BERT-tiny用时相当,可针对性优化。
代码链接:https://github.com/4paradigm/AutoX/blob/master/autox/autox_nlp/feature_engineer/nlp_feature.py
在kaggle环境中运行《值得买》数据集,发现16G内存会爆掉。初步分析是因为特征工程中暴力循环生成了出了大量衍生特征,可以考虑借鉴kaggle上的 memory reduce 代码思路进行内存优化
@poteman
Mac中 lightgbm==3.3.2.99, lightgbm.train不再包含verbose_eval和early_stopping_rounds接口,改用callbacks接口,调用lgb模型时会报错
File ~/miniforge3/envs/lx/lib/python3.9/site-packages/autox/autox_competition/models/regressor_ts.py:231, in LgbRegressionTs.fit(self, train, test, used_features, target, time_col, ts_unit, Early_Stopping_Rounds, N_round, Verbose, log1p, custom_metric, weight_for_mae)
226 model = lgb.train(self.params_, trn_data, num_boost_round=self.N_round, valid_sets=[trn_data, val_data],
227 verbose_eval=self.Verbose,
228 early_stopping_rounds=self.Early_Stopping_Rounds,
229 feval=weighted_mae_lgb(weight=weight_for_mae))
230 else:
--> 231 model = lgb.train(self.params_, trn_data, num_boost_round=self.N_round, valid_sets=[trn_data, val_data],
...
233 early_stopping_rounds=self.Early_Stopping_Rounds)
234 val = model.predict(train.iloc[valid_idx][used_features])
235 if log1p:
TypeError: train() got an unexpected keyword argument 'verbose_eval'
I would like to ask if AutoX has any plans for sample selection?
Now many data sets are so large that the computing power of individuals and small companies cannot afford.
Can a part of the data be selected for training to approximate the effect of full data training?
git clone https://github.com/4paradigm/AutoX.git
pip install pytorch_tabnet
pip install ./AutoX
python
from autox import AutoX
ModuleNotFoundError: No module named 'autox.autox_server'
当前Word2Vec和Glove模型无法处理测试数据中未见过的词,需要对测试数据重新进行词表构建,对整体效果影响较大。
代码链接:https://github.com/4paradigm/AutoX/blob/master/autox/autox_nlp/feature_engineer/nlp_feature.py
AutoX团队你们好,
如题有没有保存模型加载模型部署应用的方法?感谢!
原始数据地址: https://www.kaggle.com/datasets/prokaggler/amazon-electronic-product-recommendation
数据处理方法参考:
https://github.com/4paradigm/AutoX/blob/master/autox/autox_recommend/data_process/MovieLens_data_process.ipynb
以及
https://github.com/4paradigm/AutoX/blob/master/autox/autox_recommend/data_process/Netflix-data-process.ipynb
当前glove模型使用的是glove-python-binary包,对windows系统及mac系统安装较困难,可通过其他方式实现glove。
代码链接:https://github.com/4paradigm/AutoX/blob/master/autox/autox_nlp/feature_engineer/nlp_feature.py
原始数据地址: https://www.kaggle.com/datasets/teesoong/ml-challenge?select=checkins.csv
数据处理方法参考:
https://github.com/4paradigm/AutoX/blob/master/autox/autox_recommend/data_process/MovieLens_data_process.ipynb
以及
https://github.com/4paradigm/AutoX/blob/master/autox/autox_recommend/data_process/Netflix-data-process.ipynb
可以再po一个二维码吗?
请问下autox怎么把模型融合的方式设置为stacking?
autoX能否针对块状和间歇性需求的时间序列进行预测?
原始数据地址: https://tianchi.aliyun.com/competition/entrance/231785/introduction
数据处理方法参考:
https://github.com/4paradigm/AutoX/blob/master/autox/autox_recommend/data_process/MovieLens_data_process.ipynb
以及
https://github.com/4paradigm/AutoX/blob/master/autox/autox_recommend/data_process/Netflix-data-process.ipynb
目前使用AutoXServer生成的模型是pkl格式的,有没有办法转为pmml格式?
您是把一条数据中的实体拆分了吗?
一条数据对应一个实体?对应一个情感?
在推荐风控等场景里面 用户的行为序列特征十分重要,不知道是否考虑支持了
I am Vansin, the technical operator of OpenMMLab. In September of last year, we announced the release of OpenMMLab 2.0 at the World Artificial Intelligence Conference in Shanghai. We invite you to upgrade your algorithm library to OpenMMLab 2.0 using MMEngine, which can be used for both research and commercial purposes. If you have any questions, please feel free to join us on the OpenMMLab Discord at https://discord.gg/amFNsyUBvm or add me on WeChat (van-sin) and I will invite you to the OpenMMLab WeChat group.
Here are the OpenMMLab 2.0 repos branches:
OpenMMLab 1.0 branch | OpenMMLab 2.0 branch | |
---|---|---|
MMEngine | 0.x | |
MMCV | 1.x | 2.x |
MMDetection | 0.x 、1.x、2.x | 3.x |
MMAction2 | 0.x | 1.x |
MMClassification | 0.x | 1.x |
MMSegmentation | 0.x | 1.x |
MMDetection3D | 0.x | 1.x |
MMEditing | 0.x | 1.x |
MMPose | 0.x | 1.x |
MMDeploy | 0.x | 1.x |
MMTracking | 0.x | 1.x |
MMOCR | 0.x | 1.x |
MMRazor | 0.x | 1.x |
MMSelfSup | 0.x | 1.x |
MMRotate | 1.x | 1.x |
MMYOLO | 0.x |
Attention: please create a new virtual environment for OpenMMLab 2.0.
autox安装的时候是要提前安装深度学习框架keras嘛?是否支持pytorch或其他框架?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.