Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/4paradigm/autox
AutoX is an efficient automl tool, which is mainly aimed at data mining tasks with tabular data.
https://github.com/4paradigm/autox
kaggle machine-learning python
Last synced: 2 days ago
JSON representation
AutoX is an efficient automl tool, which is mainly aimed at data mining tasks with tabular data.
- Host: GitHub
- URL: https://github.com/4paradigm/autox
- Owner: 4paradigm
- License: apache-2.0
- Created: 2021-07-21T09:45:41.000Z (over 3 years ago)
- Default Branch: master
- Last Pushed: 2023-02-14T14:21:58.000Z (almost 2 years ago)
- Last Synced: 2025-01-12T18:06:34.444Z (9 days ago)
- Topics: kaggle, machine-learning, python
- Language: Jupyter Notebook
- Homepage: https://autox.readthedocs.io
- Size: 72 MB
- Stars: 503
- Watchers: 13
- Forks: 134
- Open Issues: 20
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
- StarryDivineSky - 4paradigm/AutoX
README
[English](./README_EN.md) | 简体中文
# AutoX是什么?
AutoX一个高效的自动化机器学习工具。
它的特点包括:
- 效果出色: AutoX在多个kaggle数据集上,效果显著优于其他解决方案(见[效果对比](#效果对比))。
- 简单易用: AutoX的接口和sklearn类似,方便上手使用。
- 通用: 适用于分类和回归问题。
- 自动化: 无需人工干预,全自动的数据清洗、特征工程、模型调参等步骤。
- 灵活性: 各组件解耦合,能单独使用,对于自动机器学习效果不满意的地方,可以结合专家知识,AutoX提供灵活的接口。
- 比赛上分点总结:整理并公开历史比赛的上分点。# AutoX包含什么内容
- [autox_competition](autox/autox_competition/README.md): 主要针对于表格类型的数据挖掘竞赛
- [autox_server](autox/autox_server/README.md): 用于上线部署的automl服务
- [autox_interpreter](autox/autox_interpreter/README.md): 机器学习可解释功能
- [autox_nlp](autox/autox_nlp/README.md): 对文本列进行处理的自动化工具
- [autox_recommend](autox/autox_recommend/README.md): 推荐系统的自动机器学习
- [autox_video](autox/autox_video/README.md): 应用于视频分类任务的自动机器学习框架# 加入社区
# 框架
## autox_competition## autox_recommend
## autox_video
# 如何为AutoX贡献
[如何为AutoX贡献](./how_to_contribute.md)# 目录
- [AutoX是什么?](#AutoX是什么?)
- [AutoX包含什么内容](#AutoX包含什么内容)
- [加入社区](#加入社区)
- [目录](#目录)
- [安装](#安装)
- [如何为AutoX贡献](#如何为AutoX贡献)
- [快速上手](#快速上手)
- [效果对比](#效果对比)
- [TODO](#TODO)
- [错误排查](#错误排查)# 安装
### github仓库安装
```
git clone https://github.com/4paradigm/autox.git
pip install ./autox
```### pip安装
```
## pip安装包可能更新不及时,建议用github安装方式安装最新版本
!pip install automl-x -i https://www.pypi.org/simple/
```# 快速上手
- [autox打比赛](autox/autox_competition/README.md)
- [autox上线部署](autox/autox_server/README.md)
- [autox可解释](autox/autox_interpreter/README.md)
- [特征工程](autox/autox_competition/feature_engineer/README.md)# 社区案例
[汽车销量预测](./demo/汽车销量预测/README.md)# 比赛案例
见demo文件夹数据集下载链接:https://pan.baidu.com/s/1p38OuP8_FJp2P_wJwhdFiw?pwd=8mxf
# 效果对比
## 不同任务下的效果提升百分比
|data_type | 对比AutoGluon | 对比H2o |
|----- | ------------- | ----------- |
|binary classification | 20.44% | 2.98% |
|regression | 37.54% | 39.66% |
|time-series | 28.40% | 32.46% |## 详细数据集对比
|data_type | single-or-multi | data_name | metric | AutoX | AutoGluon | H2o |
|----- | ------------- | ----------- |---------------- |---------------- | ----------------|----------------|
|binary classification | single-table | [Springleaf](https://www.kaggle.com/c/springleaf-marketing-response/) | auc | 0.78865 | 0.61141 | 0.78186 |
|binary classification-nlp | single-table |[stumbleupon](https://www.kaggle.com/c/stumbleupon/) | auc | 0.87177 | 0.81025 | 0.79039 |
|binary classification | single-table |[santander](https://www.kaggle.com/c/santander-customer-transaction-prediction/) | auc | 0.89196 | 0.64643 | 0.88775 |
|binary classification | multi-table |[IEEE](https://www.kaggle.com/c/ieee-fraud-detection/) | accuracy | 0.920809 | 0.724925 | 0.907818 |
|regression | single-table |[ventilator](https://www.kaggle.com/c/ventilator-pressure-prediction/) | mae | 0.755 | 8.434 | 4.221 |
|regression | single-table |[Allstate Claims Severity](https://www.kaggle.com/c/allstate-claims-severity)| mae | 1137.07885 | 1173.35917 | 1163.12014 |
|regression | single-table |[zhidemai](https://www.automl.ai/competitions/19) | mse | 1.0034 | 1.9466 | 1.1927|
|regression | single-table |[Tabular Playground Series - Aug 2021](https://www.kaggle.com/c/tabular-playground-series-aug-2021) | rmse | 7.87731 | 10.3944 | 7.8895|
|regression | single-table |[House Prices](https://www.kaggle.com/c/house-prices-advanced-regression-techniques/) | rmse | 0.13043 | 0.13104 | 0.13161 |
|regression | single-table |[Restaurant Revenue](https://www.kaggle.com/c/restaurant-revenue-prediction/)| rmse | 2133204.32146 | 31913829.59876 | 28958013.69639 |
|regression | multi-table |[Elo Merchant Category Recommendation](https://www.kaggle.com/c/elo-merchant-category-recommendation/)| rmse | 3.72228 | 3.80801 | 22.88899 |
|regression-ts | single-table |[Demand Forecasting](https://www.kaggle.com/c/demand-forecasting-kernels-only/)| smape | 13.79241 | 25.39182 | 18.89678 |
|regression-ts | multi-table |[Walmart Recruiting](https://www.kaggle.com/c/walmart-recruiting-store-sales-forecasting/)| wmae | 4660.99174 | 5024.16179 | 5128.31622 |
|regression-ts | multi-table |[Rossmann Store Sales](https://www.kaggle.com/c/rossmann-store-sales/)| RMSPE | 0.13850 | 0.20453 | 0.35757 |
|regression-cv | single-table |[PetFinder](https://www.kaggle.com/competitions/petfinder-pawpularity-score/overview/) | rmse | 20.1327 | 23.1732 | 21.0586 |# AutoX成就
### 企业支持### 比赛获奖
- [2021阿里云基础设施供应链大赛-冠军方案](https://tianchi.aliyun.com/forum/postDetail?postId=344505)
- [kaggle-H&M个性化推荐-金牌方案](https://www.kaggle.com/competitions/h-and-m-personalized-fashion-recommendations/discussion/324158)
- [AutoX获得CCF A类会议ACM Multimedia举办的视频分类任务冠军](http://auto-video-captions.top/2022/leaderboard)# TODO
功能开发完成后,发布相应的使用demo
- [ ] 多分类任务若有其他希望AutoX支持的功能,欢迎提issue!
欢迎填写[用户调研问卷](https://www.wjx.cn/vj/YOwSFHN.aspx),让AutoX变得更好!## 错误排查
|错误信息|解决办法|
|------|------|