Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/nlpjoe/daguan-classify-2018
2018达观杯长文本分类智能处理挑战赛 18解决方案
https://github.com/nlpjoe/daguan-classify-2018
competition-code keras python
Last synced: 3 months ago
JSON representation
2018达观杯长文本分类智能处理挑战赛 18解决方案
- Host: GitHub
- URL: https://github.com/nlpjoe/daguan-classify-2018
- Owner: nlpjoe
- Created: 2018-09-11T11:03:15.000Z (about 6 years ago)
- Default Branch: master
- Last Pushed: 2019-06-10T01:38:11.000Z (over 5 years ago)
- Last Synced: 2024-05-13T15:23:28.155Z (6 months ago)
- Topics: competition-code, keras, python
- Language: Jupyter Notebook
- Homepage:
- Size: 117 KB
- Stars: 153
- Watchers: 6
- Forks: 62
- Open Issues: 2
-
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
Awesome Lists containing this project
README
# 达观杯2018
[![Backers on Open Collective](https://opencollective.com/daguan-2018/backers/badge.svg)](#backers)
[![Sponsors on Open Collective](https://opencollective.com/daguan-2018/sponsors/badge.svg)](#sponsors)参数没调好,仓促比赛,单模型线上没测过,线下0.784,最终得分0.791,排名18/3462,排名不高就不多写了,等着前排分享。思路如同代码所写,很简单。
数据请在[达观数据](http://www.dcjingsai.com/common/cmpt/%E2%80%9C%E8%BE%BE%E8%A7%82%E6%9D%AF%E2%80%9D%E6%96%87%E6%9C%AC%E6%99%BA%E8%83%BD%E5%A4%84%E7%90%86%E6%8C%91%E6%88%98%E8%B5%9B_%E8%B5%9B%E4%BD%93%E4%B8%8E%E6%95%B0%E6%8D%AE.html)处下载,放在data目录下。
### 一、环境
|环境/库|版本|
|:---------:|----------|
|Ubuntu|14.04.5 LTS|
|python|3.6|
|jupyter notebook|4.2.3|
|tensorflow-gpu|1.10.1|
|numpy|1.14.1|
|pandas|0.23.0|
|matplotlib|2.2.2|
|gensim|3.5.0|
|tqdm|4.24.0|### 二、数据预处理
都写在`jupyter`里了。运行`src/preprocess/EDA.ipynb`生成各种文件。
### 三、baseline模型训练
在`src/preprocess/`中运行:
```
python baseline-x-cv.py
```### 四、深度模型训练
然后直接train模型,单GPU运行,模型自选:
```
python train_predict.py --gpu 4 --option 5 --model convlstm --feature char
```多GPU训练示例:
```
python train_predict.py --gpu 4,5,6,7 --option 5 --model convlstm --feature char
```### 五、模型融合输出
```
python stacking.py --gpu 1 --tfidf True --option 5
```这里是stacking和伪标签一起做了,请修改代码自选是否用伪标签。
## Contributors
This project exists thanks to all the people who contribute. [[Contribute](CONTRIBUTING.md)].
## Backers
Thank you to all our backers! 🙏 [[Become a backer](https://opencollective.com/daguan-2018#backer)]
## Sponsors
Support this project by becoming a sponsor. Your logo will show up here with a link to your website. [[Become a sponsor](https://opencollective.com/daguan-2018#sponsor)]