https://github.com/white127/QA-deep-learning

tensorflow and theano cnn code for insurance QA(question Answer matching)
https://github.com/white127/QA-deep-learning

Last synced: 3 months ago
JSON representation

tensorflow and theano cnn code for insurance QA(question Answer matching)

Host: GitHub
URL: https://github.com/white127/QA-deep-learning
Owner: white127
Created: 2016-06-02T01:00:34.000Z (about 9 years ago)
Default Branch: master
Last Pushed: 2018-09-07T03:40:34.000Z (almost 7 years ago)
Last Synced: 2024-08-01T16:41:46.619Z (11 months ago)
Language: Python
Size: 13.8 MB
Stars: 532
Watchers: 38
Forks: 285
Open Issues: 25
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

Insurance-QA deeplearning model
======
This is a repo for Q&A Mathing, includes some deep learning models, such as CNN、RNN.

1. CNN. Basic CNN model from 《Applying Deep Learning To Answer Selection: A Study And An Open Task》

2. RNN. RNN seems the best model on Insurance-QA dataset.

3. SWEM. SWEM is the fastest, and has good effect on other datasets, such as WikiQA ..., but is seems not so good on Insurance-QA dataset. I think that, SWEM is more suitable for Q&Q matching, not Q&A matching.

It's hard to say which model is the best in other datasets, you have to choose the most suitable model for you.

More models are on the way, pay attention to the updates.

## Requirements
1. tensorflow 1.4.0

2. python3.5

## Performance
margin loss version

Model/Score | Ins_qa_top1_precision | quora_best_prec
------------ | ------------- | -------------
CNN | 62% | None
LSTM+CNN | 68% | None
SWEM | <55% | None

logloss version

Model/Score | Insqa_top1_precision | quora_best_prec
------------ | ------------- | -------------
CNN | None | 79.60%
LSTM+CNN | None | None
SWEM | <40% | 82.69%

## Running
Change configuration to your own environment, just like data pathes

vim config.py

Data processing

python3 gen.py

Run CNN model

cd ./cnn/tensorflow && python3 insqa_train.py

It will take few hours(thousands of epoches) to train this model on a single GPU.

## Downloads
1. You can get Insurance-QA data from here https://github.com/shuzi/insuranceQA

2. You can get Quora data from here http://qim.ec.quoracdn.net/quora_duplicate_questions.tsv

## Links
1. CNN and RNN textual classification repo https://github.com/white127/TextClassification_CNN_RNN

2. 《Applying Deep Learning To Answer Selection: A Study And An Open Task》

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/white127/QA-deep-learning

Awesome Lists containing this project

README