https://github.com/activestate/apy

Last synced: about 1 month ago
JSON representation

Host: GitHub
URL: https://github.com/activestate/apy
Owner: ActiveState
License: mit
Created: 2018-02-16T18:01:27.000Z (over 8 years ago)
Default Branch: master
Last Pushed: 2018-06-22T17:48:19.000Z (about 8 years ago)
Last Synced: 2025-01-09T10:30:05.376Z (over 1 year ago)
Language: Python
Size: 1.37 MB
Stars: 1
Watchers: 3
Forks: 3
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

          # APy

Reddit JSON => mysqlite db => training files => train chatbot => interact with chatbot

## Setup for docker

1. Git clone [https://github.com/davetlewis-van/APy.git](https://github.com/davetlewis-van/APy.git) into a new directory.

1. `wget http://downloads.activestate.com/ActivePython/releases/3.6.0.3600/ActivePython-3.6.0.3600-linux-x86_64-glibc-2.3.6-401834.tar.gz`

1. `docker build -t "python36:dockerfile" .`

1. `docker run -it python36:dockerfile`

1. There are sample training files, but you can copy the training files you want to use to the `new_data` folder. If so, you need to adjust the files listed at the top of `prepare_data.py` and dev_prefix and test_prefix in `settings.py`

1. Navigate to the `/code/nmt-chatbot/setup` folder and run `prepare_data.py`

1. Navigate to the `/code/nmt-chatbot/` folder and run `train.py`

1. To interact with the chatbot run `inference.py`

## Setup on cloud.google.com

* Public IP: https://35.230.32.103

* Jupyter notebook access: `sudo jupyter notebook -ip 0.0.0.0 --port 8888 --allow-root`

* TensorBoard config:

  1. Create firewall rule for port 6006

  2. `cd nmt-chatbot/model`

  3. `tensorboard --logdir=train_log/ --host 0.0.0.0 --port 6006`

  4. Browse to: http://35.230.32.103:6006/#projector&run=.

## Resources

* [PythonProgramming.net chatbot tutorials](https://pythonprogramming.net/chatbot-deep-learning-python-tensorflow/)

* [AI and Chatbots in Technical Communication](https://www.cherryleaf.com/blog/2017/08/ai-chatbots-technical-communication-primer/)

* [Google Dialogflow](https://dialgoflow.com)

## Datasets

### Reddit

* [JSON](https://files.pushshift.io/reddit/comments/)

* [Google BigQuery](https://bigquery.cloud.google.com/table/fh-bigquery:reddit_comments.2017_12?tab=schema)

### Stack Overflow

* Kaggle

  * [Questions.csv](https://www.kaggle.com/stackoverflow/pythonquestions/downloads/Questions.csv)

  * [Answers.csv](https://www.kaggle.com/stackoverflow/pythonquestions/downloads/Answers.csv)

  * [Tags.csv](https://www.kaggle.com/stackoverflow/pythonquestions/downloads/Tags.csv)

* [Google BigQuery](https://bigquery.cloud.google.com/dataset/fh-bigquery:stackoverflow)

## Process

![Reddit JSON structure](img/json.png)

![sqlite table](img/sqlite.png)

![Training 1](img/training.png)

![Training 2](img/training2.png)

![TensorBoard Scalars](img/tensorboard.png)

![TensorBoard Projector](img/tensorboard2.png)

## TODO

* Get access to a GPU

  ![CPU usage](img/cpu.png)

* Figure out how to access my programming-specific BigQuery tables from the Google Cloud VM instance.

  ![BigQuery](img/bigquery.png)

* Figure out how to read in Stack Overflow Q&A

* Run docker container on Google Cloud VM

## License

Copyright (c) 2018 ActiveState Software Inc.

Released under the BSD-3 license. See LICENSE file for details.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/activestate/apy

Awesome Lists containing this project

README