https://github.com/activestate/apy
https://github.com/activestate/apy
Last synced: about 1 month ago
JSON representation
- Host: GitHub
- URL: https://github.com/activestate/apy
- Owner: ActiveState
- License: mit
- Created: 2018-02-16T18:01:27.000Z (over 8 years ago)
- Default Branch: master
- Last Pushed: 2018-06-22T17:48:19.000Z (about 8 years ago)
- Last Synced: 2025-01-09T10:30:05.376Z (over 1 year ago)
- Language: Python
- Size: 1.37 MB
- Stars: 1
- Watchers: 3
- Forks: 3
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# APy
Reddit JSON => mysqlite db => training files => train chatbot => interact with chatbot
## Setup for docker
1. Git clone [https://github.com/davetlewis-van/APy.git](https://github.com/davetlewis-van/APy.git) into a new directory.
1. `wget http://downloads.activestate.com/ActivePython/releases/3.6.0.3600/ActivePython-3.6.0.3600-linux-x86_64-glibc-2.3.6-401834.tar.gz`
1. `docker build -t "python36:dockerfile" .`
1. `docker run -it python36:dockerfile`
1. There are sample training files, but you can copy the training files you want to use to the `new_data` folder. If so, you need to adjust the files listed at the top of `prepare_data.py` and dev_prefix and test_prefix in `settings.py`
1. Navigate to the `/code/nmt-chatbot/setup` folder and run `prepare_data.py`
1. Navigate to the `/code/nmt-chatbot/` folder and run `train.py`
1. To interact with the chatbot run `inference.py`
## Setup on cloud.google.com
* Public IP: https://35.230.32.103
* Jupyter notebook access: `sudo jupyter notebook -ip 0.0.0.0 --port 8888 --allow-root`
* TensorBoard config:
1. Create firewall rule for port 6006
2. `cd nmt-chatbot/model`
3. `tensorboard --logdir=train_log/ --host 0.0.0.0 --port 6006`
4. Browse to: http://35.230.32.103:6006/#projector&run=.
## Resources
* [PythonProgramming.net chatbot tutorials](https://pythonprogramming.net/chatbot-deep-learning-python-tensorflow/)
* [AI and Chatbots in Technical Communication](https://www.cherryleaf.com/blog/2017/08/ai-chatbots-technical-communication-primer/)
* [Google Dialogflow](https://dialgoflow.com)
## Datasets
### Reddit
* [JSON](https://files.pushshift.io/reddit/comments/)
* [Google BigQuery](https://bigquery.cloud.google.com/table/fh-bigquery:reddit_comments.2017_12?tab=schema)
### Stack Overflow
* Kaggle
* [Questions.csv](https://www.kaggle.com/stackoverflow/pythonquestions/downloads/Questions.csv)
* [Answers.csv](https://www.kaggle.com/stackoverflow/pythonquestions/downloads/Answers.csv)
* [Tags.csv](https://www.kaggle.com/stackoverflow/pythonquestions/downloads/Tags.csv)
* [Google BigQuery](https://bigquery.cloud.google.com/dataset/fh-bigquery:stackoverflow)
## Process






## TODO
* Get access to a GPU

* Figure out how to access my programming-specific BigQuery tables from the Google Cloud VM instance.

* Figure out how to read in Stack Overflow Q&A
* Run docker container on Google Cloud VM
## License
Copyright (c) 2018 ActiveState Software Inc.
Released under the BSD-3 license. See LICENSE file for details.