https://github.com/huggingface/personas

Datasets for Deep learning Personas
https://github.com/huggingface/personas

datasets deep-learning neural-conversation-models

Last synced: 9 months ago
JSON representation

Datasets for Deep learning Personas

Host: GitHub
URL: https://github.com/huggingface/personas
Owner: huggingface
Created: 2017-02-12T11:22:42.000Z (over 9 years ago)
Default Branch: master
Last Pushed: 2017-12-27T01:35:42.000Z (over 8 years ago)
Last Synced: 2025-09-30T18:02:30.480Z (9 months ago)
Topics: datasets, deep-learning, neural-conversation-models
Size: 1.95 KB
Stars: 62
Watchers: 7
Forks: 24
Open Issues: 3
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# personas
Datasets for Deep learning Personas

***TL;DR:*** These are the datasets that we've used in our fun AI side project experiment, over at https://personas.huggingface.co/

We've trained seq2seq models using [DeepQA](https://github.com/Conchylicultor/DeepQA), a tensorflow implementation of "A neural conversational model" (a.k.a. the Google paper), a Deep learning based chatbot.

## Datasets used

* [Cornell Movie Dialogs](http://www.cs.cornell.edu/~cristian/Cornell_Movie-Dialogs_Corpus.html) corpus
* Supreme Court Conversation Data.
* [Ubuntu Dialogue Corpus](https://arxiv.org/abs/1506.08909) for tech-support type discussion.
* [Stack Exchange Data Dump](https://archive.org/details/stackexchange)

This is an anonymized dump of all user-contributed content on the Stack Exchange network. Each site is formatted as a separate archive consisting of XML files zipped via 7-zip using bzip2 compression. Each site archive includes Posts, Users, Votes, Comments, PostHistory and PostLinks. For complete schema information, see the included readme.txt.

Attribution: cc-by-sa 3.0

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/huggingface/personas

Awesome Lists containing this project

README