https://github.com/leopeng1995/neuralsql
Make DataStore More Intelligent
https://github.com/leopeng1995/neuralsql
data-analysis mongodb sql
Last synced: about 1 month ago
JSON representation
Make DataStore More Intelligent
- Host: GitHub
- URL: https://github.com/leopeng1995/neuralsql
- Owner: leopeng1995
- Created: 2019-05-20T06:27:22.000Z (about 7 years ago)
- Default Branch: master
- Last Pushed: 2022-12-08T03:02:28.000Z (over 3 years ago)
- Last Synced: 2024-12-31T09:14:24.922Z (over 1 year ago)
- Topics: data-analysis, mongodb, sql
- Language: Python
- Homepage:
- Size: 15.1 MB
- Stars: 0
- Watchers: 2
- Forks: 0
- Open Issues: 10
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
## NeuralSQL - Make DataStore More Intelligent
### What is NeuralSQL?
It is a simple query engine using SQL-like syntax. Only supported Python 3.x.
### Motivation
SQL is an intuitive way to query data. Nowadays, more and more data use MongoDB as their storage databases which are benefited by MongoDB NoSQL Features. We want to build a SQL-like query engine to train machine learning or deep learning models using data in the MongoDB. This experimental project also demonstrates my idea on serverless function to predict using pretrained model stored in MongoDB.
### Features
* Use SQL-like syntax to query MongoDB and get data insight.
* Use multiple backends, such as TensorFlow, Keras, or scikit-learn.
* Computation Engine for generates MongoDB pipeline/map-reduce operator to reduce data transmission.
* Auto-completion and syntax highlighting CLI.
* Pretrained Model Auto Serving.
### Examples
#### Tutorial 0: WordCount
Simplest NeuralSQL Example. In fact, the SQL executor will generates execution operator using MongoDB map-reduce. Prepare data uses `neuralsql -L wordcount` or `neuralsql --load-sample wordcount`.
```
SELECT content FROM text8.train TRAINER WordCount;
```
Word occurrences top 500:
```
SELECT content FROM text8.train TRAINER WordCount LIMIT 500;
```
#### Tutorial 1: Word2Vec
```
SELECT content FROM text8.train TRAINER Word2Vec;
```
We can use this example to build keyword recommendation engine. In next example, we will show you how to use MongoDB Stitch to serve word similarity request.
#### Tutorial 2: Twenty News Classifier
Prepare data uses `neuralsql -L classifier` or `neuralsql -L classifier[twenty_news]`.
```
SELECT content FROM twenty_news.train TRAINER classifier.MultinomialNB;
```
Or you can use a classifier attached by parameters:
```
SELECT content FROM twenty_news.train TRAINER classifier.SGDClassifier WITH loss='hinge', penalty='l2', random_state=42, max_iter=5, tol=None;
```
#### Tutorial 3: Chatbot
We can use MongoDB data to build a simple chatbot!
#### Configuration
If you want to use MongoDB Altas and MongoDB Stitch, you can set your username and password in `config.ini`.
#### Serverless
We can use serverless to query pretrained models. In `chatbot (MongoDB Stitch)` example, we will use MongoDB Stitch function to query pretrained model stored in MongoDB. You can connect MongoDB Stitch using `mongo` command.
```bash
mongo "mongodb://:@stitch.mongodb.com:27020/?authMechanism=PLAIN&authSource=%24external&ssl=true&appName=todo-tutorial1-uhdox:mongodb-atlas:local-userpass"
```
Then you can create a function in your Stitch app console.
```
exports = async function(text) {
let res = await context.http.post({
url: "http://www.der.ai/chatbot",
form: {
user_id: context.user.id,
text: text
}
});
return EJSON.parse(res.body.text());
};
```
You should deploy the simple chatbot service in your server. We deployed one in our server (maybe slow or down).
```bash
cd samples/chatbot_service
./run.sh
```
Finally, you can call this function in mongo shell.
```
db.runCommand({
callFunction: "chatfunc",
arguments: ["Hello World!"]
})
```
#### TODO
Lots of things to do. This is just a demo project. Do not used in production environment! However, welcome all of you to propose suggestions.
* TF-IDF Using MongoDB Map-Reduce
#### RoadMap
* Improve SQL Parser
* Add more common machine learning / deep learning models support