{"id":18899192,"url":"https://github.com/leopeng1995/neuralsql","last_synced_at":"2026-05-12T12:41:56.715Z","repository":{"id":37605389,"uuid":"187580175","full_name":"leopeng1995/neuralsql","owner":"leopeng1995","description":"Make DataStore More Intelligent","archived":false,"fork":false,"pushed_at":"2022-12-08T03:02:28.000Z","size":15870,"stargazers_count":0,"open_issues_count":10,"forks_count":0,"subscribers_count":2,"default_branch":"master","last_synced_at":"2024-12-31T09:14:24.922Z","etag":null,"topics":["data-analysis","mongodb","sql"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/leopeng1995.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2019-05-20T06:27:22.000Z","updated_at":"2019-06-29T03:27:23.000Z","dependencies_parsed_at":"2023-01-25T03:45:14.321Z","dependency_job_id":null,"html_url":"https://github.com/leopeng1995/neuralsql","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/leopeng1995%2Fneuralsql","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/leopeng1995%2Fneuralsql/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/leopeng1995%2Fneuralsql/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/leopeng1995%2Fneuralsql/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/leopeng1995","download_url":"https://codeload.github.com/leopeng1995/neuralsql/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":239879307,"owners_count":19712176,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["data-analysis","mongodb","sql"],"created_at":"2024-11-08T08:45:51.139Z","updated_at":"2025-10-30T18:55:21.560Z","avatar_url":"https://github.com/leopeng1995.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"## NeuralSQL - Make DataStore More Intelligent\n\n### What is NeuralSQL?\n\nIt is a simple query engine using SQL-like syntax. Only supported Python 3.x.\n\n### Motivation\n\nSQL is an intuitive way to query data. Nowadays, more and more data use MongoDB as their storage databases which are benefited by MongoDB NoSQL Features. We want to build a SQL-like query engine to train machine learning or deep learning models using data in the MongoDB. This experimental project also demonstrates my idea on serverless function to predict using pretrained model stored in MongoDB.\n\n### Features\n\n* Use SQL-like syntax to query MongoDB and get data insight.\n* Use multiple backends, such as TensorFlow, Keras, or scikit-learn.\n* Computation Engine for generates MongoDB pipeline/map-reduce operator to reduce data transmission.\n* Auto-completion and syntax highlighting CLI.\n* Pretrained Model Auto Serving.\n\n### Examples\n\n#### Tutorial 0: WordCount\n\nSimplest NeuralSQL Example. In fact, the SQL executor will generates execution operator using MongoDB map-reduce. Prepare data uses `neuralsql -L wordcount` or `neuralsql --load-sample wordcount`.\n\n```\nSELECT content FROM text8.train TRAINER WordCount;\n```\n\nWord occurrences top 500:\n\n```\nSELECT content FROM text8.train TRAINER WordCount LIMIT 500;\n```\n\n#### Tutorial 1: Word2Vec\n\n```\nSELECT content FROM text8.train TRAINER Word2Vec;\n```\n\nWe can use this example to build keyword recommendation engine. In next example, we will show you how to use MongoDB Stitch to serve word similarity request.\n\n#### Tutorial 2: Twenty News Classifier\n\nPrepare data uses `neuralsql -L classifier` or `neuralsql -L classifier[twenty_news]`.\n\n```\nSELECT content FROM twenty_news.train TRAINER classifier.MultinomialNB;\n```\n\nOr you can use a classifier attached by parameters:\n\n```\nSELECT content FROM twenty_news.train TRAINER classifier.SGDClassifier WITH loss='hinge', penalty='l2', random_state=42, max_iter=5, tol=None;\n```\n\n#### Tutorial 3: Chatbot\n\nWe can use MongoDB data to build a simple chatbot!\n\n\n#### Configuration\n\nIf you want to use MongoDB Altas and MongoDB Stitch, you can set your username and password in `config.ini`.\n\n#### Serverless\n\nWe can use serverless to query pretrained models. In `chatbot (MongoDB Stitch)` example, we will use MongoDB Stitch function to query pretrained model stored in MongoDB. You can connect MongoDB Stitch using `mongo` command.\n\n\n```bash\nmongo \"mongodb://\u003cusername\u003e:\u003cpassword\u003e@stitch.mongodb.com:27020/?authMechanism=PLAIN\u0026authSource=%24external\u0026ssl=true\u0026appName=todo-tutorial1-uhdox:mongodb-atlas:local-userpass\"\n```\n\nThen you can create a function in your Stitch app console.\n\n```\nexports = async function(text) {\n  let res = await context.http.post({\n    url: \"http://www.der.ai/chatbot\",\n    form: {\n      user_id: context.user.id,\n      text: text\n    }\n  });\n  \n  return EJSON.parse(res.body.text());\n};\n```\n\nYou should deploy the simple chatbot service in your server. We deployed one in our server (maybe slow or down).\n\n```bash\ncd samples/chatbot_service\n./run.sh\n```\n\nFinally, you can call this function in mongo shell.\n\n```\ndb.runCommand({\n    callFunction: \"chatfunc\",\n    arguments: [\"Hello World!\"]\n})\n```\n\n#### TODO\n\nLots of things to do. This is just a demo project. Do not used in production environment! However, welcome all of you to propose suggestions.\n\n* TF-IDF Using MongoDB Map-Reduce\n\n#### RoadMap\n\n* Improve SQL Parser\n* Add more common machine learning / deep learning models support\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fleopeng1995%2Fneuralsql","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fleopeng1995%2Fneuralsql","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fleopeng1995%2Fneuralsql/lists"}