{"id":19031484,"url":"https://github.com/plandes/clj-example-nlp-ml","last_synced_at":"2026-03-16T13:03:07.688Z","repository":{"id":80107467,"uuid":"64150438","full_name":"plandes/clj-example-nlp-ml","owner":"plandes","description":"Example Project for Natural Language Processing and Machine Learning Libraries","archived":false,"fork":false,"pushed_at":"2018-06-29T17:29:09.000Z","size":948,"stargazers_count":13,"open_issues_count":0,"forks_count":0,"subscribers_count":4,"default_branch":"master","last_synced_at":"2025-04-18T01:47:59.481Z","etag":null,"topics":["clojure","dataset","elasticsearch","feature-engineering","machine-learning","natural-language-processing"],"latest_commit_sha":null,"homepage":null,"language":"Clojure","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/plandes.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2016-07-25T16:26:29.000Z","updated_at":"2021-10-11T05:39:27.000Z","dependencies_parsed_at":null,"dependency_job_id":"cc9fa71e-7a4a-4521-b59c-18c4aaf6e2dc","html_url":"https://github.com/plandes/clj-example-nlp-ml","commit_stats":null,"previous_names":[],"tags_count":1,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/plandes%2Fclj-example-nlp-ml","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/plandes%2Fclj-example-nlp-ml/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/plandes%2Fclj-example-nlp-ml/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/plandes%2Fclj-example-nlp-ml/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/plandes","download_url":"https://codeload.github.com/plandes/clj-example-nlp-ml/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":250471803,"owners_count":21436025,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["clojure","dataset","elasticsearch","feature-engineering","machine-learning","natural-language-processing"],"created_at":"2024-11-08T21:23:47.153Z","updated_at":"2026-03-16T13:03:07.593Z","avatar_url":"https://github.com/plandes.png","language":"Clojure","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Example Project for Natural Language Processing and Machine Learning Libraries\n\n**Note**: [This repository](https://github.com/plandes/todo-task) has a more up\nto date and better example of a real (academic) project of how to use these\nlibraries.\n\nThis is a simple and small example of how to use the following libraries:\n\n* [Natural Language Processing](https://github.com/plandes/clj-nlp-parse)\n* [Machine Learning](https://github.com/plandes/clj-ml-model)\n* [Machine Learning Dataset](https://github.com/plandes/clj-ml-dataset)\n\nThis project extends Carin Meier's\n[speech act classifier](http://gigasquidsoftware.com/blog/2015/10/20/speech-act-classification-for-text-with-clojure/).\nNone of her [code](https://github.com/gigasquid/speech-acts-classifier) was\nused, only the data to test and train (found in the `resources` directory).\n\nNote that the library also illustrates how to use the\n[action command line interface library](https://github.com/plandes/clj-actioncli)\nas you can [build out a CLI version](#command-line).\n\n\u003c!-- markdown-toc start - Don't edit this section. Run M-x markdown-toc-refresh-toc --\u003e\n## Table of Contents\n\n- [Documentation](#documentation)\n- [Usage](#usage)\n    - [REPL](#repl)\n    - [Command line](#command-line)\n- [Building](#building)\n- [License](#license)\n\n\u003c!-- markdown-toc end --\u003e\n\n\n## Documentation\n\nAPI (incomplete) [documentation](https://plandes.github.io/clj-example-nlp-ml/codox/index.html).\n\n\n## Usage\n\nThis project provides a real working example of a statistical natural language\nprocessing program.  The code itself is given as examples in the libraries it\nuses (see top of this README).  To use, clone the repository and build with\nlein (see the [command line docs](#command-line)).\n\n\n### REPL\n\n```clojure\nuser\u003e (System/setProperty \"zensols.model\" \"path-to-model\")\nuser\u003e (require '[zensols.example.sa-model :as sa])\nuser\u003e (sa/classify-utterance \"when will we get there\")\nINFO  2016-07-15 18:19:00.957: stanford: parsing: \u003cwhen will we get there\u003e\nINFO  2016-07-15 18:19:00.979: stanford: creating tagger model at .../stanford/pos/english-left3words-distsim.tagger\nINFO  2016-07-15 18:19:01.565: stanford: creating ner annotators: [\"edu/stanford/nlp/models/ner/english.conll.4class.distsim.crf.ser.gz\"]\n=\u003e \"question\"\n```\n\n\n### Command line\n\n1. Install [Leiningen](http://leiningen.org) (this is just a script)\n2. Install [GNU make](https://www.gnu.org/software/make/)\n3. Install [Git](https://git-scm.com)\n4. Follow the directions in [build section](#building)\n5. Create the distribution on the desktop: `make dist`\n6. Start the Elasticsearch server using the\n   [ML Dataset project](https://github.com/plandes/clj-ml-dataset)\n7. Load the corpus into Elasticsearch: `cd ~/Desktop/nlp-ml-example/bin ; ./saclassify load-corpus`\n8. Run: `./saclassify classify -u 'when will we get there'`\n\n\n# Building\n\nAll [leiningen](http://leiningen.org) tasks will work in this project.  For\nadditional build functionality (git tag convenience utility functionality)\nclone the [Clojure build repo](https://github.com/plandes/clj-zenbuild) in the\nsame (parent of this file) directory as this project:\n```bash\n   cd ..\n   git clone https://github.com/plandes/clj-zenbuild\n```\n\n## License\n\nCopyright (c) 2016, 2017, 2018 Paul Landes\n\nPermission is hereby granted, free of charge, to any person obtaining a copy of\nthis software and associated documentation files (the \"Software\"), to deal in\nthe Software without restriction, including without limitation the rights to\nuse, copy, modify, merge, publish, distribute, sublicense, and/or sell copies\nof the Software, and to permit persons to whom the Software is furnished to do\nso, subject to the following conditions:\n\nThe above copyright notice and this permission notice shall be included in all\ncopies or substantial portions of the Software.\n\nTHE SOFTWARE IS PROVIDED \"AS IS\", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR\nIMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,\nFITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE\nAUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER\nLIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,\nOUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE\nSOFTWARE.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fplandes%2Fclj-example-nlp-ml","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fplandes%2Fclj-example-nlp-ml","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fplandes%2Fclj-example-nlp-ml/lists"}