{"id":19287860,"url":"https://github.com/deeppavlov/intent_classifier","last_synced_at":"2025-04-22T04:32:30.458Z","repository":{"id":41478877,"uuid":"115599537","full_name":"deeppavlov/intent_classifier","owner":"deeppavlov","description":null,"archived":false,"fork":false,"pushed_at":"2018-08-24T07:45:47.000Z","size":2737,"stargazers_count":83,"open_issues_count":2,"forks_count":31,"subscribers_count":8,"default_branch":"master","last_synced_at":"2024-03-20T03:01:04.920Z","etag":null,"topics":["intent-classification","natural-language-processing","natural-language-understanding","neural-networks","nlp-machine-learning"],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/deeppavlov.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2017-12-28T07:53:27.000Z","updated_at":"2024-03-20T03:01:04.921Z","dependencies_parsed_at":"2022-09-09T07:00:24.560Z","dependency_job_id":null,"html_url":"https://github.com/deeppavlov/intent_classifier","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/deeppavlov%2Fintent_classifier","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/deeppavlov%2Fintent_classifier/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/deeppavlov%2Fintent_classifier/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/deeppavlov%2Fintent_classifier/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/deeppavlov","download_url":"https://codeload.github.com/deeppavlov/intent_classifier/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":223888419,"owners_count":17220083,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["intent-classification","natural-language-processing","natural-language-understanding","neural-networks","nlp-machine-learning"],"created_at":"2024-11-09T22:07:26.430Z","updated_at":"2024-11-09T22:07:27.039Z","avatar_url":"https://github.com/deeppavlov.png","language":"Python","readme":"# This repo is not currently supported as intent classifier became a part of DeepPavlov open-source library.\n\nTry it [here](http://docs.deeppavlov.ai/en/latest/components/classifiers.html).\n\n# Neural Networks for Intent Classifier\n\nIn this repo one can find code for training and infering intent classification\nthat is presented as _shallow-and-wide Convolutional Neural Network_[1].\n\n#### Currently considered fasttext version in this repo does not works on Windows correctly.\n\nAlso this repo contains pre-trained model for intent classification on [SNIPS dataset](https://github.com/snipsco/nlu-benchmark/tree/master/2017-06-custom-intent-engines)\n\nSNIPS dataset considers the following intents: `AddToPlaylist`, `BookRestaurant`, `GetWeather`, `PlayMusic`, `RateBook`, `SearchCreativeWork`, `SearchScreeningEvent`.\n\n![Test results on SNIPS dataset](dp_ir_snips.png)\n\nTest results for other intent recognition services are from https://www.slideshare.net/KonstantinSavenkov/nlu-intent-detection-benchmark-by-intento-august-2017\n\n\n### How to install\n\nFirst of all, one have to download this repo:\n\n```\ngit clone https://github.com/deepmipt/intent_classifier.git\n\ncd intent_classifier\n```\n\nThe next step is to install requirements:\n\n```\npip install -r requirements.txt\n```\n\n\n### How to use pre-trained model (SNIPS)\n\nNow one is able to infer pre-trained model:\n\n```\n./intent_classifier.py ./snips_pretrained/snips_config.json\n```\n\nThe script loads pre-trained model, if necessary downloads pre-trained fastText embedding model [2],\nand then it is ready to predict class and probability of given phrase to belong with this class.\n\nExample:\n```\n./intent_classifier.py ./snips_pretrained/snips_config.json\n\u003eI want you to add 'I love you, baby' to my playlist\n\u003e(0.99986315, 'AddToPlaylist')\n```\n\n### How to train on your own data\n\nThe repo contains  script `train.py` for training multilabel classifier.  \nTraining data file should be presented in the following `data.csv` form:\n\n| request      |class_0|class_1|class_2|class_3| ...|\n|------------- |:-----:|:-----:|:-----:|:-----:|:--:|\n| text_0       | 1     | 0     | 0     |0      |... |\n| text_1       | 0     | 0     | 1     |0      |... |\n| text_2       | 0     | 1     | 0     |0      |... |\n| text_3       | 1     | 0     | 0     |0      |... |\n| ...          | ...   | ...   | ...   |...    |... ||\n\nThen one is ready to run `train.py` that includes reading data, tokenization, constructing data, \nbuilding dataset, initializing and training model with given parameters on dataset from `data.csv`:\n\n```\n./train.py config.json data.csv \n```\n\nThe model will be trained using parameters from `config.json` file. \nThere is a description of several parameters:\n \n- Directory named `model_path` should exist. \nFor example, if `config.json` contains `\"model_path\": \"./cnn_model\"`, \nthen configuration parameters for the trained model will be saved to `./cnn_model/cnn_model_opt.json` \nand weights of the model will be saved to `./cnn_model/cnn_model.h5`.\n\n- Parameter `model_from_saved` means whether to load pre-trained model\n\n- Parameter `lear_metrics` is a string that can include either metrics from `keras.metrics` \nor custom metrics from the file `metrics.py` (for example, `fmeasure`).\n\n- Parameter `confident_threshold` is within the range `[0,1]` \nand means the boundary whether sample belongs to the class.\n\n- Parameter `fasttext_model` contains path to pre-trained binary skipgram fastText [2] model for English language. \nIf one prefers to use default model, it will be downloaded when one will train model.\n\n- Parameter `text_size` means the number of words for padding of each tokenized text request.\n\n- Parameter `model_name` contains name of the class method from `multiclass.ry` returning uncompiled Keras model.\nOne can use `cnn_model`  that is shallow-and-wide CNN (`config.json` contains parameters for this model),\n`dcnn_model` that is deep CNN model (be attentive to provide necessary parameters for the model),\nalso it is possible to write  own model.\n\n- All other parameters refer to learning and network configuration.\n\n### How to infer\n\nInfering can be done in two ways:\n```\n./infer.py config.json\n```\nor\n```\n./intent_classifier.py config.json\n```\n\nThe first one runs `infer.py` file that contains reading parameters from `config.json` file, initializing tokenizer,\ninitializing and infering model. The second one is doing the same but reads samples from command line.\n\n\n\n\n### References\n\n[1] Kim Y. Convolutional neural networks for sentence classification //arXiv preprint arXiv:1408.5882. – 2014.\n\n[2] P. Bojanowski*, E. Grave*, A. Joulin, T. Mikolov, Enriching Word Vectors with Subword Information.\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdeeppavlov%2Fintent_classifier","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdeeppavlov%2Fintent_classifier","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdeeppavlov%2Fintent_classifier/lists"}