{"id":14959003,"url":"https://github.com/dongjunlee/text-cnn-tensorflow","last_synced_at":"2025-03-17T14:18:22.131Z","repository":{"id":236588804,"uuid":"112863764","full_name":"DongjunLee/text-cnn-tensorflow","owner":"DongjunLee","description":"Convolutional Neural Networks for Sentence Classification(TextCNN) implements by TensorFlow","archived":false,"fork":false,"pushed_at":"2019-05-30T05:17:43.000Z","size":2507,"stargazers_count":251,"open_issues_count":3,"forks_count":67,"subscribers_count":9,"default_branch":"master","last_synced_at":"2025-03-17T14:18:04.020Z","etag":null,"topics":["classification","deep-learning","hb-experiment","nlp","sentiment-analysis","tensorflow","tensorflow-models","text-cnn"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/DongjunLee.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2017-12-02T17:51:31.000Z","updated_at":"2025-01-21T03:56:03.000Z","dependencies_parsed_at":"2024-05-08T19:15:49.316Z","dependency_job_id":null,"html_url":"https://github.com/DongjunLee/text-cnn-tensorflow","commit_stats":null,"previous_names":["dongjunlee/text-cnn-tensorflow"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DongjunLee%2Ftext-cnn-tensorflow","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DongjunLee%2Ftext-cnn-tensorflow/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DongjunLee%2Ftext-cnn-tensorflow/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/DongjunLee%2Ftext-cnn-tensorflow/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/DongjunLee","download_url":"https://codeload.github.com/DongjunLee/text-cnn-tensorflow/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":244047645,"owners_count":20389206,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["classification","deep-learning","hb-experiment","nlp","sentiment-analysis","tensorflow","tensorflow-models","text-cnn"],"created_at":"2024-09-24T13:18:40.743Z","updated_at":"2025-03-17T14:18:22.102Z","avatar_url":"https://github.com/DongjunLee.png","language":"Python","readme":"# text-cnn [![hb-research](https://img.shields.io/badge/hb--research-experiment-green.svg?style=flat\u0026colorA=448C57\u0026colorB=555555)](https://github.com/hb-research)\n\nThis code implements [Convolutional Neural Networks for Sentence Classification](http://arxiv.org/abs/1408.5882) models.\n\n- Figure 1: Illustration of a CNN architecture for sentence classification\n\n![figure-1](images/figure-1.png)\n\n\n## Requirements\n\n- Python 3.6\n- TensorFlow 1.4\n- [hb-config](https://github.com/hb-research/hb-config) (Singleton Config)\n- tqdm\n- requests\n- [Slack Incoming Webhook URL](https://my.slack.com/services/new/incoming-webhook/)\n\n## Project Structure\n\ninit Project by [hb-base](https://github.com/hb-research/hb-base)\n\n    .\n    ├── config                  # Config files (.yml, .json) using with hb-config\n    ├── data                    # dataset path\n    ├── notebooks               # Prototyping with numpy or tf.interactivesession\n    ├── scripts                 # download or prepare dataset using shell scripts\n    ├── text-cnn                # text-cnn architecture graphs (from input to logits)\n        ├── __init__.py             # Graph logic\n    ├── data_loader.py          # raw_date -\u003e precossed_data -\u003e generate_batch (using Dataset)\n    ├── hook.py                 # training or test hook feature (eg. print_variables)\n    ├── main.py                 # define experiment_fn\n    ├── model.py                # define EstimatorSpec\n    └── predict.py              # test trained model       \n\nReference : [hb-config](https://github.com/hb-research/hb-config), [Dataset](https://www.tensorflow.org/api_docs/python/tf/data/Dataset#from_generator), [experiments_fn](https://www.tensorflow.org/api_docs/python/tf/contrib/learn/Experiment), [EstimatorSpec](https://www.tensorflow.org/api_docs/python/tf/estimator/EstimatorSpec)\n\n- Dataset : [rt-polarity](https://github.com/yoonkim/CNN_sentence), [Sentiment Analysis on Movie Reviews](https://www.kaggle.com/c/sentiment-analysis-on-movie-reviews/data)\n\n## Todo\n\n- apply embed_type \n\t- CNN-rand\n\t- CNN-static\n\t- CNN-nonstatic\n\t- CNN-multichannel\n\n## Config\n\nexample: kaggle\\_movie\\_review.yml\n\n```yml\ndata:\n  type: 'kaggle_movie_review'\n  base_path: 'data/'\n  raw_data_path: 'kaggle_movie_reviews/'\n  processed_path: 'kaggle_processed_data'\n  testset_size: 25000\n  num_classes: 5\n  PAD_ID: 0\n\nmodel:\n  batch_size: 64\n  embed_type: 'rand'     #(rand, static, non-static, multichannel)\n  pretrained_embed: \"\" \n  embed_dim: 300\n  num_filters: 256\n  filter_sizes:\n    - 2\n    - 3\n    - 4\n    - 5\n  dropout: 0.5\n\ntrain:\n  learning_rate: 0.00005\n  \n  train_steps: 100000\n  model_dir: 'logs/kaggle_movie_review'\n  \n  save_checkpoints_steps: 1000\n  loss_hook_n_iter: 1000\n  check_hook_n_iter: 1000\n  min_eval_frequency: 1000\n  \nslack:\n  webhook_url: \"\"   # after training notify you using slack-webhook\n```\n\n\n## Usage\n\nInstall requirements.\n\n```pip install -r requirements.txt```\n\nThen, prepare dataset and train it.\n\n```\nsh prepare_kaggle_movie_reviews.sh\npython main.py --config kaggle_movie_review --mode train_and_evaluate\n```\n\nAfter training, you can try typing the sentences what you want using `predict.py`.\n\n```python python predict.py --config rt-polarity```\n\nPredict example\n\n```\npython predict.py --config rt-polarity\nSetting max_seq_length to Config : 62\nload vocab ...\nTyping anything :)\n\n\u003e good\n1\n\u003e bad\n0\n```\n\n### Experiments modes\n\n:white_check_mark: : Working  \n:white_medium_small_square: : Not tested yet.\n\n- :white_check_mark: `evaluate` : Evaluate on the evaluation data.\n- :white_medium_small_square: `extend_train_hooks` : Extends the hooks for training.\n- :white_medium_small_square: `reset_export_strategies` : Resets the export strategies with the new_export_strategies.\n- :white_medium_small_square: `run_std_server` : Starts a TensorFlow server and joins the serving thread.\n- :white_medium_small_square: `test` : Tests training, evaluating and exporting the estimator for a single step.\n- :white_check_mark: `train` : Fit the estimator using the training data.\n- :white_check_mark: `train_and_evaluate` : Interleaves training and evaluation.\n\n\n### Tensorboard\n\n```tensorboard --logdir logs```\n\n- Category Color\n\n![category_image](images/category.png)\n\n- rt-polarity (binary classification)\n\n![images](images/rt-polarity_loss_and_accuracy.jpeg)\n\n- kaggle_movie_review (multiclass classification)\n\n![images](images/kaggle-loss_and_accuracy.jpg)\n\n\n## Reference\n\n- [Implementing a CNN for Text Classification in TensorFlow](http://www.wildml.com/2015/12/implementing-a-cnn-for-text-classification-in-tensorflow/) by Denny Britz\n- [Paper - Convolutional Neural Networks for Sentence Classification](http://arxiv.org/abs/1408.5882) (2014) by Y Kim\n- [Paper - A Sensitivity Analysis of (and Practitioners' Guide to) Convolutional Neural Networks for Sentence Classification](https://arxiv.org/pdf/1510.03820.pdf) (2015) Y Zhang\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdongjunlee%2Ftext-cnn-tensorflow","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdongjunlee%2Ftext-cnn-tensorflow","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdongjunlee%2Ftext-cnn-tensorflow/lists"}