{"id":13719395,"url":"https://github.com/utahnlp/therapist-observer","last_synced_at":"2025-05-07T11:31:36.730Z","repository":{"id":74763108,"uuid":"189759150","full_name":"utahnlp/therapist-observer","owner":"utahnlp","description":"Code for the ACL 2019 paper \"Observing Dialogue in Therapy: Categorizing and Forecasting Behavioral Codes\"","archived":false,"fork":false,"pushed_at":"2022-06-11T21:50:58.000Z","size":290,"stargazers_count":12,"open_issues_count":0,"forks_count":5,"subscribers_count":3,"default_branch":"master","last_synced_at":"2024-11-14T08:36:22.511Z","etag":null,"topics":["acl2019","attention","behavior-coding","dialog","elmo","focal-loss","hierarchical-attention-networks","psychotherapy","transformer-encoder"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/utahnlp.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null}},"created_at":"2019-06-01T17:17:33.000Z","updated_at":"2024-04-14T23:22:39.000Z","dependencies_parsed_at":"2024-01-06T00:13:18.918Z","dependency_job_id":"3ebbdfae-fceb-4384-8d2a-6fa17206a11a","html_url":"https://github.com/utahnlp/therapist-observer","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/utahnlp%2Ftherapist-observer","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/utahnlp%2Ftherapist-observer/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/utahnlp%2Ftherapist-observer/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/utahnlp%2Ftherapist-observer/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/utahnlp","download_url":"https://codeload.github.com/utahnlp/therapist-observer/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":252868883,"owners_count":21816931,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["acl2019","attention","behavior-coding","dialog","elmo","focal-loss","hierarchical-attention-networks","psychotherapy","transformer-encoder"],"created_at":"2024-08-03T01:00:47.882Z","updated_at":"2025-05-07T11:31:35.619Z","avatar_url":"https://github.com/utahnlp.png","language":"Python","funding_links":[],"categories":["Psychotherapy"],"sub_categories":["Tools for tests and experiments"],"readme":"\u003ca href=\"#\"\u003e\n    \u003cimg src=\"https://www.mlciv.com/assets/img/therapist-observer2.png\" alt=\"therapist logo\" title=\"therapist observer\" align=\"right\" height=\"200\" /\u003e\n\u003c/a\u003e\n\nTherapist-Observer\n==================\n\nThis repo implements a family of neural components for various hierarchical\ndialogue models described in [\"Observing Dialogue in Therapy:\nCategorizing and Forcasting Behavioral Codes\"](https://arxiv.org/pdf/1907.00326.pdf) By Cao et al. in\nACL 2019.\n```\n @inproceedings{cao2019observing,\n      author    = {Cao, Jie and Tanana, Michael and Imel, Zac E.\n      and Poitras, Eric and Atkins, David C and Srikumar, Vivek},\n      title     = {Observing Dialogue in Therapy: Categorizing and Forecasting Behavioral Codes},\n      booktitle = {Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics},\n      year      = {2019}\n  }\n```\n\nBesides replicating the results on the psychotherapy dataset used in our\npaper, we also offer a guideline or building models with the SOTA\nneural components for conversational analysis in other domains.\n\n# Table of Contents\n*******************\n\u003c!--ts--\u003e\n   * [Therapist-Observer](#therapist-observer)\n   * [Table of Contents](#table-of-contents)\n   * [Part I. Usage](#part-i-usage)\n      * [Required Software](#required-software)\n      * [Data Preprocessing](#data-preprocessing)\n      * [Preparing Embedding](#preparing-embedding)\n      * [Training](#training)\n         * [Training from scratch](#training-from-scratch)\n         * [Analysis for training](#analysis-for-training)\n         * [Resume Training from a checkpoint model](#resume-training-from-a-checkpoint-model)\n      * [Evalution](#evalution)\n   * [Part II. Experiment Desgining](#part-ii-experiment-desgining)\n      * [Categorizing](#categorizing)\n      * [Forecasting](#forecasting)\n   * [Part VI. Usage for Other Dataset or Tasks](#part-vi-usage-for-other-dataset-or-tasks)\n      * [Building Data Input](#building-data-input)\n      * [Model Designing](#model-designing)\n         * [Hierarchical Encoder](#hierarchical-encoder)\n         * [Various Attention Mechansims](#various-attention-mechansims)\n         * [Various Embeddings](#various-embeddings)\n   * [Known Issues (To be moved to issues)](#known-issues-to-be-moved-to-issues)\n\n\u003c!-- Added by: jcao, at: Tue Jun  4 22:59:37 MDT 2019 --\u003e\n\n\u003c!--te--\u003e\n# Part I. Usage\n*******************\n\n## Required Software\n\n   - Install pyenv or other python environment manager\n\n   In our case, we use pyenv and its plugin pyenv-virtualenv to set up\n   the python environment. Please follow the detailed steps in\n   https://github.com/pyenv/pyenv-virtualenv for details. Alternative\n   environments management such as conda will be fine.\n\n   - Install required packages\n\n   ```bash\n   pyenv install 2.7.12\n   # in our default setting, we use `pyenv activate py2.7_tf1.4` to\n   # activate the envivronment, please change this according to your preference.\n\n   pyenv virtualenv 2.7 py2.7_tf1.4\n   pyenv activate py2.7_tf1.4\n   pip install tensorflow-gpu==1.4.0 spacy pandas ujson h5py sklearn matplotlib\n   ```\n\n   - Checkout this project.\n\n   ```bash\n       git clone git@github.com:utahnlp/therapist-observer.git therapist-observer\n   ```\n   `tensorflow` folder is the source code directory for nerual models.\n\n   `Expt` folder is a folder for experiment managing, which includes all the commands(Expt/psyc_scripts/commands), config files(Expt/psyc_scripts/configs) to launch the experiments, and store all experiment outputs. In this repo, except `Expt/psyc_scirpts/commands/env.sh` contains the global variables, all model hyperparameters and reltaed configurations will be assigned in the config files in Expt/psyc_scripts/configs, each of them is corresponding to a model. For a detailed description for folders in `Expt` folder, please refer to [Expt README file](Expt/README.md)\n\n## Data Preprocessing\n\nPreprocessing pipeline consisted of 4 sub steps:\n0) Put original data into `Expt/data/psyc_ro/download/data_filename` \n1) Data Transformation (**trans.sh**), check the path in `trans.sh` \n2) Dataset split and Placement (**place_data.sh**) \n3) Tokenization (**tok.sh**) \n4) Extra Preprocessing (**preprocess_dataset.sh**) \nThe following command can run each of them in squeunce to fulfill the preprocessing pipeline.\n\n```bash\n# it will end after 30 minutes.\ncd Expt/psyc-scripts/commands/\n./pre_pipe.sh\n```\n\nWhen re-executing this, finished sub tasks will be skipped because the\ncorreponding output folder has existed. Please manually delete the\ncorresponding folder for not skipping\n\nFor more details for preprocessing, please refer to document on [README of commands](Expt/psyc-scripts/commands/README.md)\n\n## Preparing Embedding\n\n```bash\n# download glove.840B.300d into $RO_DATA_DIR,\n# WORD_EMB_FILE in each config files will point to the path of this downloaded file\n./download_glove.sh\n\n# download elmo weights and options file into $DATA_DIR/psyc_elmo\n# ELMO_OPTION_FILE and ELMO_WEIGHT_FILE will point the downloaded elmo weights and options file\n./download_elmo.sh\n\n# prepare vocabulary and elmo for training\n# generating vocabulary embedding in $VOCAB_DIR in the corresponding config file\n# which can be used by any task with $CONTEXT_WINDOW = 8, here, we take our selected model on categorizing client codes as a example.\n./prepare.sh ../configs/categorizing/selected/C_C.sh\n\n# Commands ends with \"gpuid\" means, CUDA_VISIBLEE_DEVICE will be specified by a second GPUID argument.\n# ./prepare_gpuid.sh ../configs/categorizing/selected/C_C.sh 1\n```\n\nThe above commands will mainly for preparing the vocabulary and building\nelmo embeddings for every sentence and everytoken. When ELMo enabled,\nthis command may last for 25 minutes, and around 12G GPU memory.\n\nYou only need to do the preparation again when you need to update\nthe embeding, or you have retokenzied the data(token.sh), or you want\nto build vocabulary for large context window. Once $VOCAB_DIR is\ngenerated, this vocabulary can be used for other reciept by pointing\n$VOCAB_DIR to this vocab folder.\n\nAll the following embedding related configurations in the config file\nwill impact the vocabulary preparation.\n\n  - **WORD_EMB_FILE**\n\nBy default, we use glove.840B.300d, which is default value of WORD_EMB_FILE in our config files.\nFor using other word embedding, please change this configuration and do preparation again.\n\n  - **ELMO_OPTION_FILE**, **ELMO_WEIGHT_FILE**\n\nBy default, these two files where point the default location of the download elmo files.\nIf using domain specific ELMo or other pretrained ELMo, make sure to change the above two variables in config file, and prepare.\n\n - **CONTEXT_WINDOW**\n\nBy simply set $CONTEXT_WINDOW=16, it is recommended to re-preprepare\nthe vocab when changing the window size.  Because when genenrating\nsliding window dialogue segments, the words in last $CONTEXT_WINDOW\nutterance of a dialogue may have slight impact on word frequency.\n\nMore details about the configuration, please refer to [README on configs](Expt/psyc-scripts/configs/README.md)\n\n## Training\n\n### Training from scratch\n```bash\n# all training command simply follows a single arguments\n./train.sh \u003cconfig_file\u003e\n\n# training from scratch, see `tensorflow/classes/config_reader.py` for details of each arguments in config_file\n# Again, we use selected model on categoring client codes as an example, ../configs/categorizing/selected/C_C.sh\n# $CONFIG_DIR will be made, train.log shows the training progress\n# $CONFIG_DIR/models/ will save the models and checkpints every $STEPS_PER_CHECKPINTS batch\n./train.sh ../configs/categorizing/selected/C_C.sh\n\n# Commands ends with \"gpuid\" means, CUDA_VISIBLEE_DEVICE will be specified by a second GPUID argument.\n./train_gpuid.sh ../configs/categorizing/selected/C_C.sh 1\n```\n\nWorth to mention, when training, best model with respect to different metric will be saved in $CONFIG_DIR/models/.\n$CONFIG_DIR is required to be set in the model config file.\n\n```\nmodel prefix = $ALGO + sub_model_prefix.\n```\n\n$ALGO is just a name to identify your model. see `tensorflow/classes/config_reader.py` for more details.\n$sub_model_prefix is relared to the metrics we used for evaluation, which follows a pattern \"_A_B\"\n\n```\n# A can be in {P, R, F1, R@K}\n# B can be in {macro, weighted_macro, micro} and all MISC labels.\n```\n\nHence, sub_model_prefix can be _F1_macro, that is what we used for our performance evaluation.\n\n### Analysis for training\n  ```bash\n  # for analyzing training log for Patient(client) models\n  python $ROOT_DIR/Expt/stats_scripts/stats_P.py train.log\n\n  # for analyzing training log for Therapist models\n  python $ROOT_DIR/Expt/stats_scripts/stats_T.py train.log\n  ```\n\nThe whole training will last for around 20 hours on a V100 GPU. The following command will analyze the train.log and print current best performance.\n\n### Resume Training from a checkpoint model\n\n```bash\n# training from saved checkpoint, matched by model file name with prefix as $MODEL_PREFIX_TO_RESTORE\n./train_restore.sh \u003cconfig_file\u003e sub_model_prefix\n\n# The sub_model_prefix argument is optional, when it is not loaded, the save model with best loss will be loaded. # However, model with smallest loss may not indicate best performance. You can resume from the model with repected to best metric.\n./train_restore.sh ../configs/categorizing/hlstm_8_p_semb_ru_elmo_pre1024_focal_rur_add_hs512_f1.sh _F1_macro\n```\n\n## Evalution\n\n```bash\n# For evaluating from a trained model, sub_model_prefix follows the same guide as train_restore.sh\n./dev.sh \u003cconfig_file\u003e sub_model_prefix\n\n# dev with the saved model on dev test with respect to macro F1.\n./dev.sh ../configs/categorizing/selected/C_C.sh _F1_macro\n\n# dev on test means do the same evalution on test set.\n./dev_on_test.sh ../configs/categorizing/C_C.sh _F1_macro\n```\n\nThis scripts can be manually evoked once the model to be restored is saved\n in the \"folder\". After evaluation, a dev_{model_name}.log will\n generated in $CONFIG_DIR/training folder, and results on dev set will\n show in $CONFIG_DIR/results, results on test will show in\n $CONFIG_DIR/results_on_test\n\n# Part II. Experiment Desgining\n*******************************\n\nThe two tasks in our paper is distinguished by the following\nconfigurations in the config file\n\nAll selected receipts are in `Expt/psyc-scripts/configs/categorizing/selected/`\nand `Expt/psyc-scripts/configs/forecasting/selected/`.\n\nYou can follow the steps above to cook each of them.  Worth to\nmention, if $VOCAB_DIR is already built, then please skip\npreprocessing and preparing steps, only training and evalution are\nrequired. If you would like to try diffrent tokenization or embedding,\nthen redo from the corresponding steps.\n\n```bash\n# categorization task will use the last utterance(response) to be labeled\n# forecasting task will not use the last utterance(response) to be labeled\n# `x` just means switch on, leave it empty for swith off\nUSE_RESPONSE_U=x\n\n# We always use the speaker infomation for both context and response\nUSE_RESPONSE_S=x\n\n# decode_goal in ['SPEAKER','ALL_LABEL','P_LABEL','T_LABEL','SEQ_TAG']\n# use T_LABEL for therapist code only\nDECODE_GOAL=T_LABEL\n# use P_LABEL for patient code only\nDECODE_GOAL=P_LABEL\n```\n\nWe offer the performance table on the selected models in our paper as\nfollows. For more, description for each configuration, please refer to\n[README for config file](Expt/psyc-scripts/configs/README.md)\n\n\nFor the name of selected models, last chaceracter 'C' or 'T' means client or therapist.\nThe second last character 'C' or 'F' means categorizing task or forecasting task.\nThe remaining part of the name is a id for distinguish differrent nerual architecture.\nSee more details in the paper\n\n## Categorizing\nFor client, the best model does not need any word or utterance attention.\n\n| Method                                                                                | macro    | FN           | CHANGE   | SUSTAIN  |\n|---------------------------------------------------------------------------------------|:--------:|:------------:|:--------:|:--------:|\n| Majority                                                                              | 30.6     | **__91.7__** | 0.0      | 0.0      |\n| [Xiao et al. (2016)](http://scuba.usc.edu/pdf/xiao2016_behavioral-codi.pdf)           | 50.0     | 87.9         | 32.8     | __29.3__ |\n| [BiGRU_generic_C](../../tree/master/Expt/psyc-scripts/configs/categorizing/selected/BiGRU_generic_C.sh) | __50.2__ | 87.0         | __35.2__ | 28.4     |\n| [BiGRU_ELMo_C](../../tree/master/Expt/psyc-scripts/configs/categorizing/selected/BiGRU_ELMo_C.sh)       | 52.9     | 87.6         | **39.2** | 32.0     |\n| [Can et al. (2015)](https://sail.usc.edu/publications/files/dogan-is150788.pdf)       | 44.0     | 91.0         | 20.0     | 21.0     |\n| [Tanana et al. (2016)](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4842096/)         | 48.3     | 89.0         | 29.0     | 27.0     |\n| [CONCAT_C_C](../../tree/master/Expt/psyc-scripts/configs/categorizing/selected/CONCAT_C_C.sh)           | 51.8     | 86.5         | 38.8     | 30.2     |\n| [GMGRU_H_C_C](../../tree/master/Expt/psyc-scripts/configs/categorizing/selected/GMGRU_H_C_C.sh)         | 52.6     | 89.5         | 37.1     | 31.1     |\n| [BiDAF_H_C_C](../../tree/master/Expt/psyc-scripts/configs/categorizing/selected/BiDAF_H_C_C.sh)         | 50.4     | 87.6         | 36.5     | 27.1     |\n| [Our Best](../../tree/master/Expt/psyc-scripts/configs/categorizing/selected/C_C.sh)                    | **53.9** | 89.6         | 39.1     | **33.1** |\n| Change                                                                               | **+3.5** | **-2.1**     | **+3.9** | **+3.8** |\n\n\nFor the therapist, it uses GMGRUH for word attention and ANCHOR42 for utterance attention.\n\n| Method                                                                                | macro    | FA       | RES      | REC      | GI       | QUC      | QUO      | MIA      | MIN       |\n|---------------------------------------------------------------------------------------|:--------:|:--------:|:--------:|:--------:|:--------:|:--------:|:--------:|:--------:|:---------:|\n| Majority                                                                              | 5.87     | 47.0     | 0.0      | 0.0      | 0.0      | 0.0      | 0.0      | 0.0      | 0.0       |\n| [Xiao et al. (2016)](http://scuba.usc.edu/pdf/xiao2016_behavioral-codi.pdf)           | 59.3     | __94.7__ | 50.2     | 48.3     | 71.9     | 68.7     | 80.1     | 54.0     | 6.5       |\n| [BiGRU_generic_T](../../tree/master/Expt/psyc-scripts/configs/categorizing/selected/BiGRU_generic_T.sh) | __60.2__ | 94.5     | __50.5__ | __49.3__ | 72.0     | 70.7     | 80.1     | __54.0__ | __10.8__  |\n| [BiGRU_ELMo_T](../../tree/master/Expt/psyc-scripts/configs/categorizing/selected/BiGRU_ELMo_T.sh)       | 62.6     | 94.5     | 51.6     | 49.4     | 70.7     | 72.1     | 80.8     | 57.2     | 24.2      |\n| [Can et al. (2015)](https://sail.usc.edu/publications/files/dogan-is150788.pdf)       | -        | 94.0     | 49.0     | 45.0     | __74.0__ | __72.0__ | __81.0__ | -        | -         |\n| [Tanana et al. (2016)](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4842096/)         | -        | 94.0     | 48.0     | 39.0     | 69.0     | 68.0     | 77.0     | -        | -         |\n| [CONCAT_C_T](../../tree/master/Expt/psyc-scripts/configs/categorizing/selected/CONCAT_C_T.sh)           | 61.0     | 94.5     | 54.6     | 34.3     | 73.3     | 73.6     | 81.4     | 54.6     | 22.0      |\n| [GMGRU_H_C_T](../../tree/master/Expt/psyc-scripts/configs/categorizing/selected/GMGRU_H_C_T.sh)         | 64.9     | 94.9     | **56.0** | 54.4     | **75.5** | **75.7** | **83.0** | **58.2** | 21.8      |\n| [BiDAF_H_C_T](../../tree/master/Expt/psyc-scripts/configs/categorizing/selected/BiDAF_H_C_T.sh)         | 63.8     | 94.7     | 55.9     | 49.7     | 75.4     | 73.8     | 80.0     | 56.2     | 24.0      |\n| [Our Best](../../tree/master/Expt/psyc-scripts/configs/categorizing/selected/C_T.sh)                    | **65.4** | **95.0** | 55.7     | **54.9** | 74.2     | 74.8     | 82.6     | 56.6     | **29.7**  |\n| Change                                                                                | **+5.2** | **+0.3** | **+3.9** | **+3.8** | **+0.2** | **+2.8** | **+1.6** | **+2.6** | **+18.9** |\n\n\n## Forecasting\n\nFor both client and therapist, the best model uses no word attention, and uses SELF42 utterance attention.\n\n| Method                                                                       | Dev      | Dev      | Test     | Test | Test     | Test     |\n|------------------------------------------------------------------------------|:--------:|:--------:|:--------:|:----:|:--------:|:--------:|\n|                                                                              | CHANGE   | SUSTAIN  | macro    | FN   | CHANGE   | SUSTAIN  |\n| [CONCAT_F_C](../../tree/master/Expt/psyc-scripts/configs/forecasting/selected/CONCAT_F_C.sh)   | 20.4     | 30.2     | 43.6     | 84.4 | 23.0     | **23.5** |\n| [HGRU_F_C](../../tree/master/Expt/psyc-scripts/configs/forecasting/selected/HGRU_F_C.sh)       | 19.9     | 31.2     | **44.4** | 85.7 | **24.9** | 22.5     |\n| [GMGRU_H_F_C](../../tree/master/Expt/psyc-scripts/configs/forecasting/selected/GMGRU_H_F_C.sh) | 19.4     | 30.5     | 44.3     | 87.1 | 23.3     | 22.4     |\n| [Forecast_C](../../tree/master/Expt/psyc-scripts/configs/forecasting/selected/F_C.sh)          | **21.1** | **31.3** | 44.3     | 85.2 | 24.7     | 22.7     |\n\n\nExcept for R@3, all others are F1 score.\n\n| Method                                                                                 | R@3      | macro    | FA       | RES      | REC      | GI       | QUC      | QUO      | MIA      | MIN      |\n|:--------------------------------------------------------------------------------------:|:--------:|:--------:|:--------:|:--------:|:--------:|:--------:|:--------:|:--------:|:--------:|:--------:|\n| [CONCAT_F_T](../../tree/master/Expt/psyc-scripts/configs/forecasting/selected/CONCAT_F_T.sh)             | 72.5     | 23.5     | 63.5     | 0.6      | 0.0      | 53.7     | 27.0     | 15.0     | 18.2     | 9.0      |\n| [HGRU_generic_F_T](../../tree/master/Expt/psyc-scripts/configs/forecasting/selected/HGRU_generic_F_T.sh) | 76.8     | 24.0     | 71.0     | 2.7      | 20.5     | 58.8     | 27.5     | 12.9     | 15.2     | 1.6      |\n| [HGRU_F_T](../../tree/master/Expt/psyc-scripts/configs/forecasting/selected/HGRU_F_T.sh)                 | 76.0     | 28.6     | 71.4     | 12.7     | **24.9** | 58.3     | 28.8     | 5.9      | **17.4** | 9.7      |\n| [GMGRU_H_F_T](../../tree/master/Expt/psyc-scripts/configs/forecasting/selected/GMGRU_H_F_T.sh)           | 76.6     | 26.6     | **72.6** | 10.2     | 20.6     | 58.8     | 27.4     | 6.0      | 8.9      | 7.9      |\n| [Forecase_T](../../tree/master/Expt/psyc-scripts/configs/forecasting/selected/F_T.sh)                    | **77.0** | **31.1** | 71.9     | **19.5** | 24.7     | **59.2** | **29.1** | **16.4** | 15.2     | **12.8** |\n\n\n# Part VI. Usage for Other Dataset or Tasks\n\n## Building Data Input\n\n   Preprocessing your own dataset into DSTC-like conversational json\n   format is the main job to do before modeling.\n\n   ```json\n   [\n    {\n        \"correct_seq_labels\": [],\n        \"options-for-correct-answers\": [\n            {\n                \"tokenized_utterance\": \"it 's just\",\n                \"codes\": [\n                    {\n                        \"origin_code\": \"GI\",\n                        \"translated_code\": \"giving_info\",\n                        \"coder_order\": [\n                            {\n                                \"order_id\": 1,\n                                \"coder_id\": \"ms\",\n                                \"cid\": 72427\n                            }\n                        ]\n                    }\n                ],\n                \"uid\": \"(BAER_936)_31_5_T_49_51\",\n                \"agg_label\": \"giving_info\",\n                \"speaker\": \"T\",\n                \"snt_id\": 9878\n            }\n        ],\n        \"example-id\": \"(BAER_936)_(T, 27, 3)-(T, 31, 51)\",\n        \"messages-so-far\": [\n            {\n                \"tokenized_utterance\": \"mm - hmm\",\n                \"codes\": [\n                    {\n                        \"origin_code\": \"FA\",\n                        \"translated_code\": \"facilitate\",\n                        \"coder_order\": [\n                            {\n                                \"order_id\": 1,\n                                \"coder_id\": \"ms\",\n                                \"cid\": 72411\n                            }\n                        ]\n                    }\n                ],\n                \"uid\": \"(BAER_936)_27_9_T_3_4\",\n                \"agg_label\": \"facilitate\",\n                \"speaker\": \"T\",\n                \"snt_id\": 5\n            },\n            ...\n         ],\n        \"correct_labels\": [\n            3\n        ],\n        \"pred_probs\": [\n            {\n                \"label_index\": 2,\n                \"label_name\": \"reflection_complex\",\n                \"prob\": 0.2700542211532593\n            },\n            {\n                \"label_index\": 3,\n                \"label_name\": \"reflection_simple\",\n                \"prob\": 0.100542211532593\n            },\n            ...\n         ]\n       },\n       ...\n   ]\n   ```\n\n   Our current code base is based on feeddict-based tensorflow inputs.\n   In future, we will upgrade it with newer tensforflow feattures,\n   such as estimator and tensorflow serving.\n\n## Model Designing\n\n   Our code base allows user to build converstational baseline models\n   without writing much tensorflow code. For all supported model\n   components, creating customized config file is the only thing to do\n   for building a model for your dataset.\n\n### Hierarchical Encoder\n\n### Various Attention Mechansims\n\n### Various Embeddings\n\n   - Domain Specific Glove\n\n   - Domain Specific ELMo\n\n# Known Issues (To be moved to issues)\n\n 1. Known issues about spaCy with python 2.7.5\n\n  see https://github.com/explosion/spaCy/issues/3734, Please use python 2.7.12. But Python 2 will be dropped in Jan 2020, we will try to test our code on python 3 and publish a new repo for python 3.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Futahnlp%2Ftherapist-observer","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Futahnlp%2Ftherapist-observer","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Futahnlp%2Ftherapist-observer/lists"}