{"id":22225082,"url":"https://github.com/jbeno/sentiment","last_synced_at":"2025-05-13T15:39:45.364Z","repository":{"id":262306434,"uuid":"832380034","full_name":"jbeno/sentiment","owner":"jbeno","description":"Sentiment analysis of DynaSent and SST reviews using ELECTRA and GPT collaboration","archived":false,"fork":false,"pushed_at":"2025-05-04T00:00:03.000Z","size":51174,"stargazers_count":6,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-05-04T01:17:33.663Z","etag":null,"topics":["dspy","dynasent","electra-model","gpt-4o","gpt-4o-mini","sentiment-analysis","sentiment-classification","sst"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/jbeno.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2024-07-22T23:02:56.000Z","updated_at":"2025-05-04T00:00:06.000Z","dependencies_parsed_at":"2025-04-11T12:10:34.695Z","dependency_job_id":null,"html_url":"https://github.com/jbeno/sentiment","commit_stats":null,"previous_names":["jbeno/sentiment"],"tags_count":2,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jbeno%2Fsentiment","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jbeno%2Fsentiment/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jbeno%2Fsentiment/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jbeno%2Fsentiment/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/jbeno","download_url":"https://codeload.github.com/jbeno/sentiment/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":253971404,"owners_count":21992697,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["dspy","dynasent","electra-model","gpt-4o","gpt-4o-mini","sentiment-analysis","sentiment-classification","sst"],"created_at":"2024-12-03T00:15:01.312Z","updated_at":"2025-05-13T15:39:45.345Z","avatar_url":"https://github.com/jbeno.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# ELECTRA and GPT-4o: Cost-Effective Partners for Sentiment Analysis\n\nThis repository contains the code and datasets for the research paper \"ELECTRA and GPT-4o: Cost-Effective Partners for Sentiment Analysis\", which explores collaborative approaches between ELECTRA and GPT-4o models for sentiment classification. This research was conducted as the final project for the [XCS224U](https://online.stanford.edu/courses/xcs224u-natural-language-understanding) \"Natural Language Understanding\" course by [Stanford Engineering CGOE](https://cgoe.stanford.edu).\n\n## Research Overview\n\nThe research investigated collaborative approaches between bidirectional transformers (ELECTRA Base/Large) and Large Language Models (GPT-4o/4o-mini) for three-way sentiment classification of reviews (negative, neutral, positive). We found that:\n\n- Augmenting GPT-4o-mini prompts with ELECTRA predictions significantly improved performance over either model alone\n- However, when GPT models were fine-tuned, including predictions decreased performance\n- Fine-tuned GPT-4o-mini achieved nearly equivalent performance to GPT-4o at 76% lower cost\n- The best approach depends on project constraints (budget, privacy concerns, available compute resources)\n\n## Updates\n\n### May 3, 2025:\n- Paper published in Proceedings of NAACL 2025 Workshop on Knowledge-Augmented Methods for NLP (pages 18-36)\n- Updated [arXiv listing](https://arxiv.org/abs/2501.00062) to v2 paper that addresses reviewer feedback and includes 2 rounds of experiments; added conference publication information\n- Added formal citation and ACL Anthology link (https://aclanthology.org/2025.knowledgenlp-1.2/)\n\n### March 30, 2025:\n- Added round 2 experiment results in [results_round2](results_round2), [electra_finetune](electra_finetune), and [gpt_finetune_experiments_round2.ipynb](gpt_finetune_experiments_round2.ipynb)\n- Corrected round one E7-G4O-ELFT data in [results](results) (was E14 now E19 in latest numbering), [gpt_finetune_experiments.ipynb](gpt_finetune_experiments.ipynb) and [statistics.ipynb](statistics.ipynb)\n- Updated [research_paper.pdf](research_paper.pdf), which addresses some reviewer feedback and includes 2 rounds of experiments\n- Updated [requirements.txt](requirements.txt)\n\n## Resources\n\n### Research\n- [ELECTRA and GPT-4o: Cost-Effective Partners for Sentiment Analysis](research_paper.pdf) - Research paper (PDF)\n- [arXiv preprint](http://arxiv.org/abs/2501.00062) - published Dec 29, 2024 in cs.CL, cross-posted to cs.AI (arXiv:2501.00062)\n- Published in [Proceedings of the 4th International Workshop on Knowledge-Augmented Methods for Natural Language Processing](https://aclanthology.org/2025.knowledgenlp-1.2/), pages 18-36, NAACL 2025\n\n### Models \n- [ELECTRA Base Classifier for Sentiment Analysis](https://huggingface.co/jbeno/electra-base-classifier-sentiment) - Fine-tuned ELECTRA base discriminator (Hugging Face)\n- [ELECTRA Large Classifier for Sentiment Analysis](https://huggingface.co/jbeno/electra-large-classifier-sentiment) - Fine-tuned ELECTRA large discriminator (Hugging Face)\n\n### Datasets\n- [Sentiment Merged Dataset](https://huggingface.co/datasets/jbeno/sentiment_merged) - Combination of DynaSent R1/R2 and SST-3 (Hugging Face)\n\n### Code\n- [jbeno/sentiment](https://github.com/jbeno/sentiment) - Primary research repository (GitHub)\n- [electra-classifier](https://pypi.org/project/electra-classifier/) - Package for loading fine-tuned ELECTRA classifier models (PyPI)\n\n## Repository Structure\n\n```text\n├── data/                            # Merged dataset files\n├── electra_finetune/                # ELECTRA classifier fine-tuning logs\n├── results/                         # Experiment predictions and metrics\n├── statistics/                      # Statistical analysis\n├── classifier.py                    # Neural classifier model with DDP\n├── colors.py                        # Color display utilities\n├── data_processing.ipynb            # Creation of Merged dataset\n├── datawaza_funcs.py                # Subset of Datawaza library with edits \n├── finetune.py                      # Interactive classifier fine-tuning program with DDP\n├── gpt_finetune_experiments.ipynb   # GPT fine-tune, baselines, and experiments with DSPy\n├── requirements.txt                 # Python dependencies\n├── research_paper.pdf               # Research paper in PDF format\n├── sst.py                           # SST dataset loader from CS224U repo\n├── statistics.ipynb                 # Statistical analysis\n├── torch_model_base.py              # Base neural classiifer model from CS224U repo\n└── utils.py                         # General utilities modified from CS224U repo\n```\n\n## Setup and Installation \n\n1. Clone the GitHub repo:\n```bash\ngit clone https://github.com/jbeno/sentiment.git\ncd sentiment\n```\n\n2. Create a new virtual environment:\n```bash\npython -m venv venv\nsource venv/bin/activate\n```\n\n3. Install dependencies:\n```bash\npip install -r requirements.txt\n```\n\n4. Create .env file and add environment variables:\n```bash\nOPENAI_API_KEY=your-api-key   # Required for GPT experiments\nWANDB_API_KEY=your-wandb-key  # Optional for experiment tracking\nARIZE_API_KEY=your-arize-key  # Optional for LLM trace tracking\n```\n\n## Using the Models\n\nThe fine-tuned ELECTRA models have been published on Hugging Face, but the fine-tuned GPT-4o/4o-mini models are stored privately by OpenAI. However, you can re-create the fine-tuned models using the code in this repo.\n\n### ELECTRA Models\n\nYou can use the fine-tuned ELECTRA models that have been published on Hugging Face. An `electra-classifier` package was created to streamline loading of the models.\n\n```python\n# Install the package in a notebook\nimport sys\n!{sys.executable} -m pip install electra-classifier\n\n# Import libraries\nimport torch\nfrom transformers import AutoTokenizer\nfrom electra_classifier import ElectraClassifier\n\n# Load tokenizer and model\nmodel_name = \"jbeno/electra-base-classifier-sentiment\"\ntokenizer = AutoTokenizer.from_pretrained(model_name)\nmodel = ElectraClassifier.from_pretrained(model_name)\n\n# Set model to evaluation mode\nmodel.eval()\n\n# Run inference\ntext = \"I love this restaurant!\"\ninputs = tokenizer(text, return_tensors=\"pt\")\n\nwith torch.no_grad():\n    logits = model(**inputs)\n    predicted_class_id = torch.argmax(logits, dim=1).item()\n    predicted_label = model.config.id2label[predicted_class_id]\n    print(f\"Predicted label: {predicted_label}\")\n```\n\n### Fine-tuning ELECTRA\n\nThe ELECTRA models can be fine-tuned using `finetune.py`, which has an interactive mode and leverages multiple GPUs through Distributed Data Parallel (DDP). This can also be used on BERT or RoBERTa, and a variety of datasets.\n\n```bash\npython ./finetune.py --dataset 'merged_local' --weights_name 'google/electra-base-discriminator' --save_data \\\n--save_model --save_pickle --save_preds --lr 0.00001 --epochs 100 --pooling 'mean' --dropout_rate 0.3 \\\n--num_layers 2 --hidden_dim 1024 --hidden_activation 'swishglu' --batch_size 32 --l2_strength 0.01 \\\n--checkpoint_interval 5 --use_zero --optimizer 'adamw' --scheduler 'cosine_warmup' \\\n--scheduler_kwargs '{\"T_0\":5, \"T_mult\":1, \"eta_min\":1e-7}' --decimal 6 --use_val_split \\\n--eval_split 'test' --early_stop 'score' --n_iter_no_change 10 --interactive\n```\n\nHere is the help for the command-line arguments:\n\n```\n$ python ./finetune.py --help\nusage: finetune.py [-h] [--dataset DATASET] [--eval_dataset EVAL_DATASET] [--eval_split {validation,test}]\n    [--sample_percent SAMPLE_PERCENT] [--chunk_size CHUNK_SIZE] [--label_dict LABEL_DICT]\n    [--numeric_dict NUMERIC_DICT] [--label_template LABEL_TEMPLATE] [--pos_label POS_LABEL]\n    [--weights_name WEIGHTS_NAME] [--pooling POOLING] [--finetune_bert] [--finetune_layers FINETUNE_LAYERS]\n    [--freeze_bert] [--num_layers NUM_LAYERS] [--hidden_dim HIDDEN_DIM]\n    [--hidden_activation HIDDEN_ACTIVATION] [--dropout_rate DROPOUT_RATE] [--batch_size BATCH_SIZE]\n    [--accumulation_steps ACCUMULATION_STEPS] [--epochs EPOCHS] [--lr LR] [--lr_decay LR_DECAY]\n    [--optimizer OPTIMIZER] [--use_zero] [--l2_strength L2_STRENGTH] [--optimizer_kwargs OPTIMIZER_KWARGS]\n    [--scheduler SCHEDULER] [--scheduler_kwargs SCHEDULER_KWARGS] [--max_grad_norm MAX_GRAD_NORM]\n    [--random_seed RANDOM_SEED] [--interactive] [--show_progress] [--checkpoint_dir CHECKPOINT_DIR]\n    [--checkpoint_interval CHECKPOINT_INTERVAL] [--resume_from_checkpoint] [--early_stop EARLY_STOP]\n    [--n_iter_no_change N_ITER_NO_CHANGE] [--tol TOL] [--target_score TARGET_SCORE]\n    [--val_percent VAL_PERCENT] [--use_val_split] [--wandb] [--wandb_project WANDB_PROJECT]\n    [--wandb_run WANDB_RUN] [--wandb_alerts] [--threshold THRESHOLD] [--model_name MODEL_NAME]\n    [--save_data] [--save_model] [--save_pickle] [--save_hf] [--save_preds] [--save_plots]\n    [--save_dir SAVE_DIR] [--data_file DATA_FILE] [--model_file MODEL_FILE] [--use_saved_params]\n    [--predict] [--predict_file PREDICT_FILE] [--device DEVICE] [--gpus GPUS]\n    [--num_threads NUM_THREADS] [--num_workers NUM_WORKERS] [--prefetch PREFETCH] [--empty_cache]\n    [--port PORT] [--debug] [--mem_interval MEM_INTERVAL] [--decimal DECIMAL] [--color_theme COLOR_THEME]\n\nDDP Distributed PyTorch Training for Sentiment Analysis using BERT\n\noptional arguments:\n  -h, --help            show this help message and exit\n\nDataset configuration:\n  --dataset DATASET     Training dataset to use: 'sst', 'sst_local', 'dynasent_r1', 'dynasent_r2',\n                        'mteb_tweet', 'merged_local' (default: sst_local)\n  --eval_dataset EVAL_DATASET\n                        (Optional) Different test dataset to use: 'sst', 'sst_local', 'dynasent_r1',\n                        'dynasent_r2', 'mteb_tweet', 'merged_local' (default: None)\n  --eval_split {validation,test}\n                        Specify whether to evaluate with 'validation' or 'test' split (default:\n                        validation)\n  --sample_percent SAMPLE_PERCENT\n                        Percentage of data to use for training, validation and test (default: None)\n  --chunk_size CHUNK_SIZE\n                        Number of dataset samples to encode in each chunk (default: None, process\n                        all data at once)\n  --label_dict LABEL_DICT\n                        Text label dictionary, string to numeric (default: {'negative': 0,\n                        'neutral': 1, 'positive': 2})\n  --numeric_dict NUMERIC_DICT\n                        Numeric label dictionary, numeric to string (default: {0: 'negative', 1:\n                        'neutral', 2: 'positive'})\n  --label_template LABEL_TEMPLATE\n                        Predefined class label template with dictionary mappings: 'neg_neu_pos',\n                        'bin_neu', 'bin_pos', 'bin_neg' (default: None)\n  --pos_label POS_LABEL\n                        Positive class label for binary classification, must be integer (default: 1)\n\nBERT tokenizer/model configuration:\n  --weights_name WEIGHTS_NAME\n                        Pre-trained model/tokenizer name from a Hugging Face repo. Can be root-level\n                        or namespaced (default: 'bert-base-uncased')\n  --pooling POOLING     Pooling method for BERT embeddings: 'cls', 'mean', 'max' (default: 'cls')\n  --finetune_bert       Whether to fine-tune BERT weights. If True, specify number of finetune_layers\n                        (default: False)\n  --finetune_layers FINETUNE_LAYERS\n                        Number of BERT layers to fine-tune. For example: 0 to freeze all, 12 or 24 to\n                        tune all, 1 to tune the last layer, etc. (default: 1)\n  --freeze_bert         Whether to freeze BERT weights during training (default: False)\n\nClassifier configuration:\n  --num_layers NUM_LAYERS\n                        Number of hidden layers for neural classifier (default: 1)\n  --hidden_dim HIDDEN_DIM\n                        Hidden dimension for neural classifier layers (default: 300)\n  --hidden_activation HIDDEN_ACTIVATION\n                        Hidden activation function: 'tanh', 'relu', 'sigmoid', 'leaky_relu', 'gelu',\n                        'swish', 'swishglu' (default: 'tanh')\n  --dropout_rate DROPOUT_RATE\n                        Dropout rate for neural classifier (default: 0.0)\n\nTraining configuration:\n  --batch_size BATCH_SIZE\n                        Batch size for both encoding text and training classifier (default: 32)\n  --accumulation_steps ACCUMULATION_STEPS\n                        Number of steps to accumulate gradients before updating weights (default: 1)\n  --epochs EPOCHS       Number of epochs to train (default: 100)\n  --lr LR               Learning rate (default: 0.001)\n  --lr_decay LR_DECAY   Learning rate decay factor, defaults to none, 0.95 is 5% per layer\n                        (default: 1.0)\n  --optimizer OPTIMIZER\n                        Optimizer to use: 'adam', 'sgd', 'adagrad', 'rmsprop', 'zero', 'adamw'\n                        (default: 'adam')\n  --use_zero            Use Zero Redundancy Optimizer for efficient DDP training, with the optimizer\n                        specified in --optimizer (default: False)\n  --l2_strength L2_STRENGTH\n                        L2 regularization strength for optimizer (default: 0.0)\n  --optimizer_kwargs OPTIMIZER_KWARGS\n                        Additional optimizer keyword arguments as a dictionary (default: None)\n  --scheduler SCHEDULER\n                        Learning rate scheduler to use: 'none', 'step', 'multi_step', 'exponential',\n                        'cosine', 'reduce_on_plateau', 'cyclic' (default: None)\n  --scheduler_kwargs SCHEDULER_KWARGS\n                        Additional scheduler keyword arguments as a dictionary (default: None)\n  --max_grad_norm MAX_GRAD_NORM\n                        Maximum gradient norm for clipping (default: None)\n  --random_seed RANDOM_SEED\n                        Random seed (default: 42)\n  --interactive         Interactive mode for training (default: False)\n  --show_progress       Show progress bars for training and evaluation (default: False)\n\nCheckpoint configuration:\n  --checkpoint_dir CHECKPOINT_DIR\n                        Directory to save and load checkpoints (default: checkpoints)\n  --checkpoint_interval CHECKPOINT_INTERVAL\n                        Checkpoint interval in epochs (default: 50)\n  --resume_from_checkpoint\n                        Resume training from latest checkpoint (default: False)\n\nEarly stopping:\n  --early_stop EARLY_STOP\n                        Early stopping method, 'score' or 'loss' (default: None)\n  --n_iter_no_change N_ITER_NO_CHANGE\n                        Number of iterations with no improvement to stop training (default: 5)\n  --tol TOL             Tolerance for early stopping (default: 1e-5)\n  --target_score TARGET_SCORE\n                        Target score for early stopping (default: None)\n  --val_percent VAL_PERCENT\n                        Fraction of training data to use for validation (default: 0.1)\n  --use_val_split       Use a validation split instead of a proportion of the train data (default:\n                        False)\n\nWeights and bias integration:\n  --wandb               Use Weights and Biases for logging (default: False)\n  --wandb_project WANDB_PROJECT\n                        Weights and Biases project name (default: None)\n  --wandb_run WANDB_RUN\n                        Weights and Biases run name (default: None)\n  --wandb_alerts        Enable Weights and Biases alerts (default: False)\n\nEvaluation options:\n  --threshold THRESHOLD\n                        Threshold for binary classification evaluation (default: 0.5)\n  --model_name MODEL_NAME\n                        Model name for display in evaluation plots (default: None)\n\nSaving options:\n  --save_data           Save processed data to disk as an .npz archive (X_train, X_dev, y_train,\n                        y_dev, y_dev_sent)\n  --save_model          Save the final model state after training in PyTorch .pth format (default:\n                        False)\n  --save_pickle         Save the final model after training in pickle .pkl format (default: False)\n  --save_hf             Save the final model after training in Hugging Face format (default: False)\n  --save_preds          Save predictions to CSV (default: False)\n  --save_plots          Save evaluation plots (default: False)\n  --save_dir SAVE_DIR   Directory to save archived data, predictions, plots (default: saves)\n\nLoading options:\n  --data_file DATA_FILE\n                        Filename of the processed data to load as an .npz archive (default: None)\n  --model_file MODEL_FILE\n                        Filename of the classifier model or checkpoint to load (default: None)\n  --use_saved_params    Use saved parameters for training, if loading a model\n\nPrediction options:\n  --predict             Make predictions on a provided unlabled dataset (default: False)\n  --predict_file PREDICT_FILE\n                        Filename of the unlabeled dataset to make predictions on (default: None)\n\nGPU and CPU processing:\n  --device DEVICE       Device will be auto-detected, or specify 'cuda' or 'cpu' (default: None)\n  --gpus GPUS           Number of GPUs to use if device is 'cuda', will be auto-detected (default:\n                        None)\n  --num_threads NUM_THREADS\n                        Number of threads for CPU training (default: None)\n  --num_workers NUM_WORKERS\n                        Number of workers for DataLoader (default: 0)\n  --prefetch PREFETCH   Number of batches to prefetch (default: None)\n  --empty_cache         Empty CUDA cache after each batch (default: False)\n  --port PORT           Port number for DDP distributed training (default: 12355)\n\nDebugging and logging:\n  --debug               Debug or verbose mode to print more details (default: False)\n  --mem_interval MEM_INTERVAL\n                        Memory check interval in epochs (default: 10)\n  --decimal DECIMAL     Decimal places for floating point numbers (default: 4)\n  --color_theme COLOR_THEME\n                        Color theme for console output: 'light', 'dark' (default: 'dark')\n```\n\n### GPT Fine-Tuning and Experiments\n\nThe GPT-4o/4o-mini models were fine-tuned via OpenAI API. The data processing to produce the requried JSONL files, and the `curl` commands to interact with the API, can be found in [gpt_finetune_experiments.ipynb](gpt_finetune_experiments.ipynb). This also contains all the baselines and experimental runs using various DSPy templates. This is the main notebook for the research.\n\n### Results and Analysis\n\nThe predictions and evaluation metrics of all experimental runs can be found under [results](results). Statistical tests can be found in [statistics.ipynb](statistics.ipynb) and [statistics](statistics).\n\n## Dataset\n\nThe dataset is a merge of [Stanford Sentiment Treebank](https://nlp.stanford.edu/sentiment/) (SST-3) and [DynaSent](https://github.com/cgpotts/dynasent) Rounds 1 and 2, licensed under Apache 2.0 and Creative Commons Attribution 4.0 respectively. The SST-3, DynaSent R1, and DynaSent R2 datasets were randomly mixed to form a new dataset with 102,097 Train examples, 5,421 Validation examples, and 6,530 Test examples.\n\nThe dataset is available in this repo under [data/merged](data/merged), or through the [Sentiment Merged Dataset](https://huggingface.co/datasets/jbeno/sentiment_merged) on Hugging Face. You can review the data processing to create the merged dataset here in [data_processing.ipynb](data_processing.ipynb).\n\n## Tools and Technologies\n\nCode was created in VSCode with auto-complete assistance from GitHub CoPilot with GPT-4o. Research paper copy editing, LaTeX formatting, and some code suggestions were provided by Claude 3.5 Sonnet.\n\n## Citation\n\nIf you use this material in your research, please cite:\n\n```bibtex\n@inproceedings{beno-2025-electra,\n    title = \"{ELECTRA} and {GPT}-4o: Cost-Effective Partners for Sentiment Analysis\",\n    author = \"Beno, James P.\",\n    editor = \"Shi, Weijia  and\n      Yu, Wenhao  and\n      Asai, Akari  and\n      Jiang, Meng  and\n      Durrett, Greg  and\n      Hajishirzi, Hannaneh  and\n      Zettlemoyer, Luke\",\n    booktitle = \"Proceedings of the 4th International Workshop on Knowledge-Augmented Methods for Natural Language Processing\",\n    month = may,\n    year = \"2025\",\n    address = \"Albuquerque, New Mexico, USA\",\n    publisher = \"Association for Computational Linguistics\",\n    url = \"https://aclanthology.org/2025.knowledgenlp-1.2/\",\n    pages = \"18--36\",\n    ISBN = \"979-8-89176-229-9\"\n}\n```\n\n## License\n\nThis project is licensed under the GNU GPL v3 License - see the LICENSE file for details.\n\n## Contact\n\nJim Beno - jim@jimbeno.net\n\n## Acknowledgments\n\n- The creators of the [ELECTRA model](https://arxiv.org/abs/2003.10555) for their foundational work\n- The authors of the datasets used: [Stanford Sentiment Treebank](https://huggingface.co/datasets/stanfordnlp/sst), [DynaSent](https://huggingface.co/datasets/dynabench/dynasent)\n- The Stanford CS224U course repo, which provided a starting point for this code: [cgpotts/cs224u](https://github.com/cgpotts/cs224u)\n- [Stanford Engineering CGOE](https://cgoe.stanford.edu), [Chris Potts](https://stanford.edu/~cgpotts/), [Insop Song](https://profiles.stanford.edu/insop), [Petra Parikova](https://profiles.stanford.edu/petra-parikova), and the Course Facilitators of [XCS224U](https://online.stanford.edu/courses/xcs224u-natural-language-understanding)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjbeno%2Fsentiment","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fjbeno%2Fsentiment","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjbeno%2Fsentiment/lists"}