{"id":29264351,"url":"https://github.com/bashmocha/extended-features-on-bert-performance","last_synced_at":"2025-07-04T12:31:50.911Z","repository":{"id":301356032,"uuid":"863366319","full_name":"BashMocha/Extended-Features-on-BERT-Performance","owner":"BashMocha","description":"Analysis of  BERT's Depression Detection Performance with Extended Features","archived":false,"fork":false,"pushed_at":"2025-07-03T16:58:53.000Z","size":4879,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-07-03T17:31:43.033Z","etag":null,"topics":["bert","depression-detection","extended-fea","nlp","sentiment-analysis","twitter-data"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/BashMocha.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2024-09-26T07:10:43.000Z","updated_at":"2025-07-03T16:58:56.000Z","dependencies_parsed_at":"2025-06-26T12:59:41.763Z","dependency_job_id":"e8900f77-4afe-4436-9afd-fd01297c89e4","html_url":"https://github.com/BashMocha/Extended-Features-on-BERT-Performance","commit_stats":null,"previous_names":["bashmocha/extended-features-on-bert-performance"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/BashMocha/Extended-Features-on-BERT-Performance","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/BashMocha%2FExtended-Features-on-BERT-Performance","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/BashMocha%2FExtended-Features-on-BERT-Performance/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/BashMocha%2FExtended-Features-on-BERT-Performance/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/BashMocha%2FExtended-Features-on-BERT-Performance/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/BashMocha","download_url":"https://codeload.github.com/BashMocha/Extended-Features-on-BERT-Performance/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/BashMocha%2FExtended-Features-on-BERT-Performance/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":263540407,"owners_count":23477454,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["bert","depression-detection","extended-fea","nlp","sentiment-analysis","twitter-data"],"created_at":"2025-07-04T12:31:03.758Z","updated_at":"2025-07-04T12:31:50.887Z","avatar_url":"https://github.com/BashMocha.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"## Effects of Extended Features on BERT Performance: Depression Detection\n\n\u003cb\u003eOfficial implementation of the [SIU 2025](https://www.ieee.org.tr/33-ieee-sinyal-isleme-ve-iletisim-uygulamalari-kurultayi-siu/) paper.\u003c/b\u003e\n\nEmirhan Balcı*, Esra Saraç\n\n## Abstract\nIn this study, the effects of categorical and numerical additional features obtained from Twitter posts on depression detection were investigated. Depression detection performances of the BERT large language model and SVM classifier were compared on the dataset balanced with the oversampling method. The effects of two different feature addition methods, Unimodal and Concat, were evaluated on the BERT model. The results show that oversampling improves the performance of the BERT classifier, but feature addition methods do not provide a significant improvement in the model performance. The findings of the experiments reveal the success of the BERT model in the field of classification and that it does not require additional features for the detection of depression. It is believed that this study will guide research in the field of depression detection and help researchers identify more effective areas of study.\n\n[Code](https://github.com/BashMocha/Extended-Features-on-BERT-Performance/tree/master/notebooks) | [Paper]() | [Data](https://github.com/BashMocha/Extended-Features-on-BERT-Performance/tree/master/data)\n\n## Updates\n\n25/06/2025: We release the utilized dataset and the source code.\n\n13/05/2025: The study is accepted by SIU 2025! 🎉\n\n09/02/2025: The paper is submitted to the symposium.\n\n## Addition of Extended Features into BERT\nTo enhance the feature representations obtained from the BERT model, two techniques from the open-source Python library [Multimodal-Toolkit](https://github.com/georgian-io/Multimodal-Toolkit/tree/master), developed for integrating numerical and categorical features into Transformer-based models, were employed. In the method referred to as Unimodal, categorical and numerical attributes are appended to the corresponding posts in textual form, and the resulting text is tokenized prior to being input into the BERT model for training. In the Concat approach, encoded categorical values—converted into numerical representations—along with the numerical features, are concatenated with the word embedding vector of the respective post and passed to the final classification layers.\n\nTo detect depression, five distinct feature representations were defined to characterize the structural and contextual properties of each post. These representations include two numerical features—such as the length of the post and the number of profane words it contains—and three categorical features indicating the presence of positive emojis, negative emojis, or URL links within the post.\n\nIn the Unimodal approach, the extended features were incorporated into the corresponding posts prior to tokenization, whereas in the Concat approach, they were concatenated with the word embedding vectors of the respective posts. Within the Unimodal method, categorical features were encoded using binary values to provide numerical representations, while numerical features were normalized using a quantile-based transformation to approximate a Gaussian distribution. This normalization process ensured compatibility between the numerical features and the word embeddings derived from the BERT model, and contributed to a more stable learning process by reducing the influence of outliers.\n\u003cbr\u003e\u003cbr\u003e\n\n![1](https://github.com/user-attachments/assets/48345570-cd9a-4020-9c44-dc910f89a346)\n\u003cdiv align=\"center\"\u003e\n  \u003cp\u003eVisualization of the applied Unimodal method.\u003c/p\u003e\n\u003c/div\u003e\u003cbr\u003e\n\n\n![1(1)](https://github.com/user-attachments/assets/61a5fe91-ea44-4289-b7cb-6e783fc66245)\n\u003cdiv align=\"center\"\u003e\n  \u003cp\u003eVisualization of the applied Concat method.\u003c/p\u003e\n\u003c/div\u003e\n\u003cbr\u003e\n\nThe extended features were appended to the word embedding vectors of the corresponding posts produced by the encoder layers of the BERT model, thereby increasing the embedding dimension from 768 to 773. The resulting word embedding vectors were subsequently fed into the BERT classifier.\n\n![2](https://github.com/user-attachments/assets/d75881c6-fd7f-4e59-b0ac-8178d3391f84)\n\n## Results\n\n\u003cdiv align=\"center\"\u003e\n\u003ctable\u003e\n  \u003cthead\u003e\n    \u003ctr\u003e\n      \u003cth\u003eFeature Type\u003c/th\u003e\n      \u003cth\u003eTraining Method\u003c/th\u003e\n      \u003cth\u003eF1\u003c/th\u003e\n      \u003cth\u003eF1-micro\u003c/th\u003e\n      \u003cth\u003eF1-macro\u003c/th\u003e\n      \u003cth\u003eF1-weighted\u003c/th\u003e\n    \u003c/tr\u003e\n  \u003c/thead\u003e\n  \u003ctbody\u003e\n    \u003ctr\u003e\n      \u003ctd rowspan=\"2\"\u003eOriginal\u003c/td\u003e\n      \u003ctd\u003eholdout\u003c/td\u003e\n      \u003ctd\u003e0.95\u003c/td\u003e\n      \u003ctd\u003e0.95\u003c/td\u003e\n      \u003ctd\u003e0.95\u003c/td\u003e\n      \u003ctd\u003e0.95\u003c/td\u003e\n    \u003c/tr\u003e\n    \u003ctr\u003e\n      \u003ctd\u003e5-fold\u003c/td\u003e\n      \u003ctd\u003e1.00\u003c/td\u003e\n      \u003ctd\u003e1.00\u003c/td\u003e\n      \u003ctd\u003e1.00\u003c/td\u003e\n      \u003ctd\u003e1.00\u003c/td\u003e\n    \u003c/tr\u003e\n    \u003ctr\u003e\n      \u003ctd rowspan=\"2\"\u003eUnimodal\u003c/td\u003e\n      \u003ctd\u003eholdout\u003c/td\u003e\n      \u003ctd\u003e0.95\u003c/td\u003e\n      \u003ctd\u003e0.95\u003c/td\u003e\n      \u003ctd\u003e0.95\u003c/td\u003e\n      \u003ctd\u003e0.95\u003c/td\u003e\n    \u003c/tr\u003e\n    \u003ctr\u003e\n      \u003ctd\u003e5-fold\u003c/td\u003e\n      \u003ctd\u003e0.99\u003c/td\u003e\n      \u003ctd\u003e0.99\u003c/td\u003e\n      \u003ctd\u003e0.99\u003c/td\u003e\n      \u003ctd\u003e0.99\u003c/td\u003e\n    \u003c/tr\u003e\n    \u003ctr\u003e\n      \u003ctd rowspan=\"2\"\u003eConcat\u003c/td\u003e\n      \u003ctd\u003eholdout\u003c/td\u003e\n      \u003ctd\u003e0.94\u003c/td\u003e\n      \u003ctd\u003e0.94\u003c/td\u003e\n      \u003ctd\u003e0.94\u003c/td\u003e\n      \u003ctd\u003e0.94\u003c/td\u003e\n    \u003c/tr\u003e\n    \u003ctr\u003e\n      \u003ctd\u003e5-fold\u003c/td\u003e\n      \u003ctd\u003e1.00\u003c/td\u003e\n      \u003ctd\u003e1.00\u003c/td\u003e\n      \u003ctd\u003e1.00\u003c/td\u003e\n      \u003ctd\u003e1.00\u003c/td\u003e\n    \u003c/tr\u003e\n  \u003c/tbody\u003e\n\u003c/table\u003e\n\u003c/div\u003e\n\u003cbr\u003e\n\nWhen training was conducted on the BERT+BERT model using the Concat and Unimodal methods, it was observed that both approaches yielded similar results. The holdout and 5-fold cross-validation results obtained using the Concat method were recorded as 0.94 and 1.00, respectively, while those obtained using the Unimodal method were 0.95 and 0.99. The outcomes from both the Unimodal and Concat methods are quite close to the results achieved by training the BERT+BERT model solely with oversampling. Specifically, the 5-fold cross-validation result obtained with the Unimodal method and the holdout result obtained with the Concat method were each 0.01 lower than the corresponding results from training the model solely with oversampling. These findings suggest that the contribution of the additional features integrated into the BERT model may be limited in improving classification performance for depression detection.\n\n## Citation\n\nIf you find the dataset or code useful, please cite:\n\n```bibtex\n@inproceedings{balci_extended_2025,\n\ttitle = {Effects of Extended Features on BERT Performance: Depression Detection},\n\tbooktitle = {2025 33rd IEEE Conference on Signal Processing and Communications Applications (SIU2025},\n\tauthor = {Balcı, Emirhan and Saraç, Esra},\n\tyear = {2025},\n}\n```\n\n## License\n\nMIT License\n\n\u003chr\u003e\n\nFeel free to [contact](mailto:emirbalci360@gmail.com) for any questions.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbashmocha%2Fextended-features-on-bert-performance","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fbashmocha%2Fextended-features-on-bert-performance","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbashmocha%2Fextended-features-on-bert-performance/lists"}