{"id":11563598,"url":"https://github.com/thuiar/Self-MM","last_synced_at":"2025-10-03T14:30:51.699Z","repository":{"id":37469824,"uuid":"321547044","full_name":"thuiar/Self-MM","owner":"thuiar","description":"Codes for paper \"Learning Modality-Specific Representations with Self-Supervised Multi-Task Learning for Multimodal Sentiment Analysis\"","archived":false,"fork":false,"pushed_at":"2022-06-25T08:52:04.000Z","size":500,"stargazers_count":184,"open_issues_count":18,"forks_count":35,"subscribers_count":4,"default_branch":"main","last_synced_at":"2024-09-28T14:31:28.531Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/thuiar.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2020-12-15T03:56:55.000Z","updated_at":"2024-09-27T09:02:57.000Z","dependencies_parsed_at":"2022-09-05T18:02:10.798Z","dependency_job_id":null,"html_url":"https://github.com/thuiar/Self-MM","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/thuiar%2FSelf-MM","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/thuiar%2FSelf-MM/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/thuiar%2FSelf-MM/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/thuiar%2FSelf-MM/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/thuiar","download_url":"https://codeload.github.com/thuiar/Self-MM/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":235139067,"owners_count":18942104,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-06-23T05:56:56.833Z","updated_at":"2025-10-03T14:30:46.377Z","avatar_url":"https://github.com/thuiar.png","language":"Python","funding_links":[],"categories":["其他_机器视觉"],"sub_categories":["网络服务_其他"],"readme":"![Python 3.6](https://img.shields.io/badge/python-3.6-green.svg)\n\n## SELF-MM\n\u003e Pytorch implementation for codes in [Learning Modality-Specific Representations with Self-Supervised Multi-Task Learning for Multimodal Sentiment Analysis (AAAI2021)](https://arxiv.org/abs/2102.04830). Please see our another repo [MMSA](https://github.com/thuiar/MMSA) for more details, which is a scalable framework for MSA.\n\n### Model\n\n![model](assets/MainModel.png)\n\n### Usage\n\n1. Datasets and pre-trained berts\n\nDownload dataset features and pre-trained berts from the following links.\n\n- [Baidu Cloud Drive](https://pan.baidu.com/s/1oksuDEkkd3vGg2oBMBxiVw) with code: `ctgs`\n- [Google Cloud Drive](https://drive.google.com/drive/folders/1E5kojBirtd5VbfHsFp6FYWkQunk73Nsv?usp=sharing)\n\nFor all features, you can use `SHA-1 Hash Value` to check the consistency.\n\u003e `MOSI/unaligned_50.pkl`: `5da0b8440fc5a7c3a457859af27458beb993e088`  \n\u003e `MOSI/aligned_50.pkl`: `5c62b896619a334a7104c8bef05d82b05272c71c`  \n\u003e `MOSEI/unaligned_50.pkl`: `db3e2cff4d706a88ee156981c2100975513d4610`  \n\u003e `MOSEI/aligned_50.pkl`: `ef49589349bc1c2bc252ccc0d4657a755c92a056`  \n\u003e `SIMS/unaligned_39.pkl`: `a00c73e92f66896403c09dbad63e242d5af756f8`  \n\nDue to the size limitations, the MOSEI features and SIMS raw videos are available in `Baidu Cloud Drive` only. All dataset features are organized as:\n\n```python\n{\n    \"train\": {\n        \"raw_text\": [],\n        \"audio\": [],\n        \"vision\": [],\n        \"id\": [], # [video_id$_$clip_id, ..., ...]\n        \"text\": [],\n        \"text_bert\": [],\n        \"audio_lengths\": [],\n        \"vision_lengths\": [],\n        \"annotations\": [],\n        \"classification_labels\": [], # Negative(\u003c 0), Neutral(0), Positive(\u003e 0)\n        \"regression_labels\": []\n    },\n    \"valid\": {***}, # same as the \"train\" \n    \"test\": {***}, # same as the \"train\"\n}\n```\n\nFor MOSI and MOSEI, the pre-extracted text features are from BERT, different from the original glove features in the [CMU-Multimodal-SDK](http://immortal.multicomp.cs.cmu.edu/raw_datasets/processed_data/).\n\nFor SIMS, if you want to extract features from raw videos, you need to install [Openface Toolkits](https://github.com/TadasBaltrusaitis/OpenFace/wiki) first, and then refer our codes in the `data/DataPre.py`.\n\n```\npython data/DataPre.py --data_dir [path_to_Dataset] --language ** --openface2Path  [path_to_FeatureExtraction]\n```\n\nFor bert models, you also can download [Bert-Base, Chinese](https://storage.googleapis.com/bert_models/2018_11_03/chinese_L-12_H-768_A-12.zip) from [Google-Bert](https://github.com/google-research/bert). And then, convert tensorflow into pytorch using [transformers-cli](https://huggingface.co/transformers/converting_tensorflow_models.html)  \n\n2. Clone this repo and install requirements.\n```\ngit clone https://github.com/thuiar/Self-MM\ncd Self-MM\nconda create --name self_mm python=3.7\nsource activate self_mm\npip install -r requirements.txt\n```\n\n3. Make some changes\nModify the `config/config_tune.py` and `config/config_regression.py` to update dataset pathes.\n\n4. Run codes\n```\npython run.py --modelName self_mm --datasetName mosi\n```\n\n### Results\n\n\u003e Detailed results are shown in [MMSA](https://github.com/thuiar/MMSA) \u003e [results/result-stat.md](https://github.com/thuiar/MMSA/blob/master/results/result-stat.md). \n\n### Paper\n---\nPlease cite our paper if you find our work useful for your research:\n```\n@inproceedings{yu2021le,\n  title={Learning Modality-Specific Representations with Self-Supervised Multi-Task Learning for Multimodal Sentiment Analysis},\n  author={Yu, Wenmeng and Xu, Hua and Ziqi, Yuan and Jiele, Wu},\n  booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},\n  year={2021}\n}\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fthuiar%2FSelf-MM","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fthuiar%2FSelf-MM","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fthuiar%2FSelf-MM/lists"}