{"id":26397433,"url":"https://github.com/thalesgroup/conceptbert","last_synced_at":"2025-03-17T12:17:50.303Z","repository":{"id":78565980,"uuid":"388494361","full_name":"ThalesGroup/ConceptBERT","owner":"ThalesGroup","description":"Implementation of ConceptBert: Concept-Aware Representation for Visual Question Answering","archived":false,"fork":false,"pushed_at":"2024-04-30T12:28:06.000Z","size":162642,"stargazers_count":23,"open_issues_count":7,"forks_count":11,"subscribers_count":4,"default_branch":"master","last_synced_at":"2024-04-30T13:58:19.876Z","etag":null,"topics":["ai","machine-learning"],"latest_commit_sha":null,"homepage":"https://github.com/ThalesGroup/ConceptBERT","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ThalesGroup.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2021-07-22T14:37:43.000Z","updated_at":"2024-04-30T13:58:21.522Z","dependencies_parsed_at":"2023-04-16T10:16:20.219Z","dependency_job_id":null,"html_url":"https://github.com/ThalesGroup/ConceptBERT","commit_stats":null,"previous_names":[],"tags_count":2,"template":false,"template_full_name":"ThalesGroup/template-project","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ThalesGroup%2FConceptBERT","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ThalesGroup%2FConceptBERT/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ThalesGroup%2FConceptBERT/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ThalesGroup%2FConceptBERT/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ThalesGroup","download_url":"https://codeload.github.com/ThalesGroup/ConceptBERT/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":244031129,"owners_count":20386534,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai","machine-learning"],"created_at":"2025-03-17T12:17:49.588Z","updated_at":"2025-03-17T12:17:50.284Z","avatar_url":"https://github.com/ThalesGroup.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# ConceptBert\n\nThis repository is the implementation of ConceptBert: Concept-Aware Representation for Visual Question Answering.\n\nOriginal paper:\n*François Gardères, Maryam Ziaeefard, Baptiste Abeloos, Freddy Lécué: ConceptBert: Concept-Aware Representation for\nVisual Question Answering. EMNLP (Findings) 2020: 489-498\nhttps://aclanthology.org/2020.findings-emnlp.44.pdf*\n\nFor an overview of the pipleline, please refere to the following picture:\n\n![Pipeline](/conceptBert/misc/pipeline.png)\n\n## License\n\nThis work is dual-licensed under the `Thales Digital Solutions Canada` license and `MIT License`.\n\n* **The main license is the `Thales Digital Solutions Canada` one**. You can find the [license](LICENSE) file here.\n* This repository is based on and inspired\n  by [Facebook research (vilbert-multi-task)](https://github.com/facebookresearch/vilbert-multi-task). We sincerely\n  thank for their sharing of the codes.\n  **The code related to `vilbert-multi-task` is licensed by the MIT License, please for more information\n  refer [to the file](LICENSE-VILBERT-MULTI-TASK).**\n\n### Pre-requisite\n\n* python 3.6.12\n* docker environment\n\n### Recommended\n\nIf you want to be able to develop on docker, we recommend you to use VSCODE with the container plugin.\n\n* [VSCode](https://code.visualstudio.com/) work\n  with [containers](https://code.visualstudio.com/docs/containers/overview)\n\n### Disclaimer\n\nCurrently, the project requires a lot of resources to be able to run correctly.\n\nIt is necessary to count at least 6 days of training for the first training with a `GTX 1080 Ti`(11Go RAM), and 17hours\nin an Kubernetes environment with 7GPU (7 `Titan-v`(32Go)). All the pipelines were tested on GPU server with\nfour `GeForce RTX 2080 Ti` (12Go)\n\n# :electric_plug: Data\n\n\u003e **ℹ️ Notes:**\n\u003e \n\u003e - **All information regarding the datasets or models used is specified in the [original paper](https://aclanthology.org/2020.findings-emnlp.44.pdf).**\n\u003e - The `original validation file` and the `pre-trained model` are available on the kaggle of the project: [https://www.kaggle.com/thalesgroup/conceptbert/](https://www.kaggle.com/thalesgroup/conceptbert/)\n\nOur implementation uses the pretrained features from bottom-up-attention, 100 fixed features per image and the GloVe\nvectors. The data might be saved in a folder along with pretrained_models and organized as shown below:\n\n```text\nvilbert\n├── data2\n│   ├── coco (visual features)\n│   ├── conceptnet (conceptnet facts)\n│   ├── conceptual_captions (captions for each image, extracted from (https://github.com/google-research-datasets/conceptual-captions))\n│   ├── kilbert_base_model (pre-trained weights for initial conceptBert model)\n│   ├── OK-VQA (OK-VQA dataset)\n│   ├── save_final (final saved models and outputs)\n│   ├── tensorboards (location to save tensorboard files)\n│   ├── VQA (VQA dataset)\n│   ├── VQA_bert_base_6layer_6conect-pretrained (pre-trained weights for initial vilbert model trained on vqa)\n```\n\nThe model checkpoints will be saved in the output : ./outputs/\n\n# :whale2: Docker installation (recommended)\n\nYou can choose to run ConceptBert with Docker or from your environment\n\n## Build\n\n```bash\n  docker build -t conceptbert .\n```\n\n## Start the container\n\n```bash\n  docker run -it -v /path/to/you/nas/:/nas-data/ conceptbert:latest bash\n```\n\n### Additional parameters\n\n```bash\n  docker run -it -v --shm-size=10g -e CUDA_VISIBLE_DEVICES=0,1,2,3 -v /path/to/you/nas/:/nas-data/ conceptbert:latest bash\n```\n\n* `--shm-size` is used to prevent Shared Memory error. Here the value is\n  10Go ([refer docker documentation](https://docs.docker.com/engine/reference/run/))\n* `-e CUDA_VISIBLE_DEVICES` is used to use specific GPU available. Here we want to use 4 GPU.\n\nWhen the container is up, go to the section [1. Train with VQA](#1.-train-with-vqa)\n\n# Other installation\n\nYou can use the `requirements.txt` file to install the dependencies of the project.\n\nPre-requisite:\n\n* Compile the tools `cd conceptBert/tools/refer \u0026\u0026 make`\n* python 3.6.x\n\n**If you have difficulties to create your environment, look at the contents of the Dockerfile for the necessary\ndependencies that you might miss.**\n\n# :rocket: Training and Validation\n\nNote: models and json used in the following examples are the current best results\n\n## 1. Train with VQA\n\nFirst we use VQA dataset to train a baseline model. Use the following command:\n\n```bash\n  python3 -u train_tasks.py --model_version 3 --bert_model=bert-base-uncased --from_pretrained_conceptBert None \\\n      --from_pretrained=/nas-data/vilbert/data2/kilbert_base_model/pytorch_model_9.bin \\\n      --config_file config/bert_base_6layer_6conect.json \\\n      --output_dir=/nas-data/outputs/train1_vqa_trained_model/ \\\n      --summary_writer /nas-data/tensorboards/ \\\n      --num_workers 16 \\\n      --tasks 0\n```\n\n### Command description\n\n| Parameter | Description |\n|-----------|-------------|\n| u | -u is used to force stdin, stdout and stderr to be totally unbuffered, which otherwise is line buffered on the terminal |\n| model_version |  Which version of the model you want to use |\n| bert_model | Bert pre-trained model selected in the list: bert-base-uncased, bert-large-uncased, bert-base-cased, bert-base-multilingual, bert-base-chinese. |\n| from_pretrained_conceptBert | folder of the previous trained model. In this case, it's the first train, so the value is`None`  |\n| from_pretrained  | pre-trained Bert model (VQA) |\n| config_file  | 3 config files are available in `conceptBert/config/` |\n| output_dir  | folder where the results are saved  |\n| summary_writer  |  folder used to save tensorboard items. A sub-folder will be created with the date of the day |\n| num_worker | Tells the data loader instance how many sub-processes to use for data loading. **Use your own value in\nregard of your environment** |\n| task  |  task = 0, we use VQA dataset |\n\n## 2. Train with OK-VQA (fine-tuning)\n\nThen we use OK-VQA dataset and the trained model from step 1 to train a model. Use the following command:\n\n```bash\n  python3 -u train_tasks.py --model_version 3 --bert_model=bert-base-uncased \\\n      --from_pretrained=/nas-data/vilbert/data2/save_final/VQA_bert_base_6layer_6conect-beta_vilbert_vqa/pytorch_model_11.bin \\\n      --from_pretrained_conceptBert /nas-data/outputs/train1_vqa_trained_model/VQA_bert_base_6layer_6conect/pytorch_model_19.bin \\\n      --config_file config/bert_base_6layer_6conect.json \\\n      --output_dir=/nas-data/outputs/train2_okvqa_trained_model/ \\\n      --summary_writer /outputs/tensorboards/  \\\n      --num_workers 16 \\\n      --tasks 42\n```\n\n### Command description\n\nThe parameters are the same as above, but these values change:\n\n| Parameter | Description |\n|-----------|-------------|\n| from_pretrained_conceptBert | The path of the model trained previously (step1 VQA). Corresponding of the last `pytorch_model_**.bin` file generated |\n| from_pretrained  | pre-trained Bert model (OK-VQA) |\n| task  |  task = 42 OKVQA dataset is used |\n\n## 3. Validation with OK-VQA\n\nTo validate on held out validation split, we use the model trained in step 2 using following command:\nVQA_bert_base_6layer_6conect\n\n```bash\n  python3 -u eval_tasks.py --model_version 3 --bert_model=bert-base-uncased \\\n      --from_pretrained=/nas-data/vilbert/data2/save_final/VQA_bert_base_6layer_6conect-beta_vilbert_vqa/pytorch_model_11.bin  \\\n      --from_pretrained_conceptBert=/nas-data/outputs/train2_okvqa_trained_model/OK-VQA_bert_base_6layer_6conect/pytorch_model_99.bin \\\n      --config_file config/bert_base_6layer_6conect.json \\\n      --output_dir=/nas-data/outputs/validation_okvqa_trained_model/ \\\n      --num_workers 16 \\\n      --tasks 42 \\\n      --split val\n```\n\nTwo files will be generated:\n\n* `Val_other` give 8 top answers for each questions\n* `val_result` used in the evaluation\n\n### Command description\n\nThe parameters are the same as above, but theses values change:\n\n| Parameter | Description |\n|-----------|-------------|\n| from_pretrained_conceptBert | The path of the model trained previously (step2 OKVQA). Corresponding of the last `pytorch_model_**.bin` file generated |\n| from_pretrained  | same pre-trained Bert model (OK-VQA) as step2 |\n| task  |  task = 42 OKVQA is used |\n\n# :rocket: Evaluation\n\nRun the evaluation :\n\n## Start the training with:\n\n```bash\n  python3 PythonEvaluationTools/vqaEval_okvqa.py \\\n      --json_dir /nas-data/outputs/validation_okvqa_trained_model/ \\\n      --output_dir /nas-data/outputs/validation_okvqa_trained_model/\n```\n\n## Command description\n\n* `json_dir`: path where is located the `val_result.json`\n* `output_path`: folder where the accuracy will be saved\n* `/nas-data/outputs/validation_okvqa_trained_model/`: is the final json. *You must change this by the path of\n  the json you want to evaluate*.\n\n# :bug: Known issues\n\n* If `python-prctl` return `\"python-prctl\" Command \"python setup.py egg_info\" failed with error` error, use this\n  command :\n\n```bash\n  sudo apt-get install libcap-dev python3-dev\n```\n\n# :bulb: Compare the results\n\n## Step 1: Training with VQA\n\n* 20 checkpoints must have been created (`last file name must be pytorch_model_19.bin`)\n\n## Step 2: Training with OK-VQA\n\n* 100 checkpoints must have been created (`last file name must be pytorch_model_99.bin`)\n\n## Step 3: Validation with OK-VQA\n\n* The validation generates two json file. `val_result.json` will be used in the evaluation.\n* Open the logs in the output folder (`nas-data-`) to check the result of the `eval_score`:\n\n```bash\n08/12/2020 13:09:46 - INFO - utils -   Validation [OK-VQA]: loss 3.681 score 33.040\n```\n\nIf you want to optimize your model the `loss` and `score` must be at least be the same as above.\n\n## Evaluation\n\nCompare your results in the `accuracy.json` file (results must be at least as good as the following ones).\n```json\n{\n  \"overall\": 33.04,\n  \"perQuestionType\": {\n    \"one\": 30.82,\n    \"eight\": 33.6,\n    \"other\": 32.57,\n    \"seven\": 30.61,\n    \"four\": 36.79,\n    \"five\": 33.66,\n    \"three\": 31.73,\n    \"nine\": 31.43,\n    \"ten\": 45.58,\n    \"two\": 30.23,\n    \"six\": 30.07\n  },\n  \"perAnswerType\": {\n    \"other\": 33.04\n  }\n}\n```\n\n# VQA Training\n\n* [Documentation here](/conceptBert/misc/training_vqa.md)\n\n# OK-VQA Training\n\n* [Documentation here](/conceptBert/misc/training_okvqa.md)\n\n# Troubleshooting\n\n## CUDA out of memory\n\nTry the following recommendation to resolve the problem:\n\n* Change the value of `num_workers` in your training command (ex. `--num_workers 1`)\n* Try one of the [improvements](#improvements) proposition bellow\n* Reduce parameters in `vlbert_tasks.yml`:\n    * max_seq_length\n    * batch_size\n    * eval_batch_size\n\nExample:\n\n```bash\n  max_seq_length: 4 # DGX value : 16\n  batch_size: 256 # DGX value : 1024\n  eval_batch_size: 256 # DGX value : 1024\n```\n\n# Improvements\n\nThere are several areas for improvement:\n\n* Search and replace the `to.device()` parameter in the code to be executed in the better position\n* Load a part of the dataset (create a method to load a batch of the dataset). Dataset management is in `vqa_dataset.py`\n  , method `_load_dataset`, variables `questions = questions_train + questions_val[:-3000]`\n  and `answers = answers_train + answers_val[:-3000]`\n* Train your own BERT (or find a lighter Bert)\n* Initialise Bert once and load it after\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fthalesgroup%2Fconceptbert","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fthalesgroup%2Fconceptbert","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fthalesgroup%2Fconceptbert/lists"}