{"id":20495193,"url":"https://github.com/samuelebortolotti/bears","last_synced_at":"2025-09-25T06:30:54.224Z","repository":{"id":223643321,"uuid":"759842456","full_name":"samuelebortolotti/bears","owner":"samuelebortolotti","description":"Codebase for BEARS Make Neuro-Symbolic Models Aware of their Reasoning Shortcuts.","archived":false,"fork":false,"pushed_at":"2024-11-26T15:45:28.000Z","size":44068,"stargazers_count":7,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"master","last_synced_at":"2024-11-26T16:34:36.610Z","etag":null,"topics":["bayesian-deep-learning","neuro-symbolic-ai","reasoning-shortcuts","uncertainty"],"latest_commit_sha":null,"homepage":"https://arxiv.org/abs/2402.12240","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/samuelebortolotti.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":"CITATION.cff","codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-02-19T12:48:45.000Z","updated_at":"2024-11-26T15:45:01.000Z","dependencies_parsed_at":null,"dependency_job_id":"b1e2623a-fa28-47a4-8b1d-de6e9d26db62","html_url":"https://github.com/samuelebortolotti/bears","commit_stats":null,"previous_names":["samuelebortolotti/bears"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/samuelebortolotti%2Fbears","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/samuelebortolotti%2Fbears/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/samuelebortolotti%2Fbears/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/samuelebortolotti%2Fbears/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/samuelebortolotti","download_url":"https://codeload.github.com/samuelebortolotti/bears/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":234157963,"owners_count":18788501,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["bayesian-deep-learning","neuro-symbolic-ai","reasoning-shortcuts","uncertainty"],"created_at":"2024-11-15T17:44:55.065Z","updated_at":"2025-09-25T06:30:54.217Z","avatar_url":"https://github.com/samuelebortolotti.png","language":"Python","readme":"# BEARS Make Neuro-Symbolic Models Aware of their Reasoning Shortcuts\n\nCodebase for the paper: \n\nBEARS Make Neuro-Symbolic Models Aware of their Reasoning Shortcuts, E. Marconato, S. Bortolotti, E. van Krieken, A. Vergari, A. Passerini, S. Teso\n\n[![arXiv](https://img.shields.io/badge/arXiv-1234.56789-b31b1b.svg)](https://arxiv.org/abs/2402.12240)\n\n```\n@misc{marconato2024bears,\n      title={BEARS Make Neuro-Symbolic Models Aware of their Reasoning Shortcuts}, \n      author={Emanuele Marconato and Samuele Bortolotti and Emile van Krieken and Antonio Vergari and Andrea Passerini and Stefano Teso},\n      year={2024},\n      eprint={2402.12240},\n      archivePrefix={arXiv},\n      primaryClass={cs.LG}\n}\n```\n\nIf you find the code useful, please consider citing it.\n\n[![Citation](https://img.shields.io/badge/Citation-CFF-ff69b4.svg)](https://github.com/samuelebortolotti/bears/blob/master/CITATION.cff)\n\n## Abstract\n\nNeuro-Symbolic (NeSy) predictors that conform to symbolic knowledge - encoding, e.g., safety constraints - can be affected by Reasoning Shortcuts (RSs): They learn concepts consistent with the symbolic knowledge by exploiting unintended semantics. RSs compromise reliability and generalization and, as we show in this paper, they are linked to NeSy models being overconfident about the predicted concepts. Unfortunately, the only trustworthy mitigation strategy requires collecting costly dense supervision over the concepts. Rather than attempting to avoid RSs altogether, we propose to ensure NeSy models are aware of the semantic ambiguity of the concepts they learn, thus enabling their users to identify and distrust low-quality concepts. Starting from three simple desiderata, we derive bears (BE Aware of Reasoning Shortcuts), an ensembling technique that calibrates the model's concept-level confidence without compromising prediction accuracy, thus encouraging NeSy architectures to be uncertain about concepts affected by RSs. We show empirically that bears improves RS-awareness of several state-of-the-art NeSy models, and also facilitates acquiring informative dense annotations for mitigation purposes. \n\n## Installation and use\n\nTo run experiments on XOR, MNIST-Addition, Kandinsky and BDD-OIA, access the linux terminal and use the conda installation followed by pip3:\n\n```\n$conda env create -n rs python=3.8\n$conda activate rs\n$pip install -r requirements.txt\n```\n\n## BDD-OIA (2048)\n\nBDD-OIA is a dataset of dashcams images for autonomous driving predictions, annotated with input-level objects (like bounding boxes of pedestrians, etc.) and concept-level entities (like \"road is clear\"). The original dataset can be found here: https://twizwei.github.io/bddoia_project/\n\nThe dataset is preprocessed with a pretrained Faster-RCNN on BDD-100k and with the first module in CBM-AUC (Sawada and Nakamura, IEEE (2022)), leading to embeddings of dimension 2048. These are reported in the zip ```bdd_2048.zip```. The original repo of CBM-AUC can be found here https://github.com/AISIN-TRC/CBM-AUC.\n\n![BDD-OIA](.github/boia.png)\n\nFor usage, consider citing the original dataset creators and Sawada and Nakamura:\n\n```\n@InProceedings{xu2020cvpr,\nauthor = {Xu, Yiran and Yang, Xiaoyin and Gong, Lihang and Lin, Hsuan-Chu and Wu, Tz-Ying and Li, Yunsheng and Vasconcelos, Nuno},\ntitle = {Explainable Object-Induced Action Decision for Autonomous Vehicles},\nbooktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},\nmonth = {June},\nyear = {2020}}\n\n@ARTICLE{sawada2022cbm-auc,\n  author={Sawada, Yoshihide and Nakamura, Keigo},\n  journal={IEEE Access}, \n  title={Concept Bottleneck Model With Additional Unsupervised Concepts}, \n  year={2022},\n  volume={10},\n  number={},\n  pages={41758-41765},\n  doi={10.1109/ACCESS.2022.3167702}}\n```\n\n## MNIST\n\nThis repository comprises several MNIST variations. The most relevant ones are:\n\n**MNIST-Even-Odd:**\n\nThe MNIST-Even-Odd dataset is a variant of MNIST-Addition introduced by Marconato et al. (2023b). It consists of specific combinations of digits, including only even or odd digits, such as 0+6=6, 2+8=10, and 1+5=6. The dataset comprises 6720 fully annotated samples in the training set, 1920 samples in the validation set, and 960 samples in the in-distribution test set. Additionally, there are 5040 samples in the out-of-distribution test dataset, covering all other sums not observed during training. The dataset is associated with reasoning shortcuts, and the number of deterministic RSs was calculated to be 49 by solving a linear system.\n\n**MNIST-Half:**\n\nMNIST-Half is a biased version of MNIST-Addition, focusing on digits ranging from 0 to 4. Selected digit combinations include 0+0=0, 0+1=1, 2+3=5, and 2+4=6. Unlike MNIST-Even-Odd, two digits (0 and 1) are not affected by reasoning shortcuts, while 2, 3, and 4 can be predicted differently. The dataset comprises 2940 fully annotated samples in the training set, 840 samples in the validation set, and 420 samples in the test set. Additionally, there are 1080 samples in the out-of-distribution test dataset, covering remaining sums with the included digits.\n\n## Kandinksy\n\nThe Kandinsky dataset, introduced by Müller and Holzinger in 2021, features visual patterns inspired by the artistic works of Wassily Kandinsky. Each pattern is constructed with geometric figures and encompasses two main concepts: shape and color. The dataset proposes a variant of Kandinsky where each image contains a fixed number of figures, and each figure can have one of three possible colors (red, blue, yellow) and one of three possible shapes (square, circle, triangle).\n\nIn an active learning setup, resembling an IQ test for machines, the task involves predicting the pattern of a third image given two images that share a common pattern. During inference, a model, such as the NeSy model mentioned in the experiment, computes a series of predicates like \"same_cs\" (same color and shape) and \"same_ss\" (same shape and same color). The model needs to choose the third image that completes the pattern based on these computed predicates. For example, if the first two images have different colors, the model should select the option that aligns with the observed pattern. The dataset provides a challenging task that tests a model's ability to generalize and infer relationships between visual elements.\n\n![Kandinsky pattern illustration](.github/kand-illustration.png)\n\n## Structure of the code\n\n* The code structure is the same as [Marconato Reasoning Shortcuts](https://github.com/ema-marconato/reasoning-shortcuts): XOR, Kandinksy and MNIST-Addition are in single project folder, located in ```XOR_MNIST```. Here, we defined:\n\n    * ``backbones`` contains the architecture of the NNs used.\n    * ``datasets`` cointains the various versions of MNIST addition. If you want to add a dataset it has to be located here.\n    * ``example`` is an independent folder containing all the experiments and setup for running XOR\n    *  ``models`` contains all models used to benchmark the presence of RSs. Here, you can find DPL, SL, and LTN + recunstruction, but also a simple concept extractor (cext.py) and conditional VAEs (cvae.py)\n    * ``utils`` contains the training loop, the losses, the metrics and (only wandb) loggers \n    * ``exp_best_args.py`` is where I collected all best hyper-parameters for MNIST-Addition and XOR.\n    * you can use ``experiments.py`` to prepare a stack of experiments. If you run on a cluster, you can run ``server.py`` to access submitit and schedule a job array or use ``run_start.sh`` to run a single experiment. \n\n\n* ``BDD_OIA`` follows the design of Sawada and can be executed launching ``run_bdd.sh``. Hyperparameters are already set.\n\n\n* args in ``utils.args.py``:\n    * --dataset: choose the dataset\n    * --task: addition/product/multiop\n    * --model: which model you choose, remember to add rec at end if you want to add reconstruction penalty\n    * --c_sup: percentage of concept supervision. If zero, then 0% of examples are supervise, if 1, then 100% of examples have concept supervision\n    * --which_c: pass a list to specify which concepts you want to supervise, e.g. [1,2], will activate supervision for only concept 1 and 2\n    * --joint: if included it will process both MNIST digits all together\n    * --entropy: if included it will add the entropy penalty\n    * --w_sl: weight for the Semantic Loss\n    * --gamma: general weight for the mitigation strategy (this will multiply with other weights. My advice is to set it to 1)\n    * --wrec, --beta, --w_h, --w_c: different weights for penalties (see also args description)\n    * --do-test: activate the test method. Refer to this others arguments to try out all the possible testing operations.\n\n    * others are quite standard, consider using also:\n        * --wandb: put here the name of your project, like 'i-dont-like-rss'\n        * --checkin, --checkout: specify path were to load and to save checkpoints, respectivey\n        * --validate: activate it to use the validation set (this is a switch from val to test)\n\n\n## Using `bears`\n\nAfter training the model, you can evaluate it using one of several strategies by running the program with the `--posthoc` flag and specifying the desired method with the `--type` option. The available evaluation strategies are:\n\n- `frequentist`: Evaluates the model without any Bayesian approximation.\n- `mcdropout`: Applies Monte Carlo dropout for uncertainty estimation.\n- `ensemble`: Uses a standard deep ensemble of models.\n- `laplace`: Applies a Laplace approximation to the model's weights.\n- `bears`: Uses `bears`.\n\nAlternatively, to run all strategies at once, you can use the `--evaluate-all` flag.\n\n## Issues report, bug fixes, and pull requests\n\nFor all kind of problems do not hesitate to contact me. If you have additional mitigation strategies that you want to include as for others to test, please send me a pull request. \n\n## Makefile\n\nTo see the Makefile functions, simply call the appropriate help command with [GNU/Make](https://www.gnu.org/software/make/)\n\n```bash\nmake help\n```\n\nThe `Makefile` provides a simple and convenient way to manage Python virtual environments (see [venv](https://docs.python.org/3/tutorial/venv.html)).\n\n### Environment creation\n\nIn order to create the virtual enviroment and install the requirements be sure you have the Python 3.9 (it should work even with more recent versions, however I have tested it only with 3.9)\n\n```bash\nmake env\nsource ./venv/reasoning-shortcut/bin/activate\nmake install\n```\n\nRemember to deactivate the virtual enviroment once you have finished dealing with the project\n\n```bash\ndeactivate\n```\n\n### Generate the code documentation\n\nThe automatic code documentation is provided [Sphinx v4.5.0](https://www.sphinx-doc.org/en/master/).\n\nIn order to have the code documentation available, you need to install the development requirements\n\n```bash\npip install --upgrade pip\npip install -r requirements.dev.txt\n```\n\nSince Sphinx commands are quite verbose, I suggest you to employ the following commands using the `Makefile`.\n\n```bash\nmake doc-layout\nmake doc\n```\n\nThe generated documentation will be accessible by opening `docs/build/html/index.html` in your browser, or equivalently by running\n\n```bash\nmake open-doc\n```\n\nHowever, for the sake of completeness one may want to run the full Sphinx commands listed here.\n\n```bash\nsphinx-quickstart docs --sep --no-batchfile --project bears--author \"The Reasoning Shortcut Gang\"  -r 0.1  --language en --extensions sphinx.ext.autodoc --extensions sphinx.ext.napoleon --extensions sphinx.ext.viewcode --extensions myst_parser\nsphinx-apidoc -P -o docs/source .\ncd docs; make html\n```\n\n## Libraries and extra tools\n\nThis code is adapted from [Marconato Reasoning Shortcuts](https://github.com/ema-marconato/reasoning-shortcuts). To implement [PCBMs](https://arxiv.org/abs/2306.01574), we employed some functions from [Kim ProbCBM](https://github.com/ejkim47/prob-cbm).\n\n## Laplace\n\nSince [Laplace](https://github.com/AlexImmer/Laplace) is not meant to deal with a Neuro-Symbolic architecture and neither with multiclass classification problems, we define our [own fork](https://github.com/samuelebortolotti/Laplace-Reasoning-Shortcut) which is expected to work **only** with our networks.\n\nHere we list the steps we performed:\n\n1. Go to `laplace/utils/feature_extractor.py`\n2. Add the following line to `find_last_layer`:\n  ```python\n  if key != 'original_model.encoder.dense_c' and key != 'original_model.conceptizer.enc1':\n                continue\n  ```\n  so that the library takes the concept bottleneck as the Laplace model.\n\n3. Go to `laplace/lllaplace.py`\n4. Add the following line to `_nn_predictive_samples`:\n  ```python\n  self.model.model.model_possibilities = [None] * n_samples\n  fs = list()\n  for i, sample in enumerate(self.sample(n_samples)):\n      vector_to_parameters(sample, self.model.last_layer.parameters())\n      self.model.model.model_possibilities[i] = sample\n      fs.append(self.model(X.to(self._device)).detach())\n  ```\n  so that the wrapper model knows to start tracking the output predictions.\n  Where the added lines are: `self.model.model.model_possibilities = [None] * n_samples` \n  and `self.model.model.model_possibilities[i] = sample`\n\nMoreover in the file `laplace/curvature/curvature.py` change this:\n\n```python\ndef BCE_forloop(tar,pred):\n    loss = F.binary_cross_entropy(tar[0, :4], pred[0, :4])\n    \n    for i in range(1,len(tar)):\n        loss = loss + F.binary_cross_entropy(tar[i, :4], pred[i, :4])\n    return loss \n\ndef CE_forloop(y_pred, y_true):\n    y_trues = torch.split(y_true, 1, dim=-1)\n    y_preds = torch.split(y_pred, 2, dim=-1)\n\n    loss = 0\n    for i in range(4):\n        \n        true = y_trues[i].view(-1)\n        pred = y_preds[i]\n\n        loss_i = F.nll_loss( pred.log(), true.to(torch.long) )\n        loss += loss_i / 4\n\n        assert loss_i \u003e 0, pred.log() \n    \n    return loss\n\nclass CurvatureInterface:\n    \"\"\"Interface to access curvature for a model and corresponding likelihood.\n    A `CurvatureInterface` must inherit from this baseclass and implement the\n    necessary functions `jacobians`, `full`, `kron`, and `diag`.\n    The interface might be extended in the future to account for other curvature\n    structures, for example, a block-diagonal one.\n\n    Parameters\n    ----------\n    model : torch.nn.Module or `laplace.utils.feature_extractor.FeatureExtractor`\n        torch model (neural network)\n    likelihood : {'classification', 'regression'}\n    last_layer : bool, default=False\n        only consider curvature of last layer\n    subnetwork_indices : torch.Tensor, default=None\n        indices of the vectorized model parameters that define the subnetwork\n        to apply the Laplace approximation over\n\n    Attributes\n    ----------\n    lossfunc : torch.nn.MSELoss or torch.nn.CrossEntropyLoss\n    factor : float\n        conversion factor between torch losses and base likelihoods\n        For example, \\\\(\\\\frac{1}{2}\\\\) to get to \\\\(\\\\mathcal{N}(f, 1)\\\\) from MSELoss.\n    \"\"\"\n    def __init__(self, model, likelihood, last_layer=False, subnetwork_indices=None):\n        assert likelihood in ['regression', 'classification']\n        self.likelihood = likelihood\n        self.model = model\n        self.last_layer = last_layer\n        self.subnetwork_indices = subnetwork_indices\n        if likelihood == 'regression':\n            self.lossfunc = MSELoss(reduction='sum')\n            self.factor = 0.5\n        else:\n            self.lossfunc = CrossEntropyLoss(reduction='sum')\n            # self.lossfunc = CE_forloop\n            self.factor = 1.\n```\n\nwhere for MNIST `self.lossfunc = CrossEntropyLoss(reduction='sum')` and for BDD `self.lossfunc = CE_forloop`\n\n## Acknowledgements\n\nThe authors are grateful to Zhe Zeng for useful discussion. Funded by the European Union. Views and opinions expressed are however those of the author(s) only and do not necessarily reflect those of the European Union or the European Health and Digital Executive Agency (HaDEA). Neither the European Union nor the granting authority can be held responsible for them. Grant Agreement no. 101120763 - TANGO. AV is supported by the \"UNREAL: Unified Reasoning Layer for Trustworthy ML\" project (EP/Y023838/1) selected by the ERC and funded by UKRI EPSRC.\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsamuelebortolotti%2Fbears","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsamuelebortolotti%2Fbears","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsamuelebortolotti%2Fbears/lists"}