{"id":18597576,"url":"https://github.com/bartpleiter/tabular-backdoors","last_synced_at":"2025-04-10T17:31:05.140Z","repository":{"id":172039458,"uuid":"648637610","full_name":"bartpleiter/tabular-backdoors","owner":"bartpleiter","description":"Code repository for Master thesis on backdoor attacks on transformer-based DNNs for tabular data","archived":false,"fork":false,"pushed_at":"2023-06-19T14:58:03.000Z","size":181,"stargazers_count":5,"open_issues_count":0,"forks_count":1,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-03-25T01:30:00.168Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/bartpleiter.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-06-02T12:45:19.000Z","updated_at":"2024-12-16T07:01:25.000Z","dependencies_parsed_at":null,"dependency_job_id":"d7916c91-0380-41df-8872-bed9a95db2f3","html_url":"https://github.com/bartpleiter/tabular-backdoors","commit_stats":{"total_commits":5,"total_committers":2,"mean_commits":2.5,"dds":"0.19999999999999996","last_synced_commit":"6a494fdc912ae06276efdac95f5144bab836b903"},"previous_names":["bartpleiter/tabular-backdoors"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bartpleiter%2Ftabular-backdoors","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bartpleiter%2Ftabular-backdoors/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bartpleiter%2Ftabular-backdoors/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bartpleiter%2Ftabular-backdoors/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/bartpleiter","download_url":"https://codeload.github.com/bartpleiter/tabular-backdoors/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248261993,"owners_count":21074229,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-07T01:28:40.556Z","updated_at":"2025-04-10T17:31:05.112Z","avatar_url":"https://github.com/bartpleiter.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# tabular-backdoors\nCode repository for Master thesis on backdoor attacks on transformer-based DNNs for tabular data.\n\n## Models used\n\n- TabNet, https://arxiv.org/pdf/1908.07442.pdf, (used implementation from https://github.com/dreamquark-ai/tabnet)\n- FT-Transformer, https://arxiv.org/pdf/2106.11959.pdf, (used implementation from https://github.com/Yura52/tabular-dl-revisiting-models)\n- SAINT, https://arxiv.org/pdf/2106.01342.pdf, (used implementation from https://github.com/somepago/saint)\n\n## Data used\n\n- Forest Cover Type (CovType), http://archive.ics.uci.edu/ml/datasets/covertype\n- Lending Club Loan (LOAN), https://www.kaggle.com/datasets/wordsforthewise/lending-club and https://www.kaggle.com/datasets/adarshsng/lending-club-loan-data-csv?select=LCDataDictionary.xlsx\n- Higgs Boson (HIGGS), https://archive.ics.uci.edu/ml/datasets/HIGGS\n\n## Overview\n```text\ntabular-backdoors           # Project directory\n├── data                    # Contains datasets and preprocessing notebooks\n├── ExpCleanLabel           # Experiment code for Clean Label Attack\n├── ExpInBounds             # Experiment code for In Bounds Trigger\n├── ExpTriggerPosition      # Experiment code for Trigger Position based on feature importance\n├── ExpTriggerSize          # Experiment code for Trigger Size\n├── SAINT                   # SAINT model code\n├── FTtransformer           # FT-Transformer model code\n└── Notebooks               # Other (smaller or parts of) experiments in the form of notebooks\n    ├── FeatureImportances  # Notebooks to calculate feature importance scores and rankings\n    └── Defences            # Notebooks on defences against our attacks\n```\n\n## Usage\n\n### Install and enable environment\n\n```bash\nvirtualenv tabularbackdoor\nsource tabularbackdoor/bin/activate\npip install -r requirements.txt\n\n# To run the notebooks you also need:\npip install notebook\n```\n\n### Download and preprocess data\n\n1. Download `accepted_2007_to_2018Q4.csv` from https://www.kaggle.com/datasets/wordsforthewise/lending-club and place in `data/LOAN/`\n2. Download `LCDataDictionary.xlsx` from https://www.kaggle.com/datasets/adarshsng/lending-club-loan-data-csv?select=LCDataDictionary.xlsx and place in `data/LOAN/`\n3. Download `HIGGS.csv.gz` from https://archive.ics.uci.edu/ml/datasets/HIGGS and extract `HIGGS.csv` to `data/HIGGS`\n4. Run all four notebooks under `data/preprocess` to generate the `.pkl` files containing the datasets for the experiments\n\n### Run main experiments\n\nRun the shell script in any of the `Exp*` folders from the project root with the Python filename (without extension) as argument. Output will be logged to the output folder.\n\n- NOTE: starting an experiment will override the previous log file of the same experiment.\n- NOTE: depending on the machine, you might want to edit the GPU used to train each model. To do so, edit the `cuda:x` string (located somewhere on top) in each `.py` file.\n\nExample:\n```bash\nbash ExpTriggerSize/run_experiment.sh TabNet_CovType_1F_OOB\n```\n\nTo live view the log of a running experiment, use `tail -f` with the logfile as argument in a new terminal:\n\n```bash\ntail -f output/triggersize/TabNet_CovType_1F_OOB.log\n```\n\n### View results of main experiments\n\nOutput logs are found in the `output/` folder. All logs end with a section `EASY COPY PASTE RESULTS:` where you can copy the resulting lists containing the `ASR` and `BA` for each run.\n\n### Run notebooks (Defences and FeatureImportance calculations)\n\nSee the `Notebooks/` folder for other (smaller or parts of) experiments in the form of notebooks. To run the defences, you must first run the appropiate `CreateModel` Notebook to create a backdoored model and dataset which can then be analyzed with the other Notebooks. For Fine-Pruning defence, there is a dedicated subfolder in the `Notebooks/Defences` folder with notebooks to train, prune and finetune FTT.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbartpleiter%2Ftabular-backdoors","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fbartpleiter%2Ftabular-backdoors","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbartpleiter%2Ftabular-backdoors/lists"}