Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
Projects in Awesome Lists tagged with trustworthy-ai
A curated list of projects in awesome lists tagged with trustworthy-ai .
https://github.com/trusted-ai/adversarial-robustness-toolbox
Adversarial Robustness Toolbox (ART) - Python Library for Machine Learning Security - Evasion, Poisoning, Extraction, Inference - Red and Blue Teams
adversarial-attacks adversarial-examples adversarial-machine-learning ai artificial-intelligence attack blue-team evasion extraction inference machine-learning poisoning privacy python red-team trusted-ai trustworthy-ai
Last synced: 16 Dec 2024
https://github.com/Trusted-AI/adversarial-robustness-toolbox
Adversarial Robustness Toolbox (ART) - Python Library for Machine Learning Security - Evasion, Poisoning, Extraction, Inference - Red and Blue Teams
adversarial-attacks adversarial-examples adversarial-machine-learning ai artificial-intelligence attack blue-team evasion extraction inference machine-learning poisoning privacy python red-team trusted-ai trustworthy-ai
Last synced: 28 Oct 2024
https://github.com/Giskard-AI/giskard
ðĒ Open-Source Evaluation & Testing for ML & LLM systems
ai-red-team ai-safety ai-security ai-testing ethical-artificial-intelligence evaluation-framework fairness-ai llm llm-eval llm-evaluation llm-security llmops ml-safety ml-testing ml-validation mlops rag-evaluation red-team-tools responsible-ai trustworthy-ai
Last synced: 08 Nov 2024
https://github.com/giskard-ai/giskard
ðĒ Open-Source Evaluation & Testing for ML & LLM systems
ai-red-team ai-safety ai-security ai-testing ethical-artificial-intelligence evaluation-framework fairness-ai llm llm-eval llm-evaluation llm-security llmops ml-safety ml-testing ml-validation mlops rag-evaluation red-team-tools responsible-ai trustworthy-ai
Last synced: 17 Dec 2024
https://github.com/zjunlp/easyedit
[ACL 2024] An Easy-to-use Knowledge Editing Framework for LLMs.
artificial-intelligence baichuan chatgpt easyedit efficient gpt knowledge-editing knowlm large-language-models llama llama2 mistral mmedit model-editing natural-language-processing open-source-project safeedit tool trustworthy-ai unlearning
Last synced: 22 Dec 2024
https://github.com/zjunlp/EasyEdit
[įĨčŊįžčū] [ACL 2024] An Easy-to-use Knowledge Editing Framework for LLMs.
artificial-intelligence baichuan chatgpt easyedit efficient gpt knowledge-editing knowlm large-language-models llama llama2 mistral mmedit model-editing natural-language-processing open-source-project safeedit tool trustworthy-ai unlearning
Last synced: 31 Oct 2024
https://github.com/johnsnowlabs/langtest
Deliver safe & effective language models
ai-safety ai-testing artificial-intelligence benchmark-framework benchmarks ethics-in-ai large-language-models llm llm-as-evaluator llm-evaluation-toolkit llm-test llm-testing ml-safety ml-testing mlops model-assessment nlp responsible-ai trustworthy-ai
Last synced: 16 Dec 2024
https://github.com/howiehwong/trustllm
[ICML 2024] TrustLLM: Trustworthiness in Large Language Models
ai benchmark dataset evaluation large-language-models llm natural-language-processing nlp pypi-package toolkit trustworthy-ai trustworthy-machine-learning
Last synced: 20 Dec 2024
https://github.com/HowieHwong/TrustLLM
[ICML 2024] TrustLLM: Trustworthiness in Large Language Models
ai benchmark dataset evaluation large-language-models llm natural-language-processing nlp pypi-package toolkit trustworthy-ai trustworthy-machine-learning
Last synced: 16 Nov 2024
https://github.com/THUYimingLi/BackdoorBox
The open-sourced Python toolbox for backdoor attacks and defenses.
backdoor-attacks backdoor-defenses backdoor-learning trustworthy-ai trustworthy-machine-learning
Last synced: 30 Oct 2024
https://github.com/aiverify-foundation/moonshot
Moonshot - A simple and modular tool to evaluate and red-team any LLM application.
benchmarking evaluation-framework llm red-teaming trustworthy-ai
Last synced: 13 Nov 2024
https://github.com/liuzuxin/fsrl
ð A fast safe reinforcement learning library in PyTorch
cpo cvpo decision-making library ppo pytorch reinforcement-learning robotics sac safe-rl safety-critical trpo trustworthy-ai
Last synced: 17 Dec 2024
https://github.com/yunqing-me/AttackVLM
[NeurIPS-2023] Annual Conference on Neural Information Processing Systems
adversarial-attack deep-generative-model foundation-models generative-ai image-to-text-generation large-language-models text-to-image-generation trustworthy-ai vision-language-model
Last synced: 02 Dec 2024
https://github.com/thu-ml/mmtrusteval
A toolbox for benchmarking trustworthiness of multimodal large language models (MultiTrust, NeurIPS 2024 Track Datasets and Benchmarks)
benchmark claude fairness gpt-4 mllm multi-modal privacy robustness safety toolbox trustworthy-ai truthfulness
Last synced: 16 Dec 2024
https://github.com/thu-ml/MMTrustEval
A toolbox for benchmarking trustworthiness of multimodal large language models (MultiTrust, NeurIPS 2024 Track Datasets and Benchmarks)
benchmark claude fairness gpt-4 mllm multi-modal privacy robustness safety toolbox trustworthy-ai truthfulness
Last synced: 02 Dec 2024
https://github.com/ffhibnese/Model-Inversion-Attack-ToolBox
A comprehensive toolbox for model inversion attacks and defenses, which is easy to get started.
benchmarks machine-learning model-inversion model-inversion-attacks privacy toolbox trustworthy-ai
Last synced: 09 Nov 2024
https://github.com/sleeepeer/PoisonedRAG
[USENIX Security 2025] PoisonedRAG: Knowledge Corruption Attacks to Retrieval-Augmented Generation of Large Language Models
ai machine-learning rag retrieval-augmented-generation security trustworthy-ai
Last synced: 02 Dec 2024
https://github.com/dlmacedo/distinction-maximization-loss
A project to improve out-of-distribution detection (open set recognition) and uncertainty estimation by changing a few lines of code in your project! Perform efficient inferences (i.e., do not increase inference time) without repetitive model training, hyperparameter tuning, or collecting additional data.
ai-safety anomaly-detection classification deep-learning machine-learning novelty-detection ood ood-detection open-set open-set-recognition osr out-of-distribution out-of-distribution-detection pytorch robust-machine-learning trustworthy-ai trustworthy-machine-learning uncertainty-estimation
Last synced: 05 Nov 2024
https://github.com/richard-peng-xia/CARES
[arXiv'24 & ICMLW'24] CARES: A Comprehensive Benchmark of Trustworthiness in Medical Vision Language Models
large-vision-language-model medical-multimodal-learning trustworthy-ai vision-language-model
Last synced: 02 Dec 2024
https://github.com/aimagelab/safe-clip
Safe-CLIP: Removing NSFW Concepts from Vision-and-Language Models. ECCV 2024
eccv2024 image-to-text nsfw retrieval safety text-to-image trustworthy-ai vision-and-language
Last synced: 07 Nov 2024
https://github.com/LucasFidon/trustworthy-ai-fetal-brain-segmentation
Trustworthy AI method based on Dempster-Shafer theory - application to fetal brain 3D T2w MRI segmentation
deep-learning fetal-mri segmentation trustworthy-ai trustworthy-machine-learning
Last synced: 13 Nov 2024
https://github.com/dlmacedo/robust-deep-learning
A project to train your model from scratch or fine-tune a pretrained model using the losses provided in this library to improve out-of-distribution detection and uncertainty estimation performances. Calibrate your model to produce enhanced uncertainty estimations. Detect out-of-distribution data using the defined score type and threshold.
anomaly-detection classification deep-learning deep-neural-networks machine-learning novelty-detection ood-detection open-set open-set-recognition out-of-distribution out-of-distribution-detection pytorch robust-deep-learning robust-machine-learning trustworthy-ai trustworthy-machine-learning uncertainty-calibration uncertainty-estimation uncertainty-neural-networks
Last synced: 05 Nov 2024
https://github.com/aigc-apps/PertEval
This is the accompanying repo of the NeurIPS '24 D&B Spotlight paper, PertEval, including code, data, and main results.
evaluation-framework evaluation-metrics large-language-models llm-evaluation machine-learning trustworthy-ai
Last synced: 20 Nov 2024
https://github.com/aigc-apps/perteval
This is the accompanying repo of the NeurIPS '24 D&B Spotlight paper, PertEval, including code, data, and main results.
evaluation-framework evaluation-metrics large-language-models llm-evaluation machine-learning trustworthy-ai
Last synced: 07 Nov 2024
https://github.com/ornl/flowcept
Runtime data integration system that empowers any data processing system to capture and query workflow provenance using data observability.
big-data dask data-integration lineage machine-learning mlflow model-management parallel-processing provenance reproducibility responsible-ai scientific-workflows tensorboard trustworthy-ai workflows
Last synced: 15 Dec 2024
https://github.com/howiehwong/obscureprompt
ObscurePrompt: Jailbreaking Large Language Models via Obscure Input
jailbreak large-language-models trustworthy-ai
Last synced: 13 Nov 2024
https://github.com/ClementSicard/Reliable-and-Trustworthy-AI-Notebooks
Reliable and Trustworthy Intelligence AI notebooks from ETH Zurich course taught by Prof. Dr. Martin Vechev
interpretable-ai neural-networks reliable-ai trustworthy-ai
Last synced: 17 Nov 2024