awesome-activation-engineering
A curated list of resources for activation engineering
https://github.com/ZFancy/awesome-activation-engineering
Last synced: 6 days ago
JSON representation
-
Categories
-
Collected Related Work (Uncategoried)
- arXiv 2025
- ICLR 2025
- NeurIPS 2024 - algebra-code)]
- NeurIPS 2023
- ICLR 2020 - design/gan_steerability)]
-
Concept Activation Detection
- arXiv 2025
- arXiv 2025
- PAKDD
- NeurIPS 2024 - CAV)]
- NeurIPS 2024 - Safety_SCAV)]
- MICCAI 2024
- arXiv 2024
- arXiv 2024
- arXiv 2024
- ICLR 2023 - gradients)]
- NeurIPS 2022
- NeurIPS 2022
- NeurIPS 2020
- ICML 2018
-
Concept Activation Steering
- arXiv 2025
- arXiv 2025
- arXiv 2025
- arXiv 2025
- arXiv 2025
- arXiv 2025
- AAAI 2025
- ICLR 2025
- ICLR 2025 - steering)]
- ICLR 2025 - wang123/SADI)]
- ICLR 2025
- ICLR 2025 workshop
- NeurIPS 2024
- NeurIPS 2024
- NeurIPS 2024
- NeurIPS 2024
- NeurIPS 2024
- NeurIPS 2024 workshop
- NeurIPS 2024 workshop
- NeurIPS 2024 workshop
- NeurIPS 2024 workshop
- NeurIPS 2024 workshop
- ICML 2024
- ICML 2024 workshop
- AAAI 2024
- ICLR 2024
- ICLR 2024
- EMNLP 2024
- EMNLP 2024
- ACL 2024
- ACL 2024 - wpy/InferAligner)]
- ACL 2024
- CIKM 2024 - Activation-Attack)]
- arXiv 2024 - engineering)]
- arXiv 2024 - TS)]
- arXiv 2024
- arXiv 2024 - then-steer)]
- arXiv 2024
- arXiv 2024
- arXiv 2024 - activation-addition)]
- arXiv 2024
- arXiv 2024
- arXiv 2023
- EMNLP 2023
- arXiv 2023
- ACL 2022
-
Concept Representation and Extraction
-
Relevant Repo and Blog
-
Programming Languages
Categories
Sub Categories