awesome-llm-computational-argumentation
The Hub of Computational Argumentation in the Era of LLM
https://github.com/kashiwabyte/awesome-llm-computational-argumentation
Last synced: 12 days ago
JSON representation
-
Papers
-
Debate For LLM
- Can LLMs Speak For Diverse People? Tuning LLMs via Debate to Generate Controllable Controversial Statements
- Combating Adversarial Attacks with Multi-Agent Debate
- From Values to Opinions: Predicting Human Behaviors and Stances Using Value-Injected Large Language Models
- The Moral Debater: A Study on the Computational Generation of Morally Framed Arguments
- Can LLMs Beat Humans in Debating? A Dynamic Multi-agent Framework for Competitive Debate
- The Debate Over Understanding in AI's Large Language Models
- Project Debater APIs: Decomposing the AI Grand Challenge
- An Empirical Analysis on Large Language Models in Debate Evaluation
- DEBATE: Devil's Advocate-Based Assessment and Text Evaluation
- Debatrix: Multi-dimensional Debate Judge with Iterative Chronological Analysis Based on LLM
- A Picture Is Worth a Graph: A Blueprint Debate Paradigm for Multimodal Reasoning
- Debating with More Persuasive LLMs Leads to More Truthful Answers
- Can LLMs Beat Humans in Debating? A Dynamic Multi-agent Framework for Competitive Debate
- An Empirical Analysis on Large Language Models in Debate Evaluation
- DEBATE: Devil's Advocate-Based Assessment and Text Evaluation
- Debatrix: Multi-dimensional Debate Judge with Iterative Chronological Analysis Based on LLM
- A Picture Is Worth a Graph: A Blueprint Debate Paradigm for Multimodal Reasoning
- Debating with More Persuasive LLMs Leads to More Truthful Answers
- Can Large Language Models be Trusted for Evaluation? Scalable Meta-Evaluation of LLMs as Evaluators via Agent Debate
- Combating Adversarial Attacks with Multi-Agent Debate
- Towards Explainable Harmful Meme Detection through Multimodal Debate between Large Language Models
- Recourse under Model Multiplicity via Argumentative Ensembling (Technical Report)
- Should we be going MAD? A Look at Multi-Agent Debate Strategies for LLMs
- Debate Helps Supervise Unreliable Experts
- Scalable AI Safety via Doubly-Efficient Debate
- Let Models Speak Ciphers: Multiagent Debate through Embeddings
- ChatEval: Towards Better LLM-based Evaluators through Multi-Agent Debate
- Can ChatGPT Defend its Belief in Truth? Evaluating LLM Reasoning via Debate
- Improving Factuality and Reasoning in Language Models through Multiagent Debate
- Encouraging Divergent Thinking in Large Language Models through Multi-Agent Debate
- Examining Inter-Consistency of Large Language Models Collaboration: An In-depth Analysis via Debate
- Can Large Language Models be Trusted for Evaluation? Scalable Meta-Evaluation of LLMs as Evaluators via Agent Debate
- Towards Explainable Harmful Meme Detection through Multimodal Debate between Large Language Models
- Learning to Break: Knowledge-Enhanced Reasoning in Multi-Agent Debate System
- Recourse under Model Multiplicity via Argumentative Ensembling (Technical Report)
- Should we be going MAD? A Look at Multi-Agent Debate Strategies for LLMs
- Debate Helps Supervise Unreliable Experts
- Scalable AI Safety via Doubly-Efficient Debate
- Let Models Speak Ciphers: Multiagent Debate through Embeddings
- ChatEval: Towards Better LLM-based Evaluators through Multi-Agent Debate
- Can ChatGPT Defend its Belief in Truth? Evaluating LLM Reasoning via Debate
- Improving Factuality and Reasoning in Language Models through Multiagent Debate
- Encouraging Divergent Thinking in Large Language Models through Multi-Agent Debate
- Examining Inter-Consistency of Large Language Models Collaboration: An In-depth Analysis via Debate
- The Debate Over Understanding in AI's Large Language Models
- Project Debater APIs: Decomposing the AI Grand Challenge
- The Moral Debater: A Study on the Computational Generation of Morally Framed Arguments
-
Quality Assessment
- Conclusion-based Counter-Argument Generation - 23
- Rhetoric, Logic, and Dialectic: Advancing Theory-based Argument Quality Assessment in Natural Language Processing
- Argument Quality Assessment in the Age of Instruction-Following Large Language Models
- Automatic Argument Quality Assessment -- New Datasets and Methods
- Can Language Models Recognize Convincing Arguments?
- Argument Quality Assessment in the Age of Instruction-Following Large Language Models
- Contextualizing Argument Quality Assessment with Relevant Knowledge
- Exploring the Role of Argument Structure in Online Debate Persuasion
- A Large-scale Dataset for Argument Quality Ranking: Construction and Analysis
- Contextualizing Argument Quality Assessment with Relevant Knowledge
- Conclusion-based Counter-Argument Generation - 23
- Automatic Debate Evaluation with Argumentation Semantics and Natural Language Argument Graph Networks
- Assessing the Sufficiency of Arguments through Conclusion Generation
- Assessing the Sufficiency of Arguments through Conclusion Generation
- A Large-scale Dataset for Argument Quality Ranking: Construction and Analysis
- Automatic Argument Quality Assessment -- New Datasets and Methods
- Rhetoric, Logic, and Dialectic: Advancing Theory-based Argument Quality Assessment in Natural Language Processing
- Automatic Analysis of Substantiation in Scientific Peer Reviews
- Assessing Good, Bad and Ugly Arguments Generated by ChatGPT: a New Dataset, its Methodology and Associated Tasks
- Claim Optimization in Computational Argumentation
-
Argument Generation
- Persuasiveness of Generated Free-Text Rationales in Subjective Decisions: A Case Study on Pairwise Argument Ranking
- MOCHA: A Multi-Task Training Approach for Coherent Text Generation from Cognitive Perspective
- Persuasiveness of Generated Free-Text Rationales in Subjective Decisions: A Case Study on Pairwise Argument Ranking
- Argue with Me Tersely: Towards Sentence-Level Counter-Argument Generation
- DebateKG: Automatic Policy Debate Case Creation with Semantic Knowledge Graphs
- RSTGen: Imbuing Fine-Grained Interpretable Control into Long-FormText Generators
- Argue with Me Tersely: Towards Sentence-Level Counter-Argument Generation
- From Values to Opinions: Predicting Human Behaviors and Stances Using Value-Injected Large Language Models
- MOCHA: A Multi-Task Training Approach for Coherent Text Generation from Cognitive Perspective
- RSTGen: Imbuing Fine-Grained Interpretable Control into Long-FormText Generators
- DebateKG: Automatic Policy Debate Case Creation with Semantic Knowledge Graphs
-
Argument Mining
- Hi-ArG: Exploring the Integration of Hierarchical Argumentation Graphs in Language Pretraining
- TACO -- Twitter Arguments from COnversations
- Can Large Language Models perform Relation-based Argument Mining?
- Argument Mining in Data Scarce Settings: Cross-lingual Transfer and Few-shot Techniques
- In-Context Learning and Fine-Tuning GPT for Argument Mining
- WIBA: What Is Being Argued? A Comprehensive Approach to Argument Mining
- Argument Mining in Data Scarce Settings: Cross-lingual Transfer and Few-shot Techniques
- In-Context Learning and Fine-Tuning GPT for Argument Mining
- WIBA: What Is Being Argued? A Comprehensive Approach to Argument Mining
- DMON: A Simple yet Effective Approach for Argument Structure Learning
- Exploring Key Point Analysis with Pairwise Generation and Graph Partitioning
- DMON: A Simple yet Effective Approach for Argument Structure Learning
- A School Student Essay Corpus for Analyzing Interactions of Argumentative Structure and Quality
- A School Student Essay Corpus for Analyzing Interactions of Argumentative Structure and Quality
- TACO -- Twitter Arguments from COnversations
- Can Large Language Models perform Relation-based Argument Mining?
- End-to-End Argument Mining over Varying Rhetorical Structures
- Hi-ArG: Exploring the Integration of Hierarchical Argumentation Graphs in Language Pretraining
- Overview of ImageArg-2023: The First Shared Task in Multimodal Argument Mining
- TILFA: A Unified Framework for Text, Image, and Layout Fusion in Argument Mining
- Overview of ImageArg-2023: The First Shared Task in Multimodal Argument Mining
- TILFA: A Unified Framework for Text, Image, and Layout Fusion in Argument Mining
- Detecting Check-Worthy Claims in Political Debates, Speeches, and Interviews Using Audio Data
- Detecting Check-Worthy Claims in Political Debates, Speeches, and Interviews Using Audio Data
- AQE: Argument Quadruplet Extraction via a Quad-Tagging Augmented Generative Approach
- VivesDebate-Speech: A Corpus of Spoken Argumentation to Leverage Audio Features for Argument Mining
- VivesDebate-Speech: A Corpus of Spoken Argumentation to Leverage Audio Features for Argument Mining
- Perturbations and Subpopulations for Testing Robustness in Token-Based Argument Unit Recognition
- ImageArg: A Multi-modal Tweet Dataset for Image Persuasiveness Mining
- A Holistic Framework for Analyzing the COVID-19 Vaccine Debate
- Echoes through Time: Evolution of the Italian COVID-19 Vaccination Debate
- Can Unsupervised Knowledge Transfer from Social Discussions Help Argument Mining?
- Perturbations and Subpopulations for Testing Robustness in Token-Based Argument Unit Recognition
- ImageArg: A Multi-modal Tweet Dataset for Image Persuasiveness Mining
- Echoes through Time: Evolution of the Italian COVID-19 Vaccination Debate
- Can Unsupervised Knowledge Transfer from Social Discussions Help Argument Mining?
-
-
Evaluation
-
Benchmark & datasets
- DebateQA: Evaluating Question Answering on Debatable Knowledge
- Which Side Are You On? A Multi-task Dataset for End-to-End Argument Summarisation and Evaluation
- OpenDebateEvidence: A Massive-Scale Argument Mining and Summarization Dataset
- DebateQA: Evaluating Question Answering on Debatable Knowledge
- Which Side Are You On? A Multi-task Dataset for End-to-End Argument Summarisation and Evaluation
- FREDSum: A Dialogue Summarization Corpus for French Political Debates
- QT30: A Corpus of Argument and Conflict in Broadcast Debate
- FREDSum: A Dialogue Summarization Corpus for French Political Debates
- QT30: A Corpus of Argument and Conflict in Broadcast Debate
- Single-Turn Debate Does Not Help Humans Answer Hard Reading-Comprehension Questions
- USR: An Unsupervised and Reference Free Evaluation Metric for Dialog Generation
- Single-Turn Debate Does Not Help Humans Answer Hard Reading-Comprehension Questions
- SummEval: Re-evaluating Summarization Evaluation
- Learning From Revisions: Quality Assessment of Claims in Argumentation at Scale
- IAM: A Comprehensive and Large-Scale Dataset for Integrated Argument Mining Tasks
- SummEval: Re-evaluating Summarization Evaluation
- Learning From Revisions: Quality Assessment of Claims in Argumentation at Scale
- Transformer-Based Argument Mining for Healthcare Applications
- Detecting Attackable Sentences in Arguments
- Unsupervised Expressive Rules Provide Explainability and Assist Human Experts Grasping New Domains
- Transformer-Based Argument Mining for Healthcare Applications
- Detecting Attackable Sentences in Arguments
- Unsupervised Expressive Rules Provide Explainability and Assist Human Experts Grasping New Domains
- USR: An Unsupervised and Reference Free Evaluation Metric for Dialog Generation
- A Dataset of General-Purpose Rebuttal
- A Dataset of General-Purpose Rebuttal
- Exploring the Role of Prior Beliefs for Argument Persuasion
- Exploring the Role of Prior Beliefs for Argument Persuasion
- A Corpus for Modeling User and Language Effects in Argumentation on Online Debating
- A Corpus for Modeling User and Language Effects in Argumentation on Online Debating
- Cross-topic Argument Mining from Heterogeneous Sources Using Attention-based Neural Networks
- Cross-topic Argument Mining from Heterogeneous Sources Using Attention-based Neural Networks
- Recognizing Insufficiently Supported Arguments in Argumentative Essays
- Recognizing Insufficiently Supported Arguments in Argumentative Essays
- Parsing Argumentation Structures in Persuasive Essays
- Parsing Argumentation Structures in Persuasive Essays
- Automatic Analysis of Substantiation in Scientific Peer Reviews
-
Survey
Programming Languages
Categories