https://github.com/pelikhan/action-genai-issue-dedup
Searches for duplicate issues.
https://github.com/pelikhan/action-genai-issue-dedup
Last synced: 5 months ago
JSON representation
Searches for duplicate issues.
- Host: GitHub
- URL: https://github.com/pelikhan/action-genai-issue-dedup
- Owner: pelikhan
- License: mit
- Created: 2025-05-31T04:28:13.000Z (6 months ago)
- Default Branch: main
- Last Pushed: 2025-06-12T04:42:42.000Z (6 months ago)
- Last Synced: 2025-06-12T05:35:09.176Z (6 months ago)
- Language: TypeScript
- Size: 243 KB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 2
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
- awesome-continuous-ai - Detect Duplicate Issues, reusable action - an example of a reusable action using using GitHub Actions, GitHub Models and GenAISCript for issue duplicate detection (Categories / Continuous Triage)
README
This action is designed to find duplicate issues in a GitHub repository using a GenAI model. It retrieves the current issue and compares it against other issues in the repository, leveraging LLM reasoning to determine if they are duplicates.
> [!NOTE]
> This action uses [GitHub Models](https://github.com/models) for LLM inference.
## Algorithm
The deduplication algorithm implemented in `genaisrc/action.genai.mts` operates as follows:
- **Issue Retrieval**: The script retrieves the current issue and a configurable set of other issues from the repository, filtered by state, labels, creation date, and count. The current issue is excluded from the comparison set.
- **Batch detection using small LLM**: For each group of issues, the script constructs a prompt that defines the current issue and the group of other issues (grouped to fit in the context window). The prompt instructs the **small** LLM to compare the current issue against each candidate, providing a CSV output with the issue number, reasoning, and a verdict (`DUP` for duplicate, `UNI` for unique).
- **Single duplicate validation using large LLM**: If the LLM identifies duplicates, the script runs a validation LLM prompt using a **large** model to confirm the duplicate hit.
- **Result Output**: If duplicates are found, their issue numbers and titles are output. If no duplicates are found, the action is cancelled with an appropriate message.
## Inputs
- `count`: Number of issues to check for duplicates (default: `30`)
- `since`: Only check issues created after this date (ISO 8601 format)
- `labels`: List of labels to filter issues by
- `state`: State of the issues to check (open, closed, all) (default: `open`)
- `max_duplicates`: Maximum number of duplicates to check for (default: `3`)
- `tokens_per_issue`: Number of tokens to use for each issue when checking for duplicates (default: `1000`)
- `label_as_duplicate`: Add `duplicate` label to issues that are found to be duplicates (default: `false`)
- `github_token`: GitHub token with `models: read` permission at least (https://microsoft.github.io/genaiscript/reference/github-actions/#github-models-permissions). (required)
- `debug`: Enable debug logging (https://microsoft.github.io/genaiscript/reference/scripts/logging/).
## Usage
Add the following to your step in your workflow file:
```yaml
---
permissions:
models: read
issues: write
---
steps:
- uses: pelikhan/action-genai-issue-dedup@v0
with:
github_token: ${{ secrets.GITHUB_TOKEN }}
```
## Example
Save this file under `.github/workflows/genai-issue-dedup.yml` in your repository:
```yaml
name: GenAI Find Duplicate Issues
on:
issues:
types: [opened, reopened]
permissions:
models: read
issues: write
concurrency:
group: ${{ github.workflow }}-${{ github.event.issue.number }}
cancel-in-progress: true
jobs:
genai-issue-dedup:
runs-on: ubuntu-latest
steps:
- name: Run action-issue-dedup Action
uses: pelikhan/action-genai-issue-dedup@v0
with:
github_token: ${{ secrets.GITHUB_TOKEN }}
```
## Development
This action was automatically generated by GenAIScript from the script metadata.
We recommend updating the script metadata instead of editing the action files directly.
- the action inputs are inferred from the script parameters
- the action outputs are inferred from the script output schema
- the action description is the script title
- the readme description is the script description
- the action branding is the script branding
To **regenerate** the action files (`action.yml`), run:
```bash
npm run configure
```
To lint script files, run:
```bash
npm run lint
```
To typecheck the scripts, run:
```bash
npm run typecheck
```
To build the Docker image locally, run:
```bash
npm run docker:build
```
To run the action locally in Docker (build it first), use:
```bash
npm run docker:start
```
To run the action using [act](https://nektosact.com/), first install the act CLI:
```bash
npm run act:install
```
Then, you can run the action with:
```bash
npm run act
```
## Upgrade
The GenAIScript version is pinned in the `package.json` file. To upgrade it, run:
```bash
npm run upgrade
```
## Release
To release a new version of this action, run the release script on a clean working directory.
```bash
npm run release
```