https://github.com/rdiachenko/checkstyle-xpath-llm-poc

AI-Powered XPath Generator for Checkstyle Suppressions PoC
https://github.com/rdiachenko/checkstyle-xpath-llm-poc

Last synced: over 1 year ago
JSON representation

AI-Powered XPath Generator for Checkstyle Suppressions PoC

Host: GitHub
URL: https://github.com/rdiachenko/checkstyle-xpath-llm-poc
Owner: rdiachenko
License: mit
Created: 2025-01-18T07:05:33.000Z (over 1 year ago)
Default Branch: main
Last Pushed: 2025-01-18T14:20:50.000Z (over 1 year ago)
Last Synced: 2025-01-18T15:29:22.598Z (over 1 year ago)
Language: Python
Homepage:
Size: 9.77 KB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

# XPath Generator PoC

A proof of concept for generating Checkstyle XPath suppressions using LLMs. This project demonstrates the feasibility of using Code Llama to generate XPath expressions for suppressing specific Checkstyle violations.

## Prerequisites

- Docker
- Python 3.x (for local testing)
- Java 17 or higher (for testing only)
- Hugging Face API token (set as `HF_TOKEN` environment variable)

## Project Structure

```
checkstyle-xpath-llm-poc/
│
├── config.py # Model and generation parameters
│
├── docker/ # Docker-related files
│ ├── Dockerfile # Optimized for LLM inference
│ └── inference.py # Core XPath generation script
│
├── testing/ # Testing infrastructure
│ └── test_xpath.py # Testing and validation script
│
├── models/ # Directory for downloaded models (gitignored)
└── README.md
```

## Setup

1. Set your Hugging Face token:
```bash
export HF_TOKEN="your_token_here"
```

2. Create a directory for model storage:
```bash
mkdir -p models
```

3. Build the Docker image:
```bash
docker build -t xpath-generator -f docker/Dockerfile .
```

4. Download Checkstyle (required for testing):
```bash
cd testing
curl -L -o checkstyle-10.21.1-all.jar \
https://github.com/checkstyle/checkstyle/releases/download/checkstyle-10.21.1/checkstyle-10.21.1-all.jar
```

## Usage

Run the test script to see the XPath generation in action:
```bash
cd testing
python3 test_xpath.py
```

The script will:
1. Generate AST from a sample Java code
2. Pass the code, violation, and AST to the LLM
3. Generate an XPath expression
4. Validate the generated XPath using Checkstyle

Check docker container logs:
```bash
docker logs -f xpath-generator-instance
```

## Configuration

The model and generation parameters can be configured in `config.py`. Key settings include:
- Model repository and local folder
- Model parameters (dtype, device mapping, etc.)
- Tokenizer parameters
- Generation parameters (beam size, temperature, etc.)

## Limitations

This is an experimental proof of concept and has several limitations:
- Uses a simplified prompt structure
- May generate incorrect XPaths for complex cases
- Requires downloading a large language model
- Limited to basic Checkstyle violation cases

## Note

This proof of concept demonstrates the basic approach of using LLMs for XPath generation. It's not intended for production use and serves as a starting point for further development.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/rdiachenko/checkstyle-xpath-llm-poc

Awesome Lists containing this project

README