https://github.com/welf/code-context

Code context generator for LLMs
https://github.com/welf/code-context

ai code-generation llm llms rust

Last synced: 3 months ago
JSON representation

Code context generator for LLMs

Host: GitHub
URL: https://github.com/welf/code-context
Owner: welf
License: mit
Created: 2025-01-14T12:16:05.000Z (5 months ago)
Default Branch: main
Last Pushed: 2025-01-16T19:18:26.000Z (5 months ago)
Last Synced: 2025-01-22T05:31:46.778Z (5 months ago)
Topics: ai, code-generation, llm, llms, rust
Language: Rust
Homepage:
Size: 39.1 KB
Stars: 1
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

        # Code Context Generator for LLMs

![GitHub Actions Workflow Status](https://img.shields.io/github/actions/workflow/status/welf/code-context/.github%2Fworkflows%2Fci.yml)

![License](https://img.shields.io/github/license/welf/code-context)

A CLI tool designed to process Rust source code, creating a high-level context

suitable for Large Language Models (LLMs). It eliminates non-essential

information that allows you share with LLMs large codebases.

## Table of Contents

- [Overview](#overview)

- [Features](#features)

- [Installation](#installation)

- [Usage](#usage)

- [Examples](#examples)

- [FAQ](#faq)

- [Contributing](#contributing)

- [License](#license)

## Overview

When working with LLMs on large codebases, it's crucial to balance providing

enough context while staying within context window limits and optimizing for

cost and performance. This tool processes Rust code to remove unnecessary

implementation details while preserving the essential structure and interfaces.

### Considerations

- **Context Window Management**: By stripping down the code to its essential

  structure, the tool helps fit more relevant information within the LLM's

  context window, which is crucial for effective processing and understanding.

- **Focus on Essentials**: The tool preserves the module structure, type

  definitions, function signatures, and important comments, which are often

  sufficient for understanding the overall architecture and design of the

  project.

- **Reduced Noise**: Removing implementation details and test code reduces

  noise, allowing the LLM to focus on the high-level structure and relationships

  within the codebase.

- **Scalability**: This approach scales well with large projects, as it avoids

  overwhelming the LLM with unnecessary details, making it easier to handle and

  process large codebases.

- **Incremental Sharing**: The tool's approach of sharing small parts of the

  codebase as needed ensures that the LLM has access to detailed information

  when required, without overwhelming it with the entire codebase.

## Features

- **Removes**:

  - Test functions (`#[test]`) and test modules (`#[cfg(test)]`)

  - Function bodies (with specific exceptions and when the

    `--no-function-bodies` option is used)

  - Doc comments and module-level documentation when the `--no-comments` option

    is used

  - Implementation details of derived traits

- **Preserves**:

  - Module structure and imports

  - Type definitions (structs, enums, traits)

  - Function signatures and interfaces

  - Non-test attributes (e.g., `#[derive]`)

  - Doc comments and module-level documentation (unless `--no-comments` option

    is specified)

  - Function bodies for:

    - String-like return types (`String`, `&str`, `Cow`)

    - `Result` where `T` is string-like

    - `Option` where `T` is string-like

    - Custom `Serialize` trait implementations

  - Special trait method annotations:

    - `/// This is a required method` for required trait methods

    - `/// There is a default implementation` for methods with default

      implementations

  - File paths relative to the `src` directory with `main.rs` and `lib.rs` files

    if the `--single-file` flag is used

## Installation

```bash

# Clone the repository

git clone https://github.com/yourusername/code-context.git

cd code-context

# Build the project

cargo build --release

```

## Usage

```bash

# Basic usage

code-context 

# With options

code-context  --output-dir  --no-comments --stats --dry-run --single-file

```

### Command Line Options

```

Options:

  -o, --output-dir   Output directory name [default: code-context]

      --no-function-bodies Remove function bodies (except for functions with string-like return types)

      --no-comments        Remove all comments (including doc comments)

      --no-stats           Show processing statistics

      --dry-run            Run without writing output files

      --single-file        Output all files into a single combined file

  -h, --help               Print help

  -V, --version            Print version

```

## Examples

Generated output files can be found in the

[`src-code-context`](./src-code-context/) and

[`src-custom-suffix`](./src-custom-suffix/) directories.

- The file

  [`src-code-context/code_context.rs.txt`](./src-code-context/code_context.rs.txt)

  was generated by passing the path to the `src` directory of this repo with

  `--single-file` and `--no-function-bodies` options.

- Files in the [`src-custom-suffix`](./src-custom-suffix/) directory were

  generated by passing the path to the `src` directory with

  `--output-dir custom-suffix` and `--no-function-bodies` options.

In both cases, the size reduction is 92.5% (from 76371 bytes to 5744 bytes).

### Before and After Example

**Before:**

```rust

/// Adds two numbers.

fn add(a: i32, b: i32) -> i32 {

    a + b

}

#[cfg(test)]

mod tests {

    use super::*;

    #[test]

    fn test_add() {

        assert_eq!(add(1, 2), 3);

    }

}

```

**After using the tool with `--no-comments` and `--no-function-bodies`

options:**

```rust

fn add(a: i32, b: i32) -> i32 {}

```

## FAQ

**Q: What types of files does this tool process?**\

A: The tool processes files with the `.rs` extension only. It does not process

files with `.toml`, `.json`, or other extensions.

**Q: Can I run the tool without writing output files?**\

A: Yes, use the `--dry-run` flag to run the tool without writing output files.

**Q: Why output file(s) have an extension `.rs.txt`. Why not generate `.rs`

file(s)?**\

A: If the tool generates `.rs` files, the `rust-analyzer` will generate a lot of

compilation errors. To avoid this, the tool generates `.rs.txt` files.

## Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

## License

[MIT License](./LICENSE)

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/welf/code-context

Awesome Lists containing this project

README