Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/nkalupahana/cs8395-code-coverage

Last synced: 3 days ago
JSON representation

Host: GitHub
URL: https://github.com/nkalupahana/cs8395-code-coverage
Owner: nkalupahana
Created: 2023-09-03T21:36:19.000Z (over 1 year ago)
Default Branch: main
Last Pushed: 2023-12-14T00:59:31.000Z (about 1 year ago)
Last Synced: 2024-11-10T01:42:24.587Z (2 months ago)
Language: Python
Size: 118 KB
Stars: 0
Watchers: 2
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# Test Case Generation Benchmark
*Made for Fall 2023 CS 8395-08*

This benchmark is evaluating the performance of different LLM's test case generation for code coverage. 100 Python functions are contained in the `functions/` folder, and the LLM is asked to generate test cases for the function such that it will have 100% code coverage. Many of the functions were generated using `text-davinci-003` (see `generate_functions.py`). This is novel because it's explicitly trying to get higher code coverage, which requires the LLM to understand the branches in the code. This can be a difficult problem if there are multiple stages of branching -- for example, if one set of branches creates an intermediate value, and then after that, another set of branches manipulates that values and returns it, it might be hard for the LLM to understand what sort of values could be used to cover the second set of branches.

## Running the Benchmark
- Delete `tests/`
- Install packages: `pip3 install -r requirements.txt`
- Run `python3 generate_tests.py` to generate tests using the model. Feel free to swap out the LLM used on line 9.
- Evaluate coverage by running `bash check_coverage.bash`. This will print out the coverage for each function, as well as the average coverage for all functions. As an added bonus, it will also show what percentage of the tests were valid.