https://github.com/aboutcode-org/matchcode-tests
https://github.com/aboutcode-org/matchcode-tests
Last synced: 3 months ago
JSON representation
- Host: GitHub
- URL: https://github.com/aboutcode-org/matchcode-tests
- Owner: aboutcode-org
- Created: 2024-11-08T00:09:51.000Z (about 1 year ago)
- Default Branch: main
- Last Pushed: 2025-05-19T04:44:46.000Z (8 months ago)
- Last Synced: 2025-09-17T12:00:36.959Z (4 months ago)
- Language: JavaScript
- Size: 27.8 MB
- Stars: 0
- Watchers: 5
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.rst
- Changelog: CHANGELOG.rst
- Code of conduct: CODE_OF_CONDUCT.rst
- Authors: AUTHORS.rst
Awesome Lists containing this project
README
=========================================
MatchCode approximate code search tests
=========================================
This repository is a test suite for approximate code search including AI-generated code search.
- Homepage: https://github.com/aboutcode-org/matchcode-tests/
- Related repos:
- https://github.com/aboutcode-org/purldb
- https://github.com/aboutcode-org/matchcode-toolkit
- https://github.com/aboutcode-org/ai-gen-code-search
Usage
=====
- Clone this repository
- In the clone, run ``make dev``
- run ``. venv/bin/activate``
- run the full test suite with::
pytest -vvs tests
This is designed to run only on Linux.
Tests
=====
``test_matchcode.py`` uses the dataset "Analyzing the Dependability of Large
Language Models for Code Clone Generation"
(https://zenodo.org/records/11398703). This dataset contains code solutions to
problems from LeetCode that have been generated by AI from an original solution.
The tests in ``test_matchcode.py`` compare the original solutions to the
different variations of AI generated solutions, where we compare Hamming
distances and detected ngrams from the different solutions.
License
=============
- the data is under a CC-BY-4.0 license
- the code is under the Apache-2.0 license