{"id":18609542,"url":"https://github.com/secure-software-engineering/typeevalpy","last_synced_at":"2025-04-10T22:31:27.067Z","repository":{"id":202740391,"uuid":"654148096","full_name":"secure-software-engineering/TypeEvalPy","owner":"secure-software-engineering","description":"A Micro-benchmarking Framework for Python Type Inference Tools","archived":false,"fork":false,"pushed_at":"2025-03-04T12:33:24.000Z","size":30695,"stargazers_count":33,"open_issues_count":2,"forks_count":2,"subscribers_count":5,"default_branch":"main","last_synced_at":"2025-03-25T05:41:37.570Z","etag":null,"topics":["benchmark","python","staticanalysis","typeinference"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/secure-software-engineering.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-06-15T13:39:06.000Z","updated_at":"2025-03-23T06:36:56.000Z","dependencies_parsed_at":null,"dependency_job_id":"1686d23f-4c91-4ff2-8598-7472c09b5093","html_url":"https://github.com/secure-software-engineering/TypeEvalPy","commit_stats":null,"previous_names":["ashwinprasadme/typeevalpy","secure-software-engineering/typeevalpy"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/secure-software-engineering%2FTypeEvalPy","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/secure-software-engineering%2FTypeEvalPy/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/secure-software-engineering%2FTypeEvalPy/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/secure-software-engineering%2FTypeEvalPy/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/secure-software-engineering","download_url":"https://codeload.github.com/secure-software-engineering/TypeEvalPy/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248309609,"owners_count":21082247,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["benchmark","python","staticanalysis","typeinference"],"created_at":"2024-11-07T03:06:23.614Z","updated_at":"2025-04-10T22:31:22.054Z","avatar_url":"https://github.com/secure-software-engineering.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003cp align=\"center\"\u003e\n\u003cimg src=\"TypeEvalPy.jpg\" width=\"75%\" align=\"center\"\u003e\n\u003cbr\u003e\n\n\u003ch3 align=\"center\"\u003e A Micro-benchmarking Framework for Python Type Inference Tools \u003c/h3\u003e\n\u003c/p\u003e\n\n## 📌 **Features**:\n\n- 📜 Contains **154 code snippets** to test and benchmark.\n- 🏷 Offers **845 type annotations** across a diverse set of Python functionalities.\n- 📂 Organized into **18 distinct categories** targeting various Python features.\n- 🚢 Seamlessly manages the execution of **containerized tools**.\n- 🔄 Efficiently transforms inferred types into a **standardized format**.\n- 📊 Automatically produces **meaningful metrics** for in-depth assessment and comparison.\n\n### [New] TypeEvalPy Autogen\n\n- 🤖 **Autogenerates code snippets** and ground truth to scale the benchmark based on the original `TypeEvalPy` benchmark.\n- 📈 The autogen benchmark now contains:\n  - **Python files**: 7121\n  - **Type annotations**: 78373\n\n## 🛠️ Supported Tools\n\n| Supported :white_check_mark:                                          | In-progress :wrench:                                                 | Planned :bulb:                                        |\n| --------------------------------------------------------------------- | -------------------------------------------------------------------- | ----------------------------------------------------- |\n| [HeaderGen](https://github.com/secure-software-engineering/HeaderGen) | [Intellij PSI](https://plugins.jetbrains.com/docs/intellij/psi.html) | [MonkeyType](https://github.com/Instagram/MonkeyType) |\n| [Jedi](https://github.com/davidhalter/jedi)                           | [Pyre](https://github.com/facebook/pyre-check)                       | [Pyannotate](https://github.com/dropbox/pyannotate)   |\n| [Pyright](https://github.com/microsoft/pyright)                       | [PySonar2](https://github.com/yinwang0/pysonar2)                     |\n| [HiTyper](https://github.com/JohnnyPeng18/HiTyper)                    | [Pytype](https://github.com/google/pytype)                           |\n| [Scalpel](https://github.com/SMAT-Lab/Scalpel/issues)                 | [TypeT5](https://github.com/utopia-group/TypeT5)                     |\n| [Type4Py](https://github.com/saltudelft/type4py)                      |                                                                      |\n| [GPT](https://openai.com)                                             |                                                                      |\n| [Ollama](https://ollama.ai)                                           |                                                                      |\n\n---\n\n## 🏆 TypeEvalPy Leaderboard\n\nBelow is a comparison showcasing exact matches across different tools and LLMs on the Autogen benchmark.\n\n| Rank | 🛠️ Tool                                                                                        | Function Return Type | Function Parameter Type | Local Variable Type | Total |\n| ---- | ---------------------------------------------------------------------------------------------- | -------------------- | ----------------------- | ------------------- | ----- |\n| 1    | **[mistral-large-it-2407-123b](https://huggingface.co/mistralai/Mistral-Large-Instruct-2407)** | 16701                | 728                     | 57550               | 74979 |\n| 2    | **[qwen2-it-72b](https://huggingface.co/Qwen/Qwen2-72B-Instruct)**                             | 16488                | 629                     | 55160               | 72277 |\n| 3    | **[llama3.1-it-70b](https://huggingface.co/meta-llama/Meta-Llama-3.1-70B-Instruct)**           | 16648                | 580                     | 54445               | 71673 |\n| 4    | **[gemma2-it-27b](https://huggingface.co/google/gemma-2-27b-it)**                              | 16342                | 599                     | 49772               | 66713 |\n| 5    | **[codestral-v0.1-22b](https://huggingface.co/mistralai/Codestral-22B-v0.1)**                  | 16456                | 706                     | 49379               | 66541 |\n| 6    | **[codellama-it-34b](https://huggingface.co/meta-llama/CodeLlama-34b-Instruct-hf)**            | 15960                | 473                     | 48957               | 65390 |\n| 7    | **[mistral-nemo-it-2407-12.2b](https://huggingface.co/mistralai/Mistral-Nemo-Instruct-2407)**  | 16221                | 526                     | 48439               | 65186 |\n| 8    | **[mistral-v0.3-it-7b](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.3)**            | 16686                | 472                     | 47935               | 65093 |\n| 9    | **[phi3-medium-it-14b](https://huggingface.co/microsoft/Phi-3-medium-128k-instruct)**          | 16802                | 467                     | 45121               | 62390 |\n| 10   | **[llama3.1-it-8b](https://huggingface.co/meta-llama/Meta-Llama-3.1-8B-Instruct)**             | 16125                | 492                     | 44313               | 60930 |\n| 11   | **[codellama-it-13b](https://huggingface.co/meta-llama/CodeLlama-13b-Instruct-hf)**            | 16214                | 479                     | 43021               | 59714 |\n| 12   | **[phi3-small-it-7.3b](https://huggingface.co/microsoft/Phi-3-small-128k-instruct)**           | 16155                | 422                     | 38093               | 54670 |\n| 13   | **[qwen2-it-7b](https://huggingface.co/Qwen/Qwen2-7B-Instruct)**                               | 15684                | 313                     | 38109               | 54106 |\n| 14   | **[HeaderGen](https://github.com/ashwinprasadme/headergen)**                                   | 14086                | 346                     | 36370               | 50802 |\n| 15   | **[phi3-mini-it-3.8b](https://huggingface.co/microsoft/Phi-3-mini-128k-instruct)**             | 15908                | 320                     | 30341               | 46569 |\n| 16   | **[phi3.5-mini-it-3.8b](https://huggingface.co/microsoft/Phi-3.5-mini-instruct)**              | 15763                | 362                     | 28694               | 44819 |\n| 17   | **[codellama-it-7b](https://huggingface.co/meta-llama/CodeLlama-7b-Instruct-hf)**              | 13779                | 318                     | 29346               | 43443 |\n| 18   | **[Jedi](https://github.com/davidhalter/jedi)**                                                | 13160                | 0                       | 15403               | 28563 |\n| 19   | **[Scalpel](https://github.com/SMAT-Lab/Scalpel/issues)**                                      | 15383                | 171                     | 18                  | 15572 |\n| 20   | **[gemma2-it-9b](https://huggingface.co/google/gemma-2-9b-it)**                                | 1611                 | 66                      | 5464                | 7141  |\n| 21   | **[Type4Py](https://github.com/saltudelft/type4py)**                                           | 3143                 | 38                      | 2243                | 5424  |\n| 22   | **[tinyllama-1.1b](https://huggingface.co/TinyLlama/TinyLlama-1.1B-Chat-v1.0)**                | 1514                 | 28                      | 2699                | 4241  |\n| 23   | **[mixtral-v0.1-it-8x7b](https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1)**        | 3235                 | 33                      | 377                 | 3645  |\n| 24   | **[phi3.5-moe-it-41.9b](https://huggingface.co/microsoft/Phi-3.5-MoE-instruct)**               | 3090                 | 25                      | 273                 | 3388  |\n| 25   | **[gemma2-it-2b](https://huggingface.co/google/gemma-2-2b-it)**                                | 1497                 | 41                      | 1848                | 3386  |\n\n_\u003csub\u003e(Auto-generated based on the the analysis run on 30 Aug 2024)\u003c/sub\u003e_\n\n---\n\n## :whale: Running with Docker\n\n### 1️⃣ Clone the repo\n\n```bash\ngit clone https://github.com/secure-software-engineering/TypeEvalPy.git\n```\n\n### 2️⃣ Build Docker image\n\n```bash\ndocker build -t typeevalpy .\n```\n\n### 3️⃣ Run TypeEvalPy\n\n🕒 Takes about 30mins on first run to build Docker containers.\n\n📂 Results will be generated in the `results` folder within the root directory of the repository.\nEach results folder will have a timestamp, allowing you to easily track and compare different runs.\n\n\u003cdetails\u003e\n  \u003csummary\u003e\u003cb\u003eCorrelation of CSV Files Generated to Tables in ICSE Paper\u003c/b\u003e\u003c/summary\u003e\nHere is how the auto-generated CSV tables relate to the paper's tables:\n\n- **Table 1** in the paper is derived from three auto-generated CSV tables:\n\n  - `paper_table_1.csv` - details Exact matches by type category.\n  - `paper_table_2.csv` - lists Exact matches for 18 micro-benchmark categories.\n  - `paper_table_3.csv` - provides Sound and Complete values for tools.\n\n- **Table 2** in the paper is based on the following CSV table:\n  - `paper_table_5.csv` - shows Exact matches with top_n values for machine learning tools.\n\nAdditionally, there are CSV tables that are _not_ included in the paper:\n\n- `paper_table_4.csv` - containing Sound and Complete values for 18 micro-benchmark categories.\n- `paper_table_6.csv` - featuring Sensitivity analysis.\n\u003c/details\u003e\n\n```bash\ndocker run \\\n      -v /var/run/docker.sock:/var/run/docker.sock \\\n      -v ./results:/app/results \\\n      typeevalpy\n```\n\n🔧 **Optionally**, run analysis on specific tools:\n\n```bash\ndocker run \\\n      -v /var/run/docker.sock:/var/run/docker.sock \\\n      -v ./results:/app/results \\\n      typeevalpy --runners headergen scalpel\n```\n\n📊 Run analysis on custom benchmarks:\n\nHere, running with the autogen benchmark on HeaderGen\n\n```bash\ndocker run \\\n      -v /var/run/docker.sock:/var/run/docker.sock \\\n      -v ./results:/app/results \\\n      typeevalpy \\\n      --runners headergen \\\n      --custom_benchmark_dir /app/autogen_typeevalpy_benchmark\n```\n\n🛠️ Available options: `headergen`, `pyright`, `scalpel`, `jedi`, `hityper`, `type4py`, `hityperdl`\n\n### 🤖 Running TypeEvalPy with LLMs\n\nTypeEvalPy integrates with LLMs through Ollama, streamlining their management. Begin by setting up your environment:\n\n- Create Configuration File: Copy the `config_template.yaml` from the src directory and rename it to `config.yaml`.\n\nIn the `config.yaml`, configure in the following:\n\n- `openai_key`: your key for accessing OpenAI's models.\n- `ollama_url`: the URL for your Ollama instance. For simplicity, we recommend deploying Ollama using their Docker container. [Get started with Ollama here](https://hub.docker.com/r/ollama/ollama).\n- `prompt_id`: set this to `questions_based_2` for optimal performance, based on our tests.\n- `ollama_models`: select a list of model tags from the [Ollama library](https://ollama.com/library). For better operation, ensure the model is pre-downloaded with the `ollama pull` command.\n\nWith the `config.yaml` configured, run the following command:\n\n```bash\ndocker run \\\n      -v /var/run/docker.sock:/var/run/docker.sock \\\n      -v ./results:/app/results \\\n      typeevalpy --runners ollama\n```\n\n---\n\n\u003cdetails\u003e\n  \u003csummary\u003e\u003cb\u003eRunning From Source...\u003c/b\u003e\u003c/summary\u003e\n\n## 1. 📥 Installation\n\n1.  **Clone the repo**\n\n    ```bash\n    git clone https://github.com/secure-software-engineering/TypeEvalPy.git\n    ```\n\n2.  **Install Dependencies and Set Up Virtual Environment**\n\n    Run the following commands to set up your virtual environment and activate the virtual environment.\n\n    ```bash\n    python3 -m venv .env\n    ```\n\n    ```bash\n    source .env/bin/activate\n    ```\n\n    ```bash\n    pip install -r requirements.txt\n    ```\n\n---\n\n## 2. 🚀 Usage: Running the Analysis\n\n1.  **Navigate to the `src` Directory**\n\n    ```bash\n    cd src\n    ```\n\n2.  **Execute the Analyzer**\n\n    Run the following command to start the benchmarking process on all tools:\n\n    ```bash\n    python main_runner.py\n    ```\n\n    or\n\n    Run analysis on specific tools\n\n    ```\n    python main_runner.py --runners headergen scalpel\n    ```\n\n\u003c/details\u003e\n\n---\n\n## Running TypeEvalPy Autogen\n\nTo generate an extended version of the original TypeEvalPy benchmark to include many more Python types, run the following commands:\n\n1.  **Navigate to the `autogen` Directory**\n\n    ```bash\n    cd autogen\n    ```\n\n2.  **Execute the Generation Script**\n\n    Run the following command to start the generation process:\n\n    ```bash\n    python generate_typeevalpy_dataset.py\n    ```\n\nThis will generate a folder in the repo root with the autogen benchmark with the current date.\n\n---\n\n### 🤝 Contributing\n\nThank you for your interest in contributing! To add support for a new tool, please utilize the Docker templates provided in our repository. After implementing and testing your tool, please submit a pull request (PR) with a descriptive message. Our maintainers will review your submission, and merge them.\n\nTo get started with integrating your tool, please follow the guide here: [docs/Tool_Integration_Guide.md](docs/Tool_Integration_Guide.md)\n\n---\n\n### ⭐️ Show Your Support\n\nGive a ⭐️ if this project helped you!\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsecure-software-engineering%2Ftypeevalpy","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsecure-software-engineering%2Ftypeevalpy","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsecure-software-engineering%2Ftypeevalpy/lists"}