{"id":20701151,"url":"https://github.com/casperboone/dltpy","last_synced_at":"2025-04-22T22:44:31.078Z","repository":{"id":48661763,"uuid":"209561837","full_name":"casperboone/dltpy","owner":"casperboone","description":" 🐍 Deep Learning Type Inference of Python Function Signatures using their  Natural Language Context","archived":false,"fork":false,"pushed_at":"2024-05-03T19:44:42.000Z","size":28489,"stargazers_count":14,"open_issues_count":5,"forks_count":6,"subscribers_count":5,"default_branch":"master","last_synced_at":"2024-05-03T21:00:17.490Z","etag":null,"topics":["deep-learning","python","typeinference"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/casperboone.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2019-09-19T13:30:46.000Z","updated_at":"2024-05-03T21:00:17.491Z","dependencies_parsed_at":"2023-01-18T20:45:27.623Z","dependency_job_id":null,"html_url":"https://github.com/casperboone/dltpy","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/casperboone%2Fdltpy","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/casperboone%2Fdltpy/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/casperboone%2Fdltpy/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/casperboone%2Fdltpy/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/casperboone","download_url":"https://codeload.github.com/casperboone/dltpy/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":224986414,"owners_count":17402937,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["deep-learning","python","typeinference"],"created_at":"2024-11-17T00:39:01.080Z","updated_at":"2024-11-17T00:39:01.691Z","avatar_url":"https://github.com/casperboone.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# DLTPy\nDeep Learning Type Inference of Python Function Signatures using their Natural Language Context\n\nDLTPy makes type predictions based on comments, on the semantic elements of the function name and argument names,\nand on the semantic elements of identifiers in the return expressions.  Using the natural language of these \ndifferent elements, we have trained a classifier that predicts types. We use a recurrent neural network (RNN)\nwith a Long Short-Term Memory (LSTM) architecture.\n\n_Read our [paper](https://arxiv.org/abs/1912.00680) for the full details._\n\n## Components\n\n![DLTPy flow](https://user-images.githubusercontent.com/15815208/67791371-98049480-fa77-11e9-95ed-bb94e7b06eeb.png)\n\n### `preprocessing/` Preprocessing Pipeline (a-d)\nDownloads projects, extracts comments and typesm and gives a csv file per project containing all functions.\n\nStart using:\n``` bash\n$ python preprocessing/pipeline.py\n```\nOptional arguments:\n```\n  -h, --help            show this help message and exit\n  --projects_file PROJECTS_FILE\n                        json file containing GitHub projects\n  --limit LIMIT         limit the number of projects for which the pipeline\n                        should run\n  --jobs JOBS           number of jobs to use for pipeline.\n  --output_dir OUTPUT_DIR\n                        output dir for the pipeline\n  --start START         start position within projects list\n```\n\n### `input-preparation/` Input Preparation (e-f)\n`input-preparation/generate_df.py` can be used to combine all the separate csv files per project into one big file\nwhile applying filtering.\n\n`input-preparation/df_to_vec.py` can be used to convert this generated csv to vectors.\n\n`input-preparation/embedder.py` can be used to train word embeddings for `input-preparation/df_to_vec.py`.\n\n### `learning/` Learning (g)\nThe different RNN models we evaluated can be found in `learning/learn.py`.\n\n## Testing\n``` bash\n$ pytest\n```\n\n## Credits\n- [Casper Boone](https://github.com/casperboone)\n- [Niels de Bruin](https://github.com/nielsdebruin)\n- [Arjan Langerak](https://github.com/alangerak)\n- [Fabian Stelmach](https://github.com/fabianstelmach)\n- [All contributors](../../contributors)\n\n## License\nThe MIT License (MIT). Please see the [license file](LICENSE) for more information.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcasperboone%2Fdltpy","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fcasperboone%2Fdltpy","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcasperboone%2Fdltpy/lists"}