{"id":13412251,"url":"https://github.com/vpj/python_autocomplete","last_synced_at":"2025-04-04T14:06:26.819Z","repository":{"id":66813169,"uuid":"195625906","full_name":"vpj/python_autocomplete","owner":"vpj","description":"A simple neural network for python autocompletion","archived":false,"fork":false,"pushed_at":"2020-08-09T08:36:44.000Z","size":80764,"stargazers_count":823,"open_issues_count":1,"forks_count":128,"subscribers_count":44,"default_branch":"master","last_synced_at":"2025-03-28T13:07:07.203Z","etag":null,"topics":["autocomplete","lstm","machine-learning","python"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/vpj.png","metadata":{"files":{"readme":"readme.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null}},"created_at":"2019-07-07T08:06:34.000Z","updated_at":"2025-03-21T16:08:36.000Z","dependencies_parsed_at":"2023-02-27T19:46:10.637Z","dependency_job_id":null,"html_url":"https://github.com/vpj/python_autocomplete","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vpj%2Fpython_autocomplete","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vpj%2Fpython_autocomplete/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vpj%2Fpython_autocomplete/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vpj%2Fpython_autocomplete/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/vpj","download_url":"https://codeload.github.com/vpj/python_autocomplete/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247190249,"owners_count":20898702,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["autocomplete","lstm","machine-learning","python"],"created_at":"2024-07-30T20:01:22.633Z","updated_at":"2025-04-04T14:06:26.793Z","avatar_url":"https://github.com/vpj.png","language":"Python","funding_links":[],"categories":["Python"],"sub_categories":[],"readme":"### ⭐️ We rewrote a simpler version of this at [lab-ml/source_code_modelling](https://github.com/lab-ml/source_code_modelling) and we intend to maintain it for a while\n\n[This](https://github.com/vpj/python_autocomplete) a toy project we started\nto see how well a simple LSTM model can autocomplete python code.\n\nIt gives quite decent results by saving above 30% key strokes in most files,\nand close to 50% in some.\nWe calculated key strokes saved by making a single (best)\nprediction and selecting it with a single key.\n\nWe do a beam search to find predictions, upto ~10 characters ahead.\nSo far it's too inefficient, if you are wondering about editor integration.\n\nWe train and predict on after cleaning comments, strings\nand blank lines in python code.\nThe model is trained after tokenizing python code.\nIt seems more efficient than character level prediction with byte-pair encoding.\n\nA saved model is included in this repo.\nIt is trained on [tensorflow/models](https://github.com/tensorflow/models).\n\nHere's a sample evaluation on a source file from validation set.\nRed characters are when a auto-completion started;\ni.e. user presses TAB to select the completion. \nThe green character and and the following characters highlighted in gray\nare auto-completed. As you can see, it starts and ends completions arbitrarily.\nThat is a suggestion could be 'tensorfl' and not the complete identifier\n'tensorflow' which can be a little annoying in a real usage scenario.\nWe can limit them to finish on end of tokens to fix that.\nAlso you can notice that it completes across operators as well.\nIncreasing the length of the beam search will let it complete longer pieces of code.\n\n\u003cp align=\"center\"\u003e\n  \u003cimg src=\"/python-autocomplete.png?raw=true\" width=\"100%\" title=\"Screenshot\"\u003e\n\u003c/p\u003e\n\n## Try it yourself\n\n1. Clone this repo\n\n2. Install requirements from `requirements.txt`\n\n3. Copy data to `./data/source`\n\n4. Run `extract_code.py` to collect all python files, encode and merge them into `all.py`\n\n5. Run `evaluate.py` to evaluate the model. I have included a checkpoint in the repo.\n\n6. Run `train.py` to train the model\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fvpj%2Fpython_autocomplete","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fvpj%2Fpython_autocomplete","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fvpj%2Fpython_autocomplete/lists"}