{"id":20450873,"url":"https://github.com/danhper/bigcode-tools","last_synced_at":"2025-04-13T02:33:21.701Z","repository":{"id":41797462,"uuid":"81314646","full_name":"danhper/bigcode-tools","owner":"danhper","description":"Set of tools to help working with \"Big Code\"","archived":false,"fork":false,"pushed_at":"2022-04-28T19:22:11.000Z","size":658,"stargazers_count":43,"open_issues_count":6,"forks_count":16,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-04-07T00:18:02.299Z","etag":null,"topics":["bigcode","machine-learning","parser"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/danhper.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2017-02-08T09:56:12.000Z","updated_at":"2024-03-17T08:39:29.000Z","dependencies_parsed_at":"2022-08-11T17:50:58.164Z","dependency_job_id":null,"html_url":"https://github.com/danhper/bigcode-tools","commit_stats":null,"previous_names":["tuvistavie/bigcode-tools"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/danhper%2Fbigcode-tools","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/danhper%2Fbigcode-tools/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/danhper%2Fbigcode-tools/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/danhper%2Fbigcode-tools/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/danhper","download_url":"https://codeload.github.com/danhper/bigcode-tools/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247569129,"owners_count":20959761,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["bigcode","machine-learning","parser"],"created_at":"2024-11-15T10:55:28.822Z","updated_at":"2025-04-13T02:33:21.679Z","avatar_url":"https://github.com/danhper.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# bigcode-tools\n\n[![CircleCI](https://circleci.com/gh/danhper/bigcode-tools.svg?style=svg\u0026circle-token=2508e8ffaf677893dda1ba0bc670bbd06ce137c5)](https://circleci.com/gh/danhper/bigcode-tools)\n\nA set of tools to help working with [\"Big Code\"][1].\n\nThis repository contains multiple tools to fetch source code,\ntransform source code into AST, visualize generated ASTs or\nlearn embedding for AST nodes.\n\nThe repository is currently composed of the current subprojects\n\n* [bigcode-fetcher](./bigcode-fetcher): Search and fetch source code\n* [bigcode-astgen](./bigcode-astgen): Transform source code into JSON ASTs\n* [bigcode-ast-tools](./bigcode-ast-tools): Toolset to work with JSON ASTs\n* [bigcode-embeddings](./bigcode-embeddings): Generate [token embeddings][2] from ASTs\n\nTake a look at [the tutorial][3] to get started.\n\nIf you are using this for academic work, we would be thankful if you could cite the following paper.\n\n```\n@inproceedings{Perez:2019:CCD:3341883.3341965,\n author = {Perez, Daniel and Chiba, Shigeru},\n title = {Cross-language Clone Detection by Learning over Abstract Syntax Trees},\n booktitle = {Proceedings of the 16th International Conference on Mining Software Repositories},\n series = {MSR '19},\n year = {2019},\n location = {Montreal, Quebec, Canada},\n pages = {518--528},\n numpages = {11},\n url = {https://doi.org/10.1109/MSR.2019.00078},\n doi = {10.1109/MSR.2019.00078},\n acmid = {3341965},\n publisher = {IEEE Press},\n address = {Piscataway, NJ, USA},\n keywords = {clone detection, machine learning, source code representation},\n}\n```\n\n[1]: http://learnbigcode.github.io/\n[2]: https://en.wikipedia.org/wiki/Word_embedding\n[3]: ./doc/tutorial.md\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdanhper%2Fbigcode-tools","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdanhper%2Fbigcode-tools","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdanhper%2Fbigcode-tools/lists"}