{"id":22957802,"url":"https://github.com/parvezmrobin/ml-br-deduplication","last_synced_at":"2025-06-14T09:06:35.438Z","repository":{"id":79507735,"uuid":"538870391","full_name":"parvezmrobin/ml-br-deduplication","owner":"parvezmrobin","description":"Replicated a state-of-the-art duplicate bug report detection technique using MongoDB, Docker, TensorFlow, and Scikit learn","archived":false,"fork":false,"pushed_at":"2022-09-21T00:48:41.000Z","size":801,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-06-14T09:05:45.718Z","etag":null,"topics":["deep-learning","siamese-network"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/parvezmrobin.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2022-09-20T07:51:28.000Z","updated_at":"2022-11-03T11:37:55.000Z","dependencies_parsed_at":"2023-02-27T19:32:13.465Z","dependency_job_id":null,"html_url":"https://github.com/parvezmrobin/ml-br-deduplication","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/parvezmrobin/ml-br-deduplication","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/parvezmrobin%2Fml-br-deduplication","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/parvezmrobin%2Fml-br-deduplication/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/parvezmrobin%2Fml-br-deduplication/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/parvezmrobin%2Fml-br-deduplication/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/parvezmrobin","download_url":"https://codeload.github.com/parvezmrobin/ml-br-deduplication/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/parvezmrobin%2Fml-br-deduplication/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":259790456,"owners_count":22911547,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["deep-learning","siamese-network"],"created_at":"2024-12-14T17:20:42.087Z","updated_at":"2025-06-14T09:06:35.404Z","avatar_url":"https://github.com/parvezmrobin.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# An Empirical Study On Duplicate Bug Report Identification Using Siamese Cross-Encoder Network\n\nThis project replicates \"Towards Accurate Duplicate Bug Retrieval Using Deep Learning Techniques\" \nby Jayati Deshmukh, K. M. Annervaz, Sanjay Podder, Shubhashis Sengupta, and Neville Dubash\nin International Conference on Software Maintenance (ICSM) 2017.\nWe show that even without handling structured information separately, we can achieve\ncomparable performance with respect to the original work.\n\n## Dataset\nDownload and store the dataset into MongoDB from [here](http://alazar.people.ysu.edu/msr14data/).\nIf you are using Docker for MongoDB, you can find the `docker-compose.yaml` file in the root directory.\n\n# Installing Packages\nWe highly encourage to use a virtual environment to run the project.\nYou can find the list of necessary packages in the `requirements.txt` file in the root directory.\nInstall them by running\n```shell\npip install -r requirements.txt\n```\n\n## Run\nStart a jupyter server by running\n```shell\njupyter notebook\n```\nThen open `notebooks/siamese-trials/title-descr-eclipse.ipynb` in the jupyter app.\nTo see result for different datasets, change the third line in the second cell accordingly.\n\n## Result\nCheck `project-report.pdf` for detailed analysis of the evaluation and findings.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fparvezmrobin%2Fml-br-deduplication","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fparvezmrobin%2Fml-br-deduplication","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fparvezmrobin%2Fml-br-deduplication/lists"}