{"id":27298641,"url":"https://github.com/mir-am/phd-thesis","last_synced_at":"2026-01-20T08:33:38.890Z","repository":{"id":276477849,"uuid":"929398012","full_name":"mir-am/PhD-thesis","owner":"mir-am","description":"My PhD Thesis: Machine Learning-assisted Software Analysis","archived":false,"fork":false,"pushed_at":"2025-02-09T15:01:17.000Z","size":28908,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-04-19T07:41:11.721Z","etag":null,"topics":["ai","machine-learning","ml","phd","software-analysis","software-engineering","thesis","tudelft"],"latest_commit_sha":null,"homepage":"","language":"TeX","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/mir-am.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2025-02-08T13:07:40.000Z","updated_at":"2025-02-09T15:03:44.000Z","dependencies_parsed_at":"2025-02-08T14:36:55.416Z","dependency_job_id":null,"html_url":"https://github.com/mir-am/PhD-thesis","commit_stats":null,"previous_names":["mir-am/phd-thesis"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/mir-am/PhD-thesis","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mir-am%2FPhD-thesis","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mir-am%2FPhD-thesis/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mir-am%2FPhD-thesis/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mir-am%2FPhD-thesis/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/mir-am","download_url":"https://codeload.github.com/mir-am/PhD-thesis/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mir-am%2FPhD-thesis/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28599042,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-01-20T02:08:49.799Z","status":"ssl_error","status_checked_at":"2026-01-20T02:08:44.148Z","response_time":117,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai","machine-learning","ml","phd","software-analysis","software-engineering","thesis","tudelft"],"created_at":"2025-04-12T00:37:22.757Z","updated_at":"2026-01-20T08:33:38.875Z","avatar_url":"https://github.com/mir-am.png","language":"TeX","readme":"# Machine Learning-assisted Software Analysis\nThis repository contains my PhD thesis for receiving the degree of Doctor of Philosophy at Delft University of Technology. I defended my PhD thesis on Thursday, February 6th, 2025, at 3 PM in Aula, TU Delft.\n\n\u003cp\u003e\n  \u003ca href=\"./PhD_Thesis_Mir.pdf\" style=\"display: inline-flex; align-items: center; text-decoration: none; margin-right: 1.5em;\"\u003e\n    \u003cimg src=\"https://img.icons8.com/color/48/000000/pdf.png\" alt=\"PDF icon\" width=\"28\" style=\"vertical-align: middle;\"\u003e\n    \u003cspan style=\"margin-left: 8px; vertical-align: middle; color:rgb(255, 255, 255);\"\u003e\u003cstrong\u003ePDF File\u003c/strong\u003e\u003c/span\u003e\n  \u003c/a\u003e\n  \u003cbr\u003e\n  \u003ca href=\"./src/\" style=\"display: inline-flex; align-items: center; text-decoration: none;\"\u003e\n    \u003cimg src=\"https://img.icons8.com/color/48/000000/source-code.png\" alt=\"Source icon\" width=\"28\" style=\"vertical-align: middle;\"\u003e\n    \u003cspan style=\"margin-left: 8px; vertical-align: middle; color:rgb(255, 255, 255);\"\u003e\u003cstrong\u003eSource Files\u003c/strong\u003e\u003c/span\u003e\n  \u003c/a\u003e\n\u003c/p\u003e\n\nThe thesis is also stored on the TU Delft library website [here](https://repository.tudelft.nl/record/uuid:2d59214f-2d2f-48f0-ae10-003fd3b83e61).\n\n- [Summary](#summary)\n- [Cover](#cover)\n- [Doctoral Commitee](#doctoral-commitee)\n- [Funding](#funding)\n- [License](#license)\n- [ISBN](#isbn)\n\n# Summary\n\nSoftware engineering, fundamental to modern technological advancement, profoundly\ninfluences various aspects of society by enhancing efficiency, accessibility, and security.\nThis discipline involves systematically applying engineering principles to software systems’\ndesign, development, testing, and maintenance. Innovations in software engineering\nhave revolutionized industries such as communication, finance, healthcare, and education,\ndemocratizing access to information and connecting global communities. As software\nsystems become increasingly complex, the need for efficient, secure, and reliable software\nanalysis tools becomes paramount.\n\nThe thesis focuses on improving the actionability and scalability of software analysis\nby integrating machine learning (ML) techniques. Traditional static analysis tools often\nstruggle with large codebases, leading to high false positive rates and high computational\ncosts. Machine learning, particularly deep learning architectures like Transformers, offers a\npromising solution by capturing long-range dependencies in code and learning hierarchical\nrepresentations. This capability enables ML models to automate tasks such as bug detection,\nsource code summarization, and program repair, providing developers with actionable\ninsights and improving overall productivity and code quality.\n\nA significant contribution of this thesis is the development of ML-based techniques for\ntype inference in Python and call graph pruning. An ML-based type inference approach,\nnamely Type4Py, was proposed, which accurately predicts type annotations for Python\ncode, enhancing code quality and reducing runtime errors. ML models with conservative\npruning strategies were proposed for call graph pruning, which learns from dynamic traces\nobtained by executing programs to identify and eliminate false edges, thereby minimizing\nfalse positives and improving precision. Additionally, the thesis explores the application of\ncall graphs in vulnerability analysis, demonstrating that granular assessments provide more\naccurate and actionable insights than more straightforward, dependency-level analyses.\n\nIn summary, this thesis advances the field of software analysis by harnessing machine\nlearning to address two important issues related to the actionability and scalability of software\nanalysis tools. The proposed ML-driven tools and techniques enhance the precision\nand reliability of software analysis and support developers in maintaining robust, secure,\nand maintainable software systems. These contributions pave the way for future research\nin applying ML techniques to various aspects of software engineering, promising further\nimprovements in software development practices.\n\n**Keywords:** Machine Learning, Software Analysis, Software Engineering\n\n# Cover\n\u003cimg src=\"./src/cover/cover-front.png\" alt=\"Thesis Cover\" width=\"700\" /\u003e\n\nThe cover is generated by Open AI's DALL·E 3. The prompt used for generating the cover is private.\n\n# Doctoral Commitee\nSupervisors:\n- Prof. Arie van Deursen\n- Dr. Sebastian Proksch\n- Dr. Georgios Gousios\n\nIndependent members:\n- Prof. Fernando Kuipers\n- Prof. Michael Pradel \n- Prof. Prem Devanbu\n- Dr. Baishakhi Ray\n- Prof. Andy Zaidman\n\n# Funding \nThis thesis was funded by [the FASTEN project](https://swforum.eu/project-hub/fine-grained-analysis-software-ecosystems-networks), a European Union’s Horizon 2020 research and innovation program under grant agreement number 825328.\n\n# License\nThis PhD thesis is licensed under the terms of Attribution-NonCommercial-ShareAlike 4.0 International ([CC BY-NC-SA 4.0](https://creativecommons.org/licenses/by-nc-sa/4.0/)).\n\n# ISBN\n978-94-6518-015-1\n\n---\n\n\u003cimg src=\"./src/title/logos/TUD_logo.png\" alt=\"TUD Logo\" width=\"480\" /\u003e\n\u003cimg src=\"./src/title/logos/eu_h2020_logo_w.png\" alt=\"EU Horizon 2020\" width=\"480\" /\u003e\n\u003cimg src=\"./src/title/logos/fasten_logo.png\" alt=\"FASTEN\" width=\"480\" /\u003e\n\u003cimg src=\"./src/title/logos/IPA.gif\" alt=\"IPA\" width=\"320\" /\u003e\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmir-am%2Fphd-thesis","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmir-am%2Fphd-thesis","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmir-am%2Fphd-thesis/lists"}