{"id":20785870,"url":"https://github.com/boostibot/bachelors","last_synced_at":"2025-10-26T15:13:19.034Z","repository":{"id":209663197,"uuid":"713314548","full_name":"Boostibot/bachelors","owner":"Boostibot","description":"My bachelors thesis at CTU in Prague, Faculty of Nuclear Sciences and Physical Engineering supervised by Ing. Pavel Strachota, Ph.D","archived":false,"fork":false,"pushed_at":"2025-01-05T23:52:04.000Z","size":41758,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-03-11T23:34:08.558Z","etag":null,"topics":["crystal-growth","cuda","finite-volume-method","parallel-programming","phase-field-method"],"latest_commit_sha":null,"homepage":"","language":"TeX","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Boostibot.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-11-02T09:18:35.000Z","updated_at":"2025-01-16T11:18:31.000Z","dependencies_parsed_at":"2023-12-17T23:30:47.847Z","dependency_job_id":"8b02189d-4671-4a03-ab69-90d13917b67c","html_url":"https://github.com/Boostibot/bachelors","commit_stats":null,"previous_names":["boostibot/bachelors"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/Boostibot/bachelors","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Boostibot%2Fbachelors","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Boostibot%2Fbachelors/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Boostibot%2Fbachelors/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Boostibot%2Fbachelors/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Boostibot","download_url":"https://codeload.github.com/Boostibot/bachelors/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Boostibot%2Fbachelors/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":279012192,"owners_count":26085079,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-12T02:00:06.719Z","response_time":53,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["crystal-growth","cuda","finite-volume-method","parallel-programming","phase-field-method"],"created_at":"2024-11-17T14:48:29.710Z","updated_at":"2025-10-12T17:44:11.054Z","avatar_url":"https://github.com/Boostibot.png","language":"TeX","funding_links":[],"categories":[],"sub_categories":[],"readme":"The source code to my bachelors thesis at CTU in Prague, Faculty of Nuclear Sciences and Physical Engineering supervised by Ing. Pavel Strachota, Ph.D. For the details/documentation etc. refer to the full text in [Bachelors.pdf](./Bachelors.pdf) or the latex source.\n\n# Abstract\nThis work is concerned with GPU parallel implementation of numerical schemes of the\ntwo dimensional phase field model, describing crystal growth in undercooled media. Firstly, the\nphase field model is introduced and the finite volume method is utilized to derive a semi-discrete\nscheme for admissible meshes. This scheme is numerically integrated using higher order explicit\nmethods. Then, a semi-implicit time integration scheme is derived using the Crank-Nicolson\nmethod and solved using the conjugate gradient method. Two approaches to reduce the error\nintroduced by the operator splitting method are presented and later compared. Programming\nwith CUDA is thoroughly introduced and several optimized algorithms required by the simulation\nimplementation are explained. The efficiency of one of the described algorithms is shown in a\nbenchmark. Finally, simulation results of the proposed time integration schemes are compared\nand good agreement with previous results is shown.\n\n# Gallery \n\n| Crystal with 6-fold anisotropy | Crystal with 8-fold anisotropy |\n|--------------------------------|--------------------------------|\n| ![Crystal with 6-fold anisotropy](text/Bachelors/results/show_low_xi_U_20.png \"6-fold anisotropy\") | ![Crystal with 8-fold anisotropy](text/Bachelors/results/show_low_xi_anisofold_8_U_30.png \"8-fold anisotropy\") |\n\nThe simulated crystal structures. Images show the crystal outline in white and temperature field in red-blue gradient. The solid crystal is the hottest and the surrounding undercooled (under freezing temperature) liquid the coldest.  \n\n| Comparison of various different time integration schemes | Benchmark comparison vs reference CPU implementation showing up to 20x time speedup on laptop GPU |\n|--------------------------------|--------------------------------|\n| ![Comparison of various different time integration schemes](text/Bachelors/results/model_comp.png \"time integration schemes\") | ![Benchmark comparison vs reference CPU implementation showing up to 20x time speedup on laptop GPU](text/Bachelors/results/comp_time_consumer.png \"time integration schemes\") |\n\n# Conclusion\n\nThis work presents the derivation of several numerical schemes for solving the two dimen-\nsional phase field problem with a simple anisotropy, together with its GPU parallel implemen-\ntation showing good performance on consumer hardware. The finite volume method notation is\nintroduced and utilized to derive approximations for the Laplacian and gradient differential oper-\nators on admissible meshes. Different boundary conditions are described within the finite volume\nframework using ghost cells. Semi-discrete scheme of the phase field model is derived. Then ex-\nplicit time integration schemes such as the explict Euler and Runge-Kutta-Merson methos are\ndiscussed and a time integration algorithm is presented. Semi-implicit time integration scheme\nutilizing the Crank-Nicolson method is derived. Solution of the resulting matrix of equations is\ndiscussed and operator splitting method is utilized to aid numerical matrix solvers and enabling\nthe conjugate gradient method to be used. Internal error introduced by the operator splitting\nmethod is quantified and two techniques are provided for its reduction. The first is the repeated\niteration technique, which has been shown to reliably reduce the operator splitting error. The\nsecond is the correction term technique, which has failed to reduce the operator splitting er-\nror, but produces similar crystal structures to the repeated iteration technique at no additional\nruntime cost.\n\nNext, a detailed introduction to the CUDA hardware and programming model is given. Even\nthough the text starts with simple examples, it quickly reaches non-trivial optimized implemen-\ntations of the parallel for, parallel tiled for and parallel reduction algorithms. Special focus is put\non shared memory with relation to the CUDA programming model and optimization of memory\nintensive kernels. A state-of-the-art parallel reduction kernel is presented, utilizing warp-level\nparallelism. Benchmarks of the presented algorithms are performed, showcasing superior perfor-\nmance on small and large datasets compared to the CUDA Thrust library.\n\nFinally, simulation results of the proposed time integration schemes are shown. A discussion\nof boundary conditions and their impact on the simulation is given. Integration schemes are\ncompared, showcasing consistency between the different techniques. The runtime performance\nof the developed simulation code is compared against a reference implementation. Speedups\nupwards of 20 times can be observed on both consumer hardware and specialized HPC hardware.\nThe developed simulation code is freely available at https://github.com/Boostibot/bachelors.\n\nFurther work is needed to extend the proposed algorithms to three dimensions and efficiently\ndistribute the simulation workload in many GPU setups, enabling simulation on high resolution\nthree dimensional meshes. The parallel algorithms developed in this work can be used with\nadvantage to solve more complex models including, for example, phase transitions in alloys,\nsolidification subject to fluid flow, or freezing and thawing in porous media.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fboostibot%2Fbachelors","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fboostibot%2Fbachelors","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fboostibot%2Fbachelors/lists"}