{"id":13620163,"url":"https://github.com/ghaiklor/llvm-kaleidoscope","last_synced_at":"2025-04-12T17:11:38.163Z","repository":{"id":44383764,"uuid":"82667529","full_name":"ghaiklor/llvm-kaleidoscope","owner":"ghaiklor","description":"LLVM Tutorial: Kaleidoscope (Implementing a Language with LLVM)","archived":false,"fork":false,"pushed_at":"2022-12-29T09:07:31.000Z","size":27,"stargazers_count":235,"open_issues_count":1,"forks_count":50,"subscribers_count":7,"default_branch":"master","last_synced_at":"2024-10-25T04:00:01.235Z","etag":null,"topics":["kaleidoscope","language","lexer","lexical-analysis","llvm","llvm-ir","llvm-tutorial"],"latest_commit_sha":null,"homepage":"","language":"C++","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ghaiklor.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2017-02-21T10:33:35.000Z","updated_at":"2024-09-25T09:24:35.000Z","dependencies_parsed_at":"2023-01-31T08:31:02.061Z","dependency_job_id":null,"html_url":"https://github.com/ghaiklor/llvm-kaleidoscope","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ghaiklor%2Fllvm-kaleidoscope","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ghaiklor%2Fllvm-kaleidoscope/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ghaiklor%2Fllvm-kaleidoscope/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ghaiklor%2Fllvm-kaleidoscope/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ghaiklor","download_url":"https://codeload.github.com/ghaiklor/llvm-kaleidoscope/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":222422258,"owners_count":16981940,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["kaleidoscope","language","lexer","lexical-analysis","llvm","llvm-ir","llvm-tutorial"],"created_at":"2024-08-01T21:00:52.969Z","updated_at":"2024-10-31T14:03:43.316Z","avatar_url":"https://github.com/ghaiklor.png","language":"C++","funding_links":[],"categories":["C++"],"sub_categories":[],"readme":"# Kaleidoscope: Implementing a Language with LLVM\n\n## How to build it\nOn MacOS (tested on Ventura 13.0).\n~~~\n# Install llvm (version 15.0)\nbrew install llvm@15\nmake\n./main\n# This should bring up a simple REPL.\n~~~\n\n## Why?\n\nSelf-education...\n\nI'm interested in LLVM and want to try simple things with it.\nThat's why I've started official LLVM tutorial - [Kaleidoscope](http://llvm.org/docs/tutorial).\n\n## What's it all about?\n\nThis tutorial runs through the implementation of a simple language, showing how fun and easy it can be.\nThis tutorial will get you up and started as well as help to build a framework you can extend to other languages.\nThe code in this tutorial can also be used as a playground to hack on other LLVM specific things.\n\nThe goal of this tutorial is to progressively unveil our language, describing how it is built up over time.\nThis will let us cover a fairly broad range of language design and LLVM-specific usage issues, showing and explaining the code for it all along the way, without overwhelming you with tons of details up front.\n\nIt is useful to point out ahead of time that this tutorial is really about teaching compiler techniques and LLVM specifically, not about teaching modern and sane software engineering principles.\nIn practice, this means that we’ll take a number of shortcuts to simplify the exposition.\nFor example, the code uses global variables all over the place, doesn’t use nice design patterns like visitors, etc... but it is very simple.\nIf you dig in and use the code as a basis for future projects, fixing these deficiencies shouldn’t be hard.\n\n## How it works all together?\n\n### Lexer\n\nThe first thing here is a lexer.\nLexer is responsible for getting a stream of chars and translating it into a groups of tokens.\n\n\u003e A lexer is a software program that performs lexical analysis. Lexical analysis is the process of separating a stream of characters into different words, which in computer science we call 'tokens'.\n\nTokens identifiers are stored under `lexer/token.h` file and lexer implementation under `lexer/lexer.cpp` file.\n\nTokens are just an `enum` structure, which consists of token identifier and a number assigned to this token.\nThis way, we can identify tokens through lexical analysis.\n\nThe actual reading of a stream is implemented in `lexer/lexer.cpp` file.\nFunction `gettok` reads characters one-by-one from `stdin` and groups them in tokens.\nSo, basically, `gettok` function reads characters and returns numbers (tokens).\n\nFurther, we can use these tokens in parser (semantic analysis).\n\n### AST (Abstract Syntax Tree)\n\nThough, before diving into the parser, we need to implement AST nodes, that we can use during parsing.\n\nBasic block of each AST node is `ExprAST` node, which is stored under `ast/ExprAST.h` file.\nAll other nodes are extends from `ExprAST` node.\n\nEach of AST nodes must implement one method - `codegen()`.\n`codegen()` method is responsible for generating LLVM IR, using LLVM IRBuilder API, that's all.\n\nAs you can see in `ast` folder, we have implemented the following AST nodes with appropriate code generation into LLVM IR:\n\n- Binary Expressions;\n- Call Expressions;\n- Function Expressions;\n- Number Expressions;\n- Prototype Expressions;\n- Variable Expressions;\n\nEach of these nodes have a constructor where all mandatory values are initialized.\nBased on that information, `codegen()` can build LLVM IR, usine these values.\n\nThe simplest one, i.e. is Number Expression.\n`codegen()` for number expression just calls appropriate method in LLVM IR Builder:\n\n```c++\nllvm::Value *NumberExprAST::codegen() {\n  return llvm::ConstantFP::get(TheContext, llvm::APFloat(Val));\n}\n```\n\nNow, we have two parts of a compiler which we can combine.\n\n### Parser\n\nParser is where lexer and AST are combined together.\nThe actual implementation of a parser stores into `parser/parser.cpp` file.\n\nParser uses lexer for getting a stream of tokens, which are used for building an AST, using our AST implementation.\n\nSo, in general, when parser sees a known token, i.e. number token, it tries to create a `NumberExprAST` node.\n\nWhen parsing is done, got the last character/token from the stream, we have an AST representation of our code.\nWe can use it and generate LLVM IR from our AST using `codegen()` method in each AST node.\n\nThis process is done in `main.cpp` file.\n`main.cpp` file is the place where all the parts are combined in one place.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fghaiklor%2Fllvm-kaleidoscope","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fghaiklor%2Fllvm-kaleidoscope","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fghaiklor%2Fllvm-kaleidoscope/lists"}