https://github.com/laugharne/rust_compiler_deep_dive
In this video, Daniel Cumming a formal verification engineer at Runtime Verification and Rust instructor at RareSkills explains how the Rust compiler works under the hood. This talk will explain the Rust compiler pipeline.
https://github.com/laugharne/rust_compiler_deep_dive
ast cargo cfg compiler compilers ir llvm llvm-ir mir rust rust-lang rustc solidity vyper
Last synced: about 1 month ago
JSON representation
In this video, Daniel Cumming a formal verification engineer at Runtime Verification and Rust instructor at RareSkills explains how the Rust compiler works under the hood. This talk will explain the Rust compiler pipeline.
- Host: GitHub
- URL: https://github.com/laugharne/rust_compiler_deep_dive
- Owner: Laugharne
- Created: 2025-04-03T14:28:52.000Z (about 1 year ago)
- Default Branch: main
- Last Pushed: 2025-04-08T06:39:51.000Z (about 1 year ago)
- Last Synced: 2025-04-09T19:12:12.914Z (about 1 year ago)
- Topics: ast, cargo, cfg, compiler, compilers, ir, llvm, llvm-ir, mir, rust, rust-lang, rustc, solidity, vyper
- Homepage: https://youtu.be/Ju7v6vgfEt8
- Size: 10.5 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
**How the Rust Compiler Works, a Deep Dive - Video Highlights**
```
This is a transcription of a YouTube video. Among the relevant resources for learning more about Rust compiler, I found this video particularly interesting.
This content presents the opinions and perspectives of industry experts or other individuals. The opinions expressed in this content do not necessarily reflect my opinion.
Readers are encouraged to verify the information on their own and seek professional advice before making any decisions based on this content.
```
----
In this video, **Daniel Cumming** a formal verification engineer at **Runtime Verification** and Rust instructor at **RareSkills** explains how the Rust compiler works under the hood.
This talk will explain the **Rust compiler pipeline** of
- Source code
- Abstract Syntax Tree
- High Intermediate Representation
- Typed-High Intermediate Representation
- Middle Intermediate Representation
- LLVM Intermediate Representation
- Codegen Backend
- Adding callbacks to compilation
- Building the Rust Compiler
----
----

# [00:00](https://youtu.be/Ju7v6vgfEt8?t=0) Introduction to Rust and the Rust Compiler
## Overview of the Presentation
- [00:00](https://youtu.be/Ju7v6vgfEt8?t=0) Daniel introduces himself as a formal verification engineer at Runtime Verification, focusing on Rust and its compiler.
- [00:28](https://youtu.be/Ju7v6vgfEt8?t=28) He acknowledges the ambitious goal of covering a complex topic within an hour, likening it to fitting a school bus into a living room.
- [00:52](https://youtu.be/Ju7v6vgfEt8?t=52) Daniel checks if everyone can see his screen share and adjusts his presentation setup for better visibility.Structure of the Talk
- [01:15](https://youtu.be/Ju7v6vgfEt8?t=75) The presentation will begin with a brief overview of Rust as a language (3-5 minutes).
- [01:46](https://youtu.be/Ju7v6vgfEt8?t=106) Discussion will then shift to the Rust compiler's role in transforming source code into binaries, emphasizing properties that must be maintained during this process.
- [02:09](https://youtu.be/Ju7v6vgfEt8?t=129) A detailed look inside the Rust compiler is planned for 20-30 minutes, followed by interactive callbacks related to using the compiler.
# [03:02](https://youtu.be/Ju7v6vgfEt8?t=182) What is Rust?
## Key Features of Rust
- [03:02](https://youtu.be/Ju7v6vgfEt8?t=182) Defined by the Rust Lang group as empowering users to build reliable and efficient software through explicit memory control.
- [03:34](https://youtu.be/Ju7v6vgfEt8?t=214) Capable of compiling down small enough for embedded systems without needing a garbage collector, enhancing efficiency.
## Safety Guarantees
- [03:55](https://youtu.be/Ju7v6vgfEt8?t=235) Strong guarantees regarding memory safety and thread safety are provided by its type system, preventing common errors like buffer overflows or dangling pointers.
- [04:32](https://youtu.be/Ju7v6vgfEt8?t=272) Data race freedom is ensured during concurrency; however, logical deadlocks may still occur.
# [05:06](https://youtu.be/Ju7v6vgfEt8?t=306) Memory Management in Rust
## Borrowing and Lifetime Semantics

- [05:06](https://youtu.be/Ju7v6vgfEt8?t=306) The borrowing mechanism ensures shared data cannot be mutated while multiple references exist, maintaining integrity.
## Use of Unsafe Code

- [05:39](https://youtu.be/Ju7v6vgfEt8?t=339) While memory safety relies on proper usage of safe constructs, misuse of the `unsafe` keyword allows direct manipulation but risks introducing bugs.
## Standard Library Utilization
- [06:36](https://youtu.be/Ju7v6vgfEt8?t=396) The standard library provides safe abstractions over unsafe code (e.g., vectors), allowing developers to manage concurrency safely while leveraging powerful features.

# [07:14](https://youtu.be/Ju7v6vgfEt8?t=434) Exploring the Rust Compiler

## Introduction to rustc
- [07:14](https://youtu.be/Ju7v6vgfEt8?t=434) Daniel transitions to discussing `rustc`, which compiles Rust source code into target binaries.
# [07:46](https://youtu.be/Ju7v6vgfEt8?t=466) Understanding the Rust Compiler and Its Components
## Overview of the Source Directory and Compiler
- [07:46](https://youtu.be/Ju7v6vgfEt8?t=466) The source directory is not central to today's discussion; focus will be on documentation, compiler, and library.
- [08:10](https://youtu.be/Ju7v6vgfEt8?t=490) Cloning the compiler repository allows building a fourth directory called "build," which contains the Rust C binary. Building takes significant time and space.
- [08:33](https://youtu.be/Ju7v6vgfEt8?t=513) To build, users must utilize specific scripts (X scripts), particularly x.p for configuration purposes.
## Bootstrapping in Rust
- [08:55](https://youtu.be/Ju7v6vgfEt8?t=535) The Rust compiler is a bootstrap compiler; its code is written in Rust itself, showcasing clever design principles.
- [09:28](https://youtu.be/Ju7v6vgfEt8?t=568) Inside the compiler directory are numerous crates responsible for translating source language into binary while performing various analyses.
## Compiler Frameworks and Languages
- [10:02](https://youtu.be/Ju7v6vgfEt8?t=602) Compilers can be built using external languages; frameworks like Java Cup and Flex YYC exist for defining grammars dynamically.
- [10:38](https://youtu.be/Ju7v6vgfEt8?t=638) There are multiple bootstrap compilers available, each with unique benefits in compiling Rust into binaries.
## Cargo Tooling in Rust
- [11:15](https://youtu.be/Ju7v6vgfEt8?t=675) Users familiar with Rust may have used Cargo, which simplifies interactions with the Rust compiler by running `rustc` under the hood.
- [11:39](https://youtu.be/Ju7v6vgfEt8?t=699) When executing `cargo run`, it automatically handles arguments for `rustc`, streamlining user experience.
## Versions of the Compiler
- [12:13](https://youtu.be/Ju7v6vgfEt8?t=733) The GitHub repository features nightly builds of the compiler that include frequent updates from community contributions aimed at improving functionality.
- [12:50](https://youtu.be/Ju7v6vgfEt8?t=770) Stable releases of the compiler are also available, providing more reliability by excluding experimental features.
## Library Directory Insights
- [13:13](https://youtu.be/Ju7v6vgfEt8?t=793) The library directory contains core primitives essential to Rust, including types related to memory allocation and standard libraries like vectors and slices.
- [13:47](https://youtu.be/Ju7v6vgfEt8?t=827) Understanding these components is crucial as they support building robust applications within the language framework.
## Community Engagement and Learning Resources
- [14:10]([How the Rust Compiler Works, a Deep Dive - YouTube](https://youtu.be/Ju7v6vgfEt8?t=850)) For those interested in learning more about the Rust compiler or engaging with others, [**Zulip**](https://forge.rust-lang.org/platforms/zulip.html) serves as an excellent platform for questions and discussions.
## Compilation Process Breakdown

- [14:50](https://youtu.be/Ju7v6vgfEt8?t=890) The process of converting source code to binary can be divided into **three main stages**:
1. Source Code Level
2. Intermediate Representation Level
3. Code Generation Stage
# [15:35](https://youtu.be/Ju7v6vgfEt8?t=935) Understanding Rust Compilation Process
## Overview of Code Transformation

- [15:35](https://youtu.be/Ju7v6vgfEt8?t=935) The process of transforming code into the desired format involves multiple rounds of analysis to ensure it conforms to the language's specifications.
- [15:57](https://youtu.be/Ju7v6vgfEt8?t=957) A simple "Hello World" program in Rust serves as a motivating example for understanding these transformations.
## Lexing and Parsing

- [16:29](https://youtu.be/Ju7v6vgfEt8?t=989) An **Abstract Syntax Tree (AST)** is created from source code, which requires two main processes: **lexing** and **parsing**.
- [17:05](https://youtu.be/Ju7v6vgfEt8?t=1025) Lexing involves reading a stream of characters and recognizing tokens that match the language's syntax, such as function definitions.
## Abstract Syntax Tree Representation
- [17:29]([How the Rust Compiler Works, a Deep Dive - YouTube](https://youtu.be/Ju7v6vgfEt8?t=1049)) The [**AST**](https://en.wikipedia.org/wiki/Abstract_syntax_tree) visually represents a program in a tree structure, useful for compilers. It includes various statements like conditions and assignments.

- [17:52](https://youtu.be/Ju7v6vgfEt8?t=1072) The initial step for the Rust compiler is to create this AST through lexing and parsing before further processing.
## Macro Expansion and Name Resolution
- [18:50](https://youtu.be/Ju7v6vgfEt8?t=1130) After creating the AST, the compiler must expand macros and resolve names to fully understand the program structure.
- [19:13](https://youtu.be/Ju7v6vgfEt8?t=1153) Commands can be run in Rust C using specific flags (e.g., -Z help), allowing users to view different states within the compiler during this process. `rustc -Z unpretty="ast-free" src/main.rs`

## Viewing Compiler Output
- [19:47](https://youtu.be/Ju7v6vgfEt8?t=1187) By utilizing commands like `rustc -Z help`, users can access various compiler flags that provide insights into its operations.
- [20:57](https://youtu.be/Ju7v6vgfEt8?t=1257) The output of an AST does not resemble traditional graphical representations but instead appears as nodes with pointers indicating relationships between them.

## Practical Tools for Rust Development
- [21:52]([How the Rust Compiler Works, a Deep Dive - YouTube](https://youtu.be/Ju7v6vgfEt8?t=1312)) The [**Rust Analyzer**](https://marketplace.visualstudio.com/items?itemName=rust-lang.rust-analyzer) is highly recommended for developers; it provides extensive information about code within IDE environments, enhancing productivity.
# [23:35](https://youtu.be/Ju7v6vgfEt8?t=1415) Understanding the Role of Intermediate Representations in Rust Compilers
## The Purpose of Intermediate Representations
- [23:35](https://youtu.be/Ju7v6vgfEt8?t=1415) The discussion begins with a question about the utility of seeing the intermediate representations (IR) of a program, primarily aimed at those interested in compilers.
- [23:59](https://youtu.be/Ju7v6vgfEt8?t=1439) While IR is crucial for compiler developers, it also serves others who need to understand runtime verification and build tooling like theorem proving and interpreters for Rust.
- [24:35](https://youtu.be/Ju7v6vgfEt8?t=1475) Many programs utilize IR from the Rust compiler for various business cases, although most users are typically compiler enthusiasts.
## Relationship Between Macros and Intermediate Representations
- [25:07](https://youtu.be/Ju7v6vgfEt8?t=1507) A question arises regarding the relationship between macros in Rust and their representation; it's noted that there is indeed a connection.
- [25:28](https://youtu.be/Ju7v6vgfEt8?t=1528) The existence of an intermediate representation depends on macros, which can be expanded into larger structures during compilation.
- [26:14](https://youtu.be/Ju7v6vgfEt8?t=1574) When creating procedural macros in Rust, developers must consider lexing and parsing tokens, indicating a deeper level of complexity involved.
## Analyzing Abstract Syntax Trees
- [26:38](https://youtu.be/Ju7v6vgfEt8?t=1598) The abstract syntax tree (AST) is introduced as a tool for debugging complex programs by revealing what gets outputted by the compiler.
- [27:00](https://youtu.be/Ju7v6vgfEt8?t=1620) Transforming AST into **high-level intermediate representation** involves lowering and desugaring features to streamline control flow constructs like loops into singular representations.

## Lowering and Desugaring Process
- [27:24](https://youtu.be/Ju7v6vgfEt8?t=1644) Features such as `for` loops, `while` loops, and infinite loops are consolidated into **one loop keyword** to simplify analysis.
- [28:18](https://youtu.be/Ju7v6vgfEt8?t=1698) This process leads to transforming multiple control flow statements into match statements for easier handling within the compiler's analysis phase.
## Type Safety Guarantees in Rust Compilation

- [28:52](https://youtu.be/Ju7v6vgfEt8?t=1732) After **lowering**, type inference, trait solving, and type checking occur within this new representation to ensure type safety guarantees provided by the Rust compiler.
- [29:32](https://youtu.be/Ju7v6vgfEt8?t=1772) The transition from surface language to high-level IR shows added details like prelude inclusions and macro expansions that enhance clarity for further analysis.
`rustc -Z unpretty="hir" src/main.rs`

`rustc -Z unpretty="hir-tree" src/main.rs`

## Allocation IDs in Compiler Analysis
- [30:04](https://youtu.be/Ju7v6vgfEt8?t=1804) In this stage of compilation, every expression receives **unique identifiers** known as `DefId` essential for tracking elements throughout various analysis rounds.
# Understanding Rust Compiler Analysis Techniques
## Overview of Analysis Techniques
- [31:30](https://youtu.be/Ju7v6vgfEt8?t=1890) The analysis techniques at the high-level include **trait solving**, **type inference**, and **type checking**. These processes are essential for understanding how Rust handles types and generics during compilation.
## Type Inference in Rust

- [32:14](https://youtu.be/Ju7v6vgfEt8?t=1934) **Type inference** allows the Rust compiler to deduce types without explicit declarations. For example, it can identify a variable as a vector of strings based on context.
## Trait Solving Explained
- [32:38](https://youtu.be/Ju7v6vgfEt8?t=1958) Trait solving involves determining if generics used in functions can be instantiated correctly. The compiler checks if the generic types meet required traits, such as implementing `Display` for printing elements.
- [33:13](https://youtu.be/Ju7v6vgfEt8?t=1993) An example is provided where a function logs elements from a vector of type `T`, which must implement the `Display` trait to ensure **proper output formatting**.
## Importance of Type Checking
- [34:11](https://youtu.be/Ju7v6vgfEt8?t=2051) Type checking ensures that variables conform to their defined types, preventing errors like assigning negative numbers to unsigned integers (`u32`). This process is more complex than simply verifying basic constraints.
## Transitioning to Intermediate Representation (IR)


- [35:03](https://youtu.be/Ju7v6vgfEt8?t=2103) After initial analysis, the representation evolves into an intermediate representation (IR), which retains more detailed information necessary for further analysis.
- [35:48](https://youtu.be/Ju7v6vgfEt8?t=2148) The IR is fully type checked and elaborated, allowing for deeper safety checks related to unsafe code usage within Rust programs.
- `rustc -Z unpretty="thir-tree" src/main.rs`
- `rustc -Z unpretty="thir-flat" src/main.rs`

## Control Flow Graph (CFG)

# [40:05](https://youtu.be/Ju7v6vgfEt8?t=2405) Understanding Control Flow Graphs in Rust
## Overview of Memory Representation
- [40:05](https://youtu.be/Ju7v6vgfEt8?t=2405) The **Mir (Middle IR)** level involves declaring memory locations that will eventually be assigned types, forming the basis for program execution within nodes of a [**control flow graph (CFG)**](https://en.wikipedia.org/wiki/Control-flow_graph).
## Structure of Control Flow Graph (CFG)

- [40:27](https://youtu.be/Ju7v6vgfEt8?t=2427) CFG consists of **basic blocks** where each block contains statements performing assignments to memory locations, culminating in a terminator that decides branching based on an integer value.
- [40:51](https://youtu.be/Ju7v6vgfEt8?t=2451) Basic Block 1 includes two assignment statements and has a direct edge to Basic Block 4, which concludes with return statements.
## Looping and Analysis Benefits
- [41:14](https://youtu.be/Ju7v6vgfEt8?t=2474) CFG can include loops; for instance, it is valid for Basic Block 2 to branch back to Basic Block 1.
- [41:14](https://youtu.be/Ju7v6vgfEt8?t=2474) This structure aids in subsequent analysis rounds.
- [41:48](https://youtu.be/Ju7v6vgfEt8?t=2508) The analysis at this level focuses on **Drop Elaboration** and **Borrow Checking**, crucial for enforcing Rust's memory safety guarantees regarding lifetimes and allocation.
## Visualizing MIR Representations
- [42:09](https://youtu.be/Ju7v6vgfEt8?t=2529) Using tools like `unpretty=mir` allows visualization of the MIR representation alongside source code, providing insights into memory declarations without taking them too literally.
`rustc -Z unpretty="mir" src/main.rs`
`rustc -Z unpretty="mir" -Zmir-enable-passes=-PromoteTemps src/main.rs`

- [43:29](https://youtu.be/Ju7v6vgfEt8?t=2609) In MIR, **variables are replaced by memory places with specific types**; Place Zero is reserved for function returns while others act as registers or memory locations.
## Execution Flow in MIR Programs
- [44:03](https://youtu.be/Ju7v6vgfEt8?t=2643) The execution starts from Basic Block Zero, assigning "Hello World" to Place Four and creating a non-mutable reference. If successful, it branches to Basic Block One.
- [44:39](https://youtu.be/Ju7v6vgfEt8?t=2679) In Basic Block One, the output from printing is assigned to Place One. If no errors occur during this process, it transitions through the terminator to Basic Block Two.
# [45:11](https://youtu.be/Ju7v6vgfEt8?t=2711) Transitioning from MIR to Code Generation

## Final Steps Before Code Generation
- [45:47](https://youtu.be/Ju7v6vgfEt8?t=2747) After completing analyses in MIR, the next step involves transitioning towards **code generation using LLVM** or other backends like GCC or custom solutions tailored for specific use cases.
## Importance of Constant Evaluation
- [46:22](https://youtu.be/Ju7v6vgfEt8?t=2782) Prior to code generation, all constants within Rust programs undergo evaluation. This ensures that constant values are resolved before further processing occurs.
## Lowering Process Explained
[**Static single-assignment form (SSA)**](https://en.wikipedia.org/wiki/Static_single-assignment_form)
# [47:39](https://youtu.be/Ju7v6vgfEt8?t=2859) Understanding Monomorphization in Rust
## The Role of Monomorphization
- [47:39](https://youtu.be/Ju7v6vgfEt8?t=2859) In Rust, generics allow functions to handle multiple types, but binaries do not understand these abstractions. [**Monomorphization**](https://en.wikipedia.org/wiki/Monomorphization) converts generics into specific functions for each type used.
- [48:00](https://youtu.be/Ju7v6vgfEt8?t=2880) The process involves creating individual functions during code generation, ensuring that the binary can execute all necessary operations for the generic code.
## Transitioning to LLVM
- [48:32](https://youtu.be/Ju7v6vgfEt8?t=2912) Once in LLVM (Low-Level Virtual Machine), extensive optimization occurs, bringing the representation closer to a binary format.
- [49:05](https://youtu.be/Ju7v6vgfEt8?t=2945) Users can view LLVM's output through tools like `llvm BC`, which provides insights into the compiled code structure before reaching the final binary.
**`rustc --emiy llvm-ir src/main.rs`**

**`rustc --emiy llvm-bc src/main.rs`**

## Compilation Process Overview

- [49:42](https://youtu.be/Ju7v6vgfEt8?t=2982) The compilation journey typically starts with writing a Rust source program and using Cargo to build it, resulting in a target binary.
- [50:03](https://youtu.be/Ju7v6vgfEt8?t=3003) An example provided is an **x86 assembly instruction** set generated from a simple "Hello World" program, illustrating how high-level code translates into lower-level instructions.
**`rustc --emiy ASM src/main.rs`**

# [51:03](https://youtu.be/Ju7v6vgfEt8?t=3063) Exploring Intermediate Representations
## Interaction and Further Learning
- [51:03](https://youtu.be/Ju7v6vgfEt8?t=3063) A suggestion was made to collect emails for further discussions on **intermediate representations (IR)**, indicating interest in deeper engagement with participants.
## Abstract Syntax Tree (AST)
- [51:24](https://youtu.be/Ju7v6vgfEt8?t=3084) The abstract syntax tree represents the structure of source programs in Rust. It captures function definitions and validates syntax rules during parsing.
- [51:56](https://youtu.be/Ju7v6vgfEt8?t=3116) **ABI (Application Binary Interface)** relates more closely to compiled bytecode than AST; it defines how different components interact at runtime.
## Code Generation Insights
- [52:19](https://youtu.be/Ju7v6vgfEt8?t=3139) **The Rust compiler does not generate binary code directly; this task falls under LLVM or other backends like GCC**. This separation emphasizes modularity within compilers.
- [52:56](https://youtu.be/Ju7v6vgfEt8?t=3176) Code generation is crucial as it determines how high-level constructs translate into machine-specific instructions based on chosen targets like **x86** or **ARM**.
# [53:21](https://youtu.be/Ju7v6vgfEt8?t=3201) Intermediate Representations and Their Importance
## Understanding IR Structures
- [53:21](https://youtu.be/Ju7v6vgfEt8?t=3201) Different IR stages exist within the Rust environment, facilitating various analyses and transformations before reaching final executable form.
## Lexical Analysis and Parsing
# [55:35](https://youtu.be/Ju7v6vgfEt8?t=3335) Understanding Rust's Compilation Process
## Overview of Intermediate Representations (IR)
- [55:35](https://youtu.be/Ju7v6vgfEt8?t=3335) The compilation process begins with expanding macros to fully elaborate names, leading down to the crate root. This sets the stage for further analysis.
- [56:08](https://youtu.be/Ju7v6vgfEt8?t=3368) The first pass of guarantees involves type checking, where the compiler infers types not explicitly defined and checks trait bounds for instantiation feasibility.
- [56:30](https://youtu.be/Ju7v6vgfEt8?t=3390) After type checking, a desugaring process occurs, resulting in a typed High Intermediate Representation (HIR), which includes checks for unsafety.
## Transitioning to Middle Intermediate Representation (MIR)
- [56:54](https://youtu.be/Ju7v6vgfEt8?t=3414) The transition from an abstract syntax tree to a control flow graph marks the next phase. This graph consists of basic blocks containing statements that reference memory locations rather than variables.
- [57:18](https://youtu.be/Ju7v6vgfEt8?t=3438) Each basic block concludes with a terminator that can branch out to other blocks, creating a structured layout of program execution paths.
- [57:41](https://youtu.be/Ju7v6vgfEt8?t=3461) MIR is crucial for analyzing borrowing and lifetimes, ensuring shared data remains unmutated while managing multiple references effectively.
## Code Generation and Final Steps
- [58:14](https://youtu.be/Ju7v6vgfEt8?t=3494) Following MIR analysis, code generation requires evaluating constants and processing generics into concrete instantiations before reaching the final representation.
- [58:37](https://youtu.be/Ju7v6vgfEt8?t=3517) The output can be directed towards various code generation backends like LLVM; however, this step ultimately leads to producing the target binary.
## Enhancing Compilation Insights
- [59:12](https://youtu.be/Ju7v6vgfEt8?t=3552) Typically overlooked steps in compilation are examined here; understanding these processes enriches knowledge about how binaries are built from source code.
- [59:35](https://youtu.be/Ju7v6vgfEt8?t=3575) A proposal is made to improve visibility into IR by allowing inspection during runtime instead of just at compile time through pretty printed representations.
## Utilizing Rust Compiler Internally

- [01:00:18](https://youtu.be/Ju7v6vgfEt8?t=3618) Rust enables calling its own compiler via an internal crate called **"rustc driver"** facilitating nested compilation within Rust programs.
- [01:00:41](https://youtu.be/Ju7v6vgfEt8?t=3641) A pseudo-code example illustrates how one might create an instance of the compiler within their main function using this driver module.
## Practical Implementation Steps
- [01:01:27](https://youtu.be/Ju7v6vgfEt8?t=3687) To implement nested compilation successfully, developers must include `extern crate rustc_driver` in their program along with specific feature flags for proper functionality.
# [01:02:23](https://youtu.be/Ju7v6vgfEt8?t=3743) Installation and Setup of rustc Driver
## Overview of Required Components
- [01:02:23](https://youtu.be/Ju7v6vgfEt8?t=3743) The command line indicates the need to install additional components for the task at hand, specifically related to the Rust crate. `rustup component add rust-src rustc-dev llvm-tools-preview`
- [01:02:56](https://youtu.be/Ju7v6vgfEt8?t=3776) Two main crates are highlighted: `rustc_driver` and `rustc_driver_impl`, which contain essential code for leveraging nested calls in the compiler.
## Minimum Example for Running the Compiler
- [01:03:27](https://youtu.be/Ju7v6vgfEt8?t=3807) A minimum example is presented for running the compiler, with a suggestion to increase font size for better visibility during demonstration.

- [01:03:50](https://youtu.be/Ju7v6vgfEt8?t=3830) The source directory contains only a `main.rs` file, while an external **"Hello World" program** is referenced but not included in any crates, meaning it won't be part of cargo build processes.

## Command Line Arguments and Callbacks
- [01:04:12](https://youtu.be/Ju7v6vgfEt8?t=3852) The example focuses on using the driver without interrupting callbacks; instead, it runs the compiler directly from a compiled program.
- [01:04:56](https://youtu.be/Ju7v6vgfEt8?t=3896) An empty struct called `callback` is created to implement necessary traits without overriding functions, allowing default behavior that simply passes through to the compiler.
## Building and Running Programs
- [01:05:43](https://youtu.be/Ju7v6vgfEt8?t=3943) The process begins with building the main program using `cargo build`, resulting in a binary that can call the Rust compiler.
- [01:06:15](https://youtu.be/Ju7v6vgfEt8?t=3975) When running this binary with arguments pointing to "hello.rs" (`cargo run -- ~/presentation/hello.rs`), it should compile successfully and output **"Hello Rare Skills"**.
## Understanding Compiler Callbacks
- [01:06:51](https://youtu.be/Ju7v6vgfEt8?t=4011) If no input is provided when calling Rust C, an error message appears indicating incorrect usage; however, providing valid arguments leads to successful compilation.
- [01:07:15](https://youtu.be/Ju7v6vgfEt8?t=4035) After confirming successful execution of a simple program, attention shifts towards examining callback traits within the Rust compiler's implementation.
# [01:07:40](https://youtu.be/Ju7v6vgfEt8?t=4060) Exploring Callback Traits in Rust Compiler
## Key Functions within Callback Trait
- [01:08:17](https://youtu.be/Ju7v6vgfEt8?t=4097) The discussion introduces key functions within the callback trait such as `after_create_root_passing`, which allows custom code insertion during compilation phases.
## Queries System Explanation
- [01:08:40](https://youtu.be/Ju7v6vgfEt8?t=4120) It’s noted that some functions may be deprecated but still functional. Custom code can manipulate queries during different stages of compilation (e.g., after macro expansion).
## Clarification on Queries
- [01:09:14](https://youtu.be/Ju7v6vgfEt8?t=4154) A request for clarification on queries leads to an explanation that they serve as a system within the Rust compiler for retrieving information at various points during compilation.
## Typing Context Manipulation
# [01:10:05](https://youtu.be/Ju7v6vgfEt8?t=4205) Understanding Rust Compiler Queries

## Overview of Compiler Queries
- [01:10:05](https://youtu.be/Ju7v6vgfEt8?t=4205) The speaker introduces **compiler queries**, explaining that they allow for feeding information through the Rust compiler at specific points of interest during compilation.
- [01:10:39](https://youtu.be/Ju7v6vgfEt8?t=4239) **Three key points** in the compilation process are identified where analysis can be interrupted:
1. after crate root parsing: `after_crate_root_parsing`
2. after macro expansion and name resolution: `after_expansion`
3. and after full analysis: `after_analysis`
## Compilation Phases
- [01:11:01](https://youtu.be/Ju7v6vgfEt8?t=4261) The first phase occurs post-abstract syntax tree (AST) creation, where macros remain unexpanded. No analysis is performed at this stage.
- [01:11:37](https://youtu.be/Ju7v6vgfEt8?t=4297) The second phase involves analyzing the AST after macro expansion but before type checking and trait solving. Programs with borrow checker violations can still be analyzed here without errors.
## Practical Example of Callbacks

- [01:12:01](https://youtu.be/Ju7v6vgfEt8?t=4321) The speaker demonstrates adding **custom code to callbacks** within the compilation phases, showing how to print messages indicating which phase is currently being executed.

- [01:12:43](https://youtu.be/Ju7v6vgfEt8?t=4363) By running the compiler multiple times, it becomes evident that stopping at different phases affects whether a binary is created or not.
**`cargo run -- ~/presentation/hello.rs`**

## Error Handling in Compilation
- [01:13:45](https://youtu.be/Ju7v6vgfEt8?t=4425) If compilation is stopped after macro expansion but before analysis, illegal Rust programs can still produce an AST without throwing errors.
- [01:14:23](https://youtu.be/Ju7v6vgfEt8?t=4463) An example illustrates that if a variable isn't flagged as mutable and an attempt to reassign it occurs, no error will be thrown until reaching the analysis phase.
## Insights on Syntax Validity
- [01:15:00](https://youtu.be/Ju7v6vgfEt8?t=4500) The discussion highlights that even syntactically incorrect programs may pass initial checks without triggering immediate errors due to halted analysis.
- [01:16:20](https://youtu.be/Ju7v6vgfEt8?t=4580) A further exploration into creating invalid syntax reveals that certain constructs can still generate an AST despite being semantically incorrect.
# [01:18:12](https://youtu.be/Ju7v6vgfEt8?t=4692) Error Handling and Compiler Insights
## Understanding Error Reporting in Compilation
- [01:18:12](https://youtu.be/Ju7v6vgfEt8?t=4692) The program's error handling system is designed to collect comprehensive information about errors rather than stopping at the first encountered issue, allowing for richer debugging insights.
- [01:18:33](https://youtu.be/Ju7v6vgfEt8?t=4713) The compiler continues processing even after encountering an error, aiming to provide as much context as possible for troubleshooting.
## Macro Expansion and Error Detection
- [01:18:56](https://youtu.be/Ju7v6vgfEt8?t=4736) Discusses the implications of introducing a non-existent macro during the expansion phase, highlighting how errors can be detected before full expansion occurs.
- [01:19:32](https://youtu.be/Ju7v6vgfEt8?t=4772) Emphasizes that error messages are streamed directly from the compiler without delay, ensuring immediate feedback on issues.
## Analyzing Compiler Behavior
- [01:19:54](https://youtu.be/Ju7v6vgfEt8?t=4794) Transitioning to a more complex example with additional crates allows for deeper exploration of compiler behavior post-analysis phase.
- [01:20:29](https://youtu.be/Ju7v6vgfEt8?t=4829) Introduces querying global context within the Rust compiler, demonstrating how to access mutable references and utilize closures effectively.
# [01:20:52](https://youtu.be/Ju7v6vgfEt8?t=4852) Exploring Type Context in Rust

## Accessing Internal Functions
- [01:21:14](https://youtu.be/Ju7v6vgfEt8?t=4874) Explains how various internal functions can be accessed through the **typing context (tcx)**, enabling extensive manipulation capabilities within the Rust compiler.
## Practical Example: Crate Information Retrieval
- [01:21:48](https://youtu.be/Ju7v6vgfEt8?t=4908) Demonstrates retrieving crate definitions and names using internal functions, showcasing aspects of Rust's internals typically hidden from users.
- [01:22:20](https://youtu.be/Ju7v6vgfEt8?t=4940) Highlights that understanding crate identifiers is crucial for navigating multiple crates within a project.

# [01:23:31](https://youtu.be/Ju7v6vgfEt8?t=5011) Utilizing CLI Commands for Code Analysis
## Pretty Printing with rustc
- [01:23:31](https://youtu.be/Ju7v6vgfEt8?t=5011) Discusses using CLI commands like `rustc -Z unpretty="mir" src/main.rs` to generate human-readable representations of code structures such as MIR (Mid-level Intermediate Representation).
**`rustc -Z unpretty="mir" src/main.rs`**

## Internal Function Calls for Enhanced Output
- [01:24:13](https://youtu.be/Ju7v6vgfEt8?t=5053) Describes accessing internal pretty-printing functions directly via typing context handles, illustrating how developers can leverage existing tools within the Rust ecosystem.



# Understanding Promoted Types in Rust Compiler
## Exploring Promoted Functions

- [01:26:28](https://youtu.be/Ju7v6vgfEt8?t=5188) The speaker investigates the concept of **"promoted" types** by searching for related functions in the Rust compiler, specifically focusing on a function named `promoted_mir`.

- [01:27:02](https://youtu.be/Ju7v6vgfEt8?t=5222) They explain that every body within the context has a **definition ID (DefId)**, and they plan to iterate through these bodies to identify promoters and print their sizes.

## Analyzing Internal Representations
- [01:27:39](https://youtu.be/Ju7v6vgfEt8?t=5259) Upon running their analysis, they receive warnings but successfully output internal representations of promoted types, contrasting this with a more user-friendly printed version.
- [01:28:12](https://youtu.be/Ju7v6vgfEt8?t=5292) The speaker notes that while there may be multiple promoted items, only one is present in this case, confirming expectations through length checks.
## Practical Applications of Analysis
- [01:29:19](https://youtu.be/Ju7v6vgfEt8?t=5359) The discussion shifts to practical applications where runtime verification is necessary; they mention a project requiring JSON serialization of intermediate representation (MIR).
- [01:29:55](https://youtu.be/Ju7v6vgfEt8?t=5395) They describe using a driver function called `emit_smir` to facilitate this serialization process, highlighting its role in obtaining global context and executing callbacks.

## Serialization Challenges
- [01:30:58](https://youtu.be/Ju7v6vgfEt8?t=5458) The speaker elaborates on challenges faced during serialization due to the initial format not being ready for direct conversion into JSON.
- [01:31:54](https://youtu.be/Ju7v6vgfEt8?t=5514) They emphasize the importance of normalizing data forms before serialization and how carrying around typing contexts aids in achieving this goal.
## Custom Code Generation Backends

- [01:32:29](https://youtu.be/Ju7v6vgfEt8?t=5549) Transitioning from interaction with IR levels, they discuss options for custom code generation backends within the Rust compiler beyond default LLVM or GCC options.
- [01:33:08](https://youtu.be/Ju7v6vgfEt8?t=5588) By implementing the **`CodegenBackend` trait**, developers can create tailored code generation solutions suited for specific use cases like **custom blockchains**.
## Potential Use Cases in Blockchain Development
- [01:34:04](https://youtu.be/Ju7v6vgfEt8?t=5644) The speaker speculates about **compiling Rust for unique blockchain environments** that require specialized virtual machines (VM), suggesting it’s feasible though not widely reported.
# Understanding Rust Compiler Configuration and Building Process
## Initial Thoughts on M Consumption
- [01:35:00](https://youtu.be/Ju7v6vgfEt8?t=5700) The speaker expresses uncertainty about the consumption of Mir from a stream, noting a lack of serialization encountered. This suggests that there may not be widespread usage in a portable sense.
## Exploring Rust Environment
- [01:35:37](https://youtu.be/Ju7v6vgfEt8?t=5737) Discussion on using **L2 layer blockchain** within the Rust environment indicates potential for direct access to resources, allowing users to write custom Rust programs. However, no concrete answers are provided regarding its feasibility.
## Bounded Model Checking with Coen Backend
- [01:36:00](https://youtu.be/Ju7v6vgfEt8?t=5760) The speaker mentions a project called [**Kani**](https://model-checking.github.io/kani/) (_open-source verification tool that uses [model checking](https://model-checking.github.io/kani/tool-comparison.html) to analyze Rust programs._) that utilizes a paradigm for transferring data necessary for Bounded Model Checking (CBMC). They encourage further exploration into specific traits being overwritten in real-world applications.
## Setting Up rustc for Compiler Work
- [01:36:34](https://youtu.be/Ju7v6vgfEt8?t=5794) Transitioning to **building rustc**, the speaker highlights challenges faced when modifying the compiler and offers tips to ease this process.
- They emphasize the **importance of selecting appropriate build scripts** during setup.
**`./x.py setup`**

type **b**
**`time ./x build`**
## Build Scripts and Configuration Options
- [01:37:30](https://youtu.be/Ju7v6vgfEt8?t=5850) As they initiate the build process, they explain that choosing specific profiles (like 'compiler') provides configurations tailored for easier modifications while working on the Rust compiler itself. This includes options like debugging settings and incremental compilation enabled by default.
## Understanding Compiler Profiles and Configurations
- [01:39:20](https://youtu.be/Ju7v6vgfEt8?t=5960) The configuration file (`config.example.toml`) is discussed as containing numerous customizable settings essential for building the compiler effectively.
- The speaker notes that defaults are set up by the Rust team to assist newcomers in navigating these options efficiently.
## Stages of Compiler Building Process
# [01:42:48](https://youtu.be/Ju7v6vgfEt8?t=6168) Rust Compiler Optimization Techniques
## Overview of the rustc Dev Guide
- [01:42:48](https://youtu.be/Ju7v6vgfEt8?t=6168) The speaker encourages viewers to refer to the [**rustc Dev guide**](https://rustc-dev-guide.rust-lang.org/building/how-to-build-and-run.html), which serves as a comprehensive resource for building and running the Rust compiler.
- [01:42:48](https://youtu.be/Ju7v6vgfEt8?t=6168) The guide includes [**suggested workflows**](https://rustc-dev-guide.rust-lang.org/building/suggested.html) that streamline processes like profiling, enhancing overall efficiency.
## Improving Compilation Speed
- [01:43:10](https://youtu.be/Ju7v6vgfEt8?t=6190) Initial compilation times can be lengthy; for example, a first run took 28 minutes of user time.
- [01:43:44](https://youtu.be/Ju7v6vgfEt8?t=6224) Even with multi-threading capabilities, reducing compile time from 28 minutes to 3 minutes is still not ideal for iterative development.
## Incremental Compilation Strategy
- [01:44:44](https://youtu.be/Ju7v6vgfEt8?t=6284) After making minor changes in the code (_e.g., modifying a function_), there's no need to rebuild the entire compiler.
- [01:45:29](https://youtu.be/Ju7v6vgfEt8?t=6329) By using **incremental compilation** and **retaining stage one** components, significant reductions in build times can be achieved.
## Performance Metrics
- [01:46:18](https://youtu.be/Ju7v6vgfEt8?t=6378) With incremental compilation, build times improved dramatically to 21 seconds of real time and only 1 minute and 20 seconds of user time.
- **`./x build --keep-stage 1`**
```
real 3m18.777s
user 27m9.879s
sys 1m5.075s
real 0m21.599s
user 1m19.740s
sys 0m10.388s
```
- [01:46:57](https://youtu.be/Ju7v6vgfEt8?t=6417) The speaker demonstrates how to locate the compiled binary within the target directory after making changes. `./build/x86_64-unknow-linux-gnu/stage1/bin/rustc -Z unpretty=mir hello.rs`
## Practical Application of Changes
- [01:48:14](https://youtu.be/Ju7v6vgfEt8?t=6494) A simple change was made in a Rust program (hello.rs), showcasing how modifications are reflected in the output.
- [01:48:48](https://youtu.be/Ju7v6vgfEt8?t=6528) The speaker emphasizes that even small changes can lead to meaningful improvements in performance metrics.
## Advanced Build Techniques
- [01:49:22](https://youtu.be/Ju7v6vgfEt8?t=6562) Further optimizations involve specifying **library builds** while maintaining stage one settings for faster results.
- `./x build library --keep-stage 1`
- [01:50:07](https://youtu.be/Ju7v6vgfEt8?t=6607) Understanding build dependencies is crucial; certain operations must occur sequentially (_e.g., rustdoc after library_).
## Conclusion and Q&A Session
- [01:51:07](https://youtu.be/Ju7v6vgfEt8?t=6667) The session concludes with an invitation for questions regarding common backend swaps among compilers like **Kany**, **GCC**, and **LLVM**.
# [01:52:05](https://youtu.be/Ju7v6vgfEt8?t=6725) Compiler Projects and Their Challenges
## Overview of Compiler Projects
- [01:52:05](https://youtu.be/Ju7v6vgfEt8?t=6725) Discussion on projects like Anas and Caron that focus on exchanging the Coen backend to enhance interactive theorem proving within formal methods.
- [01:52:27](https://youtu.be/Ju7v6vgfEt8?t=6747) Mention of existing queries in the Rust compiler, highlighting limited options due to predetermined namespaces.
## Namespace Clarifications
- [01:53:03](https://youtu.be/Ju7v6vgfEt8?t=6783) Explanation of namespace resolution, emphasizing that function names should not conflict with query names during usage.
- [01:53:28](https://youtu.be/Ju7v6vgfEt8?t=6808) Insight into different Intermediate Representation (IR) layers accessible during compilation, showcasing how they interact throughout the process.
## Internal Compiler Errors
- [01:54:15](https://youtu.be/Ju7v6vgfEt8?t=6855) Acknowledgment of bugs in compilers, including **rustc** and comparison with other compilers like **Solidity** and **Viper** which also have historical bugs.
- [01:54:47](https://youtu.be/Ju7v6vgfEt8?t=6887) Definition of an **Internal Compiler Error (ICE)**, characterized by unusual error messages indicating issues within the compiler rather than user code.
## Reporting Bugs
- [01:55:24](https://youtu.be/Ju7v6vgfEt8?t=6924) Guidance on reporting ICE occurrences to the compiler's GitHub issue board for resolution, stressing the importance of providing context for errors.
---
# Transcription
- [00:00:04](https://www.youtube.com/watch?v=n-ym1utpzhk?t=4) ➜ uh hello everyone my name's uh Daniel I work for runtime verification I am a formal verification engineer there and a lot of what I do is uh
- [00:00:16](https://www.youtube.com/watch?v=n-ym1utpzhk?t=16) ➜ related to rust and the rust compiler uh and so I thought that I'd share at least a cursory overview about that with you here today um a cursory overview that's
- [00:00:28](https://www.youtube.com/watch?v=n-ym1utpzhk?t=28) ➜ going to try to go from start to finish as much as possible I think that is a a somewhat ambitious goal to fit within an hour um there's it's a pretty
- [00:00:40](https://www.youtube.com/watch?v=n-ym1utpzhk?t=40) ➜ complicated piece of software and I guess trying to squeeze uh all of the intro details even inside an hour is a bit like trying to fit a school bus into
- [00:00:49](https://www.youtube.com/watch?v=n-ym1utpzhk?t=49) ➜ a your living room but we'll uh try to make it happen uh can everyone see the screen share can I maybe get some yeah we can see the um
- [00:01:01](https://www.youtube.com/watch?v=n-ym1utpzhk?t=61) ➜ you might have to move your uh what is it called yeah just just move it up into the upper right corner I guess or
- [00:01:06](https://www.youtube.com/watch?v=n-ym1utpzhk?t=66) ➜ something oh this you guys can see the overlay yeah oh nice okay how about that that's probably you won't be able to see any reactions if you put it down there I
- [00:01:15](https://www.youtube.com/watch?v=n-ym1utpzhk?t=75) ➜ don't know wait let me try ah yeah so that yeah so just as an FYI that's okay I I'll fly blind okay um okay so uh some stuff that I'm going to
- [00:01:27](https://www.youtube.com/watch?v=n-ym1utpzhk?t=87) ➜ talk about today is uh a b basic overview of what rust as a language is is trying to do this this is only going to be brief like 3 to 5 minutes just
- [00:01:38](https://www.youtube.com/watch?v=n-ym1utpzhk?t=98) ➜ just talking about uh what the language is which isimportant because uh after that we're going to be talking about the compiler
- [00:01:46](https://www.youtube.com/watch?v=n-ym1utpzhk?t=106) ➜ and the compiler's goal is to take a source programming language that we that we write in Russ Russ source code it's going to turn it into a binary and so uh
- [00:01:56](https://www.youtube.com/watch?v=n-ym1utpzhk?t=116) ➜ it's important to know like what what is it about rust that this compiler has to maintain and and enforce as it goes through these Transitions and so there's
- [00:02:06](https://www.youtube.com/watch?v=n-ym1utpzhk?t=126) ➜ a bunch of things that it's going to need to make sure uh properties that are held by this Source language and um to do that we'll have to do rounds of
- [00:02:14](https://www.youtube.com/watch?v=n-ym1utpzhk?t=134) ➜ analysis and so it's going to transform the language into different intermediate representations uh in order to perform different analysis um uh the best points
- [00:02:26](https://www.youtube.com/watch?v=n-ym1utpzhk?t=146) ➜ in time that it can so so these two points uh here will take a bit of time maybe between 20 and 30 minutes but this is where we're
- [00:02:35](https://www.youtube.com/watch?v=n-ym1utpzhk?t=155) ➜ actually going to be looking inside the Ross compiler uh after that I will talk about um using callbacks which is a more interactive uh way of dealing with the
- [00:02:46](https://www.youtube.com/watch?v=n-ym1utpzhk?t=166) ➜ rust compiler and um these last two points unless uh unless I time travel I don't think I'm going to get to them because I did a bit of a practice run
- [00:02:56](https://www.youtube.com/watch?v=n-ym1utpzhk?t=176) ➜ and I had to strip a bunch of content out of here to even just just make it towards an hour but uh maybe that can be left as a teaser for another
- [00:03:05](https://www.youtube.com/watch?v=n-ym1utpzhk?t=185) ➜ day uh so let's start off what is rust so rust are defined by rust Lang group themselves is a language empowering everyone to build reliable and efficient
- [00:03:17](https://www.youtube.com/watch?v=n-ym1utpzhk?t=197) ➜ software so I think that what they mean by empowering is uh you get a systems level language that gives you explicit control over your memory allocation uh
- [00:03:28](https://www.youtube.com/watch?v=n-ym1utpzhk?t=208) ➜ and it's able to compile to many targets and it's interoperable with other languages through foreign function interface so it's pretty powerful it
- [00:03:36](https://www.youtube.com/watch?v=n-ym1utpzhk?t=216) ➜ does a lot uh and furthermore it's efficient uh this is able to be compiled down small enough to run on embedded systems and it doesn't have a garbage
- [00:03:46](https://www.youtube.com/watch?v=n-ym1utpzhk?t=226) ➜ collector and not having a garbage collector is a little interesting because it is reliable uh without that garbage collector but its type
- [00:03:55](https://www.youtube.com/watch?v=n-ym1utpzhk?t=235) ➜ system um is able to give some strong guarantees of memory safety and thread safety um and by memory safety I mean that uh the typical foot guns that might
- [00:04:12](https://www.youtube.com/watch?v=n-ym1utpzhk?t=252) ➜ be available to you in languages like C where you're able to uh control your own memory allocation like buffer overflow using off free dangling pointers these
- [00:04:24](https://www.youtube.com/watch?v=n-ym1utpzhk?t=264) ➜ are prohibited by the type system so you know if it co- compiles that you're not going to run into that error uh you also know that there's thread
- [00:04:32](https://www.youtube.com/watch?v=n-ym1utpzhk?t=272) ➜ safety in the sense of uh data race Freedom when you do concurrency uh this doesn't prohibit you from I think logically deadlocking a li
- [00:04:45](https://www.youtube.com/watch?v=n-ym1utpzhk?t=285) ➜ blocking your code um in that sense you're thread unsafe but uh you are at least free of data race data races so all of this is enforced at compile time
- [00:04:59](https://www.youtube.com/watch?v=n-ym1utpzhk?t=299) ➜ and um this is done by the fact that rust has borrowing and lifetime semantics so the way that it deals with
- [00:05:09](https://www.youtube.com/watch?v=n-ym1utpzhk?t=309) ➜ references uh that if you have shared data uh so this is multiple borrows multiple references out in the world that you are unable to mutate them so
- [00:05:22](https://www.youtube.com/watch?v=n-ym1utpzhk?t=322) ➜ you know that if you have shared data you can't change the data you also know that if you have mutable data unable to Alias it so you can't make two
- [00:05:31](https://www.youtube.com/watch?v=n-ym1utpzhk?t=331) ➜ references uh to it two mutable borrowers and through uh making sure that these are all enforced at compile time is how they get those strong
- [00:05:41](https://www.youtube.com/watch?v=n-ym1utpzhk?t=341) ➜ guarantees as with anything though there are some caveats so the memory safety and thread safety comes with the assumption that you haven't
- [00:05:49](https://www.youtube.com/watch?v=n-ym1utpzhk?t=349) ➜ inappropriately used the unsafe keyword using the unsafe keyword gives you a lot more power to directly manipulate things like memory and raw pointers uh but it
- [00:06:01](https://www.youtube.com/watch?v=n-ym1utpzhk?t=361) ➜ then uh also empowers you with the foot guns of getting all of the uh memory bugs back in So on the flip side to that you are
- [00:06:13](https://www.youtube.com/watch?v=n-ym1utpzhk?t=373) ➜ also able to safely uh mutate uh shared data and Alias mutable data with the appropriate use of the unsafe keyword so if you use it
- [00:06:25](https://www.youtube.com/watch?v=n-ym1utpzhk?t=385) ➜ appropriately you can do these things and be empowered and so uh a good way to do that would be to use the standard Library which is
- [00:06:33](https://www.youtube.com/watch?v=n-ym1utpzhk?t=393) ➜ essentially wrapping unsafe code to give you things like vectors and uh Atomic references for concurrency so uh there are canonical ways to use unsafe code in
- [00:06:48](https://www.youtube.com/watch?v=n-ym1utpzhk?t=408) ➜ a reasonable way so that's uh that's what's going on with the language of Ross and so now moving on to Ross C Ross C is as I
- [00:06:59](https://www.youtube.com/watch?v=n-ym1utpzhk?t=419) ➜ mentioned the program that's going to take the rust Source language down to a Target binary and uh we'll have a look at
- [00:07:11](https://www.youtube.com/watch?v=n-ym1utpzhk?t=431) ➜ the GitHub for that here so this is the rust Lang uh the Ros compiler GitHub and maybe I'll bump that up in case it's a bit hard for people to see uh also feel
- [00:07:22](https://www.youtube.com/watch?v=n-ym1utpzhk?t=442) ➜ free to turn off the mic and interrupt me at any time I love questions and interjections they fuel me but uh here uh maybe I'll point out
- [00:07:34](https://www.youtube.com/watch?v=n-ym1utpzhk?t=454) ➜ there's four interesting directories I'll do a bit of orientation because when you first get to this GitHub it can be a little
- [00:07:42](https://www.youtube.com/watch?v=n-ym1utpzhk?t=462) ➜ overwhelming uh tests is all of your test code so there's not too much that we'll need to talk about in there today the source directory is not relevant to
- [00:07:50](https://www.youtube.com/watch?v=n-ym1utpzhk?t=470) ➜ the conversations that we're having today there's a bunch of stuff with documentation and other things auxilary to the point of this talk um and then
- [00:07:59](https://www.youtube.com/watch?v=n-ym1utpzhk?t=479) ➜ there compiler and Library so compiler is where most of the code that we're going to interact with today exists and uh that also does rely um on the library
- [00:08:10](https://www.youtube.com/watch?v=n-ym1utpzhk?t=490) ➜ code existing so I'll take a look in these in a moment uh if you clone this compiler and uh build it there'll be a fourth directory called build and that
- [00:08:22](https://www.youtube.com/watch?v=n-ym1utpzhk?t=502) ➜ will contain the the actual Russ C binary uh that you can run it takes a long time to build uh and it does use up quite a bit of space to do it but um if
- [00:08:33](https://www.youtube.com/watch?v=n-ym1utpzhk?t=513) ➜ you want to do it uh you're more than able to in order to do it you will need to use the scripts down here which are the X scripts so there's x.p and X x.p
- [00:08:44](https://www.youtube.com/watch?v=n-ym1utpzhk?t=524) ➜ allows you to have uh some configuration um maybe stuff that we'll be able to touch on later but if you do that it will put some compilers in your
- [00:08:54](https://www.youtube.com/watch?v=n-ym1utpzhk?t=534) ➜ build directory uh and when I say compilers that's because the the rust compiler is a bootstrap compiler meaning that it is
- [00:09:06](https://www.youtube.com/watch?v=n-ym1utpzhk?t=546) ➜ the code to write the compiler for rust is itself written in Rust so if we take a look in one of these many many crates here that make up
- [00:09:17](https://www.youtube.com/watch?v=n-ym1utpzhk?t=557) ➜ the compiler itself we choose here in here we see that this is all rust code so this is code that makes up Ross C and it
- [00:09:31](https://www.youtube.com/watch?v=n-ym1utpzhk?t=571) ➜ itself is uh written in Ross uh which might be a bit confusing uh to some people if you're unfamiliar with bootstrapping um but it's pretty clever
- [00:09:41](https://www.youtube.com/watch?v=n-ym1utpzhk?t=581) ➜ compiler design and I encourage you to to look into it um if you would be interested so inside the compiler
- [00:09:51](https://www.youtube.com/watch?v=n-ym1utpzhk?t=591) ➜ directory we have many many crates and so these crates are responsible from everything from going from the binary oh sorry the source language down to the
- [00:10:00](https://www.youtube.com/watch?v=n-ym1utpzhk?t=600) ➜ binary and Performing all the analyses on the way all of this is going to be happening at different stages along here uh when I first started writing
- [00:10:09](https://www.youtube.com/watch?v=n-ym1utpzhk?t=609) ➜ this talk I thought that I might actually go around through these directories but uhor in the chat could you
- [00:10:20](https://www.youtube.com/watch?v=n-ym1utpzhk?t=620) ➜ sweet to build a compile using langu to see compilers also build and C uh I don't think that this is a rule like you can build compilers in ex external
- [00:10:29](https://www.youtube.com/watch?v=n-ym1utpzhk?t=629) ➜ languages and in fact there's a lot of Frameworks like you can use Java cup and uh I think it's called Flex YYC uh these are all Frameworks for how to build
- [00:10:40](https://www.youtube.com/watch?v=n-ym1utpzhk?t=640) ➜ um uh compilers and paes for uh a grammar that you can Define uh on the spot so it isn't a particular rule but uh it
- [00:10:52](https://www.youtube.com/watch?v=n-ym1utpzhk?t=652) ➜ is I think there are many different bootstrap compilers and um there's definitely a lot of benefit that people have to doing their own
- [00:11:05](https://www.youtube.com/watch?v=n-ym1utpzhk?t=665) ➜ bootstrapping what compiles rust into a binary so that would be the rust C program so when you
- [00:11:18](https://www.youtube.com/watch?v=n-ym1utpzhk?t=678) ➜ typically if you're experienced with using surface language rust you might have used a tool called cargo I don't know if you can see this uh that's being
- [00:11:27](https://www.youtube.com/watch?v=n-ym1utpzhk?t=687) ➜ shared on the screen but cargo is um your sort of canonical way of interacting with the rust compiler but when you do a cargo run what this is
- [00:11:39](https://www.youtube.com/watch?v=n-ym1utpzhk?t=699) ➜ actually doing under the hood for you is it's running Russ c um and it does it with a bunch of arguments to make things easier for you so that you don't have to
- [00:11:50](https://www.youtube.com/watch?v=n-ym1utpzhk?t=710) ➜ worry about that but Russ C here is the program that is going to take the biner uh the The Source
- [00:12:00](https://www.youtube.com/watch?v=n-ym1utpzhk?t=720) ➜ language down into a binary and so back over here this uh GitHub is the GitHub that
- [00:12:11](https://www.youtube.com/watch?v=n-ym1utpzhk?t=731) ➜ contains all of the source code that will that will give us that Rusty binary if we build it now
- [00:12:20](https://www.youtube.com/watch?v=n-ym1utpzhk?t=740) ➜ uh it's also worth mentioning that uh there are a couple of different versions of the compiler so when I was back here if you had a seen when I did rusty
- [00:12:31](https://www.youtube.com/watch?v=n-ym1utpzhk?t=751) ➜ version it comes up with the word nightly here and this is because there on this repo there are nightly pushes to the rust
- [00:12:42](https://www.youtube.com/watch?v=n-ym1utpzhk?t=762) ➜ compiler um like is in every single night uh with changes so this is all sorts of different activity coming from the
- [00:12:51](https://www.youtube.com/watch?v=n-ym1utpzhk?t=771) ➜ community to try and improve the compiler add new features clean things up fix bugs but um people generally want something that's a bit more stable and
- [00:13:02](https://www.youtube.com/watch?v=n-ym1utpzhk?t=782) ➜ so there are also stable releases of the compiler uh which are more reliable um and they're uh excluding more of the experimental
- [00:13:13](https://www.youtube.com/watch?v=n-ym1utpzhk?t=793) ➜ features so uh aside from the the stuff that's in this compiler directory there's also the library directory and this is containing
- [00:13:24](https://www.youtube.com/watch?v=n-ym1utpzhk?t=804) ➜ a lot of The Primitives that exist in the language things like the core library and here you have uh sort of the core types and things like uh you know
- [00:13:35](https://www.youtube.com/watch?v=n-ym1utpzhk?t=815) ➜ pointers that sort of stuff uh things to do with all your types and handling panics and all of that kind of thing and in here in allocation there's all stuff
- [00:13:47](https://www.youtube.com/watch?v=n-ym1utpzhk?t=827) ➜ to do with allocation and there's the standard Library down the bottom so this has things like vectors and slices and um all stuff that if you're familiar
- [00:13:57](https://www.youtube.com/watch?v=n-ym1utpzhk?t=837) ➜ with using rust things like print line these types of macro functions are are inside the the standard crate here so all of this is in in some way useful
- [00:14:10](https://www.youtube.com/watch?v=n-ym1utpzhk?t=850) ➜ when building the compiler the the this directory will depend on some some things inside here uh getting back to the
- [00:14:25](https://www.youtube.com/watch?v=n-ym1utpzhk?t=865) ➜ slides so uh I mentioned the bootstrapping and the nightly releases if you um are at all wanting to interact with the
- [00:14:37](https://www.youtube.com/watch?v=n-ym1utpzhk?t=877) ➜ community and get involved in learning more about the rust compiler there is a zulip it's a fantastic place to go ask questions I encourage everyone that
- [00:14:46](https://www.youtube.com/watch?v=n-ym1utpzhk?t=886) ➜ would be interested to go there so that that's the GitHub for the compiler that has all of the source code
- [00:14:57](https://www.youtube.com/watch?v=n-ym1utpzhk?t=897) ➜ but this is trying to implement an idea and the idea is of course going from the source code down to the binary and so this is my illustration of all of the
- [00:15:07](https://www.youtube.com/watch?v=n-ym1utpzhk?t=907) ➜ things that we need to go through in order to achieve that goal so at the top level up here uh well I I break the the process down into three main stages
- [00:15:17](https://www.youtube.com/watch?v=n-ym1utpzhk?t=917) ➜ there's the rust source code level there's the rust intermediate representation level and uh finally we have code generation and so each one of
- [00:15:26](https://www.youtube.com/watch?v=n-ym1utpzhk?t=926) ➜ these inner boxes is a different uh representation of the code on that way and some of these dot points are different things that we have to do to
- [00:15:36](https://www.youtube.com/watch?v=n-ym1utpzhk?t=936) ➜ the code to get it into the the format that we would like and there's different rounds of analysis here in order to make sure that the code is well formed and
- [00:15:46](https://www.youtube.com/watch?v=n-ym1utpzhk?t=946) ➜ conforming to what it is that that language promises to do so I'll start going down through here and I'll use a hello world example as a motivating
- [00:15:57](https://www.youtube.com/watch?v=n-ym1utpzhk?t=957) ➜ example for us to view some of the different forms so if you're familiar with Russ this is pretty much as simple program as you can
- [00:16:06](https://www.youtube.com/watch?v=n-ym1utpzhk?t=966) ➜ get it's you define the main function and you tell it to print out uh hello world so Russ C when it sees this uh source code the first thing that it's
- [00:16:19](https://www.youtube.com/watch?v=n-ym1utpzhk?t=979) ➜ going to do is decide that it has to transfer this into its next form which is an abstract syntax tree but on the way to getting there it has to do
- [00:16:29](https://www.youtube.com/watch?v=n-ym1utpzhk?t=989) ➜ a few rounds of analysis that I'll point out so I'm aware that some people might not be familiar with compiler um terms here so I've I've added some definitions
- [00:16:41](https://www.youtube.com/watch?v=n-ym1utpzhk?t=1001) ➜ but an abstract syntax tree is uh a a tree like a tree is in a graph tree representation of a a source program and so to to get the tree we need to go
- [00:16:56](https://www.youtube.com/watch?v=n-ym1utpzhk?t=1016) ➜ through uh two things uh and that's called Lexing and pausing so Lexing is where we take in a stream of characters and we're going to read
- [00:17:08](https://www.youtube.com/watch?v=n-ym1utpzhk?t=1028) ➜ them in and choose uh or not choose but we we recognize tokens that that they match in the language so up uh here uh we see FN and then some whites space and
- [00:17:21](https://www.youtube.com/watch?v=n-ym1utpzhk?t=1041) ➜ we know that this is uh the start of the Declaration of a function so this is the function definer um and seeing uh string of characters before either white space
- [00:17:31](https://www.youtube.com/watch?v=n-ym1utpzhk?t=1051) ➜ or the open parentheses means this is the identifier of the function name and so leing is to take in all of those tokens and once we've done that we them
- [00:17:42](https://www.youtube.com/watch?v=n-ym1utpzhk?t=1062) ➜ into the tree this is where we take those tokens and put them into the tree what that looks like when I say tree to give you a graphical idea from the
- [00:17:52](https://www.youtube.com/watch?v=n-ym1utpzhk?t=1072) ➜ Wikipedia is this is what an abstract syntax tree looks like um and this is represent presenting a simple program where there's uh some statements a
- [00:18:03](https://www.youtube.com/watch?v=n-ym1utpzhk?t=1083) ➜ sequence of statements one of them is a while and a while has a condition and a body this condition is the comparison of a variable with a constant and if you go
- [00:18:15](https://www.youtube.com/watch?v=n-ym1utpzhk?t=1095) ➜ into the body there is a branch in that Branch there's a condition that's going to compare two variables A and B and it has an if and
- [00:18:24](https://www.youtube.com/watch?v=n-ym1utpzhk?t=1104) ➜ an else uh both of these being assignments so this tree is a way of representing a program uh that is useful uh for
- [00:18:36](https://www.youtube.com/watch?v=n-ym1utpzhk?t=1116) ➜ compilers and so the first thing that rust is going to do is try to get there so it lexes and paes to get into the as
- [00:18:47](https://www.youtube.com/watch?v=n-ym1utpzhk?t=1127) ➜ however it then to get all the way down it needs to expand macros and name resolution so but or expanding the macros at least but we can view that's
- [00:18:56](https://www.youtube.com/watch?v=n-ym1utpzhk?t=1136) ➜ why I put some white space here we can view what that unexpanded uh program looks like so over
- [00:19:04](https://www.youtube.com/watch?v=n-ym1utpzhk?t=1144) ➜ here uh I have our hello world example and there's a bunch of commands that I've written down that you can run from Russ C in order to
- [00:19:15](https://www.youtube.com/watch?v=n-ym1utpzhk?t=1155) ➜ view uh different different points of what's going on inside the compiler on the way so here this command is using the DZ flag on
- [00:19:25](https://www.youtube.com/watch?v=n-ym1utpzhk?t=1165) ➜ pretty um these flags are not necessarily easy to find but if you use Russy help what it will tell you is uh if you want to know some stuff about the
- [00:19:39](https://www.youtube.com/watch?v=n-ym1utpzhk?t=1179) ➜ compiler there are some unstable options that you can find with- Z help so Russ z-z help
- [00:19:51](https://www.youtube.com/watch?v=n-ym1utpzhk?t=1191) ➜ will spit out a whole bunch of DZ compiler flags and using these you're able to sort of tell it to dump out information about the state or turn off
- [00:20:04](https://www.youtube.com/watch?v=n-ym1utpzhk?t=1204) ➜ certain things for the compiler uh depending on depending on what you're interested in which could be a whole range of different things so if we
- [00:20:13](https://www.youtube.com/watch?v=n-ym1utpzhk?t=1213) ➜ wanted to know some stuff about the we might go Russ uh- Z help and then we could grap uh
- [00:20:25](https://www.youtube.com/watch?v=n-ym1utpzhk?t=1225) ➜ for the and it will tell us here uh a bunch of flags tree tree uh comma expanded and so these are two things that we can provide to the Unpretty flag
- [00:20:38](https://www.youtube.com/watch?v=n-ym1utpzhk?t=1238) ➜ in order for it to dump some state for us to look at so having a look at I've already run both of these commands and we'll have a
- [00:20:51](https://www.youtube.com/watch?v=n-ym1utpzhk?t=1251) ➜ look at what an abstract syntax tree looks like when rust is dumping it out so it doesn't look like that graphical representation where you have nodes and
- [00:21:02](https://www.youtube.com/watch?v=n-ym1utpzhk?t=1262) ➜ arrows pointing to to the nodes uh instead it's like a a Cony or math math graph where you have nodes and then part of the node will will have a a pointer
- [00:21:14](https://www.youtube.com/watch?v=n-ym1utpzhk?t=1274) ➜ or indicate what it points to next uh a label and so this here is corresponding to this this program if I split it to the right although it looks quite
- [00:21:30](https://www.youtube.com/watch?v=n-ym1utpzhk?t=1290) ➜ different but this is the first thing that the rust compiler has done to to try and turn this into the binary is it's taken this hello world program it's
- [00:21:41](https://www.youtube.com/watch?v=n-ym1utpzhk?t=1301) ➜ it's started at the crate level it's created uh some idea of there being some items we can see a main function exists here which makes sense uh it is indeed
- [00:21:52](https://www.youtube.com/watch?v=n-ym1utpzhk?t=1312) ➜ of kind function and inside here I know this is hard to read we won't spend a lot of time into it but inside here we can find
- [00:22:01](https://www.youtube.com/watch?v=n-ym1utpzhk?t=1321) ➜ a call to the print line and we can also find uh our string literal hello world so all of this information is in here it's just been expanded into a graph uh
- [00:22:16](https://www.youtube.com/watch?v=n-ym1utpzhk?t=1336) ➜ if we want to see the so as we saw in there there was still the print line as a macro but this ends up getting expanded straight
- [00:22:26](https://www.youtube.com/watch?v=n-ym1utpzhk?t=1346) ➜ away we can use the rust analyzer to predict what this will get expanded into by using the command expand recursively at at the
- [00:22:37](https://www.youtube.com/watch?v=n-ym1utpzhk?t=1357) ➜ macro and we can see that this print line function will get expanded into uh the standard Library IO module or create um underscore print function so a um a
- [00:22:51](https://www.youtube.com/watch?v=n-ym1utpzhk?t=1371) ➜ vs code plugin you use for that or is that yeah it absolutely is so if you are wanting to do stuff with rust I almost certainly encourage you
- [00:23:03](https://www.youtube.com/watch?v=n-ym1utpzhk?t=1383) ➜ to use the rust analyzer it's um I thought it would tell me how many people used it there but it's oh yeah it's got four million people that are
- [00:23:17](https://www.youtube.com/watch?v=n-ym1utpzhk?t=1397) ➜ using it uh this is part of the the Russ langang team and it it's it's using an LSP to give you a bunch of information in the IDE it's very very helpful for
- [00:23:30](https://www.youtube.com/watch?v=n-ym1utpzhk?t=1410) ➜ using rust and so one of the things that it can do amongst many other things is expand macros right have another question in the chat sorry yep go ahead
- [00:23:40](https://www.youtube.com/watch?v=n-ym1utpzhk?t=1420) ➜ could you read it out to me if possible sure um why or when would someone ever want to see the as of their program uh is this tool mostly for people who want
- [00:23:49](https://www.youtube.com/watch?v=n-ym1utpzhk?t=1429) ➜ to work on the compiler um it is uh it it is mostly for people who are interested in compilers going through the but it's not
- [00:24:00](https://www.youtube.com/watch?v=n-ym1utpzhk?t=1440) ➜ exclusively for that so if you are interested in compilers going through these different stages is important but uh for uh us uh we need to
- [00:24:15](https://www.youtube.com/watch?v=n-ym1utpzhk?t=1455) ➜ understand what's going on for these intermediate representations uh at runtime verification because we would like to
- [00:24:22](https://www.youtube.com/watch?v=n-ym1utpzhk?t=1462) ➜ build tooling on top of it so we want to be able to do uh theorum proving deductive verification and build our own interpreters for rust um and we can't
- [00:24:35](https://www.youtube.com/watch?v=n-ym1utpzhk?t=1475) ➜ just do this at the source language level we end up needing to have some more lowlevel representation and so we're going
- [00:24:45](https://www.youtube.com/watch?v=n-ym1utpzhk?t=1485) ➜ further down to something called the middle intermediate representation but uh there are many other programs that are working with intermediate
- [00:24:56](https://www.youtube.com/watch?v=n-ym1utpzhk?t=1496) ➜ representations of the rust compiler for their their own business case that they have although I will admit that generally um people looking at this are
- [00:25:07](https://www.youtube.com/watch?v=n-ym1utpzhk?t=1507) ➜ are people that are looking at compiler uh or interested in the r compiler itself okay another question came in uh a question related to as rust have
- [00:25:17](https://www.youtube.com/watch?v=n-ym1utpzhk?t=1517) ➜ macros that allow you to generate code are these related um I think yeah I think the question is asked relationship between
- [00:25:25](https://www.youtube.com/watch?v=n-ym1utpzhk?t=1525) ➜ as and the macros uh there there is in the sense that the that I showed you before uh has them unexpanded uh
- [00:25:38](https://www.youtube.com/watch?v=n-ym1utpzhk?t=1538) ➜ however and then there is the ability to expand them so this tree ends up being a lot a lot larger because all of these macros have been expanded but the the
- [00:25:49](https://www.youtube.com/watch?v=n-ym1utpzhk?t=1549) ➜ existence of an is dependent on a macro in in that sense they are like mutually exclusive um well I shouldn't say that actually because the the macro must
- [00:26:01](https://www.youtube.com/watch?v=n-ym1utpzhk?t=1561) ➜ exist in the um but those ideas are are are related in that way um but when you do create a a a
- [00:26:14](https://www.youtube.com/watch?v=n-ym1utpzhk?t=1574) ➜ macro you do so if you if this if the question is related to um when you write a procedural macro in Rust it is true that you have to think about Lexing and
- [00:26:26](https://www.youtube.com/watch?v=n-ym1utpzhk?t=1586) ➜ paing tokens at at that point which actually probably is what the question meant is someone who has looked at that so you do have to think about um this
- [00:26:36](https://www.youtube.com/watch?v=n-ym1utpzhk?t=1596) ➜ sort of stuff when you're working at the the surface language level for for those macros okay uh I got in the chat that answers the question thank you
- [00:26:46](https://www.youtube.com/watch?v=n-ym1utpzhk?t=1606) ➜ nice uh so this is I mean obviously we can take a look at all this stuff all day and I know that everyone attending would be thrilled to do so but um we
- [00:26:58](https://www.youtube.com/watch?v=n-ym1utpzhk?t=1618) ➜ move on so this is the abstract syntax tree and and another reason why we might want to be looking at this stuff is we might want to debug a program that's
- [00:27:05](https://www.youtube.com/watch?v=n-ym1utpzhk?t=1625) ➜ particularly nefarious and we have some knowledge of compiler uh stuff and we want to see what actually is getting spat out
- [00:27:14](https://www.youtube.com/watch?v=n-ym1utpzhk?t=1634) ➜ so the next thing after the that we need to do is we want to transform that into something that we that's very similar to it but is more amendable for us doing
- [00:27:26](https://www.youtube.com/watch?v=n-ym1utpzhk?t=1646) ➜ analysis too and so this is going to be the high intermediate representation in order to get there we need to do some lowering and desugaring
- [00:27:37](https://www.youtube.com/watch?v=n-ym1utpzhk?t=1657) ➜ and what that means is uh features of the rust language that do the same thing we want to streamline all of those into one
- [00:27:49](https://www.youtube.com/watch?v=n-ym1utpzhk?t=1669) ➜ representation so in Rust you can have for Loops while loops and the infinite Loop just the loop key word and break or return inside it to exit that Loop and
- [00:28:02](https://www.youtube.com/watch?v=n-ym1utpzhk?t=1682) ➜ all of these three things are are ways to have a loop but uh here wants to to get rid of the multiple representations and it just works with the loop keyword
- [00:28:13](https://www.youtube.com/watch?v=n-ym1utpzhk?t=1693) ➜ and so a bunch of different features that do the same thing it streamlines them into one uh control flow all gets turned into match statements I'm pretty
- [00:28:24](https://www.youtube.com/watch?v=n-ym1utpzhk?t=1704) ➜ sure and uh that's lowering and D sugaring once we have uh done that we're now in inside a here representation and we can
- [00:28:36](https://www.youtube.com/watch?v=n-ym1utpzhk?t=1716) ➜ do our first rounds of analysis which are type inference trait solving and type checking so this is starting to get to that idea of like what the rust
- [00:28:47](https://www.youtube.com/watch?v=n-ym1utpzhk?t=1727) ➜ compiler is guaranteed it guarantees you a type Safe program but on top of that it has a lot of other things so I'll
- [00:29:01](https://www.youtube.com/watch?v=n-ym1utpzhk?t=1741) ➜ show roughly using some commands here we can do the same thing we can dump uh Unpretty here and Unpretty here tree we won't need to spend much time looking at
- [00:29:12](https://www.youtube.com/watch?v=n-ym1utpzhk?t=1752) ➜ this um this first here one almost looks identical to the original program that we started with so on the left here we have our original hello world in the
- [00:29:26](https://www.youtube.com/watch?v=n-ym1utpzhk?t=1766) ➜ surface language and the here at this point has added in the Prelude it's added in that we're using the standard Library things that we are we are able
- [00:29:36](https://www.youtube.com/watch?v=n-ym1utpzhk?t=1776) ➜ to alide and we don't have to mention um at the surface language uh start to get be made explicit and uh here we can see that the
- [00:29:45](https://www.youtube.com/watch?v=n-ym1utpzhk?t=1785) ➜ macros have been expanded and the uh actual construction of this string into a constant is a bit more explicit there's also the here
- [00:29:59](https://www.youtube.com/watch?v=n-ym1utpzhk?t=1799) ➜ tree um output and so this is really what's more so happening inside uh the rust compiler it gives you a less pretty printed View and so this is almost the
- [00:30:13](https://www.youtube.com/watch?v=n-ym1utpzhk?t=1813) ➜ same as the it looks very similar except there are uh some other things going on which are really really important for the
- [00:30:24](https://www.youtube.com/watch?v=n-ym1utpzhk?t=1824) ➜ compiler to be able to do its analysis it starts allocating things called def IDs to everything that has uh a body or is something that can be alled and every
- [00:30:36](https://www.youtube.com/watch?v=n-ym1utpzhk?t=1836) ➜ single expression that exists inside the code is given a here ID and these deaf IDs are not just useful at the he level they're going to get carried down
- [00:30:45](https://www.youtube.com/watch?v=n-ym1utpzhk?t=1845) ➜ through many rounds of analysis um uh as identify as to different points of things of interest in the code sorry what do you mean when you say getting
- [00:30:58](https://www.youtube.com/watch?v=n-ym1utpzhk?t=1858) ➜ allo what's that mean in this context uh what I meant by an aloc is an aloc is things that exist at some point in memory so that might be a global that
- [00:31:14](https://www.youtube.com/watch?v=n-ym1utpzhk?t=1874) ➜ might be and uh sorry you can also alloc conss um but there are there are this idea of things that end up getting uh
- [00:31:25](https://www.youtube.com/watch?v=n-ym1utpzhk?t=1885) ➜ allocated in memory and yeah that's what I meant by that so that's the here representation and
- [00:31:36](https://www.youtube.com/watch?v=n-ym1utpzhk?t=1896) ➜ when we want to do rounds of analysis there are some things to I guess for the interest of whoops learning about this
- [00:31:56](https://www.youtube.com/watch?v=n-ym1utpzhk?t=1916) ➜ um at the he level the the things that we do for analysis are uh trait solving um type inference and what's the last thing that we do we
- [00:32:11](https://www.youtube.com/watch?v=n-ym1utpzhk?t=1931) ➜ do type checking so an example of type inference here is the fact that this V I haven't actually explicitly told it that this is a a vector of strings now the
- [00:32:23](https://www.youtube.com/watch?v=n-ym1utpzhk?t=1943) ➜ rust analyzer has in Gray told me like I know that this is a vector of strings and how it knows that is because it's it's using the information from the rust
- [00:32:32](https://www.youtube.com/watch?v=n-ym1utpzhk?t=1952) ➜ compiler to do type inference and it knows from the context of what's Happening Here that that it must be a vector of strings so that's type
- [00:32:43](https://www.youtube.com/watch?v=n-ym1utpzhk?t=1963) ➜ inference and trait resolution is or trait solving is the rust compiler deciding that for the generics that I've used have
- [00:32:55](https://www.youtube.com/watch?v=n-ym1utpzhk?t=1975) ➜ I used generics in a way where it's possible to actually construct functions that will be able to have the concrete instantiation for for for what I've
- [00:33:08](https://www.youtube.com/watch?v=n-ym1utpzhk?t=1988) ➜ asked of them and so that was very wordy I know but uh my example for that here is I want to log some elements I have a function log elements and this is a
- [00:33:17](https://www.youtube.com/watch?v=n-ym1utpzhk?t=1997) ➜ generic function of a vector of t uh what I do inside log elements is I enumerate element and I you know for my test here I just print out the the index
- [00:33:33](https://www.youtube.com/watch?v=n-ym1utpzhk?t=2013) ➜ and the element but the print line function says you can only print an element in the way that I've done it if it implements
- [00:33:42](https://www.youtube.com/watch?v=n-ym1utpzhk?t=2022) ➜ display and so here I've had to put a a trait bound on display uh sorry a trait bound on T where I've said t which is generic it can be anything except it
- [00:33:54](https://www.youtube.com/watch?v=n-ym1utpzhk?t=2034) ➜ can't be something that doesn't have display I know that one thing that must be true of T is it has to have display and then this function will resolve if I
- [00:34:04](https://www.youtube.com/watch?v=n-ym1utpzhk?t=2044) ➜ take this away the rust compiler will say I can't solve for these traits you're asking me to create uh or to satisfy a trait bound and I I don't have
- [00:34:18](https://www.youtube.com/watch?v=n-ym1utpzhk?t=2058) ➜ the ability to enforce this it it's too weak um so that's what trait solving is happening and it happens at the here level as well well and then type
- [00:34:29](https://www.youtube.com/watch?v=n-ym1utpzhk?t=2069) ➜ checking is something that we're probably all familiar with uh I'm sure everyone here has written a Russ program or any kind of
- [00:34:39](https://www.youtube.com/watch?v=n-ym1utpzhk?t=2079) ➜ program and they've got the types wrong like here you know you can't assign negative numbers to a u32 um because it's it's meant to be
- [00:34:47](https://www.youtube.com/watch?v=n-ym1utpzhk?t=2087) ➜ unsigned and so there's there's a lot more that goes on with typechecking type checking is much more complicated than just making sure you don't put the the
- [00:34:55](https://www.youtube.com/watch?v=n-ym1utpzhk?t=2095) ➜ negative number in the the unsigned but that's an example of um going through and making sure type checking is
- [00:35:06](https://www.youtube.com/watch?v=n-ym1utpzhk?t=2106) ➜ satisfied uh I might go back all right so uh any more questions Jeffrey
- [00:35:27](https://www.youtube.com/watch?v=n-ym1utpzhk?t=2127) ➜ no that's pretty clear cool so uh this is all happening at this stage where're we're getting about halfway down in our representations so we're still this here
- [00:35:36](https://www.youtube.com/watch?v=n-ym1utpzhk?t=2136) ➜ it still looks like an abstract syntax tree uh like where we came from but it has a little more information to allow us to do our
- [00:35:43](https://www.youtube.com/watch?v=n-ym1utpzhk?t=2143) ➜ analysis uh the next thing that we want to do is transfer that here into here type tie intermediate representation and this isn't that
- [00:35:53](https://www.youtube.com/watch?v=n-ym1utpzhk?t=2153) ➜ different to the here it just uh everything has been type checked in all of the types are uh completely elaborated um I'm less familiar with
- [00:36:04](https://www.youtube.com/watch?v=n-ym1utpzhk?t=2164) ➜ this than everything I did a little bit of uh Googling admittedly to see what is the fear actually used for and from what I can see it's used to uh the analysis
- [00:36:16](https://www.youtube.com/watch?v=n-ym1utpzhk?t=2176) ➜ that's done on the the is things like unsafety check so if you're using the unsafe keyword or you're using different functions that rely on unsafe it will
- [00:36:26](https://www.youtube.com/watch?v=n-ym1utpzhk?t=2186) ➜ wait until it's got the fear to check if um you're breaking any of the the rules around unsafety uh just quickly I'll and I mean
- [00:36:37](https://www.youtube.com/watch?v=n-ym1utpzhk?t=2197) ➜ very quickly because it is completely unreadable in my opinion at least when I had to look at it for the first time today but um you can use uh Rusty dason
- [00:36:50](https://www.youtube.com/watch?v=n-ym1utpzhk?t=2210) ➜ pretty thear tree and you can use un pretty fear flat to have a look at some extremely massive gravs that
- [00:37:06](https://www.youtube.com/watch?v=n-ym1utpzhk?t=2226) ➜ are I'm sure there's lots of information if you're someone that's familiar with reading this stuff in there and there's a flattened version of it here
- [00:37:16](https://www.youtube.com/watch?v=n-ym1utpzhk?t=2236) ➜ um that's a this is a look into what this looks like uh but the important thing is it's used for rounds of analysis uh this one for un
- [00:37:26](https://www.youtube.com/watch?v=n-ym1utpzhk?t=2246) ➜ safety so this is is still an abstract syntax tree looking form but the next one that we go to from the to me this is where things change up so now we change
- [00:37:39](https://www.youtube.com/watch?v=n-ym1utpzhk?t=2259) ➜ from an to sorry before we transition there's another question here yeah um checking my understanding here but we can create
- [00:37:48](https://www.youtube.com/watch?v=n-ym1utpzhk?t=2268) ➜ tools like rust analyzer using the compiler artifacts provided by the rust compiler uh for example H Etc yeah so I I must admit I'm not entirely booked up
- [00:38:01](https://www.youtube.com/watch?v=n-ym1utpzhk?t=2281) ➜ on how uh the rust analyzer is doing what it does but the LSP that has the language server protocol is in some way uh aware I don't know it must be
- [00:38:14](https://www.youtube.com/watch?v=n-ym1utpzhk?t=2294) ➜ referencing this source code directly and it's able to on the Fly ensure some things about this this process if not maybe everything at uh
- [00:38:28](https://www.youtube.com/watch?v=n-ym1utpzhk?t=2308) ➜ some particular levels are are holding but it isn't compiling your code and creating an artifact in the Target directory when the rust analyzer is
- [00:38:38](https://www.youtube.com/watch?v=n-ym1utpzhk?t=2318) ➜ running but the things that it's telling you are things related to making sure you have a valid a uh whether or not you're breaking type inference whether
- [00:38:48](https://www.youtube.com/watch?v=n-ym1utpzhk?t=2328) ➜ or not you're you're breaking uh your trait bounds um the like it can show you the type inference he can tell you when you typed
- [00:38:56](https://www.youtube.com/watch?v=n-ym1utpzhk?t=2336) ➜ check WR so all this is coming from the rust analyzer so it is definitely related to this the the exact relationship with I I haven't dug into
- [00:39:08](https://www.youtube.com/watch?v=n-ym1utpzhk?t=2348) ➜ myself uh any more questions nope cool so uh you once we're uh uh or to get to the mirr um we're
- [00:39:26](https://www.youtube.com/watch?v=n-ym1utpzhk?t=2366) ➜ going to need to change the format into a CFG format and so CFG stands for control flow graph uh I thought I'd look up a graphical representation for that
- [00:39:40](https://www.youtube.com/watch?v=n-ym1utpzhk?t=2380) ➜ as well and to my delight uh when I went to the Wikipedia and I clicked on the picture it actually shows you a rust Mir program as the
- [00:39:51](https://www.youtube.com/watch?v=n-ym1utpzhk?t=2391) ➜ example so um this is a a better view than um at least if you want to see what a CFG is in terms of like actually seeing the blocks and the arrows but
- [00:40:03](https://www.youtube.com/watch?v=n-ym1utpzhk?t=2403) ➜ this is some sort of rust mirr program it's at the Mir level and what you can think of is there's a bunch of information declaring some places in
- [00:40:13](https://www.youtube.com/watch?v=n-ym1utpzhk?t=2413) ➜ memory which are going to have some type so this is just these places of memory are going to be assigned something at some point and then the actual program
- [00:40:24](https://www.youtube.com/watch?v=n-ym1utpzhk?t=2424) ➜ is here inside these these nodes of the graphs with these being the edges of the CFG and here basic block zero has a list of statements these statements are
- [00:40:37](https://www.youtube.com/watch?v=n-ym1utpzhk?t=2437) ➜ performing assignments to those places in memory and then it gets down to a terminator and this Terminator makes some decision it's going to do a switch
- [00:40:46](https://www.youtube.com/watch?v=n-ym1utpzhk?t=2446) ➜ based on some int and it either takes this Branch or it takes this Branch to go to basic block one or basic block two in basic block one there's two
- [00:40:56](https://www.youtube.com/watch?v=n-ym1utpzhk?t=2456) ➜ assignment statements and then basic block one always has an edge to basic block four basic block four has a few statements and it returns so this is
- [00:41:05](https://www.youtube.com/watch?v=n-ym1utpzhk?t=2465) ➜ this is the graphical representation of what a CFG looks like so this is different to what our looked like uh also the these can point back around in
- [00:41:16](https://www.youtube.com/watch?v=n-ym1utpzhk?t=2476) ➜ Loops like that it would be reasonable for uh two to branch and wrap back around into one um that would be a valid CFG
- [00:41:28](https://www.youtube.com/watch?v=n-ym1utpzhk?t=2488) ➜ uh oops where am I going oh back here um and so once we've transformed the into this CFG where and why we would want to do that is because
- [00:41:41](https://www.youtube.com/watch?v=n-ym1utpzhk?t=2501) ➜ it's really beneficial for the next rounds of analysis that we want to do uh here this the rounds of analysis with the Mir are the drop elaboration and the
- [00:41:50](https://www.youtube.com/watch?v=n-ym1utpzhk?t=2510) ➜ borrow checking and so what you can think of is this is kind of that rust memory safety uh guarantee that I spoke of right at the start of talk this is
- [00:41:58](https://www.youtube.com/watch?v=n-ym1utpzhk?t=2518) ➜ where this is getting enforced all of the borrowers all of the lifetime making sure that all of the memory allocation is handled correctly with allocation
- [00:42:07](https://www.youtube.com/watch?v=n-ym1utpzhk?t=2527) ➜ freeing all of that stuff all of that analysis is happening at this level here um in something called the borrow Checker and so Mia has a really nice uh
- [00:42:20](https://www.youtube.com/watch?v=n-ym1utpzhk?t=2540) ➜ so if you remember this is our hello world example um Mia has a really nice pretty print option uh that that's useful to get a flavor for what's going
- [00:42:30](https://www.youtube.com/watch?v=n-ym1utpzhk?t=2550) ➜ on uh that you can use with uh Unpretty mirr but um a nicer option is to use Unpretty mirr and you use this flag here where you you do a minus promote temps
- [00:42:43](https://www.youtube.com/watch?v=n-ym1utpzhk?t=2563) ➜ that turns uh constant promotion off so constant promotion isn't relevant for what we're trying to look at here that's why I turned it off
- [00:42:53](https://www.youtube.com/watch?v=n-ym1utpzhk?t=2573) ➜ um and so if we split that one to the right and we take a look at this so on the left we have our hello
- [00:43:04](https://www.youtube.com/watch?v=n-ym1utpzhk?t=2584) ➜ world source example and on the right we have our mere representation of it we do get a warning straight away that um this is a pretty printed version of this uh
- [00:43:14](https://www.youtube.com/watch?v=n-ym1utpzhk?t=2594) ➜ it all all bets are off if you're going to take this a little too literally um and what we can see is just like before there's a bunch of Declaration of places
- [00:43:29](https://www.youtube.com/watch?v=n-ym1utpzhk?t=2609) ➜ in memory where with a CFG we now uh this idea of we don't have variables we have places in memory and they have some types Place zero here is always reserved
- [00:43:41](https://www.youtube.com/watch?v=n-ym1utpzhk?t=2621) ➜ for the return of a function uh and the rest of these you can think of as like um uh registers if you will or just yeah literally places
- [00:43:51](https://www.youtube.com/watch?v=n-ym1utpzhk?t=2631) ➜ in memory um so this function here prints hello world uh the way that it does that through a control flow graph is it
- [00:44:00](https://www.youtube.com/watch?v=n-ym1utpzhk?t=2640) ➜ starts always with basic block zero and it assigns to place four the constant string uh hello world it then creates another place in memory which is a uh a
- [00:44:14](https://www.youtube.com/watch?v=n-ym1utpzhk?t=2654) ➜ non-mutable reference to that um constant of hello world and then it tries to create a whatever an argument new con is of of three which was the
- [00:44:30](https://www.youtube.com/watch?v=n-ym1utpzhk?t=2670) ➜ reference to hello world if that succeeds we're going to Branch to B one and if that fails we unwind which is is like AB boarding the program but
- [00:44:40](https://www.youtube.com/watch?v=n-ym1utpzhk?t=2680) ➜ literally unwinding up the stack all the way back the C stack um B1 here is uh going to assign to place one in memory the the output of printing what was in
- [00:44:56](https://www.youtube.com/watch?v=n-ym1utpzhk?t=2696) ➜ two and what was in two came from basic block zero which was the constant hello world so we're going to literally print hello world if that uh doesn't error
- [00:45:05](https://www.youtube.com/watch?v=n-ym1utpzhk?t=2705) ➜ we're going to the Terminator here is going to take us to basic block two and otherwise we unwind back up the stack and basic block two just returns because
- [00:45:16](https://www.youtube.com/watch?v=n-ym1utpzhk?t=2716) ➜ the return type here is the unit it returns nothing so that's like a a crash course on how to read a mirr program um as I
- [00:45:27](https://www.youtube.com/watch?v=n-ym1utpzhk?t=2727) ➜ said Mir is doing a lot of those uh memory guarantees like lifetimes and borrow checking so after all this we're
- [00:45:39](https://www.youtube.com/watch?v=n-ym1utpzhk?t=2739) ➜ finished with the rust intermediate representations and the last thing that's left is cenation cenation is something I'm going
- [00:45:47](https://www.youtube.com/watch?v=n-ym1utpzhk?t=2747) ➜ to Breeze over pretty quickly um but what's important is what generally ships with rust is lvm but this section here is actually
- [00:46:02](https://www.youtube.com/watch?v=n-ym1utpzhk?t=2762) ➜ really uh you can you can change it you can exchange it with other really common Cod genen backends like GCC cran lift uh but you can also write your own custom
- [00:46:14](https://www.youtube.com/watch?v=n-ym1utpzhk?t=2774) ➜ Cod gen backend maybe uh you have a particular use case for uh turning rust after these rounds of analysis have happened at near and you say okay but
- [00:46:26](https://www.youtube.com/watch?v=n-ym1utpzhk?t=2786) ➜ now I've got that there's a whole bunch of different code generation from what everyone else in the world is doing that is useful for my particular business
- [00:46:34](https://www.youtube.com/watch?v=n-ym1utpzhk?t=2794) ➜ case um an example of this is the Carney model Checker which um takes I think all of this and then uses uh some interesting code generation for I I
- [00:46:47](https://www.youtube.com/watch?v=n-ym1utpzhk?t=2807) ➜ think it's cbmc or something like this to uh do model checking of programs um in order to get from Mia to l VM you have to go through quite a lot
- [00:47:00](https://www.youtube.com/watch?v=n-ym1utpzhk?t=2820) ➜ of different stages uh constant evaluation so all constants in a program that you write in Rust are evaluated before you
- [00:47:10](https://www.youtube.com/watch?v=n-ym1utpzhk?t=2830) ➜ even get to code generation it's it's done well yeah it's evaluated every constant that it can at least um and uh as well as that you do even more
- [00:47:24](https://www.youtube.com/watch?v=n-ym1utpzhk?t=2844) ➜ lowering if you remember lowering was sort of normal izing what was going on and the lowering that is happening here is called single static assignment um
- [00:47:33](https://www.youtube.com/watch?v=n-ym1utpzhk?t=2853) ➜ this is a big simplification as well there's a lot more that goes on and then another thing of interest is there's monomorph
- [00:47:40](https://www.youtube.com/watch?v=n-ym1utpzhk?t=2860) ➜ so uh when we write in a powerful language like rust we have a lot of generics we want functions to be able to take multiple types and we want
- [00:47:51](https://www.youtube.com/watch?v=n-ym1utpzhk?t=2871) ➜ them to be able to return multiple types we want traits and all this stuff but when we get down to a binary a binary doesn't really understand what any of
- [00:48:00](https://www.youtube.com/watch?v=n-ym1utpzhk?t=2880) ➜ that stuff is instead there's a list of different functions and uh as the program counter is stepping through uh these instructions it's
- [00:48:10](https://www.youtube.com/watch?v=n-ym1utpzhk?t=2890) ➜ jumping to different points and the binary itself doesn't understand a generic function and so the monomorph is taking all of the possible generics that
- [00:48:21](https://www.youtube.com/watch?v=n-ym1utpzhk?t=2901) ➜ can be satisfied and creating individual functions in in the code generation of the actual uh assembly or llvm in this case so that um you can handle all the
- [00:48:32](https://www.youtube.com/watch?v=n-ym1utpzhk?t=2912) ➜ different types you need to for your generic code once you're in llvm there's there's a ton of rounds of optimization and llvm
- [00:48:42](https://www.youtube.com/watch?v=n-ym1utpzhk?t=2922) ➜ is getting pretty close to a binary representation at this point using the commands I was able to admit it using llvm and llvm VC uh that's this one here
- [00:48:57](https://www.youtube.com/watch?v=n-ym1utpzhk?t=2937) ➜ and llvm is but I want to say this is readable but I h