{"id":13575329,"url":"https://github.com/cpq/bare-metal-programming-guide","last_synced_at":"2025-05-13T22:11:33.286Z","repository":{"id":59737429,"uuid":"537042935","full_name":"cpq/bare-metal-programming-guide","owner":"cpq","description":"A bare metal programming guide (ARM microcontrollers)","archived":false,"fork":false,"pushed_at":"2025-05-08T17:08:31.000Z","size":4519,"stargazers_count":3683,"open_issues_count":3,"forks_count":330,"subscribers_count":81,"default_branch":"main","last_synced_at":"2025-05-08T18:23:49.409Z","etag":null,"topics":["arm","baremetal","cmsis","embedded-web-server","embedded-webserver","ethernet","gcc","gpio","irq","make","stm32","tutorial","uart","webserver"],"latest_commit_sha":null,"homepage":"","language":"C","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/cpq.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2022-09-15T13:35:24.000Z","updated_at":"2025-05-08T17:08:37.000Z","dependencies_parsed_at":"2024-01-06T23:55:24.820Z","dependency_job_id":"4d673309-4f63-4eb2-8afe-1910f943b537","html_url":"https://github.com/cpq/bare-metal-programming-guide","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cpq%2Fbare-metal-programming-guide","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cpq%2Fbare-metal-programming-guide/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cpq%2Fbare-metal-programming-guide/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/cpq%2Fbare-metal-programming-guide/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/cpq","download_url":"https://codeload.github.com/cpq/bare-metal-programming-guide/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254036842,"owners_count":22003654,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["arm","baremetal","cmsis","embedded-web-server","embedded-webserver","ethernet","gcc","gpio","irq","make","stm32","tutorial","uart","webserver"],"created_at":"2024-08-01T15:01:00.029Z","updated_at":"2025-05-13T22:11:33.276Z","avatar_url":"https://github.com/cpq.png","language":"C","funding_links":[],"categories":["C","MCU programming","CPU_RISC-V"],"sub_categories":["Bare-metal programming (Don't need MCU)","资源传输下载"],"readme":"# A bare metal programming guide\n\n[![License: MIT](https://img.shields.io/badge/license-MIT-blue)](https://opensource.org/licenses/MIT)\n[![Build Status]( https://github.com/cpq/bare-metal-programming-guide/workflows/build/badge.svg)](https://github.com/cpq/bare-metal-programming-guide/actions)\n\nEnglish | [中文](README_zh-CN.md) | [Türkçe](README_tr-TR.md)\n\nThis guide is written for developers who wish to start programming\nmicrocontrollers using a GCC compiler and a datasheet, without using any\nframework. This guide explains the fundamentals, and helps to understand how\nembedded frameworks like Cube, Keil, Arduino, and others, work.\n\nEvery chapter in this guide comes with a complete source code which gradually\nprogress in functionality and completeness. In the end, I provide bare metal\ntemplate projects for different architectures:\n\n- **blinky** - classic, blink an LED and print a debug message periodically\n- **cli** - UART command line interface. Implements commands to set LED status and hexdump RAM\n- **lfs** - implement file functions `mkdir(),readdir(),fopen(),...` using\n  [littlefs](https://github.com/littlefs-project/littlefs) in the upper\n  region of buit-in flash memory. Store device boot\n  count in a file, increment on each boot, and print periodically\n- **webui** - embedded web server with a professional device dashboard UI\n  using [mongoose library](https://github.com/cesanta/mongoose)\n\n| Board | Arch | MCU datasheet | Board datasheet | Template project |\n| ----- | ---- | ------------- | --------------- | ---------------- |\n| STM32 Nucleo-F429ZI | Cortex-M4  | [mcu datasheet](https://www.st.com/resource/en/reference_manual/dm00031020-stm32f405-415-stm32f407-417-stm32f427-437-and-stm32f429-439-advanced-arm-based-32-bit-mcus-stmicroelectronics.pdf) | [board datasheet](https://www.st.com/resource/en/user_manual/dm00244518-stm32-nucleo144-boards-mb1137-stmicroelectronics.pdf) | [blinky](templates/blinky/nucleo-f429zi), [cli](templates/cli/nucleo-f429zi), [webui](steps/step-7-webserver/nucleo-f429zi/) |\n| STM32 Nucleo-F303K8 | Cortex-M4  | [mcu datasheet](https://www.st.com/resource/en/reference_manual/DM00043574-.pdf) | [board datasheet](https://www.st.com/resource/en/datasheet/stm32f303k8.pdf) | [lfs](templates/lfs/nucleo-f303k8) |\n| STM32 Nucleo-L432KC | Cortex-M4  | [mcu datasheet](https://www.st.com/resource/en/reference_manual/dm00151940-stm32l41xxx42xxx43xxx44xxx45xxx46xxx-advanced-armbased-32bit-mcus-stmicroelectronics.pdf) | [board datasheet](https://www.st.com/resource/en/datasheet/stm32l432kc.pdf) | [blinky](templates/blinky/nucleo-l432kc), [cli](templates/cli/nucleo-l432kc), [lfs](templates/lfs/nucleo-l432kc) |\n| SAME54 Xplained     | Cortex-M4  | [mcu datasheet](https://ww1.microchip.com/downloads/aemDocuments/documents/MCU32/ProductDocuments/DataSheets/SAM-D5x-E5x-Family-Data-Sheet-DS60001507.pdf) | [board datasheet](https://ww1.microchip.com/downloads/en/DeviceDoc/70005321A.pdf) | [blinky](templates/blinky/same54-xplained) |\n| TI EK-TM4C1294XL    | Cortex-M4F | [mcu datasheet](https://www.ti.com/lit/ds/symlink/tm4c1294ncpdt.pdf) | [board datasheet](https://www.ti.com/lit/ug/spmu365c/spmu365c.pdf) | [webui](steps/step-7-webserver/ek-tm4c1294xl) | \n| RP2040 Pico-W5500   | Cortex-M0+ | [mcu datasheet](https://datasheets.raspberrypi.com/rp2040/rp2040-datasheet.pdf) | [board datasheet](https://docs.wiznet.io/Product/iEthernet/W5500/w5500-evb-pico) | [webui](steps/step-7-webserver/pico-w5500/) |\n| ESP32-C3            | RISCV      | [mcu datasheet](https://www.espressif.com/sites/default/files/documentation/esp32-c3_technical_reference_manual_en.pdf) | | [blinky](templates/blinky/esp32-c3) |\n\nIn this tutorial we'll use the **Nucleo-F429ZI** development board, so\ngo ahead and download the mcu datasheet and the board datasheet for it.\n\n## Tools setup\n\nTo proceed, the following tools are required:\n\n- ARM GCC, https://launchpad.net/gcc-arm-embedded - for compiling and linking\n- GNU make, http://www.gnu.org/software/make/ - for build automation\n- ST link, https://github.com/stlink-org/stlink - for flashing\n- Git, https://git-scm.com/ - for downloading source code and version control\n\n### Setup instructions for Mac\n\nStart a terminal, and execute:\n\n```sh\n$ /bin/bash -c \"$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)\"\n$ brew install gcc-arm-embedded make stlink git\n```\n\n### Setup instructions for Linux (Ubuntu)\n\nStart a terminal, and execute:\n\n```sh\n$ sudo apt -y update\n$ sudo apt -y install gcc-arm-none-eabi make stlink-tools git\n```\n\n### Setup instructions for Windows\n\n- Download and install [gcc-arm-none-eabi-10.3-2021.10-win32.exe](https://developer.arm.com/-/media/Files/downloads/gnu-rm/10.3-2021.10/gcc-arm-none-eabi-10.3-2021.10-win32.exe?rev=29bb46cfa0434fbda93abb33c1d480e6\u0026hash=3C58D05EA5D32EF127B9E4D13B3244D26188713C). Enable \"Add path to environment variable\" during the installation\n- Create `c:\\tools` folder\n- Download [stlink-1.7.0-x86_64-w64-mingw32.zip](https://github.com/stlink-org/stlink/releases/download/v1.7.0/stlink-1.7.0-x86_64-w64-mingw32.zip) and unpack `bin/st-flash.exe` into `c:\\tools`\n- Download [make-4.4-without-guile-w32-bin.zip](https://sourceforge.net/projects/ezwinports/files/make-4.4-without-guile-w32-bin.zip/download) and unpack `bin/make.exe` into `c:\\tools`\n- Add `c:\\tools` to the `Path` environment variable\n- Enable \"Developer Mode\" in Windows 10/11, for symbolic link support.\n- Install Git from https://git-scm.com/download/win. Check \"Enable symlink\" during installation\n\n### Verify installed tools\n\nNow, when all required tools are installed, start terminal/command prompt, and\nenter the following commands to download this repository and build an example:\n\n```sh\ngit clone https://github.com/cpq/bare-metal-programming-guide\ncd bare-metal-programming-guide/steps/step-0-minimal\nmake\n```\n\n## Introduction\n\nA microcontroller (uC, or MCU) is a small computer. Typically it has CPU, RAM,\nflash to store firmware code, and a bunch of pins that stick out. Some pins are\nused to power the MCU, usually marked as GND (ground) and VCC pins. Other pins\nare used to communicate with the MCU, by means of high/low voltage applied to\nthose pins. One of the simplest ways of communication is an LED attached to a\npin: one LED contact is attached to the ground pin (GND), and another contact\nis attached to a signal pin via a current-limiting resistor.  A firmware code\ncan set high or low voltage on a signal pin, making LED blink:\n\n\u003cimg src=\"images/mcu.svg\" height=\"200\" /\u003e\n\n### Memory and registers\n\nThe 32-bit address space of the MCU is divided by regions. For example, some\nregion of memory is mapped to the internal MCU flash at a specific address.\nFirmware code instructions are read and executed by reading from that memory region. Another region is\nRAM, which is also mapped to a specific address. We can read and write any\nvalues to the RAM region.\n\nFrom STM32F429 datasheet, we can take a look at section 2.3.1 and learn\nthat RAM region starts at address 0x20000000 and has size of 192KB. From section\n2.4 we can learn that flash is mapped at address 0x08000000. Our MCU has\n2MB flash, so flash and RAM regions are located like this:\n\n\u003cimg src=\"images/mem.svg\" /\u003e\n\nFrom the datasheet we can also learn that there are many more memory regions.\nTheir address ranges are given in the section 2.3 \"Memory Map\". For example,\nthere is a \"GPIOA\" region that starts at 0x40020000 and has length of 1KB.\n\nThese memory regions correspond to a different \"peripherals\" inside the MCU -\na piece of silicon circuitry that make certain pins behave in a special way.\nA peripheral memory region is a collection of 32-bit **registers**. Each\nregister is a 4-byte memory range at a certain address, that maps to a certain\nfunction of the given peripheral. By writing values into a register - in other\nwords, by writing a 32-bit value at a certain memory address, we can control\nhow given peripheral should behave. By reading registers, we can read back\nperipheral's data or configuration.\n\nThere are many different peripherals. One of the simpler ones are GPIO\n(General Purpose Input Output), which allow user to set MCU pins\ninto \"output mode\" and set high or low voltage on them. Or, set pins into\nan \"input mode\" and read voltage values from them. There is a UART peripheral\nwhich can transmit and receive serial data over two pins using serial protocol.\nThere are many other peripherals.\n\nOften, there are multiple \"instances\" of the same peripheral, for example\nGPIOA, GPIOB, ... which control different set of MCU pins. Likewise, there\ncould be UART1, UART2, ... which allow to implement multiple UART channels.\nOn Nucleo-F429, there are several GPIO and UART peripherals.\n\nFor example, GPIOA\nperipheral starts at 0x40020000, and we can find GPIO register description in\nsection 8.4. The datasheet says that `GPIOA_MODER` register has offset 0, that\nmeans that it's address is `0x40020000 + 0`, and this is the format of the\nregister:\n\n\u003cimg src=\"images/moder.png\" style=\"max-width: 100%\" /\u003e\n\nThe datasheet shows that the 32-bit MODER register is a collection of 2-bit\nvalues, 16 in total. Therefore, one MODER register controls 16 physical pins,\nBits 0-1 control pin 0, bits 2-3 control pin 1, and so on. The 2-bit value\nencodes pin mode: 0 means input, 1 means output, 2 means \"alternate function\" -\nsome specific behavior described elsewhere, and 3 means analog. Since the\nperipheral name is \"GPIOA\", then pins are named \"A0\", \"A1\", etc. For peripheral\n\"GPIOB\", pin naming would be \"B0\", \"B1\", ...\n\nIf we write 32-bit value `0` to the register MODER, we'll set all 16 pins,\nfrom A0 to A15, to input mode:\n\n```c\n  * (volatile uint32_t *) (0x40020000 + 0) = 0;  // Set A0-A15 to input mode\n```\n\nNote the `volatile` specifier. Its meaning will be covered later.  By setting\nindividual bits, we can selectively set specific pins to a desired mode. For\nexample, this snippet sets pin A3 to output mode:\n\n```c\n  * (volatile uint32_t *) (0x40020000 + 0) \u0026= ~(3 \u003c\u003c 6);  // CLear bit range 6-7\n  * (volatile uint32_t *) (0x40020000 + 0) |= 1 \u003c\u003c 6;     // Set bit range 6-7 to 1\n```\n\nLet me explain those bit operations. Our goal is to set bits 6-7, which are\nresponsible for the pin 3 of GPIOA peripheral, to a specific value (1, in our\ncase). This is done in two steps. First, we must clear the current value of\nbits 6-7, because it may hold some value already. Then we must set bits 6-7\nto the value we want.\n\nSo, first, we must set bit range 6-7 (two bits at position 6) to zero. How do\nwe set a number of bits to zero? In four steps:\n\n| Action | Expression | Bits (first 12 of 32) |\n| - | - | - |\n| Get a number with N contiguous bits set: `2^N-1`, N=2 | `3`  | `000000000011` |\n| Shift that number X positions left | `(3\u003c\u003c6)` | `000011000000` |\n| Invert the number: turn zeros to ones, and ones to zeroes | `~(3\u003c\u003c6)` | `111100111111` |\n| Bitwise AND with existing value | `VAL \u0026= ~(3\u003c\u003c6)` | `xxxx00xxxxxx` |\n\nNote that the last operation, bitwise AND, turns N bits at position X to zero\n(because they are ANDed with 0), but retains the value of all other bits\n(because they are ANDed with 1). Retaining existing value is important, cause\nwe don't want to change settings in other bit ranges. So in general, if we want\nto clear N bits at position X:\n\n```c\nREGISTER \u0026= ~((2^N - 1) \u003c\u003c X);\n```\n\nAnd, finally, we want to set a given bit range to the value we want. We\nshift that value X positions left, and OR with the current value of the whole\nregister (in order to retain other bits' values):\n\n```c\nREGISTER |= VALUE \u003c\u003c X;\n```\n\n## Human-readable peripherals programming\n\nIn the previous section we have learned that we can read and write peripheral\nregister by direct accessing certain memory addresses. Let's look at the\nsnippet that sets pin A3 to output mode:\n\n```c\n  * (volatile uint32_t *) (0x40020000 + 0) \u0026= ~(3 \u003c\u003c 6);  // CLear bit range 6-7\n  * (volatile uint32_t *) (0x40020000 + 0) |= 1 \u003c\u003c 6;     // Set bit range 6-7 to 1\n```\n\nThat is pretty cryptic. Without extensive comments, such code would be quite\nhard to understand. We can rewrite this code to a much more readable form.  The\nidea is to represent the whole peripheral as a structure that contains 32-bit\nfields. Let's see what registers exist for the GPIO peripheral in the section\n8.4 of the datasheet. They are MODER, OTYPER, OSPEEDR, PUPDR, IDR, ODR, BSRR,\nLCKR, AFR. Their offsets are with offsets 0, 4, 8, etc... . That means we can\nrepresent them as a structure with 32-bit fields, and make a define for GPIOA:\n\n```c\nstruct gpio {\n  volatile uint32_t MODER, OTYPER, OSPEEDR, PUPDR, IDR, ODR, BSRR, LCKR, AFR[2];\n};\n\n#define GPIOA ((struct gpio *) 0x40020000)\n```\n\nThen, for setting GPIO pin mode, we can define a function:\n\n```c\n// Enum values are per datasheet: 0, 1, 2, 3\nenum {GPIO_MODE_INPUT, GPIO_MODE_OUTPUT, GPIO_MODE_AF, GPIO_MODE_ANALOG};\n\nstatic inline void gpio_set_mode(struct gpio *gpio, uint8_t pin, uint8_t mode) {\n  gpio-\u003eMODER \u0026= ~(3U \u003c\u003c (pin * 2));        // Clear existing setting\n  gpio-\u003eMODER |= (mode \u0026 3) \u003c\u003c (pin * 2);   // Set new mode\n}\n```\nNow, we can rewrite the snippet for A3 like this:\n\n```c\ngpio_set_mode(GPIOA, 3 /* pin */, GPIO_MODE_OUTPUT);  // Set A3 to output\n```\n\nOur MCU has several GPIO peripherals (also called \"banks\"): A, B, C, ... K.\nFrom section 2.3 we can see that they are 1KB away from each other:\nGPIOA is at address 0x40020000, GPIOB is at 0x40020400, and so on:\n\n```c\n#define GPIO(bank) ((struct gpio *) (0x40020000 + 0x400 * (bank)))\n```\n\nWe can create pin numbering that includes the bank and the pin number.\nTo do that, we use 2-byte `uint16_t` value, where upper byte indicates\nGPIO bank, and lower byte indicates pin number:\n\n```c\n#define PIN(bank, num) ((((bank) - 'A') \u003c\u003c 8) | (num))\n#define PINNO(pin) (pin \u0026 255)\n#define PINBANK(pin) (pin \u003e\u003e 8)\n```\n\nThis way, we can specify pins for any GPIO bank:\n\n```c\n  uint16_t pin1 = PIN('A', 3);    // A3   - GPIOA pin 3\n  uint16_t pin2 = PIN('G', 11);   // G11  - GPIOG pin 11\n``` \n\nLet's look first at what happens for `PIN('A', 3)`:\n\n- `(bank) - 'A'` results in `'A' - 'A'` which will evaluate to `0`. As a 16 bit binary value this would be `0b00000000,00000000`.\n- Next we bit shift this value left by 8 bits because we want to store `bank` in the upper byte of this 16 bit, or 2 byte value. In this case the result remains the same: `0b00000000,00000000`.\n- Finally we bitwise OR the value above with `num`, in our case `3` which has a 16 bit binary representation of `0b00000000,00000011`. The result in binary is `0b00000000,00000011`.\n\nLet's take a look at what happens for `PIN('G',11)`:\n\n- `(bank) - 'G'` results in `'G' - 'A'` which will evaluate to `6`. As a 16 bit binary value this would be `0b00000000,00000110`.\n- Next we bit shift this value left by 8 bits because we want to store `bank` in the upper byte of this 16 bit, or 2 byte value. This results in a binary value of: `0b00000110,00000000`.\n- Finally we bitwise OR the value above with `num`, in our case `11` which has a 16 bit binary representation of `0b00000000,00001011`. The result of the bitwise OR in binary is `0b00000110,00001011` which is a combination of `bank` in the upper byte and `pin` in the lower byte.\n\nLet's rewrite the `gpio_set_mode()` function to take our pin specification:\n\n```c\nstatic inline void gpio_set_mode(uint16_t pin, uint8_t mode) {\n  struct gpio *gpio = GPIO(PINBANK(pin)); // GPIO bank\n  uint8_t n = PINNO(pin);                 // Pin number\n  gpio-\u003eMODER \u0026= ~(3U \u003c\u003c (n * 2));        // Clear existing setting\n  gpio-\u003eMODER |= (mode \u0026 3) \u003c\u003c (n * 2);   // Set new mode\n}\n```\n\nNow the code for A3 is self-explanatory:\n\n```c\n  uint16_t pin = PIN('A', 3);            // Pin A3\n  gpio_set_mode(pin, GPIO_MODE_OUTPUT);  // Set to output\n```\n\nNote that we have created a useful initial API for the GPIO peripheral. Other\nperipherals, like UART (serial communication) and others - can be implemented\nin a similar way. This is a good programming practice that makes code\nself-explanatory and human readable.\n\n## MCU boot and vector table\n\nWhen an ARM MCU boots, it reads a so-called \"vector table\" from the\nbeginning of flash memory. A vector table is a concept common to all ARM MCUs.\nThat is an array of 32-bit addresses of interrupt handlers. First 16 entries\nare reserved by ARM and are common to all ARM MCUs. The rest of interrupt\nhandlers are specific to the given MCU - these are interrupt handlers for\nperipherals. Simpler MCUs with few peripherals have few interrupt handlers,\nand more complex MCUs have many.\n\nVector table for STM32F429 is documented in Table 62. From there we can learn\nthat there are 91 peripheral handlers, in addition to the standard 16.\n\nEvery entry in the vector table is an address of a function that MCU executes\nwhen a hardware interrupt (IRQ) triggers. The exception are first two entries,\nwhich play a key role in the MCU boot process.  Those two first values are: an\ninitial stack pointer, and an address of the boot function to execute (a\nfirmware entry point).\n\nSo now we know, that we must make sure that our firmware should be composed in\na way that the 2nd 32-bit value in the flash should contain an address of\nour boot function. When MCU boots, it'll read that address from flash, and\njump to our boot function.\n\n\n## Minimal firmware\n\nLet's create a file `main.c`, and specify our boot function that initially does\nnothing (falls into infinite loop), and specify a vector table that contains 16\nstandard entries and 91 STM32 entries. In your editor of choice, create\n`main.c` file and copy/paste the following into `main.c` file:\n\n```c\n// Startup code\n__attribute__((naked, noreturn)) void _reset(void) {\n  for (;;) (void) 0;  // Infinite loop\n}\n\nextern void _estack(void);  // Defined in link.ld\n\n// 16 standard and 91 STM32-specific handlers\n__attribute__((section(\".vectors\"))) void (*const tab[16 + 91])(void) = {\n  _estack, _reset\n};\n```\n\nHere the reset() function is the Reset_Handler. For function `_reset()`, we have used GCC-specific attributes `naked` and\n`noreturn` - they mean, standard function's prologue and epilogue should not\nbe created by the compiler, and that function does not return.\n\nThe `void (*const tab[16 + 91])(void)` expression means: define an array of 16\n\\+ 91 pointers to functions which return nothing (void) and take no arguments (void).\nEach such function is an IRQ handler (Interrupt ReQuest handler). An array of\nthose handlers is called a vector table.\n\nThe vector table `tab` we put in a separate section called `.vectors` - that we\nneed later to tell the linker to put that section right at the beginning of the\ngenerated firmware - and consecutively, at the beginning of flash memory. The\nfirst two entries are: the value of the stack pointer register, and the\nfirmware's entry point.  We leave the rest of vector table filled with zeroes.\n\nNOTE: The startup code here is written in C, and included in the main.c file.\nOftentimes, device SDKs have a startup.s file written in assembly.\n\n### Compilation\n\nLet's compile our code. Start a terminal (or a command prompt on Windows) and execute:\n\n```sh\n$ arm-none-eabi-gcc -mcpu=cortex-m4 main.c -c\n```\n\nThat works! The compilation produced a file `main.o` which contains\nour minimal firmware that does nothing.  The `main.o` file is in ELF binary\nformat, which contains several sections. Let's see them:\n\n```sh\n$ arm-none-eabi-objdump -h main.o\n...\nIdx Name          Size      VMA       LMA       File off  Algn\n  0 .text         00000002  00000000  00000000  00000034  2**1\n                  CONTENTS, ALLOC, LOAD, READONLY, CODE\n  1 .data         00000000  00000000  00000000  00000036  2**0\n                  CONTENTS, ALLOC, LOAD, DATA\n  2 .bss          00000000  00000000  00000000  00000036  2**0\n                  ALLOC\n  3 .vectors      000001ac  00000000  00000000  00000038  2**2\n                  CONTENTS, ALLOC, LOAD, RELOC, DATA\n...\n```\n\nNote that VMA/LMA addresses for sections are set to 0 - meaning, `main.o`\nis not yet a complete firmware, because it does not contain the information\nwhere those sections should be loaded in the address space. We need to use\na linker to produce a full firmware `firmware.elf` from `main.o`.\n\nThe section .text contains firmware code, in our case it is just a _reset()\nfunction, 2-bytes long - a jump instruction to its own address. There is\nan empty `.data` section and an empty `.bss` section\n(data that is initialized to zero) . Our firmware will be copied\nto the flash region at offset 0x8000000, but our data section should reside\nin RAM - therefore our `_reset()` function should copy the contents of the\n`.data` section to RAM. Also it has to write zeroes to the whole `.bss`\nsection. Our `.data` and `.bss` sections are empty, but let's modify our\n`_reset()` function anyway to handle them properly.\n\nIn order to do all that, we must know where stack starts, and where data and\nbss section start. This we can specify in the \"linker script\", which is a file\nwith the instructions to the linker, where to put various sections in the\naddress space, and which symbols to create.\n\n### Linker script\n\nCreate a file `link.ld`, and copy-paste contents from\n[steps/step-0-minimal/link.ld](steps/step-0-minimal/link.ld). Below is the explanation:\n\n```\nENTRY(_reset);\n```\n\nThis line tells the linker the value of the \"entry point\" attribute in the\ngenerated ELF header - so this is a duplicate to what a vector table has.  This\nis an aid for a debugger (like Ozone, described below) that helps to set a\nbreakpoint at the beginning of the firmware.  A debugger does not know about a\nvector table, so it relies on the ELF header.\n\n```\nMEMORY {\n  flash(rx)  : ORIGIN = 0x08000000, LENGTH = 2048k\n  sram(rwx) : ORIGIN = 0x20000000, LENGTH = 192k  /* remaining 64k in a separate address space */\n}\n```\nThis tells the linker that we have two memory regions in the address space,\ntheir addresses and sizes.\n\n```\n_estack     = ORIGIN(sram) + LENGTH(sram);    /* stack points to end of SRAM */\n```\n\nThis tell a linker to create a symbol `estack` with value at the very end\nof the RAM memory region. That will be our initial stack value!\n\n```\n  .vectors  : { KEEP(*(.vectors)) }   \u003e flash\n  .text     : { *(.text*) }           \u003e flash\n  .rodata   : { *(.rodata*) }         \u003e flash\n```\n\nThese lines tell the linker to put vectors table on flash first,\nfollowed by `.text` section (firmware code), followed by the read only\ndata `.rodata`.\n\nThe next goes `.data` section:\n\n```\n  .data : {\n    _sdata = .;   /* .data section start */\n    *(.first_data)\n    *(.data SORT(.data.*))\n    _edata = .;  /* .data section end */\n  } \u003e sram AT \u003e flash\n  _sidata = LOADADDR(.data);\n```\n\nNote that we tell linker to create `_sdata` and `_edata` symbols. We'll\nuse them to copy data section to RAM in the `_reset()` function.\n\nSame for `.bss` section:\n\n```\n  .bss : {\n    _sbss = .;              /* .bss section start */\n    *(.bss SORT(.bss.*) COMMON)\n    _ebss = .;              /* .bss section end */\n  } \u003e sram\n```\n\n### Startup code\n\nNow we can update our `_reset()` function. We copy `.data` section to RAM, and\ninitialise bss section to zeroes. Then, we call main() function - and fall into\ninfinite loop in case if main() returns:\n\n```c\nint main(void) {\n  return 0; // Do nothing so far\n}\n\n// Startup code\n__attribute__((naked, noreturn)) void _reset(void) {\n  // memset .bss to zero, and copy .data section to RAM region\n  extern long _sbss, _ebss, _sdata, _edata, _sidata;\n  for (long *dst = \u0026_sbss; dst \u003c \u0026_ebss; dst++) *dst = 0;\n  for (long *dst = \u0026_sdata, *src = \u0026_sidata; dst \u003c \u0026_edata;) *dst++ = *src++;\n\n  main();             // Call main()\n  for (;;) (void) 0;  // Infinite loop in the case if main() returns\n}\n```\n\nThe following diagram visualises how `_reset()` initialises .data and .bss:\n\n![](images/mem2.svg)\n\nThe `firmware.bin` file is just a concatenation of the three sections:\n`.vectors` (IRQ vector table), `.text` (code) and `.data` (data).  Those\nsections were built according to the linker script: `.vectors` lies at the very\nbeginning of flash, then `.text` follows immediately after, and `.data` lies\nfar above. Addresses in `.text` are in the flash region, and addresses in\n`.data` are in the RAM region.  If some function has address e.g. `0x8000100`,\nthen it is located exactly at that address on flash. But if the code accesses\nsome variable in the `.data` section by the address e.g. `0x20000200`, then\nthere is nothing at that address, because at boot, `.data` section in the\n`firmware.bin` resides in flash! That's why the startup code must relocate\n`.data` section from flash region to the RAM region.\n\nNow we are ready to produce a full firmware file `firmware.elf`:\n\n```sh\n$ arm-none-eabi-gcc -T link.ld -nostdlib main.o -o firmware.elf\n```\n\nLet's examine sections in firmware.elf:\n\n```sh\n$ arm-none-eabi-objdump -h firmware.elf\n...\nIdx Name          Size      VMA       LMA       File off  Algn\n  0 .vectors      000001ac  08000000  08000000  00010000  2**2\n                  CONTENTS, ALLOC, LOAD, DATA\n  1 .text         00000058  080001ac  080001ac  000101ac  2**2\n                  CONTENTS, ALLOC, LOAD, READONLY, CODE\n...\n```\n\nNow we can see that the .vectors section will reside at the very beginning of\nflash memory at address 0x8000000, then the .text section right after it, at\n0x80001ac. Our code does not create any variables, so there is no data section.\n\n## Flash firmware\n\nWe're ready to flash this firmware! First, extract sections from the\nfirmware.elf into a single contiguous binary blob:\n\n```sh\n$ arm-none-eabi-objcopy -O binary firmware.elf firmware.bin\n```\n\nAnd use `st-link` utility to flash the firmware.bin. Plug your board to the\nUSB, and execute:\n\n```sh\n$ st-flash --reset write firmware.bin 0x8000000\n```\n\nDone! We've flashed a firmware that does nothing.\n\n## Makefile: build automation\n\nInstead of typing those compilation, linking and flashing commands, we can\nuse `make` command line tool to automate the whole process. `make` utility\nuses a configuration file named `Makefile` where it reads instructions\nhow to execute actions. This automation is great because it also documents the\nprocess of building firmware, used compilation flags, etc.\n\nThere is a great Makefile tutorial at https://makefiletutorial.com - for those\nnew to `make`, I suggest to take a look. Below, I list the most essential\nconcepts required to understand our simple bare metal Makefile. Those who\nalready familiar with `make`, can skip this section.\n\nThe `Makefile` format is simple:\n\n```make\naction1:\n\tcommand ...     # Comments can go after hash symbol\n\tcommand ....    # IMPORTANT: command must be preceded with the TAB character\n\naction2:\n\tcommand ...     # Don't forget about TAB. Spaces won't work!\n```\n\nNow, we can invoke `make` with the action name (also called *target*) to execute\na corresponding action:\n\n```sh\n$ make action1\n```\n\nIt is possible to define variables and use them in commands. Also, actions\ncan be file names that needs to be created:\n\n```make\nfirmware.elf:\n\tCOMPILATION COMMAND .....\n```\n\nAnd, any action can have a list of dependencies. For example, `firmware.elf`\ndepends on our source file `main.c`. Whenever `main.c` file changes, the\n`make build` command rebuilds `firmware.elf`:\n\n```\nbuild: firmware.elf\n\nfirmware.elf: main.c\n\tCOMPILATION COMMAND\n```\n\nNow we are ready to write a Makefile for our firmware. We define a `build`\naction / target:\n\n```make\nCFLAGS  ?=  -W -Wall -Wextra -Werror -Wundef -Wshadow -Wdouble-promotion \\\n            -Wformat-truncation -fno-common -Wconversion \\\n            -g3 -Os -ffunction-sections -fdata-sections -I. \\\n            -mcpu=cortex-m4 -mthumb -mfloat-abi=hard -mfpu=fpv4-sp-d16 $(EXTRA_CFLAGS)\nLDFLAGS ?= -Tlink.ld -nostartfiles -nostdlib --specs nano.specs -lc -lgcc -Wl,--gc-sections -Wl,-Map=$@.map\nSOURCES = main.c \n\nbuild: firmware.elf\n\nfirmware.elf: $(SOURCES)\n\tarm-none-eabi-gcc $(SOURCES) $(CFLAGS) $(LDFLAGS) -o $@\n```\n\nThere, we define compilation flags. The `?=` means that's a default value;\nwe could override them from the command line like this:\n\n```sh\n$ make build CFLAGS=\"-O2 ....\"\n```\n\nWe specify `CFLAGS`, `LDFLAGS` and `SOURCES` variables.\nThen we tell `make`: if you're told to `build`, then create a `firmware.elf`\nfile. It depends on the `main.c` file, and to create it, start\n`arm-none-eabi-gcc` compiler with a given flags. `$@` special variable\nexpands to a target name - in our case, `firmware.elf`.\n\nLet's call `make`:\n\n```\n$ make build\narm-none-eabi-gcc main.c  -W -Wall -Wextra -Werror -Wundef -Wshadow -Wdouble-promotion -Wformat-truncation -fno-common -Wconversion -g3 -Os -ffunction-sections -fdata-sections -I. -mcpu=cortex-m4 -mthumb -mfloat-abi=hard -mfpu=fpv4-sp-d16  -Tlink.ld -nostartfiles -nostdlib --specs nano.specs -lc -lgcc -Wl,--gc-sections -Wl,-Map=firmware.elf.map -o firmware.elf\n```\n\nIf we run it again:\n\n```sh\n$ make build\nmake: Nothing to be done for `build'.\n```\n\nThe `make` utility examines modification times for `main.c` dependency and\n`firmware.elf` - and does not do anything if `firmware.elf` is up to date.\nBut if we change `main.c`, then next `make build` will recompile:\n\n```sh\n$ touch main.c # Simulate changes in main.c\n$ make build\n```\n\nNow, what is left - is the `flash` target:\n\n\n```make\nfirmware.bin: firmware.elf\n\tarm-none-eabi-objcopy -O binary $\u003c $@\n\nflash: firmware.bin\n\tst-flash --reset write $\u003c 0x8000000\n```\n\nThat's it! Now, `make flash` terminal command creates a `firmware.bin` file,\nand flashes it to the board. It'll recompile the firmware if `main.c` changes,\nbecause `firmware.bin` depends on `firmware.elf`, and it in turn depends on\n`main.c`. So, now the development cycle would be these two actions in a loop:\n\n```sh\n# Develop code in main.c\n$ make flash\n```\n\nIt is a good idea to add a clean target to remove build artifacts:\n\n\n```\nclean:\n\trm -rf firmware.*\n```\n\nA complete project source code you can find in [steps/step-0-minimal](steps/step-0-minimal) folder.\n\n## Blinky LED\n\nNow as we have the whole build / flash infrastructure set up, it is time to\nteach our firmware to do something useful. Something useful is of course blinking\nan LED. A Nucleo-F429ZI board has three built-in LEDs. In a Nucleo board\ndatasheet section 6.5 we can see which pins built-in LEDs are attached to:\n\n- PB0: green LED\n- PB7: blue LED\n- PB14: red LED\n\nLet's modify `main.c` file and add our definitions for PIN, `gpio_set_mode()`.\nIn the main() function, we set the blue LED to output mode, and start an\ninfinite loop. First, let's copy the definitions for pins and GPIO we have\ndiscussed earlier. Note we also add a convenience macro `BIT(position)`:\n\n```c\n#include \u003cinttypes.h\u003e\n#include \u003cstdbool.h\u003e\n\n#define BIT(x) (1UL \u003c\u003c (x))\n#define PIN(bank, num) ((((bank) - 'A') \u003c\u003c 8) | (num))\n#define PINNO(pin) (pin \u0026 255)\n#define PINBANK(pin) (pin \u003e\u003e 8)\n\nstruct gpio {\n  volatile uint32_t MODER, OTYPER, OSPEEDR, PUPDR, IDR, ODR, BSRR, LCKR, AFR[2];\n};\n#define GPIO(bank) ((struct gpio *) (0x40020000 + 0x400 * (bank)))\n\n// Enum values are per datasheet: 0, 1, 2, 3\nenum { GPIO_MODE_INPUT, GPIO_MODE_OUTPUT, GPIO_MODE_AF, GPIO_MODE_ANALOG };\n\nstatic inline void gpio_set_mode(uint16_t pin, uint8_t mode) {\n  struct gpio *gpio = GPIO(PINBANK(pin));  // GPIO bank\n  int n = PINNO(pin);                      // Pin number\n  gpio-\u003eMODER \u0026= ~(3U \u003c\u003c (n * 2));         // Clear existing setting\n  gpio-\u003eMODER |= (mode \u0026 3) \u003c\u003c (n * 2);    // Set new mode\n}\n```\n\nSome microcontrollers, when they are powered, have all their peripherals\npowered and enabled, automatically. STM32 MCUs, however, by default have their\nperipherals disabled in order to save power. In order to enable a GPIO peripheral,\nit should be enabled (clocked) via the RCC (Reset and Clock Control) unit.\nIn the datasheet section 7.3.10 we find that the AHB1ENR (AHB1 peripheral\nclock enable register) is responsible to turn GPIO banks on or off. First we\nadd a definition for the whole RCC unit:\n\n```c\nstruct rcc {\n  volatile uint32_t CR, PLLCFGR, CFGR, CIR, AHB1RSTR, AHB2RSTR, AHB3RSTR,\n      RESERVED0, APB1RSTR, APB2RSTR, RESERVED1[2], AHB1ENR, AHB2ENR, AHB3ENR,\n      RESERVED2, APB1ENR, APB2ENR, RESERVED3[2], AHB1LPENR, AHB2LPENR,\n      AHB3LPENR, RESERVED4, APB1LPENR, APB2LPENR, RESERVED5[2], BDCR, CSR,\n      RESERVED6[2], SSCGR, PLLI2SCFGR;\n};\n#define RCC ((struct rcc *) 0x40023800)\n```\n\nIn the AHB1ENR register documentation we see that bits from 0 to 10 inclusive\nset the clock for GPIO banks GPIOA - GPIOK:\n\n```c\nint main(void) {\n  uint16_t led = PIN('B', 7);            // Blue LED\n  RCC-\u003eAHB1ENR |= BIT(PINBANK(led));     // Enable GPIO clock for LED\n  gpio_set_mode(led, GPIO_MODE_OUTPUT);  // Set blue LED to output mode\n  for (;;) (void) 0;                     // Infinite loop\n  return 0;\n}\n```\n\nNow, what is left to do, is to find out how to set a GPIO pin on and off, and\nthen modify the main loop to set an LED pin on, delay, off, delay.  Looking at\nthe datasheet section 8.4.7, wee see that the register BSRR is responsible for\nsetting voltage high or low. The low 16 bit are used to set the ODR register\n(i.e. set pin high), and high 16 bit are used  to reset the ODR\nregister (i.e. set pin low). Let's define an API function for that:\n\n```c\nstatic inline void gpio_write(uint16_t pin, bool val) {\n  struct gpio *gpio = GPIO(PINBANK(pin));\n  gpio-\u003eBSRR = (1U \u003c\u003c PINNO(pin)) \u003c\u003c (val ? 0 : 16);\n}\n```\n\nNext we need to implement a delay function. We do not require an accurate\ndelay at this moment, so let's define a function `spin()` that just executes\na NOP instruction a given number of times:\n\n```c\nstatic inline void spin(volatile uint32_t count) {\n  while (count--) (void) 0;\n}\n```\n\nFinally, we're ready to modify our main loop to implement LED blinking:\n\n```c\n  for (;;) {\n    gpio_write(led, true);\n    spin(999999);\n    gpio_write(led, false);\n    spin(999999);\n  }\n```\n\nRun `make flash` and enjoy blue LED flashing.\nA complete project source code you can find in [steps/step-1-blinky](steps/step-1-blinky).\n\n## Blinky with SysTick interrupt\n\nIn order to implement an accurate time keeping, we should enable ARM's SysTick\ninterrupt. SysTick a 24-bit hardware counter, and is part of ARM core,\ntherefore it is documented by the ARM datasheet. Looking at the datasheet, we\nsee that SysTick has four registers:\n\n- CTRL - used to enable/disable systick\n- LOAD - an initial counter value\n- VAL - a current counter value, decremented on each clock cycle\n- CALIB - calibration register\n\nEvery time VAL drops to zero, a SysTick interrupt is generated.\nThe SysTick interrupt index in the vector table is 15, so we need to set it.\nUpon boot, our board Nucleo-F429ZI runs at 16Mhz. We can configure the SysTick\ncounter to trigger interrupt each millisecond.\n\nFirst, let's define a SysTick peripheral. We know 4 registers, and from the\ndatasheet we can learn that the SysTick address is 0xe000e010. So:\n\n```c\nstruct systick {\n  volatile uint32_t CTRL, LOAD, VAL, CALIB;\n};\n#define SYSTICK ((struct systick *) 0xe000e010)\n```\n\nNext, add an API function that configures it. We need to enable SysTick\nin the `SYSTICK-\u003eCTRL` register, and also we must clock it via the\n`RCC-\u003eAPB2ENR`, described in the section 7.4.14:\n\n```c\n#define BIT(x) (1UL \u003c\u003c (x))\nstatic inline void systick_init(uint32_t ticks) {\n  if ((ticks - 1) \u003e 0xffffff) return;  // Systick timer is 24 bit\n  SYSTICK-\u003eLOAD = ticks - 1;\n  SYSTICK-\u003eVAL = 0;\n  SYSTICK-\u003eCTRL = BIT(0) | BIT(1) | BIT(2);  // Enable systick\n  RCC-\u003eAPB2ENR |= BIT(14);                   // Enable SYSCFG\n}\n```\n\nBy default, Nucleo-F429ZI board runs at 16Mhz. That means, if we call\n`systick_init(16000000 / 1000);`, then SysTick interrupt will be generated\nevery millisecond. We should have interrupt handler function defined - here\nit is, we simply increment a 32-bit millisecond counter:\n\n```c\nstatic volatile uint32_t s_ticks; // volatile is important!!\nvoid SysTick_Handler(void) {\n  s_ticks++;\n}\n```\n\nWith 16MHz clock, we init SysTick counter to trigger an interrupt every\n16000 cycles: the `SYSTICK-\u003eVAL` initial value is 15999, then it decrements\non each cycle by 1, and when it reaches 0, an interrupt is generated. The\nfirmware code execution gets interrupted: a `SysTick_Handler()` function is\ncalled to increment `s_tick` variable. Here how it looks like on a time scale:\n\n![](images/systick.svg)\n\n\nThe `volatile` specifier is required here because `s_ticks` is modified by the\ninterrupt handler. `volatile` prevents the compiler to optimise/cache `s_ticks`\nvalue in a CPU register: instead, generated code always accesses memory.  That\nis why `volatile` keywords is present in the peripheral struct definitions,\ntoo. Since this is important to understand, let's demonstrate that on a simple\nfunction: Arduino's `delay()`. Let it use our `s_ticks` variable:\n\n```c\nvoid delay(unsigned ms) {            // This function waits \"ms\" milliseconds\n uint32_t until = s_ticks + ms;      // Time in a future when we need to stop\n while (s_ticks \u003c until) (void) 0;   // Loop until then\n}\n```\n\nNow let's compile this code with, and without `volatile` specifier for `s_ticks`\nand compare generated machine code:\n\n```\n// NO VOLATILE: uint32_t s_ticks;       |  // VOLATILE: volatile uint32_t s_ticks;\n                                        |\n ldr     r3, [pc, #8]  // cache s_ticks |  ldr     r2, [pc, #12]\n ldr     r3, [r3, #0]  // in r3         |  ldr     r3, [r2, #0]   // r3 = s_ticks\n adds    r0, r3, r0    // r0 = r3 + ms  |  adds    r3, r3, r0     // r3 = r3 + ms\n                                        |  ldr     r1, [r2, #0]   // RELOAD: r1 = s_ticks\n cmp     r3, r0        // ALWAYS FALSE  |  cmp     r1, r3         // compare\n bcc.n   200000d2 \u003cdelay+0x6\u003e           |  bcc.n   200000d2 \u003cdelay+0x6\u003e\n bx      lr                             |  bx      lr\n```\n\nIf there is no `volatile`, the `delay()` function will loop forever and never\nreturn. That is because it caches (optimises) the value of `s_ticks` in a\nregister and never updates it. A compiler does that because it doesn't know\nthat `s_ticks` can be updated elsewhere - by the interrupt handler!  The\ngenerated code with `volatile`, on the other hand, loads `s_ticks` value on\neach iteration.  So, the rule of thumb: **those values in memory that get\nupdated by interrupt handlers, or by the hardware, declare as `volatile`**.\n\nNow we should add `SysTick_Handler()` interrupt handler to the vector table:\n\n```c\n__attribute__((section(\".vectors\"))) void (*const tab[16 + 91])(void) = {\n    _estack, _reset, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, SysTick_Handler};\n```\n\nNow we have a precise millisecond clock! Let's create a helper function\nfor arbitrary periodic timers:\n\n```c\n// t: expiration time, prd: period, now: current time. Return true if expired\nbool timer_expired(uint32_t *t, uint32_t prd, uint32_t now) {\n  if (now + prd \u003c *t) *t = 0;                    // Time wrapped? Reset timer\n  if (*t == 0) *t = now + prd;                   // First poll? Set expiration\n  if (*t \u003e now) return false;                    // Not expired yet, return\n  *t = (now - *t) \u003e prd ? now + prd : *t + prd;  // Next expiration time\n  return true;                                   // Expired, return true\n}\n```\n\nNow we are ready to update our main loop and use a precise timer for LED blink.\nFor example, let's use 500 milliseconds blinking interval:\n\n```c\n  uint32_t timer, period = 500;          // Declare timer and 500ms period\n  for (;;) {\n    if (timer_expired(\u0026timer, period, s_ticks)) {\n      static bool on;       // This block is executed\n      gpio_write(led, on);  // Every `period` milliseconds\n      on = !on;             // Toggle LED state\n    }\n    // Here we could perform other activities!\n  }\n```\n\nNote that using SysTick, and a helper `timer_expired()` function, we made our\nmain loop (also called superloop) non-blocking. That means that inside that\nloop we can perform many actions - for example, have different timers with\ndifferent periods, and they all will be triggered in time.\n\nA complete project source code you can find in [steps/step-2-systick](steps/step-2-systick) folder.\n\n## Add UART debug output\n\nNow it's time to add a human-readable diagnostics to our firmware. One of the\nMCU peripherals is a serial UART interface. Looking at the datasheet section\n2.3, we see that there are several UART/USART controllers - i.e. pieces of\ncircuitry inside MCU that, properly configured, can exchange data via\ncertain pins. A mimimal UART setup uses two pins, RX (receive) and TX (transmit).\n\nIn a Nucleo board datasheet section 6.9 we see that one of the\ncontrollers, USART3, is using pins PD8 (TX) and PD9 (RX) and is connected to\nthe on-board ST-LINK debugger. That means that if we configure USART3 and\noutput data via the PD9 pin, we can see it on our workstation via the ST-LINK\nUSB connection.\n\nThus, let us create a handy API for the UART, the way we did it for GPIO.\nDatasheet section 30.6 summarises UART registers - so here is our UART struct:\n\n```c\nstruct uart {\n  volatile uint32_t SR, DR, BRR, CR1, CR2, CR3, GTPR;\n};\n#define UART1 ((struct uart *) 0x40011000)\n#define UART2 ((struct uart *) 0x40004400)\n#define UART3 ((struct uart *) 0x40004800)\n```\n\nTo configure UART, we need to:\n- Enable UART clock by setting appropriate bit in `RCC-\u003eAPB2ENR` register\n- Set \"alternate function\" pin mode for RX and TX pins. There can be several\n  alternate functions (AF) for any given pin, depending on the peripheral that\n  is used. The AF list can be found in the\n  [STM32F429ZI](https://www.st.com/resource/en/datasheet/stm32f429zi.pdf)\n  table 12\n- Set baud rate (receive/transmit clock frequency) via the BRR register\n- Enable the peripheral, receive and transmit via the CR1 register\n\nWe already know how to set a GPIO pin into a specific mode. If a pin is in the\nAF mode, we also need to specify the \"function number\", i.e. which exact\nperipheral takes control. This can be done via the \"alternate function register\",\n`AFR`, of the GPIO peripheral. Reading the AFR register description in the\ndatasheet, we can see that the AF number occupies 4 bits, thus the whole setup\nfor 16 pins occupies 2 registers.\n\n```c\nstatic inline void gpio_set_af(uint16_t pin, uint8_t af_num) {\n  struct gpio *gpio = GPIO(PINBANK(pin));  // GPIO bank\n  int n = PINNO(pin);                      // Pin number\n  gpio-\u003eAFR[n \u003e\u003e 3] \u0026= ~(15UL \u003c\u003c ((n \u0026 7) * 4));\n  gpio-\u003eAFR[n \u003e\u003e 3] |= ((uint32_t) af_num) \u003c\u003c ((n \u0026 7) * 4);\n}\n```\n\nIn order to completely hide register-specific code from the GPIO API, let's\nmove the GPIO clock init to the `gpio_set_mode()` function:\n\n```c\nstatic inline void gpio_set_mode(uint16_t pin, uint8_t mode) {\n  struct gpio *gpio = GPIO(PINBANK(pin));  // GPIO bank\n  int n = PINNO(pin);                      // Pin number\n  RCC-\u003eAHB1ENR |= BIT(PINBANK(pin));       // Enable GPIO clock\n  ...\n```\n\nNow we're ready to create a UART initialization API function:\n\n```c\n#define FREQ 16000000  // CPU frequency, 16 Mhz\nstatic inline void uart_init(struct uart *uart, unsigned long baud) {\n  // https://www.st.com/resource/en/datasheet/stm32f429zi.pdf\n  uint8_t af = 7;           // Alternate function\n  uint16_t rx = 0, tx = 0;  // pins\n\n  if (uart == UART1) RCC-\u003eAPB2ENR |= BIT(4);\n  if (uart == UART2) RCC-\u003eAPB1ENR |= BIT(17);\n  if (uart == UART3) RCC-\u003eAPB1ENR |= BIT(18);\n\n  if (uart == UART1) tx = PIN('A', 9), rx = PIN('A', 10);\n  if (uart == UART2) tx = PIN('A', 2), rx = PIN('A', 3);\n  if (uart == UART3) tx = PIN('D', 8), rx = PIN('D', 9);\n\n  gpio_set_mode(tx, GPIO_MODE_AF);\n  gpio_set_af(tx, af);\n  gpio_set_mode(rx, GPIO_MODE_AF);\n  gpio_set_af(rx, af);\n  uart-\u003eCR1 = 0;                           // Disable this UART\n  uart-\u003eBRR = FREQ / baud;                 // FREQ is a UART bus frequency\n  uart-\u003eCR1 |= BIT(13) | BIT(2) | BIT(3);  // Set UE, RE, TE\n}\n```\n\nAnd, finally, functions for reading and writing to the UART.\nThe datasheet section 30.6.1 tells us that the status register SR tells us\nwhether data is ready:\n```c\nstatic inline int uart_read_ready(struct uart *uart) {\n  return uart-\u003eSR \u0026 BIT(5);  // If RXNE bit is set, data is ready\n}\n```\n\nThe data byte itself can be fetched from the data register DR:\n```c\nstatic inline uint8_t uart_read_byte(struct uart *uart) {\n  return (uint8_t) (uart-\u003eDR \u0026 255);\n}\n```\n\nTransmitting a single byte can be done via the data register too. After\nsetting a byte to write, we need to wait for the transmission to end, indicated\nvia bit 7 in the status register:\n```c\nstatic inline void uart_write_byte(struct uart *uart, uint8_t byte) {\n  uart-\u003eDR = byte;\n  while ((uart-\u003eSR \u0026 BIT(7)) == 0) spin(1);\n}\n```\n\nAnd writing a buffer:\n```c\nstatic inline void uart_write_buf(struct uart *uart, char *buf, size_t len) {\n  while (len-- \u003e 0) uart_write_byte(uart, *(uint8_t *) buf++);\n}\n```\n\nNow, initialise UART in our main() function:\n\n```c\n  ...\n  uart_init(UART3, 115200);              // Initialise UART\n```\n\nNow, we're ready to print a message \"hi\\r\\n\" every time LED blinks!\n```c\n    if (timer_expired(\u0026timer, period, s_ticks)) {\n      ...\n      uart_write_buf(UART3, \"hi\\r\\n\", 4);  // Write message\n    }\n```\n\nRebuild, reflash, and attach a terminal program to the ST-LINK port.\nOn my Mac workstation, I use `cu`. It also can be used on Linux. On Windows,\nusing `putty` utility can be a good idea. Run a terminal and see the messages:\n\n```sh\n$ cu -l /dev/YOUR_SERIAL_PORT -s 115200\nhi\nhi\n```\n\nA complete project source code you can find in [steps/step-3-uart](steps/step-3-uart) folder.\n\n## Redirect printf() to UART\n\nIn this section, we replace `uart_write_buf()` call by `printf()` call, which\ngives us an ability to do formatted output - and increase our abilities to\nprint diagnostic information, implemeting so called \"printf-style debugging\".\n\nA GNU ARM toolchain that we're using comes not only with a GCC compiler and\nother tools, but with a C library called newlib,\nhttps://sourceware.org/newlib. A newlib library was developed by RedHat for\nembedded systems.\n\nIf our firmware calls a standard C library function, for example `strcmp()`,\nthen a newlib code will be added to our firmware by the GCC linker.\n\nSome of the standard C functions that newlib implements, specifically, file\ninput/output (IO) operations, implemented by the newlib is a special fashion: those\nfunctions eventually call a set of low-level IO functions called \"syscalls\".\n\nFor example:\n- `fopen()` eventually calls `_open()`\n- `fread()` eventually calls a low level `_read()`\n- `fwrite()`, `fprintf()`, `printf()` eventually call a low level `_write()`\n- `malloc()` eventually calls `_sbrk()`, and so on.\n\nThus, by modifying a `_write()` syscall, we can redirect\nprintf() to whatever we want. That mechanism is called \"IO retargeting\".\n\nNote: STM32 Cube also uses ARM GCC with newlib, that's why Cube projects\ntypically include `syscalls.c` file.  Other toolchains, like TI's CCS, Keil's\nCC, might use a different  C library with a bit different retargeting\nmechanism. We use newlib, so let's modify `_write()` syscall to print to the\nUART3.\n\nBefore that, let's organise our source code in the following way:\n- move all API definitions to the file `hal.h` (Hardware Abstraction Layer)\n- move startup code to `startup.c`\n- create an empty file `syscalls.c` for newlib \"syscalls\"\n- modify Makefile to add `syscalls.c` and `startup.c` to the build\n\nAfter moving all API definitions to the `hal.h`, our `main.c` file becomes\nquite compact. Note that it does not have any mention of the low-level\nregisters, just a high level API functions that are easy to understand:\n\n```c\n#include \"hal.h\"\n\nstatic volatile uint32_t s_ticks;\nvoid SysTick_Handler(void) {\n  s_ticks++;\n}\n\nint main(void) {\n  uint16_t led = PIN('B', 7);            // Blue LED\n  systick_init(16000000 / 1000);         // Tick every 1 ms\n  gpio_set_mode(led, GPIO_MODE_OUTPUT);  // Set blue LED to output mode\n  uart_init(UART3, 115200);              // Initialise UART\n  uint32_t timer = 0, period = 500;      // Declare timer and 500ms period\n  for (;;) {\n    if (timer_expired(\u0026timer, period, s_ticks)) {\n      static bool on;                      // This block is executed\n      gpio_write(led, on);                 // Every `period` milliseconds\n      on = !on;                            // Toggle LED state\n      uart_write_buf(UART3, \"hi\\r\\n\", 4);  // Write message\n    }\n    // Here we could perform other activities!\n  }\n  return 0;\n}\n```\n\nGreat, now let's retarget printf to the UART3. In the empty syscalls.c,\ncopy/paste the following code:\n\n```c\n#include \"hal.h\"\n\nint _write(int fd, char *ptr, int len) {\n  (void) fd, (void) ptr, (void) len;\n  if (fd == 1) uart_write_buf(UART3, ptr, (size_t) len);\n  return -1;\n}\n```\n\nHere we say: if the file descriptor we're writing to is 1 (which is a\nstandard output descriptor), then write the buffer to the UART3. Otherwise,\nignore. This is the essence of retargeting!\n\nRebuilding this firmware results in a bunch of linker errors:\n\n```sh\n../../arm-none-eabi/lib/thumb/v7e-m+fp/hard/libc_nano.a(lib_a-sbrkr.o): in function `_sbrk_r':\nsbrkr.c:(.text._sbrk_r+0xc): undefined reference to `_sbrk'\ncloser.c:(.text._close_r+0xc): undefined reference to `_close'\nlseekr.c:(.text._lseek_r+0x10): undefined reference to `_lseek'\nreadr.c:(.text._read_r+0x10): undefined reference to `_read'\nfstatr.c:(.text._fstat_r+0xe): undefined reference to `_fstat'\nisattyr.c:(.text._isatty_r+0xc): undefined reference to `_isatty'\n```\n\nSince we've used a newlib stdio function, we need to supply newlib with the\nrest of syscalls. We add a simple stubs that do nothing, with exception  of\n`_sbrk()`. It needs to be implemented, since `printf()` calls `malloc()` which\ncalls `_sbrk()`:\n\n```c\nint _fstat(int fd, struct stat *st) {\n  (void) fd, (void) st;\n  return -1;\n}\n\nvoid *_sbrk(int incr) {\n  extern char _end;\n  static unsigned char *heap = NULL;\n  unsigned char *prev_heap;\n  if (heap == NULL) heap = (unsigned char *) \u0026_end;\n  prev_heap = heap;\n  heap += incr;\n  return prev_heap;\n}\n\nint _close(int fd) {\n  (void) fd;\n  return -1;\n}\n\nint _isatty(int fd) {\n  (void) fd;\n  return 1;\n}\n\nint _read(int fd, char *ptr, int len) {\n  (void) fd, (void) ptr, (void) len;\n  return -1;\n}\n\nint _lseek(int fd, int ptr, int dir) {\n  (void) fd, (void) ptr, (void) dir;\n  return 0;\n}\n```\n\nNow, rebuild gives no errors. Last step: replace the `uart_write_buf()`\ncall in the `main()` function with `printf()` call that prints something\nuseful, e.g. a LED status and a current value of systick:\n\n```c\nprintf(\"LED: %d, tick: %lu\\r\\n\", on, s_ticks);  // Write message\n```\n\nThe serial output looks like this:\n\n```sh\nLED: 1, tick: 250\nLED: 0, tick: 500\nLED: 1, tick: 750\nLED: 0, tick: 1000\n```\n\nCongratulations! We learned how IO retargeting works, and\ncan now printf-debug our firmware.\nA complete project source code you can find in [steps/step-4-printf](steps/step-4-printf) folder.\n\n## Debug with Segger Ozone\n\nWhat if our firmware is stuck somewhere and printf debug does not work?\nWhat if even a startup code does not work? We would need a debugger. There\nare many options, but I'd recommend using an Ozone debugger from Segger.\nWhy? Because it is stand-alone. It does not need any IDE set up. We can\nfeed our `firmware.elf` directly to Ozone, and it'll pick up our source files.\n\nSo, [download Ozone](https://www.segger.com/products/development-tools/ozone-j-link-debugger/)\nfrom the Segger website. Before we can use it with our Nucleo board,\nwe need to convert ST-LINK firmware on the onboard debugger to the jlink firmware\nthat Ozone understands. Follow the [instructions](https://www.segger.com/products/debug-probes/j-link/models/other-j-links/st-link-on-board/)\non the Segger site.\n\nNow, run Ozone. Choose our device in the wizard:\n\n\u003cimg src=\"images/ozone1.png\" width=\"50%\" /\u003e\n\nSelect a debugger we're going to use - that should be a ST-LINK:\n\n\u003cimg src=\"images/ozone2.png\" width=\"50%\" /\u003e\n\nChoose our firmware.elf file:\n\n\u003cimg src=\"images/ozone3.png\" width=\"50%\" /\u003e\n\nLeave the defaults on the next screen, click Finish, and we've got our\ndebugger loaded (note the hal.h source code is picked up):\n\n![](images/ozone4.png)\n\nClick the green button to download, run the firmware, and we're stopped here:\n\n![](images/ozone5.png)\n\nNow we can single-step through code, set breakpoints, and do the usual debugging\nstuff. One thing that could be noted, is a handy Ozone peripheral view:\n\n![](images/ozone6.png)\n\nUsing it, we can directly examine or set the state of the peripherals. For\nexample, let's turn on a green on-board LED (PB0):\n\n1. We need to clock GPIOB first. Find Peripherals -\u003e RCC -\u003e AHB1ENR,\n   and enable GPIOBEN bit - set it to 1:\n  \u003cimg src=\"images/ozone7.png\" width=\"75%\" /\u003e\n2. Find Peripherals -\u003e GPIO -\u003e GPIOB -\u003e MODER, set MODER0 to 1 (output): \n  \u003cimg src=\"images/ozone8.png\" width=\"75%\" /\u003e\n3. Find Peripherals -\u003e GPIO -\u003e GPIOB -\u003e ODR, set ODR0 to 1 (on): \n  \u003cimg src=\"images/ozone9.png\" width=\"75%\" /\u003e\n\nNow, a green LED should be on! Happy debugging.\n\n## Vendor CMSIS headers\n\nIn the previous sections, we have developed the firmware using only datasheets,\neditor, and GCC compiler. We have created peripheral structure definitions\nmanually, using datasheets.\n\nNow as you know how it all works, it is time to introduce CMSIS headers.\nWhat is it ? These are header files with all definitions, created and supplied\nby the MCU vendor. They contain definitions for everything that MCU contains,\ntherefore they rather big.\n\nCMSIS stands for Common Microcontroller Software Interface Standard, thus it is\na common ground for the MCU manufacturers to specify peripheral API.  Since\nCMSIS is an ARM standard, and since CMSIS headers are supplied by the MCU\nvendor, they are the source of authority. Therefore, using vendor\nheaders is a preferred way, rather than writing definitions manually.\n\nThere are two sets of CMSIS headers:\n- First, there are ARM Core CMSIS headers. They describe ARM core,\n  and published by ARM on github: https://github.com/ARM-software/CMSIS_5\n- Second, there are MCU vendor CMSIS headers. They describe MCU peripherals,\n  and published by the MCU vendor. In our case, ST publishes them at\n  https://github.com/STMicroelectronics/cmsis_device_f4\n\nWe can pull those headers by a simple Makefile snippet:\nhttps://github.com/cpq/bare-metal-programming-guide/blob/785aa2ead0432fc67327781c82b9c41149fba158/step-5-cmsis/Makefile#L27-L31\n\nThe ST CMSIS package also provides startup files for all their MCUs. We\ncan use those instead of hand-writing the startup.c. The ST-provided startup\nfile calls `SystemInit()` function, so we define it in the `main.c`.\n\nNow, let's replace our API functions in the `hal.h` using CMSIS definitions,\nand leave the rest of the firmware intact.  From the `hal.h`, remove all\nperipheral API and definitions, and leave only standard C inludes, vendor CMSIS\ninclude, defines to PIN, BIT, FREQ, and `timer_expired()` helper function.\n\nIf we try to rebuild the firmware - `make clean build`, then GCC will fail\ncomplaining about missing `systick_init()`, `GPIO_MODE_OUTPUT`, `uart_init()`,\nand `UART3`. Let's add those using STM32 CMSIS files.\n\nLet's start from `systick_init()`. ARM core CMSIS headers provide a\n`SysTick_Config()` function that does the same - so we'll use it.\n\nNext goes `gpio_set_mode()` function. The  `stm32f429xx.h` header has\n`GPIO_TypeDef` structure, identical to our `struct gpio`. Let's use it:\nhttps://github.com/cpq/bare-metal-programming-guide/blob/52e1a8acd30e60eba4c119e22b609571e39a86e0/step-5-cmsis/hal.h#L24-L28\n\nThe `gpio_set_af()` and `gpio_write()` functions is also trivial -\nsimply replace `struct gpio` with `GPIO_TypeDef`, and that's all.\n\nNext goes UART. There is a `USART_TypeDef`, and defines for  USART1, USART2,\nUSART3. Let's use them:\n\n```c\n#define UART1 USART1\n#define UART2 USART2\n#define UART3 USART3\n```\n\nIn the `uart_init()` and the rest of UART functions, change `struct uart` to\n`USART_TypeDef`. The rest stays the same!\n\nAnd we are done. Rebuild, reflash the firmware. The LED blinks, the UART\nshows the output. Congratulations, we have adopted our firmware code to\nuse vendor CMSIS header files. Now let's reorganise the repository a bit\nby moving all standard files into `include` directory and updating Makefile\nto let GCC know about it:\nhttps://github.com/cpq/bare-metal-programming-guide/blob/785aa2ead0432fc67327781c82b9c41149fba158/step-5-cmsis/Makefile#L4\n\nAlso, let's include CMSIS header pulling as a dependency for the binary:\nhttps://github.com/cpq/bare-metal-programming-guide/blob/785aa2ead0432fc67327781c82b9c41149fba158/step-5-cmsis/Makefile#L18\n\nWe have left with a project template that can be reused for the future\nprojects.  A complete project source code you can find in\n[steps/step-5-cmsis](steps/step-5-cmsis)\n\n\n## Setting up clocks\n\nAfter boot, Nucleo-F429ZI CPU runs at 16MHz. The maximum frequency is 180MHz.\nNote that system clock frequency is not the only factor we need to care about.\nPeripherals are attached to different buses, APB1 and APB2 which are clocked\ndifferently.  Their clock speeds are configured by the frequency prescaler\nvalues, set in the RCC. The main CPU clock source can also be\ndifferent - we can use either an external crystal oscillator (HSE) or an\ninternal oscillator (HSI). In our case, we'll use HSI.\n\nWhen CPU executes instructions from flash, a flash read speed (which is around\n25MHz) becomes a bottleneck if CPU clock gets higher. There are several tricks\nthat can help. Instruction prefetch is one. Also, we can give a clue to the\nflash controller, how faster the system clock is: that value is called flash\nlatency. For 180MHz system clock, the `FLASH_LATENCY` value is 5. Bits 8 and 9\nin the flash controller enable instruction and data caches:\n\n```c\n  FLASH-\u003eACR |= FLASH_LATENCY | BIT(8) | BIT(9);      // Flash latency, caches\n```\n\nThe clock source (HSI or HSE) goes through a piece of hardware called\nPLL, which multiplies source frequency by a certain value. Then, a set of\nfrequency dividers are used to set the system clock and APB1, APB2 clocks.\nIn order to obtain the maximum system clock of 180MHz, multiple values\nof PLL dividers and APB prescalers are possible. Section 6.3.3 of the\ndatasheet tells us the maximum values for APB1 clock: \u003c= 45MHz,\nand the APB2 clock: \u003c= 90MHz. That narrows down the list of possible\ncombinations. Here we chose the values manually. Note that tools like\nCubeMX can automate the process and make it easy and visual.\n\nhttps://github.com/cpq/bare-metal-programming-guide/blob/9a3f9bc7b07d6a2a114581979e5b6715754c87c1/step-6-clock/hal.h#L20-L28\n\nNow we're ready for a simple algorithm to set up the clock for CPU and peripheral buses\nmay look like this:\n\n- Optionally, enable FPU\n- Set flash latency\n- Decide on a clock source, and PLL, APB1 and APB2 prescalers\n- Configure RCC by setting respective values:\n- Move clock inititialization into a separate file `sysinit.c`, function\n  `SystemInit()` which is automatically called by the startup code\n\nhttps://github.com/cpq/bare-metal-programming-guide/blob/9a3f9bc7b07d6a2a114581979e5b6715754c87c1/step-6-clock/sysinit.c#L10-L26\n\nWe need to also change `hal.h` - specifically, the UART intialization code.\nDifferent UART controllers are running on different buses: UART1 runs on a\nfast APB2, and the rest of UARTs run on a slower APB1. When running on a\ndefault 16Mhz clock, that did not make a difference. But when running on\nhigher speeds, APB1 and APB2 may have different clocks, thus we need to\nadjust the baud rate calculation for the UART:\nhttps://github.com/cpq/bare-metal-programming-guide/blob/9a3f9bc7b07d6a2a114581979e5b6715754c87c1/step-6-clock/hal.h#L90-L107\n\nRebuild and reflash, and our board runs at its maximum speed, 180MHz!\nA complete project source code you can find in [steps/step-6-clock](steps/step-6-clock)\n\n## Web server with device dashboard\n\nThe Nucleo-F429ZI comes with Ethernet on-board. Ethernet hardware needs\ntwo components: a PHY (which transmits/receives electrical signals to the\nmedia like copper, optical cable, etc) and MAC (which drives PHY controller).\nOn our Nucleo, the MAC controller is built-in, and the PHY is external\n(specifically, it is Microchip's LAN8720a).\n\nMAC and PHY can talk several interfaces, we'll use RMII. For that, a bunch\nof pins must be configured to use their Alternative Function (AF).\nTo implement a web server, we need 3 software components:\n- a network driver, which sends/receives Ethernet frames to/from MAC controller\n- a network stack, that parses frames and understands TCP/IP\n- a network library that understands HTTP\n\nWe will use [Mongoose Network Library](https://github.com/cesanta/mongoose)\nwhich implements all of that in a single file. It is a dual-licensed library\n(GPLv2/commercial) that was designed to make network embedded development\nfast and easy.\n\nSo, copy\n[mongoose.c](https://raw.githubusercontent.com/cesanta/mongoose/master/mongoose.c)\nand\n[mongoose.h](https://raw.githubusercontent.com/cesanta/mongoose/master/mongoose.h)\nto our project. Now we have a driver, a network stack, and a library at hand.\nMongoose also provides a large set of examples, and one of them is a\n[device dashboard example](https://github.com/cesanta/mongoose/tree/master/examples/device-dashboard).\nIt implements lots of things - like dashboard login, real-time data exchange\nover WebSocket, embedded file system, MQTT communication, etcetera.  So let's\nuse that example. Copy two extra files:\n- [net.c](https://raw.githubusercontent.com/cesanta/mongoose/master/examples/device-dashboard/net.c) - implements dashboard functionality\n- [packed_fs.c](https://raw.githubusercontent.com/cesanta/mongoose/master/examples/device-dashboard/packed_fs.c) - contains HTML/CSS/JS GUI files\n\nWhat we need is to tell Mongoose which functionality to enable. That can\nbe done via compilation flags, by setting preprocessor constants. Alternatively,\nthe same constants can be set in the `mongoose_custom.h` file. Let's go\nthe second way. Create `mongoose_custom.h` file with the following contents:\n\n```c\n#pragma once\n#define MG_ARCH MG_ARCH_NEWLIB\n#define MG_ENABLE_MIP 1\n#define MG_ENABLE_PACKED_FS 1\n#define MG_IO_SIZE 512\n#define MG_ENABLE_CUSTOM_MILLIS 1\n```\n\nNow it's time to add some networking code to main.c. We `#include \"mongoose.c\"`,\ninitialise Ethernet RMII pins and enable Ethernet in the RCC:\n\n```c\n  uint16_t pins[] = {PIN('A', 1),  PIN('A', 2),  PIN('A', 7),\n                     PIN('B', 13), PIN('C', 1),  PIN('C', 4),\n                     PIN('C', 5),  PIN('G', 11), PIN('G', 13)};\n  for (size_t i = 0; i \u003c sizeof(pins) / sizeof(pins[0]); i++) {\n    gpio_init(pins[i], GPIO_MODE_AF, GPIO_OTYPE_PUSH_PULL, GPIO_SPEED_INSANE,\n              GPIO_PULL_NONE, 11);\n  }\n  nvic_enable_irq(61);                          // Setup Ethernet IRQ handler\n  RCC-\u003eAPB2ENR |= BIT(14);                      // Enable SYSCFG\n  SYSCFG-\u003ePMC |= BIT(23);                       // Use RMII. Goes first!\n  RCC-\u003eAHB1ENR |= BIT(25) | BIT(26) | BIT(27);  // Enable Ethernet clocks\n  RCC-\u003eAHB1RSTR |= BIT(25);                     // ETHMAC force reset\n  RCC-\u003eAHB1RSTR \u0026= ~BIT(25);                    // ETHMAC release reset\n```\n\nMongoose's driver uses Ethernet interrupt, thus we need to update `startup.c`\nand add `ETH_IRQHandler` to the vector table. Let's reorganise vector table\ndefinition in `startup.c` in a way that does not require any modification\nto add an interrupt handler function. The idea is to use a \"weak symbol\"\nconcept.\n\nA function can be marked \"weak\" and it works like a normal function.  The\ndifference comes when a source code defines a function with the same name\nelsewhere. Normally, two functions with the same name make a build fail.\nHowever if one function is marked weak, then a build succeeds and linker\nselects a non-weak function. This gives an ability to provide a \"default\"\nfunction in a boilerplate, with an ability to override it by simply creating a\nfunction with the same name elsewhere in the code.\n\nHere how it works in our case. We want to fill a vector table with default\nhandlers, but give user an ability to override any handler. For that, we create\na function `DefaultIRQHandler()` and mark it weak. Then, for every IRQ handler,\nwe declare a handler name and make it an alias to `DefaultIRQHandler()`:\n\n```c\nvoid __attribute__((weak)) DefaultIRQHandler(void) {\n  for (;;) (void) 0;\n}\n#define WEAK_ALIAS __attribute__((weak, alias(\"DefaultIRQHandler\")))\n\nWEAK_ALIAS void NMI_Handler(void);\nWEAK_ALIAS void HardFault_Handler(void);\nWEAK_ALIAS void MemManage_Handler(void);\n...\n__attribute__((section(\".vectors\"))) void (*const tab[16 + 91])(void) = {\n    0, _reset, NMI_Handler, HardFault_Handler, MemManage_Handler,\n    ...\n```\n\nNow, we can define any IRQ handler in our code, and it will replace the default\none. This is what happens in our case: there is a `ETH_IRQHandler()` defined\nby the Mongoose's STM32 driver which replaces a default handler.\n\nThe next step is to initialise Mongoose library: create an event manager,\nsetup network driver, and start a listening HTTP connection:\n\n```c\n  struct mg_mgr mgr;        // Initialise Mongoose event manager\n  mg_mgr_init(\u0026mgr);        // and attach it to the MIP interface\n  mg_log_set(MG_LL_DEBUG);  // Set log level\n\n  struct mip_driver_stm32 driver_data = {.mdc_cr = 4};  // See driver_stm32.h\n  struct mip_if mif = {\n      .mac = {2, 0, 1, 2, 3, 5},\n      .use_dhcp = true,\n      .driver = \u0026mip_driver_stm32,\n      .driver_data = \u0026driver_data,\n  };\n  mip_init(\u0026mgr, \u0026mif);\n  extern void device_dashboard_fn(struct mg_connection *, int, void *, void *);\n  mg_http_listen(\u0026mgr, \"http://0.0.0.0\", device_dashboard_fn, \u0026mgr);\n  MG_INFO((\"Init done, starting main loop\"));\n```\n\nWhat is left, is to add a `mg_mgr_poll()` call into the main loop.\n\nNow, add `mongoose.c`, `net.c` and `packed_fs.c` files to the Makefile.\nRebuild, reflash the board.  Attach a serial console to the debug output,\nobserve that the board obtains an IP address over DHCP:\n\n```\n847 3 mongoose.c:6784:arp_cache_add     ARP cache: added 0xc0a80001 @ 90:5c:44:55:19:8b\n84e 2 mongoose.c:6817:onstatechange     READY, IP: 192.168.0.24\n854 2 mongoose.c:6818:onstatechange            GW: 192.168.0.1\n859 2 mongoose.c:6819:onstatechange            Lease: 86363 sec\nLED: 1, tick: 2262\nLED: 0, tick: 2512\n```\n\nFire up a browser at that IP address, and get a working dashboard, with\nreal-time graph over WebSocket, with MQTT, authentication, and other things!\nSee\n[full description](https://github.com/cesanta/mongoose/tree/master/examples/device-dashboard)\nfor more details.\n\n![Device dashboard](https://raw.githubusercontent.com/cesanta/mongoose/master/examples/device-dashboard/screenshots/dashboard.png)\n\nA complete project source code you can find in\n[steps/step-7-webserver](steps/step-7-webserver) directory.\n\n## Automated firmware builds (software CI)\n\nIt is a good practice for a software project to have continuous\nintegration (CI). On every change pushed to the\nrepository, CI automatically rebuilds and tests all components.\n\nGithub makes it easy to do. We can create a `.github/workflows/test.yml` file\nwhich is a CI configuration file. In that file, we can install ARM GCC\nand run `make` in every example directory to build respective firmwares.\n\nLong story short! This tells Github to run on every repo push:\nhttps://github.com/cpq/bare-metal-programming-guide/blob/b0820b5c62b74a9b4456854feb376cda8cde4ecd/.github/workflows/test.yml#L1-L2\n\nThis installs ARM GCC compiler:\nhttps://github.com/cpq/bare-metal-programming-guide/blob/b0820b5c62b74a9b4456854feb376cda8cde4ecd/.github/workflows/test.yml#L9\n\nThis builds a firmware in every example directory:\nhttps://github.com/cpq/bare-metal-programming-guide/blob/b0820b5c62b74a9b4456854feb376cda8cde4ecd/.github/workflows/test.yml#L10-L18\n\nThat's it!  Extremely simple and extremely powerful. Now if we push a change to\nthe repo which breaks a build, Github will notify us. On success, Github will\nkeep quiet.  See an [example successful\nrun](https://github.com/cpq/bare-metal-programming-guide/actions/runs/3840030588).\n\n\n## Automated firmware tests (hardware CI)\n\nWould it be great to also test built firmware binaries on a real hardware, to\ntest not only the build process, but that the built firmware is correct and\nfunctional?\n\nIt is not trivial to build such a system ad hoc. For example,\none can setup a dedicated test workstation, attach a tested device\n(e.g. Nucleo-F429ZI board) to it, and write a piece of software for remote\nfirmware upload and test using a built-in debugger. Possible, but fragile,\nconsumes a lot of efforts and needs a lot of attention.\n\nThe alternative is to use one of the commercial hardware test systems (or EBFs,\nEmbedded Board Farms), though such commercial solutions are quite expensive. \n\nBut there is an easy way.\n\n### Solution: ESP32 + vcon.io\n\nUsing https://vcon.io service, which implements remote firmware update and\nUART monitor, we can:\n\n1. Take any ESP32 or ESP32C3 device (e.g. any inexpensive development board)\n2. Flash a pre-built firmware on it, turning ESP32 into a remotely-controlled programmer\n3. Wire ESP32 to your target device: SWD pins for flashing, UART pins for capturing output\n4. Configure ESP32 to register on https://dash.vcon.io management dashboard\n\nWhen done, your target device will have an authenticated, secure RESTful\nAPI for reflashing and capturing device output. It can be called from anywhere,\nfor example from the software CI:\n\n![VCON module operation](images/hero.svg)\n\nNote: the [vcon.io](https://vcon.io) service is run by Cesanta - the company I\nwork for. It is a paid service with a freebie quota: if you have just a few\ndevices to manage, it is completely free.\n\n### Configuring and wiring ESP32\n\nTake any ESP32 or ESP32C3 device - a devboard, a module, or your custom device.\nMy recommendation is ESP32C3 XIAO devboard\n([buy on Digikey](https://www.digikey.ie/en/products/detail/seeed-technology-co-ltd/113991054/16652880))\nbecause of its low price (about 5 EUR) and small form factor.\n\nWe're going to assume that the target device is a Raspberry Pi\n[W5500-EVB-Pico](https://docs.wiznet.io/Product/iEthernet/W5500/w5500-evb-pico)\nboard with a built-in Ethernet interface. If your device is different,\nadjust the \"Wiring\" step according to your device's pinout.\n\n- Follow [Flashing ESP32](https://vcon.io/docs/#module-flashing) to flash your ESP32\n- Follow [Network Setup](https://vcon.io/docs/#module-registration) to register ESP32 on https://dash.vcon.io\n- Follow [Wiring](https://vcon.io/docs/#module-to-device-wiring) to wire ESP32 to your device\n\nThis is how a configured device breadboard setup may look like:\n![](images/breadboard.webp)\n\nThis is how a configured device dashboard looks like:\n![](images/screenshot.webp)\n\nNow, you can reflash your device with a single command:\n\n```sh\ncurl -su :API_KEY https://dash.vcon.io/api/v3/devices/ID/ota --data-binary @firmware.bin\n```\n\nWhere `API_KEY` is the dash.vcon.io authentication key, `ID` is the registered\ndevice number, and `firmware.bin` is the name of the newly built firmware.  You\ncan get the `API_KEY` by clicking on the \"api key\" link on a dashboard.  The\ndevice ID is listed in the table.\n\nWe can also capture device output with a single command: \n\n```sh\ncurl -su :API_KEY https://dash.vcon.io/api/v3/devices/ID/tx?t=5\n```\n\nThere, `t=5` means wait 5 seconds while capturing UART output.\n\nNow, we can use those two commands in any software CI platform to test a new\nfirmware on a real device, and test device's UART output against some expected\nkeywords. \n\n### Integrating with Github Actions\n\nOkay, our software CI builds a firmware image for us. It would be nice to\ntest that firmware image on a real hardware. And now we can!\nWe should add few extra commands that use `curl` utility to send a built\nfirmware to the test board, and then capture its debug output.\n\nA `curl` command requires a secret API key, which we do not want to expose to\nthe public. The right way to go is to:\n1. Go to the project settings / Secrets / Actions\n2. Click on \"New repository secret\" button\n3. Give it a name, `VCON_API_KEY`, paste the value into a \"Secret\" box, click \"Add secret\"\n\nOne of the example projects builds firmware for the RP2040-W5500 board, so\nlet's flash it using a `curl` command and a saved API key. The best way is\nto add a Makefile target for testing, and let Github Actions (our software CI)\ncall it:\nhttps://github.com/cpq/bare-metal-programming-guide/blob/8d419f5e7718a8dcacad2ddc2f899eb75f64271e/.github/workflows/test.yml#L18\n\nNote that we pass a `VCON_API_KEY` environment variable to `make`. Also note\nthat we're invoking `test` Makefile target, which should build and test our\nfirmware. Here is the `test` Makefile target:\nhttps://github.com/cpq/bare-metal-programming-guide/blob/d9bced31b1ccde8eca4d6dc38440e104dba053ce/step-7-webserver/pico-w5500/Makefile#L32-L39\n\nExplanation:\n- line 34: The `test` target depends on the `upload` target, so `upload`\n  is executed first (see line 38)\n- line 35: Capture UART log for 5 seconds and save it to `/tmp/output.txt`\n- line 36: Search for the string `Ethernet: up` in the output, and fail if it\n  is not found\n- line 38: The `upload` target depends on `build`, so we always build firmware\n  before testing\n- line 39: We flash firmware remotely. The `--fail` flag to `curl` utility\n  makes it fail if the response from the server is not successful (not HTTP 200\n  OK)\n\nThis is the example output of the `make test` command described above:\n\n```sh\n$ make test\ncurl --fail ...\n{\"success\":true,\"written\":59904}\ncurl --fail ...\n3f3 2 main.c:65:main                    Ethernet: down\n7d7 1 mongoose.c:6760:onstatechange     Link up\n7e5 3 mongoose.c:6843:tx_dhcp_discover  DHCP discover sent\n7e8 2 main.c:65:main                    Ethernet: up\n81d 3 mongoose.c:6726:arp_cache_add     ARP cache: added 192.168.0.1 @ 90:5c:44:55:19:8b\n822 2 mongoose.c:6752:onstatechange     READY, IP: 192.168.0.24\n827 2 mongoose.c:6753:onstatechange            GW: 192.168.0.1\n82d 2 mongoose.c:6755:onstatechange            Lease: 86336 sec\nbc3 2 main.c:65:main                    Ethernet: up\nfab 2 main.c:65:main                    Ethernet: up\n```\n\nDone! Now, our automatic tests ensure that the firmware can be built, that\nit is bootable, that it initialises the network stack correctly.  This mechanism\ncan be easily extended: just add more complex actions in your firmware binary,\nprint the result to the UART, and check for the expected output in the test.\n\nHappy testing!\n\n## About me\n\nI am Sergey Lyubka, an engineer and entrepreneur. I hold a MSc in Physics from\nKyiv State University, Ukraine. I am a director and co-founder at Cesanta - a\ntechnology company based in Dublin, Ireland. Cesanta develops embedded solutions:\n- https://mongoose.ws - an open source HTTP/MQTT/Websocket network library\n- https://vcon.io - a remote firmware update / serial monitor framework\n\nYou are welcome\nto [register for my free webinar on embedded network programming](https://mongoose.ws/webinars/).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcpq%2Fbare-metal-programming-guide","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fcpq%2Fbare-metal-programming-guide","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcpq%2Fbare-metal-programming-guide/lists"}