# Booting a Cortex-M Microcontroller --- In this deck, we're talking specifically about Arm Cortex-M based microcontrollers. Other Arm processors, and processors from other companies may vary. --- ## Terms * Processor - the core that executes instructions * Peripheral - Hardware block for performing dedicated tasks * Flash - the non-volatile *flash memory* that the code and the constants live in * RAM - the volatile *random-access memory* that the global variables, heap and stack live in * SoC - the *system-on-a-chip* that contains (a) processor(s), some peripherals, and usually some memory * MCU - A *microcontroller unit* is a more specialized kind of SoC Note: - Examples for peripheral: UART or RNG hardware block. - MCUs usually include the similar components like an SoC and are designed to run an embedded system. They are usually also less complex. --- ## An example * Arm Cortex-M4 - a 32-bit processor core aimed at microcontrollers * Use the `thumbv7em-none-eabi` or `thumbv7em-none-eabihf` targets * nRF52840 - a SoC from Nordic Semi that uses that processor core --- ## An example (2) * Arm Cortex-M0+ - a smaller, simpler, 32-bit processor core * Use the `thumbv6m-none-eabi` target * RP2040 - a SoC from Raspberry Pi that has *two* of those processor cores --- ## Booting a Cortex-M The [Arm Architecture Reference Manual](https://developer.arm.com/documentation/ddi0403/ee/?lang=en) explains we must provide:
node0
Stack Pointer
Reset Pointer
Exception Pointers
...
Interrupt Pointers
...
The chip does everything else. Note: There are fourteen defined Exception Handlers (if the chip does not support a particular Exception, you must use the special value `0x0000_0000`). The number of interrupt handlers is defined by the SoC - the Arm NVIC can handle up to 240 interrupts in Armv7-M or 480 interrupts in Armv8-M. --- ## The steps 1. Make an array, or struct, with those two (or more) words in it 2. Convince the linker to put it at the right memory address 3. Profit --- ## C vector table ```c __attribute__ ((section(".vector_table"))) unsigned long myvectors[] = { (unsigned long) &_stack_top, (unsigned long) rst_handler, (unsigned long) nmi_handler, // ... } ``` --- ## Rust vector table - type definitions This is possible in Rust as well, but is a bit more involved due to stronger typing rules. ```rust ignore extern "C" { static mut _stack_top: usize } pub struct VectorTable { stack_top: *const usize, rst_handler: extern "C" fn(), nmi_handler: extern "C" fn(), // ... } ``` --- ## Rust vector table ```rust ignore #[link_section=".vector_table"] #[no_mangle] static VECTOR_TABLE: VectorTable = VectorTable { // Create a raw pointer from the stack top address. stack_top: &raw const _stack_top, rst_handler, nmi_handler, // ... } ``` Note: The cortex-m-rt crate does not use a dedicated `VectorTable` struct. Instead it places some of the individual vector table components into dedicated segments and then places all components in the correct order inside the linker script. --- ## Memory Layout Most embedded applications written in C/C++ and Rust have a very similar memory and binary layout. Many systems use separate flash and RAM memory regions:
Note: Some of those memory segments need to be set up by code! `.bss` is a RAM segment which needs to be zero-initialized, and `.data` is a segment in RAM where the initial values are stored in the flash memory and need to be copied from there. Some larger embedded applications might also use a heap. In embedded applications, those heaps are oftentimes also statically allocated and might be a part of the `.bss` or `.uninit` segments. --- ## Memory Layout - Unified Some systems also have a run-time layout where code and data are located in the same memory region:
--- ## C Reset Handler Can be written in C! But it's hazardous. ```c extern unsigned long __sidata, __sdata, __edata; extern unsigned long __sbss, __ebss; void rst_handler(void) { unsigned long *src = &__sidata; unsigned long *dest = &__sdata; while (dest < &__edata) { *dest++ = *src++; } dest = &__sbss; while (dest < &__ebss) { *dest++ = 0; } main(); while(1) { } } ``` Note: - This code copies `.data` from flash to RAM, and zero-initializes the `.bss` block. - `__sidata` is the load address in flash. - Global variables are not initialised when this function is executed. What if the C code touches an uninitialised global variable? C programmers don't worry so much about this because the C language spec does not care about global variables not being initialized when this code runs. Rust programmers definitely worry about this. - Why is it hazardous? Working with raw pointers is easy to get wrong. --- ## Rust Reset Handler (1) ```rust ignore extern "C" { // Start and end of the initialized data block (.data). //`__sidata` is the load address in flash. static mut __sidata: usize; static mut __sdata: usize; static mut __edata: usize; // Start and end of zero-initialized block (.bss). static mut __sbss: usize; static mut __ebss: usize; } ``` --- ## Rust Reset Handler (2) ```rust ignore use core::ptr::{addr_of, addr_of_mut}; #[unsafe(no_mangle)] pub unsafe extern "C" fn rst_handler() { unsafe { let src = addr_of!(__sidata); let dest = addr_of_mut!(__sdata); let size = addr_of_mut!(__edata).offset_from(dest); for i in 0..size { dest.offset(i).write_volatile(src.offset(i).read()); } let dest = addr_of_mut!(__sbss); let size = addr_of_mut!(__ebss).offset_from(dest); for i in 0..size { dest.offset(i).write_volatile(0); } } } ``` Sadly, this is [UB](https://github.com/rust-embedded/cortex-m-rt/issues/300). Note: This is Undefined Behaviour because globals haven't been initialised yet and it is illegal to execute any Rust code in the presence of global variables with invalid values (e.g. a `bool` with an integer value of `2`). It's also arguably UB because we are violating the rules of pointer provenance: We are using `write_volatile` to write outside the bounds the objects we have declared to Rust (we said that `__sdata` was *only* a single `u32`). It is now reasonably settled that this is bad in theory, but it's debatable whether it's currently bad in practice (cortex-m-rt got away with it for years). I believe that in time it will get *worse* in practice, so don't do it. --- ## The cortex-m-rt crate Does all this work for you, in raw Arm assembly language - so it's actually sound. See [Reset](https://github.com/rust-embedded/cortex-m/blob/c-m-rt-v0.7.3/cortex-m-rt/src/lib.rs#L501), [Linker script](https://github.com/rust-embedded/cortex-m/blob/c-m-rt-v0.7.3/cortex-m-rt/link.x.in), and [Vector table](https://github.com/rust-embedded/cortex-m/blob/c-m-rt-v0.7.3/cortex-m-rt/src/lib.rs#L1130) --- ## The #[entry] macro * Attaches your `fn main()` to the reset function in cmrt * Hides your `fn main()` so no-one else can call it * Remaps `static mut FOO: T` to `static FOO: &mut T` so they are safe Note: Why does it map static mut FOO: T to static FOO: &mut T?. We can not prove to the compiler that the static mutable variables is actually safe to use (our entry point is just called once, and can not be reached otherwise). Expand the entry macro by using `cargo expand --bin hello`. Allowing `static_mut_ref` is required to allow static mutable references which clippy is unhappy about. There is actually some discussion on whether the macro should do this source code manipulation, which might look a little bit like magic for someone who does not know what is happening. --- ## Using the crate * [knurling-rs/app-template](https://github.com/knurling-rs/app-template) * [ferrous/rust-training](https://github.com/ferrous-systems/rust-training/tree/main/example-code/qemu-thumbv7em) --- ## Linker scripts * In Rust they work exactly like they do in `clang` or `gcc` * Same `.text`, `.rodata`, `.data`, `.bss` sections * `cortex-m-rt` provides `link.x`, which pulls in a `memory.x` you supply * You must tell the linker to use `link.x`, with: * A build-script * `rustflags` in `.cargo/config.toml`, or * The `RUSTFLAGS` environment variable