Tuesday, June 11, 2019

Early Boot is a Lie

Everything you know about early boot is a lie. The classic introduction to IBM PC compatible (x86, x64) startup sequence goes as follows. Read-only memory (Boot ROM) loads the Master Boot Record (MBR) from the first Peripheral Component Interconnect (PCI) Device. MBR loads the rest of the Operating System (OS). Everyone should write an MBR bootloader (it's fun!), but there is a glaring constraint that is a hold-over from the old days, which is that you have 512 bytes (about 400 available to use for machine code). Here is an example MBR bootloader that closely resembles the "Standard MBR", traditionally written in assembly, but for clarity we have rewritten it in pesudo-Rust.
// Oversimplified Legacy Standard MBR
pub unsafe fn mbr_main() {
    const SECTOR: usize = 0x200;

    // Pointer to Master Boot Record (MBR)
    // This is empty before shadowing, MBR after shadowing.
    let mut mbr = Vec::from_raw_parts(
        0x0600 as *mut u8, SECTOR, SECTOR);

    // Pointer to Volume Boot Record (VBR)
    // This is the MBR before shadowing, VBR after shadowing.
    let mut vbr = Vec::from_raw_parts(
        0x7c00 as *mut u8, SECTOR, SECTOR);

    // Chainload/Underlay/Shadow Master Boot Record (MBR)
    // This program started running from 0x7c00, but the
    // instruction pointer has moved since then, so we need to 
    // jump to here + 0x0600 - 0x7c00, represented by asm_shadow!().
    ptr::copy(vbr.as_ptr(), mbr.as_mut_ptr(), SECTOR);
    asm_shadow!(mbr.as_ptr() as usize - vbr.as_ptr() as usize);

    // Find first bootable partition entry (oversimplified!)
    let bootable = MbrTable::from(mbr).partitions[0];

    // Chainload/Overlay/Fetch Volume Boot Record (VBR)
    csm::ChsReader::from(bootable).read(vbr.as_mut_ptr());

    // Tail call the Volume Boot Record (VBR)
    let vbr_main = vbr.as_ptr() as *const Fn();
    vbr_main();
    unreachable!();
}
The first thing to realize is that since this is the first 512 bytes on the hard drive, that people have written many versions of this. There are MBR bootloaders found in Windows, macOS, Linux, SYSLINUX, GRUB, CoreBoot, SeaBios, EDK2, and many others. There are far too many versions of this to discuss here, and so I will move on to more interesting things. Let's go in the other direction. What loads the MBR? What jumps to the MBR? Enter the reset vector. The code that goes in the the reset vector is called the Volume Top File (VTF). The reset vector is physical address 0xfffffff0 (-16), which has a jump to physical address 0x00007c00 (31744). Sounds like a dream. Here is some equivalent pseudo-Rust:
// Oversimplified Boot Device Selection (BDS)
pub unsafe fn bds_main() {
    const SECTOR: usize = 0x200;

    // Pointer to Master Boot Record (MBR)
    let mbr = Vec::from_raw_parts(
        0x7c00 as *mut u8, SECTOR, SECTOR);

    // Load first sector from first hard drive
    csm::ChsReader::default().read(mbr.as_mut_slice());

    // Tail call the Master Boot Record (MBR)
    let mbr_main = mbr.as_ptr() as *const Fn();
    mbr_main();
    unreachable!();
}
I put my compiled binary code in the first sector of the hard drive, which gets loaded at 0x7c00, and I never have to worry about anything else. Perhaps if you've heard about the Power-On Self-Test (POST), then you may say to yourself, even if there are processes before the MBR, then I don't have to worry about them because they're in ROM, and therefore read-only.

That's not true.

The 16 bytes at the top of memory is where all of UEFI takes place. How is this even possible? UEFI is beyond the scope of this article, but to overview, here is some psuedo-Rust:

pub unsafe fn vtf_main() {
    // There is only enough room for a jump, so imagine 
    // that this is what the top of memory jumps to.

    // Security (SEC)
    sec_main();

    // Pre-EFI Initialization (PEI)
    pei_main();

    // Driver eXecution Environment (DXE)
    dxe_main(); 

    // Tail call Boot Device Selection (BDS)
    bds_main(); 
    unreachable!();
}
If we had the source code to the final stages of the Boot ROM, then it might look something like this pesudo-Rust:
pub unsafe fn rom_main() {
    const SECTOR: usize = 0x200;

    // Pointer to Volume Top File (VTF)
    let vtf = Vec::from_raw_parts(
        0xfffff000 as *mut u8, 8*SECTOR, 8*SECTOR);

    // Copy 8 sectors from SPI flash to memory
    rom::SpiReader::default().read(vtf.as_mut_slice());

    // Tail call the Volume Top File (VTF)
    let vtf_main = 0xfffffff0 as *const Fn();
    vtf_main();
    unreachable!();
}
That seems a bit unfair. The modern UEFI runtime environment gets 8 sectors (about 4 KiB), and the legacy MBR environment gets 1 sector (512 bytes? If I were a bootloader, I would ask for a refund. So let us overview the process:
  • rom_main(), presumably still read-only...
  • vtf_main(), writable (HW)
  • sec_main(), writable (HW)
  • pei_main(), writable (HW, or PEI modules)
  • dxe_main(), writable (HW, or DXE modules)
  • bds_main(), writable (SW, or NVRAM)
  • mbr_main(), writable (SW)
  • vbr_main(), writable (SW)
  • boot_loader()
  • oper_system()
We have certainly skipped a few steps, for example, the Compatibility Support Module (CSM), System Management Mode (SMM), and the Trusted Platform Module (TPM). These three are all critical parts of the startup process, but are too complex to get into detail here. There really is no such thing as "Early Boot" anymore, because what we think of as early, is very late in the process now. Suffice it to say that there is an entire ecosystem of companies and hackers trying to squeeze ever more out of those 16 bytes at the top of memory.

No comments:

Post a Comment