Tegra Boot Flow

Revision History

Version	Date	Description
1	2012/12/07	Initial Release.
2	2014/09/26	git URL updates. Point out that bootloaders can be loaded to IRAM rather than SDRAM. Minor updates to cover up to Tegra124.
3	2015/11/30	Add description of redundancy algorithms. Mention Tegra210. Reworded for consistency with more recent CPU naming choices.

Introduction

This document explains the basic principles of how Tegra boots. It is intended to provide a high-level overview of the boot process, mainly from a software- or firmware-developer’s perspective, deferring to other sources for details.

This document applies to Tegra20, Tegra30, Tegra114, Tegra124, and Tegra210.

Hardware Architecture

Tegra SoCs contain the following components related to the boot process:

The main CPU complex, referred to as the CCPLEX. The CCPLEX typically runs the system’s primary software stack.
A boot CPU, variously known across Tegra generations as BPMP-Lite (Boot and Power Management Processor), AVP (Audio-Video Processor), or occasionally as COP (a legacy name meaning CO-Processor). This processor implements the initial boot process. This processor typically is not the same architecture or implementation as the CCPLEX; on all current Tegra variants, it is an ARM7TDMI.
A boot ROM, containing the initial code that implements the boot process. This ROM is embedded into the SoC itself, and is not externally accessible.
A small embedded RAM, the IRAM. This RAM is typically dedicated for use by the AVP, and is used for state storage during the boot process, and some flashing processes.
Various peripheral controllers, such as eMMC, NAND, and SPI flash. These provide access to the boot memory device. The boot memory contains the BCT and bootloader.
USB controllers.
A Power Management Controller, or PMC. This is separate from any board-level PMIC (Power Management Integrated Circuit), which typically implements voltage regulators and related functionality.
Fuses; factory-programmable read-only data embedded into the SoC.
Straps; signals on the Tegra package which may be pulled weakly high or low during the boot process to communicate information to Tegra.

Boot Process

When Tegra is powered on, the boot CPU executes code from the boot ROM. The CCPLEX is not powered and does not execute code.

The boot ROM determines which boot memory device to use by reading a combination of fuses and/or straps. Various types of memory are supported, such as eMMC, NAND, or SPI flash.

Production systems will hard-code the boot memory device. Reference or development boards may support booting from multiple different memory types, and hence provide jumpers or switches to influence which boot memory to use.

Once the boot memory device is determined, the boot ROM will initialize the appropriate peripheral controller, and start reading data from the boot memory. The first piece of information to be read is the BCT.

The BCT indicates:

How to configure the boot memory for optimal access.
How to configure SDRAM (optional).
Where in boot memory the bootloader image is located.
The SDRAM (or IRAM) location to load the bootloader into.
The entry point for the bootloader.

For further details, see BCT Overview.

The boot ROM processes the BCT as follows:

If no valid BCT can be found, enters USB recovery mode (RCM).
Re-programs the boot memory controller according to the parameters specified in the BCT.
(If the BCT contains SDRAM configuration parameters): Programs the SDRAM controller according to the data specified in the BCT. This is the first point at which SDRAM can be accessed.
Reads the bootloader from boot memory into RAM, and validates the image.
If no valid bootloader could be found, enters USB recovery mode (RCM).
Jumps to the bootloader entry point.

The bootloader load address will typically be within SDRAM, hence why the BCT contains SDRAM controller configuration data. However, the bootloader load address could also be located within IRAM. In this case:

The BCT need not contain valid SDRAM controller configuration data, since the boot ROM need not access SDRAM.
If the BCT does not contain valid SDRAM controller configuration data, and SDRAM access is required, the bootloader will need to initialize the SDRAM controller itself.

Observe that the boot ROM simply jumps to the bootloader’s entry point. At this point, all code is still running on the boot CPU. The boot ROM does not boot the CCPLEX. Doing so may require board-specific configuration of the PMIC or other voltage regulators.

BCT Handoff

A copy of the BCT is left in IRAM by the boot ROM. This is useful because of the spare space in the BCT structure; the "customer data". This may be used to communicate device-software-specific data to software components beyond the boot ROM. This information is not used by the boot ROM at all. One example is the ODMDATA field which, purely by software convention, indicates which UART to use as the debug serial port, or the SDRAM size. Note that typical software stacks on more recent SoCs use the SDRAM controller’s configuration registers to determine the SDRAM size, avoiding the need to encode redundant information into the ODMDATA.

Bootloader Responsibilities

The boot ROM loads and executes the BCT-defined bootloader on the boot CPU. The CCPLEX is left disabled. If the bootloader wishes to execute on the CCPLEX, or wishes other software to execute on the CCPLEX, then explicit steps must be taken to activate the CCPLEX and cause it to execute that specific code.

Redundancy

The boot ROM will attempt to read the BCT and bootloader from multiple locations in boot memory. If a particular copy is missing, cannot be read due to I/O errors, or fails validation due to corruption or tampering, the boot ROM attempts to load a redundant copy instead. The exact search algorithm used differs between BCTs and bootloaders.

The search algorithms use two properties of the boot memory; block size and page size. The values of these properties are either hard-coded based on memory type, or determined at run-time by querying the memory device.

Type	Property	Source
eMMC	Block size	Hard-coded: 16KiB
eMMC	Page size	READ_BL_LEN field of the CSD
SPI	Block size	Hard-coded: 32KiB
SPI	Page size	2KiB or 16KiB based on fuses
NAND	Block size	PAGE_SIZE value returned by READID command
NAND	Page size	BLOCK_SIZE value returned by READID command

A memory device is considered to consist of a series of blocks, each of which consists of a series of pages.

BCT Redundancy

For the purposes of this discussion, a variable PagesPerBct is defined as the minimum number of whole pages that a BCT can fit into.

The boot ROM first attempts to read the BCT from block 0, starting at page 0. If a valid BCT is found here, it is used, and searching terminates.

Next, the boot ROM attempts to read the BCT from block 0, starting at page PagesPerBct. If a valid BCT is found here, it is used, and searching terminates.

Next, the boot ROM attempts to find a "journal block" containing valid BCTs. A "journal block" is defined as a block that contains a valid BCT starting at page 0.

The boot ROM searches at most the first 64 blocks in the boot memory for such a journal block. It is considered an error if no such journal block is found. See the discussion of error handling later in this document.

Once a journal block is located, the boot ROM searches through the pages in the block, starting at multiples of PagesPerBct, until either the whole block has been processed, or an invalid BCT is found. The last valid BCT found is used.

While attempting to read the BCT from the boot memory, the bad block table cannot be used, since this table is itself part of the BCT.

Bootloader redundancy

The BCT contains an ordered array of information about available bootloaders. The various entries can reference either redundant copies of the same bootloader binary, or different versions of the bootloader for fail-safe upgrade purposes.

At a very high level, the boot ROM employs the following algorithm to load a bootloader into RAM and execute it:

for each bootloader entry in the BCT (in BCT array order):
    attempt to load the bootloader into RAM
    if that succeeded:
        jump to that bootloader
// no bootloader could be loaded
enter recovery mode

This basic algorithm is complicated by the bootloader version field. In the simplest case where all bootloader entries contain the same version number, the algorithm devolves to the simplified form above.

By default, the boot ROM will only consider bootloader entries with a version field that matches the version field of the first entry, and will stop iterating through the entries is a mismatch is found. The intent is to ensure that if some subset of the bootloader entries are upgraded, and hence the version field of their entries is modified, then the boot ROM will only boot the most recent version of the bootloader. This prevents an accidental rollback to an earlier version of the bootloader in the face of boot memory read errors, corruption, or tampering. Observe that this relies on upgraded bootloader entries being placed contiguously at the start of the array.

This behavior may be overridden via the "enable fail back" flag. This flag may be set either in the BCT itself, or at run-time via PMC scratch0 register bit 2, followed by a system reset. If this flag is enabled, the boot ROM considers all bootloader entries irrespective of version number, thus allowing fallback to earlier bootloader versions if the newer versions experience boot memory read errors, or hash validation failures.

In all cases, whenever the boot ROM is attempting to load a particular bootloader entry into RAM, it considers all bootloader entries that have the same version number to be redundant copies of the same binary. If a boot memory read error occurs while reading a particular bootloader entry, the boot ROM will attempt to re-read the failed pages from the next bootloader entry, if it has an identical version number. Reading will continue from that bootloader entry until the entire bootloader has been read, or until another error causes the bootloader to switch to yet another redundant copy.

This scheme enables the boot ROM to read a minimal set of pages from boot memory in the face of errors. Any error causes one single additional read of just the failed page(s) from a redundant copy, in a single pass through the bootloader binary. If this scheme were not implemented, any failure would cause the boot ROM to switch to attempt to load the next bootloader entry from scratch, repeating all successful reads prior to the error, thus slowing the boot process.

Please observe that this scheme is still safe, albeit perhaps not optimal, in the case where the bootloader version field is set to the same value for all entries even when different entries actually reference different bootloader binaries. If boot memory read errors cause the boot ROM to read portions of the bootloader binary from different bootloader entries, and those different bootloader entries actually refer to different bootloader binaries, then the bootloader hash validation will fail. This will cause the boot ROM to attempt to load the next bootloader entry, at which time the entire bootloader binary will be read from a single bootloader entry, and hence hash validation will succeed.

Error Handling and Recovery Mode

Errors may occur during the boot process, such as a missing BCT, BCT hash validation failure, bootloader hash validation failure, etc. In this case, the boot ROM enters recovery mode (RCM).

Other situations may also cause entry into recovery mode:

A recovery mode strap exists. If this is asserted, recovery mode will be entered unconditionally. This would usually be asserted by the user pressing a button, or some system management controller asserting the strap.
If Tegra PMC register scratch0 bit 2 is set at power-up, recovery mode will be entered. This register bit is not cleared when Tegra resets, so any software may set this bit, then reboot, to request recovery mode.

On non-engineering parts, recovery mode enables the USB1 port in device mode, and accepts commands in the "Tegra RCM" protocol. This primarily allows code to be downloaded into IRAM, and executed on the boot CPU. This code may then implement any function desired, e.g. directly re-flashing the boot memory, implementing more advanced USB protocols, or potentially even fully booting the system itself.

Note that when the system enters recovery mode, it is quite likely that a valid BCT has not been processed, and hence SDRAM is not accessible. The downloaded code must somehow configure the SDRAM controller itself, if SDRAM is to be used.

Security

The Tegra SoC supports various security modes. Some of these modes require the BCT, bootloader, and/or RCM protocol messages to be encrypted and/or signed with a potentially device-specific key. Details of these features are beyond the scope of this document. These security features are quite unlikely to be enabled on any developer or reference board.

Further Information

BCT location in IRAM

See the U-Boot source code. In particular, search for any use of NVBOOTINFOTABLE_BCTPTR, bct_start, and/or NVBOOTINFOTABLE_BCTPTR.

Booting the CCPLEX

See the U-Boot source code. For Tegra124 and earlier, the U-Boot SPL executes on the boot CPU and boots the CCPLEX. The main U-Boot binary executes on the CCPLEX.

U-Boot source code

source browsing: http://git.denx.de/?p=u-boot.git;a=summary
git download: git://git.denx.de/u-boot.git

RCM protocol

See the tegrarcm source code at the URLs below. Note that tegrarcm implements multiple USB protocols; RCM to talk to the boot ROM, and additional protocols (e.g. Nv3p) to communicate with the executable that is downloaded to IRAM.

source browsing: https://github.com/NVIDIA/tegrarcm
git download: https://github.com/NVIDIA/tegrarcm.git