MBED, FlashIAP, TDBStore and STM32F4 internal flash

(Also: deriving your own class from SlicingBlockDevice)

It wasn’t completely trivial to have all of them work together. Either the documentation doesn’t dig deep enough into some hidden assumptions, or I was just too dumb to find it.

Anyway, the problem is as follows: I want to store configuration data (such as calibration values or the navplan) in the persistent memory of a robot. Very occasionally, the data has to be updated remotely in the field, but it’s mostly read. Power can fail at any time (it’s solar in my case), so the writing must be robust to power failure. Also, I need to add or change the list of stored parameters during development, while retaining the data already stored in the persistent memory at a former stage, so a bit more flexibility than simple addresses would be welcome, a key / values system would do the trick.

Mbed 5 comes with a component that does exactly that: TDBStore. It works on a (string) key / value scheme. It’s resistant to power failure, that is, should power fail during a writing, the database won’t be updated, but it won’t be corrupted either. To do that, TDBStore needs to store two copies of the database: “in modification” and “committed” in some sense. The copies have to be contiguous (the second copy comes immediately after the first one). Beyond requiring twice as much room in the underlying storage, flash for example, this will become important at some point. Each copy is stored in an entire number of sectors. In this context, a sector is a contiguous block of memory that has to be erased all at once. And, of course, both copies must be of the same size.

TDBStore in itself isn’t tied to a particular storage, we have to provide with some form of it, as an Mbed BlockDevice, such as an external flash, an internal memory block (volatile in that case), or the MCU internal flash.

My MCU is the STM32F4, the STM32F446RE on a Nucleo board to be precise, which comes with 512KB of flash, and my program + bootloader uses only a fraction of it. This is why it’s interesting to store the data in the internal flash of the MCU, rather than an external SD Card / flash component which would be an additional potential point of failure.

The structure of the flash memory in the STM32F446RE is the following:

Sector #AddressSize (KB)
00x0800000016
10x0800400016
20x0800800016
30x0800C00016
40x0801000064
50x08020000128
60x08040000128
70x08060000128

Thus a total of 512KB in 8 sectors. As we can see from the addressing, the sectors are contiguous. In its internal memory, the STM32F446RE has the ability to write bytes one by one, while on SD cards it’s much more common to have to write an entire sector of 512 bytes or more at once. However the erase size is the entire sector, for example 128KB in the case of sectors 5 to 7, and a sector must be erased at least once before being used.

First approach (program #1)

Mbed puts your program at the beginning of the flash memory. The object code of my program is about 100KB. We get very quickly there when using some standard libraries, such as printf and other stuff, which quickly grow to a few dozens of kilobytes. The net result is that the first 5 sectors are used, and we’re left with 3 x 128 KB.

“Hey, that’s more that I need”, you think. “Furthermore, it makes sense to put the database at the end of the flash, thus the address doesn’t have to change, given there’s enough room for both the .text segment and my database”.

Now is the important point. When starting, TDBstore scans both copies of the database (you remember there were two of them, right ?) to determine which is the committed one and erase the other copy for a fresh start. Actually, it has an “erase as you go” algorithm to avoid taking too much time, since the MCU may be locked while erasing in some architecture.

If we have a flash card with a 512 or 1KB sector which we can write and erase, this is all very well. But here the erase unit is the entire 128KB sector, and we need two of them !!! So all in all, with a naive first approach, we’ve to dedicate 256KB of our precious flash memory, which may be a lot.

Okay, but this is a start. Here’s what it looks like:

#include "mbed.h"
#include <FlashIAPBlockDevice.h>
#include <TDBStore.h>

#include <flash_data.h>

static DigitalOut led(LED1);

static Serial pc(SERIAL_TX, SERIAL_RX, 115200);

#define FIRST_DB_BLOCK (ADDR_FLASH_SECTOR_6)

static FlashIAPBlockDevice full_flash(FIRST_DB_BLOCK, 256 * 1024);
static TDBStore nvStore(&full_flash);

static uint32_t val;

int main() {
  full_flash.init();
  int res = nvStore.init();
  printf("init result = %d\n", res);

  res = nvStore.get("_0000", &val, sizeof(val));
  printf("get result = %d, value = %ld\n", res, val);

  val += 1;
  res = nvStore.set("_0000", &val, sizeof(val), 0);
  printf("set result = %d\n", res);

  for(;;) {
    led = !led;
    wait_us(1000000);
  }
}

mbed_json.app

Including the necessary components

However, the code isn’t the end of the story. First, the FLASH and FLASHIAP components aren’t included by default in Mbed compilation, we’ve to tell it. This is done in mbed_json.app by putting target_feature.add and target.components_add directives.

Partitioning the flash to protect your database from the linker

Our code and initialized data (the .text segment) and the database share the same flash memory, so it’s a good idea to tell the linker not to overwrite the database. Mbed puts the .text segment at the beginning of the flash (this is why we put the databse at the end, remember ?). So all we have to do is to tell the linker not to use too much space, which is also done in mbed_app.json. In this first example, we’re using the last 2 128KB memory banks (256 KB), which leaves 256 KB for the .text segment (0x40000 in hex). We tell the linker not to use more by adding a target.restrict_size directive to mbed_app.json.

Resulting mbed_app.json

Given all that, here’s what we get:

{
        "NUCLEO_F446RE": {
            "target.features_add": ["STORAGE"],
            "target.components_add": ["FLASH", "FLASHIAP"],
            "target.restrict_size": "0x40000"
        }
    }
}

Compiling and running

When compiling, the first thing we might notice is that, by the virtue of the target.restrict_size, the compiler is taking some special precautions regarding the memory organization. We see this at the beginning of the compilation:

Using ROM regions application, post_application in this build.
  Region application: size 0x40000, offset 0x8000000
  Region post_application: size 0x40000, offset 0x8040000

At the end of the compilation, there’s a “Merging Regions” activity, and if the .text segment was so large that it couldn’t fit into the flash we’ve left for the system, we would have a failure at this point.

We can also have a look at the linker script. In a GCC environment, it’s in BUILD/*/GCC_ARM/.link_script.ld. At the beginning, we can see the following lines:

MEMORY
{
  FLASH (rx) : ORIGIN = 0x8000000, LENGTH = 0x40000
  RAM (rwx) : ORIGIN = 0x200001C8, LENGTH = 128k - (0x1C4+0x4)
}

They tell the linker there is 256KB (0x40000 hex) of flash, so we’re confident our database is out of the compiler / loader’s reach.

If we upload the code to our card and read the serial port, we’ll get something like this:

init result = 0
get result = 0, value = 1634754933
set result = 0

Note that 0 is the return code for “no error”.

To observe is that the value is:

  1. incremented every time the program is run
  2. The value persists across power off !
  3. The value isn’t initialized, so the value we get the very first time is random (writing some code to initialize it to a known value is left as an exercise).

You can play with the code a little bit, for example if you set FIRST_DB_BLOCK to ADDR_FLASH_SECTOR_7 instead of _6, there is no sector available after ADDR_FLASH_SECTOR_7 to store the second copy of the database, and the code will fail at runtime with some exception like this:

++ MbedOS Error Info ++
Error Status: 0x80FF0101 Code: 257 Module: 255
Error Message: Underlying BD must have flash attributes
Location: 0x80031BB
Error Value: 0x0
Current Thread: main  Id: 0x20001300 Entry: 0x8004C73 StackSize: 0x1000 StackMem: 0x20001AC0 SP: 0x200029DC 
For more info, visit: https://mbed.com/s/error?error=0x80FF0101&tgt=NUCLEO_F446RE
-- MbedOS Error Info --

Saving some memory (program #2)

Let’s go a step further.

First, we might want several databases, but with two memory banks for each database, we may run quickly out of them.

Even if one database is enough for our application, we may not want to sacrifice half of our precious 512 KB of flash memory. I certainly don’t.

There’s a way to logically partition a memory bank so that we can use less memory. The corresponding class is called SlicingBlockDevice. SlicingBlockDevice wraps around a lower level class, such as the FlashIAPBlockDevice and allows to use only a part of it to some extent. However, when it comes to the erase step, SlicingBlockDevice calls the underlying FlashIAPBlockDevice erase method, where the “whole bank at once” limitation (which is a hardware limitation) still applies.

Fortunately, the problem has been anticipated by the architects of the mbed Storage API. By making all the interface methods virtual, it is possible to derive from SlicingBlockDevice and to override the erase behavior with very little effort. The principle is to replace the erase related methods with methods that

  1. return appropriate values when asked about erase size, is_valid_erase and so on, emulating a smaller sector size
  2. Take appropriate actions when asked to erase an (emulated, small) sector. Here, appropriate action means writing the erase value (the value that appears in the memory when it is erased) in the emulated sector.

However, it should be noted that this doesn’t absolutely emulate the behaviour of the flash API, because, let’s say we’re working on sector 7, it has to be initialized at least once in the life of the chip, something that the native FlashIAP would do, and that this emulation doesn’t. So the sector has to have been initialized at some point in the past, for example by running the first program in this post.

Here’s what our derived class looks like. I’ve chosen a 1024 bytes sector size. To be not too inefficient, an array of 64 bytes is reserved once in memory with the erase value, and the pseudo-erase process copy this array to the simulated sector (instead of 1024 times 1 byte). This part isn’t reentrant, but this is an example, not production code.

#define VIRTUAL_SECTOR_SIZE  (1024)
#define VIRTUAL_ERASE_SUB_SIZE (64) // Must divide VIRTUAL_SECTOR_SIZE (and be compatible with program)

class _MyBlockDevice: public SlicingBlockDevice {
public:
  _MyBlockDevice(BlockDevice *bd, bd_addr_t start, bd_addr_t end=0): SlicingBlockDevice(bd, start, end) { }
  virtual int erase(bd_addr_t addr, bd_size_t size);
  virtual bd_size_t get_erase_size () const { return VIRTUAL_SECTOR_SIZE; }
  virtual bd_size_t get_erase_size (bd_addr_t addr) const;
  virtual bool is_valid_erase (bd_addr_t addr, bd_size_t size) const {
    return size % VIRTUAL_SECTOR_SIZE == 0 && is_valid_program(addr, size);
  }
};

static uint8_t _eraseBlock[VIRTUAL_ERASE_SUB_SIZE] = { 1, 2 }; // Differe

int _MyBlockDevice::erase(bd_addr_t addr, bd_size_t size) {
  if (!is_valid_erase(addr, size)) return -1;
  if (_eraseBlock[0] != _eraseBlock[1]) { // erase block non initialized
    int n;
    uint8_t erase_value = get_erase_value();
    for(uint8_t *p = _eraseBlock, n = VIRTUAL_ERASE_SUB_SIZE; n; --n) *p++ = erase_value;
  }
  for(; size > 0; addr += VIRTUAL_ERASE_SUB_SIZE, size -= VIRTUAL_ERASE_SUB_SIZE) {
    program(_eraseBlock, addr, VIRTUAL_ERASE_SUB_SIZE);
  }
  return 0;
}

bd_size_t _MyBlockDevice::get_erase_size(bd_addr_t addr) const {
  return is_valid_program(addr, VIRTUAL_SECTOR_SIZE) ? VIRTUAL_SECTOR_SIZE : 0;
}

And here is the full program. Note that, unlike program #1, we’re using only sector 7, which is divided in 2 by the magic of our new class.

#include "mbed.h"
#include <FlashIAPBlockDevice.h>
#include <TDBStore.h>

#include <flash_data.h>

DigitalOut led(LED1);

static Serial pc(SERIAL_TX, SERIAL_RX, 115200);

#define FIRST_DB_BLOCK (ADDR_FLASH_SECTOR_7)

#define VIRTUAL_SECTOR_SIZE  (1024)
#define VIRTUAL_ERASE_SUB_SIZE (64) // Must divide VIRTUAL_SECTOR_SIZE (and be compatible with prog\
ram)                                                                                                

class _MyBlockDevice: public SlicingBlockDevice {
public:
  _MyBlockDevice(BlockDevice *bd, bd_addr_t start, bd_addr_t end=0): SlicingBlockDevice(bd, start, \
end) { }
  virtual int erase(bd_addr_t addr, bd_size_t size);
  virtual bd_size_t get_erase_size () const { return VIRTUAL_SECTOR_SIZE; }
  virtual bd_size_t get_erase_size (bd_addr_t addr) const;
  //virtual int get_erase_value () const;                                                           
  virtual bool is_valid_erase (bd_addr_t addr, bd_size_t size) const {
    return size % VIRTUAL_SECTOR_SIZE == 0 && is_valid_program(addr, size);
  }
};

static uint8_t _eraseBlock[VIRTUAL_ERASE_SUB_SIZE] = { 1, 2 }; // Differe                           

int _MyBlockDevice::erase(bd_addr_t addr, bd_size_t size) {
  if (!is_valid_erase(addr, size)) return -1;
  if (_eraseBlock[0] != _eraseBlock[1]) { // erase block non initialized                            
    int n;
    uint8_t erase_value = get_erase_value();
    for(uint8_t *p = _eraseBlock, n = VIRTUAL_ERASE_SUB_SIZE; n; --n) *p++ = erase_value;
  }
  for(; size > 0; addr += VIRTUAL_ERASE_SUB_SIZE, size -= VIRTUAL_ERASE_SUB_SIZE) {
    program(_eraseBlock, addr, VIRTUAL_ERASE_SUB_SIZE);
  }
  return 0;
}

bd_size_t _MyBlockDevice::get_erase_size(bd_addr_t addr) const {
  return is_valid_program(addr, VIRTUAL_SECTOR_SIZE) ? VIRTUAL_SECTOR_SIZE : 0;
}
                            
static FlashIAPBlockDevice full_flash(ADDR_FLASH_SECTOR_7, 128 * 1024);
static _MyBlockDevice flash(&full_flash, 0, 128 * 1024);
static TDBStore nvStore(&flash);

static uint32_t val = 0;

int main_ex1() {
  flash.init(); // Calls full_flash.init()
  int res = nvStore.init();
  printf("init result = %d\n", res);

  res = nvStore.get("_0000", &val, sizeof(val));
  printf("get result = %d, value = %ld\n", res, val);

  val += 1;
  res = nvStore.set("_0000", &val, sizeof(val), 0);
  printf("set result = %d\n", res);

  for(;;) {
    led = !led;
    wait_us(1000000);
  }
}

This also means, that we put in mbed_app.json:

"target.restrict_size": "0x60000"

and thus, we have 384KB for our .text segment (and use only 128KB for the two copies of the database).

It can be noticed that _MyBlockDevice relies upon Mbed SlicingBlockDevice for addressing, and thatSlicingBlockDevice deals in logical addresses. This means that, for example, when we create a SlicingBlockDevice for a block starting at the physical address 0x08060000 (an argument passed to the SlicingBlockDevice constructor), the address of the first byte from the SlicingBlockDevice point of view is 0x0, and thus this property “propagates” in some sense to _MyBlockDevice.

Conclusion

The combination of TDBStore + the internal flash of a chip enables a powerful, flexible, highly reliable and yet cheap application-level non-volatile memory storage. Since TDBStore assumes that two consecutive, dedicated, fully erasable memory banks may be allocated, some adaptation layer may be necessary between TDBStore and the flash access API, however, this can be done easily by deriving from the SlicingBlockDevice.

References