nRF9160: Memfault

The Memfault sample shows how to use the Memfault SDK in an nRF Connect SDK application to collect coredumps and metrics. The sample connects to an LTE network and sends the collected data to Memfault’s cloud using HTTPS.

Requirements

The sample supports the following development kits:

Hardware platforms

PCA

Board name

Build target

Thingy:91

PCA20035

thingy91_nrf9160

thingy91_nrf9160_ns

nRF9160 DK

PCA10090

nrf9160dk_nrf9160

nrf9160dk_nrf9160_ns

Before using the Memfault platform, you must register an account in the Memfault registration page and create a new project in Memfault.

The sample is configured to compile and run as a non-secure application on nRF91’s Cortex-M33. Therefore, it automatically includes the Secure Partition Manager that prepares the required peripherals to be available for the application.

You can also configure it to use TF-M instead of Secure Partition Manager.

To get access to all the benefits, like up to 100 free devices connected, register at the Memfault registration page.

Overview

In this sample, Memfault SDK is used as a module in nRF Connect SDK to collect coredumps, reboot reasons, metrics and trace events from devices and send to the Memfault cloud. See Memfault terminology for more details on the various Memfault concepts.

Metrics

The sample adds properties specific to the application, while the Memfault SDK integration layer in nRF Connect SDK adds the system property metrics. See Memfault: Collecting Device Metrics for details on working and implementation of metrics. Some metrics are collected by the Memfault SDK directly. There are also some metrics, which are specific to nRF Connect SDK that are enabled by default:

  • LTE metrics:

    • Enabled and disabled using CONFIG_MEMFAULT_NCS_LTE_METRICS.

    • Ncs_LteTimeToConnect - Time from the point when the device starts to search for an LTE network until the time when it gets registered with the network.

    • Ncs_LteConnectionLossCount - The number of times that the device has lost the LTE network connection after the initial network registration.

  • Stack usage metrics:

In addition to showing the capturing of metrics provided by the Memfault SDK integration layer in nRF Connect SDK, the sample also shows how to capture an application-specific metric. This metric is defined in samples/nrf9160/memfault/config/memfault_metrics_heartbeat_config.h:

  • Switch1ToggleCount - The number of times Switch 1 has been toggled on an nRF9160 DK.

Error Tracking with trace events

The sample implements a user-defined trace reason for demonstration purposes. The trace reason is called Switch2Toggled, and is collected every time Switch 2 is toggled on an nRF9160 DK. In addition to detection of the event, the trace includes the current switch state. See Memfault: Error Tracking with Trace Events for information on how to configure and use trace events.

Coredumps

Coredumps can be triggered either by using the Memfault shell command mflt crash, or by pressing a button:

  • Button 1 triggers a stack overflow

  • Button 2 triggers a NULL pointer dereference

These faults cause crashes that are captured by Memfault. After rebooting, the crash data can be sent to the Memfault cloud for further inspection and analysis. See Memfault documentation for more information on the debugging possibilities offered by the Memfault platform.

The sample enables Memfault shell by default. The shell offers multiple test commands to test a wide range of functionality offered by Memfault SDK. Run the command mflt help in the terminal for more information on the available commands.

Configuration

The Memfault SDK allows the configuration of some of its options using Kconfig. To configure the options in the SDK that are not available for configuration using Kconfig, use samples/nrf9160/memfault/config/memfault_platform_config.h. See Memfault SDK for more information.

See Configuring your application for information about how to permanently or temporarily change the configuration.

Minimal setup

To send data to the Memfault cloud, a project key must be configured using CONFIG_MEMFAULT_NCS_PROJECT_KEY.

Note

The Memfault SDK requires certificates required for the HTTPS transport. The certificates are by default provisioned automatically by the nRF Connect SDK integration layer for Memfault SDK to sec tags 1001 - 1005. If other certificates exist at these sec tags, HTTPS uploads will fail.

Additional configuration

There are two sources for Kconfig options when using Memfault SDK in nRF Connect SDK:

  • Kconfig options defined within the Memfault SDK.

  • Kconfig options defined in the nRF Connect SDK integration layer of the Memfault SDK. These configuration options are prefixed with CONFIG_MEMFAULT_NCS.

Check and configure the following options in Memfault SDK that are used by the sample:

If CONFIG_MEMFAULT_ROOT_CERT_STORAGE_NRF9160_MODEM is enabled, TLS certificates used for HTTP uploads are provisioned to the nRF9160 modem when memfault_zephyr_port_install_root_certs() is called.

Check and configure the following options for Memfault that are specific to nRF Connect SDK:

If CONFIG_MEMFAULT_NCS_INTERNAL_FLASH_BACKED_COREDUMP is enabled, CONFIG_PM_PARTITION_SIZE_MEMFAULT_STORAGE can be used to set the flash partition size for the flash storage.

Configuration files

If you just want to do a quick test with a sample, disable the CONFIG_MEMFAULT_USER_CONFIG_ENABLE option in the prj.conf to avoid adding the user configuration files. Otherwise, follow the instructions below.

Memfault SDK requires three files in the include path during the build process. Add a new folder into your project called config and add the following three files:

  • memfault_platform_config.h - Sets Memfault SDK configurations that are not covered by Kconfig options

  • memfault_metrics_heartbeat_config.def - Defines application-specific metrics

  • memfault_trace_reason_user_config.def - Defines application-specific trace reasons

For more information, you can see Memfault nRF Connect SDK integration guide. You can use the files in the nRF9160: Memfault sample as a reference. To have these configuration files in the include path, add the following in the CMakeLists.txt file:

zephyr_include_directories(config)

Building and running

This sample can be found under samples/nrf9160/memfault in the nRF Connect SDK folder structure.

See Building and programming an application for information about how to build and program the application.

The sample is configured to compile and run as a non-secure application on nRF91’s Cortex-M33. Therefore, it automatically includes the Secure Partition Manager that prepares the required peripherals to be available for the application.

You can also configure it to use TF-M instead of Secure Partition Manager.

Testing

Before testing, ensure that your device is configured with the project key of your Memfault project. After programming the sample to your development kit, test it by performing the following steps:

  1. Connect to the kit with a terminal emulator (for example, PuTTY). See How to connect with PuTTY for the required settings.

  2. Observe that the sample starts. Following is a sample output on the terminal:

    *** Booting Zephyr OS build v2.4.99-ncs1-4934-g6d2b8c7b17aa  ***
    <inf> <mflt>: Reset Reason, RESETREAS=0x0
    <inf> <mflt>: Reset Causes:
    <inf> <mflt>:  Power on Reset
    <inf> <mflt>: GNU Build ID: a09094cdf9da13f20719f87016663ab529b71267
    <inf> memfault_sample: Memfault sample has started
    
  3. The sample connects to an available LTE network, which is indicated by the following message:

    <inf> memfault_sample: Connecting to LTE network, this may take several minutes...
    
  4. When the connection is established, the sample displays the captured LTE time-to-connect metric (Ncs_LteTimeToConnect) on the terminal:

    <inf> memfault_sample: Connected to LTE network. Time to connect: 3602 ms
    
  5. Subsequently, all captured Memfault data will be sent to the Memfault cloud:

    <inf> memfault_sample: Sending already captured data to Memfault
    <dbg> <mflt>: Response Complete: Parse Status 0 HTTP Status 202!
    <dbg> <mflt>: Body: Accepted
    <dbg> <mflt>: Response Complete: Parse Status 0 HTTP Status 202!
    <dbg> <mflt>: Body: Accepted
    <dbg> <mflt>: No more data to send
    
  6. Upload the symbol file generated from your build to your Memfault account so that the information from your application can be parsed. The symbol file is located in the build folder: memfault/build/zephyr/zephyr.elf:

    1. In a web browser, navigate to Memfault.

    2. Login to your account and select the project you created earlier.

    3. Navigate to Fleet > Devices in the left menu.

      You can see your newly connected device and the software version in the list.

    4. Click on the software version number for your device and then the Upload button to upload the symbol file.

  7. Back in the terminal, press <TAB> on your keyboard to confirm that the Memfault shell is working. The available shell commands are displayed:

    uart:~$
      clear              device             flash              help
      history            kernel             log                mcuboot
      mflt               mflt_nrf           nrf_clock_control  resize
      shell
    
  8. Learn about the available Memfault shell commands by issuing the command mflt help.

  9. Press Button 1 or Button 2 to trigger a stack overflow or a NULL pointer dereference, respectively.

  10. Explore the Memfault user interface to look at the errors and metrics that has been sent from your device.

Dependencies

The sample requires the Memfault SDK, which is part of nRF Connect SDK’s West manifest, and will be downloaded automatically when west update is run.

This sample uses the following nRF Connect SDK libraries and drivers:

It uses the following sdk-nrfxlib library:

In addition, it uses the following sample: