Floating Point Services

The kernel allows threads to use floating point registers on board configurations that support these registers.

Note

Floating point services are currently available only for boards based on ARM Cortex-M SoCs supporting the Floating Point Extension, the Intel x86 architecture and ARCv2 SoCs supporting the Floating Point Extension. The services provided are architecture specific.

The kernel does not support the use of floating point registers by ISRs.

Concepts

The kernel can be configured to provide only the floating point services required by an application. Three modes of operation are supported, which are described below. In addition, the kernel’s support for the SSE registers can be included or omitted, as desired.

No FP registers mode

This mode is used when the application has no threads that use floating point registers. It is the kernel’s default floating point services mode.

If a thread uses any floating point register, the kernel generates a fatal error condition and aborts the thread.

Unshared FP registers mode

This mode is used when the application has only a single thread that uses floating point registers.

On x86 platforms, the kernel initializes the floating point registers so they can be used by any thread (initialization in skipped on ARM Cortex-M platforms and ARCv2 platforms). The floating point registers are left unchanged whenever a context switch occurs.

Note

The behavior is undefined, if two or more threads attempt to use the floating point registers, as the kernel does not attempt to detect (or prevent) multiple threads from using these registers.

Shared FP registers mode

This mode is used when the application has two or more threads that use floating point registers. Depending upon the underlying CPU architecture, the kernel supports one or more of the following thread sub-classes:

  • non-user: A thread that cannot use any floating point registers

  • FPU user: A thread that can use the standard floating point registers

  • SSE user: A thread that can use both the standard floating point registers and SSE registers

The kernel initializes and enables access to the floating point registers, so they can be used by any thread, then saves and restores these registers during context switches to ensure the computations performed by each FPU user or SSE user are not impacted by the computations performed by the other users.

ARM Cortex-M architecture (with the Floating Point Extension)

On the ARM Cortex-M architecture with the Floating Point Extension, the kernel treats all threads as FPU users when shared FP registers mode is enabled. This means that any thread is allowed to access the floating point registers. The ARM kernel automatically detects that a given thread is using the floating point registers the first time the thread accesses them.

Pretag a thread that intends to use the FP registers by using one of the techniques listed below.

Pretagging a thread with the K_FP_REGS option instructs the MPU-based stack protection mechanism to properly configure the size of the thread’s guard region to always guarantee stack overflow detection.

During thread context switching the ARM kernel saves the callee-saved floating point registers, if the switched-out thread has been using them. Additionally, the caller-saved floating point registers are saved on the thread’s stack. If the switched-in thread has been using the floating point registers, the kernel restores the callee-saved FP registers of the switched-in thread and the caller-saved FP context is restored from the thread’s stack. Thus, the kernel does not save or restore the FP context of threads that are not using the FP registers.

Each thread that intends to use the floating point registers must provide an extra 72 bytes of stack space where the callee-saved FP context can be saved.

Lazy Stacking is currently enabled in Zephyr applications on ARM Cortex-M architecture, minimizing interrupt latency, when the floating point context is active.

If an ARM thread does not require use of the floating point registers any more, it can call k_float_disable(). This instructs the kernel not to save or restore its FP context during thread context switching.

ARCv2 architecture

On the ARCv2 architecture, the kernel treats each thread as a non-user or FPU user and the thread must be tagged by one of the following techniques.

If an ARC thread does not require use of the floating point registers any more, it can call k_float_disable(). This instructs the kernel not to save or restore its FP context during thread context switching.

During thread context switching the ARC kernel saves the callee-saved floating point registers, if the switched-out thread has been using them. Additionally, the caller-saved floating point registers are saved on the thread’s stack. If the switched-in thread has been using the floating point registers, the kernel restores the callee-saved FP registers of the switched-in thread and the caller-saved FP context is restored from the thread’s stack. Thus, the kernel does not save or restore the FP context of threads that are not using the FP registers. An extra 16 bytes (single floating point hardware) or 32 bytes (double floating point hardware) of stack space is required to load and store floating point registers.

RISC-V architecture

On the RISC-V architecture, the kernel treats each thread as a non-user or FPU user and the thread must be tagged by one of the following techniques:

If a RISC-V thread no longer requires the use of the floating point registers, it can call k_float_disable(). This instructs the kernel not to save or restore its FP context during thread context switching. This function can only be called from the thread itself.

During thread context switching the RISC-V kernel saves the callee-saved floating point registers, if the switched-out thread is tagged with K_FP_REGS. Additionally, the caller-saved floating point registers are saved on the thread’s stack. If the switched-in thread has been tagged with K_FP_REGS, then the kernel restores the callee-saved FP registers of the switched-in thread and the caller-saved FP context is restored from the thread’s stack. Thus, the kernel does not save or restore the FP context of threads that are not using the FP registers. An extra 84 bytes (single floating point hardware) or 164 bytes (double floating point hardware) of stack space is required to load and store floating point registers.

x86 architecture

On the x86 architecture the kernel treats each thread as a non-user, FPU user or SSE user on a case-by-case basis. A “lazy save” algorithm is used during context switching which updates the floating point registers only when it is absolutely necessary. For example, the registers are not saved when switching from an FPU user to a non-user thread, and then back to the original FPU user. The following table indicates the amount of additional stack space a thread must provide so the registers can be saved properly.

Thread type

FP register use

Extra stack space required

cooperative

any

0 bytes

preemptive

none

0 bytes

preemptive

FPU

108 bytes

preemptive

SSE

464 bytes

The x86 kernel automatically detects that a given thread is using the floating point registers the first time the thread accesses them. The thread is tagged as an SSE user if the kernel has been configured to support the SSE registers, or as an FPU user if the SSE registers are not supported. If this would result in a thread that is an FPU user being tagged as an SSE user, or if the application wants to avoid the exception handling overhead involved in auto-tagging threads, it is possible to pretag a thread using one of the techniques listed below.

If an x86 thread uses the floating point registers infrequently it can call k_float_disable() to remove its tagging as an FPU user or SSE user. This eliminates the need for the kernel to take steps to preserve the contents of the floating point registers during context switches when there is no need to do so. When the thread again needs to use the floating point registers it can re-tag itself as an FPU user or SSE user by calling k_float_enable().

Implementation

Performing Floating Point Arithmetic

No special coding is required for a thread to use floating point arithmetic if the kernel is properly configured.

The following code shows how a routine can use floating point arithmetic to avoid overflow issues when computing the average of a series of integer values.

int average(int *values, int num_values)
{
    double sum;
    int i;

    sum = 0.0;

    for (i = 0; i < num_values; i++) {
        sum += *values;
        values++;
    }

    return (int)((sum / num_values) + 0.5);
}

Suggested Uses

Use the kernel floating point services when an application needs to perform floating point operations.

Configuration Options

To configure unshared FP registers mode, enable the CONFIG_FPU configuration option and leave the CONFIG_FPU_SHARING configuration option disabled.

To configure shared FP registers mode, enable both the CONFIG_FPU configuration option and the CONFIG_FPU_SHARING configuration option. Also, ensure that any thread that uses the floating point registers has sufficient added stack space for saving floating point register values during context switches, as described above.

Use the CONFIG_SSE configuration option to enable support for SSEx instructions (x86 only).

API Reference

group float_apis

Functions

void k_float_enable(struct k_thread *thread, unsigned int options)

Enable preservation of floating point context information.

This routine informs the kernel that the specified thread (which may be the current thread) will be using the floating point registers. The options parameter indicates which floating point register sets will be used by the specified thread:

  • K_FP_REGS indicates x87 FPU and MMX registers only

  • K_SSE_REGS indicates SSE registers (and also x87 FPU and MMX registers)

Invoking this routine initializes the thread’s floating point context info to that of an FPU that has been reset. The next time the thread is scheduled by z_swap() it will either inherit an FPU that is guaranteed to be in a “sane” state (if the most recent user of the FPU was cooperatively swapped out) or the thread’s own floating point context will be loaded (if the most recent user of the FPU was preempted, or if this thread is the first user of the FPU). Thereafter, the kernel will protect the thread’s FP context so that it is not altered during a preemptive context switch.

Warning

This routine should only be used to enable floating point support for a thread that does not currently have such support enabled already.

Return

N/A

Parameters
  • thread: ID of thread.

  • options: Registers to be preserved (K_FP_REGS or K_SSE_REGS).