Brief description:
We cannot fully (and easily) utilize the cortex-m Floating point features in Zephyr RTOS applications, which are running together with TF-M.
Detailed Description of the case:
In Zephyr multi-thread environment we normally enable the FP co-processor and the advanced context-control features in Cortex-M
- Automatic state preservation
- Lazy stacking
Zephyr threads are all free to use the floating point registers; the automatic state preservation ensures the caller-saved registers are preserved in the thread's stack and the regular context-switch routines ensure the FP callee-saved registers are also preserved in the thread's dedicated container for callee-saved context. This has been fairly stable on non TrustZone-M enabled devices.
The situation, however. becomes challenging when Zephyr applications are running at Non-Secure domain, with TF-M running in Secure domain:
The interesting use case is when Zephyr Non-Secure threads with active FP context (CONTROL.FPCA =1) are doing secure calls to the TF-M services. In such a scenario, it may occur that the TF-M secure threads will need to preserve the FP context themselves, during a Non-Secure interrupt that attempts to access the FP registers. Normally Zephyr HW interrupts do not access the FP registers, but this is not guaranteed. And, of course, the PendSV interrupt, which handles the thread context-switch does access the FP registers, because it may need to save and restore the FP callee-saved registers.
Currently this scenario will immediately trigger a TF-M system crash, via a Secure NOCP UsageFault (no co-processor), as TF-M does not enable the FP co-processor. This is noticed frequently in Zephyr applications with vanilla TF-M versions.
I suppose that this problem could be solved by enabling the FP co-processor in TF-M unconditionally (currently, this is under FPU_USED pre-processor macro). Alternatively, there needs to be a way for TF-M to know that the Non-Secure application is build with Floating-Point support, and enable the FP co-processor in SCB->CPACR_S register.
However, even if we solve this problem, the TF-M would still need to properly preserve the FP caller-saved context, when switching between threads in Secure PendSV, so Non-Secure interrupts (and potential Non-Secure reschedule actions) would not interfere with FP state preservation. What we see today is that the Non-Secure FP context is not preserved during TF-M thread switches, leading to weird situations such as when the lazy state preservation active bit is set in Secure thread mode; which should not be the case.
The reason this occurs is because the TF-M context switch routine simply loads the stacked EXC_RETURN value of the thread to be switched-in, potentially leaving the FPCCR.LSPACT bit set in Secure Thread mode (!!), and the caller-saved registers unstacked.
This will lead to system crashes, if a preempting Non-Secure ISR attempts to use the floating point registers, because the FPCAR register is most likely containing trash information, pointing to an area not reserved for FP context preservation.
It is not enough to clear the FPPCR.LSPACT bit in Secure PendSV; this would mean that an Non-Secure ISR/Thread could eventually corrupt a previous FP context. It seems to me we need to store the FP context in PendSV, before switching-out a thread with an active FP context (EXC_RETURN.FType = 0).
Assessment
This is a serious problem IMHO. It rules out the possibility to use the Cortex-M FP-related optimization (automatic and lazy state preservation) and forces Zephyr to manually save and restore the FP context before and after a secure call, respectively.
Let me know if you need more information.
/Ioannis