Page MenuHomePhabricator

Supporting Non-Secure RTOS applications, integrated with Trusted Firmware-M, which want to use the FPU
Closed, ResolvedPublic

Description

Brief description:

We cannot fully (and easily) utilize the cortex-m Floating point features in Zephyr RTOS applications, which are running together with TF-M.

Detailed Description of the case:

In Zephyr multi-thread environment we normally enable the FP co-processor and the advanced context-control features in Cortex-M

  • Automatic state preservation
  • Lazy stacking

Zephyr threads are all free to use the floating point registers; the automatic state preservation ensures the caller-saved registers are preserved in the thread's stack and the regular context-switch routines ensure the FP callee-saved registers are also preserved in the thread's dedicated container for callee-saved context. This has been fairly stable on non TrustZone-M enabled devices.

The situation, however. becomes challenging when Zephyr applications are running at Non-Secure domain, with TF-M running in Secure domain:

The interesting use case is when Zephyr Non-Secure threads with active FP context (CONTROL.FPCA =1) are doing secure calls to the TF-M services. In such a scenario, it may occur that the TF-M secure threads will need to preserve the FP context themselves, during a Non-Secure interrupt that attempts to access the FP registers. Normally Zephyr HW interrupts do not access the FP registers, but this is not guaranteed. And, of course, the PendSV interrupt, which handles the thread context-switch does access the FP registers, because it may need to save and restore the FP callee-saved registers.

Currently this scenario will immediately trigger a TF-M system crash, via a Secure NOCP UsageFault (no co-processor), as TF-M does not enable the FP co-processor. This is noticed frequently in Zephyr applications with vanilla TF-M versions.

I suppose that this problem could be solved by enabling the FP co-processor in TF-M unconditionally (currently, this is under FPU_USED pre-processor macro). Alternatively, there needs to be a way for TF-M to know that the Non-Secure application is build with Floating-Point support, and enable the FP co-processor in SCB->CPACR_S register.

However, even if we solve this problem, the TF-M would still need to properly preserve the FP caller-saved context, when switching between threads in Secure PendSV, so Non-Secure interrupts (and potential Non-Secure reschedule actions) would not interfere with FP state preservation. What we see today is that the Non-Secure FP context is not preserved during TF-M thread switches, leading to weird situations such as when the lazy state preservation active bit is set in Secure thread mode; which should not be the case.

The reason this occurs is because the TF-M context switch routine simply loads the stacked EXC_RETURN value of the thread to be switched-in, potentially leaving the FPCCR.LSPACT bit set in Secure Thread mode (!!), and the caller-saved registers unstacked.

This will lead to system crashes, if a preempting Non-Secure ISR attempts to use the floating point registers, because the FPCAR register is most likely containing trash information, pointing to an area not reserved for FP context preservation.

It is not enough to clear the FPPCR.LSPACT bit in Secure PendSV; this would mean that an Non-Secure ISR/Thread could eventually corrupt a previous FP context. It seems to me we need to store the FP context in PendSV, before switching-out a thread with an active FP context (EXC_RETURN.FType = 0).

Assessment

This is a serious problem IMHO. It rules out the possibility to use the Cortex-M FP-related optimization (automatic and lazy state preservation) and forces Zephyr to manually save and restore the FP context before and after a secure call, respectively.

Let me know if you need more information.

/Ioannis

Event Timeline

ioannisg created this task.Apr 27 2021, 9:12 AM
davidwang added a subscriber: davidwang.

hi, Ioannisg

For the concern you mentioned, yes, we need to add extra steps in secure scheduler, I am still working on this part.

  1. When secure doing secure calls: a. if lazy fp is disabled, hardware will push/pop FP context automatically during exception entry/return. b. if lazy fp is enable, for isolation 1, secure scheduler will save and restore FP context, but not invalidate FP context; for isolation 2 and 3, secure scheduler will trigger lazy fp stacking, hardware will push FP context to thread' stack and invalidate them automatically.
  2. When non-secure doing secure calls, non-secure side will SG to secure world in tfm_nspm_thread_entry, then doing secure calls as same as above. FP context of non-secure can be restored when bxns lr to non-secure side.

For your crash problem, if only enabled FP (hardware) on NS side, but not enable FP (hardware) on S side, there should be no crash even non-secure doing secure calls (Currently TF-M only support software FP by default).
As you mentioned crash happens for Secure NOCP UsageFault (no co-processor), I think you should make some change on TF-M side to enable FP support, right?
Could you let me know what's the exactly change you made on TF-M? These information will be helpful for investigation to see whether we can find a way to mitigate the problem.
1.Which cortex-m core?
2.TF-M changeset you are using and compiler version.
3.Changes you did for TF-M while integrated to your project.
4.More information about the crash:

a. IPC mode or library mode? Which isolation level?
b. Detail sequences of actions between NS and S when crash.
c. Did you use FP in SPM(secure partition manager) or use FP in secure partitions? Or both? 
d. Crash only for Lzay FP enabled? How about the status if Lazy FP disabled?
e. Which PSA calls causes the crash, or all PSA calls? 
f. How about the occurrence of crash? Always crash or sometimes?
g. Which fault entered? Value of registers and stack frame in memory at crash time are very useful for analysis.
h. Other information if possible.

Thank you!

Hi Feder,

For your crash problem, if only enabled FP (hardware) on NS side, but not enable FP (hardware) on S side, there should be no crash even non-secure doing secure calls (Currently TF-M only support software FP by default).

Right, when only Non-Secure application is using the FPU, there should be no issues.

As you mentioned crash happens for Secure NOCP UsageFault (no co-processor), I think you should make some change on TF-M side to enable FP support, right?

The required change is that TF-M enables FP usage as well, by writing to CPACR_S register. There are several options here

  • enable FPU unconditionally
  • enable FPU based on a user configuration that signifies that the NS Application uses the FP registers _and_ is able to switch to a secure call with CONTROL.FPCA set to 1 (otherwise, there is no problem, actually)

Could you let me know what's the exactly change you made on TF-M? These information will be helpful for investigation to see whether we can find a way to mitigate the problem.

I did not do any changes to TF-M. All I did is to enable and use the FP registers in the Non-Secure application, enabling, also automatic state preservation and lazy stacking (ASPEN, LSPEN). I also allowed Non-Secure threads with active FP context to call into PSA calls.

In detail:

1.Which cortex-m core?

M33

TF-M changeset you are using and compiler version.

Using TF-M 1.3.0

Detailed sequence of actions between NS and S when crash.

a. Boot TF-M
b. Boot Non-Secure application. Set CPACR (allow FP usage), Set FPCCR .ASPEN and .LSPEN to 1
c. Switch to a non-secure thread that uses FP instructions. CONTROL.FPCA will be set to 1, indicating an active FP context.
d. Do a PSA call
e. While inside the PSA call, trigger a Non-Secure interrupt that makes use of FP instructions (I am using Non-Secure PendSV)
f. crash with Secure UsgFault, NOCP

Changes you did for TF-M while integrated to your project.

None

IPC mode or library mode? Which isolation level?

Does not matter

Did you use FP in SPM(secure partition manager) or use FP in secure partitions? Or both?

No

Which PSA calls causes the crash, or all PSA calls?

Does not matter. But it has to be a call that gets interrupt by non-secure application

How about the occurrence of crash? Always crash or sometimes?

Always

Which fault entered? Value of registers and stack frame in memory at crash time are very useful for analysis.

Secure No-Coprocessor UsageFault error

Crash only for Lzay FP enabled? How about the status if Lazy FP disabled?

I only try with LSPEN set to 1. Without LSPEN, I guess that the stacking will occur during the Non-Secure exception entry, and this will end up with the same UsgFault

Once more, I am stressing that it does not seem to be enough to enable the CPACR in Secure domain. We need to actually save the FP registers in the secure thread context-switch, if the thread is actually switched-out.

ioannisg triaged this task as High priority.May 3 2021, 8:21 AM

Setting this to High for now - but feel free to re-triage this was not appropriate.

Hi @ioannisg,
FYI. Feder is on holiday and will back to office on 10th May.
Thanks.

hi, ioannisg

If you didn't change TF-M while integrating into your project, PSA call(handler mode) cannot be interrupted by non-secure interrupt like you mentioned, the reason is non-secure exceptions are de-prioritized (AIRCR.PRIS = 1) in TF-M.
Non-secure interrupt can only be active when system in thread mode.

Based on your answer:

  1. You didn't "use FP in SPM(secure partition manager) or use FP in secure partitions".
  2. You didn't change TF-M while integrating into your project.
  3. Crash only happens in non-Secure interrupt that makes use of FP instructions.

Your application should be a non-Secure application using FPU, there should be no issues.

Besides the FP registers configured you mentioned for non-secure application, did you add compiler and linker flag accordingly for your non-secure interrupt source file, such as -mfloat-abi=hard -mfpu=fpv5-sp-d16 (gcc compiler)?

Hi Feder,

  1. I have not changed TF-M at all, while integrating to Zephyr. am using an upstream TF-M version.
  2. There is no problem with the linker, and ABIs; I compile zephyr and Tf-M with soft FP.

If you didn't change TF-M while integrating into your project, PSA call(handler mode) cannot be interrupted by non-secure interrupt like you mentioned, the reason is non-secure exceptions are de-prioritized (AIRCR.PRIS = 1) in TF-M.

I did not intend to mention that Secure handler mode is interrupted.

Non-secure interrupt can only be active when system in thread mode.

Exactly. This is my use case. Let me explain this in a bit more detail, to avoid confusion.

Steps:
a. Boot TF-M
b. Boot Non-Secure application. Zephyr in my use-case. Inside Zephyr boot phase, I set CPACR (to allow FP usage). I also set FPCCR .ASPEN and .LSPEN to 1. These bits enable FP automatic state preservation and lazy stacking, respectively.
c. Inside my Non-Secure application, I switch to a non-secure thread that uses FP instructions. This means that CONTROL.FPCA will be set to 1, indicating an active FP context.
d. Then, I am calling a secure service from my non-secure thread application. E.g. I am calling psa_hash_compute()
e. While still doing the above secure service, so, before the psa_hash_compute function returns, a Non-Secure interrupt is triggered and becomes active. This is allowed, because the background state is Secure Thread mode.
f. Inside the non-secure interrupt I make use of FP instructions.
g. Crash with Secure UsgFault, NOCP is observed.

Could you look at this use case and see if you also get the same processor error?

If you don't see any issues, could you suggest an explanation why I am observing a Secure UsageFault with NOCP flag, while i am only using an upstream TFM version?

Thanks!

federliangarm added a comment.EditedMay 13 2021, 1:35 AM

Hi, Ioannisg

The first thing I want to confirm is "I compile zephyr and Tf-M with soft FP". As you know, TF-M is default with soft FP.

  1. But for Zephyr, are you using soft FP or hardware FP?
  2. Is it possible for you to share the compile options and linker options for the source file including the "Non-Secure interrupt" crashing?
  3. Is it possible to show the assembly code for the "Non-Secure interrupt"?
  4. Please have a try to disable lazy FP stacking, to see whether still crash.

Hi,

But for Zephyr, are you using soft FP or hardware FP?

I am using soft FP for Zephyr. You can see the setting here:
https://github.com/zephyrproject-rtos/zephyr/blob/f8ac3a49ec95f4bf98bbd1f2b827c421f654d48b/arch/arm/core/aarch32/Kconfig#L241

Hard ABI is disabled when building with TF-M. Otherwise we cannot link TF-M with Zephyr libraries.

Is it possible for you to share the compile options and linker options for the source file including the "Non-Secure interrupt" crashing?

You can browse all the settings here: https://github.com/zephyrproject-rtos/zephyr/tree/master/cmake
And the gcc-specific settings here: https://github.com/zephyrproject-rtos/zephyr/blob/master/cmake/compiler/gcc/target_arm.cmake

Is it possible to show the assembly code for the "Non-Secure interrupt"?

Yeap. The Non-Secure interrupt is the PendSV handler in Zephyr:

https://github.com/zephyrproject-rtos/zephyr/blob/f8ac3a49ec95f4bf98bbd1f2b827c421f654d48b/arch/arm/core/aarch32/swap_helper.S#L112

The link shows the line that triggers the fault.

Please have a try to disable lazy FP stacking, to see whether still crash.

Will do, and let you know.

Yes, it crashes even without Lazy Stacking. It is a bit more deterministic as is. It crashes in the first secure exception entry, after the transition to secure domain from non-secure.

federliangarm added a comment.EditedMay 13 2021, 9:28 AM

hi, Ioannisg

This is the explanation for "FP_SOFTABI" in Zephyr.
"config FP_SOFTABI
bool "Floating point Soft ABI"
help

	  This option selects the Floating point ABI in which hardware floating
	  point instructions are generated but soft-float calling conventions."

The definition of FP_SOFTABI in Zephyr is not same as the default "software FP" option in TF-M.

For GCC compiler, please check here https://gcc.gnu.org/onlinedocs/gcc/ARM-Options.html.
"-mfloat-abi=name
Specifies which floating-point ABI to use. Permissible values are: ‘soft’, ‘softfp’ and ‘hard’.
Specifying ‘soft’ causes GCC to generate output containing library calls for floating-point operations. ‘softfp’ allows the generation of code using hardware floating-point instructions, but still uses the soft-float calling conventions. ‘hard’ allows generation of floating-point instructions and uses FPU-specific calling conventions."

TF-M currently support "-mfloat-abi=soft" as default, but doesn't support the setting in Zephyr "-mfloat-abi=softfp". Because they are totally different things.

For"-mfloat-abi=softfp", compiler generates hardware floating-point instructions but still uses the software float calling conventions.
TF-M doesn't support FP instructions with "-mfloat-abi=softfp" option. This should be the reason why you got the crash when doing secure call.
For confirmation, please double check whether there is FP instructions in the assemble code of the secure exception .

ioannisg added a comment.EditedMay 13 2021, 5:56 PM

TF-M currently support "-mfloat-abi=soft" as default, but doesn't support the setting in Zephyr "-mfloat-abi=softfp". Because they are totally different things.

But why? These ABIs are link-compatible, I do not see the problem. GCC allows these binaries to link successfully.

Can you please point me to the TF-M documentation, where it is stated that "TF-M does not support compiling Non-Secure application binaries with soft FP"?

I could not find any relevant information (but may have missed that). I could only find:

if(TFM_SYSTEM_FP)
    message(FATAL_ERROR "Hardware FPU is currently not supported in TF-M")
endif()

This suggests that TF-M does not support Hardware FPU, but I do not think this means TF-M does not allow the non-secure applications to use the FPU.
TF-M actually implies that Non-Secure applications are allowed to use the FPU:

See the code snippet below [in tfm_arch_config_extensions()]:

#if defined(__ARM_ARCH_8_1M_MAIN__) || defined(__ARM_ARCH_8M_MAIN__)
    /* Permit Non-secure access to the Floating-point Extension.
     * Note: It is still necessary to set CPACR_NS to enable the FP Extension in
     * the NSPE. This configuration is left to NS privileged software.
     */
    SCB->NSACR |= SCB_NSACR_CP10_Msk | SCB_NSACR_CP11_Msk;
#endif

So, according to my understanding, TF-M currently intends to allow Non-Secure applications to make use of the FPU. And of course, such applications need need to compile with softfp, to do that. So the use-case I have presented to you is a valid use case.

If TF-M does not allow non-secure applications to use the FPU at all, that should be stated in the documentation and the tfm_arch_config_extensions() should not touch the NSACR register. But, honestly, that would be a severe limitation to Non-Secure applications on SoCs that have an FPU co-processor. And I do not believe this could have been the original intention.

TF-M doesn't support FP instructions with "-mfloat-abi=soft" option. This should be the reason why you got the crash when doing secure call.

I have been trying to explain to you that I compile TF-M with soft, I do not change the default settings. And I compile Zephyr with softfp, and as expected, I do not get any link errors. Maybe you are confused that if I use softfp for Zephyr, then this is also used in the TF-M compilation? I can confirm you that the TF-M binary is still compiled with soft, as is the default configuration :)

Therefore, the TF-M code does not use FPU instructions at all. The reason for the crash is that Cortex-M tries to stack FP registers because the LSPACT flag is left set, by the core, while in secure thread mode.

For confirmation, please double check whether there is FP instructions in the assemble code of the secure exception.

I confirm that the assembly code of any secure exceptions do not have any FP instructions. There are no FP instructions at all in the TF-M binary.

hi, Ioannisg

TF-M currently support "-mfloat-abi=soft" as default, but doesn't support the setting in Zephyr "-mfloat-abi=softfp". Because they are totally different things.

But why? These ABIs are link-compatible, I do not see the problem. GCC allows these binaries to link successfully.

I mean TF-M itself current FP option is "-mfloat-abi=soft", not the "-mfloat-abi=softfp" which is used in Zephyr. It doesn't matter with setting for Zephyr.

Can you please point me to the TF-M documentation, where it is stated that "TF-M does not support compiling Non-Secure application binaries with soft FP"?

I could not find any relevant information (but may have missed that). I could only find:

if(TFM_SYSTEM_FP)

message(FATAL_ERROR "Hardware FPU is currently not supported in TF-M")

endif()

This suggests that TF-M does not support Hardware FPU, but I do not think this means TF-M does not allow the non-secure applications to use the FPU.
TF-M actually implies that Non-Secure applications are allowed to use the FPU:
See the code snippet below [in tfm_arch_config_extensions()]:

#if defined(ARM_ARCH_8_1M_MAIN) || defined(ARM_ARCH_8M_MAIN)

/* Permit Non-secure access to the Floating-point Extension.
* Note: It is still necessary to set CPACR_NS to enable the FP Extension in
 * the NSPE. This configuration is left to NS privileged software.
 */
SCB->NSACR |= SCB_NSACR_CP10_Msk | SCB_NSACR_CP11_Msk;

#endif

So, according to my understanding, TF-M currently intends to allow Non-Secure applications to make use of the FPU. And of course, such applications need need to compile with softfp, to do that. So the use-case I have presented to you is a valid use case.
If TF-M does not allow non-secure applications to use the FPU at all, that should be stated in the documentation and the tfm_arch_config_extensions() should not touch the NSACR register. But, honestly, that would be a severe limitation to Non-Secure applications on SoCs that have an FPU co-processor. And I do not believe this could have been the original intention.

Of course, TF-M supports working with NS application which uses FP.

TF-M doesn't support FP instructions with "-mfloat-abi=soft" option. This should be the reason why you got the crash when doing secure call.

This is typo, already corrected. TF-M doesn't support FP instructions with "-mfloat-abi=softfp" option. The guess reason is based on "whether there is FP instructions in secure exception". Since you confirmed that there is no FP instruction in secure exception, just forget it.

Therefore, the TF-M code does not use FPU instructions at all. The reason for the crash is that Cortex-M tries to stack FP registers because the LSPACT flag is left set, by the core, while in secure thread mode.

Do you have FP instruction in secure thread? If there is no FP instruction, the processor will not stack FP registers even LSPACT flag is set.
I tried this on my side on STM32L562e_dk board with TF-M repo: enable "-mfloat-abi=softfp" in NS side and use TF-M default "-mfloat-abi=soft" option, and changed FP registers in non-secure thread first, then do PSA call. When system is in secure thread mode, no crash found even FPCCR_S.LSPACT = 1.

For confirmation, please double check whether there is FP instructions in the assemble code of the secure exception.

I confirm that the assembly code of any secure exceptions do not have any FP instructions. There are no FP instructions at all in the TF-M binary.

If there is no FP instruction in secure exception, it doesn't make sense that system crash at secure exception as you mentioned before.

Yes, it crashes even without Lazy Stacking. It is a bit more deterministic as is. It crashes in the first secure exception entry, after the transition to secure domain from non-secure.

It looks very strange for the problem you reported. I think first we should locate which assembly line cause the crash. You can check the ReturnAddress(address of $MSP_NS+0x18, or $PSP_S+0x18, or $PSP_NS+0x18) on the stack to get some clue when crash happens.
You can also contact support@arm.com for help.

Do you have FP instruction in secure thread? If there is no FP instruction, the processor will not stack FP registers even LSPACT flag is set.
I tried this on my side on STM32L562e_dk board with TF-M repo: enable "-mfloat-abi=softfp" in NS side and use TF-M default "-mfloat-abi=soft" option, and changed FP registers in non-secure thread first, then do PSA call. When system is in secure thread mode, no crash found even FPCCR_S.LSPACT = 1.

Right. No FP instructions in the secure (TF-M) thread.

So first of all, if you acknowledge that FPCCR_S.LSPACT is 1, in secure thread mode, this is already a "problem". This bit should only be set to 1 if lazy state preservation is active, and that should only be the case in Handler mode. Not in Thread mode.

Now: what you need to try here is the scenario i described above: In brief, you need to get a Non-Secure interrupt while you are in Secure thread mode, and _that_ interrupt needs to access the FP registers. Then, if LSPACT is 1, you will activate lazy stacking in the background state, which is secure. And this will trigger the fault.

If there is no FP instruction in secure exception, it doesn't make sense that system crash at secure exception as you mentioned before.

It crashes, though, with the error information that I mentioned.

It looks very strange for the problem you reported. I think first we should locate which assembly line cause the crash. You can check the ReturnAddress(address of $MSP_NS+0x18, or $PSP_S+0x18, or $PSP_NS+0x18) on the stack to get some clue when crash happens.

I have already posted the line that cause the crash. It is in Non-Secure PendSV exception, an FP instruction that loads the FP registers.

You can also contact support@arm.com for help.

Well, I do not believe this is a problem with the ARM architecture. I believe it is a Tf-M problem that needs to be addressed.

I tried this on my side on STM32L562e_dk board with TF-M repo: enable "-mfloat-abi=softfp" in NS side and use TF-M default "-mfloat-abi=soft" option, and changed FP registers in non-secure thread first, then do PSA call. When system is in secure thread mode, no crash found even FPCCR_S.LSPACT = 1.

By the way, this is not complete setup: you also need to enable ASPEN and LSPEN in Non-Secure mode.

Here's the list of steps, again:

a. Compile Tf-M as is, and Boot TF-M
b. Boot Non-Secure application. Set CPACR (allow FP usage), Set FPCCR .ASPEN and .LSPEN to 1
c. Switch to a non-secure thread that uses FP instructions. CONTROL.FPCA will be set to 1, indicating an active FP context. You can do this with any FP instructions, e.g. vldmia { }.
d. Do a PSA call
e. While inside the PSA call, observe that CONTROL.FPCA is still set to 1. Then trigger a Non-Secure interrupt that makes use of FP instructions (I am using Non-Secure PendSV, and there's a vstm instruction)
f. crash with Secure UsgFault, NOCP

hi, Ioannisg

I tried on my side with steps above, and I still cannot reproduce your issue.
Please confirm NSACR register is configured correctly, and check the FP related registers before the FP instruction causing crash.

Here is the test evidence on my side.

  1. Compile TF-M as default(-mfloat-abi=soft), Non-secure side with -mfloat-abi=softfp.

  1. NSACR register

  1. CPACR register

  1. FPCCR register, .ASPEN and .LSPEN to 1

  1. Switch to a non-secure thread that uses FP instructions, change FP registers, CONTROL.FPCA will be set to 1.
  2. Do a PSA call, CONTROL.FPCA will be 0 after the SVC instruction.
  3. Trigger Non-secure Timer interrupt and test vstm FP instruction in this non-secure handler, no crash found.

Before vstm FP instruction, the FP registers are as following.


After vstm FP instruction, FP caller register is stacked and invalidated due to lazy FP stacking enabled.

Hi Feder,

I created a very small application in TF-M which follows exactly the steps that I have described to you.
You can build the TF-M and application binaries using the following branches (or cherry-pick the relevant commits on the top):

Secure (TF-m): https://github.com/ioannisg/trusted-firmware-m/commits/test_fpu
Non-Secure (TF-m-tests): https://github.com/ioannisg/tf-m-tests/commits/test_fpu_usage

Note that the selected HW interrupt that I used corresponds to the nRF5340 board. For your STM, you can use any IRQ line that is available.

You need to build this with PSA_API=ON, TEST_S=OFF, TEST_NS=OFF of course. I also set BL2=OFF and used GCC toolchain.

Please, try to build and run it, and confirm whether or not the crash occurs.

Besides trying to reproduce this scenario, I recommend you to take a look at the Exception Entry section in the ARM Manual, where you can find some information of the root cause of the issue we are discussing here.

Looking forward to your testing!

hi, Ioannisg

Please try to add code below just before setting NSACR in tfm_arch_config_extensions() and let me know the result. Thanks!

/* Enable Secure privileged and unprivilged access to the FP Extension */
SCB->CPACR |= (3U << 10U*2U)     /* enable CP10 full access */
              | (3U << 11U*2U);  /* enable CP11 full access */

Hi Feder, correct; setting CPACR_S solves the problem of Secure No-Coprocessor Usage Fault.

However, this is only a partial solution. That is because a Secure thread re-scheduling will clear the .FPCA flag, but leave the LSPACT set, meaning that a NS IRQ with FP instructions will trigger again an FP stacking. But this FP stacking will be done on the memory where FPCAR is pointing at, and FPCAR is only updated in exception entry, if .FPCA is set. As a result, it does not seem that you avoid a stack corruption.

Let me know if you've understood this argumentation, or you need more information.

Any updates here, Feder?

However, this is only a partial solution. That is because a Secure thread re-scheduling will clear the .FPCA flag, but leave the LSPACT set, meaning that a NS IRQ with FP instructions will trigger again an FP stacking. But this FP stacking will be done on the memory where FPCAR is pointing at, and FPCAR is only updated in exception entry, if .FPCA is set. As a result, it does not seem that you avoid a stack corruption.

Let me know if you've understood this argumentation, or you need more information.

hi, Ioannisg
Did you face the issue you mentioned above? If yes, please list the detail steps about how to reproduce the issue.

ioannisg added a comment.EditedJun 1 2021, 12:08 PM

Hi Feder,
Honestly, I've only faced this issue when doing some advance scheduling manipulations on the Non-Secure side (modifying LSPACT, FPCA, etc.), so not in mainline TF-M/RTOS use-cases. Please disregard it for now.

The actual problem with clearing the CONTROL.FPCA during Secure thread re-scheduling is the following: if Non-Secure thread may preempt Secure threads and get scheduled-in, we do not have an information that the scheduled-out thread has an active FP context. This is because FPCA is the only non-banked information that we can use. LSPACT is banked, so a Non-Secure thread cannot read LSPACT_S. As a result of no information, the NonSecure RTOS may just generate an FP context effectively corrupting the FP callee-saved registers (S16-S31). This is not related to stacking.

This is a problem today, and this blocks Zephyr from efficiently using the FPU in applications combined with TFM.

This is why I mentioned that enabling CPACR in Secure side only partially solves the problem of "allowing NS applications to efficiently use FPU together with TFM". But at least, it solves a crash, so I think we should fix this for 1.4 release.

hi, Ioannisg

Thank you for letting us know the problem when you are using TF-M, this is helpful for all stakeholder of TF-M.

As you know, for secure and non-secure side are closely related in TF-M. In latest TF-M v1.3 release, FP support (in secure side either non-secure side) are not official announced, so the problem you reported is not a TF-M bug.
But FP support is already in our plan. I am still working on it, suppose it should be available in v1.4 release later.

For your concern:

The actual problem with clearing the CONTROL.FPCA during Secure thread re-scheduling is the following: if Non-Secure thread may preempt Secure threads and get scheduled-in, we do not have an information that the scheduled-out thread has an active FP context. This is because FPCA is the only non-banked information that we can use. LSPACT is banked, so a Non-Secure thread cannot read LSPACT_S. As a result of no information, the NonSecure RTOS may just generate an FP context effectively corrupting the FP callee-saved registers (S16-S31). This is not related to stacking.

Are you talking about non-secure scheduled-out thread may lost the CONTROL.FPCA after secure psa calls?

Based on current design in TF-M v1.3, FP support is disabled in TF-M:

  1. Secure threads cannot be preempted by non-secure threads. Secure scheduler and non-secure scheduler are isolated. Secure scheduler in SPM is a passive scheduler which can only be activated by the request from non-secure side. In other words, if secure threads are running, non-secure side are "blocked". Secure threads can only be interrupted by exceptions from secure or non-secure side. Secure side will also be "blocked" when non-secure threads are running as the same.
  2. When non-secure thread with FP context do psa call, CONTROL_NS.FPCA will be cleared temporarily at SVC exception entry, and this bit will be recovered by ~EXC_RETURN[4] after unstacking of SVC exception return, please check PopStack in Arm®v8-M Architecture Reference Manual. Non-secure side still have the FP indicator flag .FPCA, it is not lost even inter actions happen between secure and non-secure.
  3. Since FP support is not enabled in secure side, FP context from non-secure side are not touched by secure world.

If FP support is enabled in future, same processing as above.

  1. Then in Non-secure scheduler, EXC_RETURN[4] can be used to check whether current switch-out thread uses FP.

Hope this could answer your concern.

Hi,

As you know, for secure and non-secure side are closely related in TF-M. In latest TF-M v1.3 release, FP support (in secure side either non-secure side) are not official announced, so the problem you reported is not a TF-M bug.

So, I reported a TF-M system crash, when a non-Secure application (and only the non-secure application) uses the FPU. And as far as I understand, you 've reproduced it. And this is because TFM does not enable FPU in CPACR. I am fine if you do not want to call it a TF-M bug, as long as this is noted in the "known issues" that TF-M does not currently *fully* support Non-Secure applications that use the FPU. I have not seen such a note anywhere in the documentation (but might have missed that, of course, so if there is such a note, please post this link here for my reference). Without such a note, TF-M gives the impression to its users that NS applications may use the FPU freely, which is clearly not the case.

Happy that you are working on FP support for TF-M, although, this is not what we are interested in; what we want is just to allow NS use of the FPU, in builds with TFM. But if your work fixes also our problem, I am still very happy, and looking forward to your progress.

federliangarm closed this task as Resolved.Jun 8 2021, 3:24 AM

Hi @ioannisg,

If you are still interested, we have an ongoing patch for adding an option to enable FPU coprocessors CP10/CP11: https://review.trustedfirmware.org/c/TF-M/trusted-firmware-m/+/15243

Thanks,
Lingkai