Hi,
Using GNAT v11.2/0-4 for ARM (Alire), the following procedure Unbiased_Rounding for Float works as expected.
function Unbiased_Rounding (X : Float) return Float is
Y : Float;
begin
Asm (“vrintn.f32 %0,%1”,
Outputs => Float’asm_output (“=t”, Y),
Inputs => Float’asm_input (“t”, X));
return Y;
end Unbiased_Rounding;
according to Machine Constraints (Using the GNU Compiler Collection (GCC))
the constraint t means "VFP floating-point registers s0-s31. Used for 32 bit values” and the constraint w means "VFP floating-point registers d0-d31 and the appropriate subset d0-d15 based on command line options. Used for 64 bit values only”
therefore we wrote our long_float version as
function Unbiased_Rounding (X : Long_Float) return Long_Float is
Y : Long_Float;
begin
Asm (“vrintn.f64 %0,%1”,
Outputs => Long_Float’asm_output (“=w”, Y),
Inputs => Long_Float’asm_input (“w”, X));
return Y;
end Unbiased_Rounding;
however this fails to compile.
GNAT 11.2/0-4 (Alire) complains
Error: invalid instruction shape – `vrintn.f64 s14,s14’
presumably because the operands are S registers rather than double precisions D registers.
Is this a bug or have we misunderstood something?
Nice idea but our long_float is not float. We can use long_float just fine - GNAT produces the expected code. Our only problem is when we try to use the built-in assembler to efficiently implement the attribute Unbiased_Rounding by using the ARM FPV5 VRINTN instruction.
I am no Ada nor ARM ASM expert. But following you explanation it may seem that the compiler is using the “s” (single float) registers for your Long_Float type/values. Since you did not post the definition of Long_Float, I assume the compile noticed that whatever thing was defined, could fit in 32bits.
In this case I would recommend that you try telling the compiler to explicitly use 64-bit sizes/logic for the Long_Float type. The way one would go about doing it is with the Size aspect. Here is a mockup:
subtype My_Long_Float is Long_Float range -1000.0 .. 1000.0
with Size => 64;
in this case
Long_Float'Size = 64; -- True
Also, the standard indicates that the compiler has to support Long_Float if the machine has a precision of 11 or greater digits Floating Point Types Are you compiling against a generic 32-bit ARM machine? If you are, maybe GCC/GNAT is doing Float = Long_Float and does not generate the code you want.
Dear Fer,
When compiling we specify the GCC switch -d16 to inform the compiler that our target as 16 D (double float) registers. The Ada code that we write using Long_Float generates machine code that uses 64 bit D registers as expected.
The only problem we have is when we try to use the built-in assembler.
Best wishes,
Ahlan
Oups, indeed, “should support” is not a requirement
I am sadly not that knowledgeable to help you with that then… Could you try using a different compiler version just to make sure that it is not a bug with v11?
The solution is to use %P to access the parameters constrained using “w”
Try as I might I can’t find this wonderful secret documented anywhere.
I stumbled on the solution in the NXP forum where jingpan replied to a question on how to use the ARM VSQRT instruction for double.
When using the inline assembler from C and using named parameters you need to access parameters constrained by “w”, ie D registers using %P[name] rather than %[name] as everywhere else.
Using positional parameters one needs to use %Pn rather than %n
And yes it must be a capital P
I fail to understand why one needs to do this because surely the assembler already knows that the parameter has been constrained to a D register - but I guess this is just an additional quirk to an already very quirky assembler.
My GNAT Ada code to implement the Unbiased_Rounding attribute efficiently using the VFLOATN instruction is therefore
subtype T is Long_Float;
function Unbiased_Rounding (X : T) return T is
Y : T;
begin
Asm (“vrintn.f64 %P0,%P1”,
Outputs => T’asm_output (“=w”, Y),
Inputs => T’asm_input (“w”, X));
return Y;
end Unbiased_Rounding;
Of course we wouldn’t have to resort to assembler at all had there been a built-in intrinsic for VFLOATN as there is for all the other VFLOAT instructions. But I guess that is hoping for too much.