I started with SPARK, and I have a comment and a question.
The comment first: It’s really nice how easy it is to set up and use GNATprove in Alire. And I’m positively surprised how small the step is to go from Ada to SPARK and get proofs. Very cool!
Now the question, because of course I don’t know nearly enough about how the provers actually work.
I’m stating a post-condition that cannot be proven, so I started adding assertions to figure out where it fails. I have an array type with indexes from 1 to 2000. I’m looping over such an array, increasing a counter for each element (except the first one) if it satisfies a condition. I want to have a post-condition on the upper bound of the counter in the end and am trying to figure out how to formulate a loop invariant.
When I’m asserting that the counter is <= the input’s length after each iteration, it tells me the assertion might fail. This is too hard, I guess? Or am I missing something obvious? Is there another way to specify such a loop invariant about the upper bound of a variable?
Here is the relevant part of the code:
package A is
Number_Of_Lines : constant Natural := 2000;
-- The number of lines in the input file and thus
-- the number of provided depth measurement values.
type Measurements is array (1 .. Number_Of_Lines) of Natural;
end A;
---
with A; use A;
package body B with
SPARK_Mode => On
is
function Number_Of_Increases (Values : in Measurements) return Natural with
Global => null,
Depends => (Number_Of_Increases'Result => Values),
Post => Number_Of_Increases'Result < Values'Length; -- this is what I try to state
function Number_Of_Increases (Values : in Measurements) return Natural is
Count : Natural := 0;
Previous_Value : Natural := 0;
begin
for I in Values'Range loop
declare
Value : constant Natural := Values (I);
begin
if I > 1 and Value > Previous_Value then
Count := Count + 1;
end if;
Previous_Value := Value;
end;
pragma Assert (Count >= 0);
pragma Assert (Count <= Values'Length); -- assertion might fail
end loop;
return Count;
end Number_Of_Increases;
end B;
Also, is there a resource you would recommend where I can read up on which provers are used (the gnatprove.out file says CVC5) and how they prove different things? I have the impression that most beginner questions are of the kind “Why can’t it prove this or that?”
(I’m currently working my way through the AdaCore tutorial, which I find excellent to get started. but of course it doesn’t go into great detail.)
I think I understand it: GNATprove does not keep track of how variables change from one iteration of the loop to the next. It proves loop invariants by proving that the invariant holds in the first iteration, and if it held after any one iteration, it also holds after the next one. Which for the upper bound doesn’t work - for this you would need to know how often you loop.
What indeed does work is adding an assumption that makes the iterative step provable:
function Number_Of_Increases (Values : in Measurements) return Natural with
Post => Number_Of_Increases'Result <= Values'Length;
function Number_Of_Increases (Values : in Measurements) return Natural is
Count : Natural := 0;
begin
for I in Values'Range loop
pragma Assume (Count < Values'Length);
...
-- in one path, Count is increased by 1,
-- in the other one it is not updated
pragma Loop_Invariant (Count <= Values'Length);
end loop;
return Count;
end Number_Of_Increases;
That makes the post-condition much less helpful, but it makes sense.
If I remove the post-condition, however, I still have an upper bound issue, because GNATprove says:
As a general rule, loop invariants need to involve the loop index, and the proof requires it is true for I = 1st value, and if the invariant is true for I-1, it can be proved it is true for I. So your loop invariant of Count <= I is exactly the right thing for this example, since it can be proved for I = Values’First, and can be proved for I (when I greater than Values’First) assuming it is true for I-1.
Here is my “unprofessional” answer, as I do not work with SPARK or know it well. But GNATProve has different prove values. They indicate to GNATProve how hard it has to try before it gives up. AFAIK, in the highest setting it is able to use CVC5, Z3 and alter-ego to prove the whole thing.
Also, you can enable counter-proves, which will tell you what/how SPARK thinks that your contracts will break. It is a great learning resource!
I’m not sure but in this video assuming it is the right one. Rod talks about how different parts of SparkNaCl are proven by different provers and taking some time to do so. He says something like, so long as everything can be proven by atleast one prover then you are covered. He is going beyond Silver and proving some correctness properties which probably explains why proving takes a fair bit of time.
May I follow up with a question about loop invariants in nested loops?
If I have a two-dimensional array Input and walk over all rows and columns, it seems that an inner loop invariant is enough, if it uses both loop variables? Here is an example of what seems straightforward enough for me, but apparently not for the prover:
Count : Natural := 0;
for Row in Input'Range (1) loop
for Col in Input'Range (2) loop
if Some_Condition_Is_Satisfied then
Count := @ + 1;
end if;
pragma Loop_Invariant (Count <= Row * Col);
end loop;
end loop;
I was getting ok with loop invariants in simple loops, but the nested one throws me off again. So I’d appreciate any hint what to consider or take care of in this case.
(The actual dimension in my example is 140 x 140, although that shouldn’t play a role once it’s too big for unrolling, right?)
Indeed 140 is too big for unrolling (I think the limit is 20? To be sure that unrolling doesn’t come into play, you can use --no-loop-unrolling).
For SPARK, a loop is like a procedure - imagine putting the entire loop body in a separate procedure. To write a loop invariant is like writing a single formula that could be both a precondition and postcondition of the procedure at the same time. Note that you need to “shift” loop indices and the like to make the analogy work.
Generally you need a loop invariant for both loops as both are their own “procedures” if we assume the above analogy. In your case, the loop invariant for the outer loop assumes you have done the previous rows entirely, so it’s something like
Thanks a lot for taking the time; I really appreciate it! It makes sense to have one for the outer loop with Count depending on Row, and one for the inner loop with Count depending on Row and Col. I still don’t get why it thinks “loop invariant might fail in first iteration”, but I’ll play around with it a bit more.