# [Solved] Improving performance on a vector: help needed

Hello

I’m trying to implement the Ulam Number task on Rosetta Code. I have a working implementation but it is much, much slower than the usual suspects, despite being identical (as far as I can tell).

Investigation suggests this is due to the following loop:

``````for Ith in 1 .. Numbers.Last_Index - 1 loop
Sieve (U + Numbers (Ith)) := Sieve (U + Numbers (Ith)) + 1;
end loop;
``````

At one point, this loop takes 9 seconds on Ada, compared to a fraction of a second for the usual suspects. Here is the comparable code for one of the usual suspects:

``````        for (int i = 0; i < ulen - 1; ++i)
++sieve[u + ulams[i] - 1];
``````

The entire program takes Ada a little more than 9 minutes, while the usual suspect takes 30 seconds. I understand this isn’t semantically equivalent, but a ~20x slowdown seems like a problem.

Some data that may or may not be relevant:

• `Numbers` and `Sieve` are both `Ada.Containers.Vector` types whose elements are `Positive`.
• The compiler tells me I can’t use Ada 2022’s `@` symbol in the assignment because the left hand side evaluates to a reference. That reminds me that vectors aren’t arrays, and makes me suspect a lot of overhead behind the scenes.

Some ways I tried to mitigate this:

• Change the loop to

``````for Number of Numbers loop
Sieve (U + Number) := Sieve (U + Number) + 1;
end loop;
Sieve (U + Numbers.Last_Element) :=
Sieve (U + Numbers.Last_Element) - 1;
``````

That by itself cut the runtime of the one run of the loop from 9 seconds to 5 and confirms my suspicions at least partly.

(As an aside, it would be nice if there were a way to specify that I want a slice of the vector, so that I don’t have to add in that post-loop assignment – and I do have to add that to maintain correctness. We can do that with arrays; is there a way to do it with vectors?)

• Use an array instead of a vector, reassigning the array on each iteration. I’m not even sure that would help, but even if it did, I must be working on this too late at night, because I know I’ve done something like this before, but:

• When I try to overwrite the old `Sieve` with the new one, a runtime error arises, I imagine because the old `Sieve` was initialized with a fixed size different from the new one.
• I tried using a discriminated type where the array’s length was the discriminant, but that gave me the same trouble.

Questions:

• Am I using the vector incorrectly above? Is there a better library function to access and modify an element?
• Can someone point me to an example where a discriminated record is used to create and reassign arrays, they way I’m sure I remember doing once before? or correct my memory by saying that cannot, in fact, be done?
• I tried to initialize an array using the new Ada 2022 iterator approach, something like this:
``````--  Old_Size is the value of Old_Values, of smaller dimension than the one we're generating
[ for Ith in 1 .. Old_Size => Old_Values (Ith), others => 0 ]
``````
…but the compiler complained that I had to use either a dynamic initialization or something else (can’t quite remember what). I tried to look this up in the ARM but didn’t see what it meant.
• Suppose I want to intiialize a vector with a lot of 0’s. That doesn’t seem to be possible? That is, a vector is initialized and resized via `Set_Length`, but there’s no way to say, “I want all the values to be …”?
• Any advice on eliminating the remaining bottleneck?

I haven’t posted the code anywhere online yet, but can if necessary.

Vectors do have a lot of internal checks (less so than other containers though), so you’ll probably need to disable them. I don’t have experience on how to do that, but I know that there is some mechanism for it. May have to look at the GNAT RM to see or take a look at their source code for the vector package and see if it is something obvious.

One thing I would recommend is to try the more obtuse way of Replace_Element and Element operations rather than indexed assignment. Indexed assignment generates a “reference type” which may have finalization happening in the background. Also use the versions that take indexes in rather than cursors (to avoid the extra cursor validity checks).

So `Sieve (U + Number) := Sieve (U + Number) + 1;`

becomes

`Sieve.Replace_Element(U+Number, Sieve.Element(U+Number)+1);`

That could remove a lot of the checks, but you are also duplicating the addition, so you might also consider doing the addition of `U + Number` just once and using a variable for the indexes. Note though, that it all depends on how GNAT optimizes the indexed assignment method, so I don’t know which is faster.

You still have checks for the U+Number potentially, so if you know they will never fail, you can optimize it with `pragma Suppress`

Yeah, you may need to use an indefinite holder to do that or do you own dynamic memory handling. I tend to prefer the indefintie holder since it does all the work for you, but it may be too slow depending on how you are using it.

``````function To_Vector
(New_Item : Element_Type;
Length   : Count_Type) return Vector
with Pre  => Length <= Maximum_Length or else raise Constraint_Error,
Post =>
To_Vector'Result.Length = Length and then
not Tampering_With_Elements_Prohibited (To_Vector'Result)
and then
not Tampering_With_Cursors_Prohibited (To_Vector'Result)
and then
To_Vector'Result.Capacity >= Length;
``````
1 Like

Tampering checks are known to cause some overhead. I would recommend suppressing them:

Replacing the standard vector implementation in the standard library with a vector implementation from the Ada traits containers might give better performance as well:

1 Like

with Ada 2022, you could get a Stable view of the container instead.
Search for “stable” in the latest RM:

1 Like

With performance it’s also best to check your compile settings and try a few profiling tools. Without the code and a profile I can’t suggest any improvements.

I would recommend running with Linux’s perf, or depending on your hardware, VTune or get a flame graph with AMD’s uProf.

1 Like

Thanks to everyone. I had forgotten about suppressing checks. I didn’t even know you could suppress tampering checks. (Talk about something I’d have abused very much in the past…! )

• Suppressing merely the tampering checks got us to 130 seconds.
• Suppressing all checks got us down to roughly 90 seconds.

This was with `-Og`, so I also tried it with `-O2`.

• Suppressing merely the tampering checks got us to 104 seconds.
• Suppressing all checks got us down to roughly 20 seconds.

Nice! I’ll reply inmore detail later.

I don’t know if you blog at all, but it might be a fun article to write about and can show your experience tackling it.

Alas, I have to resize this container on occasion, unless I know its final size in advance, which I don’t. That’s an interesting tool for future work, though.

The tech lead at work uses flame graphs with Rust. I think he uses a package pre-baked for llvm. I wasn’t aware of uProf, and I’ve wanted to look more into this. Thanks.

Then destroy (finalize) the stabilized view before resizing.

I see what you mean. I hadn’t read far enough into it. Thanks; I will explore that.

My initial approach would be to use Iterate over Numbers, and apply Update_Element to the corresponding element of Sieve. You’d probably need some extra effort to avoid processing the last element of Numbers.

Actually, don’t bother… I just tried to construct an example to post here, and it seems the Stable container view is another Ada 2022 feature not (yet?) implemented in GNAT