Bug(s) in my Ada program for Number Sequence (OEIS:A033919 - Odd k for which k+2^m is composite for all m < k) Rosetta Code

dmitry-kazakov · November 3, 2024, 7:22pm

It is still Ada.
Anyway, here is an implementation based on 10**39 base (128-bit unsigned). It does not deal with signed integers as you do not need them:

   function Value (S : String) return Valid_Big_Integer is
      package Conversions is new Unsigned_Conversions (Unsigned_128);
      use Conversions;
      Base   : constant Valid_Big_Integer :=
                        Conversions.To_Big_Integer (10**39);
      Result : Valid_Big_Integer;
      Chunk  : Unsigned_128 := 0;
      Length : Natural := 0;
      First  : Boolean := True;
   begin
      if S'Length = 0 then
         raise Constraint_Error;
      end if;     
      for Index in S'Range loop
         case S (Index) is
            when '0'..'9' =>
               Chunk := Chunk * 10
                      + (  Character'Pos (S (Index))
                        -  Character'Pos ('0')
                        );
            when '_' =>
               if Index in S'First | S'Last  or else S (Index - 1) = '_'
               then
                  raise Constraint_Error;
               end if;
            when others =>
               raise Constraint_Error;
         end case;
         if Length < 38 then
            Length := Length + 1;
         else
            if First then
               Result := To_Big_Integer (Chunk);
               First  := False;
            else
               Result := Result * Base + To_Big_Integer (Chunk);
            end if;
            Chunk  := 0;
            Length := 0;
         end if;
      end loop;
      if Length > 0 then
         if First then
            Result := To_Big_Integer (Chunk);
         else
            Result := Result
                    * To_Big_Integer (Unsigned_128'(10) ** Length)
                    + To_Big_Integer (Chunk);
         end if;
      end if;
      return Result;
   end Value;

Irvise · November 3, 2024, 10:44pm

Thank you a lot @dmitry-kazakov!

I think I will need a few more digits though. I have left the PC running fro a few hours and the factor for 50216813883093446110686315385661331328818843555712276103941 (I have counted 59 digits here) is greater than 4_136_965_712_881, so at least the factor would fit in there, but not the actual number…

I think it would be a nice exercise for me to take your code and expand it to 256 or 512 bits though

Best,
Fer

dmitry-kazakov · November 4, 2024, 8:28am

It is easy if you have Unsigned_256 or Unsigned_512. You need to change 39 and 38=39-1 to the number of decimal places fitting into 256 or 512 respectively.
BTW, I miscalculated the number for 128-bit. So change 39 to 38 and 38 to 37!

simonjwright · November 4, 2024, 1:56pm

I made a version using GNATCOLL_GMP and trying hard to avoid finalizations (using procedure interfaces rather than functions).

Source here - tidy, but casing could be better (ada-ts-mode fumbling).

To get wedged at k=773 takes 2.5s with this, compared to 9.5s with the original Big_Integer implementation. Still not going to give us a result like Go’s 2s for 5 results!

dmitry-kazakov · November 4, 2024, 4:06pm

I cannot tell about GNATCOLL but the implementation of Big_Integer under Windows uses 32-bit words as “digits” and 64-bit as double digits (needed for multiplication). It could be 64 and 128 as 128-bit arithmetic is available.

The the major design issue is lack of in-place operations and Big_Integer x “digit” in-place operations, e.g.

procedure Add (L : in out Big_Integer; R : Unsigned_64);

I do not understand why the standard library does not have them, because in most algorithms you could take advantage of them.

dmitry-kazakov · November 4, 2024, 8:14pm

There is an alternative implementation of big numbers in Ada here:

Maybe it is worth to check too?
[edited, I botched the link]

JC001 · November 5, 2024, 10:22am

There is also PragmARC.Unbounded_Numbers.*.

dmitry-kazakov · November 5, 2024, 10:51am

Yes, I remembered it, but then could not find big numbers there. PragmARC is an excellent library, but it is difficult to navigate. A searchable index and topic list would help greatly.

Anyway, it seems that PragmARC uses Vector for an implementation and has no in-place big-to-digit operations.

So far there are two approaches:

An array of “digits” (unsigned words)
A vector or other dynamically allocated container of “digits”

The second, of course, has a considerable penalty, the first either has an upper limit or becomes inefficient for very large numbers.

It would be very interesting to see a comparison of existing Ada big numbers libraries.

Ret_Build_Engineer · November 5, 2024, 10:37pm

Hi Jeff, Fer, Simon, Dmitry;

I have re-implemented my initial program as version #2 using PragmArc.Unbounded_Numbers and I am having the same problem.

I’m looking closer at some of the other improvements Fer made to my code and I’m experimenting with them…

I still wonder if there is one or more bugs in my implementation (translation) of the Python Rosetta Code solution…

Thanks for all the attention to this issue!

R. B. E.

Irvise · November 5, 2024, 10:48pm

Well… I just spent the good part of the evening doing just that You can see the results here! GitHub - Irvise/Ada_Unbounded_Perf_Comparison: Performance comparison of some Ada implementations of Big_Num
Just do a alr build and you will have the binaries under ./bin I did not try the crypto library as it is not available in Alire.

From the tests I have done (very basic and stupid) Pragmarc is twice slower than the Ada.Big_Num implementation, which it itself is about 2 to 3 times slower than GMPs implementation.

Cheers you all!

Ret_Build_Engineer · November 5, 2024, 11:46pm

Hi Fer;

I think I misled you by my use of the Max_Sequence_Number. It is meant to be the position in the sequence, not the maximum value that can be generated. So I should have used a different identifier to not have confused you. I don’t think that this has a bearing on the problem at hand, but it is important.

So, the maximum would be 5 (773 2131 2491 4471 5101), not 800.

I still think that there must be a bug in my code as it should have printed 773 as the first number in the sequence. Either that or there is even a more basic bug in the big number library, where numeric comparasins are broken.

Again,
Thanks for all the interesting insights and comments…

R. B. E.

dmitry-kazakov · November 6, 2024, 11:24am

How do you explain the results? GMP is actually written is Assembly!

simonjwright · November 6, 2024, 11:47am

Maybe it’s using a generic build rather than one optimised for the processor? Really, I have no idea

dmitry-kazakov · November 6, 2024, 12:04pm

It looks that there is none. However GMP uses heap and GNATColl binding use controlled wrappers. These might be primary candidates for a slowdown, but Go cannot do it without using a heap either. There must be some other reason.