Hi,
I made a benchmarking program for sorting algorithms, that create arrays of up to 1024 elements, here Positive, created from Ada.Numerics.Discrete_Random.
I’m having the strangest bug so far: it can happen at multiple places, seems to depend from the state of several variables that do not interact, this is absolute bonker.
It’s all generic and worked before with String, up to 1024, ditto. But plus the above exceptions thrown I also get at some place a “range check failed”. I wonder if perhaps Random might produce sometimes value very close or equal to Positive Last and perhaps some algo makes some temporary variable exceeds that… But I constrained the subtype given to the RNG and nothing changes.
This is as good a time as any to start delving in debugging tools, something I never did yet…
This time save for sending multiple files, I can’t provide an example because I can’t seem to pin down the problem. Common issues with overflow/memory whatever, right ? Also, no there is access types anywhere, but there is recursion in some places, though it works with long, long strings before.
The weirdest part is that it panics in the first 1 to 3 iterations, so with very small array sizes !
How do I narrow down what causes stuff like “raised STORAGE_ERROR : SIGBUS: possible stack overflow” ?
Or I could start with allocating more stack space ?
I would recommend that you compile your code with debugging enabled, with checks enabled and all sanitation things enables. Also, compile with -Og so that the debugging will be simpler.
Then simply run your program in GDB and tell GDB to catch assertions and catch exceptions (or something like that). Then just simply run your program. With the previous flags, the moment something goes wrong, the program/debugger will halt and will tell you exactly what line caused the error. Then I would recommend that you print out the values of the variables that are on that line and check if anything looks off or odd.
With this method I was able to debug an issue in a 30 kLOC program in 5 minutes And I had never even taken a look at the src before!
If you are using GNAT, you should compile with -fstack-check so that you get the Ada semantics of raising Storage_Error if you attempt to overflow the stack.
Program received signal SIGSEGV, Segmentation fault. 0x000000000041f7f8 in system.secondary_stack.ss_release (m=…) at s-secsta.adb:975 warning: 975 s-secsta.adb: Aucun fichier ou dossier de ce nom
It doesn’t explain me where the failure starts in my code though. But others have had my issue. I read the GPR attribute Library_Auto_Init can help but I don’t use project files here, and frankly I don’t want to bother every time I have a bit of recursion or just a few thousands integers. Also, are my switches enough “gnatmake -gnatX0 -Og -fstack-check” ?
Well I deactivated the secondary stack with a restriction pragma, and it compiles. Now it still craches, but gdb doesn’t mention that s-secsta anymore. Instead of a segfault I get SIGBUS.
Using host libthread_db library “/lib/x86_64-linux-gnu/libthread_db.so.1”. Catchpoint 4 (signal SIGBUS), 0x0000000000405fe2 in test_all_sorts.repeat_test ()
By uncommenting all the statements, I get a fuller range of exceptions, and it stops at various points as well depending on the run:
Program received signal SIGSEGV, Segmentation fault.
0x0000000500000020 in ?? ()
(gdb) run
Program received signal SIGSEGV, Segmentation fault. system.random_numbers.random (gen=…) at s-rannum.adb:186
186 in s-rannum.adb
Are you linking to any foreign code/libraries?
This kinda looks like Spooky Action at a Distance, which can happen if you’re playing with C and do something with a bad pointer.
I never touched C and there is no dynamic allocation here. Well, not on my side anyway. Drop it, I can perfectly believe in a literal ghost in the machine at this point, one bent on messing with me.