Nice work @Verisimilitude!
I gave your program a spin
Here is how it went:
I downloaded both the program and the trie
library. I tested it with the supplied data/weather_stations.csv
file. However, I had to delete the initial comments of that file and bump the number of allowed city names to 40_000 as that file has very few duplicates (it has about 45k entries). I love Ada for how easy it is to debug
Finding the ācommentsā issue was easy, finding that the supplied sample file had more that 10_000 distinct entries was also fairly simple thanks to exceptions and GDB catch exception
and catch assert

The command I used during debugging was: gnatmake -g3 -ggdb -f 1brc.adb -o 1brc_debug
For the optimised binary I used: gnatmake -march=native -O3 -f -gnato0 -gnat2022 -gnatB -gnatp -flto 1brc.adb -o 1brc_optim -largs -s
I could not measure a difference between the two binaries is such a small input. Most of the time is spent printing on the terminal x) My hardware supports AVX-512 (AMD Ryzen 7840HS). However, I found that there are not many AVX-512 instructions generated. but AVX2/AVX was used extensively.
I am trying to fix my Java/Maven installation so that I can generate a 1B sample file and test it a bit more.
Thank you!!!
Update:
I managed to get my Java and Maven system going and I generated a 1 billion entry file (13 gigs). I ran the provided baseline program to have a verified result. Here is what I got.
The debug build takes about 2mins and 20 secs. The optimised build takes 1min 57 secs. The provided baseline program (java) takes 2mins 30 secs.
However, the results of your program are not the same as the baseline⦠I ran a diff to make sure that the generated results were the same but some numbers are clearly different from the baseline (it is not just formatting). Here is an example:
Baseline
{Abha=-32.3/18.0/67.2, Abidjan=-20.6/26.0/76.0, AbƩchƩ=-22.2/29.4/76.1, Accra=-30.1/26.4/78.2, Addis Ababa=-31.8/16.0/66.9,
Optimised/Debug
{Abha=-32.3/ 67.2/ 17.9, Abidjan=-20.6/ 76.0/ 26.0, AbƩchƩ=-22.2/ 76.1/ 29.4, Accra=-30.1/ 78.2/ 26.4, Addis Ababa=-31.8/ 66.9/ 15.9,
Apart from the order being different and some extra whitespacing (which I believe comes from Adaās librariesā¦), the first result is incorrect. The average temp for Abha should be 18.0, but the Ada program says it is 17.9⦠Maybe there is some issue with fixed point precision?
I hope this helps,
Fer