First time hearing of TDFAs. This seams like a really interesting Project! I would love to help/contribute but I do not know enough about TDFA theory and time is already kinda short
The papers she references seam easy enough to learn TDFA from. I’ll try to fit in the time to learn a bit about it. Seams really interesting! I’ll put this project in my ever expanding “todo“ list
Hi! The author of the program here. This program was created mostly as a prototype implementation of the TDFA algorithms to help me understand them. I had a need for creating my own portable TDFA-based lexer generator, but I first wanted to create a prototype so I could figure out how it actually works.
I write most of my portable software in C89, as I like supporting old or unusual computers and operating systems (VM/CMS, MVS, VMS, old unices, old NT, etc.). I did not want to create the prototype in C though, as C is hard to use and hard to read, and Ada’s large standard library allows for much quicker prototyping. It also means I have to debug less memory errors and can just focus on the actual bugs in the algorithms. I published this prototype so that anyone else learning about TDFA can have an easier to read implementation, as the main implementation (RE2C) doesn’t have the most readable source code. Reading the source code of an actual implementation is necessary to fully understanding the algorithms, as the 2022 paper often leaves out details or is occasionally just incorrect (the author has been very helpful in correcting these issues though!). This is also the reason I chose a public domain equivalent license: anyone can use it for anything without fear of licensing issues.
I don’t really plan on “supporting” the program much; I’ll correct any mistakes I discover, but I will likely not add any new features or anything. For me, it mostly serves as a blueprint that I can rewrite in C, and turn into an actual lexer generator. I do wish that I could just use Ada (or really anything other than C) everywhere, but unfortunately C is what ended up being the lingua franca, and makes it possible to port my programs almost anywhere.
Overall, I am glad I chose Ada for the prototype, but I would probably not choose Ada 2022 specifically again; I found roughly 6 different GNAT compiler bugs (ranging from crashes to miscompilations) that I had to fix or work around. I’ve reported some of these bugs, and I even submitted a patch for one of them. Some of them I haven’t reported yet because I haven’t prepared a reproducer and a proper report, and I’m not sure if it’s a good idea to bombard the GNAT bug tracker with 4 different bug reports all at once. Some of them are also pretty benign and only happen with horribly incorrect code (e.g. swapped arguments for 'Reduce). A good chunk of these bugs were related to Ada 2022 features, but some were Ada 2012 ones; I really wonder how some of them haven’t gotten found earlier. I guess most of Adacore’s customers support legacy Ada 83 or 95 codebases?
Although it’s called Ada 2022, it wasn’t finalized until mid-2023. It’s possible a lot of your bugs have already been reported; I’ve reported a few bugs myself, and as I recall it has taken the GNAT maintainers a while to fix them. (That’s not a criticism, BTW; just an observation.)
Some new language features are still not even preliminarily implemented in GNAT. As a simple example, I was reading the Barnes book a couple of weeks ago and noticed that Ada 2022 now lets you end a record declaration with end record Type_Name; instead of just end record;. I was really excited about that (simple minds, simple pleasures, what can I say) and went to try it out immediately, only to discover that GNAT rejects that, even as of 15.2.1. I haven’t reported it because I figure they’re working on more important things, like last summer’s work implementing the new parallel features.
The ones I reported (Bug 123138, Bug 123589) hadn’t been found yet, but I haven’t checked the other ones (one was a weird ICE when using a specific overloaded function call in a selected component in an iterated component association, and weird stuff like that). Given how specific they usually are, I’m guessing they haven’t been found and I just keep stumbling upon weird bugs somehow. I’ll probably report them some time soon.
As a former AdaCore customer I can confirm, yes we used Ada 2005. As an Ada developer, I dislike most of the newer stuff and find the language evolving in a wrong direction.