You might be interested in the following comparison between both: https://www.adalog.fr/publicat/ASIS-LAL.pdf
Iâve been questioning the utility of LibAdaLang myself, apart from the usability. TBH, I rather would have some sort of DIANA-like IR which is amiable to DB-storage (namly a graph-DB), deconstructed structurally, so that the structures themselves can be manipulated and/or queried.
This itself would be for the purpose of tooling, but in particular proving (like SPARK & model-checkers) and meta-programming; a LOT of code could be shared between the DB-engine and the solver(s) because both are using âunificationâ to solve queries, which means that if those components were SPARK-proven, you get mulipclitive results. (The other cool thing about deconstructing things at a structural-level, is that the parallel-structures of constructions can be translated between each other; as an example, consider the parallel between OOP-casses and generics. (See Genericicty Rehabilitated) â Being able to translate between these, as well as having a library of algorithms and data-structures would, IMO, be a killer feature for development.)
I mean, imagine being able to say something like SELECT * FROM Algorithms WHERE Tag LIKE "%sort%" AND Big_Theta = "N**2"
â or similar (giving the SQL its own DIANA-like IR means that we could do all the same tricks to the query itself) â and how being able to do searches and transformations (which are provably correct) on structures would be.
I actually doubt that these are so great; IMO, the dogged insistence on source being unstructured text, the reliance on external configurations (eg file-system and permissions; just look at how a bad install leads to so much frustration w/ GNAT⌠and look at how painful the bootstrapping process can be), and the
general design forcing repetitive work. (To be fair, much of this is also present in other compiler systems; itâs just so frustrating that the benefits of Ada arenât input into Adaâs own environment.)
Given a code-structure IR like DIANA, where it is defined in such a way as to not be able to store [syntactically] invalid code, âfoldingâ defined-to-be-the-same constructs into a single form (though perhaps with an attribute saying which form it was), you both reduce the repetitive work (of parsing) and the possible-errors of forgetting to handle a defined-same construct. You get the benefit, also, of being able to essentially annotate on the IR the work thatâs already been done, such as SPARK-proof or fuzzing or other testing â this is, ultimately, the problem with the UNIX design-philosophy and âeverything is textâ mindset: it prevents you from correctness and robustness. (e.g. Consider the âsmall-tools, pipes, and scriptingâ and how, at every step, you lose your type-information and have to ad-hoc re-parse the outputs-as-inputs. This is not actually a pie-in-the-sky hypothetical, there are security-exploits because tool B runs on the output of tool A but assumes that the information in column 3 is a non-positive integer.)
Could langkit and lkql be good technologies, and the implementation/design of these tools simply be doing things the retarded way? Sure. âŚbut weâre lying to ourselves if we say that the sprawl and bootstrap-difficulties of GNAT (and its tools) doesnât indicate there is some underlying design interplay. And weâre lying if we donât acknowledge that the underlying helper-tools own designs donât impact the design of the programs using them.
TL;DR â Hearing about and discussing the state of GNAT, bootstrapping and/or implementing new features (like parallel
), indicates to me some underlying structural/system design-issues.
PS â Something like parallel
should be EASY to bootstrap, and would be if that aforementioned DIANA-like IR were OOP and root-object had an execute
interpretation-function on the base/abstract type: The execution-task would simply be called on the parallel-construct (e.g.) one for each section of the PARALLEL
-block, one for each âchunkâ of the PARALLEL
-For, etc. Then for ânativeâ code, let the runtime handle it with OpenMP/CUDA/whatever, just like Ada did with task
-features, so that executing on single-core vs multi-core was trivial.
From the paper:
LibAdalang
Pros: LibAdalang is able to work on incomplete/incorrect
code and provides sophisticated support to the concrete
representation of the program, as well as editing and
modifying the original text. It processes the latest version
of the language.Cons: As there is no connection to the compiler, there is no
guarantee that LibAdalangâs view of the program
corresponds the compilerâs view or to the Ada standard. An
analysis tool cannot rely on the fact that there is no
diagnosis to trust that Ada rules are being obeyed; therefore
the tool should be run only on programs that have been
successfully compiled with a full Ada compiler.The typing system and the distance between Adaâs formal
definition and the analysis packages, as well as the lack of a
number of useful features, makes it less fit for deep
analysis of Ada code.The development of LibAdalang is fully under control of
AdaCore without external review, and of course it is not a
standard, nor expected to become one.
Those are pretty big cons, and pretty much confirm my intuition above.
Can you elaborate? Most the cons in this paper seem either irrelevant or almost deceptive.
Hi, Libadalang team lead here.
Jean-Pierre Rosen is entitled to his opinion on Libadalang. However please note that we have successfully based many analysis tools on top of it.
Porting GNATcheck to Libadalang is a deceptive effort, since we implemented an interpreted/JITted query language in the process, LKQL, that allowed us to develop many checks that AdaControl had, in a pretty short span of time. It also didnât take 5 years please donât spread misinformation.
One thing that is true, is that for design reasons, the LAL tree is very syntactic. However you have many queries that allow you to access the semantic part of the code.
But things like syntactic call expressions are not de-sugared, which can be less practical for some applications like code analyzers/static analysis, which is something that weâre aware of.
This is a design trade-off, and allows us to have syntactic focused projects such as pretty-printers/IDE/etc.
This is something weâre aware of, and we have plans to work on a more abstract IR for code analysis purposes.
As for what @OneWingedShark is saying ⌠Itâs hard to judge what somebody is saying when heâs living in a fantasy realm and not in the real world. Sources as text is the world we live in as professionnal tool developpers. Whether a different structure would be better or not is up for debate, and completely irrelevant if we want to support our existing customers with multi-million lines of source text code.
is to remove gnat and gprbuild from debian, so âapt installâ wont work
Thatâs definitely not part of AdaCoreâs plans, and I can say that with almost absolute certaintyâŚ
As Simon said, anyway we plan to keep contributing our code to gcc, and what distro maintainers do with it downstream of that is out of our control. So even if AdaCore wanted to, it couldnât.
- The âlack of connection to the compilerâ implies an ad-hoc error detection system; meaning separate code-bases, meaning they can be out-of-sync;
- The âtypeing system and the distanceâ paragraph indicates a separate-from-the-compiler checking of language rules, incurring the same problems as above;
- The âfully under controlâ paragraph is not necessarily bad, but it is a consideration for anyone who incorporates it as a dependency. (Anyone remember DirectMusic?)
- The âshould only be run on programs that have been successfully compiledâ goes back to the observation about designs in my previous post.
IMO, itâs much better to have the âhard workâ portions of the initial phases of compilation separate and distinct from the analysis portions. (This is to say tokenization & parsing, at the least, should be done commonly and the IR or AST be what is passed to analysis/optimization tools.) This allows you to separate the âdownstreamâ tooling from the error-detection/-correction and handling syntactically / semantically incorrect code.
âŚpassing IRs to subsequent compiler phases was done in the past, when computers didnât have a lot of memory. Insofar as I know, the âsmall GNAT toolsâ all force re-reading, re-tokenizing, re-parsing, for running any one of them.
Also, I completely fail to understand why many programmers dismiss the idea of a DB-based IDE âbecause of existing codebasesâ⌠is it because you canât imagine a small external-world interfacing module to import/export? âŚor is it because âNo! Now I canât use vi!!!â
I feel like this con is more of a red herring. I havenât kept up with ASIS in a long time, but if the paper is to be believed they abandoned maintaining the standard it after Ada95, not updating it for newer standards. Adacore was already having to add non standard stuff to itâs ASIS implementation to keep up with Ada2005 and beyond, so I donât find that much different than Adacore providing libadalang, especially since ASIS is still available for the compilers that stop at Ada95.
That said, if the ASIS standard has been updated past Ada95 and is still being updated, then it can be more of a talking point. Anyone know if it is or not?
Yep.
The con aspect here is more about the varying contexts of external parties and the possible-downsides of being proprietary/non-standardized.
I would be very interested in this, especially if this is done in an open, âanyone can use itâ manner⌠in fact, I proposed such an IR for Alire during its design-phase, pointing out the advantages of an IR which:
- Designed to âfoldâ defined-as-same constructs to a common representation;
- Ex:
Return 2;
,Return Result : Constant Integer:= 2;
andReturn Result : Constant Integer:= 2 do NULL; end return;
should all resolve into the same IR-object[-class], because they are LRM-defined to be the same, with the distinction being attributes indicating does the return-value have a name? if so, what? and are there children-statements? (ie extended return), and so on. - This allows for simpler handling: you only have one IR-node to handle, instead of X-many similar constructs, which reduces the âsurface areaâ for bugs or âimpedance mismatchâ.
- Ex:
- Designed to be amiable to the storage in a DB;
- This allows for efficient querying,
- and efficient storage,
- and likely allows structural transformation, as in âI want this type from a previous project, but as a singleton⌠ok, I can transform it from a [tagged-]record into a package.â
- If the IR is OOP-based, then having an
execute
interpretation method on the base/abstract type is enough to run the IR, meaning that bootstrapping is reduced to:- getting the compiler to emit this IR, and then
- running the interpreter/
execute
, - writing the native-backend,
- using #2 & #3 to generate the native-producing compiler.
- BONUS: This also allows new features, like
parallel
, to be run interpreted and then âbootstrapped out to nativeâ more easily⌠perhaps as simply as implementing the feature in the runtime and re-linking.
- If, as with DIANA, you (a) design so that it is impossible for non-valid code to be stored, and (b) implement âfoldingâ of non-semantically-meaningful elements (e.g. spaces vs tabs), you get several good consequences:
- Reduction of redundant work such as lexing and parsing,
- Elimination of handling erroneous constructs, and
- Presentation of these known-good to tools.
Honestly, it was such a disappointment when Alire rejected the idea and basically made itself [at MVP] a glue-code also-ran interface-to-github package-manager. â We could have had a system where it would be impossible to serve out non-compliable [or, at least parsable] code and which, eventually, could have the capability to query is dependency-X compatible with with my codebase? and to which versions?
I can see that, though to be fair, that means it is only a relative con for libadalang for Ada83/95 at that point. Anything past that and both ASIS and Libadalang would share that same con (would it still be considered a con in a comparison if both sides shared the same issue?). Also after Ada95 it less becomes a con for the tools and more of a con for using the ASIS standard at all, unless there is a modern standard available?
AFAIK, the standard could be revived/updated (I donât know that it would need a lot of work to get it to Ada2005; 2012, or 2022 would be more; I suspect 2005 would be really minute and most of the work going to 2012 being aspects; 2022 would be the parallel
and other extensions), but seems to be more of a query-system/API than an IR⌠which is fine, and good as a standard for tooling, but really is completely different than the qualities an IR would bring, almost orthogonal, TBH.
Another toolset using libadalang but not from AdaCore to be mentioned: Renaissance-Ada:
For instance, one of those tools suggests (among other things) Ada 2012 features like âif expressionsâ on places where it applies.
It also finds patterns where membership tests can be used:
if Code = SP_Open or Code = SP_Create or Code = SP_Append then
becomes:
if Code in SP_Open | SP_Create | SP_Append then