Debian Ada package removals

DirkCraeynest · October 9, 2024, 7:35am

You might be interested in the following comparison between both: https://www.adalog.fr/publicat/ASIS-LAL.pdf

OneWingedShark · October 9, 2024, 7:25pm

I’ve been questioning the utility of LibAdaLang myself, apart from the usability. TBH, I rather would have some sort of DIANA-like IR which is amiable to DB-storage (namly a graph-DB), deconstructed structurally, so that the structures themselves can be manipulated and/or queried.

This itself would be for the purpose of tooling, but in particular proving (like SPARK & model-checkers) and meta-programming; a LOT of code could be shared between the DB-engine and the solver(s) because both are using “unification” to solve queries, which means that if those components were SPARK-proven, you get mulipclitive results. (The other cool thing about deconstructing things at a structural-level, is that the parallel-structures of constructions can be translated between each other; as an example, consider the parallel between OOP-casses and generics. (See Genericicty Rehabilitated) — Being able to translate between these, as well as having a library of algorithms and data-structures would, IMO, be a killer feature for development.)

I mean, imagine being able to say something like SELECT * FROM Algorithms WHERE Tag LIKE "%sort%" AND Big_Theta = "N**2" — or similar (giving the SQL its own DIANA-like IR means that we could do all the same tricks to the query itself) — and how being able to do searches and transformations (which are provably correct) on structures would be.

I actually doubt that these are so great; IMO, the dogged insistence on source being unstructured text, the reliance on external configurations (eg file-system and permissions; just look at how a bad install leads to so much frustration w/ GNAT… and look at how painful the bootstrapping process can be), and the
general design forcing repetitive work. (To be fair, much of this is also present in other compiler systems; it’s just so frustrating that the benefits of Ada aren’t input into Ada’s own environment.)

Given a code-structure IR like DIANA, where it is defined in such a way as to not be able to store [syntactically] invalid code, “folding” defined-to-be-the-same constructs into a single form (though perhaps with an attribute saying which form it was), you both reduce the repetitive work (of parsing) and the possible-errors of forgetting to handle a defined-same construct. You get the benefit, also, of being able to essentially annotate on the IR the work that’s already been done, such as SPARK-proof or fuzzing or other testing — this is, ultimately, the problem with the UNIX design-philosophy and “everything is text” mindset: it prevents you from correctness and robustness. (e.g. Consider the “small-tools, pipes, and scripting” and how, at every step, you lose your type-information and have to ad-hoc re-parse the outputs-as-inputs. This is not actually a pie-in-the-sky hypothetical, there are security-exploits because tool B runs on the output of tool A but assumes that the information in column 3 is a non-positive integer.)

Could langkit and lkql be good technologies, and the implementation/design of these tools simply be doing things the retarded way? Sure. …but we’re lying to ourselves if we say that the sprawl and bootstrap-difficulties of GNAT (and its tools) doesn’t indicate there is some underlying design interplay. And we’re lying if we don’t acknowledge that the underlying helper-tools own designs don’t impact the design of the programs using them.

TL;DR — Hearing about and discussing the state of GNAT, bootstrapping and/or implementing new features (like parallel), indicates to me some underlying structural/system design-issues.

PS — Something like parallel should be EASY to bootstrap, and would be if that aforementioned DIANA-like IR were OOP and root-object had an execute interpretation-function on the base/abstract type: The execution-task would simply be called on the parallel-construct (e.g.) one for each section of the PARALLEL-block, one for each ‘chunk’ of the PARALLEL-For, etc. Then for “native” code, let the runtime handle it with OpenMP/CUDA/whatever, just like Ada did with task-features, so that executing on single-core vs multi-core was trivial.

OneWingedShark · October 9, 2024, 7:31pm

From the paper:

LibAdalang
Pros: LibAdalang is able to work on incomplete/incorrect
code and provides sophisticated support to the concrete
representation of the program, as well as editing and
modifying the original text. It processes the latest version
of the language.

Cons: As there is no connection to the compiler, there is no
guarantee that LibAdalang’s view of the program
corresponds the compiler’s view or to the Ada standard. An
analysis tool cannot rely on the fact that there is no
diagnosis to trust that Ada rules are being obeyed; therefore
the tool should be run only on programs that have been
successfully compiled with a full Ada compiler.

The typing system and the distance between Ada’s formal
definition and the analysis packages, as well as the lack of a
number of useful features, makes it less fit for deep
analysis of Ada code.

The development of LibAdalang is fully under control of
AdaCore without external review, and of course it is not a
standard, nor expected to become one.

Those are pretty big cons, and pretty much confirm my intuition above.

Fabien.C · October 10, 2024, 11:07am

Can you elaborate? Most the cons in this paper seem either irrelevant or almost deceptive.

raph-amiard · October 10, 2024, 1:51pm

Hi, Libadalang team lead here.

Jean-Pierre Rosen is entitled to his opinion on Libadalang. However please note that we have successfully based many analysis tools on top of it.

Porting GNATcheck to Libadalang is a deceptive effort, since we implemented an interpreted/JITted query language in the process, LKQL, that allowed us to develop many checks that AdaControl had, in a pretty short span of time. It also didn’t take 5 years please don’t spread misinformation.

One thing that is true, is that for design reasons, the LAL tree is very syntactic. However you have many queries that allow you to access the semantic part of the code.

But things like syntactic call expressions are not de-sugared, which can be less practical for some applications like code analyzers/static analysis, which is something that we’re aware of.

This is a design trade-off, and allows us to have syntactic focused projects such as pretty-printers/IDE/etc.

This is something we’re aware of, and we have plans to work on a more abstract IR for code analysis purposes.

As for what @OneWingedShark is saying … It’s hard to judge what somebody is saying when he’s living in a fantasy realm and not in the real world. Sources as text is the world we live in as professionnal tool developpers. Whether a different structure would be better or not is up for debate, and completely irrelevant if we want to support our existing customers with multi-million lines of source text code.

raph-amiard · October 10, 2024, 2:02pm

is to remove gnat and gprbuild from debian, so “apt install” wont work

That’s definitely not part of AdaCore’s plans, and I can say that with almost absolute certainty…

As Simon said, anyway we plan to keep contributing our code to gcc, and what distro maintainers do with it downstream of that is out of our control. So even if AdaCore wanted to, it couldn’t.

OneWingedShark · October 10, 2024, 2:20pm

The “lack of connection to the compiler” implies an ad-hoc error detection system; meaning separate code-bases, meaning they can be out-of-sync;
The “typeing system and the distance” paragraph indicates a separate-from-the-compiler checking of language rules, incurring the same problems as above;
The “fully under control” paragraph is not necessarily bad, but it is a consideration for anyone who incorporates it as a dependency. (Anyone remember DirectMusic?)
The “should only be run on programs that have been successfully compiled” goes back to the observation about designs in my previous post.

IMO, it’s much better to have the “hard work” portions of the initial phases of compilation separate and distinct from the analysis portions. (This is to say tokenization & parsing, at the least, should be done commonly and the IR or AST be what is passed to analysis/optimization tools.) This allows you to separate the “downstream” tooling from the error-detection/-correction and handling syntactically / semantically incorrect code.

OneWingedShark · October 10, 2024, 2:28pm

…passing IRs to subsequent compiler phases was done in the past, when computers didn’t have a lot of memory. Insofar as I know, the “small GNAT tools” all force re-reading, re-tokenizing, re-parsing, for running any one of them.

Also, I completely fail to understand why many programmers dismiss the idea of a DB-based IDE “because of existing codebases”… is it because you can’t imagine a small external-world interfacing module to import/export? …or is it because “No! Now I can’t use vi!!!”

jere · October 10, 2024, 3:45pm

I feel like this con is more of a red herring. I haven’t kept up with ASIS in a long time, but if the paper is to be believed they abandoned maintaining the standard it after Ada95, not updating it for newer standards. Adacore was already having to add non standard stuff to it’s ASIS implementation to keep up with Ada2005 and beyond, so I don’t find that much different than Adacore providing libadalang, especially since ASIS is still available for the compilers that stop at Ada95.

That said, if the ASIS standard has been updated past Ada95 and is still being updated, then it can be more of a talking point. Anyone know if it is or not?

OneWingedShark · October 10, 2024, 4:53pm

Yep.
The con aspect here is more about the varying contexts of external parties and the possible-downsides of being proprietary/non-standardized.

I would be very interested in this, especially if this is done in an open, “anyone can use it” manner… in fact, I proposed such an IR for Alire during its design-phase, pointing out the advantages of an IR which:

Designed to “fold” defined-as-same constructs to a common representation;
1. Ex: Return 2;, Return Result : Constant Integer:= 2; and Return Result : Constant Integer:= 2 do NULL; end return; should all resolve into the same IR-object[-class], because they are LRM-defined to be the same, with the distinction being attributes indicating does the return-value have a name? if so, what? and are there children-statements? (ie extended return), and so on.
2. This allows for simpler handling: you only have one IR-node to handle, instead of X-many similar constructs, which reduces the “surface area” for bugs or “impedance mismatch”.
Designed to be amiable to the storage in a DB;
1. This allows for efficient querying,
2. and efficient storage,
3. and likely allows structural transformation, as in “I want this type from a previous project, but as a singleton… ok, I can transform it from a [tagged-]record into a package.”
If the IR is OOP-based, then having an execute interpretation method on the base/abstract type is enough to run the IR, meaning that bootstrapping is reduced to:
1. getting the compiler to emit this IR, and then
2. running the interpreter/execute,
3. writing the native-backend,
4. using #2 & #3 to generate the native-producing compiler.
5. BONUS: This also allows new features, like parallel, to be run interpreted and then “bootstrapped out to native” more easily… perhaps as simply as implementing the feature in the runtime and re-linking.
If, as with DIANA, you (a) design so that it is impossible for non-valid code to be stored, and (b) implement “folding” of non-semantically-meaningful elements (e.g. spaces vs tabs), you get several good consequences:
1. Reduction of redundant work such as lexing and parsing,
2. Elimination of handling erroneous constructs, and
3. Presentation of these known-good to tools.

Honestly, it was such a disappointment when Alire rejected the idea and basically made itself [at MVP] a glue-code also-ran interface-to-github package-manager. — We could have had a system where it would be impossible to serve out non-compliable [or, at least parsable] code and which, eventually, could have the capability to query is dependency-X compatible with with my codebase? and to which versions?

jere · October 10, 2024, 5:34pm

I can see that, though to be fair, that means it is only a relative con for libadalang for Ada83/95 at that point. Anything past that and both ASIS and Libadalang would share that same con (would it still be considered a con in a comparison if both sides shared the same issue?). Also after Ada95 it less becomes a con for the tools and more of a con for using the ASIS standard at all, unless there is a modern standard available?

OneWingedShark · October 10, 2024, 6:01pm

AFAIK, the standard could be revived/updated (I don’t know that it would need a lot of work to get it to Ada2005; 2012, or 2022 would be more; I suspect 2005 would be really minute and most of the work going to 2012 being aspects; 2022 would be the parallel and other extensions), but seems to be more of a query-system/API than an IR… which is fine, and good as a standard for tooling, but really is completely different than the qualities an IR would bring, almost orthogonal, TBH.

zertovitch · October 11, 2024, 5:46am

Another toolset using libadalang but not from AdaCore to be mentioned: Renaissance-Ada:

For instance, one of those tools suggests (among other things) Ada 2012 features like “if expressions” on places where it applies.
It also finds patterns where membership tests can be used:

if Code = SP_Open or Code = SP_Create or Code = SP_Append then

becomes:

if Code in SP_Open | SP_Create | SP_Append then