Anyone interested in resurrecting an open source strict Ada 83 (mil-std 1815A) compiler?

VMo · August 7, 2024, 10:49am

Hello !
I am not very competent for licencing terms, probably some here will have better suggestions than mine.
The link you mentioned to archive.adaic effectively seems to be the mention of the original source code. I have a backup of the original William Easton / Peregrine work and I keep carefully the walnut creek CD’s from which I extracted the original source.

Until now, i never published anything about the system I derived from Peregrine’s work, so I did not care a lot about licence mentions.
Of course if this work on an Ada 83 compiler derived from Peregrine’s Diana system is publicized due mentions of origin and licencing terms will be included.

The original work has been transformed a lot and if the original licencing terms from Peregrine allows, it seems to me that a GPL licence would be appropriate for the present derived work so that source code of further derivations should be published.
Comments from every competent participant is welcome on this issue.

Talking about coding style, I have an open mind and I think the approach of Gnat style checking is excessive. I myself changed of style with age and I am conscious that there is not only one manner to write an understandable system.
My advices would be to to avoid a too densely written code, that is why I tend to use tabs of size 10 to spead syntax elements on a 100 characters line width.
Package specifications benefit from a compact vertical writing because it facilitates conceptual and structural comprehension. Bodies pose more of a problem of algorithmic grasping which benefits from a more verticallly spread writing.
A whole left based indentation is not always the best idea for loop or blocks labels, center those labels sometimes more efficiently draws attention on them.
Comments should not be systematic, I had a time when I filled lines with comments from column 120 to 250, before I had to acknowledge that I mainly did not read them when reading again the source code later.

Coding style is not all, choice of clear-cut concepts and good concepts naming also impacts comprehension.

The ultimate criterion is system understandability and maintainability. A good test is : write a piece of software, let it aside several months or years, then work again on it. if you understand quickly and reenter the software easily, it was well done. Everything that poses comprehension problems should probably have been done differently. Of course this test is problematic in an engineering perspective where time is often critical, but in an academic career such things can be practiced and bring some information on what is beneficial and what is not.

As for the question of licence, coding advice from experienced people is welcome. If some kind of coding standardization is recognized as a real benefit, why not, but I myself tend to be rather liberal on this coding style issue.

VMo · August 7, 2024, 11:17am

Some considerations for natural language matters. Every sensed individual will recognize that english is the vehicular language especially in Computer Science. Nonetheless, when possible, it should not be neglected that a mother language expression is more easily understood. To have an in depth understanding of LRM 1815A, I translated chapters of it in my mother language. I think it would be a good thing to subdivide and keep the documentation for different languages when translations are available.

As long as I was alone, commenting in my language with some bizarre experimental code layout trials posed no problem, it is clear that if some other people are interested to participate, I’ll come back to english comments and respect some agreed conventions.
I had a tendency to over comment with the idea of printing source from column 1 to 120 and comment to columns 250 on another page. It was not a very good idea. If source code is to be managed collectively, it would probably be better to completely remove comments and have some separate descriptive documentation for packages and sub-programs description. The problem is that it would be nice to have some integration of the two sides, source-code and multilingual documentation.

If somebody has practical experience on efficient software documentation and sogftware structure representation (structural package graphs…), advice welcome here also.

OneWingedShark · August 26, 2024, 4:36am

Thank you for looking at Byron.
You are right at getting stalled on parsing: a combination of a bout of depression and “life happening” afterwards sapped a lot of motivation… and, TBH, since then the question of “Will there be any payoff?” has haunted me.

@Lucretia is right in that I wanted to use a DIANA-inspired (though OOP) IR — in part because the idea of DIANA really appeals to my design-sensibilities, and in part because it could be a foundational technology providing a huge leap in qiality over other programming-language utilities. Example: The Alire management-system, had it used such an IR, would be able to (a) reject non-parsable code, keeping the repository clean and preventing “Dave broke the build”-commits, and (b) use a DB to not only store, but version, and possibly even flag/prohibit non-compatible dependency-sets.

Anyway, my main plan would still work for you, all we would have to do is:

Write a DIANA to SeedForth (a tokenized, minimal Forth) backend;
Write a SeedForth interpreter;
Translate the compiler source (with the backend from #1), yielding SF/Ada;
Translate the interpreter of #2, yielding SF/SF;
Implement the SeedForth words (about 30) for whatever new processor;
Combining #4 & #5 yields a SeedForth interpreter on the new processor;
Using #6, we can then interpret #3 (SF/Ada) and compile on the new HW;
If native code is needed, a new backend can be done;
Combining #7 & #8 yields a native-code compiler, on that HW;

Technically, at #6, you could just be satisfied with that: using the SeedForth as a bytecode and the interpreter as a VM; but for a real Ada experience you really want some of the other stuff, the idea that the APSE kind of pointed to — if the system is implemented within/upon a database, then the “Library” becomes merely a subsystem within the DB… and you could also store things like “recipes”, algorithms, and even do version-control workspaces as outlined in Workspaces and Experimental Databases: Automated Support for Software Maintenance and Evolution.

The benefits of a DB-based IDE are plenty, especially because you can share components of the DB-engine and static-analysis and proof-systems, all three — this RFI response details how such a system (with VHDL and PL/SQL) could be created and used to good effect.

VMo · August 27, 2024, 10:19am

Hello,

Thanks for your answer. As I said, the Ada83 DIANA front-end derived from Peregrine and amply modified is available and translates itself ; which is already an achievement because the translator itself is already complicated.
So, but for verifications and ease of use, I consider that the frontend question is mainly solved for Ada83, which is a complicated and useful enough language.

I oscillated on the backend question between 3 addresses IR for optimization and stack IR as used by polish students, I also thought about Forth and various other possibilities as Qube. But I finally obtained my first executable a few days ago by producing stack machine fasmg macro code. The approach is defendable even if it produces lots of source writing because it allows some direct debugging by reading.

fasmg is an assembly engine which produces an output byte flux from sophisticated preprocessed/interpreted/compiled macro language. I think it is a better intermediate target than producing C for example, and a stack IR can be immediately translated to target machine code through fasm macros. Even without optimization, some acceptable X86 code can be obtained. This path is better than virtual machines which pose problems with interpreter/hardware decoupling (branch prediction desync for example).

As a simple example my first exec running without seg fault (!) from test.adb :

PROCEDURE TEST IS
  I	: INTEGER := 0;
  J	: INTEGER := 1;
BEGIN
  I := J + 1;
END;

is front-end compiled to DIANA and then DIANA converted to fasmg macro text (TEST.DCL.COD file) :

namespace TEST
elab:
  virtual at 8
    LOC::
  end virtual
	LINK 1,	loc_siz

  virtual LOC
    align_w
    I_disp = $
    dw ?
  end virtual
	LDI	 0
	STw  1,	I_disp

  virtual LOC
    align_w
    J_disp = $
    dw ?
  end virtual
	LDI	 1
	STw  1,	J_disp

  virtual LOC
    loc_siz = $
  end virtual
begin:
	LDw  1,	J_disp
	LDI	 1
	ADD
	STw  1,	I_disp
	UNLINK 1
	RTD
end namespace

This text is included in a fasmg “launch file” with preinclusion of stack machine macros for LDw, ADD, and so on ; as well as a simple ELF64 header/program header.

include '../../src/code_gen/fasmg/codi_x86_64.fas'

	CALL	TEST.elab
	SYSEXIT

include 'TEST.DCL.COD'

So that fasmg assembly of the “launch file” produces a directly working x86-64 ELF64 executable (displayed without 120 byte header) :

0000000000000078 <.data+0x78>:
  78:   e8 07 00 00 00          call   0x84
  7d:   6a 3c                   push   $0x3c
  7f:   58                      pop    %rax
  80:   31 ff                   xor    %edi,%edi
  82:   0f 05                   syscall
  84:   41 51                   push   %r9
  86:   49 89 e1                mov    %rsp,%r9
  89:   48 83 ec 0c             sub    $0xc,%rsp
  8d:   50                      push   %rax
  8e:   6a 00                   push   $0x0
  90:   58                      pop    %rax
  91:   66 41 89 41 f8          mov    %ax,-0x8(%r9)
  96:   58                      pop    %rax
  97:   50                      push   %rax
  98:   6a 01                   push   $0x1
  9a:   58                      pop    %rax
  9b:   66 41 89 41 f6          mov    %ax,-0xa(%r9)
  a0:   58                      pop    %rax
  a1:   50                      push   %rax
  a2:   66 41 8b 41 f6          mov    -0xa(%r9),%ax
  a7:   50                      push   %rax
  a8:   6a 01                   push   $0x1
  aa:   58                      pop    %rax
  ab:   5b                      pop    %rbx
  ac:   01 d8                   add    %ebx,%eax
  ae:   66 41 89 41 f8          mov    %ax,-0x8(%r9)
  b3:   58                      pop    %rax
  b4:   4c 89 cc                mov    %r9,%rsp
  b7:   41 59                   pop    %r9
  b9:   c3                      ret

It is a work in progress very imperfect for now but it is working.
Next objective is modifying target stack management (using rax as top register interfers with calls) and producing the hello exec from :

with TEXT_IO;
use  TEXT_IO;
PROCEDURE TEST_1 IS
  MSG	:constant STRING	:= "Bonjour";
BEGIN
  PUT_LINE( MSG );
END;

Though simple, it implies using a TEXT_IO code, somewhat more complex than just a PUT_STR macro for parametrizing a write syscall.

VMo · September 6, 2024, 5:44pm

Hello to all !

Just a short post to tell that the test_1.adb “hello world” (Bonjour) program has been compiled and produced a viable x86-64 executable.

A Machine_Code package has been added and enabled in the Ada83/DIANA translator.

Then a Text_IO package squeleton with a Put_Line procedure (syscall through machine code insertions) has been correctly compiled to fasmg macro engine text TEXT_IO.FINC (fasmg include file).
The Test_1 procedure calling Text_IO.Put_Line was compiled to TEST_1.FINC

Finally the fasmg assembler took the launcher test_1.fas file which includes TEST_1.FINC itself including TEXT_IO.FINC so that an elf64 executable of 257 bytes (includes 120 elf header bytes) that is 122 code+15 Ada string constant bytes is produced and works as expected.

Some interesting characteristics :

No linker is necessary, assembler engine with include files and 4 passes resolves the labels and section work.
A trick with conditional assembly based on used identifiers enables code elimination from unused package procedures.
Elaboration is done by straight flow from elf entry to entry point for Ada program procedure elaboration and body execution.
Machine_Code package is strictly conformant with LRM 13.8 (record aggregate, no gnu like oddities)
So it seems a “short path” is really practicable from Ada83/DIANA to target code via fasmg macro engine intermediate code.

Next step : compiling some simple useful program with loops and floating computations.

Exosvs · September 20, 2024, 1:19am

Hi there. Are you still working on this? I see recent updates in the link.

I’m very interested in this

VMo · September 20, 2024, 6:21pm

Hello !
Yes I am still working on this, but as I am alone I periodically get some rest by working on Kalinda OS or physics alternately.

In fact I worked on Ada 83 and this project for many years first transforming and understanding the DIANA translator from Peregrine. The virtual page pointer system has been completely revised a few years ago to remove some negative numbers tricks which obscured the system.

I was somewhat blocked by the complexity of building a backend for the translator, until I discovered FASM assembly engine powerful characteristics.

Last weeks I was thinking about the problem of generic implementation for Text_io. I have some ideas but a few points are still obscure on how I can compile the generic once and use a standard code for various instantiations.

Simultaneously I added the loops and began coding indexed arrays.

For now the code is a direct translation to stack IR and x86-64 machine code, at some point a 3 address IR will have to be inserted to optimize code (index computations and loops lead to suboptimal stack code).

I you are interested, feel free to take a look at the framagit project. If you need some explanations, tell me. Surely some organization should be arranged to work in team on this project.

jelle · December 17, 2024, 10:41am

I really love the idea of having a light-weight open source Ada 83 compiler . It’s a shame that there don’t seem to be many open source ones available. And I mean open source in the OSI definition. From the looks of it (I’m no lawyer) the “DIANA” project as present on the Walnut Creek disc 1 from March 1995 doesn’t have a license. This means that it’s in fact “All rights reserved” and you can’t really modify it and distribute the code online. You could probably state that personal use is within fair-use, especially since it was distributed on that cd-rom, but I personally wouldn’t be comfortable putting time or effort into this without having the licensing issue solved.

It might be a interesting to try to get a hold of the people involved in writing this and finding out where the copyright now lies. This also goes for other Ada 83 compilers. From a “tech historian” perspective it’s such a shame that so many of those aren’t available anymore. I would love to see how they were made and maybe breath some new live into them. Maybe we could make a small “paleotechnologist group” that tries to save old Ada compilers from complete disappearance?

Licensing stuff aside, I would really love to fiddle around with an Ada 83 compiler that runs on modern hardware. There’s amazing value in the original spec and I think you can do an immense amount of productive work in it. It would also be nice to have some alternatives to Gnat. I think the language community would be served by having multiple open source Ada compilers that have people maintaining them. Sometimes I take a small look at Pascal and feel a slight envy that there are so many open source compilers available. That is until I write anything in Pascal and immediately start longing for that sweet Ada goodness

pyj · December 17, 2024, 2:47pm

The modern spec feels enormous, but there’s a lot of notes clarifying things from the original, and it seems like there might be other simplifications that they’ve added.

I am only a few weeks and a few thousands lines of Ada deep into my own LLVM front-end for Ada. The compiler does so much, it is very much a non-trivial amount of work.

jelle · December 17, 2024, 7:43pm

I really love the Ada specs, all of them. I wish every language was this well-designed and properly documented . But a “non-trivial” amount of work is putting it very mildly . I think even building a nice Ada83 compiler from scratch is a sabbatical-sized project

Is there a place where we can follow your work on your own frontend?

pyj · December 17, 2024, 7:57pm

Let’s be really clear about expectations here…

Outside of the doing the two interpreters in “Crafting Interpreters” I’ve never written a compiler before.
I have zero formal training, either from undergrad or graduate courses in doing this.
I’m a C++ programmer who writes Ada as a hobby. I have no idea what “idiomatic Ada” actually looks like.

Considering the above, I have no idea how far I’m going to get. I wasn’t going to tell anybody, but figured I’d say something since I’m starting to test emitted LLVM IR with bbt now, and folks have asked for a new front-end especially with LLVM many times.

If I can actually pass an ACATS test, then I’ll make the repo public.

jelle · December 17, 2024, 8:59pm

Awesome that you’re working on this! Just enjoy the ride and don’t feel bad if you don’t get as far as you intended. Just programming for the sheer fun of it can be incredibly fullfilling and we shouldn’t let pesky goal-driven thinking get in the way of that .

Regarding non-paid projects I’m all in on the Andreas Kling mindset

jere · December 17, 2024, 10:21pm

On my end, I am just enjoying reading your journey updates. I myself have toyed with the idea, but I definitely lack the skill, so it is interesting to see a different person’s experience.

Side note: if I ever sound too pushy with my questions, let me know. Earnestly I am just interested in hearing the story/journey.

LionelDraghi · December 17, 2024, 11:36pm

I won’t participate to this adventure, but I highly sympathize with the project!

I’m of the same generation than Vincent, and remember my first professional experiences on VMS with Dec Ada. I also remember the french AFNOR translation of the standard, and how beautiful I found the examples and explanations, and how often I open it randomly for the pleasure.

When diving into the following versions of the language, I was also very sensible to the examples (the rationales, the discussions in the AI…). Each time I saw a new syntax that is more clear, shorter and less error prone, I had the desire to screen my existing code to use this beautiful new feature wherever possible.
But in the same time, I found the overall language more and more complex. If you consider for example what’s around the generalized iterators, from a language evolution perspective, I am impressed. Building that with minimal modifications to the language, well play!
The result is excellent from the user of the data-structure point of view. But what a complexity from the point of view of the one writing the data structure.

When I was using Ada professionally (until 2007), I could pretend to understand most of it (excluding the specialized annexes). This is no more possible, too complex, too big.

So, I can understand the charm for Ada83, because it was still of human size, possible to grasp without being a lawyer.

But I am more interested by Randy approach I saw today on comp.lang.ada (it’s maybe an old post), thinking on what traits of the language could be removed or changed to get a great simplification without to much functional loss.
Maybe it’s time to assume abandoning the upward compatibility, and create a new language.
We have 40 years of background, we can do both better and simpler.
At least, this is my feeling.

And one of the goals could be to have a definition of the same order of size than Ada 83.

But, anyway, I support your project !
Go Ada 83, go DIANA!

My two cents

ebriot · December 18, 2024, 7:25am

Hello,
Your post indicates that what you are most missing at this point is code generation and optimisation. Perhaps you could have an intermediate goal: build a new LSP (language-server-protocol) server for Ada83. This “only” requires lexical, syntactic and semantic analysis, and when integrated in editors (either modern like Visual Studio Code, or older-looking like vim and Emacs) it would be able to flag all deviations from the Ada83 standard, provide jump-to-declaration, and other services.

Then for the time being you can use GNAT -gnat83 to actually compiling your code, with the warm feeling that it doesn’t have any GNAT extensions.

(I am suggesting that with the idea that having a concrete, published tool is always more motivating than aiming too high and failing to deliver)

VMo · December 18, 2024, 11:07am

Hello Jelle and all !

Thank you for your interest. I answer each point.

Licencing issues. I am not a specialist, but here are some indications. The DIANA front end I entensively restructured is from the PAL library maintained by R.L.Conn during the 1990ies.

link :
https://www.semanticscholar.org/paper/What-users-should-know-about-the-Public-Ada-Library-Conn/650fd7a29d82bda1d9d75b8058f2551f34135a0c

says that :

“All items in the PAL have been released to the public with unlimited distribution and arc freeware in most cases (the exceptions are shareware).”

I found this on arc shareware (from ARC Shareware License ):

The following is the text of the original ARC shareware license:

You may copy and distribute this program freely, provided that: 1) No fee is charged for such copying and distribution, and 2) It is distributed ONLY in its original, unmodified state. If you like this program, and find it of use, then your contribution will be appreciated. You may not use this product in a commercial environment without paying a license fee of $35. Site licenses and commercial distribution licenses are available. A program disk and printed documentation are available for $50. If you fail to abide by the terms of this license, then your conscience will haunt you for the rest of your life.

I think it does not prohibit to overhaul completely old source code pieces to make them usable thirty years later for non commercial use.
The only contact mentioned was Bill Easton at Peregrine systems, a firm since bought by Hewlett.
Bill Easton, Peregrine Systems (703)689-1168, easton@ida.org (does not work anymore) or easton@access.digex.net (no rejection but no answer at date).

paleotechnologist group.
This is an interesting idea, something in line with this remarkable work:
https://datamuseum.dk/wiki/Rational/R1000s400
We could set up a document and source base on ancient Ada 83 compiler developments collecting data from good willing participants in those old projects.
We could name the project “Ada 83 memory”

jelle · December 18, 2024, 12:18pm

Nice! The PAL library sounds similar to Pascal’s SWAG which I used to study back in the day. Maybe we could simply take a similar approach with regards to licensing if we can’t get a hold of the copyright holders? I think they solved it pretty nicely: swag/LICENSE.md at 58c540d0f96c9e34c70ca8b491027671e790775e · delphidabbler/swag · GitHub

I love the “Ada 83 memory” name. We could use the modus operandi of the other digital preservation projects that try to do similar stuff. That datamuseum site looks great! It would be neat if we could have a similar wiki describing Ada’s early days

Lucretia · December 18, 2024, 1:50pm

I wouldn’t even bother. I’d rewrite a diana implementation from the specs which are available.

Lucretia · December 18, 2024, 2:02pm

I don’t necessarily lack the skill as I’ve always been interested in compilers and wanted to tailor my degree to that, but that didn’t turn out too well…

This is something I’ve been arguing for for god know how many years, but Randy and others have always ~~been reluctant~~ refused point blank to do this. Can you point me to the post? My machine died so the one I’m using (which is really kind of painful and laggy to use) doesn’t have usenet, so needs to be a web archive.

Maybe ~~Orenda~~ Titania could fill that void as someone (pyj) is working on an LLVM front end which I was going to try years ago.

Can you put it up somewhere, I’d be interested in taking a look, as I have a project which I was going to build into mine which will be an external tool first and I’d like to be able to test it with a real compiler.

JeremyGrosser · December 18, 2024, 5:29pm

7 posts were split to a new topic: A New Ada Subset