As I am struggling with my Ada 83 compiler, here are my thoughts about Forth backend or alternatives and compiler bootstrap.
So you want a verifiable compiler that does not make things behind your back. In any case you will have an intermediate representation (DIANA in my case), if you choose 9X 2X versions, it will be more complex, good luck.
DIANA structure can be dumped and scrutinized because I have the Ada 83 front-end source in Ada 83. So I know what it produces, and I have tools to look at produced DIANA). Yes the front-end and the fasmg code generator are compiled with gnat (in fact I can half compile the compiler to DIANA with my own front-end, it works, but does not execute of course the code generator being incomplete). But even if I had a “noxious gnat” that attempted to add code in my back, the DIANA structure would not be affected in a non verifiable way.
Now that you have your intermediate representation semantically checked, how to produce an executable from this in a transparent and quick way (not necessarily optimized but acceptable) ?
The most direct way I found is to write custom macro assembly text which is assembled to native machine code with program data structuring explicited. This text can be read and visually verified.
From DIANA, I choose to write fasmg macro text (flat assembler is a general assembly machine, not a specialized machine assembler), because some macros define a stack machine operations, other macros define data structuring on stack and co-stack.
Each Ada unit is compiled to a .finc fasmg text, the program takes an include .finc defining operations macros (that emit binary native code after an ELF header, and other withed units or sub-units have their fasm .finc included also. This fasmg text is assembled by the fasmg assembler and produces direct an ELF executable for whatever machine you like (depends on the binary emitted by the macros, that is it depends on the first .finc include (“codi_x86_64.finc” in my case on Linux Ubuntu).
The choice of Forth will no be a convenient one, Forth lacks data structuring expression which is paramount in coding a stack intensive language as Ada 83 (and I suppose later versions).
To explicit the point, here is an example with source file “lis_caractere.adb” (in bin directory of my framagit project) :
with TEXT_IO;
use TEXT_IO;
-------------
procedure LIS_CARACTERE
is -------------
C : CHARACTER;
begin
PUT( " Bonjour " );
NEW_LINE(2);
LIRE_UN:
loop
PUT( " Entrez un caractere ! " );
GET( C );
PUT( " Vous avez tape : " );
PUT( C );
NEW_LINE;
exit when C = 'q';
end loop LIRE_UN;
end LIS_CARACTERE;
-------------
This is compiled (translated to DIANA and code_gen treated option W) with the command line
a83.sh ./ ./lis_caractere.adb W
(framagit bin directory serves for tests and contains the ADA__LIB).
This gives the following macro text (in “LIS_CARACTERE.FINC” in ADA__LIB directory) :
include 'TEXT_IO.FINC'
PRO LIS_CARACTERE_L1 ;---------- PRO LIS_CARACTERE
ELB 1 ; BODY ELAB
VAR C_disp, b ; variable bool char
; end elab
begin: ;---------- BDY INSTRUCTIONS
STR STR_L2, ' Bonjour ' ; constante string=' Bonjour '
LCA STR_L2_ptr
CALL TEXT_IO. ,PUT_L56
LI 2
CALL TEXT_IO. ,NEW_LINE_L26
LIRE_UN:
STR STR_L4, ' Entrez un caractere : ' ; constante string=' Entrez un caractere : '
LCA STR_L4_ptr
CALL TEXT_IO. ,PUT_L56
LVA 1, C_disp
CALL TEXT_IO. ,GET_L50
STR STR_L5, ' Vous avez tape : ' ; constante string=' Vous avez tape : '
LCA STR_L5_ptr
CALL TEXT_IO. ,PUT_L56
Lb 1, C_disp
CALL TEXT_IO. ,PUT_L52
LI 1
CALL TEXT_IO. ,NEW_LINE_L26
Lb 1, C_disp
LI 113
CEQ
BT L3
BRA LIRE_UN
L3: ; post loop LIRE_UN
UNLINK 1
RTD
excep:
endPRO ;---------- end PRO LIS_CARACTERE
There is an include for TEXT_IO which is too long for here (text_io I am struggling with for some time…).
The main procedure is launched by a standard launcher in file “LIS_CARACTERE.fas” which includes stack machine macros via “codi_x86_64.finc”
include '../../src/code_gen/fasmg/codi_x86_64.finc'
LINK 0, loc_siz
include 'LIS_CARACTERE.FINC'
CALL , LIS_CARACTERE_L1
SYSEXIT
virtual VARzone
loc_siz = $ ; Ce n'est que là que l'on calcule la taille des globales qui sera retropropagee au LINK 0
end virtual
</pre/
Now, with the command
fasm lis_caractere.fas
An 1.8Kb ELF executable “lis_caractere” is directly produced ! And it works (but for a minor problem in the damned text_io which puts an extraneous line feed
but I know why).
All stack machine instructions are defined as macros in codi_x86_64.finc.
for example LI 1 (load immediate 1) is
; -------------
macro LI? val ; Load Immediate
; -------------
db 0x48, 0xC7, 0xC0 ; mov rax, val
dd val
PUSH_RAX
end macro
fasmg produces x86-64 binary in a single RWX segment (…) ELF executable.
Sorry for the length, but it gives a relatively detailed account of my (hard thought won) solution of the poor lonesome Ada 83 man.
Now, you will notice that I use a stack machine which is not far from Forth machines like J1, P16, Steamer S21 and company. There is no secret in there. But Forth has no explicit mean of Ada stack variables structuring.
You will suffer with that…