It really comes from the foundational definitions of Forth: in Forth a “word” is (a) a sequence of words to execute, or (b) a chunk of machine-instructions to execute. From this we get two things: (1) a Forth is very simple, and very low-level/hardware-oriented; and (2) by this definition, a Forth is also extremely portable, as to port it to a completely different architecture is implement the “core words” that all the other words devolve into a sequence of.
In standard Forth, words are essentially a dictionary look-up of text. (Meaning they can be bytes and bytes and bytes long e.g. some_really_long_operation_that_describes_everything_it_does
, or short +
.) The execution of Forth involves a stack machine, so the sequence “12 34 +” would be parsed as 12
, pushed on the stack, 34
, pushed on the stack, the top two items added together and the result pushed on the stack: 46
.
With SeedForth, the words are tokenized, meaning that you could have say 0..2**6
-indexed words and use the “leftover bits” of your wordsize to index a user-dictionary, if you provide a user-dictionary at all. What this means is that your indices into the dictionary are a uniform, length, and that the entries therein are lists of uniform-element length; i.e.
Core_Word_Count : Constant := 2**6;
System_Cell_Size : Constant := 8;
Subtype Word_Length is Natural 0..Natural'Pred( 2**System_Cell_Size );
Type Token is range -Core_Word_Count..2**System_Cell_Size-Core_Word_Count
with Size => System_Cell_Size;
Subtype Core_Token is Token range Token'First..0;
Subtype User_Token is Token range Token'Succ(Core_Token'Last)..Token'Last;
-- Instantiate MAP package with User_Token as the index, and an array of Token as element.
So, once you get your SeedForth tokenized, you have what’s essentially equivalent to a bytecode, except for a very simple (thus easily provable) conceptual virtual-machine. —The details can be so close/integrated to the bare-metal that there’s no real “virtual”… those core words are now constants, in this case things like -26
, -3
, -18
— this means that you now have a stable instruction-stream, such that all you need to port is implementing those core words.
In practice: You feed your SPARK-proved SeedForth interpreter to your SPARK-proved SeedForth producing backend to get a verifiable SeedForth-in-SeedForth, allowing us to leverage that to get a verified interpreter, in itself. — This is a verified correct interpreter.
Then you feed your Ada compiler whole into the SeedForth producing backend, this gives you the Ada compiler in SeedForth — this allows you to run your verified compiler on the above verified interpreter. (You can now produce code to run from the new architecture, and on the new architecture.)
Having these in-place, getting verified native-code execution on the new hardware is to implement a SPARK-proved backend for the compiler, then run that through the above to get a native-code producing SeedForth Ada-compiler. (Feed it to itself, again, and you have a native-code producing native-code Ada compiler.)
TL;DR — Using SeedForth means: (1) there is no C, at all, in the dependencies; (2) the simplified nature makes it easier to verify; (3) the token-stream can be used as a bytecode; (4) you get a very portable verified VM for the “bytecode” of #3; (4) You can, as needed, (a) interpret your Ada programs on this VM, achieving high portability with little work, or (b) implement just the verified backend, and generate a verified native-compiler on the new architecture with all the stuff that’s been done before.
TL;DR for the TL;DR — It allows high-portability, after the initial work, with the option to “half-port” (interpret) your Ada programs.