YouTube video on trusting software

I came across this video, which makes some good points on trusting software:
Can We Trust Rust?
While the particular topic is Rust, there are some good observations, observations which provide some insight on how to avoid the issues.

On the technical side:

  1. Having multiple implementations; and/or
  2. Having source-code availability, and
  3. Having a non-selfhosting compiler (i.e. bootstrappable from some other interpreter/translator).

I haven’t watched the video yet but I think the concerns about a self hosting compiler are overblown. There are other ways of analysing generated machine code for compromise. In fact Dewar on the mailing list made much rejoicing when Gnat was finally self hosted because it simplified so much apparently.

Yes, that’s because with a self-hosted compiler you can use the language’s own facilities to implement them. For example:

Integer_Value : Integer renames Integer'Value( Get_Text(Token) );

As you can see, the above takes care of parsing an integer without the usage of RegEx or other pattern-matching, and precisely to the implementation of Integer'Value according to the run-time library. This is exactly why self-hosting improves the engineering aspect, simplifying so much. (Although, TTBOMK, this sort of trick of pushing things to the run-time library isn’t typically used in Ada compiler construction commercially, as opposed to explicitly implementing in-source.)

Do note one thing, though: the run-time can be done in another language. This actually could be viewed as ā€œhalf-bootstrappingā€, and thus dismissed because it adds ā€œanother pieceā€ into the problem of bootstrapping.

Kind-of, but not really: once you subsume into code-generation the ā€œspecial casesā€ it can follow along, implicitly embedding itself into the language. — As a simple example, consider the above trick of using Integer'Value and ā€œpushing things to the run-timeā€: if we have an Integer'Value that returns 0 instead of raising Constraint_Error on non-integral strings we have introduced erroneous behavior that propegates along until Integer'Value is re-coded.

The paper Reflections on Trusting Trust [1984] lays out how, upon control of OS and compiler code, the end-result is a compiler that inserts a ā€œmaster-controlā€ password into the ā€˜login’ function for the OS secretly and ā€œwithout any codeā€, as all that code has been bootstrapped into the compiler.

Ada has facilities for mitigating this, however: (1) the optional Annex H (High Integrity Systems) provides pragmas Reviewable and Inspection_Point, and (2) you could conceivably use SPARK to prove generated code matches input.

Yes, my plan for Byron includes an executable IR, which can be run ā€œas an interpreterā€, an implementation of SeedForth in Ada, and a SeedForth-generating backend. Combined together this means that bootstrapping is essentially re-compiling these in various combinations. (Then, to get native-code compilation, all you have to do is write a code-generator for the new architecture and ā€œdo the recompile danceā€ until you get a native-code Ada-translator that emits native code.)

Combined with Annex H, I think this nicely solves all the possible issues: the introduction of SeedForth as [essentially] a virtual-machine byte-code ā€œbreaks the cycleā€ and allows for not only ā€œinspectonā€, but ā€œmulti-platformā€ as the whole process is itself relies on the interpretation of SeedForth… something which, if you don’t trust the Ada-interpreter, you could reasonably implement (it’s about 30-35 ā€˜words’ [Forth parlance for instructions]) and bootstrap from there.

1 Like