I don’t know why that should be a huge problem: you already know how big the stack-frame-equivalent is, so you pop that amount off the stack, just as you would clean up the stack-frame for a normal compiler.
There’s several ways you can do it:
- Using a High Level Language as a Cross Assembler
- Target a VM
- P-Code is simple, but less well-known now;
- JVM is ubiquitous, making for a good cross-compile bootstrap platform;
- DOTNET is interesting, but more limited (there are implementations on non-MS OSes);
- Forth/SeedForth, very simple and small;
- LOLITA - A Low Level Intermediate Language for Ada+
(This paper is the only ref to it, though, TTBOMK) - Interpret/execute the IR directly.
(This is technically what Graal/Truffle does, though the IR is essentially the AST.)