C++ build time anecdote
However, the team I work with collaborates with a team that works in C++, and the C++ team has much, much longer build times: on average, ours are less than 10 minutes (from scratch); theirs are roughly two hours every time they change something , no matter how small.
As someone who works on huge C++ projects, this is a red flag to me. Most projects aren’t like this, usually builds are much, much quicker, the worst I’ve dealt with was ~30 minutes due to working in core headers on core types that most of the entire project used. Seconds are much more common, and sometimes minutes.
C/C++ builds are extremely parallizable, but sensitive to SSD/HDD speeds, so I’d be curious what hardware they’re throwing at it. A great way to slow down a big C++ build is to use HDD.
Physical Design
A specification can be auto-generated from the body
In some ways yes, in other ways, no.
Due to the nature of the interface/implementation split, you often must provide certain information for the compiler to make decisions, such as providing a size for a type used on the stack. Rust would called this “Sized”, whereas in Ada you’d put the type definition in the private
section of the module, in C++, this is part of why private:
is a section in a class declaration in a header file even though clients don’t logically need it.
There’s also reasons to not do this to prevent leaking of design information you have in your libraries, such as types and functions private to that library. You also often don’t want to export all your header files to prevent confusing users as well and preventing others from depending on them.
In Rust you can control this with pub
and pub(crate)
, in C/C++ this is done with anonymous namespaces and static
functions/variables. You can’t do this generically for C/C++ though because anonymous namespaces are relatively new (a decade or so, IIRC). For Ada, you can’t because it it follows the linear sequence of elaborations model, and doesn’t have conditional compilation to remove elements, so I think it’d be harder to generate the correct interface if you have multiple implementations (e.g. Windows/Mac/Linux of the same module).
I highly doubt that separating spec and implementation has a signficiant effect on compile-time.
C++ and Ada both exhibit what’s called “physical design”, in addition to “logical design.” The gist is that the file that elements are put in affects the program structure as a whole. Lakos’ book, “Large Scale C++ Software Design” is old, but goes into exceptional detail on this, and how to “levelize” designs to prevent compilation problems from spiraling out of control. (Some of the stuff in that book is out of date, but it’s still a great resource). In C++ this manifests due to each translation unit being compiled down to an individual object file, since it affects optimization and other concerns. (I’m omitting details here on “unity builds” which were much more popular before link time optimization (LTO))
Most programming languages don’t deal with issues arising from physical design issues. The origin of the header/source file split with multiple object files was due to entire programs not being able to be kept in memory. Nowadays it helps with build parallelization. Reading a file off a SSD really isn’t that bad these days … Lakos’ book recommends redundant header guards around every include, since at the time opening the header to read through it again just to skip over everything with the header guards caused a bunch of slowdown. The usual modern technique is to just use #pragma once
instead.
I guess you might be able to technically reverse engineer the C++ name mangling and data in object files to get this since I think calling convention/parameter/return types could be inferred? Integer sizes especially could change per platform which might get ugly. I’m not sure how you’d handle aliased names in this case, since I don’t think they’d show up.
If you’re struggling with C++ compilation times, there are tools and techniques to throw at it, due to its header/source split.
- minimize includes in header files
- prefer forward declarations over headers
- minimize inlining (note that writing function definitions in a class declaration are implicitly inlined)
- minimize usage of templates (a common source of main in the 90s, and coming back due to often heavy template usage)
- "pointer-to-implementation (PIMPL pattern) to reduce re-compilation times for heavily changed elements on which a lot of things depend.
An advantage of Ada and C++ has is that you can often prevent rebuilds of large numbers of components if specs/header files don’t change. You can often limit changes to source files, so just the affected object files get rebuilt and linked.
From a tooling side, you should definitely put include-what-you-use into a CI job to give you a running log of what is actually being used. I’ve gotten pretty decent speed boost just from it’s recommendations. Every project should have it set up.
On Linux ccache used to be a must-have, I’m not sure if there’s a more modern replacement.
Bazel uses caching and do an excellent job of parallelizing builds.
FASTBuild lives up to its name. It has caching and distributed builds built-in. Even without distributed builds, a project I converted over to FASTbuild from CMake resulted in a 4x faster clean build. It’s amazing to me that more people don’t use this tool, because frankly, it’s amazing.
Bazel and FASTbuild both have excellent profiling capabilities.
Rust compilation speed is a known issue, and the idea of a complete rewrite of the compiler has been floated.. Rust also heavily uses monomorphism (C+±like compile-time templates) which leads to bloat. It is super convenient though to just have a .rs
file with everything in it (include unit tests!) and not have to worry about moving templates to a header file, etc.
The one really good argument I remember seeing for separate spec files is that it forces a software engineering team to think about what they’re doing before they do it, which is especially valuable when they won’t be able to change it after sharing it. That’s a compelling argument.
If the exported headers of a library don’t change, then you know that a component linking in your library won’t have to recompile, though it may relink. This is useful for when you’re deploying new artifacts to other people, or pulling in new versions of artifacts and you don’t have to worry about a full compilation. You just leave “stitch everything back up” to the linker.
Summary
This is a complicated topic touching many elements of the lifecycle of shared libraries, static libraries, and final executables.