Experiences With Large Projects

Ada is often represented as a language for programming in the large. Can someone here with experience in large Ada projects talk about what the experience was working on a large Ada projects. What was different from large projects in other languages? What were the primary pains?

Large my be defined generously however you choose, but I’m imagining projects with one or more of:

  1. Dozens of contributers
  2. Hundreds of thousands to millions of lines of code
  3. Production critical for years
  4. Expansive and changing scope
  5. Stringent performance requirements such as latency or throughput goals
1 Like

Dear @cowile, welcome to the forum!

I cannot answer this question from first hand experience, however, I do know quite a few Ada programmers working in very large codebases (+2 million lines).

TL;DR: it is just great!

About your points:

  1. Dozens of contributors: I guess that having a good SCM would help manage all the people. Nevertheless, I assume you are referring to the typical case of “Bob changed X things without notifying Alice, will things break?”.
    • Ada’s very strong typing system, the Compilers sanity checks, extra features (such as liquid types and contracts) and the obvious use of SPARK; will ensure that either random wrong changes do not compile, get caught very early (runtime checks during testing or production) or simply are not allowed by SPARK (the prover).
  2. A lot of SLOCs: Ada was not a pioneer of the module system, but back in the 80s it was one of the first and it was quite useful and natural. Segmenting large projects by modules/functionality is very easy, recommended and scalable. Newer PLs (read, after 2010s) have realised how important this is and they are also using modules (see Rust). Even C++ finally got some basic module functionality! Just 40 years after Ada… Maybe using C as the baseline design was not a good idea… For Rust, maybe creating an improved version of C++ was also not a good idea…
  3. Production critical: pleaaaaase, is this even a question?? :wink: See the above answer and also take a look at the Ariane rocket program, Airbus, Boeing, Paris Metro, etc.
  4. The modular system of Ada allows changes to be contained. Breaking changes (read requirements that change scope) can be detected early by the compiler if they are trivial. The techniques mentioned in the bullet point of point 1. should provide even more assurance and early detection of clashes between changes. If you use generics, type reflection/attributes, etc; a simple change in a source file will automatically expand throughout the rest of your program without you having to do anything!
  5. Ada generates incredibly performant code. full-stop. Really! Take string (base type, so stl:: bullshit), they are not null/\0 terminated, their length is part of the type itself. The algorithms on structures whose size is always know are incredibly performant. Use Godbolt/Comiler Explorer to see how Ada code compares to other languages. Also, take a look at the bullet point for the second point of yours, that should give you a really good idea on how deep you can go with Ada.
    • Just a word of caution, use user defined types for your data, this may help the compiler generate substantially more performant code. If you see that the performance/assembly of Ada is substantially worse than that of C/C++/Rust, it may be that these languages allow for undefined behaviour and they are taking advantage of it while Ada may be forbidding it!

I hope this answers your points. Please, post any questions, issues or anecdotes that you may have while learning Ada. A great place to get started learning it is https://learn.adacore.com/

Best regards,
Fer

5 Likes

The big difference with language X (choose one with conditional compilation, “include”'s, global scope by default, pointers everywhere, soft typing, …) is Modularity!
The bigger the teams and project size, the bigger the difference. While you get the whole modularity integrated with Ada and have minimal recompilation at each build, the X guys wait for whole rebuilds each time they change a curly bracket. Plus using, configuring, scripting external tools that try to figure out dependencies - it’s a lot of work for them!

Then, Ada’s waterproof type system and the low usage of pointers. The X guys worry about getting memory dumps, doing forensic about that, etc. .

Another general difference is that you need to be much less defensive (it is due to Ada’s type system but also myriads of details like the syntax of the for loop, where you can’t do dumb things like writing the exit condition wrong, having jump steps of 2 or tampering the loop parameter).
This confidence is both a blessing and (potentially) a curse: with Ada you are much more relaxed thanks to the early bug detection. But you (individually or collectively) may even become too relaxed and a bit spoiled. That’s the potential curse.

4 Likes

My experience is similar to zertovich’s. Starting with the architecture phase, and continuing throughout the design phase, design decisions are formalized in package specifications. These are then enforced by the compiler when the project gets to code generation. In effect, there was no software integration effort needed, since it had been done by the compiler during implementation. Compared to other languages, especially those that would happily compile obsolete calls to subprograms whose specifications had changed, this represented a significant savings.

Of course, I’ve also seen projects that treated package specifications like C header files: an inconvenience required by the compiler, but not something that anyone ever read. They also tended not to use user-defined numeric types, and made heavy use of access types. They did not see as much advantage from Ada.

3 Likes

Ada has made you lazy and careless. You can write programs in C that are just as safe by the simple application of super-human diligence.

E. Robert Tisdale

3 Likes

My normal day job is jumping into large codebases (millions of lines, usually C++) and helping out.

I’ve never written Ada professionally, but I did add MDX to the Ada Reference Manual outputter in 2023 to be able to make ada-lang.io’s ARM output. It’s only about 50k LoC, but has this comment showing it started in 2000:

    -- Edit History:
    --
    --  3/ 9/00 - RLB - Created base program.
    --  4/14/00 - RLB - Created from analysis program.
    --  4/18/00 - RLB - Added scanning pass.

-- ... SNIP A BUNCH OF LINES ...

    --  7/26/23 - RLB - Updated copyright date and address.
  1. A lot of problem detail is better hidden in packages due to the language structure.

Ada splits code into specification and body implementations, when written for GNAT each is in a separate file (.ads and .adb respectfully). This, combined with encapsulation in Ada being at the package level (types are opaque or non-opaque, visiblity is based on package relation), not the class level (C++, C#) means a lot of implementation details stays in the related implementation files. It also means types related to the same subproblem will be in the same file, and you don’t need to hop between multiple files to see the entire picture. 99% of the time, just seeing the spec file is all you need, and that spec file is much shorter.

In C++, C#, or Java, you start peeling back classes and interfaces and it takes a while to get to an executable line of code, because you need a constellation of objects to do anything meaningful. In Ada, I find that subproblems end up as distinct packages and the focus is on functional behavior, not on playing connect the dots with types.

In C++ or Rust, you’ll often deal with related types (types within types). In Ada, types cannot declare additional types inside themselves, everything is declared at the package level, so I often find an entire subproblem solved in the implementation with all the types used there, rather than jumping five files to understand a high level flow path.

  1. A lack of macros and turing complete templates means less “magic code.”

I’m confident I could take almost any programmer and toss them into an Ada codebase and they’d be able to read almost any part of the entire codebase in a few weeks. There’s only a few secret handshakes in generics code like (<>) that you have to look up.

C++ and Rust code lets you really shoot yourself in the foot with complicated macros. Ruby on Rails is built on this sort of magic with metaprogramming as well. Rails, C++ and Rust do a lot of cool stuff with the macros and metaprogramming, but it can hinder understanding what is actually going on.

Ada doesn’t have macros, so what you read is what you get. There’s no wild __VA_ARGS__ usage in macros or std::enable_if SFINAE or macro_rules! code with wild and different rules. Ada generics occur at the package and function (or procedure) level, so the little bit of “this is what we’re generic over” is the weird part and the rest is “just code.” You don’t end in angle-bracket hell, instead you get a different type of hell where you have to explicitly instantiate templates, but then you use them just like non-generic code.

The downside of this is that sometimes you’ll see code generators to get around a lot of boilerplate (like in AUnit for unit testing).

Another downside is that straightforwardness sometimes just isn’t convenient. In Trendy Test, I abuse the exception mechanism for flow control in various ways to handle test registration and running. If I had the macro power of C++ or Ruby, or the metaprogramming of Ruby, there would be much better ergonomics.

  1. Constraints help

You can embed a lot of information in Ada directly. Two different ints might have different semantic meaning and shouldn’t be mixed and Ada lets you do that, like a built-in Rust newtype or a Go type-definition. Ada depends on function overloading, so you can use that system to only allow meaningful operations: it makes sense to multiply Meters_Per_Second by Seconds to get Meters. If you screw something up with a derived type, the compiler has your back and it embeds the meaning into the program.

This also goes for pre/post conditions which are on the specification, not an assert hidden inside an implementation. Often I didn’t have to bother looking at the implementation of something, which saved time.

  1. General nestability of packages/functions can lead to insanity

The one big issue I’ve seen in Ada is the ability to nest functions (using that term here instead of “subprogram”) inside of functions combined with the lack of closures in the language. One project I was looked to contribute to had multiply nested local functions, with a gigantic function with most contents starting halfway across my editor.

This is exaggerated, but something like this:

procedure Foo is
   procedure Foo1 is
      declare
         procedure Foo2 is 
            procedure Foo3 is
               -- ...
      begin
          --  ... some actual executable code
5 Likes

It depends on what you mean by “the package level”: every declaration can be local (to packages or subprograms): it includes types, generics, …, and that, within nested packages, functions, etc. :slight_smile:
You can have an Ada program without packages but with custom types:

with Ada.Text_IO;

procedure Hello is

  type ABC is (a, b, c);
  
  procedure Nested (p : ABC) is
  
    type HW is (hello, world);
    
  begin
    for i in HW loop
      Ada.Text_IO.Put (p'Image & ' ' & i'Image & ' ');
    end loop;
  end;
    
begin
  for p in ABC loop
    Nested (p);
  end loop;
end Hello;

General nestability of packages/functions can lead to insanity

But on the bright side, nesting is very practical. Some subprograms make sense only within a certain context. You also save tons of parameters - an advantage in readability and performance.
Of course each feature can be misued.
Note that AdaControl has rules about limiting the nesting:

Max_Nesting (x4) Controls scopes nested more deeply than a given limit.

3 Likes

You’re exactly right.

What I was trying to do was contrast to nested types like in C++ (struct Foo { struct Bar {}; };). You can declare types inside declare or function or procedure or task bodies, but only those inside packages are visible from outside their containing elements AFAIK, and hence usable when from outside and hence part of the programming boundary interface. They are also visible inside nested subprograms.

C++ and Rust really do seem to love their related types (like $whatever_container::iterator, $whatever_type::reference) whereas in Ada these types would likely be declared at the same level (Ada.Containers.Vectors.Iterator, Ada.Containers.Vectors.Reference_Type).

Helper subprograms definitely make sense in some circumstances. I find it hard to read more than one set of lexically nested subprograms. For efficiency, it depends on how many parameters, how often you’re calling them, the weight of your parameters, if the compiler will inline and what the compiler actually outputs. In this case I’m not sure, I’d have to find or write something with deep nesting subprograms, measure and look at some assembly.

2 Likes

Thanks for the answers!

Would you happen to know anything like how fast those projects build or how much unit testing there is compared to relying on the compiler and formal methods?

Thanks. Yes I read through Introduction To Ada, Advanced Journey With Ada, and Introduction To SPARK. I’ve done some small projects with what I learned, contributing an Alire package called format_strings (AI-assisted) and doing a couple of Rosetta Code problems without Ada solutions.

But since I don’t see a lot of opportunities to work on big projects with other people, I created this thread to ask about people that had. I contribute to the Linux kernel as part of my day job and am very interested in OS dev. I’ve written some basic OSes and would love to try doing one in Ada. I remember trying something like that about ten years ago during a previous spark (lol) of Ada interest, but ran into the problem of not knowing how to cut down the language runtime to work on bare metal and gave up for lack of finding good resources for how to do it.

This was something I definitely appreciated about Ada. Some categories of mistake are eliminated by thoughtful syntax, like you have to explicitly specify what block you are closing with end if; or end loop;. That’s a big improvement over C’s “brackets optional for one line rule”. And the additional characters don’t bother me because it’s more comfortable to type than bracket langs.

Thanks. These were the kind of things I was looking for when asking about pain points, though the latter seems like a problem in any language.

2 Likes

Ada is quite modular, so once you build the entire project, the rest should just be incremental builds. There should be little to no cascade effects.

NVIDIA used the formal methods for their OS extensively. Eurocontrol, AFAIK, uses ASIS. Sadly, ASIS does not support newer Ada versions and was removed from GCC in v10 (though they still get support from AdaCore as they pay for their tools). ASIS was a extended set of checks done by the compiler and behaved like an extended target/language. AdaCore would like people to start using LibAdalang instead. Also, keep in mind that Ada, when compiled with GCC or LLVM (oh, yes, we have an LLVM backend!) it can use the sanity checks and tooling that they offer, on top of the language spec. You can also use any other analysis tools, such as Valgrind, etc.

Additionally, when dealing with memory, you may not be aware that Ada supports “Memory Pools” (known as arenas in other langs) since 2012. They can also serve as an advance tool to deal with safety, increase performance and debugability. You can learn much more about the Ada memory model in this outstanding presentation by Jean-Pierre. Also, keep in mind that GNAT offers finalizable types (not yet standard) since V15, which are basically simple controlled types that are compatible with embedded systems, have a much better defined behaviour and can be analyzed by SPARK. Also, SPARK can now prove complete memory correctness ala borrow checker from Rust (including leaks and lifetimes, unlike Rust).

Oh, thank you for contributing to Linux, I rock it everyday ^^

There are quite a few interesting OS projects in Ada. The biggest and probably the one you would like to dive deep into is Ironclad with its Gloire distribution. Ironclad is a *NIX kernel written in Ada/SPARK, similar to Linux and it can run now a ton of typical *NIX programs, such as videogames :smiley:. It is not huge, it is only about 40k lines, but it can already do a lot. It also implements its own RTS (RunTime System). You can see an introduction to it here. Oh, and this is mostly written by a couple of 25(±) year olds!

Another interesting OS is HiRTOS, it is a hard RT kernel written in SPARK and fully formally verifyable for ARM/RISC-V devices. This one is about the same size as Ironclad, but completely different in nature, targets and design.

Finally, a commercial open source Ada/SPARK micro/separation kernel is Muen, we know from disclosed information that it is being used in encryption devices and telecommunications/telephone systems. After a clone of the kernel with all of its submodules, there are 158 kLOCs of Ada (the kernel is only 6200 LOCs).

Wait, what? The runtimes can be zero/light (no runtime whatsoever), extremely basic (allowing for secondary stack management, memory features…), light-tasking (think of the restricted tasking profiles Ravenscar and Jorvik, see section D.13) and full-tasking (Windblows, Macs, Linux *BSDs…). You do not have to cut the Runtime, you only have to use the features that can be used with the runtime you are targeting.

For example, if you want to use a light runtime, you must simply not use any tasking or advance memory thingies. You can take a look at the impressive, great, wonderful and amazing SweetAda project. In total it has 100 kLOC lines. It is a framework to create bare-metal applications if a f*ck ton of boards and architectures. It truely is an amazing project and it is done by a single incredible person. Runtime-wise, it only supports zero/light and secondary stack (selected arches) runtimes, though the goal is to support tasking. It also has some drivers for common peripherals and general virtual thingies, such as FAT support.

Another great project is bb-runtimes, this is a collection of arches/boards/fpgas with several runtimes. There are several that have light-tasking or even full-tasking runtimes. It also brings in some peripherals to the mix. This is probably the most complete Runtime-filled repo out there.

You can also check the runtimes for Ironclad, HiRTOS and Muen if you are interested. Also, a nice bit of trivia/updates, the Ravenscar and Jorvik profiles can be verified by SPARK, meaning that if you stick to them, SPARK can prove the correct behaviour of your multitasking applications!

I hope this provides a useful answer to your needs :slight_smile:

Best regards,
Fer

4 Likes

That sounds like a cool job!

Some get avoided entirely as well. There’s a killer feature I never see anyone talk about: You can specify dependencies between Ada packages for initialization during the environment task prior to the entry procedure, to avoid the static initialization order fiasco.

IME, just the baseline Ada experience prevents you from shooting yourself in the foot 95% of the time. e.g. you cannot return a reference to an in and out param, or a stack variable unless you explicitly mark them as aliased.

Build times are pretty good compared to “I changed a low level header file and had to wait 30 minutes for a build.” I have heard extensive formal methods proofs can take a long time (hours), though GNAT studio has ways to target analyses. My verified postfix calculator takes about a minute or so on a 5950x, but proofs seem to parallelize really well. Trendy Test runs all tests randomized and in parallel, the problem being if you have unit tests depending on module state, the result isn’t well defined unless you mark those tests as “run in serial.”

I’ve been playing around more with LLMs and vibe coded decent Ada versions of Rust’s Rc and Box equivalents in about 10 minutes. A big reason I still do Ada hobby dev is because the spec/implementation firewall gives a place to cordon off vibe-coded elements until you have time to go through them. I’m wondering if the era of programming languages focusing on “expressiveness” is over, and those focusing on semantic correctness (like Ada) is now here.

4 Likes

Thanks for all the links. My memory is fuzzy but I think I just never found any of the documentation relating to zero footprint profiles.

I’m also curious about any support that makes debugging easier. Does it play well with GDB and are there other more useful debugging tools out there?

Interesting. Where can I read more about this feature?

Even before LLms I noticed a (re)surging interest in strong, static type systems. Haskell gained some popularity in the mid-2010s before falling off (presumably happy to stay an ivory-tower language avoiding success at all costs) and Rust keeps growing. I think everyone is running from the horrors of the scripting language dominated web era. My view from inside Big Tech is the dominant dynamic languages JavaScript and Python are banned in their naked form and must use typed variants.

Ada has benefited from this trend and IMO ends up looking like a language ahead of its time. And the over-complexity criticisms don’t stick anymore because if anything Ada feels simpler than a lot of more modern languages.

There will absolutely be synergy between LLMs and languages with strong correctness focus. Like you say it is easier to section off the vibe coded bits and good compiler feedback helps keep the LLMs on track for longer tasks. With LLms, the thing most pushing people towards “expressive” languages, the extra typing, is a non issue.

3 Likes

There is no documentation for the “zero”/light runtimes because… that is what they represent, the absolute lack of any runtime x)

GDB has built in support for Ada OOTB and it work wonders. I always launch GDB with catch assertions and catch exceptions and just let my Ada program (built with checks enabled, which is the default in GCC) and let it tell me where things fail. It literally is the most pleasurable experience I have ever had with debugging. Additionally, it supports all the features of GDB, so you can also do remote debugging with OpenOCD in embedded boards, I have used it with great success. SweetAda has built in support for it for its targets. Also, obviously, you can debug QEMU binaries, etc. Ada is no different than C/C++ and I would dare to say it has even better features than them because of the language design and because of the extra small features in GDB. Also, you can deploy Ada program in Renode if you wanna give them a test without the real hardware. Here is a small blog detailing Ada and Renode.

This is a good place ^^

110% right!

4 Likes

There is some runtime support in the light runtimes and even zero. It is just all independent of clocks etc. and so can be made to run on any chip much more easily and on chip families. For example array bounds, 'Image, 'Value etc..

Hopefully these may be of some use for Linux but I find Linux drivers a pain to read with all the abstractions never mind trying to integrate Ada code for many targets so good luck.

2 Likes

Ada was designed with reducing project lifetime costs of large projects as a primary goal. I’m not sure if any other language was actually even designed as such but certainly not so carefully.

I can’t even find a date on it but it seems some were against Ada quite early on. I guess they liked getting monopolies by using application specific languages. Wow were they grasping at straws if this is the best they could come up with. Some features should only be used sparingly to alleviate a particular pain point and some have been improved upon since then too.

https://dl.acm.org/doi/pdf/10.1145/1041326.1041327

4 Likes

Perhaps the most astounding example is the YF-22 development: here are the [HTML] slides detailing how effective Ada is in large projects. — The integration-time given is astounding, especially if you’ve done any commercial integration of C/C++, PHP, or JavaScript.

2 Likes

There’s also this one from AdaCore.

Looking at these things from an outside point of view, they look like language proselytization propaganda.

However, after writing a whole bunch of Ada code and then coming back to it multiple years later and seeing how easy things are to change, my skepticism has considerably waned.

It doesn’t matter if you’re sitting on a gold mine if no one knows about it. A massive problem is that Ada’s vernacular predates or conflicts with a lot of modern terminology (storage_element, task, protected object, 'class isn’t a “class”, limited type, tagged instead of class, entry, “entry families”, “controlled type” instead of “RAII”, access instead of “handle” or “pointer” – yes, I know they’re not exactly the same, “generic packages” – the closest thing might be an OCaml signature?). I tried to address some of these in my FOSDEM 2022 talk.

2 Likes

A lot of these aren’t the same; as an example “class” and “'Class” are, respectively

  1. A syntactic construct defining an OOP type.
  2. The set of this type and all types derived therefrom.

Likewise, “access” and “pointer”:

  1. A referential type,
  2. A location whose contents are an address referring to some other object.

Note, while “accesses” are typically realized as pointers, nothing constrains them to be so. (That is, you could in theory define an access which is Nybble bitpatterns 0..15 to indicate a pool/array of 16 items, which is the access/index thereunto.)

Limited” vs “copy-constructor”:

  1. Limited indicates a type without an inbuilt assignment operation, this means that it can be: uncopyable, deep-copy, or shallow-copy (the last two explicitly via user-defined subprogram).
  2. A copy-constructor is essentially the same as the optional user-defined subprogram.

tagged instead of class

  1. Tagged denotes an OOP-object, in particular the mechanism whereby the runtime determines the type of a given OOP-object;
  2. Class (in most OOP languages) is a syntactic construct defining a type.

controlled type instead of RAII

  1. Controlled is not entirely equivalent to RAII, controlled can (for instance) be used to reference-count, or provide some scoped-finalization, or perhaps as a DEBUGGING device.
  2. Strictly speaking RAII is [only] about pairing constructors to destructors, ensuring that the object is created only when/if the constructor completes.

“Storage element”:

  1. Cell: In Forth, the system chunk-of-memory which is the most natural (typically smallest addressable unit),
  2. Ada: the same as above, the system’s default addressable unit.
3 Likes

I’m so confused by your reply, I think you interpreted what I wrote backwards.

That’s what I was saying.

My point about RAII is if you have those mechanisms, you can use those mechanisms for other things, or even asymmetrically (e.g. destructor-only behavior). The other term I’ve run into regarding this is “scope-based resource management” (there’s supposedly other nuance to this term) though I’ve never seen that in Ada docs either.

I learned Ada a long time C++ all time, and you see things like scope-based locks (std::scoped_lock), profiler scopes (TRACE_EVENT), reference count, run a deferred lambda on scope exit, etc. as a result of having both constructors and destructors (though controlled types are initialize/finalize).

A problem in C++ is that you don’t entirely know if you’re going to get additional behavior on construction or destruction, which is usually why you often see class where type have behavior (or data encapsulation) and struct for plain data (even though class and struct are almost entirely equivalent). Ada makes this explicit, if it’s not some flavor of Controlled, then there’s no lifetime-based behavior.

1 Like

On a “normal” PC a ~3 million LOC project takes a few seconds to build when you are back after, say, an absence of one week and catch up with other people’s changes. For a full build (usually unnecessary), it depends on the PC: a 4-core takes 7 minutes, a 24-core takes 45 seconds.

5 Likes