Irenic language comparisons: Ada and Rust, AoC overall

jere · October 2, 2024, 10:10pm

I don’t really view the optional pattern as a set or collection though. It’s more of a “is there something there or not and if so, here is some data”. You can definitely represent that with an array, but I generally view arrays as sets or collections of homogeneous objects instead of a singular object comprised of heterogeneous components (or the lack of any object).

Again, not saying it’s wrong to represent it with an array, it just feels off to me. It might be the fact that it’s a singular object (or none) vs a collection of objects (or none). Not sure. It just feels off.

cantanima · October 3, 2024, 6:55am

Thanks to everyone for the comments and suggestions. I’ve made all the requested changes. It took a while because I was waiting for some free time to look up the switches used by the compilers (per @jere’s suggestion) but finally broke down and did it at an ungodly hour.

I’ll still take suggestions or comments. You might see other changes since I plan to give it another once-over myself, then post it to a Rust forum to see what suggestions they have.

I did share it with some coworkers but haven’t heard back and don’t think they’ve read it. (It is long, after all.)

Oh – I’ll probably add a link to AdaCore’s recent blog post, which also compares Ada, Spark, and Rust… a weird sort of serendipity.

JC001 · October 3, 2024, 9:31am

I would say the main differences between Ada and Rust are that Ada is a high-level language that emphasizes ease of reading over ease of writing, and Rust is a low-level language that emphasizes ease of writing over ease of reading.

initial, international development culminated in an 1983 ISO standard

The 1983 standard was ANSI/MIL-STD-1815A. It was adopted unchanged as an ISO standard in 1987.

I’m glad to see you think Ada will still be used in 20012.

There was no standard in 2002. TC1 to Ada 95 came out in 2000, but if we include TCs then there’s one for Ada 12 as well.

How did I come up with digits 18?

By looking at System.Max_Digits? digits Sytem.Max_Digits is portable.

cantanima · October 3, 2024, 4:24pm

I’ll add the Ada part for sure. Thanks.

Whoops. Thanks for the corrections (on 20012 and 2002 as well); will make.

What’s “TC1”?

Sigh. Thanks; I was very unaware of that. I’ve been wondering for years how to get that. It’s probably in a very obvious place in the RM, too – just not obvious enough for s like me.

jere · October 3, 2024, 4:31pm

It stands for Technical Corrigendum 1

A corrigendum is a thing to be corrected, typically an error in a printed book ( see here ). So TC1 is the first correction to the Ada95 standard. It’s weaker than an amendment (which is what Ada2005 technically is)

There’s also a TC1 for Ada2012 (the /4 in the standard per my original post)

JC001 · October 4, 2024, 8:57am

ARM 3.5.7/6, Floating Point Types

kevlar700 · October 4, 2024, 9:18pm

I like how Ada.Calendar says at the time of writing. Time is still negative as it isn’t 2150 yet.

cantanima · October 5, 2024, 1:07am

Documented has been updated again.

I’m a bit perplexed on where I came up with the years I had for Ada revisions. I know I found them somewhere, but I decided to go with (a) @JC001’s corrections, further modified by (b) what’s listed on a webpage at adaic. I’ve linked to that webpage, so at least if I copied it wrong or misunderstood again, people should be able to make sense of it.

I’ll still take comments and corrections, of course, but hopefully it’s done. Thanks to everyone for the suggestions.

JC001 · October 5, 2024, 9:47am

You can index Rust types with non-integers

When presented with the advantages of Ada’s user-defined numeric types over the use of machine-specific types, C++ and Rust developers often say, “You can do that in X,” and go on to show how. The important thing, for C++, is that no one ever does, and your comments just before seem to indicate that this is the case for Rust, too. If you can say, “Rust developers commonly use technique A to do X” then that seems like a meaningful comparison, but if it’s not common usage then I don’t think you should mention it.

I’m trying to use only safe, stable, standard features

I’ve seen a couple of surveys of many real-world Rust projects, which reported that all of them made extensive use of Rust’s unsafe features. From this I concluded that safe Rust is only suitable for toy problems, and real-world Rust is no safer than Ada. (Of course, it could be that these projects all have incompetent developers and could be developed with safe Rust.) Certainly most AoC problems are toy problems. Other Rust developers presented with the results of these surveys have said, “We only use unsafe Rust for X,” from which I conclude that their projects confirm the survey’s results. I’d be interested on your thoughts on this.

Ada has a bit of an unfair advantage

I disagree. Ada has an advantage, resulting, as you go on to say, from its design as a high-level language. There’s nothing unfair about one design approach having an advantage over another.

jere · October 5, 2024, 7:45pm

I’ll disagree here on the concept of leaving out things that are less common IF your intent is to also entice non Ada programmers to be interested in Ada or look into Ada. Language comparisons are a mine field (there’s a reason a lot of places forbid them in discussions). So when you do one, you want to be open and honest and unfortunately leaving out things because they aren’t common will come across to some folks as either dishonest (like you are intentionally trying to make the other languages look worse) or that you lack experience in the languages you are discussing because you didn’t know the feature existed (and this can lead to them not even reading the rest of your post).

I think it is fine to leave in the notes of the other less common language features. You can always leave in a comment saying you think they are less common idioms or something similar.

Ideally you want the reader to decide if it is a meaningful comparison or not, you don’t want to pre decide that for them.

That said, if this is purely for Ada only folk, it’s less of a worry.

cantanima · October 5, 2024, 8:39pm

That’s quite a surprise. As far as I know, we don’t ever use unsafe in the main project I’ve worked with, which is highly non-trivial, involves tens or even hundreds of thousands of lines of code (and would be much larger if not for the use of macros), replaces code written in Java, and replaces a project written in C++. It makes me wonder what domain those studies were surveying, at what period of Rust’s development, how experienced the programmers were, and/or what definition of “unsafe” was used. After all, Rust defines what it means by unsafe.

If it turns out that they surveys were looking at, say, tokio, well, that seems about as silly as saying that Ada guarantees safety as long as you don’t use pointers at all, but hey, the standard library makes extensive use of pointers behind the scenes, so Ada isn’t really safe.

I’ll grant this caveat: the world is full of C/C++ programmers who think that "Real programmers use C/C++"™. They have flooded the world with libraries and tools that, like it or not, humbler programmers such as myself must use. – And yes, we really must. There’s no way around it, not in any practical sense: I only have so many years to live, never mind to deliver a product. There’s no way I’m going to reimplement, say, the Gnu Linear Programming Kit in a reasonable timeframe. (Genuine example from a previous career.)

To interface with GLPK from Rust, I’d have to use unsafe. If you want to hold that against Rust, fine. But – speak of the devil – I once generated a glpk interface for Ada, ironically enough for the Advent of Code, since I wasn’t able to find an Ada-native linear programming toolkit. gnat did a very nice job of it, but every last one of Ada’s safety guarantees went out the window the moment I invoked one of those interface functions. So that doesn’t strike me as a salient complaint.

I regret that this reply comes off as “tu, quoque”-ish, and I apologize for that, because I really don’t mean it that way. It’s just that the very vague claim that “[all] real-world Rust projects” studied in a couple of surveys make “extensive” use of unsafe is both so nebulous and so incongruent with my experience that I don’t know how else to reply.

cantanima · October 5, 2024, 8:55pm

I hope you don’t mind if I separate the rest of my reply from the other part.

I think you’ve misunderstood, and if that’s the case I’m very open to any wording you’d suggest. After all, this is one place where I much prefer Ada’s approach, as it’s a very nice example of how it’s so much better than doing it in just about any way you’d do it in Rust. Indeed, that’s the entire point of including it, as well as of showing how to do it in Rust, which in all honesty feels as clunky as I hope it looks.

(Would it help if I point out that I’ve now shared this at the main Rust language forum, where it’s receiving some attention and comments? Tryin’ ta do my small part for Ada publicity!)

(Added later: I think you may not be giving due weight to how I pointed out how hard it was to accomplish it in Rust: for example, Rust’s enumerations give you almost nothing for free; yes, there are crates you can sort-of use to automate this, but they impose certain restrictions; etc.)

Again, I’m willing to reword this, but when I hear “unfair” I don’t hear it in the negative (whiny?) way you seem to. There’s a way of using “unfair” where one is pointing out precisely that it’s inherent. For example: a mathematician has an unfair advantage over an English major in an integration bee, and … well, I was about to say vice-versa for a spelling bee, but in all honesty my admittedly limited experience has suggested modern English majors aren’t so good at that, either.

I hope that explains? If you still think I should strike “unfair”, I will.

Lucretia · October 5, 2024, 9:11pm

Languages with an unfair advantage are those which are poorer quality but have more users.

JC001 · October 6, 2024, 12:15pm

That’s interesting. As I said, some Rust defenders of the faith responded to this by admitting that their projects also used unsafe Rust.

I know that I saw multiple such papers, but right now all I can find is this one, which surveyed five open-source projects four years ago (there’s another by the same authors in 2024, but it surveys the same projects). I recall another that surveyed around twenty, but I can’t find it right now. The definition of “unsafe” seems to be the use of the unsafe keyword, which they subdivide into unsafe functions, unsafe regions, and unsafe traits. The five projects total over 800 KLOC and contained nearly 5000 unsafe instances, for slightly over six uses per KLOC, which I think qualifies as extensive use.

I agree that using dynamic data structures from the standard library involves pointers. Obviously if you use bindings to C libraries all pretensions of safety go out the window. My interest is the claim that real-world Rust is safer than Ada, which does not appear to be the case.

JC001 · October 6, 2024, 12:53pm

which typically is to not do it at all, IIUC. I was commenting on the general attitude of not doing things that involve lots of writing, rather than going to the effort to obtain the advantages that involve lots of writing. Predefined numeric types rather than the equivalent of user-defined numeric types was the example I used. For the case of using an enumeration type to index an array, that is usually implementing a map (such as your example with Direction and Energized), so presumably Rust users would typically use a map from the standard library, though that’s pure guessing on my part, as you don’t show a Rust equivalent of your Energized map AFAICT.

I presume that “this” refers to your article.

This only seems important if people typically do accomplish it in Rust, which I gather is not the case.

When I hear or use “unfair advantage” it means sneaky or cheating: The student had an unfair advantage on the exam because he had seen the questions in advance. I would not use it to describe an inherent advantage such as the advantage of a mathematician over a non-mathematician in an integration bee (do those exist?). The mathematician has an advantage, but it’s not unfair.

We have two opinions, which differ. A decision should probably be made based on a larger sample of native English speakers.

Lucretia · October 6, 2024, 1:52pm

Not strictly true. The binding can ensure the Ada on top is NOT a point of failure, by adding types over the top of the C, this is what I try to do in SDL. You can’t guarantee the C isn’t a shit show underneath, obviously.

cantanima · October 6, 2024, 8:45pm

This will be a long reply, so I’ll break it into sections.

Your interest

My interest is the claim that real-world Rust is safer than Ada, which does not appear to be the case.

Did I make that claim? I do not want to make that claim, at least not “on average”. Let me know if something I wrote implies that.

I do think Rust’s memory safety guarantees are more effective than Ada’s, and apparently the Ada community agrees, since they adapted ownershpi for Spark – do I understand that correctly? But I know nothing about concurrency, and Ada has a lot of things I like and Rust flat-out lacks.

The article

That’s a great article. I’ve only skimmed it, but it’s illuminating and I plan to share it. Thank you!

As far as I know – I may be forgetting something, but I’ve reviewed a lot of our code – the team I work on does not engage in these practices.

Part of it may be the application domain we work in; part of it may be our fortune in having a very disciplined and intelligent tech lead.

The projects they study

The applications

Servo is a web browser.
TiKV is a key-value store.
Parity Ethereum is blockchain.
Redox and Tock are operating systems.

IMHO that’s quite a curious collection. On the other hand, there’s certainly nothing invalid about the selection, as I’ve seen a lot of blockchain demand for Rust programmers at LinkedIn, and there are annoying Rustaceans who act as if Rust solves your safety problems everywhere (e.g., the moderator of a Rust group I’m in), so it’s absolutely worth studying. But I don’t think it’s representative of most Rust development work.

The libraries

We do use Rand and Rayon. I’d argue (admittedly without concrete evidence at hand) that their use of unsafe is comparable to the use of pointers in Ada’s standard library.
I’m very familiar with Lazy_static. We avoid it. I don’t think it’s a blanket prohibition, but I’ve had code rejected because it used Lazy_static.

That said, I will argue that this, too, is comparable to the use of pointers in Ada’s standard library. However, in this case I have a concrete idea why. Rust forbids modifying module-local globals – something Ada allows, as I highlighted in my writeup – but the point of Lazy_static is that the language might not be able to initialize an otherwise immutable static variable. For example: in Ada you can initialize a constant vector at compile-time, which is really nice. (Maybe it’s not performed until run-time, but you can at least specify its values.) You can’t do that in Rust. So Lazy_static has to break the rule, exactly once, and never again. This falls under what I’d call “safe” unsafe. Doesn’t bother me, but I’ll grant there could be other issues.
I’ve never heard of Crossbeam or Threadpool, or if so I don’t remember it. Either way, I can’t comment.

Servo

The paper sparked my curiosity regarding Servo. I’d heard it was Mozilla’s attempt to use Rust to replace Firefox’s C/C++ code, but eventually spun off because… I dunno, but Mozilla sort of flung Rust to the mercy of the wind a few years ago. But I thought it was more or less moribund.

Turns out it’s not! Its GitHub repo shows a lot of recent work, including downloadable releases.

A search for unsafe turns up about 100 usages. On closer inspection, at least a few of them are in reference to something else; e.g.,

  /// The response is cross-origin and did not pass CORS checks. It is unsafe
  /// to expose pixel data to the requesting environment.

Some locations have a comment to this effect:

  // Returning Handles directly from Heap values is inherently unsafe, but here it's
  // always done via rooted JsTimers, which is safe.

That reminds me of comments we have in our code, where to get around the linter warnings we have to add a # Panics or # Errors section to a function’s documentation, which goes something to the effect of:

  // Does not panic, because we check the value first, but clippy thinks it does.

There are things like this:

  #[allow(unsafe_code)]
  unsafe extern "C" fn empty(extra: *const c_void) -> bool

…which I wonder whether they need long-term. And then there’s stuff like this:

  unsafe { &mut *(ptr as *mut dyn Flow) }

…which is an unsafe cast. I wonder why they need it; we don’t do that in our project. I agree that would be unsafe.

Reasons the paper identifies for using unsafe

Not a complete review; just a few things that caught my attention:

The most common purpose of the unsafe usages is to reuse existing code (42%), for example, to convert a C-style array to Rust’s variable-size array (called slice), to call functions from external libraries like glibc.

Not surprising. It’s one reason we avoid interacting with C/C++ as much as possible. I think I mentioned that a couple of our dependencies are on libraries that wrap C/C++, but for one of these we (and possibly a lot of others) are waiting for another library developer to finish a long-in-progress pure-rust implementation.

Another common purpose of using unsafe code is to improve performance (22%).

I’m a little surprised by this. Perhaps I shouldn’t be, given that two of their choices are embedded operating systems (Redox and tock), but they go on to add:

Our experiments show that unsafe memory copy with ptr::copy_nonoverlapping() is 23% faster than the slice::copy_from_slice() in some cases. Unsafe memory access with slice::get_unchecked() is 4-5× faster than the safe memory access with boundary checking. Traversing an array by pointer computing (ptr::offset()) and dereferencing is also 4-5× faster than the safe array access with boundary checking. The remaining unsafe usages include bypassing Rust’s safety rules to share data across threads (14%) and other types of Rust compiler check bypassing.

This astonishes me because one of the points our tech leads hammers is that if you use the standard library correctly, you manage to evade index validity checks in a completely safe way. As I recall, a good example is to replace something like this:

for i in 4..10 {
    do_something_with(&mut vector[i], i); // suppose you need i for some reason
}

…with this:

vector.iter_mut()
    .enumerate()
    .take(10)
    .skip(4)
    .for_each(|(ith, element)| do_something_with(element, i));

The first version checks the index on every attempt to access vector[i]. The second does not, and is thus significantly faster, at least in our experience. I don’t know the details, but I think it’s comparable to Ada’s for element of Vector, which (I hope) also doesn’t need to perform a validity check on each access.

(To that end, does Ada’s standard library provide a function to return a slice of a vector? I have this vague memory that it does, but it doesn’t call it a slice, and I don’t find it at the moment.)

Rust’s rule checks are sometimes too strict and … it is useful to provide an alternative way to escape these checks.

This agrees with (some of) my experience, but I’ve often felt that way about Ada’s tampering checks – which doubtless have saved my bacon many more times than I am aware of.

We also found that the scope of lifetime in Rust is difficult to reason about, especially when combined with unsafe code, and wrong understanding of lifetime causes many memory-safety issues.

If it’s difficult for programmers to reason about, it’s probably difficult for the compiler to reason about, as well. (That makes me wonder, though: doesn’t Ada have a similar principle of lifetimes, though not perhaps as pervasive? I feel as if I’ve run into it before there, as well.)

The second part of that statement does surprise me, but not very much.

OK, that’s a lot, perhaps far more than interests anyone here. I apologize if so. I do appreciate the article. If you find any of the others, let me know. If they’re not available publicly but you can give me a DOI, let me know; when sufficiently motivated I can acquire articles of this sort.

cantanima · October 6, 2024, 9:38pm

Yeah, that sort of situation doesn’t arise much in our project. I have a vague memory that I would have liked to use it once, but that’s it. Can’t remember why. Perhaps Rust provides sufficient idioms, and I’m guilty of using the Ada mindset.

We do use the standard library a lot, including maps, so you may be right. I don’t like it as much as the use of an enumeration-indexed array because you can’t define a compile-time map in Rust. Possibly an argument that I’m approaching it with an Ada mindset. You could get around it with a lazy_static but… well, see above

Yep!

(whoops… roughly an hour after i drafted this, and apparently went to the Rust forum to check on their replies, i come back and find it still open. hope i’m not about to make a double post, sorry if so)

JC001 · October 7, 2024, 10:32am

No, nor did I intend to imply that you had. I also oversimplified the claim that I am interested in. I should have said the claim often made by Rust proponents that Rust is safer than Ada, when by “Rust” they mean the safe subset, and I’m interested in its real-world use. When pressed, they generally admit that they use Rust’s unsafe features.

For handling pointers to objects, safe Rust is safer than Ada, but that seems like a meaningless comparison. Rust is a low-level language with pointers to objects everywhere that is a big safety improvement over other low-level languages with pointers to objects everywhere such as C, but it’s still a low-level language with pointers to objects everywhere. Much better is to move to a high-level language where pointers to objects are rarely needed (so rarely that “never” is a good approximation). When I do need to use pointers to objects, I encapsulate them, and am able to prove memory safety to my satisfaction.

I don’t really think that adding ownership/borrowing to SPARK was needed. What is needed is a collection of dynamic data structures with formal proofs of memory safety.

IIUC, it’s possible to use C++ without pointers, arrays, and predefined numeric types, using the standard library to obtain arrays with bounds checking and user-defined numeric types, and produce code that is as safe as Ada (though much less easy to read), and there are a few people who do use it this way, but the vast majority do not. Perhaps your team is a similar outlier in the Rust world.

Ada has the concept of scope, which is similar. It seems easier for most people to understand than visibility, except when access-to-object types are used. Since low-level languages need pointers to objects everywhere, it is perhaps reasonable that the concept is more difficult for them.

There’s this 2024 paper by the same authors that expands on the results of their earlier paper to develop Rust checkers, then applies the checkers to an additional 12 projects. It found errors in those projects, too, further indicating that real-world Rust may not be as safe as some claim.

kevlar700 · October 8, 2024, 11:05am

I feel like this should count as Ada. Or atleast it would be beneficial to point out; Correct me if I’m wrong but it seems that using Spark just to the minumum level of enabling borrow checking is more straight forward than using Rust. The flow analysis also provides the benefit of memory leak prevention.