This will be a long reply, so I’ll break it into sections.
Your interest
My interest is the claim that real-world Rust is safer than Ada, which does not appear to be the case.
Did I make that claim? I do not want to make that claim, at least not “on average”. Let me know if something I wrote implies that.
I do think Rust’s memory safety guarantees are more effective than Ada’s, and apparently the Ada community agrees, since they adapted ownershpi for Spark – do I understand that correctly? But I know nothing about concurrency, and Ada has a lot of things I like and Rust flat-out lacks.
The article
That’s a great article. I’ve only skimmed it, but it’s illuminating and I plan to share it. Thank you!
As far as I know – I may be forgetting something, but I’ve reviewed a lot of our code – the team I work on does not engage in these practices.
Part of it may be the application domain we work in; part of it may be our fortune in having a very disciplined and intelligent tech lead.
The projects they study
The applications
- Servo is a web browser.
- TiKV is a key-value store.
- Parity Ethereum is blockchain.
- Redox and Tock are operating systems.
IMHO that’s quite a curious collection. On the other hand, there’s certainly nothing invalid about the selection, as I’ve seen a lot of blockchain demand for Rust programmers at LinkedIn, and there are annoying Rustaceans who act as if Rust solves your safety problems everywhere (e.g., the moderator of a Rust group I’m in), so it’s absolutely worth studying. But I don’t think it’s representative of most Rust development work.
The libraries
-
We do use Rand and Rayon. I’d argue (admittedly without concrete evidence at hand) that their use of unsafe
is comparable to the use of pointers in Ada’s standard library.
-
I’m very familiar with Lazy_static. We avoid it. I don’t think it’s a blanket prohibition, but I’ve had code rejected because it used Lazy_static.
That said, I will argue that this, too, is comparable to the use of pointers in Ada’s standard library. However, in this case I have a concrete idea why. Rust forbids modifying module-local globals – something Ada allows, as I highlighted in my writeup – but the point of Lazy_static is that the language might not be able to initialize an otherwise immutable static variable. For example: in Ada you can initialize a constant vector at compile-time, which is really nice. (Maybe it’s not performed until run-time, but you can at least specify its values.) You can’t do that in Rust. So Lazy_static has to break the rule, exactly once, and never again. This falls under what I’d call “safe” unsafe. Doesn’t bother me, but I’ll grant there could be other issues.
-
I’ve never heard of Crossbeam or Threadpool, or if so I don’t remember it. Either way, I can’t comment.
Servo
The paper sparked my curiosity regarding Servo. I’d heard it was Mozilla’s attempt to use Rust to replace Firefox’s C/C++ code, but eventually spun off because… I dunno, but Mozilla sort of flung Rust to the mercy of the wind a few years ago. But I thought it was more or less moribund.
Turns out it’s not! Its GitHub repo shows a lot of recent work, including downloadable releases.
A search for unsafe
turns up about 100 usages. On closer inspection, at least a few of them are in reference to something else; e.g.,
/// The response is cross-origin and did not pass CORS checks. It is unsafe
/// to expose pixel data to the requesting environment.
Some locations have a comment to this effect:
// Returning Handles directly from Heap values is inherently unsafe, but here it's
// always done via rooted JsTimers, which is safe.
That reminds me of comments we have in our code, where to get around the linter warnings we have to add a # Panics
or # Errors
section to a function’s documentation, which goes something to the effect of:
// Does not panic, because we check the value first, but clippy thinks it does.
There are things like this:
#[allow(unsafe_code)]
unsafe extern "C" fn empty(extra: *const c_void) -> bool
…which I wonder whether they need long-term. And then there’s stuff like this:
unsafe { &mut *(ptr as *mut dyn Flow) }
…which is an unsafe cast. I wonder why they need it; we don’t do that in our project. I agree that would be unsafe.
Reasons the paper identifies for using unsafe
Not a complete review; just a few things that caught my attention:
The most common purpose of the unsafe usages is to reuse existing code (42%), for example, to convert a C-style array to Rust’s variable-size array (called slice), to call functions from external libraries like glibc.
Not surprising. It’s one reason we avoid interacting with C/C++ as much as possible. I think I mentioned that a couple of our dependencies are on libraries that wrap C/C++, but for one of these we (and possibly a lot of others) are waiting for another library developer to finish a long-in-progress pure-rust implementation.
Another common purpose of using unsafe code is to improve performance (22%).
I’m a little surprised by this. Perhaps I shouldn’t be, given that two of their choices are embedded operating systems (Redox and tock), but they go on to add:
Our experiments show that unsafe memory copy with ptr::copy_nonoverlapping() is 23% faster than the slice::copy_from_slice() in some cases. Unsafe memory access with slice::get_unchecked() is 4-5× faster than the safe memory access with boundary checking. Traversing an array by pointer computing (ptr::offset()) and dereferencing is also 4-5× faster than the safe array access with boundary checking. The remaining unsafe usages include bypassing Rust’s safety rules to share data across threads (14%) and other types of Rust compiler check bypassing.
This astonishes me because one of the points our tech leads hammers is that if you use the standard library correctly, you manage to evade index validity checks in a completely safe way. As I recall, a good example is to replace something like this:
for i in 4..10 {
do_something_with(&mut vector[i], i); // suppose you need i for some reason
}
…with this:
vector.iter_mut()
.enumerate()
.take(10)
.skip(4)
.for_each(|(ith, element)| do_something_with(element, i));
The first version checks the index on every attempt to access vector[i]
. The second does not, and is thus significantly faster, at least in our experience. I don’t know the details, but I think it’s comparable to Ada’s for element of Vector
, which (I hope) also doesn’t need to perform a validity check on each access.
(To that end, does Ada’s standard library provide a function to return a slice of a vector? I have this vague memory that it does, but it doesn’t call it a slice, and I don’t find it at the moment.)
Rust’s rule checks are sometimes too strict and … it is useful to provide an alternative way to escape these checks.
This agrees with (some of) my experience, but I’ve often felt that way about Ada’s tampering checks – which doubtless have saved my bacon many more times than I am aware of.
We also found that the scope of lifetime in Rust is difficult to reason about, especially when combined with unsafe code, and wrong understanding of lifetime causes many memory-safety issues.
If it’s difficult for programmers to reason about, it’s probably difficult for the compiler to reason about, as well. (That makes me wonder, though: doesn’t Ada have a similar principle of lifetimes, though not perhaps as pervasive? I feel as if I’ve run into it before there, as well.)
The second part of that statement does surprise me, but not very much.
OK, that’s a lot, perhaps far more than interests anyone here. I apologize if so. I do appreciate the article. If you find any of the others, let me know. If they’re not available publicly but you can give me a DOI, let me know; when sufficiently motivated I can acquire articles of this sort.