Opening a file Ada.Text_IO.Open (...) using GNAT compiler makes a heap allocation

While doing advent of code challenges and making sure the solutions do not make any heap allocations during run-time except for application start up (elaboration time), I’ve discovered that opening a file with Ada.Text_IO.Open (…) generates a heap allocation. It means that if one does not closes the file, either deliberately or by mistake due to raising of an exception not only would make an application leak file handles but also memory.

One may wonder about why this is so. I did not expect a heap allocation opening a file. Maybe this is applies only to the implementation of the standard library Ada.Text_IO package of the GNAT compiler, but not for example to the Ada.Streams.Stream_IO package (I haven’t checked!). One reason that I can imagine is usage of Taft Types (Opaque types). It means a File_Type is defined in the public part of the specification package of Text_IO but the exact details of File_Type is defined in the body of the Text_IO package. It is smart because one only needs to replace the body of Text_IO package when porting the standard library to another platform or operating system. The down side is that Taft Types, introduced in Ada83, is closely connected with heap allocations (my experience is that it is hard to avoid heap allocations when using Taft Types).

Any other reason for making heap allocation when opening file?

The Text_IO application interface (Get procedures, etc.) is designed to scan and interpret an input text file character by character. It seems likely that opening a Text_IO file for input allocates a buffer so that the implementation of Text_IO can read larger blocks of text from the OS file system, instead of calling the OS to read each character separately. Similarly for text output, writing instead of reading. While the buffer could be a component of Text_IO.File_Type, perhaps heap allocation is preferred because some Ada implementations do not support large local stack allocations.

Thanks for sharing your thoughts Niklas. I’m not sure about Text_IO being buffered. I didn’t see any buffer when looking at the GNAT implementation of the standard library, although I agree with you that it should be buffered. What I did get the impression of is that effort has been made to reuse code as much as possible between all the IO packages.

One day I may be tempted to write my own Ada binding to the operating system for file input-output and see if I can avoid unnecessary heap allocations.

I had fun building a macOS replacement, leak_detector, for gnatmem (which seems never to have been available for macOS, don’t know why).

Turns out there are 3 allocations for opening a file: one for the file control block, one for the “form” string, and one for the full filename; all get freed on program exit.

Looks as though buffering for read is done using OS primitives.

We used to use valgrind to detect memory leaks, but this is slow. We recently started using LeakSanitizer (which comes along with AddressSanitizer). It requires recompiling/instrumenting the code, so you are not quite testing the code you will be running in production, but it is very fast (Less than 5% time increase to run our tests).
To use (at least on linux, I do not know whether you need anything else on MacOS):

apt install asan6
gnatmake -sanitize=address  main.adb

(of course it works with gprbuild, I just did not want to write the project file :slight_smile:

1 Like