How to use strings in Ada?

simonjwright · October 15, 2024, 9:32pm

There’s nothing wrong with heap allocation; for a short-lived not-too-large program, let the OS clean up on program exit.

For a long-lived program, deallocation needs to be taken care of. Most of the time, you can rely on well-written libraries, like Ada.Strings.Unbounded or the indefinite containers.

gast · October 15, 2024, 10:34pm

Of course, in Ada you can have a specially designed memory pool to improve performance like mark-and-release pool.

Is there any sample code for such an implementation?

stevelitt · October 15, 2024, 11:51pm

[dmitry-kazakov] How would I implement your type Roman_Latin if I wanted UTF-8 characters? Thanks.

AJ-Ianozi · October 16, 2024, 2:22am

Coincidentally @dmitry-kazakov Has written in one of the most influential string libraries with utf8 support in Ada back in the day, Strings_Edit, part of The Goliath known as Simple Components.

jere · October 16, 2024, 2:33am

You could check out dmitry’s at: Simple components for Ada

dmitry-kazakov · October 16, 2024, 11:49am

That is a way different question. If you wanted an enumeration type encoded in a specific way, e.g. UTF-8 then Ada has representation clause for that:

   type Roman_Latin is
        (  ' ','a','b','c','d','e','f','g','h','i','j','k','l',
           'm','n','o','p','q','r','s','t','v','x','y','z'
        );
   for Roman_Latin use
        (  'a' => 97,  'b' => 98,  'c' => 99,  'd' => 100,
           'e' => 101, 'f' => 102, 'g' => 103, 'h' => 104,
           'i' => 105, 'j' => 106, 'k' => 107, 'l' => 108,
           'm' => 109, 'n' => 110, 'o' => 111, 'p' => 112,
           'q' => 113, 'r' => 114, 's' => 115, 't' => 116,
           'v' => 117, 'x' => 118, 'y' => 119, 'z' => 120,
           ' ' => 32
        );
   type Roman_String is array (Positive range <>) of Roman_Latin;

   A : constant Roman_String := "veni vidi vici";

Here Roman_Latin characters have Unicode code points as values.
What you cannot do is to make Roman_Latin a subtype of Character or Wide_Character. The Ada type system supports only range constraints, no holes.

evanescente-ondine · October 16, 2024, 12:42pm

I think it’s bad form… Whenever I can reasonably predict a size, or the number of iterations of a loop a problem requires, I beat myself until I got the algorithm perfect. Isn’t this similar to the discussion over garbage collection ? I fear some practices make us lazy then we wonder why lightning fast complete graphical OSes used to fit on a few Mbs.

OneWingedShark · October 16, 2024, 11:44pm

IIRC, you can do that with the Static_Predicate.

-- Subtype renaming for continent  switching of subtype derivation.
Subtype Code_Points is Wide_Wide_Character;

-- Subtypes defining the parts.
Subtype Upper is Code_Points range 'A'..'Z';
Subtype Lower is Code_Points range 'a'..'z';
Subtype Digit is Code_Points range '0'..'9';
Subtype Other is Code_Points
   with Static_Predicate => ' ' | '.' | ',' | '?' | '(' | ')' | '"' | ''' | ':';

-- New type.
Type New_Roman_Latin is new Code_Points
  with Static_Predicate => Upper | Lower | Digit | Other;

dmitry-kazakov · October 17, 2024, 7:30am

Yes, I forgot that. Then it would be:

   subtype Roman_Latin is Character
      with Static_Predicate => Roman_Latin in
           ' '|'a'|'b'|'c'|'d'|'e'|'f'|'g'|'h'|'i'|'j'|'k'|'l'|
           'm'|'n'|'o'|'p'|'q'|'r'|'s'|'t'|'v'|'x'|'y'|'z';

OneWingedShark · October 17, 2024, 6:25pm

You can simplify that:

   subtype Roman_Latin is Character
      with Static_Predicate => Roman_Latin in ' '|'a'..'z';

dmitry-kazakov · October 17, 2024, 6:59pm

Latin alphabet has no U and W.

   subtype Roman_Latin is Character
      with Static_Predicate => Roman_Latin in
           ' ' | 'a'..'t'|'v'|'x'..'z';