How to make a subtype of Character containing just 'a', 'b', 'c', and 'q'?

I’ve tried a whole bunch of things, read docs, web searched, experimented, and asked chatgpt, but I still can’t figure out how to make a subtype of Character that has only ‘a’, ‘b’, ‘c’ and ‘q’. Is this possible, and if so, how do I do it?

Hi!

You can use the Static_Predicate aspect :slight_smile: Here is a copied piece of code I took from the Telegram chat group for whitespaces, so you need to adapt it to your characters :slight_smile:

Subtype Whitespace is Character
  with Static_Predicate Whitespace in ASCII.HT | ASII.VT ASCII.CR | ' ';

I hope this helps. Also, Character is just an enumeration as far as I know, so you can play around with that too.

Best,
Fer

2 Likes

Thanks Fer,

It turns out that you can find several different syntaxes for Static_Predicate. Inspired by your Whitespace example, I mixed and matched several non-compiling syntaxes until I got the following, which compiles on my system:

   subtype Valid_Choices is Character
         with Static_Predicate =>
         Valid_Choices in
         'a' | 'b' | 'c' | 'd' | 'q';

Thanks for steering me in the right direction!

2 Likes

You could even use ranges in the predicate by the way. And the correct syntax is indicated in the Reference Manual. Here is the link to the documentation of the manual and the example 3.2 Types and Subtypes | Ada Programming Language

subtype Basic_Letter is Character -- See A.3.2 for "basic letter".
   with Static_Predicate => Basic_Letter in 'A'..'Z' | 'a'..'z' | 'Æ' | 
                             'æ' | 'Ð' | 'ð' | 'Þ' | 'þ' | 'ß';
1 Like

Exactly so!
Well, mostly… Ada has a special notion of a “character type” which is any enumeration where at least one element thereof is a character-literal surrounded by single apostrophes — the thing that’s special about a “character type” is that a value of an array of that type can be denoted with double-quotes… so:

  Type Example is ( One, Two, Three, '.' );

   Value : Array(positive range <>) of Example:=
      "..." & One &"..."; -- is shorthand for:
      -- ('.','.','.',One,'.','.','.')

But, yes, it is an enumeration.

2 Likes

Character is a magic enumeration type, since it has features that you can’t implement in an enumeration type of your own. But character types that we can implement are just enumeration types.

What “magic” properties does Character have that user-defined character-types do not?

That you can represent an array of characters as a string literal, for example.

No, string literals work for any string type.

As for how Character is magic, what is the literal for Character'Val (0)?

It’s not a character-literal.
The definition, say ASCII (for simplicity), would be:

Type ASCII is 
 (NUL, -- NUL, a non-printable (and non-literalable) control, #0.
       -- other control characters
  ' ', -- SPACE, the first printable (and character-literal), #32.
  '!', -- EXCLAIMATION-MARK, also a character-literal, #33.
       -- other printable characters: '"'..'}', #34..#126.
  DEL  -- Also a non-printable control-character, #127.
 ) with Size => 7;

There is no magic here.
In fact, I’ve used custom character-sets where I did something like:

 Type Custom is
 ( 'A', 'B', 'C', 'D',
   'E', 'F', 'G', 'H',
   Tab,
   'I', 'J', 'K'
);

Or something similar, so that I could have the horizontal-tab character available… Just because Custom'Val(9) is “TAB” (and therefore not representable as a character-literal), doesn’t make it magic. (And, yes, I realize that overlaying the character-type on ASCII is data-punning and not great practice.)

I believe my example showed just that.
That some character-type might have elements which are not representable by a character-literal is irrelevant and, arguably, introduce an undecidable problem; example: “Given a program text which has a string containing something like a unicode left-to-right mark, is this meant for display purposes, or for data-purposes?”

Having this control-code as a non-literal means that any string containing it has to construct it using the name; so something like:

Left_to_Right : Constant Wide_Wide_Character:= Wide_Wide_Character'Val(16#0000_202E#);
Example : Constant Wide_Wide_String:=
             "Something" & Left_To_Right & "Else"

Thus we prevent injection of unwanted data/behaviors into our string-literals.

No, and I did not say “character literal”.

Your examples are standard character types. “Magic” is something the language can do that a user of the language cannot do. Since your examples are things that a user of the language can do, they are clearly not magic.

In your ASCII example, the literal for ASCII'Val (0) is NUL. In your Custom example, the literal for Custom'Val (8) is Tab.

But what is the literal for Character'Val (0)?

As per Standard:

type Character is
     (nul,     soh,   stx,    etx,      eot,   enq,    ack,   bel,   --0 (16#00#) .. 7 (16#07#)
      bs,      ht,    lf,     vt,       ff,    cr,     so,    si,    --8 (16#08#) .. 15 (16#0F#)

The literal for Character'Val(0) is nul.

Let’s see:

     1. package Character_Magic is
     2.    C : constant Character := nul;
                                     |
        >>> error: "nul" is not visible
        >>> error: non-visible declaration in package Standard.ASCII
     
     3. end Character_Magic;

So nul is not the literal for Character'Val (0).

Note that in ARM A.1, nul, soh, and the other non-graphic values of Character are in italics. As stated in ARM 3.5.2, “Each of the nongraphic characters … has a corresponding language-defined name, which is not usable as an enumeration literal … these names are given in the definition of type Character in A.1, “The Package Standard”, but are set in italics.”

As we have seen, nul “is not usable as an enumeration literal”. So, what is the literal for Character'Val (0)?

with
Ada.Text_IO;

procedure Example is
  C1 : Character renames Character'value( "NUL" );
  C2 : Character renames Character'val( 0 );
  Eq : Boolean   renames "="(C1, C2);
begin
 Ada.Text_IO.Put_Line( "C1 = C2: " & Eq'Image );
end Example;

Output:

C1 = C2: TRUE

So, obviously, NUL is recognized as the appropriate value corresponding to Character'Val( 0 ).

"NUL" is recognized as an input to Character'Value that yields Character'Val( 0 ). This is expected from ARM 3.5.2( and demonstrated by your example ).

We know that NUL is not a literal of type Character( from ARM 3.5.2 and demonstrated by my example ).

It should be a simple matter to state the literal for Character'Val( 0 )( if one exists ), yet no one has done so. It should be clear from ARM 3.5.2 that this is because there is no such literal.

So now the question becomes: How can a user of Ada create an enumeration type with a value that has no literal?

Easily:

type Illiteral is (X, Y, Z);

Consider literals of being rather of a universal type, implicitly converted to the actual type when the type inherits that sort of literals. Yes, yes, inheritance again. :grinning:

It is a far less magic model of thought which also explains why:

type I is range 1..2;
X : I := 3; -- Raises Constraint_Error

the above is legal, rather than a syntax error.

Dealing with universal types Ada deploys a more powerful type system of implicit type conversions ARM tries to hide. But sometimes the truth leaks out… :grinning:

No one has provided an example of this because the language does not allow it. This is why Character is a magic enumeration type.

String, on the other hand, is not magic, yet a lot of people seem to have a problem with that.