Scalar object 'Valid attribute

I would like to better understand the rationale for the 'Valid attribute for scalar objects as it seems backward to me. Take the case of performing an Unchecked_Conversion to assign an integer value to an enumeration object whose type has a representation clause assigning each member an integer value. The unchecked conversion is performed, assigning the object a (potentially erroneous) value then 'Valid is applied to check if the object is erroneous and if so, the error is handled/corrected.

It seems it would be easier if 'Valid could be applied to the enumeration type and the integer could be checked by doing My_Enum'Valid (My_Int) before doing the assignment and erroneous objects could be avoided all together. This could be wrapped in a simple function:

function To_Protocol_ID (id : C.unsigned_short) return Protocol_ID is
   (if Protocol_ID'Valid (id) then Protocol_ID'enum_val (id) else Protocol_None);

The best general solution I came up with is a Checked_Conversion generic, wrapping Unchecked_Conversion where an enumeration default value is returned in the case where 'Valid returns false. The conversion default is specified at instantiation or as a parameter replacing the instantiated default at checked conversion time.

   generic
      type Source is (<>);
      type Target is (<>);
      Target_Default : Target;
   function Checked_Conversion (S : Source;
                                Default : Target := Target_Default) return Target;

   function Checked_Conversion (S : Source;
                                Default : Target := Target_Default) return Target is
      function To_Target is new Ada.Unchecked_Conversion (Source, Target);
      Obj : constant Target := To_Target (S);
   begin
      return (if Obj'Valid then Obj else Default);
   end Checked_Conversion;

Is there a better way? If not, why doesn’t Ada have Ada.Checked_Conversion?

Below is test code using the generic and its output.

with Ada.Text_IO; use Ada.Text_IO;
with Ada.Unchecked_Conversion;
with Interfaces.C;

procedure Test_Valid_Generic is

   package C renames Interfaces.C;

   type Protocol_ID is (Protocol_None, Protocol_A, Protocol_B, Protocol_C);
   for Protocol_ID use (Protocol_None => 0, Protocol_A => 10,
                        Protocol_B => 20, Protocol_C => 30);
   for Protocol_ID'Size use C.unsigned_short'size;  -- prevents compiler warning

   id_valid   : C.unsigned_short := 10;
   id_invalid : C.unsigned_short := 15;

   generic
      type Source is (<>);
      type Target is (<>);
      Target_Default : Target;
   function Checked_Conversion (S : Source;
                                Default : Target := Target_Default) return Target;

   function Checked_Conversion (S : Source;
                                Default : Target := Target_Default) return Target is
      function To_Target is new Ada.Unchecked_Conversion (Source, Target);
      Obj : constant Target := To_Target (S);
   begin
      return (if Obj'Valid then Obj else Default);
   end Checked_Conversion;

   function To_Protocol_ID is new Checked_Conversion
                                 (C.unsigned_short, Protocol_ID, Protocol_None);

begin
   Put_Line ("Testing 'Valid attribute on Protocol_ID enumeration type...");
   Put_Line ("   id_valid : C.unsigned_short = " & id_valid'image);
   Put_Line ("   To_Protocol_ID (id_valid) = " & To_Protocol_ID (id_valid)'image);
   New_Line;
   Put_Line ("   id_invalid : C.unsigned_short = " & id_invalid'image);
   Put_Line ("   To_Protocol_ID (id_invalid) = " & To_Protocol_ID (id_invalid)'image);
   New_Line;
   Put_Line ("   id_invalid : C.unsigned_short = " & id_invalid'image);
   Put_Line ("   To_Protocol_ID (id_invalid) = " & To_Protocol_ID (id_invalid, Default => Protocol_B)'image);

end Test_Valid_Generic;
$ ./test_valid_generic
Testing 'Valid attribute on Protocol_ID enumeration type...
   id_valid : C.unsigned_short =  10
   To_Protocol_ID (id_valid) = PROTOCOL_A

   id_invalid : C.unsigned_short =  15
   To_Protocol_ID (id_invalid) = PROTOCOL_NONE

   id_invalid : C.unsigned_short =  15
   To_Protocol_ID (id_invalid) = PROTOCOL_B

I tend to agree with you, but there are some reasons it is how it is.
IIUC, the reason that it is on the value, rather than the type/subtype, is for things like (a) memory-overlay, and (b) memory-mapped I/O. — I believe the rationale was that you could define your [sub]type with all the correct values and use overlay, mapped-I/O, or unchecked conversion on (e.g.) incoming data, then use Val'Valid on that object, rather than something like:

Declare
  Raw_Data : Integer with Import, Address => IO_Address;
Begin
  if Target_Type'Valid( Raw_Data ) then 
  -- …
End;

As you can see, the problem with a Type'Valid attribute is that it would be hiding a conversion, then testing against the validation, then returning a Boolean… meaning that you just lost the passing value if it is valid, forcing a reconversion. — we could have had it work on the type, if we’d had attributes that reference functions with an out-parameter, but 'Valid was prior Ada2012.

@sttaft would have more details on it.

Attribute 'Valid is meant for types which machine representation may include patterns not valid for the type. No less, no more.

All misuses of 'Valid are on to you. Normally you should never use 'Valid.

There is no way a compiler could convert a numeric type to enumeration without additional knowledge.

Regarding your example under assumption that ID corresponds to the Protocol_ID position the proper method is

   function To_ID (Position : Unsigned_Short) return Protocol_ID is
   begin
      if ID > Protocol_ID'Pos (Protocol_ID'Last) then
         raise Constraint_Error with "Invalid argument";
      else
         return Protocol_ID'Val (ID);
      end if;
   end To_ID;

As you can see, the problem with a Type'Valid attribute is that it would be hiding a conversion, then testing against the validation, then returning a Boolean… meaning that you just lost the passing value if it is valid, forcing a reconversion. — we could have had it work on the type, if we’d had attributes that reference functions with an out-parameter, but 'Valid was prior Ada2012.

I not familiar with these type conversion mechanics, thought this check was pretty simple. In the case of Protocol_ID, just take the candidate value and look it up in a table containing the valid values of 10, 20, and 30 and return a boolean, avoiding creating an erroneous object and/or raising Constraint_Error. A valid object of type Protocol_ID can then be create without an exception occurring.

IMO, enumeration types have been broken since the language inception as you can’t (easily) model values of this nature and it looked like 'Valid, 'Enum_Rep, and 'Enum_Val and resolved this. If someone would be kind enough to point me to the correct section of ARM / rationale document(s), I can just read about it. I’ve been away from Ada since 2005 and have lost track of the specification documents. A quick google did not provide any results and will try again.

No, the ID does not correspond to the position in Protocol_ID. In the example, I focused on doing the conversion to a valid value and did not explain the larger context.

The example is a distillation of network packet reception code where a packet is received from a network and validated. As part of this process, the protocol’s ID must be correctly identified. The protocol field is 16 bits wide with 2^16 possible values, in the example only 10, 20 & 30 are valid values and any packet not having one of them should be quickly discarded. It seemed to me, the new enumeration attributes would support this and I’m still of that opinion.

In the example, the id_valid and id_invalid represent two notional examples of a protocol ID field. The field is not received from the actual hardware, but from C library code that interfaces with the hardware. Initially, I thought I would validate the field by applying Protocol_ID'Valid (id) but instead now realize an entire packet header can be overlaid and then do Packet_Header.ID'Valid to the field in the header that is of type Protocol_ID to determine if it is valid. The packet can be kept or discarded without an exception being raised.

If you want to convert some sparse number to enumeration use a case statement.

If I want to use that case statement in more than one location, it needs to be wrapped by a function (it should be wrapped with a function even if only used once). This is the sort of thing that was done for decades prior to the 'Valid, 'Enum_Rep, and 'Enum_Val attributes. It’s frustrating to do it the old way as the enumeration type and its representation clause already have all the information to do it, just lacked programmer access to it. The external case statement replicates that information.

What is the difference between the user-defined function and 'Enum_Val?

S'Enum_Val denotes a function with the following specification:

function S'Enum_Val (Arg : universal_integer) return S'Base

This function returns a value of the type of S whose representation value equals the value of
Arg. For the evaluation of a call on S'Enum_Val, if there is no value in the base range of its
type with the given representation value, Constraint_Error is raised.

Wouldn’t the compiler and runtime system provide the optimal data structure and function for implementing the storage and lookup of the representation clause values? Especially if the number of values changed and/or the density of values changed?

Why?

To me frustrating is that Ada programmers keep on misusing representation clauses. They should never be used in implementations of network protocols. I have implemented dozens of and never used representation clauses. Not once.

No. The target Ada type must be optimal for implementing the problem space logic. Storage layout is an implementation detail irrelevant to that. This is the main logical argument against representation clauses. A practical argument is that they never work well as your case vividly demonstrated.

Why?

If you wanted to do the identifier validation in more than one place - I don’t have a good example for this. Also, the case statement could get lengthy if the number of identifiers is large so a function improves code clarity, an example is there are roughly 200+ link-level identifiers.

To me frustrating is that Ada programmers keep on misusing representation clauses. They should never be used in implementations of network protocols. I have implemented dozens of and never used representation clauses. Not once.

You used the term “misuse” a couple times but never address what you mean by that so it’s just an opinion. I believe you are capable of implementing network protocols in binary code, assembly language, C, C++, Ada … The realm of what’s possible is irrelevant and not a proof of correctness. We know Ada representation clauses are not required to implement network protocols since (some) protocols have been around a lot longer than Ada.

No. The target Ada type must be optimal for implementing the problem space logic. Storage layout is an implementation detail irrelevant to that. This is the main logical argument against representation clauses. A practical argument is that they never work well as your case vividly demonstrated.

My argument is against the way enumeration types were implemented prior to Ada 2022. With the (relatively) new attributes, I expect that argument is moot and enumeration are now a great way to represent these types of identifiers.

Please explain why you would not want to use strong typing and the expressive power of enumerations to represent identifiers and have the compiler/runtime implement it for you.

Because such examples do not exist. You read a packet, you decode it, once. End of story.

Yes, after all everything is Turing Complete…

But they do represent protocol ID regardless encoding. There is no reason to use protocol encoding for language objects. This has nothing to do with strong typing anyway. It is about how such objects are constructed. Representation clauses and unchecked conversions are the worst possible methods of constructing language objects.