Yes, I think that I was expecting it to automatically generate a “>” operator. Got it now, thanks to you, OneWingedShark, and dmitry-kazakov.
Oh, thank you.
I’m glad it helps.
In my program I’ve been working–that sparked this whole discussion–I am trying to populate an array of records. I read the file line by line and parsed each of those lines by a delimiter. I don’t know the size of the string for each delimiter so that is why I was using unbounded strings. Is there a way around this? I guess I could figure out the max size each delimited entry could potentially be and then allocate memory for that. For some reason I thought that this wasn’t going to work but it did! I thought I had to use unbounded strings because of the variable string length. I appreciate the tip, as I don’t like wasting memory/space and like to be efficient.
Open (File_Handle, Mode => In_File, Name => In_File_Name);
type Order_Entry is record
Order_Name : Wide_String(1…60); --name is usually 3 times longer that status
Order_Status : Wide_String(1…20);
end record;
Order_Array : array (Integer range 1 … 5) of Order_Entry;
while not End_Of_File (File_Handle) loop
Parsed_Field_List :=
PragmARC.Line_Fields.Parsed (Get_Line (Mega_File_Handle), ‘|’);
Order_Array (Index).Order_Name := Parsed_Field_List (1);
Order_Array (Index).Order_Status := Parsed_Field_List (2);
Index := Index + 1;
end loop;
Here is an implementation I meant:
- kept sorted
- has embedded indices: by name, by status
- no memory overhead
You can walk the list either in names order or status order.
The name index is used to keep elements. It ignores Status in comparisons, so name is unique.
The status index is compares first status and then name.
with Ada.Finalization;
with Generic_Set;
package Order_Lists is
type Order_Entry (Name_Length, Status_Length : Natural) is record
Name : String (1..Name_Length);
Status : String (1..Status_Length);
end record;
type Order_List is tagged limited private;
procedure Add (List : in out Order_List; Name, Status : String);
function Get_By_Name (List : Order_List; Index : Positive)
return Order_Entry;
function Get_By_Status (List : Order_List; Index : Positive)
return Order_Entry;
function Get_Size (List : Order_List) return Natural;
private
type By_Name_Ptr is access Order_Entry;
function "<" (Left, Right : By_Name_Ptr) return Boolean;
function Equal (Left, Right : By_Name_Ptr) return Boolean;
package By_Name_Sets is new Generic_Set (By_Name_Ptr, null, "=" => Equal);
type By_Status_Ptr is access all Order_Entry;
function "<" (Left, Right : By_Status_Ptr) return Boolean;
package By_Status_Sets is new Generic_Set (By_Status_Ptr, null);
type Order_List is
new Ada.Finalization.Limited_Controlled with
record
By_Name : By_Name_Sets.Set;
By_Status : By_Status_Sets.Set;
end record;
overriding procedure Finalize (List : in out Order_List);
end Order_Lists;
The body
with Ada.Unchecked_Deallocation;
package body Order_Lists is
procedure Free is
new Ada.Unchecked_Deallocation (Order_Entry, By_Name_Ptr);
function "<" (Left, Right : By_Name_Ptr) return Boolean is
begin
if Left = null then
return Right /= null;
elsif Right = null then
return False;
else
return Left.Name < Right.Name;
end if;
end "<";
function Equal (Left, Right : By_Name_Ptr) return Boolean is
begin
if Left = null then
return Right = null;
elsif Right = null then
return False;
else
return Left.Name = Right.Name;
end if;
end Equal;
function "<" (Left, Right : By_Status_Ptr) return Boolean is
begin
if Left = null then
return Right /= null;
elsif Right = null then
return False;
elsif Left.Status = Right.Status then
return Left.Name < Right.Name;
else
return Left.Status < Right.Status;
end if;
end "<";
function "=" (Left, Right : By_Status_Ptr) return Boolean is
begin
return Left.all = Right.all;
end "=";
procedure Add (List : in out Order_List; Name, Status : String) is
Item : By_Name_Ptr :=
new Order_Entry'(Name'Length, Status'Length, Name, Status);
begin
if List.By_Name.Is_In (Item) then
Free (Item);
raise Constraint_Error with Name & " is already in the list";
end if;
List.By_Name.Add (Item);
List.By_Status.Add (Item.all'Unchecked_Access);
end Add;
function Get_By_Name (List : Order_List; Index : Positive)
return Order_Entry is
begin
return List.By_Name.Get (Index).all;
end Get_By_Name;
function Get_By_Status (List : Order_List; Index : Positive)
return Order_Entry is
begin
return List.By_Status.Get (Index).all;
end Get_By_Status;
function Get_Size (List : Order_List) return Natural is
begin
return List.By_Name.Get_Size;
end Get_Size;
procedure Finalize (List : in out Order_List) is
Item : By_Name_Ptr;
begin
for Index in 1..List.By_Name.Get_Size loop
Item := List.By_Name.Get (Index);
Free (Item);
end loop;
end Finalize;
end Order_Lists;
This is a test program. It is a bit long because it contains full error diagnostics. It also illustrates how to avoid tokenizing, i.e. there is never any need to determine fields or to copy them, even if the field is a string.
It reads file and prints the content sorted by status.
Note, there is no End_Of_File test.
with Ada.Text_IO; use Ada.Text_IO;
with Ada.Command_Line; use Ada.Command_Line;
with Ada.Exceptions; use Ada.Exceptions;
with Strings_Edit; use Strings_Edit;
with Strings_Edit.Integers; use Strings_Edit.Integers;
with Order_Lists;
procedure Test is
function Where (Line, Column : Positive) return String is
begin
return Image (Line) & ':' & Image (Column);
end Where;
File : File_Type;
Orders : Order_Lists.Order_List;
Count : Natural := 0;
begin
case Argument_Count is
when 1 =>
begin
Open (File, In_File, Argument (1));
exception
when others =>
Put_Line ("Cannot open file: " & Argument (1));
Set_Exit_Status (Failure);
return;
end;
when others =>
Put_Line ("Usage: test <file-name>");
Set_Exit_Status (Failure);
return;
end case;
begin
loop
Count := Count + 1;
declare
Line : constant String := Get_Line (File);
Pointer : Integer := Line'First;
Name_Start : Integer;
Name_Stop : Integer;
Status_Start : Integer;
Status_Stop : Integer;
begin
Get (Line, Pointer, SpaceAndTab); -- Skip blanks
if Pointer > Line'Last then
raise Data_Error with
"Name is expected at " & Where (Count, Pointer);
end if;
Name_Start := Pointer;
Name_Stop := Pointer - 1;
while Pointer <= Line'Last loop
case Line (Pointer) is
when '|' =>
Pointer := Pointer + 1;
exit;
when ' ' | Character'Val (8) =>
null; -- Truncate trailing blanks
when others =>
Name_Stop := Pointer;
end case;
Pointer := Pointer + 1;
end loop;
if Name_Start > Name_Stop then
raise Data_Error with
"Name is empty at " & Where (Count, Name_Start);
end if;
Get (Line, Pointer, SpaceAndTab);
if Pointer > Line'Last then
raise Data_Error with
"Status is expected at " & Where (Count, Pointer);
end if;
Status_Start := Pointer;
Status_Stop := Pointer - 1;
while Pointer <= Line'Last loop
case Line (Pointer) is
when '|' =>
raise Data_Error with
"Unexpected field at " & Where (Count, Pointer);
when ' ' | Character'Val (8) =>
null;
when others =>
Status_Stop := Pointer;
end case;
Pointer := Pointer + 1;
end loop;
if Status_Start > Status_Stop then
raise Data_Error with
"Status is empty at " & Where (Count, Status_Start);
end if;
Orders.Add
( Line (Name_Start..Name_Stop),
Line (Status_Start..Status_Stop)
);
end;
end loop;
exception
when End_Error =>
Close (File);
when Error : others =>
Put ("Line " & Image (Count) & " error: ");
Put (Exception_Information (Error));
New_Line;
Close (File);
Set_Exit_Status (Failure);
return;
end;
for Index in 1..Orders.Get_Size loop
Put ("Name: " & Orders.Get_By_Status (Index).Name);
Put (", Status: " & Orders.Get_By_Status (Index).Status);
New_Line;
end loop;
Set_Exit_Status (Success);
exception
when Error : others =>
Put_Line ("Error:" & Exception_Information (Error));
Set_Exit_Status (Failure);
end Test;
Sure; the way to get around this is to lean into the nature of the file you already have: that is, you can use a fixed-string as-needed.
As an example:
Package Example is
Type Names(First_Length, Last_Length : Natural) is record
First_Name : String(1..First_Length);
Last_Name : String(1..Last_Length);
end record;
Package Name_List is new Ada.Containers.Indefinite_Vector
([...] Element_Type => Names [...]);
Function Get_Names(Input : in out File_Type) return Name_List.Vector;
End Example;
And we can stash all the “ugly bits” in the implementation:
Package Body Example is
-- Note the nice interface.
Function Get_Names(Input : in out File_Type) return Name_List.Vector is
-- And here we make advantageous use of defaulting.
Function Parse(Text : String := Get_Line(Input)) return Names is
Delimiter : Constant String:= ", ";
Comma : Natural renames Ada.Strings.Fixed.Index
( Pattern => Delimiter, Source => Text );
Begin
-- We assume a form of LASTNAME, FIRSTNAME
if Comma not in Positive then
Raise Constraint_Error with "Malformed line: " & Text;
end if;
From_Form;
Declare
Last : String renames Text( Text'First..Natural'Pred(Comma) );
First : String renames Text( Comma+Delimiter'Length..Text'Last );
Begin
Return Names'( First_Length => First'Length, Last_Length => Last'Length,
First_Name => First, Last_Name => Last );
End From_Form;
End Parse;
Begin
Return Result : Name_List.Vector:= Name_List.Empty_Vector do
Loop
Result.Append( Parse ); -- We just keep appending new names until we run out.
End Loop;
Exception
when End_Error => Close_File( Input );
End return;
End Get_Names;
End Example;
And that is how you can avoid using Unbounded_Strings, how you can “stash” tricky/ugly bits, and how you can leverage (a) renames, (b) defaults, and (c) exceptions to make the core-logic straightforward.
This is a lot to digest. I haven’t had the opportunity to try out this code, but I appreciate it. At first glance, it seems to add a lot more complexity to my code. Tokenizing (parsing table data into a record) seems to be more intuitive and easy to read, but I’m open to exploring this new approach. I’d be interested to see the memory overhead and compare my implementation to yours and see how much it reduces the memory footprint and processing power. I don’t know the best way to benchmark my code. I was going to start another thread on that, like how to see the memory usage of an Ada executable. One thing I found out last week is that my antivirus program goes beserk when running the Ada executable, so that is why my processor was loaded so high. I thought it was because I wrote a really poorly coded program.
Tokenizing parses into strings. You have to do a second pass to reparse strings, e.g. trimming or extracting numbers. Then you need to add error handing. Only then you can judge readability,
This is the problem with most code that uses libraries. They get you 80% there and the rest 20% requires total redesign. Be it tokens or containers. The worst thing ever is of course XML, JSON and similar meaningless stuff. It is OK for run-once programs, though.
Interesting, but unlikely. Unless you have to parse megabytes of text data, e.g. log files, then tokenizing and Unbounded_String can really bring the CPU to its knees.