Little help about binary trees

evanescente-ondine · September 2, 2024, 2:10am

Hi,
I’m progressing steadily though at my pace in the Adacraft book, now finishing chapter 13, about recursivity (among other things) out of 19.
I can see clear as day that at some tasks, in particular devising algorithms, I suck really, really hard. My mind lacks the imagination. I can apply well something I learnt but not intuit. Can someone give me some hint for the following ?

Write a program which asks the user to pick an animal and then tries to identify it by asking a series of yes/no questions (see exercise 3.3). Use a record containing a string and two pointers. If the pointers are null, the string is the name of an animal; if not, the string is a question to ask and the pointers point to further records of the same type, one for a ‘yes’ answer and one for a ‘no’. The program should ask questions and follow the appropriate pointers until an animal’s name (e.g. ‘aardvark’) is reached, at which point the question ‘Is it an aardvark?’ should be asked. If the user responds ‘no’, the program should ask for the name of the animal

Then tree keeps on growing whenever the computer guesses wrong and asks for the real name and a new question.

The data structure described in exercise 11.2 is known as a binary tree. It is a fundamentally recursive structure; the node at the root of the tree points to two other nodes which can be considered to be the roots of two smaller subtrees with the same structure. Modify your solution to exercise 11.2 using a recursive algorithm for searching the tree.

I made the iterative version but I don’t see how recursion or binary tree search apply here. I couldn’t see for generating anagrams either, and took a whole hour reading explanations before it clicked… I’m that slow for this kind of things.

I hope you don’t mind assistng for non strictly Ada stuff when I freeze.

bracke · September 2, 2024, 6:11am

Hi

I would create a procedure that handles one node and takes the node as parameter (the first call would take the root node). It then calls it self whenever it has to “move on” to another node and takes that node as a parameter.

The procedure has to figure out whether the animal has been identified or not and thus whether it should make another recursive call.

Hope it helps.

Kind regards
Bent Bracke

cantanima · September 2, 2024, 3:58pm

Don’t let yourself get too frustrated; binary trees and recursion are both very good at deflating most aspiring programmers’ egos. Once we get used to them, they’re not so bad, but only relatively speaking, and I know a few people who’ve been in the business a long while and still tremble at the sight of them.

To @bracke’s hint I would add that recursive functions are almost always simple in their structure. Given an input I, the recursive function F:

Tests if I satisfies a condition. (Often called the “terminal condition”.) In the classical case of the Fibonacci sequence, if I = 1 or I = 2
1. If so, it returns either I or some easily-computed value based on it. Fibonacci: then return 1;.
2. If not, it recurses; that is, it calls itself with one or more different parameters typically obtained by modifying I. Fibonacci: else return F (I - 1) + F (I - 2);.

That’s it. That’s the essence of recursion. If you’re familiar with induction in mathematics, it’s basically the same thing. (If not, don’t worry about it. Induction deflates most aspiring mathematicians’ egos, too.)

Common problems people have with recursion:

they fail to test for every potential terminal condition;
they call the function with the same input parameter, leading to an infinite loop.

In your case, the recursive function, which I’ll call F, takes one input parameter, the record type described in the problem, which I’ll call Node. It is a wee bit more complicated than the skeleton I sketched above:

Test terminal condition: Are both pointer fields null?
1. yes: The node’s string field is an animal, so ask if it’s the animal and obtain the user’s reply.
  - If yes, return the animal’s name.
  - If no,
    - Ask for the actual name.
    - Overwrite the node’s string field.
    - Only then return it.
2. no: The node’s string field is a yes/no question, so:
  - Display the question and ask for the user’s reply.
  - Use the user’s reply to recurse (call itself) with the correct pointer field, something akin to:
```
if Answer = 'Y' then
    return F (Node.Affirmative);
else
    return F (Node.Negative);
end if;
```

I’m afraid that may be simultaneously too much and too little. (I’m also afraid I may have completely misunderstood.) Don’t hesitate to follow up.

evanescente-ondine · September 2, 2024, 5:32pm

Huh, so I’m nothing special in my difficulties. Dunno I should rejoice because I’m not dumb, or depress because they’re likely to persist forever at least to some degree.

I see. The assignment’s goal was simple. But because of it I don’t feel recursion as “natural” at all for this problem while it did feel so when tackling things like syntax checking. If the procedure’s job is too simple, iteration seems cleaner and easier to troubleshoot.

jere · September 2, 2024, 7:13pm

If you want a good example of how to do simple recursion, here’s a practical example.

If you use Ada 95 only, one thing you often can miss out on is the Get_Line function in Ada.Text_IO which wasn’t added until Ada 2005. They could have added it in Ada 95 but didn’t. One way to implement it yourself is with recursion. Ada95 defines a Get_Line procedure like this:

procedure Get_Line(Item : out String;
                   Last : out Natural);

Reference: Input-Output of Characters and Strings

Where Last is the last index of the string that was updated (for cases where the input is shorter than the supplied length of the string. Consider by starting with a basic use of Get_Line in Ada95 without recursion. You have to decide on a maximum line length and hope it is enough:

function Ada_95_Get_Line return String is
        Buffer_Length : constant := 80;
        Result : String(1..Buffer_Length);
        Last   : Natural := 0;
begin
   Get_Line(Result, Last);
   return Result(1..Last);
end Ada_95_Get_Line

But what happens if the line supplied is more than 80 characters? Using this simplistic method we would mistakenly not read the whole line and may not get what we are looking for. You can however modify this to use recursion and get the whole line.

First you need a terminating condition. How do we tell if we reached the end of the line? Well the answer in this case revolves around the Last parameter of Get_Line. If that value is smaller than the buffer size that means the code encounted an end of the line before it finished the Get_Line call. So our terminating condition can be: “If Last is the exact size of the buffer then that means we didn’t encounter the end of line condition, otherwise we did”. In code that could look like this:

if Last = Buffer_Length then
   -- Didn't find EOL, so need to do some recursion here
else
   -- Did find the EOL so need to terminate 
   -- the recursion and send the result
end if;

Modifying the old non recursive example to include recursion:

function Ada_95_Get_Line return String is
    Buffer_Length : constant := 80;
    Result : String(1..Buffer_Length);
    Last   : Natural := 0;
begin
    Get_Line(Result, Last);
    if Last = Buffer_Length then
        return Result & Ada_95_Get_Line;  -- Recursion
    else
        return Result(1..Last); -- Terminate the recursion
    end if;
 end Ada_95_Get_Line;

Now this will read up to 80 characters without recursion but if the line is longer, it will recurse until the whole line is read. If you want to adjust the value so that you have even less chance of recursion you can increase the buffer size to a value

jere · September 2, 2024, 7:41pm

This is a side discussion about the direction the book took (this is not an evaluation of your approach at all):

I know you are following a book there, but I am surprised at how the book approaches the solution. Using things like "if the pointers are null, the string is a name of an animal, … " seem very un-Ada like, so I am surprised the book went that route. If you are willing to entertain a different method on the side that doesn’t match the book exactly, here are my thoughts:

Since there are two distinct node types, I would have gone with an enumeration for the kinds of nodes and a discriminated record to handle each situation. Something like this:

    -- Types for the overall design
    type Node_Kind is (Question, Answer);
    type Question_Result is (No, Yes);
    
    -- Forward declaration and node access type
    type Node;
    type Node_Access is not null access Node;
    
    -- Holds links to various question results
    type Node_Links is array(Question_Result) of Node_Access;
    
    -- Main node type
    type Node(Kind : Node_Kind := Question) is record
        case Kind is
            when Question => 
                Question : Unbounded_String;
                Links    : Node_Links;
            when Answer =>
                Answer   : Unbounded_String;
        end case;
    end record;

The more interesting parts here:

The record discriminant is defaulted. This allows me to use it in an array which is useful for holding a list of links to other nodes.
I use an enumeration for the answer type. This will allow us to directly convert user input to an array index later. You can take the attribute'Value on the input string and if it matches an enumeration name (case insensitive), then it will convert it for you.
My list of node links is indexed off the above enumeration, this can simplify selection logic later by using array indexing instead of IF/ELSE and CASE statements.
I used Unbounded_String since the questions and answers can be any length

For the recursion, now your terminating condition is now if the discriminant is Answer to indicate an answer node.

NOTE: The Node_Access type declaration includes not null which isn’t Ada95 conformant, but this can be removed if you want to either keep it Ada95 conformant.

Since I am using Unbounded_Strings, I generally like to have some helper functions to convert to/from strings and I find the To_String/To_Unbounded_String clutter up too much for me. If they don’t for you or if you just prefer to use them, then the next part is optional:

    -- Utility conversion operations
    function "+"(Value : String) return Unbounded_String
        renames To_Unbounded_String;
    function "+"(Value : Unbounded_String) return String
        renames To_String;

That lets me do things like +"Convert this to Unbounded_String" and vice versa. Again this is optional and too taste.

Now if you want to see my solution to the recursion problem you stated above, I’ll include it in a spoiler section below. It’ll be formulated around my changed interpretation of the problem:


    function Recursive_Guess(Node : Node_Access) return String is
    begin
        -- Terminating condition for recursion
        if Node.Kind = Answer then
            return "My guess is " & (+Node.Answer);
        end if;
        
        -- ask question
        Put(+Node.Question & " ");
        
        -- loop until a valid answer (yes/no) is given
        loop declare
            -- Take the user input here at initialization
            Answer_Str : constant String := Get_Line;
            Answer     : Question_Result;
        begin
            -- append newline to standard out after
            -- the question is answered
            New_Line; 
            
            -- Convert answer to enumeration. Invalid answers
            -- are handled below in an exception handler
            Answer := Question_Result'Value(Answer_Str);
            
            -- Valid answer at this point so call this
            -- function again recursively
            return Recursive_Guess(Node.Links(Answer));
        
        exception
            when Constraint_Error => 
                -- Wrong answer so put error message and loop
                -- again for correct answer
                Put("Invaild Answer, put yes or no: ");
                
        end; end loop; -- end for declare block and loop
        
    end Recursive_Guess;

The nice thing about this process is that the logic is:

Read a line of text as the answer
Convert the line to an enumeration value directly
Select the node based on the enumeration value created

If the user puts in a non yes/no answer, then the exception handler fires, puts an error message and allows the code to loop to try for another answer.

NOTE: Get_Line is not Ada95 conformant, but you can replace that call with the Get_Line function example I talked about in the previous post.

Building a tree can be a bit more tedious when you use variant records, but I really like creating constructor functions for each situation to help make this less painful. For an Answer node, consider:

    function New_Answer
        (Answer_Str : String) 
         return Node_Access 
    is begin
        return new Node'(Kind   => Answer, 
                         Answer => +Answer_Str);
    end New_Answer;

For a new Question:

    function New_Question
        (Question_Str : String;
         No_Node      : Node_Access;
         Yes_Node     : Node_Access)
         return Node_Access
    is begin
        return new Node'(Kind     => Question, 
                         Question => +Question_Str, 
                         Links    =>
                            (No  => No_Node, 
                             Yes => Yes_Node));
    end New_Question;

That allows a pretty simple decision tree like this:

    Root : constant Node_Access :=
        New_Question("Is it a mammal?",
            No_Node => New_Question("Is it a fish?",
                No_Node  => New_Answer("Beetle"),
                Yes_Node => New_Answer("Trout")),
            Yes_Node => New_Answer("Cow"));

Which can be more readable than trying to create the variant records in place.

The full code is below (in spoiler block), tested with the jdoodle online Ada compiler:

with Ada.Text_IO; use Ada.Text_IO;
with Ada.Strings.Unbounded; use Ada.Strings.Unbounded;

procedure jdoodle is

    -- Only needed for Ada_95
    function Ada_95_Get_Line return String is
        Buffer_Length : constant := 80;
        Result : String(1..Buffer_Length);
        Last   : Natural := 0;
    begin
        Get_Line(Result, Last);
        if Last = Buffer_Length then
            return Result & Ada_95_Get_Line;
        else
            return Result(1..Last);
        end if;
    end Ada_95_Get_Line;

    -- Types for the overall design
    type Node_Kind is (Question, Answer);
    type Question_Result is (No, Yes);
    
    -- Forward declaration and node access type
    type Node;
    type Node_Access is not null access Node;
    
    -- Holds links to various question results
    type Node_Links is array(Question_Result) of Node_Access;
    
    -- Main node type
    type Node(Kind : Node_Kind := Question) is record
        case Kind is
            when Question => 
                Question : Unbounded_String;
                Links    : Node_Links;
            when Answer =>
                Answer   : Unbounded_String;
        end case;
    end record;
    
    -- Utility conversion operations
    function "+"(Value : String) return Unbounded_String
        renames To_Unbounded_String;
    function "+"(Value : Unbounded_String) return String
        renames To_String;
    
    -- Work horse function
    function Recursive_Guess(Node : Node_Access) return String is
    begin
        -- Terminating condition for recursion
        if Node.Kind = Answer then
            return "My guess is " & (+Node.Answer);
        end if;
        
        -- ask question
        Put(+Node.Question & " ");
        
        -- loop until a valid answer (yes/no) is given
        loop declare
            -- Take the user input here at initialization
            Answer_Str : constant String := Get_Line;
            Answer     : Question_Result;
        begin
            -- append newline to standard out after
            -- the question is answered
            New_Line; 
            
            -- Convert answer to enumeration. Invalid answers
            -- are handled below in an exception handler
            Answer := Question_Result'Value(Answer_Str);
            
            -- Valid answer at this point so call this
            -- function again recursively
            return Recursive_Guess(Node.Links(Answer));
        
        exception
            when Constraint_Error => 
                -- Wrong answer so put error message and loop
                -- again for correct answer
                Put("Invaild Answer, put yes or no: ");
                
        end; end loop; -- end for declare block and loop
        
    end Recursive_Guess;
    
    -- Utility node construction functions
    function New_Answer
        (Answer_Str : String) 
         return Node_Access 
    is begin
        return new Node'(Kind   => Answer, 
                         Answer => +Answer_Str);
    end New_Answer;
        
    function New_Question
        (Question_Str : String;
         No_Node      : Node_Access;
         Yes_Node     : Node_Access)
         return Node_Access
    is begin
        return new Node'(Kind     => Question, 
                         Question => +Question_Str, 
                         Links    =>
                            (No  => No_Node, 
                             Yes => Yes_Node));
    end New_Question;
                     
    -- Test tree
    Root : constant Node_Access :=
        New_Question("Is it a mammal?",
            No_Node => New_Question("Is it a fish?",
                No_Node  => New_Answer("Beetle"),
                Yes_Node => New_Answer("Trout")),
            Yes_Node => New_Answer("Cow"));
    
begin
    Put_Line(Recursive_Guess(Root));
end jdoodle;

evanescente-ondine · September 2, 2024, 8:54pm

Damn, well, thank you very much. I’ll study it religiously.
Yes I didn’t expect this methodology of theirs either. I find the 'Value trick very elegant. As for Ada 95… I always use the latest constructions, including the extensions likely to be included in the next version (or that should be included anyway) so “sticking to the rules”, whatever rules they may be, has never my thing.

jere · September 2, 2024, 9:10pm

I couldn’t remember who, but someone was very adamant about only 95 solutions for a few things, so figured I would toss that in there in case they happened upon this thread.

cantanima · September 3, 2024, 2:03am

Whoah whoah whoah again. I said some of my coworkers. That doesn’t mean you’re condemned to suffer from them.

I had trouble with recursion for years myself, but more because I misunderstood a professor’s remark that every recursion can be turned into an iteration, and vice versa, and too much recursion can blow the stack, so I tried to apply iteration to every problem, even those where recursion was a much, much better choice. In one program I even parsed general mathematical expressions with iteration. What a nightmare.

You’re right!

JC001 · September 3, 2024, 11:58am

You might be interested in the 1989 Ada Letters article Variable-Length String Input in Ada.

JC001 · September 3, 2024, 12:37pm

Note that you can implement binary trees without access types. It may not be a good approach, but it’s possible.

Conceptually, a binary tree is

type Tree is record
   Value : Element;
   Left  : Tree;
   Right : Tree;
end record;

and we presume an operation

function Is_Empty (T : in Tree) return Boolean;

Then you want a procedure such as

procedure Guess (T : in out Tree) with Pre => not Is_Empty (T);

procedure Guess (T : in out Tree) is
   ...
begin -- Guess
   if Is_Empty (T.Left) then -- T contains an animal
      -- Deal with animal case

      return;
   end if;

   -- T contains a question
   -- Ask the question and get an answer of 'N' or 'Y' in Response
   if Response = 'N' then
      Guess (T => T.Left);
   else
      Guess (T => T.Right);
   end if;
end Guess;

Dealing with an animal may involve creating a new Tree and inserting it between T’s parent and T, so the reality will likely be more complex than this. As such, it doesn’t seem like a great example for introducing binary trees. Using an ordered binary tree to sort input would be better.

OneWingedShark · September 3, 2024, 2:22pm

I mean, you took math, right?
Remember proofs? In particular, Proof by Induction?
(P(x) -> P(x+1); & P(1), thus proving for all x where x is positive.)

It’s EXACTLY the same thing.

Example:

Function "**"( Left, Right : Natural ) return Integer is
Begin
  case Right is
    when 0 => Return (if Left in Positive then 1 else raise Constraint_Error);
    when 1 => Return Left;
    when 2 => Return Left*Left;
    when others =>
      Declare
         Odd_Fixup : Constant Natural := 
            (if Right mod 2 in Positive then Left else 1);
      Begin
        Return (Left**(Right/2)) * Odd_Fixup;
      End;
End "**";

So, what this does is takes advantage of the fact that X**(2Y) = (X**Y)**2, allowing for logrithmic multipclation-calls rather than linear.

evanescente-ondine · September 3, 2024, 2:34pm

I did, I did… But just because you can read Shakespear, even studied the aliterations and word plays, doesn’t mean you can write it.

OneWingedShark · September 3, 2024, 2:49pm

The thing to note is that everything devolves down to, and is built off of, the base-case(s):

When Right is 0: Return 1, unless Left is zero, because that’s undefined.
When Right is 1: Return the value.
When Right is 2: Return the square of the value.

Now, technically, we don’t need that last case: if our non-base-case operations simply divide the exponent by two and square the result, eventually you get down to the exponent being 2, which is (X**1) ** 2.

We could also get rid of the first base-case by making Right a Positive subtype… but that might be a bit too restrictive.

evanescente-ondine · September 3, 2024, 11:37pm

I edited to add a new question and a name when the guess is wrong, and I must say the recursive version is rather elegant actually. Also, using a mutable type with definite fields (Unbounded_string) instead of a string makes editing the field SO MUCH easier than getting climbing back on the tree than changing existing pointers !!
One more assignement done.

jere · September 6, 2024, 10:58pm

Out of curiosity, have you had any time to put some thought into how to handle the more complex container types (including say balanced binary trees). I like the idea of avoiding manual dynamic memory allocation when possible.

The main problem I find is that if you have a container that needs circular references (double linked lists for example, maybe AVL and Red/Black trees too?), then using holders becomes problematic as one nodes parent link is a duplicate copy of the the actual parent node instead of the same holder for both perspectives.

In more traditional methods, you use a weak reference (to break the cycle) to get around this, but if we are trying to avoid access types, that might be tougher here.

OneWingedShark · September 6, 2024, 11:49pm

If you’re doing non-/minimally-variable balanced trees you can use the “array trick”.

Generic
   Type Element is private;
   Type Index is (<>);
   Type Vector is array (Index range <>) of Element;
Package Example is
   -- Index-1 is always the root; this is a perfectly balanced tree.
   Type Balanced_Tree(<>) is private;
   Function Initialize( Data : Vector ) return Balanced_Tree
     with Pre => (for some X in 0..10 => Vector'Length = Natural'Pred(2**X))
                 or else Vector'Length = 0;
   Function  Get  ( Object : in     Balanced_Tree; Index : Positive ) return Element;
   Procedure Set  ( Object : in out Balanced_Tree; Index : Positive; Value : Element);
   Function  Left ( Object : in     Balanced_Tree; Index : Positive ) return Element;
   Function  Right( Object : in     Balanced_Tree; Index : Positive ) return Element;
Private
   Type Balanced_Tree is array (Positive range <>) of Element;   
End Example;
-----------------------------------------
Package Body Example is
   Function Left( Index: Positive ) return Positive is ( 2*Index );
   Function Right( Index: Positive ) return Positive is ( Positive'Succ(2*Index) );
   
   Function  Left ( Object : in     Balanced_Tree; Index : Positive ) return Element is
   ( Object(Left(Index)) );
   Function  Right( Object : in     Balanced_Tree; Index : Positive ) return Element is
   ( Object(Right(Index)) );

   Function Initialize( Data : Vector ) return Balanced_Tree is
      Count : Natural:= Natural'First;
   Begin
      Return Result : Balanced_Tree(1..Data'Length) do
         For Item of Data loop
            Count:= Natural'Succ( Count );
            Result( Count ):= Item;
         End loop;
      End return;
   End Initialize;

   Function  Get( Object : in     Balanced_Tree; Index : Positive ) return Element is
   Begin
      Return Object( Index );
   End Get;

   Procedure Set( Object : in out Balanced_Tree; Index : Positive; Value : Element) is
   Begin
      Object(Index):= Value;
   End Set;
End Example;

JC001 · September 7, 2024, 11:11am

Typically holders are not a good choice for such structures, but holders are not the only container available. To implement S-expressions, for example, you might use Ada.Containers.Indefinite_Doubly_Linked_Lists (one would not try to implement linked lists, since they already exist in the standard library), though they do not have such references. And sometimes, access types cannot be avoided, but they can generally be encapsulated and hidden.

simonjwright · September 7, 2024, 12:36pm

Just for info, you might be interested in the Booch95 version of AVL trees:

user spec and body
implementation spec and body