with Ada.Strings.Unbounded; use Ada.Strings.Unbounded;
with Ada.Text_IO; use Ada.Text_IO;
with GNAT.RegPat; use GNAT.RegPat;
procedure test is
RE: constant Pattern_Matcher := Compile("^(\d{1,3})\.(\d{1,3})\.(\d{1,3})\.(\d{1,3})/(\d{1,2})$");
ip4: Unbounded_String;
matches: Match_Array(1..5);
begin
ip4 := To_Unbounded_String("10.41.129.37/129");
Match(RE, To_String(ip4), matches);
Dump(RE);
Put_Line(matches(5).First'Image);
Put_Line(matches(5).Last'Image);
Put_Line(To_String(Unbounded_Slice(ip4, matches(5).First, matches(5).Last)));
if matches(matches'Last) /= No_Match then
Put_Line("OMG");
end if;
end;
Looks like the non-greedy version does what you expect. According to the documentation, /(\d{1,2})$ is the greedy version, but tested with /(\d{1,2}?)$ (the non-greedy version according to the docs) there is no 5th match. So possibly there is a mistake in the doc (at least in the file s-regpad.ads)
With that change the output is:
:
0
0
raised CONSTRAINT_ERROR : regpat.adb:18 range check failed
As @simonjwright mentioned, Match (i) when i > 0 refers to the groups in the regular expression, whilst Match (0) refers to the whole string. If Match (0) is No_Match, then the result of the groups is inconsistent, no matter if you use greedy or non-greedy expressions (I confused some terms in my previous greedy/non-greedy comment).
Try this version with and without the final 9 in the ip4 constant:
with Ada.Text_IO; use Ada.Text_IO;
with GNAT.RegPat; use GNAT.RegPat;
procedure test is
RE: constant Pattern_Matcher := Compile("^(\d{1,3})\.(\d{1,3})\.(\d{1,3})\.(\d{1,3})/(\d{1,2})$");
ip4 : constant String := "10.41.129.37/129";
matches: Match_Array(0..5);
begin
Match(RE, ip4, matches);
for m of matches loop
if m = No_Match then
Put_Line ("no match");
else
Put_Line (ip4 (m.First .. m.Last));
end if;
end loop;
end;
author of g-regpat here (though that was close to 25 years ago, amazing…)
I think the last answer is correct: you need to test whether Matches(0) /= No_Match before you look at any of the other match groups.
Though I would actually qualify this as a bug in g-regpat.adb, in that it should likely reset everything to No_Match to avoid such ambiguities, as is done in multiple other cases.
Though in practice I think they should obsolete g-regpat.adb, and replace it with a binding to libpcre2. The latter is I believe used by gcc itself, so that would not be an extra dependency, and the regexp engine is way way more advanced and efficient (while having compatible syntax, so mostly existing Ada code would not be impacted). I always meant to do that, but did not have time.