My perl regular expression is matching much more than it should in UE 21 and 22.
<cc>(.*Mass[^<(]*Mass)\.( .*)</cc>
is matching this in my input file:
<cc>O. Ahlborg & Sons v. Massachusetts Heavy Indus., Inc., 65 Mass. App. Ct. 385, 392 (2006)</cc>. See <readme cite_65><cc>Harrison v. Roncone, 447 Mass. 1001 (2006)</cc> (where case involves multiple claims and multiple parties, judgment dismissing fewer than all claims or parties is interlocutory and is not immediately appealable absent determination by trial judge that there is no just reason for delay and upon express direction for the entry of final judgment). See also <readme cite_66><cc>Trenz v. Family Dollar Stores of Mass., Inc., 73 Mass. App. Ct. 610, 613-615 (2009)</cc>
I'm trying to match only the content between <cc></cc> that has two occurrences of "Mass" contained within it, so only the first <cc> above should be matched. I would have expected the [^<)] to have stopped match
The ^ in the character set doesn't seem to have an effect.
Removing the parentheses doesn't help nor does using the non-greedy ? character.
Am I misunderstanding something or is this a bug of some type.
This behavior occurs in UE 21.30.0.124 and 22.20.0.34
Thanks for any help!
Steve
<cc>(.*Mass[^<(]*Mass)\.( .*)</cc>
is matching this in my input file:
<cc>O. Ahlborg & Sons v. Massachusetts Heavy Indus., Inc., 65 Mass. App. Ct. 385, 392 (2006)</cc>. See <readme cite_65><cc>Harrison v. Roncone, 447 Mass. 1001 (2006)</cc> (where case involves multiple claims and multiple parties, judgment dismissing fewer than all claims or parties is interlocutory and is not immediately appealable absent determination by trial judge that there is no just reason for delay and upon express direction for the entry of final judgment). See also <readme cite_66><cc>Trenz v. Family Dollar Stores of Mass., Inc., 73 Mass. App. Ct. 610, 613-615 (2009)</cc>
I'm trying to match only the content between <cc></cc> that has two occurrences of "Mass" contained within it, so only the first <cc> above should be matched. I would have expected the [^<)] to have stopped match
The ^ in the character set doesn't seem to have an effect.
Removing the parentheses doesn't help nor does using the non-greedy ? character.
Am I misunderstanding something or is this a bug of some type.
This behavior occurs in UE 21.30.0.124 and 22.20.0.34
Thanks for any help!
Steve