Perl RegExs misbehaving in v12.10b

Perl RegExs misbehaving in v12.10b

112
Power UserPower User
112

    Sep 05, 2006#1

    I've found some anomalous behaviour in Perl RegExs... (v12.10b)

    As I read the documentation, if I search for [a-g] I should only find letters: lowercase "a" to lower case "g". If I search for [A-G], then it should only find uppercase "A" to uppercase "G". This however, seems to work ONLY if the match case option is checked... This is a bug is it not?

    Can anyone confirm this?

    Also, if the match is as [A-G] with the match case option checked, the find will work correctly (say using F3). However, if you want to check that the find worked using "Highlight all items found" then the wrong items are highlighted.

    Can anyone confirm this behaviour?

    If confirmed I'll report it...

    Example failing RegEx:

    Replace: (((?:)|[(glu])[bcigors])w([A-Z][A-Z_0-9]+)
    with: \1_\L\3

    will convert: gswHELLO_THERE to gs_hello_there


    I've found some more anomalous behaviour in Perl RegExs... (v12.10b)

    If I use the Perl RegEx above that works with a replace on the current file, it WON'T work if I scope it to "Open Files".

    Can anyone confirm this too?

    If confirmed I'll report it...

    TIA,
    Paolo
    There is no such thing as an inconsistently correct system...
    Therefore, aim for consistency; in the expectation of reaching correctness!

    6,680583
    Grand MasterGrand Master
    6,680583

      Sep 05, 2006#2

      The match case issue is not a bug. This was discussed already at Replacing "[A-Z]" to "_[A-Z]".

      Second is really a bug which also exists in v12.10a. With search string [A-Z]+ and options Match Case, Regular Expressions (Perl engine) and Highlight All Items Found checked not only strings with uppercase letters are highlighted, also all strings with lowercase letters. This is definitively a bug of the Perl engine only. With legacy Unix or UltraEdit engine the Highlight All Items Found works perfect with this search string.


      But much more frustrating for you will be that in v12.10b a normal Find with Perl engine active does not find anything anymore when executed from within a macro. This is a new bug of v12.10b which I think makes most macros with PerlReOn not working anymore.

      For example a file is open which contains anywhere the 2 characters "in" in sequence. Now try following macro:

      InsertMode
      ColumnModeOff
      HexOff
      PerlReOn
      Top
      Find "in"

      You will see that the cursor will be after macro execution at top of the file, even when the file contains "in". The normal, non regular expression Find with Perl engine active simply finds nothing anymore. This is a new bug in v12.10b. I'm currently writing a bug report email about this new issue to IDM. With UnixReOn or UnixReOff the normal Find works perfect in the macro.

      Edited on 2006-10-28: The bug described above for UltraEdit v12.10b and UEStudio v6.00 only is fixed in UE v12.20 and UES v6.10.
      Best regards from an UC/UE/UES for Windows user from Austria

      112
      Power UserPower User
      112

        Sep 05, 2006#3

        Mofi wrote:The match case issue is not a bug. This was discussed already at Replacing "[A-Z]" to "_[A-Z]".
        With respect Mofi, it IS a bug (Bug: The manifestation of a defect)

        It is either a bug in the code or a bug in the documentation, but a bug none the less. As far as I could tell, this significant behaviour is not highlighted anywhere. And I had a good look at the Boost RegEx Library documentation.

        As with the other thread you referred to, this is behaviour which (without adequate documentation) violates the "Principle of least astonishment".

        However, as always, thanks for your help in sorting out the scope of the problem.

        Paolo
        There is no such thing as an inconsistently correct system...
        Therefore, aim for consistency; in the expectation of reaching correctness!

        6,680583
        Grand MasterGrand Master
        6,680583

          Sep 05, 2006#4

          I have moved the content of your second thread with the replace in "Open Files" to this thread to collect all Perl problems of v12.10b here in this thread. I confirm this bug too. And this "Perl regex replace in all open files" bug exists also in v12.10a.

          About the match case issue:

          Where in the help of UltraEdit/UEStudio is [A-Z] (or [a-z]) explained? I can't find it and I want to report this with my prepared email about small help mistakes. A documentation of other programs cannot be 100% applied to UltraEdit/UEStudio even if the help of UE/UES refers to that program. The other program maybe has simply no extra Match Case option.

          For me it was always clear since I use UltraEdit (with UltraEdit style regex) that [A-Z] and [a-z] depends on the Match Case option and [A-Z] or [a-z] is equal with [A-Za-z] if Match Case is not active. You can see this also in the wordfile for the function strings. Case-sensitive languages use often [A-Za-z] where non case-sensitive languages has often only [A-Z].

          But maybe IDM should really mention in the help about the regular expressions (UltraEdit, Unix and Perl) anywhere that the amount of characters found by [A-Z] or [a-z] depends on option Match Case.
          Best regards from an UC/UE/UES for Windows user from Austria

          344
          MasterMaster
          344

            Sep 05, 2006#5

            At least it shows (once more) that IDM implemented the PerlRegExp-stuff not very professional imho....
            Don't know which version-tracking software they use internally ;-)
            Normally using all newest english version incl. each hotfix. Win 10 64 bit

            112
            Power UserPower User
            112

              Sep 06, 2006#6

              Mofi wrote:For me it was always clear since I use UltraEdit (with UltraEdit style RegEx) that [A-Z] and [a-z] depends on the Match Case option and [A-Z] or [a-z] is equal with [A-Za-z] if Match Case is not active. You can see this also in the wordfile for the function strings. Case-sensitive languages use often [A-Za-z] where non case-sensitive languages has often only [A-Z].

              But maybe IDM should really mention in the help about the regular expressions (UltraEdit, Unix and Perl) anywhere that the amount of characters found by [A-Z] or [a-z] depends on option Match Case.
              Well Mofi, one of the reasons we write code as we do (full of bugs) is that we don't put ourselves in the place of the new user often enough.

              It took me over twenty years to realize we design in bugs... (we could just as easily - sometimes more easily - design out bugs).

              It did not seem strange to you (which is fair enough), but you will concede, surely, that if the original Boost Libraries didn't suggest this behaviour (of linking [a-zA-Z] behaviour to match case) then as Bego suggested, it strongly looks as though IDM didn't implement the library in the generally accepted way. Now this may have been BAD - broken as designed! However, if IDM deliberately broke the "interface" then they should have told us very clearly that that's what they'd done...

              Fixing the documentation would go a long way to restoring the "Principle of least astonishment"...

              My AU$0.05 cents...

              Paolo
              There is no such thing as an inconsistently correct system...
              Therefore, aim for consistency; in the expectation of reaching correctness!

              24
              Basic UserBasic User
              24

                Sep 06, 2006#7

                Hi

                Another strange 'bug' is that when you do a find in an open file the display 'shudders' when scrolling which it doesn't do if Perl Regular Expressions is not activated. Anyone else get this behaviour?

                I am still quite disappointed with the Perl regular expression implementation. It was one of the changes of Version 12 but is still not working correctly.

                112
                Power UserPower User
                112

                  Sep 07, 2006#8

                  Mofi wrote:But much more frustrating for you will be that in v12.10b a normal Find with Perl engine active does not find anything anymore when executed from within a macro. This is a new bug of v12.10b which I think makes most macros with PerlReOn not working anymore.
                  Yes, but going back to 12.10a doesn't help either as the find worked, but the macro then won't execute what comes after...

                  I'm in the proverbial catch-22 - I can't go back and there's no point in going forward!

                  IDMen, please please please take more care with your changes!

                  We need a functional Perl Find/Replace ASAP...

                  Paolo
                  There is no such thing as an inconsistently correct system...
                  Therefore, aim for consistency; in the expectation of reaching correctness!

                  2
                  NewbieNewbie
                  2

                    Sep 07, 2006#9

                    Example failing RegEx:

                    Replace: (((?:)|[(glu])[bcigors])w([A-Z][A-Z_0-9]+)
                    with: \1_\L\3

                    will convert: gswHELLO_THERE to gs_hello_there
                    Paolo, are you saying UE should or shouldn't replace gswHELLO_THERE with gs_hello_there?

                    When I tested in UE 12.10b, gswHELLO_THERE was replaced with gs_hello_there, as it should, whether Match Case was selected or not.

                    I also tested with Perl v5.8.4, which resulted in the same replacement, whether case sensitive or insensitive matching used:
                    perl -le '$_="gswHELLO_THERE"; s/(((?:)|[(glu])[bcigors])w([A-Z][A-Z_0-9]+)/\1_\L\3/; print' #case sensitive
                    gs_hello_there

                    perl -le '$_="gswHELLO_THERE"; s/(((?:)|[(glu])[bcigors])w([A-Z][a-z_0-9]+)/\1_\L\3/i; print' #case insensitive - note the a-z
                    gs_hello_there


                    I agree with Mofi regarding the Match Case issue: Not a bug. With Match Case unchecked it is the same in Perl as supplying the i (case insensitive matching) modifier. This means m/[a-z]/i is identical to m/[A-Z]/i is identical to m/[a-zA-Z]/. If this is not documented clearly in UE docs, it should be, but was always clear to me as well.

                    - still no sig

                    29
                    Basic UserBasic User
                    29

                      Sep 07, 2006#10

                      PaoloFCantoni wrote:I've found some more anomalous behaviour in Perl RegExs... (v12.10b)

                      If I use the Perl RegEx above that works with a replace on the current file, it WON'T work if I scope it to "Open Files".

                      Can anyone confirm this too?
                      I believe I've seen this too. Anyone else?

                      112
                      Power UserPower User
                      112

                        Sep 11, 2006#11

                        Ho xilduq,

                        Sorry for the delay, I've been out of Net contact for a while.

                        I am saying that UE should replace gswHELLO_THERE with gs_hello_there.

                        As to the MatchCase problem, I agree that it is reasonable that UE behave this way, but it is sufficiently different that it should be more clearly spelt out in the documentation. From a Goal Oriented Design perspective, if I uncheck Match Case, with regular expressions, a suppressible warning should pop up indicating that the expressions will behave differently. In fact, a similar suppressible pop-up should appear for non-regular expression searches. My point is that the really needs to be two messages, one for RE based searches, the other for non-RE.

                        Paolo
                        There is no such thing as an inconsistently correct system...
                        Therefore, aim for consistency; in the expectation of reaching correctness!

                        3
                        NewbieNewbie
                        3

                          Sep 21, 2006#12

                          For what it's worth, BBEdit (which I regard very highly) for Mac behaves the same way. I don't like this behavior, but it doesn't appear to be UE's behavior alone.