Case sensitivity of Perl regular expressions in UE/UES

Case sensitivity of Perl regular expressions in UE/UES

9
NewbieNewbie
9

    Apr 16, 2010#1

    If you ever tried to use Perl regular expressions to search for an item in UE, you might have noticed that even
    if your regular expression specifically searches using case-sensitive tokens, the results may match words that
    do not match on the case-sensitivity.
    For example:

    If you search for " [A-Z]+\d+ " you would normally expect that you will find words that contain all uppercase letters
    followed by at least one digit - like: HOME2, ABD445, etc...
    Unfortunately UE interpretation of that regular expression is not compliant with Perl rules, and the search can match
    words like: home2, hOMe3, ad77, as well as HOME2, ABD445, etc...

    The only way to force any regular expression test in UE to adhere to true REGEX standards is to check the "Match Case" option.

    To me this behaviour contradicts what everybody else understands as "Perl regular expressions" but unfortunately UE team
    does not want to change this since according to email I got from them: "the thinking was that these two options would conflict with each other".
    By the two options they mean the MatchCase option and the true Regex syntax.

    The same case-sensitive problem also affects all other places in UE where Perl regular expressions can be used,
    for example if Perl regexs are used in wordfiles, they too will ignore the case-sensitivity, and that cannot be explained
    by any possible conflict with "MatchCase" option since it does not exist for wordfiles.

    Since this has been a decision made by UE team, they will not change it unless enough people will complain.
    Personally I believe that since regular expressions are an advanced feature, they should not contain "shortcuts"
    to make sure that novices will not get stung.

    Please write a letter to UE if you agree with me on this subject.

    236
    MasterMaster
    236

      Apr 16, 2010#2

      I disagree.

      First, "Perl regex" is a commonly used term that might better be called "Perl style regular expression". It doesn't mean that a native Perl regex can be used in an editor that support Perl regexes (otherwise you'd need to add / around the regex and use mode modifiers like /m etc.). It just means that the basic syntax of the regular expressions follow Perl regex rules. UE and most other editors that support "Perl regexes" don't implement the entire engine (omitting features like Unicode properties \p{Letter} etc.). It's not a patented name.

      Second, case-insensitive search is the default in nearly every editor, word processor etc. It would be highly counterintuitive to change that simply because the user has switched from plain text to regex search.

      Just my $.02
      Tim

      9
      NewbieNewbie
      9

        Apr 16, 2010#3

        I can only add that (1) case sensitivity is a very basic feature of regular expressions that is present in practically all version of them.
        And (2) The user who is just switching from plain text to regex search is likely to read about the way they work, and will rather
        be surprised by them not working as they should.

        On top of that UE help shows examples like:

        For example [ abc], will match any of the characters 'a', 'b', or 'c'.
        For example [ a-c] will match any single character in the range 'a' to 'c'.

        Note that even in UE own help message there is no mention at all that [abc] can also match B.

        236
        MasterMaster
        236

          Apr 17, 2010#4

          The feature is there, and there is a checkbox to enable it.

          Show me one editor that changes case sensitivity behavior when switching from plain text search to regex search. It would be very bad usability practice to change a setting implicitly when the user selects a different option elsewhere (unless the selected option leaves no other choice, which definitely isn't the case here).

          What should UE do? Set a checkmark in the "case sensitivity" checkbox when you select "regular expressions" and clear it again when you select plain text search? What if you had case sensitivity enabled in plain text search, then switch it off during regex search, and then select plain text search again? Should it turn it on again? Leave it off?

          262
          MasterMaster
          262

            Apr 17, 2010#5

            A workaround to operate with case sensitive tokens in Perl Regex regardless of the setting of "match case" in UE is to prefix with (?-i) modifier. It means turn off case insensitivity.

            Example:

            (?-i)[A-Z]+\d+

            Read on at Regular-expressions.info / modifiers...

            6,651559
            Grand MasterGrand Master
            6,651559

              Apr 17, 2010#6

              Pietzcker and jorrasdk have explained nearly everything. I just want to add something. UltraEdit uses the Boost regular expression library. I don't know which version of the library the IDM developers have implemented. So I can't tell you which version of the documentation you really have to read. However, for your case sensitivity behavior question the currently latest one can be used.

              The Boost regular expression library documentation has a table of contents page. There is also a PDF version of the entire documentation.

              An important part (page) for users is Perl Regular Expression Syntax. At top of this page you can read that the searching function can be called with different options. One of them is icase which is used when in a Perl script /i is used, and by UltraEdit when the search option Match Case is NOT enabled. You want always case sensitive searches, no problem, enable this option once and you got it, except you have enabled auto reset setting for match case option in the configuration dialog.

              jorrasdk has posted already one method to define inside a search string the case sensitivity. Perhaps you can also use \u or the equivalent [[:upper:]] in the search string to find only uppercase letters independent of the match case option.
              Best regards from an UC/UE/UES for Windows user from Austria

              61
              Advanced UserAdvanced User
              61

                Apr 30, 2010#7

                Using Help of UE and typing in boost I was able to get this from the documentation.


                Boost Software License
                Version 1.0 - August 17th, 2003

                Permission is hereby granted, free of charge, to any person or organization obtaining a copy of the software and accompanying documentation covered by this license (the "Software") to use, reproduce, display, distribute, execute, and transmit the Software, and to prepare derivative works of the Software, and to permit third-parties to whom the Software is furnished to do so, all subject to the following:

                The copyright notices in the Software and this entire statement, including the above license grant, this restriction and the following disclaimer, must be included in all copies of the Software, in whole or in part, and all derivative works of the Software, unless such copies or derivative works are solely in the form of machine-executable object code generated by a source language processor.

                THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, TITLE AND NON-INFRINGEMENT. IN NO EVENT SHALL THE COPYRIGHT HOLDERS OR ANYONE DISTRIBUTING THE SOFTWARE BE LIABLE FOR ANY DAMAGES OR OTHER LIABILITY, WHETHER IN CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

                79
                Advanced UserAdvanced User
                79

                  Apr 30, 2010#8

                  I believe that's just the version of the license - the code is versioned separately.