Tapatalk

"." (dot) in Perl regular expressions doesn't include newline characters CRLF?

"." (dot) in Perl regular expressions doesn't include newline characters CRLF?

32
Basic UserBasic User
32

    Apr 27, 2012#1

    Hello

    I was looking for the following part...

    Code: Select all

    <h2 class="titleMain">.+?<span>(.+?)</span>
    ... but UE doesn't seem to include CRLF for the catch all "." character:
    Search string '<h2 class="titleMain">.+?<sp...' not found!
    Before I upgrade to a more recent release of UltraEdit, I need to make sure that
    • 15.10 does indeed not support "." as CRLF
    • this issue is solved by the latest release of UltraEdit.
    Thank you.

    6,685587
    Grand MasterGrand Master
    6,685587

      Apr 27, 2012#2

      Yes, . does not match newline characters (U+000d - carriage return \r, U+000a - line-feed - \n, U+2028 - line separator, U+2029 - paragraph separator) by default. And that has not changed in latest version of UltraEdit.

      If . matches newline characters or not is controlled by flag match_not_dot_newline. That is common for all Perl regular expression implementations. Not common is to which value this flag is set by default. In UltraEdit it is set to true (as it is more secure and most often users do not want searching/replacing over multiple lines).

      However, it is quite easy to control this flag by starting the search string with (?s). This little expression at the beginning of the search string tells the Boost regular expression library within UltraEdit to use flag match_not_dot_newline with value false for this search/replace. See the chapter Modifiers in Boost documentation about Perl Regular Expression Syntax.

      A common solution for matching any character independent on flag match_not_dot_newline is to use instead of a dot the character set definition [\s\S] as this matches any whitespace character and any non whitespace character which means really any character. [\w\W] matches also any character (any word character and any non word character).

      32
      Basic UserBasic User
      32

        May 02, 2012#3

        Thanks Mofi, problem solved.