Tapatalk

Perl regex replace all not working on file encoded with Chinese code page 950 (Big5) in UES 17.00 (fixed)

Perl regex replace all not working on file encoded with Chinese code page 950 (Big5) in UES 17.00 (fixed)

2
NewbieNewbie
2

    May 04, 2017#1

    UEStudio '17 (x64)  version: 17.00.0.16

    Not working when "Replace All" of "Regular Expression (Perl)"
    is2.png (18.75KiB)

    6,685587
    Grand MasterGrand Master
    6,685587

      Re: Perl regex replace all not working on file encoded with Chinese code page 950 (Big5) in UES 17.00

      May 04, 2017#2

      What does the file contain?

      What do you expect that this regular expression should find?

      Please post the answers on this question as text and not as uploaded screenshot. We don't like it to have to type in our UE/UES what you have as text available and post as not usable pixel data for some unknown reason.

      The search expression [\.]{3,}[\s \.]* can be also written [.]{3,}[\s.]* as the dot has no special meaning inside square brackets (character class) and \s matches any whitespace character including the newline characters carriage return and line feed.

      The search expression could be also written as \.{3,}[\s.]* as there is absolutely no need for a character class with just a single character.

      But the expression is definitely not good at all. It searches for a string with 3 or more dots followed by 0 or more whitespaces OR dots greedy. That does not make much sense as the two parts of the expression have partly an overlapping character set. What about using the search expression \.{3}[\s.]* to find a string consisting of exactly 3 dots and more dots or whitespaces?
      Best regards from an UC/UE/UES for Windows user from Austria

      2
      NewbieNewbie
      2

        Re: Perl regex replace all not working on file encoded with Chinese code page 950 (Big5) in UES 17.00

        May 04, 2017#3

        Hi,

        In my point of view it is not a regular expression issue.

        When I search with [\.]{3,}[\s \.]* if finds the strings as expected.That is not the problem.
        But when I click Replace All button to replace all occurrences of the strings I found before, it indicates for example 10 items replaced!
        But actually there are no changes on screen! So I use Replace (R) button to replace one by one.

        PS: In previous version the Replace All button worked fine.

        Here is an example:

        Code: Select all

        ch1xxxx .........................23
        ch2xxxx ......................, ............................... 24
        ch3xxxx ? ................................................ 26
        如何在綱頁上加入程式 .................................... 27
        9FEBJavaScript$..................................;....29
        第一個 JavaScript 程式 ............... . ........................... I. . . 32
        在綱貞上顥示訊患.................................................. 34
        引用挪部 JavaScript 擋案 ............................................ 35
        追蹤錨誤 error track ...........................ˉ .....ˉ ......................... 37
        Firefox 的 JavaScript 主捚台 ....................................... 38
        ch11xxxx .................................. 40
        
        ues_1.GIF (228.28KiB)

        6,685587
        Grand MasterGrand Master
        6,685587

          Re: Perl regex replace all not working on file encoded with Chinese code page 950 (Big5) in UES 17.00

          May 05, 2017#4

          I was not able to reproduce this issue using English 32-bit UEStudio v17.00.0.16. I did following:
          • Started UEStudio with default settings.
          • Converted the new file from ASCII to Unicode.
          • Copied and pasted the sample text from browser into the new file.
          • Set the caret on line 7 at column 30.
          • Pressed Ctrl+R to open Replace window and copied and pasted [\.]{3,}[\s \.]* into Find what edit field.
          • Entered ; into Replace with, checked option Regular expressions and selected Perl.
            Option Replace all is from top of file was checked already and all other options were not checked.
          • Clicked on button Replace all and status bar at bottom displayed 15 items replaced on left side.
          The document window displayed immediately after clicking on Replace all:

          Code: Select all

          ch1xxxx ;23
          ch2xxxx ;, ;24
          ch3xxxx ? ;26
          如何在綱頁上加入程式 ;27
          9FEBJavaScript$;;;29
          第一個 JavaScript 程式 ;I. . . 32
          在綱貞上顥示訊患;34
          引用挪部 JavaScript 擋案 ;35
          追蹤錨誤 error track ;ˉ ;ˉ ;37
          Firefox 的 JavaScript 主捚台 ;38
          ch11xxxx ;40
          

          Wait. I have just seen on your last image that the file is not a Unicode encoded file. It is a file encoded with code page 950 (ANSI/OEM Traditional Chinese Big5).

          So I converted the file from UTF-16 LE to ANSI with code page 950 using encoding selector in status bar, replaced the file contents by a new copy from browser, set caret again on line 7 at column 30 and clicked on Replace all.

          The status bar indicated 14 items replaced (not 15!), but document window still displayed the unmodified text.

          Next I converted the file to ANSI with code page 1252 (ANSI - Latin I), replaced the file contents by a new copy from browser and clicked on button No on warning that the Unicode characters in clipboard can't be converted to 1252 and if the file should be converted to UTF-8, set caret again on line 7 at column 30 and clicked on Replace all.

          The status bar indicated 15 items replaced and the modified file was displayed as expected in document window.

          Conclusion: There is a bug in UEStudio v17.00.0.16 on running a Perl regular expression replace all on a file being ANSI encoded with code page 950.

          Please report this issue by email to IDM support. I don't do that because I never edit ANSI encoded Chinese files.

          Update: This bug was fixed with UE v24.20.0.62 and UEStudio v18.00.0.04.
          Best regards from an UC/UE/UES for Windows user from Austria