UE13.20+2, UTF8 handling

UE13.20+2, UTF8 handling

262
MasterMaster
262

    Oct 24, 2007#1

    Hello UTF experts. I need a sanity check regarding UTF handling in UE13.20+2.

    I have a script that writes to the output window. The stuff that gets written originates from a DOS-ASCII file. Things are displayed correctly in the output window

    For instance:

    addMapField("sk-nøgtxt-10", skNoegtxt_10);

    If I select this line and copy it to the clipboard and paste in into a empty DOS-ASCII file, I get:

    addMapField("sk-nøgtxt-10", skNoegtxt_10);

    Ok, fair enough. The scripting engine and/or the UE outputWindow method write() writes a UTF file in the outputWindow. And when I paste it into a non-UTF document some characters are "corrupted".

    In UE13.10+4 I would then just perform a File - Conversions - UTF-8 to ASCII even though the file format is already ASCII. The conversion was able to find and fix characters, so I would end up with the correct ASCII data:

    addMapField("sk-nøgtxt-10", skNoegtxt_10);

    And then I could get on with my "ASCII" work. (I no not work with UTF files in general).

    But in UE13.20+2 the menu option File - Conversions - UTF-8 to ASCII is greyed out when I'm in a non-UTF file. So suddenly I can't convert the UTF characters pasted into a non-UTF document :-(

    In the hotfix UE13.20+2 a number of hints about UTF are listed:
    * Read only UTF-8, UTF-16BE, and ASCII Escaped Unicode files
    * Find/replace and UTF-8 to ASCII conversion
    * Save As fails to convert line endings with Unicode files

    Why is UTF-8 to ASCII greyed out suddenly ? Do you think it's an error in the UE13.20+2 hotfix - or an error that's finally fixed ? If someone supports me in that it is an error in UE13.20+2 I will report it to IDM Support.

    Workaround for UE13.20+2: To get on I first have to paste the clipboard contents into a blank non-UTF file. Save it to the harddisk as DOS-ASCII. And reopen the file and rely on the UTF autodetection to flag the file as a UTF-file. Then I can convert to ASCII with File - Conversions - UTF-8 to ASCII and copy/paste the string into my target file.

    6,686585
    Grand MasterGrand Master
    6,686585

      Oct 24, 2007#2

      Yes, there are lots of Unicode related changes in UE v13.20 hotfix +2 and I also don't know what they are for and why those changes. Till UE v13.20+1 it was always possible to run UTF-8 to ASCII on an ASCII file.

      And in my point of view this was very good for a situation like yours or if a file with UTF-8 encoded characters was not detected as UTF-8 file (detection disabled, or no BOM and no UTF-8 declaration and the first UTF-8 encoded character is beyond the first few KBs of the file).

      It looks like whole Unicode handling is changed or IDM is currently working on it. As a result of that it is impossible to use a character > 0x7F in search or replace strings on Windows 98 SE. Only on Windows XP the search for characters like äöüÄÖÜß still work. I have reported this Win98 only problem with non ASCII characters. However, this problem is the reason why I downgraded for my normal work to UE v13.10a+2 and UES v6.30a+2 and use latest version only for tests or if answering forum questions. I hope, the IDM developers get soon the Unicode handling back to work.

      I suggest to write an email to IDM support and telling IDM that it was good not to disable UTF-8 to ASCII conversion for ASCII files.

      Well, if UTF-8 to ASCII conversion is executed on a file which was already once converted, you get a destroyed file where it is not possible to undo the change.
      Best regards from an UC/UE/UES for Windows user from Austria

      262
      MasterMaster
      262

        Oct 24, 2007#3

        Mofi wrote:I suggest to write an email to IDM support and telling IDM that it was good not to disable UTF-8 to ASCII conversion for ASCII files.
        I will do that. Thanks for your feedback Mofi!

        6,686585
        Grand MasterGrand Master
        6,686585

          Dec 01, 2007#4

          Also with UE v13.20+4 it is not possible to convert the UTF-8 encoded text in an ASCII/ANSI file directly with UTF-8 to ASCII to real ASCII/ANSI. I found only 1 workaround.

          Copy the complete UTF-8 encoded text in the ASCII/ANSI file to clipboard.
          Use File - Conversions - UNICODE/ASCII/UTF-8 to UTF-8 (ASCII Editing).
          Select All with Ctrl+A and replace it with Ctrl+V with the original UTF-8 text in the clipboard.
          Use File - Conversions - UTF-8 to ASCII to get now the original text correct in ANSI.

          Lots of steps in comparison with the simple UTF-8 to ASCII execution in previous versions. :?
          Best regards from an UC/UE/UES for Windows user from Austria