I answer the second question first.
UltraEdit for Windows v24.00 and UEStudio v17.00 and all later versions are full Unicode aware applications supporting scripts being encoded with:
- ANSI using system code page for GUI applications or
- UTF-8 without or with BOM or
- UTF-16 Little Endian without or with BOM.
This means also on execution of the script that Unicode characters
- are read correct from script file and written into JavaScript strings independent on encoding of script file and
- can be read from an opened file into a JavaScript string and
- can be written from a JavaScript string into opened file and
- can be copied from JavaScript string to active clipboard and
- can be read from active clipboard and written into a JavaScript string.
But it is also possible to run finds/replaces/find in files/replace in files with characters not supported by system code page in previous versions of UltraEdit supporting only ANSI encoded scripts by using Perl regular expression and encode those characters with their hexadecimal values. I wrote the script
UnicodeStringToPerlRegExp.js for converting a Unicode character sequence or UTF-8 byte stream or even a ANSI character stream into a Perl regular expression string. So a Perl regular expression replace in files can search for bytes of UTF-8 encoded characters and replace them by bytes of a UTF-8 encoded replace string. For example UTF-8 encoded
ä can be searched with \xC3\xA4 and replaced by UTF-8 encoded
Ω using \xE2\x84\xA6 without using the
Use encoding at all.
A script file can be also UTF-8 encoded with UE < v24.00 and UES < 17.00 on having no BOM (byte order mark) and UTF-8 encoded characters exist only in comments or finds/replace strings. Please see topic
UltraEdit.clipboardContent not supporting Chinese characters? for more details on what is possible regarding to Unicode characters in UltraEdit scripts with UE < v24.00 and UES < v17.00.
The first question was interesting as nobody has asked that before. I created quickly a script which creates an ANSI, a UTF-8 and a UTF-16 encoded file with just a few very short lines on execution with two characters with a code value greater decimal 127. Then I tested multiple simple, non regular expression
Replace in Files all executed manually with
Use encoding set to
Auto-detect to look how those replaces work on the three different encoded files.
I was astonished to see with UE v25.20.0.88 that characters were replaced by UTF-8 encoded characters in very small ANSI encoded file resulting in having finally ANSI and UTF-8 encoded characters in that file. But this happens only on very small ANSI encoded file with just 208 bytes. The same characters in same ANSI encoded character block in a larger ANSI encoded file with 119 KB were correct replaced and are encoded in ANSI after replace. It looks like the
Auto-detect encoding setting of
Find/Replace in Files needs a certain amount of bytes in a file to correct detect if a file is ANSI and not UTF-8 encoded.
The UTF-8 encoding of UTF-8 encoded file with just 211 bytes and no BOM was always correct detected and updated by the
Replace in Files executed manually by me.
And UTF-16 LE encoded file with BOM was also always correct updated by all
Replace in Files.
I have to find out with more experiments which amount of bytes is required by UltraEdit to detect ANSI encoding in small ANSI encoded files on running a Find/Replace in Files with enabled
Use encoding set to
Auto-detect.
It was also interesting for me that the small ANSI encoded file with just 208 bytes was opened always as ANSI and never as UTF-8 encoded. So
Replace in Files encoding auto-detection works a bit different than the encoding auto-detection on opening a file which was not expected by me.
Next I recorded the
Replace in Files executed manually with UE v25.20.0.88 into a macro and played the recorded macro after restoring the different encoded files back to original contents. That worked as expected and produced the same file contents as the manually executed
Replace in Files before.
I looked on macro code and could see value
-2 for option
Auto-detect.
So I modified the initially created script and added the
Replace in Files with exactly the same options as used manually before and recorded into the macro. The two encoding options were written by me into the script file as:
Code: Select all
UltraEdit.frInFiles.useEncoding=true;
UltraEdit.frInFiles.encoding=-2;
But that was no good idea because of UltraEdit crashed on script execution on executing the first
UltraEdit.frInFiles.replace with those parameters. I restarted UltraEdit and executed the script again and UltraEdit crashed again. Of course I will report this crash by email to IDM support.
Conclusion: Encoding option
Auto-detect on usage of option
Use encoding is currently not possible in an UltraEdit script, only in an UltraEdit macro or manually.