What I think happened on your computer.
As you used old UltraEdit you clicked by mistake on menu item OEM Character Set
in menu View
. UltraEdit automatically inserts characters with a code value greater 127 decimal like éàèù (French) or äöüÄÖÜß (German) with the code value according to system OEM code page defined according to Windows Region and Language settings with this option enabled.
But OEM Character Set
is a command to toggle an option. It does not convert the characters in active file. It just enables the feature to write text from now on using system OEM code page instead of system ANSI code page as by default.
This option is very useful on writing batch files where it is necessary to write the batch file code with using OEM code page, for example when a text containing éàèùäöüÄÖÜß should be output into console window using command echo
on batch file execution. But this option is definitely not useful on writing/editing HTML files.
The option toggled by OEM Character Set
is set per file extension group. The file extension groups can be configured at Advanced - Settings
/Configuration - Editor - Word Wrap/Tab Settings
. I suppose you have only group Default
. Therefore this option is now enabled for all non Unicode files.
UltraEdit has also the commands OEM to ANSI
and ANSI to OEM
to convert everything in active file to ANSI/OEM. Those two commands are for example in menu File
in submenu Conversions
on using UE < v23.00 or UE v24.00 with traditional menus.
However, since UE v14.10 the command OEM Character Set
exists in UltraEdit, but is not available anymore in any menu or toolbar or the customization dialogs for menu, toolbar, ribbon. I suppose that IDM removed this command because of too many users enabled it by mistake although for users like me often writing batch files it is very useful. See the forum topics Manual customization of command OEM Character Set
and UE/UES configuration to edit batch files (*.bat, *.cmd) by default with OEM character set
You don't need this option as not useful on editing HTML files. So you need to toggle off this option in old version of UltraEdit by clicking once again in menu View
on menu item OEM Character Set
But in case of using now UE v24.00 with configuration taken over from old version of UltraEdit, you have to toggle off this option by editing the INI file as the command does not exist anymore in UE v24.00 in any menu/toolbar/ribbon. I wrote an instruction on how to add this command to toolbar, but I suggest not doing all those steps as you do not really need this command.
Do the following to disable OEM Character Set
in INI file of UltraEdit v24.00:
- Exit all running instances of UltraEdit.
- Copy %APPDATA%\IDMComp\UltraEdit to clipboard, paste with Ctrl+V this folder path into address bar of Windows Explorer and press key RETURN to open this by default hidden folder.
There are at least 1, but more likely several uedit*.ini files. 64-bit UltraEdit v24.00 uses only uedit64u.ini and 32-bit UltraEdit v24.00 uses only uedit32u.ini. Any other uedit*.ini are from previous versions of UltraEdit.
- Open the INI file used by UltraEdit v24.00 in Windows Notepad.
- Search for setting Force OEM which exists with Force OEM=, Force OEM2=, Force OEM3=, ... in section [Settings].
- Modify each Force OEM value from 1 to 0. I suppose, there is only Force OEM=1 which needs to be modified to Force OEM=0.
- Save and closed the edited INI file.
- Redo the steps 3 to 6 for the other files if you plan to uninstall UE v24.00 and reinstall old UltraEdit version.
- Start UltraEdit and the accented characters are inserted into non Unicode files again with using system ANSI code page, i.e. Windows-1252 in your case.
has no character defined in code value range 7F to 9F while Windows-1252
contains in this range Western European characters like €
So be careful with charset=iso-8859-1
in header of an HTML file not inserting a character of this range not encoded with appropriate HTML entity, i.e. €
instead of €
. Otherwise your HTML file is not really valid because of using characters not defined in declared character set. It would be better to use charset=windows-1252
in case of inserting €
not encoded as HTML entity into an ANSI encoded HTML file.
Well, in real for displaying the HTML file in browser it does not really matter if €
is in the HTML file as single byte with hexadecimal code value 80 while the HTML file contains the character set declaration charset=iso-8859-1
. It is standard that all browsers interpret an HTML file with charset=iso-8859-1
identical to an HTML file with charset=windows-1252
and convert the bytes correct to Unicode for displaying. The browser manufacturers know that most HTML writers do not know what charset=
really means and that there are differences between ISO 8859-1 and Windows-1252.
Important to know for every HTML/XHTML writer:
The character set declaration in HTML file defines how to interpret the bytes in the HTML file. It does not define which character set to use for displaying the HTML file contents. All browsers convert text to display/print to Unicode on loading the HTML file. So charset=
defines for the browser how to interpret the bytes of the HTML file and not how to display it. An HTML file with German text containing only ASCII characters because of encoding äöüÄÖÜß€ and other characters with a code value greater 7F with HTML entities can be declared with charset=us-ascii
and is nevertheless displayed correct by any browser as the browser converts the byte stream with the HTML entities correct to Unicode for displaying the text.