Format (encoding) not preset correct in Save As dialog (fixed)

Format (encoding) not preset correct in Save As dialog (fixed)

2
NewbieNewbie
2

    Aug 03, 2011#1

    Hi,

    I need to know how UES handles byte order marks. For example, if I open a file from another source which does not contain BOMs, will UES add them? Generally speaking, will UES keep the BOMs as they found them? Can I add or remove them?

    Thanks,

    --Brian

    6,602548
    Grand MasterGrand Master
    6,602548

      Aug 03, 2011#2

      A byte order mark already present in a file on opening is kept on save (with Ctrl+S) by default.

      For UTF-8 files without BOM there are two configuration settings:
      • Write UTF-8 BOM header to all UTF-8 files when saved
      • Write UTF-8 BOM on new files created within this program (if above is not set)]
      By default both settings are not enabled and therefore UTF-8 files without BOM are saved (with Ctrl+S) also without BOM.

      UTF-16 files without BOM on opening are saved by default also without BOM, except a sort is done on entire file which results in adding the BOM in the background. The workaround is to select all with Ctrl+A and then sort the file which results in sorting entire file without adding the BOM. Update: There is no BOM added anymore in background on sorting an entire UTF-16 LE encoded file without BOM since UltraEdit for Windows v23.20 and UEStudio v16.20.

      The Save As dialog has the Format option where you can select with which encoding a Unicode file should be saved and if the save is with or without BOM. So the Save As command can be used to remove or add a BOM.

      You can use Replace In Files command to remove a BOM from a bunch of files, see for example Byte Order Marker (BOM) query?

      5
      NewbieNewbie
      5

        Aug 05, 2011#3

        The "Save As" action always chooses UTF-8 (with BOM), and I have to mouse around and reset it in every case.

        Since the UTF-8 with BOM is a brain-dead format, which is also incompatible with many applications, I NEVER want a BOM on UTF-8.
        There are no 16-bit entities in the UTF-8 encoding, so it is completely extraneous. It is not "erroneous" only because the UNICODE standards committee made a mistake to allow it.

        I want the default to be UTF-8 without BOM.
        In fact, I would like to remove the UTF-8 with BOM encoding completely from UltraEdit!

        Is it possible to change the default for "Save As"?

        6,602548
        Grand MasterGrand Master
        6,602548

          Aug 05, 2011#4

          I'm using currently English UltraEdit v17.10.0.1015 on Windows XP SP3 and the Format option in the Save As dialog is always set to Default which means the file is saved with the encoding as indicated in the status bar, and for UTF-8 files the BOM saving is handled according to the two settings for newly created files and UTF-8 files without BOM on opening. As far as I know UltraEdit v17.00 does not remember anymore the last used value for the Format option. Whenever I open the Save As dialog, Default is preselected for the Format option.

          arofer, you have not posted which version of UltraEdit you use and which operating system. The Save As dialog of UE v17.00 and later depends on version of Windows. There are two different Save As dialogs implemented in UE v17.00 and later versions, one for Windows XP and former versions and another one for Windows Vista and later Windows versions. Perhaps there is something different resulting in preselecting always UTF-8.

          Wait a moment, I just detected something interesting. While Format option in Save As dialog is always set to Default when I open the Save As dialog, I can see that UltraEdit remembers the last used Format option in uedit32.ini with value File Format= in section [Settings]. I don't know why this is done because on my installation this value is never applied. I will ask IDM by email about this value in uedit32.ini and will post the reply here.

          5
          NewbieNewbie
          5

            Aug 08, 2011#5

            I am on version 17.10.0.1015 of UltraEdit on Windows 7 Professional.
            The UTF-8 settings are:
            New File Creation: UTF-8
            File Handling/Save: all options are deselected

            I never see "Default" on the save as. I always see "UTF-8", which puts a BOM on the file.

            UTF-8 with BOM is incompatible with most programs, since the BOM is essentially useless in this encoding.
            Hence text processing programs such as editors, compilers, etc. tend to assume no one with an ounce of sense would use it.
            Search the web for "UTF8 BOM" and you can see the numerous problems it causes.
            It seems like ultraedit is one of the very few programs in the world that uses this file format regularly, for no good reason I can discern.
            It is unnecessary for detection of utf-8 file encoding, which is always done correctly on files without the BOM.

            We really like UltraEdit and therefore we are quite baffled about this apparent lack of good judgment.

            6,602548
            Grand MasterGrand Master
            6,602548

              Aug 08, 2011#6

              I have not yet received a reply by IDM. I can tell you only what I see with UltraEdit v17.10.0.1015 on Windows XP SP3 x86.
              • If I select Create new files as UTF-8 at Configuration - Editor - New File Creation,
              • and Write UTF-8 BOM header to all UTF-8 files when saved
                and Write UTF-8 BOM on new files created within this program
                are both not enabled at Configuration - File Handling - Save,
              • and I press Ctrl+N to open a new file which is encoded in UTF-8 with DOS line terminators as indicated in the status bar at bottom with U8-DOS,
              • and enter some text with characters with decimal value greater than 127,
              • and press key F12 to open Save As dialog for saving the new file,
              • the Format option is set to Default and the file is saved as UTF-8 file without BOM.
              I agree that usually UTF-8 encoded files are used without BOM and that using a BOM makes problems in many applications not supporting Unicode files. The UTF-8 encoding without BOM is often used because
              • many applications like compilers and interpreters still do not really support Unicode files. UTF-8 encoding makes it possible to use those applications nevertheless because UTF-8 encoded files can be interpreted when only single byte, null terminated strings can be processed.
              • UTF-8 encoding saves storage space and reduces data transfer volume for many text files in comparison to UTF-16 because often most text files contain mainly characters from ASCII table and just a few characters are from Unicode table.
              However, UTF-8 with BOM should be nevertheless supported by a text editor.

              I just tested UltraEdit with the default settings by renaming uedit32.ini (usually located in directory %APPDATA%\IDMComp\UltraEdit) and starting UltraEdit which created the INI file completely new. The 2 Write UTF-8 BOM ... settings are by default not checked. The default encoding type for new files is ANSI and therefore this setting must be changed by the user when new files should be encoded with UTF-8. After making this change, opening a new file, entering a text, opening Save As dialog, entering a file name and saving the file, the new file was saved on my hard disk as UTF-8 encoded file with DOS line terminators and without BOM.

              So UltraEdit is configured by default for saving UTF-8 files without BOM, at least on Windows 2000 / XP. Just when a UTF-8 file with BOM is opened, modified and saved, it still contains the BOM after save.

              Perhaps the Save As dialog for Windows 7 and Vista is working currently different to Save As for Windows XP and 2000.

              Is there any other user using UE v17.10.0.1015 on Windows 7 / Vista who can confirm this behavior?

              In the meantime while waiting for a reply from IDM support, I suggest that you check your uedit32.ini. Open it with Notepad while UltraEdit is not running. Search for File Format= and set the value to 0 (zero). Make sure there is no second File Format= in uedit32.ini. Save uedit32.ini, close Notepad and check if UltraEdit now has always Default set in the Save As dialog.

              If that does not work and in Save As dialog still UTF-8 is preselected, close UE, rename uedit32.ini for example to uedit32_bak.ini, start UE which creates uedit32.ini new, select encoding type UTF-8 for new files in the configuration, open a new file (the one already open is an ANSI file), enter some text and save it. Is the file now saved as UTF-8 file without BOM?

              5
              NewbieNewbie
              5

                Aug 09, 2011#7

                Okay I fixed it.
                First, I changed the Uedit32.ini as you recommended. No change in behavior.

                Second, I renamed the Uedit32.ini and with save as, I got the Default encoding.
                I changed this to "utf8 no bom" and that became the new defaulted "save as" encoding.

                I diffed the old Uedit32.ini and the new one and noticed 3 significant changes. The new ones were:

                Default File Type=0
                Default UTF8=0
                File Format=4

                By reverting to the old Uedit32.ini and changing these three settings, my default encoding is now (hurrah!) "utf8 no bom".

                So, thanks for the hints.
                Apparently, the first time you do "save as" plants in concrete the defaulted encoding, which can then only be fixed with a change to Uedit32.ini outside of UltraEdit.

                As far as your other comments go, I would agree that text processors "should" support "utf-8 with BOM", if only because others have made the mistake of creating files in this encoding. My main complaint was that UltraEdit was DEFAULTING to this rotten encoding, causing me many headaches.

                It's not only text processors that have "no support for Unicode" that do not support "utf-8 with BOM". It's also text processors with robust Unicode support that fail to support "utf-8 with BOM". Take, for example, the standard Sun java compiler (!!!). The reason for this lack of support is therefore not due to not embracing Unicode, but rather because of a recognition that "utf8-with-BOM" is a backwater, rarely-used, inferior encoding that uses a redundant and useless marker only because of some mistake made by a standards committee.

                BTW, did you ever wonder what it might mean if the bytes in the BOM in utf8 were switched? It's like one hand clapping.

                May I suggest that the BOM in "utf8-with-BOM" stands for "Backwater Obsolete Mark" rather than "byte order mark", which is a misnomer.

                6,602548
                Grand MasterGrand Master
                6,602548

                  Aug 09, 2011#8

                  The configuration setting Encoding Type of New File Creation is saved in uedit32.ini with two settings (two because of downwards compatibility):

                  For ANSI:
                  [Settings]
                  Default UTF8=0
                  Default Unicode=0

                  For UTF-8:
                  [Settings]
                  Default UTF8=1
                  Default Unicode=0

                  For Unicode:
                  [Settings]
                  Default UTF8=0
                  Default Unicode=1

                  I have already explained that the last used Format is remembered with:

                  [Settings]
                  File Format=index of last Format selection

                  The configuration setting Default file type for new files of DOS/Unix/Mac Handling is stored with:

                  [Settings]
                  Default File Type=0

                  Value 0 is for DOS, value 1 for UNIX and value 2 for MAC.

                  It should not be required to change any of these settings manually and directly in uedit32.ini.

                  I have received today the reply from IDM on my email with the questions:
                  Mofi wrote:What is the purpose of File Format= in uedit32.ini?

                  Is it possible that Encoding in Save As dialog on Windows 7 / Vista is set to this value while this is not done for Save As dialog on Windows 2000 / XP?
                  My email contained not just these two questions, but also some additional information and the link to this topic.
                  IDM wrote:Thank you for your message. Yes, this value is used to track the Encoding value last used in the Save As dialog so that this can be remembered and provided the next time the Save As dialog is invoked.
                  So we know now that UltraEdit should remember last used Format selection and should preselect the encoding on next opening of the Save As dialog. This definitely does not work in UE v17.10.0.1015 on Windows XP / 2000 because always Default is set in Save As dialog, although last used format option is correct remembered. I tested this with several archived versions of UltraEdit and found out that preselecting the last used format option is not working anymore since UltraEdit v16.00.0.1025. It's really interesting that nobody reported this until now. It looks like most users don't need a conversion from default encoding type for new files to a different encoding on first save.

                  And it looks like on Windows 7 / Vista the opposite is the case. The remembered encoding is preset in the Save As dialog, but the dialog does not remember correct anymore the last used format selection. You should report this by email to IDM support. I don't want to do this because I can't verify if this issue is fixed in a future version because of having no computer running Windows 7 or Windows Vista. Thanks.

                  5
                  NewbieNewbie
                  5

                    Aug 09, 2011#9

                    I believe I have found an (additional) problem.
                    If "Default UTF8=1", I cannot get the "Save As" default setting to move off of "UTF-8 with BOM".

                    It looks like this is a bug in UltraEdit, so I am bailing out and going to support with this.

                    I sent in my settings and example to IDMCOMP support.
                    They verified that this is indeed a bug and will let me know when it gets fixed.

                    2
                    NewbieNewbie
                    2

                      Aug 12, 2011#10

                      Thank you Mofi and arofer for getting to the bottom of this!

                      --Brian

                      6,602548
                      Grand MasterGrand Master
                      6,602548

                        Sep 02, 2011#11

                        The issue on Windows XP / 2000 that Format is always set to Default instead of last used format (encoding) is fixed in UE v17.20.

                        5
                        NewbieNewbie
                        5

                          Sep 08, 2011#12

                          Okay. I just downloaded Version 17.20.0.1014 to my Windows 7 Pro, and it seems to have fixed the BOM default problem. The default encoding is now set to UTF8-no BOM, unlike the prior version, which stubbornly stuck at UTF-8 WITH BOM.

                          It baffles me, however, as to why IDM, which is otherwise so clever in their product development, are so supportive of a brain-dead (utf8 with bom) encoding. It's like they have a blind spot in their vision.

                          For example, the entry labeled "UTF-8" is actually "UTF-8 WITH (obsolete, unsupported, and unnecessary) BOM".
                          And, the entry labeled "UTF-8 WITH NO BOM" is actually the standard, supported form of UTF-8 file encoding.
                          This nomenclature is backwards at best. The oddball is the UTF-8 WITH BOM.

                          Anyway, thanks to IDM for fixing this problem, which has cost me a great deal of frustration.

                          1581
                          Power UserPower User
                          1581

                            May 03, 2019#13

                            So - 8 years and 9 versions later (v26) I have the same(?) problem.

                            I have two files:

                            Jim.txt in ANSI
                            Jim.xml in UTF-8/BOM

                            I open and edit them.
                            Then I want to press F12 to save them under Joe.txt and Joe.xml, both in their current format.
                            But in one file it always wants to change the format. (I suppose "Default" is the last used "Save as - Encoding".)

                            What's the current state of this topic?
                            As written above? Manual editing of the INI? Or some settings I didn't find?
                            How to keep the files in the format they are?
                            UE 26.20.0.74 German / Win 10 x 64 Pro

                            6,602548
                            Grand MasterGrand Master
                            6,602548

                              May 04, 2019#14

                              The oldest UltraEdit version archived by me is 10.10c which does not have the Format option in Save As dialog.

                              The next archived UltraEdit version is 11.10c which has already the option Format in Save As dialog. So I ran tests on Windows 7 x64 with Windows Classic desktop theme to find out which versions of UltraEdit for Windows remember the last used Format or Encoding selection in Save As dialog and which versions always set this option to Default. The results are:
                              • UE v11.10c to UE v15.20.0.1027 remember last used Format selection in Save As dialog.
                              • UE v16.00.0.1025 to v16.30.0.1003 set Format always to Default on opening Save As dialog.
                              • UE v17.00.0.1025 introduced Save As dialog on Windows Vista and later Windows versions with option Encoding. The Save As dialog on Windows XP and former versions was still with option Format. The two different Save As dialogs depending on version of Windows were kept up to UE v22.20.0.49 which was the last released UltraEdit version running also on Windows XP. I ran the tests additionally on Windows XP for those versions of UltraEdit.
                                • UE v17.00.0.1025 to v17.10.0.1025 remember last used Encoding selection in Save As dialog on Windows Vista and later Windows versions.
                                  UE v17.00.0.1025 to v17.10.0.1025 set Format always to Default on opening Save As dialog on Windows XP and former Windows versions.
                                • UE v17.20.0.1016 to v17.30.0.1016 remember last used Encoding selection in Save As dialog on Windows Vista and later Windows versions.
                                  UE v17.20.0.1016 to v17.30.0.1016 remember last used Format selection in Save As dialog on Windows XP and former Windows versions.
                                • UE v18.00.0.1021 to v22.20.0.49 set Encoding always to Default on opening Save As dialog on Windows Vista and later Windows versions.
                                  UE v18.00.0.1021 to v22.20.0.49 set Format always to Default on opening Save As dialog on Windows XP and former Windows versions.
                              • UE v23.00.0.42 to v26.00.0.72 (currently latest version) remember last used Encoding selection in Save As dialog.
                              The preferred Encoding selection should be always Default which results saving a file with new (or same name) with the encoding the file has.

                              So ANSI encoded Jim.txt is saved with Default as ANSI encoded Joe.txt and UTF-8 without or with BOM encoded Jim.xml is saved with Default as UTF-8 without or with BOM encoded as Joe.xml according to my test with UE v26.00.0.72 on Windows 7 if the two UTF-8 configuration settings are both not checked as by default.
                              Best regards from an UC/UE/UES for Windows user from Austria

                              1581
                              Power UserPower User
                              1581

                                May 06, 2019#15

                                For heavens sake - how many versions have you installed? 😊

                                Thanks - I think that with "Default" (German: Standard) its solved.
                                UE 26.20.0.74 German / Win 10 x 64 Pro