inflated file size with UTF-8 encoding

inflated file size with UTF-8 encoding

2
NewbieNewbie
2

    Feb 04, 2006#1

    Hello,

    when I edit UTF-8 encoded file, the file size that UltraEdit displays in status bar is twice inflated than its actual size (number of chars*2+2(BOM?)).

    Is there some setting that can turn this off?

    6,675585
    Grand MasterGrand Master
    6,675585

      Feb 04, 2006#2

      No, this cannot be turned off. What do you know about UTF-8 encoding?

      A short (hopefully correct explanation): In UTF-8 all ASCII characters with a hexadecimal code value lower than 80 (decimal 128) are stored in UTF-8 as single byte. All other characters are encoded with 2, 3 or even 4 bytes.

      For editing UltraEdit must convert temporarily the UTF-8 file to Unicode like a UTF-16 file where every character is encoded with 2 bytes. You can see it, if you click on Edit - Hex Functions - Hex Edit (UE < v13.00). Click on Hex Edit again to switch back to normal editing. The actual file size in the status bar shows always the ACTUAL file size and so the size is correct because actually it is a real Unicode file with BOM (+2 byte). It's converted back to UTF-8 on file save.
      Best regards from an UC/UE/UES for Windows user from Austria