How to know encoding of a file?

How to know encoding of a file?

1
NewbieNewbie
1

    Feb 17, 2015#1

    Hi

    I'd like to know if it's possible to know the encoding of a text file when opening it (unicode, UTF, etc...)

    I know it's possible to convert, but how to know my original encoding?

    Many thanks

    79
    Advanced UserAdvanced User
    79

      Feb 18, 2015#2

      It's not an easy problem to automatically detect the encoding of a file. Raymond Chen of Microsoft has written about how Notepad tries to deal with that problem and the various issues that crop up:

      - Some files come up strange in Notepad
      - The Notepad file encoding problem, redux

      And his articles don't even get into the additional ambiguities of trying to detect non-ANSI/non-UNICODE single-byte or multi-byte encodings.

      To quote from the end of the second article:
      The point is that no matter how you decide to resolve the ambiguity, somebody will win and somebody else will lose. And then people can start experimenting with the "losers" to find one that makes your algorithm look stupid for choosing "incorrectly".

      6,603548
      Grand MasterGrand Master
      6,603548

        Feb 18, 2015#3

        UltraEdit has encoding detection built-in for UTF-8, UTF-16 Little Endian, UTF-16 Big Endian, ASCII Escaped Unicode, and supports also character set declaration in HTML and XHTML (short HTML5 declaration with v24.00 or later) and encoding declaration in XML files.

        So let UltraEdit auto-detect the encoding and then look at bottom on status bar where the encoding or the code page in case of a non Unicode file is displayed for the active file.
        Best regards from an UC/UE/UES for Windows user from Austria