Tapatalk

How to know encoding of a file?

How to know encoding of a file?

1
NewbieNewbie
1

PostFeb 17, 2015#1

Hi

I'd like to know if it's possible to know the encoding of a text file when opening it (unicode, UTF, etc...)

I know it's possible to convert, but how to know my original encoding?

Many thanks

79
Advanced UserAdvanced User
79

PostFeb 18, 2015#2

It's not an easy problem to automatically detect the encoding of a file. Raymond Chen of Microsoft has written about how Notepad tries to deal with that problem and the various issues that crop up:

- Some files come up strange in Notepad
- The Notepad file encoding problem, redux

And his articles don't even get into the additional ambiguities of trying to detect non-ANSI/non-UNICODE single-byte or multi-byte encodings.

To quote from the end of the second article:
The point is that no matter how you decide to resolve the ambiguity, somebody will win and somebody else will lose. And then people can start experimenting with the "losers" to find one that makes your algorithm look stupid for choosing "incorrectly".

6,825625
Grand MasterGrand Master
6,825625

PostFeb 18, 2015#3

UltraEdit has encoding detection built-in for UTF-8, UTF-16 Little Endian, UTF-16 Big Endian, ASCII Escaped Unicode, and supports also character set declaration in HTML and XHTML (short HTML5 declaration with v24.00 or later) and encoding declaration in XML files.

So let UltraEdit auto-detect the encoding and then look at bottom on status bar where the encoding or the code page in case of a non Unicode file is displayed for the active file.