Display of non-ASCII glyphs in the non-hex display section of hex edit mode

herrblint · Apr 13, 2019#12019-04-13T12:51+00:00

Is there a setting that would cause the hex editor to use the same font the main editor is using, in the non-hex display section of the file?

For example the normal text edit mode of UltraEdit displays the glyph properly:

■ (Black Square, Unicode 25A0)

But when the hex edit mode is invoked that glyph is unavailable. The non-hexadecimal section renders it as â- followed by a space.

Mofi · Apr 13, 2019#22019-04-13T15:00+00:00

Your question makes it clear for me that you have not understood how hex editing mode works.

I recommend first reading the introducing chapter of UltraEdit power tip Working with Unicode in UltraEdit/UEStudio.

A UTF-16 Little Endian with BOM encoded text file contains the sentence This character ■ is a black square. as follows:

Code: Select all

00000000h: FF FE 54 00 68 00 69 00 73 00 20 00 63 00 68 00 ; ÿþT.h.i.s. .c.h.
00000010h: 61 00 72 00 61 00 63 00 74 00 65 00 72 00 20 00 ; a.r.a.c.t.e.r. .
00000020h: A0 25 20 00 69 00 73 00 20 00 61 00 20 00 62 00 ;  % .i.s. .a. .b.
00000030h: 6C 00 61 00 63 00 6B 00 20 00 73 00 71 00 75 00 ; l.a.c.k. .s.q.u.
00000040h: 61 00 72 00 65 00 2E 00                         ; a.r.e...

The same text is stored in a UTF-8 encoded text file as:

Code: Select all

00000000h: 54 68 69 73 20 63 68 61 72 61 63 74 65 72 20 E2 ; This character â
00000010h: 96 A0 20 69 73 20 61 20 62 6C 61 63 6B 20 73 71 ; –  is a black sq
00000020h: 75 61 72 65 2E                                  ; uare.

And the same text looks as follows in a UTF-16 Big Endian with byte order mark encoded text file:

Code: Select all

00000000h: FE FF 00 54 00 68 00 69 00 73 00 20 00 63 00 68 ; þÿ.T.h.i.s. .c.h
00000010h: 00 61 00 72 00 61 00 63 00 74 00 65 00 72 00 20 ; .a.r.a.c.t.e.r. 
00000020h: 25 A0 00 20 00 69 00 73 00 20 00 61 00 20 00 62 ; % . .i.s. .a. .b
00000030h: 00 6C 00 61 00 63 00 6B 00 20 00 73 00 71 00 75 ; .l.a.c.k. .s.q.u
00000040h: 00 61 00 72 00 65 00 2E                         ; .a.r.e..

In hex edit mode the bytes in a file are no longer interpreted as characters. They are interpreted as bytes. After the semicolon the bytes are displayed with their ASCII representation to make at least ASCII text in any type of file (binary or text) easy readable for people. There are no longer characters displayed in hex edit mode, just the bytes of the file not interpreted at all.

For that reason it is not possible to get a character with Unicode code value U+25A0 displayed in ASCII representation area of the bytes of the file displayed as character depending on character encoding of the file.