User to user discussion and support for UltraEdit, UEStudio, UltraCompare, and other IDM applications.

Help with setting up and configuring custom user tools in UltraEdit (based on command line input)
9 posts Page 1 of 1
No matter how I adjust the options, it always turns my webpage to unrecognizable characters, like the image attached. Does anyone know what's wrong? Thx
646464.jpg
646464.jpg (67.69 KiB) Viewed 6698 times
Upload as attachment to a further post your HTML (PHP) input file and the entire section [HTMLTidyOptions] from %appdata%\IDMComp\UltraEdit\uedit32.ini to have any chance to find out the reason for this output. Please tell us also which version of UltraEdit you use. English UltraEdit version 16.10.0.1035?

The output I get in the output window when parsing one of my HTML files with HTML Tidy of UE v16.10.0.1035 is only

HTML Tidy Parsing ...OK

and the tidy document in the ** HTML Tidy Output ** document window is well encoded and displayed.

My HTML file contains in head section the line

<meta http-equiv="content-type" content="text/html; charset=iso-8859-1">

and the file is an ANSI file using this code page. To be more correct, it is an ASCII file because HTML entities used for ANSI letters.
Best regards from Austria
I fixed this:
Download a portable trial version UE and it works so I mirrored every configuration item from it and found I set the UE to create a new file always as utf-8 format.
After changing that to ascii(default) it works well, don know why though, thank you :D
Thanks for posting what caused the strange display. HTML Tidy always outputs in ASCII/ANSI. UltraEdit should therefore ignore the setting for new files when capturing the output of HTML Tidy into a new file or correct convert the Tidy output from ANSI to UTF-8 or UTF-16. I will report this issue by email to IDM.

Edited on 2010-08-31: With UltraEdit v16.20.0.1009 this issue is partly fixed. If new files are UTF-8 encoded files according to the configuration setting, UltraEdit converts the new file first to ASCII / ANSI before writing the output of HTML Tidy to the new file and therefore the parser output is readable in the new window. But if new files are UTF-16 encoded files according to the configuration setting, the HTML Tidy output in the new file is still wrong and so not readable at all.

Edited on 2010-12-03: The remaining problem with UTF-16 as default for new files is fixed with UltraEdit v16.30.0.1000. Now the HTML Tidy output file is always an ASCII / ANSI file if the input HTML file is also an ASCII / ANSI file independent of the configuration setting for the encoding type of new files.
UltraEdit v16.30.0.1000 introduced the new feature that an HTML file encoded with UTF-8 results in a UTF-8 Tidy output file independent of the configuration setting for the encoding type of new files and independent of the Tidy option for char-encoding. But UE v16.30.0.1000 does not display the UTF-8 encoded file correct as Unicode file. The UTF-8 encoded output file must be stored, closed and re-opened to get it correct displayed. I reported this new issue by email to IDM.
Best regards from Austria
I use UE 14.20 German to write HTML with German umlauts (ä, ö, ü, ...)

When I use HTML Tidy they are converted from
Code: Select all
Müller Möhre Öse

to
Code: Select all
Müller Möhre Öse

How to avoid this?
Here are my settings:
http://imageshack.us/photo/my-images/217/ultraedithtmltidy.png

I tried different setting for "Char encoding", but there was no difference ...

Peter
You have probably read char-encoding option explanation. UltraEdit is not up-to-date regarding listing all possible values for this option. If you want full control, store the HTML Tidy options you want to use with a value different from default in a text file and specify this file instead of using the options you can configure in the dialog and which are stored in uedit32.ini.

According to your report I suppose your HTML file is encoded with UTF-8. So you should see in the status bar at bottom of the UltraEdit window U8-DOS or U8-UNIX. And your HTML file contains in the head section probably (and hopefully) also

<meta http-equiv="Content-Type" content="text/html; charset=utf-8" >

or something similar according to document type. In this case HTML Tidy outputs also everything always in UTF-8. But your version of UltraEdit does not recognize that the output captured is encoded in UTF-8. So everything captured from HTML Tidy is written into an ANSI file. If you save the HTML Tidy output as file, close the document window and reopen it, you will see that the German umlauts look fine because now UltraEdit recognizes the characters as UTF-8 encoded.

Newer versions of UltraEdit support UTF-8 encoded HTML Tidy output, see my above post.
Hello Mofi

thanks for info and links.

Mofi wrote: According to your report I suppose your HTML file is encoded with UTF-8. So you should see in the status bar at bottom of the UltraEdit window U8-DOS or U8-UNIX. And your HTML file contains in the head section probably (and hopefully) also

<meta http-equiv="Content-Type" content="text/html; charset=utf-8" >

Yes. U8-DOS and charset=utf-8.

But nevertheless ..

HTML-Tidy reports:
Code: Select all
HTML Tidy Parsing ...
line 1 column 1 - Warning: specified input encoding (utf-8) does not match actual input encoding (utf-16)
...
Info: Doctype given is "-//W3C//DTD HTML 4.01 Transitional//EN"
Info: Document content looks like HTML 4.01 Transitional

And to save and reopen the file as you recommended above does not help.

Edit: I took now the section [HTMLTidyOptions] from uedit32.ini and made a new CFG file. Then I changed

Code: Select all
char-encoding=0   (which is not defined) to ascii, to utf8 and to nothing (I removed the line)


Result: all the same ...

The report of HTML-Tidy reports also:
Code: Select all
line 61 column 21 - Warning: <img> attribute "gelï¿¿ï¿¿schte" lacks value

It has already a problem here ...

Peter
Semi - solution:

a) The CFG has to use ":" instead of =

wrong:
char-encoding=utf8
correct:
char-encoding: utf8

An example can be found here: http://www.w3.org/People/Raggett/tidy

b) The result is now that umlauts in strings will not be replaced, umlauts in filename will be replaced. But the file has to be saved to refresh the strings (see postings above):

Characters displayed in the result of Tidy:
Code: Select all
                            <img alt="Hier fehlt ein Bild." title=
                            "Abfrage zum endgültigen Löschen des Ordners " src=
                            "media/Outlook_L%C3%B6schen.png">

Characters displayed in the file after "save as .. UTF-8":
Code: Select all
                            <img alt="Hier fehlt ein Bild." title=
                            "Abfrage zum endgültigen Löschen des Ordners " src=
                            "media/Outlook_L%C3%B6schen.png">

Peter
I found an alternative:
http://int64.org/projects/htmltrim

I found it with a link from W3C-Consortium.

7 years old, but a nice GUI, a lot of options which can be stored in project-files. Maybe the base is also HTML-Tidy, but now the Umlaute are OK. Now I will try to edit and check it with UE and included HTML-TIDY, and then to make it "pretty-printing" with HTMLTRIM.

Peter
9 posts Page 1 of 1
cron