Hello!
Well, after working on posting content to online CMS systems, from clients sending me a Microsoft Word file, I have come up with this:
First, save the file out in Microsoft Word as a Filtered HTML for CMS systems.
It generates a fair HTML output, but there is still a lot of in-line formatting and other stuff that needs to be edited out.
I wrote this macro to help.
It still requires manual work to set up ordered and unordered lists, but a lot better than doing all the formatting deletion.
I have attached sample files from the original Word file, and the cleaned up result of the macro.
Here is the macro:
There may be shorter ways using complex pattern matching, but probably anyone can understand this one.
If anyone has any suggestions, please let me know.
Mark
Well, after working on posting content to online CMS systems, from clients sending me a Microsoft Word file, I have come up with this:
First, save the file out in Microsoft Word as a Filtered HTML for CMS systems.
It generates a fair HTML output, but there is still a lot of in-line formatting and other stuff that needs to be edited out.
I wrote this macro to help.
It still requires manual work to set up ordered and unordered lists, but a lot better than doing all the formatting deletion.
I have attached sample files from the original Word file, and the cleaned up result of the macro.
Here is the macro:
Code: Select all
InsertMode
ColumnModeOff
HexOff
Top
InsertMode
ColumnModeOff
HexOff
UltraEditReOn
Find "<html>"
Key DEL
UltraEditReOn
Find "<head>"
UltraEditReOn
StartSelect
Find Select "</head>"
EndSelect
Key DEL
Top
UltraEditReOn
Find "<body"
UltraEditReOn
StartSelect
Find Select ">"
EndSelect
Key DEL
Loop 0
Top
UltraEditReOn
Find "class="
IfFound
StartSelect
Find Select ">"
Key LEFT ARROW
EndSelect
Key DEL
Else
ExitLoop
EndIf
EndLoop
Top
UltraEditReOn
Find "style="
IfFound
StartSelect
Find Select ">"
Key LEFT ARROW
EndSelect
Key DEL
Else
ExitLoop
EndIf
EndLoop
Top
UltraEditReOn
Find "</body>"
Key DEL
Top
UltraEditReOn
Find "</html>"
Key DEL
Top
Loop 0
Find "<span"
IfFound
StartSelect
Find Select ">"
Key DEL
"<span>"
Else
ExitLoop
EndIf
EndLoop
Top
Loop 0
Find "<p "
IfFound
StartSelect
Find Select ">"
Key DEL
"<p>"
Else
ExitLoop
EndIf
EndLoop
Top
PerlReOn
Find MatchCase RegExp "^(?:[\t ]*(?:\r?\n|\r))+"
Replace All ""
UltraEditReOn
Top
Find "<p>"
Key DEL
"<h3>"
Find "</p>"
Key DEL
"</h3>"
If anyone has any suggestions, please let me know.
Mark
- Marketing_Newsletter.doc (48 KiB) 1
- Microsoft Word document to save as filtered HTML file with MS Word
- Cleaned_up_Fixed_Marketing_Newsletter.htm (13.93 KiB) 1
- Results HTML file after macro execution to clean up the HTML file created by Microsoft Word