Tapatalk

Macro for deletion of unnecessary tags in an HTML file at current position of text cursor

Macro for deletion of unnecessary tags in an HTML file at current position of text cursor

6
NewbieNewbie
6

PostMay 02, 2024#1

I would like a macro which deletes tags based at current position of the caret, no matter where is the text cursor:
  • at the beginning of an opening tag;
  • inside an opening tag;
  • between opening and closing tag;
  • at the beginning of a closing tag;
  • inside a closing tag.
The macro should delete opening and closing tag.

Here is an example HTML file.

Code: Select all

<html>
<head>
<title>EXAMPLE</title>
</head>
<body>
<p class="indent">EXAMPLE FILE</p>

<p>===Unnecessary tags==</p>

<p>These are useless tags. Opening and closing tags should be deleted together.</p>

<p><b>Aaa</b>
should be just
Aaa</p>

<p><i>Bbb</i>
should be just
Bbb</p>

<p><u>Ccc</u>
should be just
Ccc</p>

<p class="indent">Example <a id="rref6" href="#ref6">6</a>
should be just
Example 6</p>

</body>
</html>
Here is a macro which I have so far. It works only on the opening tag. If the caret is at opening tag, then the two tags are deleted together.

Code: Select all

InsertMode
ColumnModeOff
HexOff
PerlReOn
Find RegExp Up "(?<=<)\w+"
Copy
UltraEditReOn
Find RegExp "</^c>"
Key DEL
UltraEditReOn
Find RegExp Up "<^c*>"
Key DEL

6,825625
Grand MasterGrand Master
6,825625

PostMay 04, 2024#2

Here is a macro for this task which was quickly developed and poorly tested with UltraEdit for Windows v2024.0.0.28. It can be used for HTML, XHTML and XML files.

The macro has no HTML / XHTML / XML language knowledge. It does not determine if the text to delete is really an HTML, XHTML or XML tag. It does not analyze the structure to find the matching tag. It just deletes the next closing or opening tag. It does not recognize if there is an opening tag in an HTML file like <li> without the optional end tag </li> for deletion of only the opening tag in this case. It does not determine if the first tag found is an empty tag like <br> or <br />. The user using this macro should know when it is executed and what it really does. There can be of course used the command Undo to undo unwanted deletions.

Code: Select all

InsertMode
ColumnModeOff
HexOff
UltraEditReOn
IfCharIs "<"
Else
Find MatchCase RegExp "[<>]"
IfNotFound
Find MatchCase RegExp Up "[<>]"
IfNotFound
ExitMacro
EndIf
EndIf
Key LEFT ARROW
IfCharIs ">"
Find MatchCase Up "<"
IfNotFound
ExitMacro
EndIf
Key LEFT ARROW
EndIf
EndIf
Clipboard 9
Key RIGHT ARROW
IfCharIs "/"
Find RegExp "[a-z^-]+"
IfFound
Copy
Else
ExitMacro
EndIf
Find MatchCase Up "<"
Key LEFT ARROW
Find MatchCase RegExp "</[~>]+>"
Replace ""
Find RegExp Up "<^c[^t^n^r >]"
IfFound
Key LEFT ARROW
Find MatchCase Up "<"
Key LEFT ARROW
Find MatchCase RegExp "<[~>]+>"
Replace ""
EndIf
Else
Find RegExp "[a-z]+"
IfFound
Copy
Else
ExitMacro
EndIf
Find MatchCase Up "<"
Key LEFT ARROW
Find MatchCase RegExp "<[~>]+>"
Replace ""
Find RegExp "</^c[^t^n^r ]++>"
Replace ""
EndIf
ClearClipboard
Clipboard 0
The code is better understandable with indentations.

Code: Select all

InsertMode
ColumnModeOff
HexOff
UltraEditReOn
IfCharIs "<"
Else
    Find MatchCase RegExp "[<>]"
    IfNotFound
        Find MatchCase RegExp Up "[<>]"
        IfNotFound
            ExitMacro
        EndIf
    EndIf
    Key LEFT ARROW
    IfCharIs ">"
        Find MatchCase Up "<"
        IfNotFound
            ExitMacro
        EndIf
        Key LEFT ARROW
    EndIf
EndIf
Clipboard 9
Key RIGHT ARROW
IfCharIs "/"
    Find RegExp "[a-z^-]+"
    IfFound
        Copy
    Else
        ExitMacro
    EndIf
    Find MatchCase Up "<"
    Key LEFT ARROW
    Find MatchCase RegExp "</[~>]+>"
    Replace ""
    Find RegExp Up "<^c[^t^n^r >]"
    IfFound
        Key LEFT ARROW
        Find MatchCase Up "<"
        Key LEFT ARROW
        Find MatchCase RegExp "<[~>]+>"
        Replace ""
    EndIf
Else
    Find RegExp "[a-z]+"
    IfFound
        Copy
    Else
        ExitMacro
    EndIf
    Find MatchCase Up "<"
    Key LEFT ARROW
    Find MatchCase RegExp "<[~>]+>"
    Replace ""
    Find RegExp "</^c[^t^n^r ]++>"
    Replace ""
EndIf
ClearClipboard
Clipboard 0

6
NewbieNewbie
6

PostMay 04, 2024#3

Thank you
It works properly

PostNov 28, 2024#4

I have been using this macro for a long time and it is working perfectly.
Thanks for that
But now I have a problem. It would be great if you could make this macro into a script so I don't have to load it again and again.

6,825625
Grand MasterGrand Master
6,825625

PostNov 28, 2024#5

One or more macros can be stored together in a single macro file. The macro file with all the often needed macros can be configured to be loaded automatically on startup of UltraEdit / UEStudio for being available from the beginning without the need to first load manually the macro file. See this forum topic for details. The execution of macros from an automatically on startup loaded macro file is faster than the execution of a script from the script list. The macro solution is better than a script solution for this type of task.

However, there is installed with UltraEdit / UEStudio a tag list file with the tag groups UE/UES Macro Commands and UE/UES Script Commands. I use both on writing macros from scratch or writing scripts. Open the Tag List view, select the tag group for the script commands and replace one macro command after the other by the corresponding script command. The name of a tag is identical in both tag groups if a macro command is available also as a script command. It should be therefore no problem for any user to rewrite a macro as script with little knowledge on JavaScript syntax.

6
NewbieNewbie
6

PostJan 17, 2025#6

I have been using this macro for a long time and it is working very well, but it is not working properly in tags like this. I would be happy if you could fix it.

Code: Select all

<article-title>Effect of number of blades on aerodynamic forces on a straight-bladed Vertical Axis Wind Turbine</article-title>

6,825625
Grand MasterGrand Master
6,825625

PostJan 17, 2025#7

I modified in the macro code in my first post the line

Code: Select all

Find RegExp "[a-z]+"
to

Code: Select all

Find RegExp "[a-z^-]+"
There can be other characters added to this character class like an underscore or the digits 0-9. The character - has the special meaning fromto inside a character class definition and must be escaped for that reason in an UltraEdit regular expression with ^ and in Unix or Perl regular expressions with a backslash.

I recommend adding to this character class only the characters really used in element names in the files on which the macro is used. The XML specification allow more as it can be read on the W3C page XML Elements in the chapter XML Naming Rules.