Tapatalk

How do I match all the content between and including <style> then several lines then </style>

How do I match all the content between and including <style> then several lines then </style>

5

PostMay 15, 2021#1

I'm trying to generate a series of regular expressions to remove unwanted html from converted .docx to .html.
Most of the patterns I have generated work but...
How do I match all the content between and including <style> then several lines then </style>
I've tried <style*</style> and <style>[*^p]</style> with no success. 
It seems I don't understand how to include all text including all the new lines.
I'm using the UltraEdit regular expression syntax. 
Any help would be appreciated. (I'm a bit new to regular expressions in Ultraedit)
Thanks

19276
MasterMaster
19276

PostMay 15, 2021#2

Hi,

I would try this Perl solution
Just replace XXX with your desired tag.

BR, Fleggy

EDIT:
or if there are no nested tags then this simple Perl regex should be sufficient:
(?s)<style>.*?</style>

6,824625
Grand MasterGrand Master
6,824625

PostMay 15, 2021#3

The Perl instead of UltraEdit regular expression replaces as suggested by Fleggy are best to select the entire STYLE element quickly on not being too large (several MB) and delete it. The UltraEdit regular expression is not really capable doing that.

An always working solution would be using the Find + Select feature of UltraEdit. That works even if the STYLE element to remove has thousands of lines with hundreds of MB. There is first done a non-regular expression find for <style to find and select the starting tag. Next there is done again a non-regular expression find searching for the next occurrence (not the matching occurrence) of the end tag </style> with holding key SHIFT on clicking on button Next in the find dialog window. UltraEdit expands the existing selection of the start tag now up to end of the end tag found by the second find with additionally selecting everything. The selection is removed with pressing key DEL. Doing that during recording a macro produces the macro code:

Code: Select all

InsertMode
ColumnModeOff
HexOff
UltraEditReOn
Find "<style"
UltraEditReOn
StartSelect
Find Select "</style>"
EndSelect
Key DEL
The recorded macro code can be improved to:

Code: Select all

InsertMode
ColumnModeOff
HexOff
UltraEditReOn
Top
Find "<style"
IfFound
StartSelect
Find Select "</style>"
EndSelect
Delete
EndIf
The Find + Select feature can be used also with a regular expression find. It is just not necessary for this use case to use a regular expression find.

5

PostMay 15, 2021#4

Thanks! The second is great in that it gives me a template to set up a number of deletes in a single macro!