Tapatalk

How do I match all the content between and including <style> then several lines then </style>

How do I match all the content between and including <style> then several lines then </style>

5

    May 15, 2021#1

    I'm trying to generate a series of regular expressions to remove unwanted html from converted .docx to .html.
    Most of the patterns I have generated work but...
    How do I match all the content between and including <style> then several lines then </style>
    I've tried <style*</style> and <style>[*^p]</style> with no success. 
    It seems I don't understand how to include all text including all the new lines.
    I'm using the UltraEdit regular expression syntax. 
    Any help would be appreciated. (I'm a bit new to regular expressions in Ultraedit)
    Thanks

    19176
    MasterMaster
    19176

      May 15, 2021#2

      Hi,

      I would try this Perl solution
      Just replace XXX with your desired tag.

      BR, Fleggy

      EDIT:
      or if there are no nested tags then this simple Perl regex should be sufficient:
      (?s)<style>.*?</style>

      6,685587
      Grand MasterGrand Master
      6,685587

        May 15, 2021#3

        The Perl instead of UltraEdit regular expression replaces as suggested by Fleggy are best to select the entire STYLE element quickly on not being too large (several MB) and delete it. The UltraEdit regular expression is not really capable doing that.

        An always working solution would be using the Find + Select feature of UltraEdit. That works even if the STYLE element to remove has thousands of lines with hundreds of MB. There is first done a non-regular expression find for <style to find and select the starting tag. Next there is done again a non-regular expression find searching for the next occurrence (not the matching occurrence) of the end tag </style> with holding key SHIFT on clicking on button Next in the find dialog window. UltraEdit expands the existing selection of the start tag now up to end of the end tag found by the second find with additionally selecting everything. The selection is removed with pressing key DEL. Doing that during recording a macro produces the macro code:

        Code: Select all

        InsertMode
        ColumnModeOff
        HexOff
        UltraEditReOn
        Find "<style"
        UltraEditReOn
        StartSelect
        Find Select "</style>"
        EndSelect
        Key DEL
        
        The recorded macro code can be improved to:

        Code: Select all

        InsertMode
        ColumnModeOff
        HexOff
        UltraEditReOn
        Top
        Find "<style"
        IfFound
        StartSelect
        Find Select "</style>"
        EndSelect
        Delete
        EndIf
        
        The Find + Select feature can be used also with a regular expression find. It is just not necessary for this use case to use a regular expression find.
        Best regards from an UC/UE/UES for Windows user from Austria

        5

          May 15, 2021#4

          Thanks! The second is great in that it gives me a template to set up a number of deletes in a single macro!