A little help for an existing Regular Expression

A little help for an existing Regular Expression

10
Basic UserBasic User
10

    12:39 - 15 days ago#1

    Some time ago, very valuable help was given on this forum, to clean out subtitle text via Perl regular expression (thank you).

    If I may, can I ask for a help for the following.

    The regular expression is:

    Code: Select all

    ^(?:(?:[\t ]*|\d+|[0-2][0-9]:[0-5][0-9]:[0-5][0-9].*)(?:\r?\n|\r))+
    The file looks like this:

    Code: Select all

    1
    00:00:01,355 --> 00:00:05,025
    <font color="#000000">This is the first spoken sentence.
    </font>
    
    2
    00:00:05,025 --> 00:00:08,362
    <font color="#000000">Something, compared to “something,”
    </font>
    
    
    When I apply Replace All, Perl and Replace With is blank, this is the result
    <font color="#000000">“This is the first spoken sentence.
    </font>
    <font color="#000000">Something, compared to “something,”
    </font>
    What change needs to be made to remove the font parts to end up with:

    Code: Select all

    This is the first spoken sentence.
    Something, compared to “something,”

    Mofi
    6,572524
    Grand MasterGrand Master
    6,572524

      16:46 - 15 days ago#2

      The search expression can be extended with two more OR expressions:

      Code: Select all

      ^(?:(?:[\t ]*|\d+|[0-2][0-9]:[0-5][0-9]:[0-5][0-9].*)(?:\r?\n|\r))+|<font color="[^"]+">|</font>(?:\r?\n|\r)
      Best regards from an UC/UE/UES for Windows user from Austria

      uedit32
      10
      Basic UserBasic User
      10

        22:05 - 15 days ago#3

        Perfect. Thank you

          18:15 - 10 days ago#4

          Found this still remains after the new update

          Code: Select all

          Half of the <i><font size="16"><font face="Arial">Translation.
          </font></font></i>I took it
          instead of

          Code: Select all

          Half of the Translation.
          I took it

          Mofi
          6,572524
          Grand MasterGrand Master
          6,572524

            4:56 - 10 days ago#5

            The Perl regular search expression below removes also all font and italic tags.

            Code: Select all

            ^(?:(?:[\t ]*|\d+|[0-2][0-9]:[0-5][0-9]:[0-5][0-9].*)(?:\r?\n|\r))+|(?:<(?:font|i)[^>]*>)+|(?:</(?:font|i)>)+(?:\r?\n|\r)?[/color]
            Best regards from an UC/UE/UES for Windows user from Austria

            uedit32
            10
            Basic UserBasic User
            10

              7:22 - 4 days ago#6

              Thank you