delete text between two words

delete text between two words

12
Basic UserBasic User
12

    Jul 16, 2006#1

    i need remove text between [lang_en] and [/lang_en]

    example:

    Code: Select all

    blabla [lang_en]Genre: Sci-Fi / Action / Adventure[/lang_en] blabla
    [lang_en]Country:[/lang_en] USA / Canada
    i testing this macro:

    Code: Select all

    InsertMode
    ColumnModeOff
    HexOff
    UnixReOff
    Top
    Loop 
    Find RegExp "lang_en[~,]+/lang_en"
    IfFound
    Delete
    Else
    ExitLoop
    EndIf
    EndLoop
    
    but with this effect :(

    Code: Select all

    blabla [] USA / Canada

      Jul 16, 2006#2

      I found something here Delete all lines between 2 pre-defined texts and compiled into this:

      Code: Select all

      InsertMode
      ColumnModeOff
      HexOff
      Top
      Loop 
      Find "[lang_en]"
      IfNotFound
      ExitLoop
      EndIf
      EndIf
      StartSelect
      Find Select "[/lang_en]"
      IfSel
      EndSelect
      Delete
      Else
      EndSelect
      ExitLoop
      EndIf
      EndLoop
      Find "[lang_sk]"
      Replace All ""
      Find "[/lang_sk]"
      Replace All ""
      
      Is is working great, but is too complicated.

      6,675585
      Grand MasterGrand Master
      6,675585

        Jul 16, 2006#3

        Fine, you are now searching, thinking and trying before asking. So I will help you a little.

        First your macro contains 1 EndIf too much.

        Second you can speed up your macro if you delete all strings between [lang_en]...[/lang_en] which are single line strings by using a single regular expression replace all (in UltraEdit style).

        Find RegExp "^[lang_en^]*^[/lang_en^]"
        Replace All ""

        This regex will delete occurrences like:

        [lang_en]normal single line string[/lang_en]
        [lang_en]single line string with [ inside[/lang_en]


        You can also remove all strings between [lang_en]...[/lang_en] even if they are spanned over multiple lines with a single regular expression, but only if there is no [ character within the string.

        Find RegExp "^[lang_en^][~^[]+^[/lang_en^]"
        Replace All ""

        This regex will delete occurrences like:

        [lang_en]line 1 of a multi-line string
        line 2 of a multi-line string
        line 3 of a multi-line string
        [/lang_en]


        But to make sure that really every string between [lang_en]...[/lang_en] is deleted, finally the loop must be used to also delete strings like:

        [lang_en]line 1 of a multi-line string
        line 2 of a multi-line string with a [ character inside
        because of a nested other [tag]...[/tag] block
        [/lang_en]


        The whole macro with the enabled macro property Continue if a Find with Replace not found looks like as follows:

        InsertMode
        ColumnModeOff
        HexOff
        UnixReOff
        Top
        Find RegExp "^[lang_en^]*^[/lang_en^]"
        Replace All ""
        Find RegExp "^[lang_en^][~^[]+^[/lang_en^]"
        Replace All ""
        Loop
        Find "[lang_en]"
        IfNotFound
        ExitLoop
        EndIf
        StartSelect
        Find Select "[/lang_en]"
        IfFound
        EndSelect
        Delete
        Else
        EndSelect
        ExitLoop
        EndIf
        EndLoop
        Top

        Add UnixReOn or PerlReOn (v12+ of UE) at the end of the macro if you do not use UltraEdit style regular expressions by default - see search configuration. Macro command UnixReOff sets the regular expression option to UltraEdit style.

        I have described the Find Select method at: How to select everything between two predefined strings in different lines?

        Special note:
        I have written once that IfFound is not possible after a Find Select "" command. This is not true anymore. IDM seems to have fixed this bug. I don't know at which version. For this example it is important to use IfFound instead of IfSel because the first searched string [lang_en] is still selected even if the second searched string is not found. If you have a version of UltraEdit where the IfFound does not work after Find Select "", you can first unselect the first search string and move the cursor to the start of the first search string. For this example the loop with a possible workaround would look like:

        Loop
        Find "[lang_en]"
        IfNotFound
        ExitLoop
        EndIf
        EndSelect
        Key LEFT ARROW
        Find Up "["
        EndSelect
        Key LEFT ARROW

        StartSelect
        Find Select "[/lang_en]"
        IfSel
        EndSelect
        Delete
        Else
        EndSelect
        ExitLoop
        EndIf
        EndLoop

        A second solution for the UNSELECT AND MOVE BACK code is:

        EndSelect
        Key LEFT ARROW
        Key Ctrl+LEFT ARROW
        Key LEFT ARROW
        Key Ctrl+LEFT ARROW
        Key LEFT ARROW
        Best regards from an UC/UE/UES for Windows user from Austria

        6
        NewbieNewbie
        6

          Jul 21, 2006#4

          MoFi
          hello!
          i am a new learner.
          i don't clearly understand"^[lang_en^]*^[/lang_en^]" and "^[lang_en^][~^[]+^[/lang_en^]" what to means.
          can you explain them little detail?

            Jul 21, 2006#5

            i had tried as you what to say
            (Find RegExp "^[lang_en^]*^[/lang_en^]"
            Replace All "")
            but [lang_en]and[/lang_en] is replaced with ""

            6,675585
            Grand MasterGrand Master
            6,675585

              Jul 21, 2006#6

              cwq2008119 wrote:can you explain them little detail?
              You should read in help of UltraEdit the article about the Find command and also Regular Expressions.

              The UltraEdit style regular expression search string "^[lang_en^]*^[/lang_en^]" means following:

              Find within a single line a string which starts with the string [lang_en] and ends with the string [lang_en]. The asterisk between these 2 strings means 0 or more occurences of any character except a new line character (CR or LF). The characters [ and ] are regular expression characters, but in this regular expression it should be interpreted as normal characters. So the must be escaped with the ^ character.


              The UltraEdit style regular expression search "^[lang_en^][~^[]+^[/lang_en^]" means following:

              Find within whole text a block which starts with the string [lang_en] and ends with the string [lang_en]. The [~^[]+ expression between the two strings means find at least 1 or more occurences of all characters specified within the brackets. The expression within [] means: all characters except the character [ because it is the start of the end string. This includes also the new line characters CR and LF which is the reason why it finds a block and not only a string within a line.

              cwq2008119 wrote:[lang_en]and[/lang_en] is replaced with ""
              That's exactly what the replace should do, delete the whole string that starts with [lang_en] and ends with [/lang_en]. If you want to delete only the characters between the two strings, you only have to write Replace All "[lang_en][/lang_en]".
              Best regards from an UC/UE/UES for Windows user from Austria

              6
              NewbieNewbie
              6

                Jul 22, 2006#7

                thank you!
                something more about "^[lang_en^]*^[/lang_en^]" and
                "^[lang_en^][~^[]+^[/lang_en^]"
                ^[lang_en^]:does the first ^ lies in the front of the first character of
                the start string? and the second ^ in the front of the last character.

                "^[lang_en^][~^[]+^[/lang_en^]".
                [~^[],why within [] only exclude [(the first start character?)?

                can you recommend some program language to learn?
                how about Perl(Practical Extraction and Report Language)?

                6,675585
                Grand MasterGrand Master
                6,675585

                  Jul 22, 2006#8

                  You are driving me crazy and you have not read the help articles!

                  The ^ character is an escape character for UltraEdit style regular expression characters like the \ is for Unix/Perl style regex or C/C++.

                  ^ in an UltraEdit style regular expression simply means that the next character in the string should be interpreted as normal character and not with its regular expression meaning. Sorry, but that should not be too difficult to understand.

                  [lang_en] in a regular expression search would search for a character which is either a l or a or n or g or _ or e (and its uppercase equivalents too if Match Case is not used). ^[lang_en^] in an UltraEdit style regular expression search string means simply search for the string [lang_en] and not for a single character.

                  Once again. Read the help and play a little to understand. Learning by doing is the best method to learn something. I did not have a teacher or a book. I just learned it by trial and error.
                  Best regards from an UC/UE/UES for Windows user from Austria

                  206
                  MasterMaster
                  206

                    Jul 22, 2006#9

                    You are driving me crazy and you have not read the help articles!
                    :D
                    Software For Metalworking
                    http://closetolerancesoftware.com

                    6
                    NewbieNewbie
                    6

                      Jul 23, 2006#10

                      Mofi
                      thank you!
                      i see.
                      find regexp ^[abc^] is searching [abc],
                      and find regexp ^abc^ is searching abc.