Tagged expression with an OR

Tagged expression with an OR

35
Basic UserBasic User
35

    Nov 09, 2007#1

    Have been through FAQ, Powertips, and searched this find forum but not able to find what I am looking for...

    I am using UE 11.20+4 with Unix regexp

    I have several text files that I need to manipulate. Here is a bogus sample for discussion purposes:

    abc bbb [abcdefg:hijk] qwe bbb ccc
    asd
    aaa bbb THEWORD aaa bbb
    qwe
    aaa bbb [bbcdefg:hijk] aaa bbb ccc
    asd
    aaa bbb THEWORD aaa bbb THEWORD
    qwe THEWORD


    I want the resulting file to look like this:

    abcdefg:hijk,THEWORD
    bbcdefg:hijk,THEWORD THEWORD THEWORD

    I am basically looking for any text within the square brackets and I am also looking for a specific string ("THEWORD" in this example)

    I wanted to use tagged expression search/replace but couldn't get past the search part:

    (\[.+\]|THEWORD)

    To test the search component, I was thinking to "List Lines Containing Strings" which would show me a list of all lines that had either a string within square brackets or "THEWORD" in it. BUT it only shows lines that have "THEWORD" in it. "\[.+\]|" works fine by itself, but not when I add it to the OR search expression.

    I want to delete all text from the beginning of the line to the "[" and to delete all characters from the "]" to (and INCLUDING) the end of line EXCEPT for any occurences of "THEWORD"

    I also want to make sure that on those lines that do NOT have square brackets in it but have "THE WORD", that all characters OTHER THAN occurences of "THE WORD" will be deleted

    I figured I could use tagged expressions (e.g. /1) to help, although I guess I would need to strip out the square brackets afterwards if I did it that way.


    Would appreciate some help...!

    With Regards-
    Sam

    262
    MasterMaster
    262

      Nov 10, 2007#2

      There's always a risk of getting wrong feedback when using bogus data. I'm not sure I understand how your before example ends up as the shown result. But the macro below will produce the result you want but I'm not sure it necessarily will work on your "live" data.

      InsertMode
      ColumnModeOff
      HexOff
      UnixReOn
      Top
      Find "^p"
      Replace All " "
      Find "THEWORD"
      Replace All "ÿ "
      Find RegExp "^[^ÿ\[]*(.)"
      Replace All "\1"
      Find "["
      Replace All "^p["
      Find RegExp "\][^ÿ\r\n]*"
      Replace All "] "
      Find RegExp "ÿ[^ÿ\r\n]*"
      Replace All "ÿ "
      Find "ÿ"
      Replace All "THEWORD"
      Find "["
      Replace All ""
      Find "]"
      Replace All ","


      Notes: "THEWORD" is replaced temporarily with a non-occuring character "ÿ". If this character can occur in your data or is problematic in your code page, then choose another one instead.

      I repeat the macro with comments:

      InsertMode
      ColumnModeOff
      HexOff
      UnixReOn
      Top
      Find "^p"
      all text on one line (assuming DOS line endings)
      Replace All " "
      Find "THEWORD"
      Temporarily replacing THEWORD
      Replace All "ÿ "
      Find RegExp "^[^ÿ\[]*(.)"
      From start to first ÿ or [
      Replace All "\1"
      Find "["
      insert line breaks at each new [
      Replace All "^p["
      Find RegExp "\][^ÿ\r\n]*"
      Remove text after ] to the end of line or first ÿ
      Replace All "] "
      Find RegExp "ÿ[^ÿ\r\n]*"
      Remove text after ÿ to next ÿ or end of line
      Replace All "ÿ "
      Find "ÿ"
      change back to THEWORD
      Replace All "THEWORD"
      Find "["
      Remove [
      Replace All ""
      Find "]"
      Remove ] and insert comma
      Replace All ","


      If this was not exactly what you wanted, either change the macro yourself or post some real examples of your data as code block.

      35
      Basic UserBasic User
      35

        Nov 11, 2007#3

        Thank you very much for your time. I have never done macros before so will need to do some learning so I can understand what you have done.

        As an aside, I am curious: - if you or someone else could help me understand why my search string did not work as I expected, I would also be grateful!

        i.e. (\[.+\]|THEWORD)

        262
        MasterMaster
        262

          Nov 11, 2007#4

          Because OR expressions do only support literal text (A|B) - no wildcards etc.

          22
          Basic UserBasic User
          22

            Re: tagged expression with an OR

            Nov 11, 2007#5

            It's interesting that when I tried version 11.20b the expression (\[.+\]|THEWORD) would not find text matching \[.+\], but when I tried it in version 12.20 with regexp set to perl compatible it matches both terms; i.e. it would match any text for \[.+\].

            Jane

            262
            MasterMaster
            262

              Nov 11, 2007#6

              Yes, perl regexp is more powerful than legacy unix style regexp.

              35
              Basic UserBasic User
              35

                Nov 11, 2007#7

                So (\[.+\]|THEWORD) search WILL work with 12.20 perl regexp?

                Thanks for your posts!

                  Nov 12, 2007#8

                  jorrasdk - EXCELLENT! FANTASTIC. Macros are my new friend! I just have to figure out what that code means! (thanks for your comments). EXCELLENT!

                  Your code works perfectly for me except for the last line of my text documents: it does not delete the text to the right of the last occurence of "THEWORD".

                  Your point is well taken re: bogus data... my example did not CONTAIN any text to the right of the last occurence of "THEWORD"... but assuming that it did, does anyone have an easy fix to delete that text too?

                  Thanks again-
                  Sam

                  6,675585
                  Grand MasterGrand Master
                  6,675585

                    Nov 12, 2007#9

                    Try this macro. The new code is with red color.

                    InsertMode
                    ColumnModeOff
                    HexOff
                    UnixReOn
                    Bottom
                    Find Up "THEWORD"
                    IfFound
                    Key LEFT ARROW
                    Key RIGHT ARROW
                    SelectToBottom
                    IfSel
                    Find RegExp "(\[.*\])"
                    Replace All SelectText "\1"
                    IfNotFound
                    Delete
                    EndIf
                    EndSelect
                    EndIf
                    EndIf

                    Top
                    Find "^p"
                    Replace All " "
                    Find "THEWORD"
                    Replace All "ÿ "
                    Find RegExp "^[^ÿ\[]*(.)"
                    Replace All "\1"
                    Find "["
                    Replace All "^p["
                    Find RegExp "\][^ÿ\r\n]*"
                    Replace All "] "
                    Find RegExp "ÿ[^ÿ\r\n]*"
                    Replace All "ÿ "
                    Find "ÿ"
                    Replace All "THEWORD"
                    Find "["
                    Replace All ""
                    Find "]"
                    Replace All ","
                    Key END
                    IfColNum 1
                    DeleteLine
                    EndIf
                    Bottom
                    IfColNumGt 1
                    InsertLine
                    EndIf
                    Top
                    Best regards from an UC/UE/UES for Windows user from Austria

                    35
                    Basic UserBasic User
                    35

                      Nov 13, 2007#10

                      Thanks Mofi! PERFECT!

                        Nov 13, 2007#11

                        Hi folks - instead of hardcoding "THEWORD", I would instead like to highlight a word within the text file before running the macro. I've been messing around with ^s and ^c, but don't seem to be making progress. I've downloaded Mofi's macro examples, and been looking at help files and searches on this forum, but spending a LOT of time and not feeling any closer. I found a comment that Mofi had made back in 2004

                        "I noticed, that ^c and ^s are simply not working in a regular expression search with Unix style. " - but wasn't sure if this had any relevance.

                        Can anyone please help me convert Mofi's macro above so that instead of hardcoding "THEWORD", the macro will instead use the highlighted text?

                        With Regards-
                        Sam

                        6,675585
                        Grand MasterGrand Master
                        6,675585

                          Nov 14, 2007#12

                          Yes, ^c works only in UltraEdit style regular expressions or in non regular expression finds/replaces. In your macro THEWORD is used 3 times in non regular expression replaces. So you only have to replace THEWORD by ^c and of course copy the selected text to a clipboard, best not to Windows clipboard. The macro runs now only if something is selected on macro start.

                          IfSel
                          Clipboard 9
                          Copy

                          InsertMode
                          ColumnModeOff
                          HexOff
                          UnixReOn
                          Bottom
                          Find Up "^c"
                          IfFound
                          Key LEFT ARROW
                          Key RIGHT ARROW
                          SelectToBottom
                          IfSel
                          Find RegExp "(\[.*\])"
                          Replace All SelectText "\1"
                          IfNotFound
                          Delete
                          EndIf
                          EndSelect
                          EndIf
                          EndIf
                          Top
                          Find "^p"
                          Replace All " "
                          Find "^c"
                          Replace All "ÿ "
                          Find RegExp "^[^ÿ\[]*(.)"
                          Replace All "\1"
                          Find "["
                          Replace All "^p["
                          Find RegExp "\][^ÿ\r\n]*"
                          Replace All "] "
                          Find RegExp "ÿ[^ÿ\r\n]*"
                          Replace All "ÿ "
                          Find "ÿ"
                          Replace All "^c"
                          Find "["
                          Replace All ""
                          Find "]"
                          Replace All ","
                          Key END
                          IfColNum 1
                          DeleteLine
                          EndIf
                          Bottom
                          IfColNumGt 1
                          InsertLine
                          EndIf
                          Top
                          ClearClipboard
                          Clipboard 0
                          EndIf
                          Best regards from an UC/UE/UES for Windows user from Austria

                          35
                          Basic UserBasic User
                          35

                            Nov 14, 2007#13

                            Thanks Mofi! Absolutely WONDERFUL! I don't yet understand it, but it's WONDERFUL!!

                            I am a little embarrassed to ask the forum for more help on this macro - I have been reading up on macros to try to get up to speed, but I am having a tough time finding the pieces I need to understand to do what I want to do...

                            After the macro runs, I will end up with something like this:

                            abcdefg:hijk,THEWORD
                            bbcdefg:hijk,THEWORD THEWORD THEWORD
                            qweqwe:qwer, THEWORD THEWORD

                            What I REALLY need (which I have tried to do manually AFTER running this amazing macro) is a result that looks like this:

                            abcdefg:hijk,THEWORD
                            bbcdefg:hijk,THEWORD
                            bbcdefg:hijk,THEWORD(2)
                            bbcdefg:hijk,THEWORD(3)
                            qweqwe:qwer, THEWORD
                            qweqwe:qwer, THEWORD(2)

                            With large files, I have found how "human" I am in making errors as I am copying and pasting and typing... trying to get the above result.

                            I spent some time looking over posts having to do with duplicate values and duplicate lines hoping to get a clue (most posts had to do with deleting) - but this is still way beyond me... so all help is appreciated


                            The rest I can do manually if need be, but if anyone insists on helping me even further... :D

                            I need the file sorted with some more slicing and dicing for THIS final result:

                            "THEWORD","
                            abcdefg:hijk
                            bbcdefg:hijk
                            qweqwe:qwer
                            "
                            "THEWORD(2)","
                            bbcdefg:hijk
                            qweqwe:qwer
                            "
                            "THEWORD(3)","
                            bbcdefg:hijk
                            "

                            Mofi - your macro alone has saved me a tremendous amount of time - even if someone does not help me further, I am still way ahead here. Thanks to all for your posts.

                            And what an amazing product!

                            With Regards-
                            Sam