Tapatalk

Delete Lines Not Containing Character

Delete Lines Not Containing Character

10
Basic UserBasic User
10

    Feb 13, 2013#1

    Hello,
    I want to be able to delete all lines not containing a certain character, in this case an UnderScore "_".

    If someone can help with this I'd appreciate it.
    Thanks in advance…

    10211
    Power UserPower User
    10211

      Feb 13, 2013#2

      One way would be to open the find dialog and then enter your character - an underscore in this case and then click then button for "show lines". This will then display all line containing your underscore and hide all other lines. Then you can go the to edit menu, delete, delete all hidden lines. You will be left only with the lines containing the underscore.

      Cheers...

      Frank

      6,685587
      Grand MasterGrand Master
      6,685587

        Feb 13, 2013#3

        As you can read at How to delete all lines NOT containing specific word or string or expression? you don't need a macro anymore for this task.

        All you need is executing a Perl regular expression Replace All from top of the file with search string ^(?:(?!_).)*$\r\n and an empty replace string.

        As macro code:

        Code: Select all

        InsertMode
        ColumnModeOff
        HexOff
        Bottom
        IfColNumGt 1
        InsertLine
        IfColNumGt 1
        DeleteToStartofLine
        EndIf
        EndIf
        Top
        PerlReOn
        Find MatchCase RegExp "^(?:(?!_).)*$\r\n"
        Replace All ""

        10
        Basic UserBasic User
        10

          Feb 14, 2013#4

          First of all I should have noted that I was trying to add this functionality to an existing macro. Sorry, that was my fault. :|

          Secondly thanks Frank for showing me how to process the lines without a macro.

          Thirdly Mofi, it works very well. :)

          Why do you use Perl expressions over the regular?

          Many thanks for your help,
          JH

          6,685587
          Grand MasterGrand Master
          6,685587

            Feb 14, 2013#5

            The UltraEdit and Unix regular expressions are simply not powerful enough to do the negative find in one step.

            Well, in your special case with checking only for existence of a single character and not a string it would be possible to make the replace also with the UltraEdit or Unix engines.

            The last 3 macro commands can be

            with the UltraEdit engine:

            Code: Select all

            UltraEditReOn
            Find MatchCase RegExp "%[~_^p]++^p"
            Replace All ""
            With the Perl engine and also the Unix engine when replacing first command by UnixReOn:

            Code: Select all

            PerlReOn
            Find MatchCase RegExp "^[^_\r\n]*\r\n"
            Replace All ""
            With the Unix engine:

            Code: Select all

            UnixReOn
            Find MatchCase RegExp "^[^_\p]*\p"
            Replace All ""
            The Perl regular expression found by pietzcker is more complex, but more general as it works also for "lines NOT containing string" instead of just "lines NOT containing character".

            10
            Basic UserBasic User
            10

              Feb 14, 2013#6

              Mofi,
              thank you for the different code examples.

              It would seem that if a person is going to concentrate on learning one type of Expression it would be Perl. I will never have the time to become proficient at more than one.
              Is this a logical conclusion?

              What function does the period have in the exspression?

              Find MatchCase RegExp "^(?:(?!_).)*$\r\n"

              Thanks again for your expertise,
              JH

              6,685587
              Grand MasterGrand Master
              6,685587

                Feb 15, 2013#7

                UltraEdit or Unix regular expression syntax are easy to learn as all you need to know is the small set of special characters as described in help of UltraEdit and listed when clicking on the button with the triangle in the Find/Replace window and UltraEdit or Unix regular expression option is enabled.

                But Perl regular expression engine is much, much more powerful than the other two and much older regular expression engines. So you might start learning the Perl regexp syntax. But be aware of the fact that the Perl syntax explanation fills entire books or complete websites and not just a single small table. Nevertheless it is worth to learn the syntax when you need to reformat often text files.

                pietzcker posted at How to delete all lines NOT containing specific word or string or expression? the expression used here and posted also from which webpage he has this regular expression.

                The dot means in Perl (and Unix) syntax simple any character except carriage return and line-feed. I must add "usually any character except new line characters" because it is possible with an addition at beginning of a search string that the dot matches also new line characters, see "." in Perl regular expressions doesn't include CRLFs?

                The combination .* means 0 or more characters except new line characters. That there is a closing parenthesis between these two special characters is necessary for the negative lookahead.

                This expression is definitely an expert expression and very hard to understand for a beginner.

                ^ means start search at beginning of a line.

                (?:...) is a non-capturing (non-tagging) group. The *\r\n after the closing parenthesis of this group means that the expression inside the group must be positively applied 0 or more times up to a carriage return and line-feed pair. If the expression inside the non-capturing group fails anywhere within the line, the find result is negative for this line.

                (?!...) is a negative lookahead expression. Lookahead is like lookbehind a special feature of Perl regular expression engine which are supported all by other regular expression engines using also Perl syntax. The RegExp object of JavaScript core supports lookahead, but not lookbehind. Also the QRegExp class of Qt 4.8 supports lookahead, but not lookbehind. The .NET RegEx class supports lookahead and lookbehind. In this search string the negative lookahead is applied on any character of the line because of the dot.

                The expression simply means:
                • begin search at beginning of a line,
                • find any character except new line characters,
                • verify that this character and the other characters to right are not matching the string in the negative lookahead expression,
                • if the actual character and the characters to right match the string in the negative lookahead, the find is negative for this line and continue find at step 1 except last line of file is already reached,
                • otherwise continue on step 2 if there is one more character not being a carriage return or line-feed.
                • Finally if carriage return and line-feed is found, the result of the find is positive and therefore apply the replace string on found string (=entire line).
                • After replace continue at step 1 except end of file is reached.