Tapatalk

How to search for 2 words in a file?

How to search for 2 words in a file?

6
NewbieNewbie
6

    Nov 26, 2014#1

    Is it possible to search for 2 or more words in files?

    6,685587
    Grand MasterGrand Master
    6,685587

      Nov 26, 2014#2

      If you want to search in an opened file or in all files of a directory for any word from a list of words, this is no problem with using Perl regular expression engine with search expression \<(?:word1|word2|word3|word4|AndSoOn)\> with Find or Find in Files command.

      Explanation:

      \< ... beginning of a word.

      (?:...) ... non-capturing group used here for the OR expression of the words.

      | ... simply means OR.

      \> ... end of a word.

      But the number of words is not unlimited. There is an internal limit in regular expression engine which depends on length of the words. Up to 50 words are usually no problem.

      Much more difficult is already to search for 2 words in any order which must both exist in a file. Regular expression engines are not really designed for such not very precise searches.

      But the Perl search expression (?s)\<(word1|word2)\>(?:.(?!\1))*?\<(?:word1|word2)\> could be used for this task.

      Explanation:

      (?s) ... the . should match also carriage return and line-feed. So the two words can be on different lines.

      \< ... beginning of a word.

      (...) ... a capturing group used here for the OR expression of the words.

      | ... simply means OR.

      \> ... end of a word.

      (?:...) ... non-capturing group used for finding any character 0 or more times as long as first word found is not found again.

      . ... any character including carriage return and line-feed usually not matched because of (?s) at beginning of search string.

      (?!\1) ... this is a negative lookahead with a back-reference. It checks if next to currently matched character there is NOT once again the word already found by first OR expression in first capturing group.

      *? ... means 0 or more times non-greedy applied to expression in non-capturing group.

      The rest of the expression was already explained above.

      This expression works only if there is not too much data between the 2 words. A search with this expression was successful with 878 KB between the 2 words, but failed with error message An unknown error occurred while accessing %temp%\filename.000 with 885 KB between the 2 words. The limit for the data between the 2 words is surely not a simple KB limit.

      I would not do a search for files which must contain already 3 words using a regular expression search. It would be possible, but is already very complex.

      But if you know the order of the words in the file and all 2, 3, 4, ... words must be found within a small block of some KB, it would be no problem to use a Perl regular expression find. The regular expression engines are designed for such precise searches.
      Best regards from an UC/UE/UES for Windows user from Austria

      6
      NewbieNewbie
      6

        Nov 27, 2014#3

        Thanks Mofi
        explanation very clear
        Maurizio