Extracting SHA1 hashes matched by a regular expression from file

Extracting SHA1 hashes matched by a regular expression from file

2
NewbieNewbie
2

    Feb 05, 2018#1

    Hi,

    I wanted to extract some SHA1 hashes from random text files. So, let's say I have a file like this:

    Code: Select all

    a c0c13ddb9e5ec04ab1561bfaebe4920a6f1bdda9
    asklskalk
     as dsads fcaa0fed49330b5d07E1c7a300a545cf41ec8df0 8dkds8z
     sadadasda
    
    asad 1b866886c46430c871a91ead6e607fad9e291de4 fcaa0fed49330b5d07E1c7a300a545cf41ec8df0
    
    lll
    As SHA1 hashes are always 40 character long and use only hexadecimal digits, I can extract then with a simple grep command:

    Code: Select all

    $ grep -Eo '[[:xdigit:]]{40}' teste.txt
    c0c13ddb9e5ec04ab1561bfaebe4920a6f1bdda9
    fcaa0fed49330b5d07E1c7a300a545cf41ec8df0
    1b866886c46430c871a91ead6e607fad9e291de4
    fcaa0fed49330b5d07E1c7a300a545cf41ec8df0
    I've been trying to simulate this grep command (in particular the -o option, which makes grep print out only the matching part of the line) with UltraEdit scripting. It takes me a while because I found no easy way to know then the find command went back to the first occurrence found (as UE find is circular), so I don't know when to stop a loop as UltraEdit.activeDocument.isFound() will always return true unless I delete/change the occurrences found, which I don't want to do.

    The only way I found is to save the position for the first match and then keep comparing the other ones with it. It's not elegant though. Current working code is like this:

    Code: Select all

    var doc = UltraEdit.activeDocument;
    
    doc.top();
    doc.findReplace.matchWord=true;
    doc.findReplace.searchDown=true;
    doc.findReplace.regExp=true;
    
    UltraEdit.clearClipboard();
    
    for (i=0, first=0; ; i++) { 
       doc.findReplace.find("[[:xdigit:]]{40}");
    
       if (! doc.isFound() || first == doc.currentPos)
          break;
    
       if (i == 0)
          first = doc.currentPos;
       else
          UltraEdit.clipboardContent += "\n";
    
       doc.copyAppend();
    }
    
    UltraEdit.outputWindow.write(i + " occurrences found.");
    
    if (i > 0) {
       UltraEdit.newFile();
       doc.paste();
    }
    So, is there any easier way to accomplish this? I'm using the latest version of UE on Mac.
    Thanks in advance,
    Fernando

    6,681583
    Grand MasterGrand Master
    6,681583

      Feb 06, 2018#2

      I am using only UltraEdit for Windows. So I can only hope that what I wrote below is true also for UltraEdit for Mac.
      1. There is the setting Continue find at end of file in Configuration/Preferences at Search - Miscellaneous which you can uncheck to avoid that a find downwards continues from top of file after reaching end of file and a find upwards continues at end of file on reaching top of file.
      2. On running a find or replace from within a macro or script UltraEdit (for Windows) never continues find/replace at top/end of file to avoid an endless running find/replace executed in a loop if the script writer evaluates the return code of find/replace functions correct.
      3. The find/replace scripting functions itself return true/false. So the file/replace function can be directly used in an IF condition in a script. The isFound() function exists mainly for historical reasons. The set of macro commands were initially 1:1 implemented as scripting commands. I use never in scripts isFound() and isNotFound() and instead evaluate the return value of findReplace.find() and findReplace.replace(). So I don't know if isFound() and isNotFound() work as they should with returning true/false depending on last find/replace result..
      4. I wrote already grep replacement scripts, see Find strings with a regular expression and output them to new file. Running this script on posted example with search string \b[0-9A-Fa-f]{40}\b results in creating a new file with the lines:

        Code: Select all

        c0c13ddb9e5ec04ab1561bfaebe4920a6f1bdda9
        fcaa0fed49330b5d07E1c7a300a545cf41ec8df0
        1b866886c46430c871a91ead6e607fad9e291de4
        fcaa0fed49330b5d07E1c7a300a545cf41ec8df0
      Note: Search string [[:xdigit:]]{40} works with Perl regular expression engine of UltraEdit (from Boost library) although it would be better to use \b[[:xdigit:]]{40}\b for correct matching really only strings of exact 40 hexadecimal characters. But the Perl regular expression support of JavaScript core used by the scripts written by me do not support [:xdigit:] which is the reason why [0-9A-Fa-f] or [\dA-Fa-f] must be used on using for example FindStringsToNewFile.js.
      Best regards from an UC/UE/UES for Windows user from Austria

      2
      NewbieNewbie
      2

        Feb 07, 2018#3

        Thanks for your answer! Your script works much better :)