Script to count instances of text

Script to count instances of text

3
NewbieNewbie
3

    9:54 - May 22#1

    I'm new to creating scripts in UE but created macro's in the past. What I want to achieve is not possible using macro's so I'm hoping for some help here.
    I want to create a script that does a search count of instances using a text value that is dynamically created:
    "[Node 001]" up to "[Node 100]" and then sends the count outcome in a new file
    Node 001 found x times
    This will help me analyzing my logs significantly.
    Does anyone possibly already has such a script?

    Art

    Mofi
    6,52749433
    Grand MasterGrand Master
    6,52749433

      17:00 - May 22#2

      I have several ideas how this task could be done with an UltraEdit script. Here is one which should always work independent on version of UltraEdit and the operating system and size of active file which can have even multiple GB and independent on file being saved on disk or just a new, unnamed file in UltraEdit. The disadvantage of this solution is its low efficiency due to lots of document window updates during the script execution. Other solutions would be faster but cannot be written by me without knowing version of UltraEdit, typical file size and if the file is always a named file stored on a storage media (hard disk, network drive, other storage media).

      Code: Select all

      if (UltraEdit.document.length > 0)  // Is any file opened?
      {
         // Define environment for this script.
         UltraEdit.insertMode();
         if (typeof(UltraEdit.columnModeOff) == "function") UltraEdit.columnModeOff();
         else if (typeof(UltraEdit.activeDocument.columnModeOff) == "function") UltraEdit.activeDocument.columnModeOff();
      
         // Define the parameters for running in a loop from top to bottom in active
         // file a case-sensitive, non regular expression find to count the occurrences
         // of each searched string created dynamically in the outer loop below.
         UltraEdit.ueReOn();
         UltraEdit.activeDocument.findReplace.mode=0;
         UltraEdit.activeDocument.findReplace.matchCase=true;
         UltraEdit.activeDocument.findReplace.matchWord=false;
         UltraEdit.activeDocument.findReplace.regExp=false;
         UltraEdit.activeDocument.findReplace.searchDown=true;
         if (typeof(UltraEdit.activeDocument.findReplace.searchInColumn) == "boolean") {
            UltraEdit.activeDocument.findReplace.searchInColumn=false;
         }
      
         var nCount;
         var sFind;
         var sNumber;
         var sResult;
         var sZeros = "000";
         var asResults = [];
      
         for (var nNumber = 1; nNumber <= 100; ++nNumber)
         {
            // Convert the current integer number into a string using decimal system.
            sNumber = nNumber.toString(10);
            // Create the string to find with inserting the correct number of leading
            // spaces depending on length of current number string (2, 1, and 0 zeros).
            sFind = "[Node " + sZeros.substr(sNumber.length) + sNumber + "]";
      
            // Initialize the counter variable, move caret to top of active file
            // and run in a loop the find until searched string is not found
            // anymore from current caret position to end of file with
            // counting the number of positive matches.
            nCount = 0;
            UltraEdit.activeDocument.top();
            while (UltraEdit.activeDocument.findReplace.find(sFind)) nCount++;
      
            // Create the result string to write finally into a new file and
            // append this string to the array of strings with the results.
            sResult = sFind.substr(1,sFind.length-2) + " found " + nCount.toString(10) + " time";
            if (nCount != 1) sResult += "s";
            asResults.push(sResult);
         }
      
         // Append an empty string to the results string array for having terminated
         // also the last line in the new file with carriage return and line-feed.
         asResults.push("");
      
         // Create a new ANSI encoded file with DOS line terminators
         // independent on what is configured in settings for a new file.
         UltraEdit.newFile();
         UltraEdit.activeDocument.unicodeToASCII();
         if(typeof(UltraEdit.activeDocument.UTF8ToASCII) == "function")
         {
            UltraEdit.activeDocument.UTF8ToASCII();
         }
         UltraEdit.activeDocument.unixMacToDos();
      
         // Join the result strings together with carriage return and line-feed
         // between each of the strings in the array, write the resulting
         // multi-line string into the new file and move caret to top.
         UltraEdit.activeDocument.write(asResults.join("\r\n"));
         UltraEdit.activeDocument.top();
      }
      
      Best regards from an UC/UE/UES for Windows user from Austria

      Art312134
      3
      NewbieNewbie
      3

        20:02 - May 22#3

        Thanks!

        I'm working with UE v24.

        The logs I want to check are between 50 k and 500 k lines of text. and approximately 3 – 30 MB

        I tested your script on a file with 45 k lines and 3 MB, but it freezes UE after one run analyzing the entire text. Sometimes it is freezing on the first attempt, the second time it is stick starting for the second time from the first line.
        The file is on a network location but can be copied to a local SSD to improve performance.

        Mofi
        6,52749433
        Grand MasterGrand Master
        6,52749433

          6:56 - May 23#4

          The file size is problematic for loading entire file contents as one large string into memory of JavaScript core engine and run the finds directly on the large string. The most efficient method is not possible for that reason. It can take several minutes with the window updates to run all the finds on such files.

          It could be made perhaps faster using Find in Files and load the number of total found strings from the output window if the log file is always a named file on a network drive or a local drive. But then the script depends on the configuration of Find output format for File summary which is in general not good.

          I would need the exact version of UltraEdit as displayed in About window where the version can be also selected and copied with Ctrl+C to the clipboard. Then I could restore exactly the same version from my personal archives and try the script on a file with 500,000 lines.

          But honestly I would do that job with a simple batch file or a PowerShell script instead of using an UltraEdit script because UltraEdit is a graphical user interface application and the task can be done with an UltraEdit very difficult without document window refreshes making the script execution time very long in comparison to a program which just searches for a string and count the found occurrences and do not that without any graphical window updates.

          Here is a batch file solution on which just the log file name on third line must be adapted to get a .txt file with name of the batch file in the directory of the batch file with the results.

          Code: Select all

          @echo off
          setlocal EnableExtensions DisableDelayedExpansion
          set "LogFile=C:\Temp\Test.log"
          if exist "%LogFile%" goto ProcessFile
          echo ERROR: File "%LogFile%" does not exist.
          echo(
          pause
          exit /B
          :ProcessFile
          set "TempFile=%TEMP%\%~n0.tmp"
          set "ResultsFile=%~dpn0.txt"
          copy "%LogFile%" "%TempFile%" >nul
          setlocal EnableDelayedExpansion
          (for /L %%I in (1,1,100) do set "Number=00%%I" & for /F "tokens=3 delims=:" %%J in ('%SystemRoot%\System32\find.exe /C "[Node !Number:~-3!]" "!TempFile!"') do if not "%%J" == " 1" (echo Node !Number:~-3! found%%J times) else echo Node !Number:~-3! found%%J time)>"!ResultsFile!"
          endlocal
          del "%TempFile%"
          endlocal
          
          Please note that the Windows command FIND counts just the lines containing the searched string and not the number of found strings. The result is wrong if a line contains a searched string more than once.
          Best regards from an UC/UE/UES for Windows user from Austria

          Art312134
          3
          NewbieNewbie
          3

            22:09 - May 23#5

            Many thanks. I adjusted the script slightly in order to use the specific file as command line parameter

            Code: Select all

            @echo off
            setlocal EnableExtensions DisableDelayedExpansion
            set "LogFile=%1"
            if exist "%LogFile%" goto ProcessFile
            echo ERROR: File "%LogFile%" does not exist.
            echo(
            pause
            exit /B
            :ProcessFile
            set "TempFile=%TEMP%\%~n0.tmp"
            set "ResultsFile=%~dpn0.txt"
            copy "%LogFile%" "%TempFile%" >nul
            setlocal EnableDelayedExpansion
            (for /L %%I in (1,1,100) do set "Number=00%%I" & for /F "tokens=3 delims=:" %%J in ('%SystemRoot%\System32\find.exe /C "[Node !Number:~-3!]" "!TempFile!"') do if not "%%J" == " 1" (echo Node !Number:~-3! found%%J times) else echo Node !Number:~-3! found%%J time)>"!ResultsFile!"
            endlocal
            del "%TempFile%"
            uedit64.exe %ResultsFile%
            endlocal
            
            The text is indeed searched on a line level, for this purpose it doesn't matter how many strings are found on the same line.

            Mofi
            6,52749433
            Grand MasterGrand Master
            6,52749433

              5:06 - May 24#6

              The batch file code with the small modification is not correct now. The batch file is not working correct if the batch file is called with the log file name enclosed in " because of the file name contains a space or one of one of these characters &()[]{}^=;!'+,`~

              The correct batch file code would be:

              Code: Select all

              @echo off
              setlocal EnableExtensions DisableDelayedExpansion
              set "LogFile=%~f1"
              if defined LogFile goto CheckFile
              echo ERROR: Batch file "%~nx0" must be run with log file name as argument.
              echo(
              pause
              exit /B 2
              :CheckFile
              if exist "%LogFile%" goto ProcessFile
              echo ERROR: File "%LogFile%" does not exist.
              echo(
              pause
              exit /B 1
              :ProcessFile
              set "TempFile=%TEMP%\%~n0.tmp"
              set "ResultsFile=%~dpn0.txt"
              copy "%LogFile%" "%TempFile%" >nul
              setlocal EnableDelayedExpansion
              (for /L %%I in (1,1,100) do set "Number=00%%I" & for /F "tokens=3 delims=:" %%J in ('%SystemRoot%\System32\find.exe /C "[Node !Number:~-3!]" "!TempFile!"') do if not "%%J" == " 1" (echo Node !Number:~-3! found%%J times) else echo Node !Number:~-3! found%%J time)>"!ResultsFile!"
              endlocal
              del "%TempFile%"
              uedit64.exe "%ResultsFile%"
              endlocal
              
              Best regards from an UC/UE/UES for Windows user from Austria