Script to rearrange lines of a large report

Script to rearrange lines of a large report

1
NewbieNewbie
1

    Jun 27, 2015#1

    I made a JavaScript to concatenate lines in a text document in the following fashion:

    Code: Select all

    0
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    To:

    Code: Select all

    0 4 8
    1 5 9
    2 6 10
    3 7 11
    The script does exactly what I want. However, it keeps crashing on a file of 58.1 MB. Any help?

    Code: Select all

    var strings; //variable which holds selection
    var stringArray = new Array(); //create array to hold string values
    var arrayLength = 0; //array length
    var lineTerminator = "\r\n"; //line terminator character you may need to change this depending on the file type
    var lineNumber = 8; //beginning line number
    var bottomLineNum; //last line in report
    
    //--- Find last line in document ---
    UltraEdit.activeDocument.bottom();
    bottomLineNum = UltraEdit.activeDocument.currentLineNum;
    
    //--- While starting line number < last line number---
    while(lineNumber < bottomLineNum)
    {
       //--- Move cursor to first line to be acted on, column 1----
       UltraEdit.activeDocument.gotoLine(lineNumber, 1);
    
       //--- Select the chunk of the text file to be fixed---
       UltraEdit.activeDocument.startSelect();
       for(var i = 0; i < 16; i++) {
          UltraEdit.activeDocument.key("DOWN ARROW");
       }
       UltraEdit.activeDocument.endSelect();
    
       //Get selection
       strings = UltraEdit.activeDocument.selection;
       //split string at line terminator characters
       stringArray = strings.split(lineTerminator);
       arrayLength = stringArray.length
    
       for (var x = 0; x < arrayLength/4; ++x) {
          UltraEdit.activeDocument.write(stringArray[x] + stringArray[x + 4] + stringArray[x + 8] + '\n');
       }
       UltraEdit.activeDocument.write('\n');
       lineNumber = lineNumber + 17;
    }

    6,681583
    Grand MasterGrand Master
    6,681583

      Jun 27, 2015#2

      This script can't work as expected.

      For example on a file containing

      Code: Select all

      Header line 1
      Header line 2
      Header line 3
      Header line 4
      Header line 5
      Header line 6
      Header line 7
      Data line 01
      Data line 02
      Data line 03
      Data line 04
      Data line 05
      Data line 06
      Data line 07
      Data line 08
      Data line 09
      Data line 10
      Data line 11
      Data line 12
      Data line 13
      Data line 14
      Data line 15
      Data line 16
      Data line 17
      Data line 18
      Data line 19
      Data line 20
      Data line 21
      Data line 22
      Data line 23
      
      the output of the script is

      Code: Select all

      Header line 1
      Header line 2
      Header line 3
      Header line 4
      Header line 5
      Header line 6
      Header line 7
      Data line 01Data line 05Data line 09
      Data line 02Data line 06Data line 10
      Data line 03Data line 07Data line 11
      Data line 04Data line 08Data line 12
      Data line 05Data line 09Data line 13
      
      Data line 17
      Data line 18
      Data line 19
      Data line 20
      Data line 21
      Data line 22
      Data line 23
      
      with the caret blinking after script execution finished at beginning of line containing Data line 17.

      Your script completely ignores that the number of lines in file decreases while the script is reformatting the file. So determining the number of lines at beginning and running the script until actual line number is greater or equal the initial last line number is a completely wrong loop condition.

      Next 16 lines are selected on each loop run, but output are just 3 lines at once into a line. It can be seen that with this algorithm data line 05 and 09 exist twice in output and the data lines 14, 15 and 16 are removed completed. I think, this is not what you really want.

      The crash most likely occurs because of an uncaught out of memory situation caused by the thousands of small selections made during script execution with getting the selected strings into memory on file with more than 50 MB. I'm not sure, but I think the JavaScript interpreter has a low maximum heap size not being automatically increased. I don't know which one, but I think, it is quite low.

      It is not clear for me how many lines of a block should be really reformatted: 16 lines reformatted to 5 x 3 lines with 1 line replaced by a blank line, or just 12 lines reformatted to 4 x 3 lines as in your example.

      The script below uses three UltraEdit tagged regular expression Replace All to re-arrange the data lines of the report. It would be of course also possible to use three Unix or Perl regular expression Replace All using capturing groups and back-references.

      Code: Select all

      if (UltraEdit.document.length > 0)  // Is any file opened?
      {
         // Define environment for this script.
         UltraEdit.insertMode();
         UltraEdit.columnModeOff();
      
         var nHeaderLinesCount = 7;  // Number of header lines to ignore.
      
         // Move caret to bottom of the active file.
         UltraEdit.activeDocument.bottom();
         var nLineCount = UltraEdit.activeDocument.currentLineNum;
      
         // Has the file more lines than number of header lines at all?
         if (nLineCount > nHeaderLinesCount)
         {
            // Make sure last line of file has also a line termination.
            if (UltraEdit.activeDocument.isColNumGt(1))
            {
               nLineCount++;
               UltraEdit.activeDocument.insertLine();
               if (UltraEdit.activeDocument.isColNumGt(1))
               {
                  UltraEdit.activeDocument.deleteToStartOfLine();
               }
            }
            if (((nLineCount - nHeaderLinesCount - 1) % 12) == 0)
            {
               // There are several methods to determine line terminator type. But
               // as caret is already at end of file after last line termination,
               // it is best to simply select the last line termination and get
               // it into a string variable.
               UltraEdit.activeDocument.startSelect();
               UltraEdit.activeDocument.key("UP ARROW");
               UltraEdit.activeDocument.key("END");
               UltraEdit.activeDocument.endSelect();
               var sLineTerm = UltraEdit.activeDocument.selection;
      
               // Define UltraEdit as regular expression engine and all parameters
               // to run 3 UE regular expression Replace All from current position
               // in file using tagged expressions to reformat the data lines.
               UltraEdit.ueReOn();
               UltraEdit.activeDocument.findReplace.mode=0;
               UltraEdit.activeDocument.findReplace.matchCase=false;
               UltraEdit.activeDocument.findReplace.matchWord=false;
               UltraEdit.activeDocument.findReplace.regExp=true;
               UltraEdit.activeDocument.findReplace.searchDown=true;
               UltraEdit.activeDocument.findReplace.searchInColumn=false;
               UltraEdit.activeDocument.findReplace.preserveCase=false;
               UltraEdit.activeDocument.findReplace.replaceAll=true;
               UltraEdit.activeDocument.findReplace.replaceInAllOpen=false;
      
               // Define the 3 UE regular expression search strings necessary for
               // reformatting the data lines of the report with DOS line terminator.
               var sSearch1 = "%^(?++^)^p^(?++^p?++^p?++^p^)^(?++^)^p^(?++^p?++^p?++^p^)^(?++^p^)^(?++^p?++^p?++^)$";
               var sSearch2 = "%^(?++^p?++^)^p^(?++^p?++^p^)^(?++^)^p^(?++^p?++^p^)^(?++^p^)^(?++^p?++^)$";
               var sSearch3 = "%^(?++^p?++^p?++^)^p^(?++^)^p^(?++^)^p^(?++^)^p^(?++^p^)^(?++^)$";
      
               var bMAC = false;
               if (sLineTerm == "\n")
               {
                  UltraEdit.outputWindow.write("Line terminator type is UNIX.");
                  sSearch1 = sSearch1.replace(/\^p/g,"^n");
                  sSearch2 = sSearch2.replace(/\^p/g,"^n");
                  sSearch3 = sSearch3.replace(/\^p/g,"^n");
               }
               else if (sLineTerm == "\r")
               {
                  // For a MAC file the replaces always failed even with using ^r
                  // instead of ^p. Therefore a MAC file is converted to DOS as
                  // workaround and after the replaces back to MAC.
                  UltraEdit.outputWindow.write("Line terminator type is MAC.");
                  UltraEdit.activeDocument.unixMacToDos();
                  bMAC = true;
               }
               else
               {
                  UltraEdit.outputWindow.write("Line terminator type is DOS.");
               }
      
               // Move caret now to beginning of first line after the header lines.
               UltraEdit.activeDocument.gotoLine(nHeaderLinesCount+1,1);
      
               // Run the 3 UltraEdit tagged regular expression Replace All.
               UltraEdit.activeDocument.findReplace.replace(sSearch1,"^1 ^3 ^5^2^4^6");
               UltraEdit.activeDocument.findReplace.replace(sSearch2,"^1 ^3 ^5^2^4^6");
               UltraEdit.activeDocument.findReplace.replace(sSearch3,"^1 ^3 ^5^2 ^4 ^6");
      
               if (bMAC)   // Convert a MAC file back from DOS to MAC.
               {
                  UltraEdit.activeDocument.dosToMac();
               }
               UltraEdit.activeDocument.top();
            }
            else
            {
               UltraEdit.outputWindow.clear();
               UltraEdit.outputWindow.showStatus=false;
               UltraEdit.outputWindow.write("Error: The number of data lines is not an exact multiple of 12.");
               UltraEdit.outputWindow.showWindow(true);
            }
         }
         else
         {
            UltraEdit.outputWindow.clear();
            UltraEdit.outputWindow.showStatus=false;
            UltraEdit.outputWindow.write("Error: The active file as less number of lines as defined for header.");
            UltraEdit.outputWindow.showWindow(true);
         }
      }
      
      With this script a file containing the lines:

      Code: Select all

      Header line 1
      Header line 2
      Header line 3
      Header line 4
      Header line 5
      Header line 6
      Header line 7
      Data line 01
      Data line 02
      Data line 03
      Data line 04
      Data line 05
      Data line 06
      Data line 07
      Data line 08
      Data line 09
      Data line 10
      Data line 11
      Data line 12
      Data line 13
      Data line 14
      Data line 15
      Data line 16
      Data line 17
      Data line 18
      Data line 19
      Data line 20
      Data line 21
      Data line 22
      Data line 23
      Data line 24
      Data line 25
      Data line 26
      Data line 27
      Data line 28
      Data line 29
      Data line 30
      Data line 31
      Data line 32
      Data line 33
      Data line 34
      Data line 35
      Data line 36
      
      is reformatted to:

      Code: Select all

      Header line 1
      Header line 2
      Header line 3
      Header line 4
      Header line 5
      Header line 6
      Header line 7
      Data line 01 Data line 05 Data line 09
      Data line 02 Data line 06 Data line 10
      Data line 03 Data line 07 Data line 11
      Data line 04 Data line 08 Data line 12
      Data line 13 Data line 17 Data line 21
      Data line 14 Data line 18 Data line 22
      Data line 15 Data line 19 Data line 23
      Data line 16 Data line 20 Data line 24
      Data line 25 Data line 29 Data line 33
      Data line 26 Data line 30 Data line 34
      Data line 27 Data line 31 Data line 35
      Data line 28 Data line 32 Data line 36
      
      Best regards from an UC/UE/UES for Windows user from Austria