How to split a file with a script based on a specified condition

How to split a file with a script based on a specified condition

3
NewbieNewbie
3

    Oct 07, 2008#1

    I am trying to figure out a way to split a large file I have into smaller files. The main file can be 7+ meg in size I need to break into roughly 1 or 2 meg files. The problem I am having is the file needs to be broken up at specific point (from the start of an xml tag to the end of one). for example in the file I have start looks like this <CALL>9148933855122|110|||1| and end element </CALL><CALL>9148933854616|110|||1| Notice that the end is on the same line as the start of the next.

    Each start to end element is about 100k so 10 would be a meg. I would like to process the file and create files with 10-20 calls per file. And name each file CALLnn.inserterror where nn is the file number. Does anybody know how to do this?

    6,606548
    Grand MasterGrand Master
    6,606548

      Oct 08, 2008#2

      That could be done with a set of macros and there are already some similar examples posted. You can find these posts by using the advanced forum search (see top right corner on this page), enter split file, select the macro forum and run the forum search.

      I could also develop the macros you need here because it is quite simple. But before I do this can you tell us which version of UltraEdit you use and so if a script could be used instead of macros. Your task requires the usage of a variable (the counter) and nested loops. Macros don't support variables and nested loops which is the reason why I wrote about a set of macros to do that. With a script the task is pretty easy to realize.
      Best regards from an UC/UE/UES for Windows user from Austria

      3
      NewbieNewbie
      3

        Oct 08, 2008#3

        Thank you a script would be fine if you could help with this. I am using UltraEdit-32 Professional Text/Hex Editor Version 13.20a. Anything you can help with I appreciate. Thanks

        6,606548
        Grand MasterGrand Master
        6,606548

          Oct 10, 2008#4

          Here is a script to split a file after x occurrences of a specified string into several numbered files. I have tested the script on a small example file. Hope it works also for your much larger file. I have developed and tested the script with UE v13.20a+1.

          Code: Select all

          var FoundsPerFile = 20;      // Global setting for number of found split strings per file.
          var SplitString = "</CALL";  // String where to split. The split occurs after next character.
          
          /* Find the tab index of the active document */
          // Copied from https://forums.ultraedit.com/viewtopic.php?f=52&t=4571
          function getActiveDocumentIndex () {
             var tabindex = -1; /* start value */
          
             for (var i = 0; i < UltraEdit.document.length; i++)
             {
                if (UltraEdit.activeDocument.path==UltraEdit.document[i].path) {
                   tabindex = i;
                   break;
                }
             }
             return tabindex;
          }
          
          if (UltraEdit.document.length) { // Is any file open?
             // Set working environment required for this job.
             UltraEdit.insertMode();
             if (typeof(UltraEdit.columnModeOff) == "function") UltraEdit.columnModeOff();
             else if (typeof(UltraEdit.activeDocument.columnModeOff) == "function") UltraEdit.activeDocument.columnModeOff();
             UltraEdit.activeDocument.hexOff();
             UltraEdit.ueReOn();
          
             // Move cursor to top of active file and run the initial search.
             UltraEdit.activeDocument.top();
             UltraEdit.activeDocument.findReplace.searchDown=true;
             UltraEdit.activeDocument.findReplace.matchCase=true;
             UltraEdit.activeDocument.findReplace.matchWord=false;
             UltraEdit.activeDocument.findReplace.regExp=false;
             // If the string to split is not found in this file, do nothing.
             if (UltraEdit.activeDocument.findReplace.find(SplitString)) {
                // This file is probably the correct file for this script.
                var FileNumber = 1;    // Counts the number of saved files.
                var StringsFound = 1;  // Counts the number of found split strings.
                var NewFileIndex = UltraEdit.document.length;
                /* Get the path of the current file to save the new
                   files in the same directory as the current file. */
                var SavePath = "";
                var LastBackSlash = UltraEdit.activeDocument.path.lastIndexOf("\\");
                if (LastBackSlash >= 0) {
                   LastBackSlash++;
                   SavePath = UltraEdit.activeDocument.path.substring(0,LastBackSlash);
                }
                /* Get active file index in case of more than 1 file is open and the
                   current file does not get back the focus after closing the new files. */
                var FileToSplit = getActiveDocumentIndex();
                // Always use clipboard 9 for this script and not the Windows clipboard.
                UltraEdit.selectClipboard(9);
                // Split the file after every x found split strings until source file is empty.
                while (1) {
                   while (StringsFound < FoundsPerFile) {
                      if (UltraEdit.document[FileToSplit].findReplace.find(SplitString)) StringsFound++;
                      else {
                         UltraEdit.document[FileToSplit].bottom();
                         break;
                      }
                   }
                   // End the selection of the find command.
                   UltraEdit.document[FileToSplit].endSelect();
                   // Move the cursor right to include the next character and unselect the found string.
                   UltraEdit.document[FileToSplit].key("RIGHT ARROW");
                   // Select from this cursor position everything to top of the file.
                   UltraEdit.document[FileToSplit].selectToTop();
                   // Is the file not already empty?
                   if (UltraEdit.document[FileToSplit].isSel()) {
                      // Cut the selection and paste it into a new file.
                      UltraEdit.document[FileToSplit].cut();
                      UltraEdit.newFile();
                      UltraEdit.document[NewFileIndex].setActive();
                      UltraEdit.activeDocument.paste();
                      /* Add line termination on the last line and remove automatically added indent
                         spaces/tabs if auto-indent is enabled if the last line is not already terminated. */
                      if (UltraEdit.activeDocument.isColNumGt(1)) {
                         UltraEdit.activeDocument.insertLine();
                         if (UltraEdit.activeDocument.isColNumGt(1)) {
                            UltraEdit.activeDocument.deleteToStartOfLine();
                         }
                      }
                      // Build the file name for this new file.
                      var SaveFileName = SavePath + "CALL";
                      if (FileNumber < 10) SaveFileName += "0";
                      SaveFileName += String(FileNumber) + ".inserterror";
                      // Save the new file and close it.
                      UltraEdit.saveAs(SaveFileName);
                      UltraEdit.closeFile(SaveFileName,2);
                      FileNumber++;
                      StringsFound = 0;
                      /* Delete the line termination in the source file
                         if last found split string was at end of a line. */
                      UltraEdit.document[FileToSplit].endSelect();
                      UltraEdit.document[FileToSplit].key("END");
                      if (UltraEdit.document[FileToSplit].isColNumGt(1)) {
                         UltraEdit.document[FileToSplit].top();
                      } else {
                         UltraEdit.document[FileToSplit].deleteLine();
                      }
                   } else break;
                }  // Loop executed until source file is empty!
          
                // Close source file without saving and re-open it.
                var NameOfFileToSplit = UltraEdit.document[FileToSplit].path;
                UltraEdit.closeFile(NameOfFileToSplit,2);
                /* The following code line could be commented if the source
                   file is not needed anymore for further actions. */
                UltraEdit.open(NameOfFileToSplit);
          
                // Free memory and switch back to Windows clipboard.
                UltraEdit.clearClipboard();
                UltraEdit.selectClipboard(0);
             }
          }

          3
          NewbieNewbie
          3

            Nov 02, 2008#5

            Thanks this worked great

            1
            NewbieNewbie
            1

              Dec 28, 2011#6

              I am a long time lurker (I thought I registered years ago)
              But MOFI you are beyond awesome, every time I come back here year after year you are writing such wonderful scripts for everyone.

              You go way beyond pointing them to a similar script and saying well take this or that and figure it out, and usually end up writing their exact script for them.
              I really hope people appreciate that.

              6,606548
              Grand MasterGrand Master
              6,606548

                Dec 28, 2011#7

                Thanks for your praise on my support here. I think, it is better to give a questioner exactly what was asked for than giving just hints what could be done. In my point of view it is easier for a beginner to learn from a well written and commented script than making the first steps the hard way by developing or finding out something with just a few hints. This saves time of the questioner and also my time and those of other power forum members because normally no second, third, ... question until the task is really fulfilled as it often occurs if just giving hints.