Goto a midway point in large file

Goto a midway point in large file

3
NewbieNewbie
3

    Aug 21, 2013#1

    We got stuck with several very large > 400MB xml files and we have to split them in half. Unfortunately the files do not have line feeds so everything is line 1. I tried going to the end of the file taking the value of the column number, divide that number by 2 then request go to that column number and consistently it only takes me to some point in the first wrapped line.

    There is no anchor words to use since it is xml file and there are as many as 20,000 repeated tags. Scrolling is painfully slow due to the shear size of the file.

    Is there something else I can do to get to the midway point of a large file so I can insert a page break? Once I have a page break I can easily break up the files as necessary it is just getting to the initial midway point that is the problem.

    Thanks in advance,
    Pam

    6,686585
    Grand MasterGrand Master
    6,686585

      Aug 22, 2013#2

      Usually the command XML Convert to CR/LFs in menu Format is used to convert an XML file without line breaks and proper indentations into a well formatted and good readable XML file. I have never executed it on such a large XML file, but it is worth to try it.

      Here is a little UltraEdit script which positions caret in the middle of a file:

      Code: Select all

      if (UltraEdit.document.length > 0)  // Is any file opened?
      {
         // Get size of file in bytes.
         var nFileSize = UltraEdit.activeDocument.fileSize;
         // Calculate half of file size taking into account that JavaScript
         // creates a float result instead of an integer result when file
         // size is odd which must be avoided as result must be an integer.
         var nHalfSize = (nFileSize % 2) ? (nFileSize+1)/2 : nFileSize/2;
         UltraEdit.activeDocument.gotoPos(nHalfSize);
      }

      3
      NewbieNewbie
      3

        Aug 22, 2013#3

        Yes, you came to the same conclusion that I did. BTW it takes ~8 minutes for a 350MB file to reformat it. Once reformatted I can find the middle line number and obtain an anchor value. The script is great when I get a chance I will add it. Thanks for your input and help.
        Pam