Split large 2 GB text files

Split large 2 GB text files

4
NewbieNewbie
4

    Sep 18, 2006#1

    Hi All

    I have a large text file (which I receive once a month) which is over 2gig and has about 30 million names and addresses

    I need to split the file say into 5 chunks by line number then I can import into my database app. ( I normally do this manually)

    I know there is software out there to do this but hey If i can do it for free and learn something in the process then hey Ill at least try.

    I've not used macro's in UE yet so if someone can point me in the right direct then that would be great

    Cheers

      Re: Split Large 2 Gig text files

      Sep 19, 2006#2

      Hi

      Ive been fiddling but cant get it to work heres what ive done soo far on a 3million line test file

      InsertMode
      ColumnModeOff
      HexOff
      UnixReOff
      GotoLine 1000000
      SelectToBottom
      Delete
      SaveAs "1-1000000.txt"
      CloseFile
      Open "Test.txt"
      GotoLine 1000000
      SelectToTop
      Delete
      GotoLine 1000000
      SelectToBottom
      Delete
      SaveAs "1000000-2000000.txt"
      CloseFile
      Open "Test.txt"
      GotoLine 2000000
      SelectToTop
      Delete
      SaveAs "2000000-3000000.txt"
      CloseFile


      I've done it this way because my clipboard cant hold much

      It runs up until the 1st Open "Test.txt" then falls over

      Please help

      6,684586
      Grand MasterGrand Master
      6,684586

        Sep 19, 2006#3

        Your macro code looks quite good. Because I don't have a 2 GB text file I can only guess what the problem is: the synchronizing process between end of the opening of the extremly large file and continue with next macro command. Maybe the command Top after every Open helps, maybe in a loop with a defined number to create a wait loop.

        And hopefully you have choosen also the option Open file without temp file but NO prompt (CAUTION: Edits are permanent, decreases load time for large files) at Configuration - File Handling - Temporary Files which of course you will definitively need here.

        Normally it is also good for editing so large files to check the configuration option Disable line numbers at Configuration - Editor Display - Miscellaneous to decrease the loading time but you can't do that because you need the line numbers and so UE has to scan the whole file for CRLF.

          Nov 03, 2006#4

          A second suggestion: After every Open command inserting following:

          Bottom
          GetValue "Enter 0 when bottom of file reached!"
          Key BACKSPACE

          Maybe this user interaction helps UE.


          Third suggestion: Make the split manually by using Edit - Select Range and File - Save Selection As.


          Fourth suggestion: Check the configuration option Disable line numbers and use instead of GotoLine the following macro code:

          Loop 1000000
          Key DOWN ARROW
          EndLoop

          Very slow but maybe working.
          Best regards from an UC/UE/UES for Windows user from Austria

          4
          NewbieNewbie
          4

            Nov 07, 2006#5

            Hi Mofi

            Thanks for your help

            Is there not a SaveSelectionAs "filename.txt" command or work around?


            tx

            6,684586
            Grand MasterGrand Master
            6,684586

              Nov 07, 2006#6

              No, there is no SaveSelectionAs command because in a macro normally the command sequence

              Copy
              NewFile
              Paste
              SaveAs "filename"

              does the same. But in your case it will fail with hundreds of MBs.

              Well, you could create a submacro which does for example following

              IfEof
              Else
              StartSelect
              Loop 200
              Key PGDN
              EndLoop
              Copy
              EndSelect
              PreviousWindow
              Paste
              NextWindow
              EndIf

              And in your main macro you have following sequence as often as needed:

              NewFile
              NextWindow
              PlayMacro 500 "Case sensitive name of already existing submacro"
              PreviousWindow
              SaveAs "filename01.txt"
              CloseFile

              But I don't understand why you do not use one of the dozens free file splitting tools available on WWW which would make this job much faster than an editor can ever do because the file splitting tool must not show the content of the huge file.
              Best regards from an UC/UE/UES for Windows user from Austria

              4
              NewbieNewbie
              4

                Nov 08, 2006#7

                Thanks Mofi

                yep your right I'll just buy the software or do it manually only takes about 10 minutes but i thought it would be interesting to see if it was possible in UE.

                Thanks for your help


                Stu