Parse and process tab-delimited text file

Parse and process tab-delimited text file

3
NewbieNewbie
3

    Feb 14, 2013#1

    Disclaimer
    Please accept my apologies if this is a question that has been asked before. I've checked the forums, but could not find a solution or thread related to my issue.

    I have a text file in the following format:
    • table_name_01 [tab] column_name_01
      table_name_01 [tab] column_name_02
      table_name_01 [tab] column_name_03
      ...
      ...
      ...
      table_name_99 [tab] column_name_9
      9
    Here is what I am attempting to do:
    • read each tab-delimited line
      create a new document based on the table_name
      put the column_name in the table_name document
      read the next line
      if a document of table_name already exists, append the column_name
      if a document of table_name does not exist, create a new document based on the table_name
      put the column_name in the table_name document
    do this until the end of file

    I am brand new to UE scripting, and not very familiar with JS scripting. Any and all help is greatly appreciated.

    Thank you in advance.

    6,604547
    Grand MasterGrand Master
    6,604547

      Feb 15, 2013#2

      Which file extensions should the files have?

      Should the file table_name_01 contain after script finished

      column_name_01[tab]column_name_02[tab]column_name_03

      or

      column_name_01
      column_name_02
      column_name_03


      Can be assumed that no table name contains a character not valid for a file name like : / \ < > etc. or must the script verify that and replace not allowed characters in file names by an underscore or a different character?

      Last, how large is the input file? Some KB, a few MB, several hundred MB or even GB? That makes a big difference in implementation.

      21
      Basic UserBasic User
      21

        Feb 15, 2013#3

        For small files use this script

        Code: Select all

        if (UltraEdit.document.length > 0) {
           // Define the environment for the script.
           UltraEdit.insertMode();
           if (typeof(UltraEdit.columnModeOff) == "function") UltraEdit.columnModeOff();
           else if (typeof(UltraEdit.activeDocument.columnModeOff) == "function") UltraEdit.activeDocument.columnModeOff();
           UltraEdit.activeDocument.hexOff();
           // Select all and load the file contents into an array of lines.
           UltraEdit.activeDocument.selectAll();
           var allTables = {};
           if (UltraEdit.activeDocument.isSel()) {
              var asLines = UltraEdit.activeDocument.selection.split("\r\n");
              UltraEdit.activeDocument.top();  // Discards the selection.
              for (var nLineNum = 0; nLineNum < asLines.length; nLineNum++) {
               var asLineVals = asLines[ nLineNum ].split( "\t" );
               //ignore other lines
               if ( asLineVals.length == 2 ) {
               if ( allTables[ asLineVals[0] ] == null ) {
               //new tablename
               allTables[ asLineVals[0] ] = [];
               }
               allTables[ asLineVals[0] ][ allTables[ asLineVals[0] ].length ] = asLineVals[ 1 ];
               }
              }
              //now write file for each table
              for ( var _table in allTables ) {
             UltraEdit.newFile();
             var colSep = "\t"; //may change it to \r\n
               for ( var _tableCol=0; _tableCol < allTables[_table].length - 1; _tableCol++ ) {
               UltraEdit.activeDocument.write( allTables[_table][_tableCol] + colSep );
               }
             UltraEdit.activeDocument.write( allTables[_table][_tableCol] );
             UltraEdit.saveAs( "c:\\temp\\" + _table + ".txt" );
         UltraEdit.closeFile( "c:\\temp\\" + _table + ".txt" ,2);
              }
           }
        }
        

        3
        NewbieNewbie
        3

          Feb 15, 2013#4

          Thank you for your reply.
          Mofi wrote:Which file extensions should the files have?
          A simple .TXT or .SQL extension would suffice.
          Mofi wrote: Should the file table_name_01 contain after script finished

          column_name_01[tab]column_name_02[tab]column_name_03

          or

          column_name_01
          column_name_02
          column_name_03
          It should contain the following:
          column_name_01
          column_name_02
          column_name_03

          Mofi wrote:Can be assumed that no table name contains a character not valid for a file name like : / \ < > etc. or must the script verify that and replace not allowed characters in file names by an underscore or a different character?
          This would be a correct assumption, for both table names and field names.
          Mofi wrote:Last, how large is the input file? Some KB, a few MB, several hundred MB or even GB? That makes a big difference in implementation.
          I can't envision the input file ever being more than 5MB in size. The input file is manually created (for now) by copying specific columns from an Excel spreadsheet and dropping them in a text file.

            Feb 15, 2013#5

            Thank you so very much for this, Jaretin. With the slight of changing var colSep = "\t"; to var colSep = "\r\n"; it is exactly what I was looking for.