Tapatalk

Batch Convert to Character Delimited

Batch Convert to Character Delimited

3
NewbieNewbie
3

    Sep 12, 2014#1

    Hello

    Is there a way to convert several fixed column files to Character delimited at the same time with the same column size parameters?

    Thank you

    6,685587
    Grand MasterGrand Master
    6,685587

      Sep 13, 2014#2

      The commands Convert to Fixed Column and Convert to Character Delimited are designed for manual usage only. They are not available as macro or scripting command.

      But fixed column files can be often also converted to character delimited using 1 or more Perl regular expression replaces using backreferences.

      Therefore it could be possible to batch convert fixed column files to character delimited using an UltraEdit macro/script which contains a set of Perl regular expression replaces on all open files or all files in a folder.

      Post the field widths of the data in the fixed column files, and which separator to use for character delimited file, and also an example line as code block, if you need help on the Perl regular expression search and replace strings for this batch conversion task.
      Best regards from an UC/UE/UES for Windows user from Austria

      3
      NewbieNewbie
      3

        Sep 15, 2014#3

        Hello Mofi,
        thanks for the reply. I would appreciate your help with the Perl regular expressions as I am not familiar with them, though I have used UltraEdit's extensively.

        The files to convert have 34 columns with the following sizes:

        2,5,2,2,4,1,20,30,60,8,7,20,10,20,10,20,10,20,10,20,10,20,10,15,15,60,60,2,40,1,30,3,3,8

        The separator would be a comma and I am attaching a sample file (deleted later).

        Again, thanks for your help.

        6,685587
        Grand MasterGrand Master
        6,685587

          Sep 15, 2014#4

          Here is an UltraEdit script using Perl regular expression to convert your fixed column files to CSV files.

          The directory C:\\Temp\\ and the file type *.txt must be perhaps adapted.

          Code: Select all

          /* Script to convert a file with fixed columns widths (line lengths) to CSV.
          
             Each line has a fixed length of 558 characters with 34 data columns.
          
             The field widths of the columns are:
          
             Column: 1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34
             Width:  2, 5, 2, 2, 4, 1,20,30,60, 8, 7,20,10,20,10,20,10,20,10,20,10,20,10,15,15,60,60, 2,40, 1,30, 3, 3, 8
          
             How to get the values for first capturing group in the lines below?
          
             1. Copy the width values into a new file.
             2. Replace all commas by \r\n respectively ^p.
             3. Enable column mode with Alt+C or Column - Column Mode.
             4. Select the 2 columns for first n rows.
             5. Use Column - Sum Columns/Selection to get the field widths sum.
          */
          
          // Replace comma by ; or \\t (not \t) for a semi-colon or tab as separator.
          var sSeparator = ",";
          
          // Conversion to CSV is done on all files in a directory
          // using several Perl regular expression Replace in Files.
          UltraEdit.perlReOn();
          UltraEdit.frInFiles.regExp=true;
          UltraEdit.frInFiles.matchCase=true;
          UltraEdit.frInFiles.matchWord=false;
          UltraEdit.frInFiles.filesToSearch=0;
          UltraEdit.frInFiles.directoryStart="C:\\Temp\\";
          UltraEdit.frInFiles.searchInFilesTypes="*.txt";
          UltraEdit.frInFiles.openMatchingFiles=false;
          UltraEdit.frInFiles.ignoreHiddenSubs=false;
          UltraEdit.frInFiles.unicodeSearch=false;
          UltraEdit.frInFiles.preserveCase=false;
          UltraEdit.frInFiles.searchSubs=false;
          UltraEdit.frInFiles.logChanges=true;
          
          UltraEdit.outputWindow.write("Inserting the temporary separators ...");
          
          // Up to 9 capturing groups can be used in one replace. Therefore several
          // replaces using backreferences are necessary to insert from right to
          // left the separator.
          
          // It is not possible to insert directly commas (or semi-colons, tabs) as
          // separator in case of some field values contain the separator. Therefore
          // a different separator is used which most likely does not exist in the
          // fixed column file - the vertical tab (hexadecimal code value 0x0B).
          
          // The simple approach uses the values of the field widths.
          // UltraEdit.frInFiles.replace("^(.{411})(.{60})(.{2})(.{40})(.)(.{30})(.{3})(.{3})(.{8})",         "\\1\\v\\2\\v\\3\\v\\4\\v\\5\\v\\6\\v\\7\\v\\8\\v\\9");
          // UltraEdit.frInFiles.replace("^(.{251})(.{10})(.{20})(.{10})(.{20})(.{10})(.{15})(.{15})(.{60})", "\\1\\v\\2\\v\\3\\v\\4\\v\\5\\v\\6\\v\\7\\v\\8\\v\\9");
          // UltraEdit.frInFiles.replace("^(.{134})(.{7})(.{20})(.{10})(.{20})(.{10})(.{20})(.{10})(.{20})",  "\\1\\v\\2\\v\\3\\v\\4\\v\\5\\v\\6\\v\\7\\v\\8\\v\\9");
          // UltraEdit.frInFiles.replace("^(.{7})(.{2})(.{2})(.{4})(.)(.{20})(.{30})(.{60})(.{8})",           "\\1\\v\\2\\v\\3\\v\\4\\v\\5\\v\\6\\v\\7\\v\\8\\v\\9");
          // UltraEdit.frInFiles.replace("^(.{2})", "\\1\\v");
          
          // More efficient is the approach below as last capturing group in above
          // replaces does not change anything. But the calculation of the field
          // widths is more difficult with this solution.
          UltraEdit.frInFiles.replace("^(.{351})(.{60})(.{60})(.{2})(.{40})(.)(.{30})(.{3})(.{3})"       /* 8*/, "\\1\\v\\2\\v\\3\\v\\4\\v\\5\\v\\6\\v\\7\\v\\8\\v\\9\\v");
          UltraEdit.frInFiles.replace("^(.{221})(.{10})(.{20})(.{10})(.{20})(.{10})(.{20})(.{10})(.{15})"/*15*/, "\\1\\v\\2\\v\\3\\v\\4\\v\\5\\v\\6\\v\\7\\v\\8\\v\\9\\v");
          UltraEdit.frInFiles.replace("^(.{36})(.{30})(.{60})(.{8})(.{7})(.{20})(.{10})(.{20})(.{10})"   /*20*/, "\\1\\v\\2\\v\\3\\v\\4\\v\\5\\v\\6\\v\\7\\v\\8\\v\\9\\v");
          UltraEdit.frInFiles.replace("^(.{2})(.{5})(.{2})(.{2})(.{4})(.)"                               /*20*/,"\\1\\v\\2\\v\\3\\v\\4\\v\\5\\v\\6\\v");
          
          /* Remove spaces at beginning and end of a field value by
          
             a) removing all spaces at beginning of a line,
             b) removing spaces after separator using a positive lookbehind,
             c) removing spaces before separator using a positive lookahead,
             d) removing spaces at end of a line.
          
          Note: \v in replace string means vertical tab, but in search string
                it means any vertical whitespace character. Therefore \x0B is
                used in the search strings for referencing a vertical tab.
          */
          UltraEdit.outputWindow.write("Trim the field values ...");
          UltraEdit.frInFiles.replace("^ +|(?<=\\x0B) +| +(?=\\x0B)| +$", "");
          
          // Escape every double quote in the field values with a double quote.
          UltraEdit.outputWindow.write("Escaping double quotes ...");
          UltraEdit.frInFiles.regExp=false;
          UltraEdit.frInFiles.replace('"', '""');
          
          // Put all field values with 1 or more double quotes or separators
          // completely into double quotes to get finally a valid CSV file.
          UltraEdit.outputWindow.write('Double quote field values with '+sSeparator+' or " ...');
          UltraEdit.frInFiles.regExp=true;
          UltraEdit.frInFiles.replace('(?:^|(?<=\\x0B))([^\\r\\n\\x0B"'+sSeparator+']*["'+sSeparator+'][^\\r\\n\\x0B]*)', '"\\1"');
          
          // And finally replace the vertical tabs by the real separator.
          UltraEdit.outputWindow.write("Replace temporary separators by real separator ...");
          UltraEdit.frInFiles.replace("\\x0B", sSeparator);
          
          Update: Script replacement for command Convert to Character Delimited contains information about a script and a link to the page on where this script can be downloaded for general conversion of a fixed column file to a CSV file using the technique as demonstrated here. This enhanced version of the script asks the script user for the field widths and the separator like the command Convert to Character Delimited in menu Column does and then converts the file with lines of fixed length to a comma separated values file.
          Best regards from an UC/UE/UES for Windows user from Austria

          3
          NewbieNewbie
          3

            Sep 17, 2014#5

            Hello Mofi,

            Thank you very much, this worked like a charm. I really appreciate the help.

            Greetings