Inserting commas at regular intervals, multiple per line

Inserting commas at regular intervals, multiple per line

2
NewbieNewbie
2

    Nov 08, 2008#1

    I've got a rather large text file, and it appears to simply have data printed in columns separated by variable numbers of spaces. I'd like to add comma separation at columns: 16, 55, 62, 72, 82, 98, 99, 100, 120, 134, 135 in each and every row.

    Can I use the default UltraEdit insert function for this?

    If I can, I'm not seeing it how.

    Here's a sanitized example of my current data:

    Code: Select all

    ABC           123 11 12345                           0237 04ABC       123       12345 #1234      016006000500065065671234347487538B21
    Here's a sanitized example of what I'm looking for

    Code: Select all

    ABC           ,123 11 12345                           ,0237 04,ABC       123       ,12345 #1234     , ,016006000500065065671234347487538,B,21
    My example may be a couple of characters off, but I believe it gives the idea. I've got a few thousand lines like this to process. Any clues?

    Appendix: Environment
    Host PC is running WinXP Sp3
    UltraEdit is version 14.20.0.1035

    6,603548
    Grand MasterGrand Master
    6,603548

      Nov 08, 2008#2

      The first method is to use the column mode.
      • Click on Column - Column Mode to activate the column mode editing.
      • Set in the first line the cursor to the first column where you want to insert a comma.
      • Click on Column - Insert/Fill Columns, enter the comma and press button OK.
      • Move the cursor to the next column where you want to insert a comma.
      • Click on Column - Insert/Fill Columns (use hotkey!) and press button OK.
      • Redo step 4 and 5 until all commas are inserted.
      • Turn column mode off with a click on Column - Column Mode.
      Big advantage of this method: You can be sure to insert every comma always at the correct column.

      The second method is to use Column - Convert to Character Delimited, a command written to convert a fixed width file into a CSV file. It is fast and easy to handle, but you need to know how many characters every field has, for example according to your question 16, 55-16 = 39, 62-55 = 7, ... So it is possible that you insert a comma at the wrong column when you make a mistake on calculating the field widths. Also after executing the command you can't undo the changes. Therefore it is good advice to always save a file before running this command to be able to use File - Revert to Saved if the result is not what has been expected.

      Hint: Before opening the dialog, select the characters of every field once. At bottom of the UltraEdit main window in the status bar is displayed how many bytes are selected. For an ASCII/ANSI file this is the field width, for a Unicode file the number of selected bytes must be divided by 2. So you can easily find out and note anywhere the widths of every field.

      The third method would be to use a Perl regular expression with following syntax for the search string:

      ^(.{x})(.{x})(.{x})(.{x})(.{x})

      and for the replace string:

      \1,\2,\3,\4,\5,\6,

      ^ start every search a start of a line.
      (...) means store the string found by the expression inside () temporarily for re-use it in the replace step (or even as back reference in the search).
      . means any single character except new line characters - all characters except CR and LF.
      {x} means the preceding expression - here any character except CRLF - x times. So x would be for your task 16, 55-16 = 39, 62-55 = 7, ...

      This method is definitely the most complicated method to convert a file from fixed column to CSV.

      A fourth method would be to use Insert String At Every Increment. Similar to the first method you would have to place the cursor always at correct column position in the first line, insert there the comma and then run this command for inserting the comma in all other lines too. The file increment value would be the line length including the 2 not displayed line ending characters CR and LF (for DOS files). And of course this increment value increases with every execution by 1 because of the inserted comma.
      Best regards from an UC/UE/UES for Windows user from Austria

      2
      NewbieNewbie
      2

        Nov 08, 2008#3

        Mofi,

        Thanks a great deal.
        I'm now using the second method you just outlined.
        Looks like UltraEdit is hard at work.
        I wonder how long this will take with a 41 MB file.
        Oh well. I hope the weather's good in Austria!

        3
        NewbieNewbie
        3

          Jun 18, 2009#4

          Hi Mofi,

          I ran into problem when I hit the 10th comma. It seems to replace the string in \1. How can I fix this?

          6,603548
          Grand MasterGrand Master
          6,603548

            Jun 18, 2009#5

            You can use only up to 9 tags as documented in help. Reason is quite simple. How should any regular expression engine determine how it should interpret \10 - as tagged string one and zero character or as tagged string 10? A regular expression engine can't do that. And you can believe me that I have often used \10 to append a zero to a string, for example to convert numbers like 1.5, 1.00 and 1.23 to 1.50, 1.00, 1.23.

            Simply run 2 replaces. The first one for the commas 10 to X and the second one for the commas 1 to 9.

            Or if your file is a fixed column width file and you want to convert it to a CSV file, use special function Column - Convert to Character Delimited.
            Best regards from an UC/UE/UES for Windows user from Austria

            3
            NewbieNewbie
            3

              Jun 18, 2009#6

              Thanks Mofi. The convert to character delimited works for me.

              I tried to record convert to character delimited but it does not add the commas when I run the macro.

              So I went to try the RegExp. When I run the macro, no commas inserted. What am I doing wrong?

              Code: Select all

              InsertMode
              ColumnModeOff
              HexOff
              DeleteLine
              DeleteLine
              DeleteLine
              DeleteLine
              DeleteLine
              StartSelect
              Key RIGHT ARROW
              Key RIGHT ARROW
              ColumnModeOn
              SelectToBottom
              Key DEL
              ColumnModeOff
              Bottom
              Key Ctrl+END
              DeleteLine
              Key UP ARROW
              DeleteLine
              DeleteLine
              DeleteLine
              PerlReOn
              Find RegExp "(.{125})(.{15})"
              Replace All "\1,\2,"
              PerlReOn
              Find RegExp "(.{20})(.{26})(.{3})(.{14})(.{3})(.{14})(.{10})(.{11})(.{14})"
              Replace All "\1,\2,\3,\4,\5,\6,\7,\8,\9,"

              6,603548
              Grand MasterGrand Master
              6,603548

                Jun 25, 2009#7

                There is no macro command for Convert to Character Delimited. This function can't be used from within macros or scripts (too many variables - see dialog). If you want to do this conversion with a macro because you need it regularly, you have to develop a macro or script which does the same with regular expression replaces.

                You have already done it. I don't really understand what your macro does. First it deletes 6 lines. I guess it does this at top of the file, but there is no command Top, so it could be also anywhere else. Next it looks like you want to delete the first 2 columns. Your approach works if the last line of the file has a line ending. But better would be to use the command ColumnDelete 2 with the cursor at top of the file. What you next do at end of the file is mysterious for me.

                However, the replace commands fail because the cursor is at end of the file. The special replace option Replace All is from top of file is never active when a replace is used from within a macro or script to avoid endless loops. Use the command Top before running the replaces and the macro should work.
                Best regards from an UC/UE/UES for Windows user from Austria

                3
                NewbieNewbie
                3

                  Jun 29, 2009#8

                  Hi Mofi,
                  I now understand what I did wrong. I was able to correct the macro based on your recommendation and it is working perfectly. This was my first attempt to whip up a macro to format a text file. Once again, thank you for your assistance and patience.