Move Data to Delimited Format

Move Data to Delimited Format

6
NewbieNewbie
6

    Feb 13, 2013#1

    I know this has come up over and over (sorry MOFI) but I can't seem to find what I need in the forums (or in my notes).
    I need to run a macro to rip through some files to convert ROW DATA to a DELIMITED FORMAT data.
    I have about 30 files containing 100000 records. They need to be converted to a delimited format to import into SQL Server.

    DATA TO CONVERT (REPEATED 100000 TIMES FOR EACH RECORD IN THE FILE)
    -----------------------------------
    ...
    BEGIN:>>DocTypeName: CS - ZZ Archive Results
    [6]Last Name: SHMOE
    [5]First Name: JOE
    [1]P#: ZHELDHL22344
    [9]Date: 02/13/2004
    [2]Account/Client #: ASD2342113423docdate: 04/12/2004
    >>FileTypeNum: 1
    >>FullPath: w:\wil\ber\knuck\les.txt
    ...

    DESTINATION FORMAT:
    -----------------------------------
    [1]|[2]|[BLANK]|[BLANK]|[5]|[6]|[BLANK]|[BLANK]|[9]

    MOFI: Do you have a macro you could copy-and-paste for me? I promise I'll put it in my notes and not lose it =:)

    Thanks,

    -Allen

    6,605547
    Grand MasterGrand Master
    6,605547

      Feb 14, 2013#2

      Try this macro:

      Code: Select all

      InsertMode
      ColumnModeOff
      HexOff
      UltraEditReOn
      Bottom
      IfColNumGt 1
      InsertLine
      EndIf
      Top
      TrimTrailingSpaces
      Find MatchCase RegExp "%Last Name: +^(*^)^pFirst Name: +^(*^)^pP#: +^(*^)^pDate: +^(*^)^pAccount/Client #: +^(*^)$"
      Replace All "!^3|^5|||^2|^1|||^4"
      Find MatchCase RegExp "%[~!]*^p"
      Replace All ""
      Find MatchCase RegExp "%!"
      Replace All ""
      It converts

      Code: Select all

      BEGIN:>>DocTypeName: CS - ZZ Archive Results
      Last Name: SHMOE
      First Name: JOE
      P#: ZHELDHL22344
      Date: 02/13/2004
      Account/Client #: ASD2342113423docdate: 04/12/2004
      >>FileTypeNum: 1
      >>FullPath: w:\wil\ber\knuck\les.txt
      to

      Code: Select all

      ZHELDHL22344|ASD2342113423docdate: 04/12/2004|||JOE|SHMOE|||02/13/2004

      6
      NewbieNewbie
      6

        Feb 15, 2013#3

        PERFECT!!
        Thank you Mofi!

        I've tucked that safely into my notes for the next time I need to do this (in 2yrs) =:)

        -Allen

          Feb 18, 2013#4

          Mofi,

          I have 253 files to run this against.. (ugh).
          What would be the best way to wrap this into a FindInFiles loop?

          Thanks again,

          -Allen

          6,605547
          Grand MasterGrand Master
          6,605547

            Feb 19, 2013#5

            If you are not sure that last line of every file has a DOS line terminator, run a Perl regular expression Replace in Files with search string (.)$(?!\r\n) and replace string \1\r\n

            That appends a DOS line terminator at end of those files not already ending with a DOS line terminator.

            To trim all spaces/tabs at end of the lines use an UltraEdit regular expression Replace in Files with search string [ ^t]+$ and an empty replace string. With the Perl regexp engine the search string would be [ \t]+$

            Next run simply 3 UltraEdit regular expression Replace in Files using the search and replace strings as in the macro.

            As macro:

            PerlReOn
            ReplInFiles MatchCase RegExp "Full directory path ending with\" "*.*" "(.)$(?!\r\n)" "\1\r\n"
            UltraEditReOn
            ReplInFiles MatchCase RegExp "Full directory path ending with\" "*.*" "[ ^t]+$" ""
            ReplInFiles MatchCase RegExp "Full directory path ending with\" "*.*" "%Last Name: +^(*^)^pFirst Name: +^(*^)^pP#: +^(*^)^pDate: +^(*^)^pAccount/Client #: +^(*^)$" "!^3|^5|||^2|^1|||^4"
            ReplInFiles MatchCase RegExp "Full directory path ending with\" "*.*" "%[~!]*^p" ""
            ReplInFiles MatchCase RegExp "Full directory path ending with\" "*.*" "%!" ""

            6
            NewbieNewbie
            6

              Feb 19, 2013#6

              Again.. thank you.

              This is AWESOME!

                Feb 19, 2013#7

                I'm sorry...
                I've hit another road block and I can't slice this at the source and I can't get my mind around doing this in UltraEdit..

                The data I'm getting contains rows of similar data. I want to "filter" the useless data but can't figure out how to do this in a macro.

                EXAMPLE OF THE DATA I'M LEFT WITH:
                ---------------------------------------
                Last Name: SHMOE|First Name: Joe|P#: ZHELDHL22344|Data1: GOOD DATA|Data2: GOOD DATA
                Last Name: SHMOE|First Name: Joe|P#: ZHELDHL22344|Data1: BAD DATA|Data2: BAD DATA
                Last Name: QUEUE|First Name: Suzy|P#: BELDDHL2454544|Data1: GOOD DATA|Data2: GOOD DATA
                Last Name: QUEUE|First Name: Suzy|P#: BELDDHL2454544|Data1: BAD DATA|Data2: BAD DATA
                ...
                I want to keep the first record and toss out the second.

                If I could just use the tag expression in the SEARCH I would resolve this.
                Example of what I would like to do (but UE won't let me):

                WHAT I WOULD LIKE TO DO:
                ----------------------------------
                Find MatchCase RegExp "Last Name: ^(*^)|First Name: ^(*^)|P#: ^(*^)Last Name: +^1First Name: +^2P#: +^3"
                Replace All "Last Name: ^1|First Name: ^2|P#: ^3"
                ...
                The macro searches twice but only returns the first data found.

                Am I just making this too hard?

                How would you come at this?

                Thanks again,

                -Allen

                6,605547
                Grand MasterGrand Master
                6,605547

                  Feb 20, 2013#8

                  This feature of using a tagged string already in search string is called back referencing. Only the most powerful Perl regular expression engine supports back referencing. The legacy UltraEdit and Unix regular expression engines do not support referencing a tagged string already in search string.

                  PerlReOn
                  Find MatchCase RegExp "^(.*Last Name: )(.*)(\|First Name: )(.*)(\|P#: )(.*)(\|.*)\r\n.*Last Name: \2\|First Name: \4\|P#: \6\|.*$"
                  Replace All "\1\2\3\4\5\6\7"

                  The backslash left to every | is necessary as character | in a Perl regular expression means OR and therefore this character must be escaped to find simply character |.

                  .* is in Perl syntax the same as just * in UltraEdit syntax.

                  Depending on your data the regular expression search and replace string could be less complex:

                  PerlReOn
                  Find MatchCase RegExp "^(.*Last Name: .*\|First Name: .*\|P#: .*\|)(.*)\r\n\1.*$"
                  Replace All "\1\2"