Match data from 2 different structures see example

Match data from 2 different structures see example

3
NewbieNewbie
3

    Dec 22, 2006#1

    Below there are snippets of data from 2 file structures I have.
    The first structure is simply a telephone number. The 2nd is structured differently. It also contains a phone number but has a couple characters in front of it and has a domain name after it. Please look below at the samples. I need to flag the numbers in the first file which only have phone numbers and append the matching domain name to them from the 2nd file.
    I have 86 files of the 2nd kind which are almost 200mb each and have about 3 million lines.

    Any and all help would be greatly appreciated. Please provide as many detailed instructions as possible.

    Thanks!!!!!!


    File 1:

    2012001988
    2012221122
    2012222222
    2012222222
    2012243310
    2012259297
    2012259553
    2012323494
    2012407625
    2012436385
    2012436579
    2012440499
    2012445058
    2012616081
    2012621297
    2012647883


    File 2:

    AA2013162981,myboostmobile.com
    AA2013162982,myboostmobile.com
    AA2013162983,myboostmobile.com
    AA2013162984,myboostmobile.com
    AA2013162985,myboostmobile.com
    AA2013162986,myboostmobile.com
    AA2013162987,myboostmobile.com
    AA2013162988,myboostmobile.com
    AA2013162989,myboostmobile.com
    AA2013162990,myboostmobile.com
    AA2013162991,myboostmobile.com
    AA2013162992,myboostmobile.com
    AA2013162993,myboostmobile.com
    AA2012221122,myboostmobile.com
    AA2013162995,myboostmobile.com
    AA2013162996,myboostmobile.com
    AA2013162997,myboostmobile.com
    AA2013162998,myboostmobile.com
    AA2013162999,myboostmobile.com
    AA2013163000,vtext.com
    AA2013163001,vtext.com
    AA2013163002,vtext.com
    AA2013163003,vtext.com
    AA2013163004,vtext.com
    AA2013163005,vtext.com
    AA2013163006,vtext.com
    AA2013163007,vtext.com
    AA2013163008,vtext.com

    6,687587
    Grand MasterGrand Master
    6,687587

      Dec 22, 2006#2

      Okay, here is the macro which should do this job on your large files. Good that you have mentioned that because I would have written the macro different when the files are not so big.

      File 1 with the phone numbers only must have the focus. File 2 is also already open. No other file should be open.

      First the macro trims all trailing spaces in file 1 (for security) and inserts a new line at the end of the file with the special marker character » and then copies whole content of file 1 to clipboard 9 and pastes it at top of file 2. This is done to avoid window switching which normally decrease macro execution speed. Hopefully size of file 1 is not too big for clipboard copying.

      Now in file 2 a loop is executed until the marker character is found at start of a line.

      In the loop first the current line is marked with character # and the phone number is selected and copied to clipboard 9. Next a find for this phone number with a following comma is executed.

      If this search string is found the comma and the following domain name is copied to user clipboard 8 and with a find upwards with clipboard 9 reactived the cursor is moved back to the current phone number from file 1. There the comma and the domain name is appended to the line and the line mark remains.

      If search for phone number with following comma was not successful, the line mark is removed.

      Last command moves cursor down to next phone number from file 1 or the special marker character.

      After the loop has finished the macro selects the maybe modified content from file 1 in file 2, cuts it to user clipboard 9 and pastes it over still existing selection in file 1. So file 2 was temporarily modified, but now contains the same content as before start of the macro.

      In file 1 the last line with the special marker character is deleted, the cursor is positioned back to top and the 2 used clipboards are cleared to free memory.

      The macro property Continue if a Find with Replace not found must be checked for this macro.

      InsertMode
      ColumnModeOff
      HexOff
      TrimTrailingSpaces
      Bottom
      IfColNum 1

      "
      Else
      "
      »
      "
      EndIf
      SelectAll
      Clipboard 9
      Copy
      NextWindow
      Top
      Paste
      Top
      Loop
      IfCharIs "»"
      ExitLoop
      EndIf
      "#"
      StartSelect
      Key END
      Copy
      EndSelect
      Find "^c,"
      IfFound
      Key LEFT ARROW
      StartSelect
      Key END
      Clipboard 8
      Copy
      EndSelect
      Clipboard 9
      Find Up "#^c"
      Key LEFT ARROW
      Key RIGHT ARROW
      Clipboard 8
      Paste
      Clipboard 9
      Key HOME
      Else
      Key HOME
      Key DEL
      EndIf
      Key DOWN ARROW
      EndLoop
      Key DOWN ARROW
      SelectToTop
      Cut
      PreviousWindow
      Paste
      Key UP ARROW
      DeleteLine
      Top
      ClearClipboard
      Clipboard 8
      ClearClipboard
      Clipboard 0

      Because Christmas is soon coming, here is a second macro which does the same as above but uses window switching. You can use this one, if file 1 is also very large. I would be really interested in which version is faster. Can you run both macros on the same files, determine the execution time and post it?

      The macro property Continue if a Find with Replace not found must be checked for this macro too.

      InsertMode
      ColumnModeOff
      HexOff
      TrimTrailingSpaces
      Bottom
      IfColNum 1

      "
      Else
      "
      »
      "
      EndIf
      Top
      NextWindow
      Top
      PreviousWindow
      Loop
      IfCharIs "»"
      ExitLoop
      EndIf
      "#"
      StartSelect
      Key END
      Copy
      EndSelect
      NextWindow
      Find "^c,"
      IfFound
      Key LEFT ARROW
      StartSelect
      Key END
      Copy
      EndSelect
      Top
      PreviousWindow
      Key LEFT ARROW
      Key RIGHT ARROW
      Paste
      Key HOME
      Else
      PreviousWindow
      Key HOME
      Key DEL
      EndIf
      Key DOWN ARROW
      EndLoop
      DeleteLine
      Top
      ClearClipboard
      Clipboard 0

      And here is again the macro above, but now without the special marker character because now IfEof is used. This will not work if file 1 is a Unicode file because UltraEdit v12.20b does not correctly identify end of file on Unicode files. So the marker character in a new line at end of the file used in the macro above is my workaround to run a macro to end of a file when I don't know the file format (Unicode or ASCII).

      The macro property Continue if a Find with Replace not found must be checked for this macro too.

      InsertMode
      ColumnModeOff
      HexOff
      TrimTrailingSpaces
      Bottom
      IfColNumGt 1
      InsertLine
      EndIf
      Top
      NextWindow
      Top
      PreviousWindow
      Loop
      IfEof
      ExitLoop
      EndIf
      "#"
      StartSelect
      Key END
      Copy
      EndSelect
      NextWindow
      Find "^c,"
      IfFound
      Key LEFT ARROW
      StartSelect
      Key END
      Copy
      EndSelect
      Top
      PreviousWindow
      Key LEFT ARROW
      Key RIGHT ARROW
      Paste
      Key HOME
      Else
      PreviousWindow
      Key HOME
      Key DEL
      EndIf
      Key DOWN ARROW
      EndLoop
      Top
      ClearClipboard
      Clipboard 0
      Best regards from an UC/UE/UES for Windows user from Austria

      3
      NewbieNewbie
      3

        Dec 22, 2006#3

        It seems like there has got to be a different, simpler way.
        I do not necessarily need to append the data from the 2nd set of files to the ACTUAL 1st file. I simply need to end up with a list of the records from list one that are somewhere in the lists from list 2 (list 2 is 86 files) and have the extra data from list two appended to the matching records of list one...even if that means making a list 3.

        Is there I way I can take list one and go line by line... take line one from list one and search list 2 (86 LARGE files) for any match of that string and if there is a match append that to a new file...then search again for the next line of list one inside list 2... etc. etc. ???
        I want the least amount of user action.
        I do not mind if it takes a few days even so long as it gives me the end result I am looking for.

        Your replies (all of you!) are welcome and suggestions/help TOTALLY appreciated!!!

        40
        Basic UserBasic User
        40

          Dec 23, 2006#4

          this sounds more like a job for a database, especially when the files are big and if you need to do do this regularly.
          As the data is one entry per line, you can easily import it in any database, like all data of type 1 in table1, and of type 2 in table2. Then it's a matter of a simple select statement in a loop to generate a 3rd table with data of types 1 and 2 combined as needed. I guess any database engine will do this much faster than UE.

          6,687587
          Grand MasterGrand Master
          6,687587

            Dec 23, 2006#5

            A database program would do this job really much faster than UltraEdit.

            Have you understand and tried my macros? Which one is faster?

            Because of your very large files it is maybe really better to use macro 2 or 3. The modified first file is not saved at the end, so you can delete all remaining lines not marked with a # at start of the line, then remove the # character itself from start of all lines and save the file with a new name.

            List 2 are 86 files - no problem. Use the old DOS command copy to copy all 86 files together to a REALLY big single file - see Appending text files and Want to combine many files into one file - and then use one of my macros. You should also read in help of UltraEdit the page with the title Large File Handling and configure UltraEdit accordingly.
            Best regards from an UC/UE/UES for Windows user from Austria

            1
            NewbieNewbie
            1

              Jan 26, 2007#6

              Thank you!!! The 2nd one worked great for me and very quickly. I have similar large files and it is beautiful!