I use both UE ver 13.10 and UE3 13.10a+2 with the UE regex engine. I process very large log files and have to massage the data prior to importing into database. I use UE macros extensively in this process.
The log files are tab delimited files of email traffic for a large company. The first issue I am trying to solve is converting the date, which appears twice in each entry. It is in the format of: 2007-8-29
There are no padded zeros in months and dates of 1 through 9 in the MM and DD fields. I want to convert all of these dates to MM/DD/YYYY.
The next issue is more complex. Email addresses appear twice in each line, and a domain also appears in the message ID. I want to split the email addresses from the domain (two fields), but I do not want to change the message ID which has the domain in it. I can get it to split the domain from all three by using a simple search and replace on the @; however, this goes too far. I only want it to split the domain from the user name in the email address in each line, not in the message ID. As you can see, there are three @ in each line. The one in the middle of the line that is identified as @mdm.gbl is the one I do not want changed. BTW, this @mdm.gbl changes and therefore is not the same in each message. In the examples below I want to divide [email protected] and [email protected]
See the three sample extractions of the data below:
Example #1:
Example #2:
Example #3:
The log files are tab delimited files of email traffic for a large company. The first issue I am trying to solve is converting the date, which appears twice in each entry. It is in the format of: 2007-8-29
There are no padded zeros in months and dates of 1 through 9 in the MM and DD fields. I want to convert all of these dates to MM/DD/YYYY.
The next issue is more complex. Email addresses appear twice in each line, and a domain also appears in the message ID. I want to split the email addresses from the domain (two fields), but I do not want to change the message ID which has the domain in it. I can get it to split the domain from all three by using a simple search and replace on the @; however, this goes too far. I only want it to split the domain from the user name in the email address in each line, not in the message ID. As you can see, there are three @ in each line. The one in the middle of the line that is identified as @mdm.gbl is the one I do not want changed. BTW, this @mdm.gbl changes and therefore is not the same in each message. In the examples below I want to divide [email protected] and [email protected]
See the three sample extractions of the data below:
Example #1:
Code: Select all
2007-8-28 23:59:59 GMT XXX.XXX.21.161 vlb-smtpin-08.ns.cs.theircompany.com - XBH-ENT-01 XX.XXX.14.47 [email protected] 1019 [email protected] 3 0 13036 1 2007-8-28 23:59:59 GMT 0 Version: 6.0.3790.3959 - Fw: Message Subject - Mesa, AZ - 37/hr - Contract Opening - APPLY NOW [email protected] -
Code: Select all
2007-8-28 23:59:59 GMT XXX.XXX.21.161 vlb-smtpin-05.ns.cs.theircompany.com - XBH-ENT-01 XX.XXX.14.47 [email protected] 1025 [email protected] 3 0 13036 1 2007-8-28 23:59:59 GMT 0 Version: 6.0.3790.3959 - Fw: Message Subject - Mesa, AZ - 37/hr - Contract Opening - APPLY NOW [email protected] -
Code: Select all
2007-8-29 0:0:0 GMT XXX.XXX.21.161 vlb-smtpin-07.ns.cs.theircompany.com - XBH-ENT-01 XX.XXX.14.47 [email protected] 1024 [email protected] 3 0 13036 1 2007-8-28 23:59:59 GMT 0 Version: 6.0.3790.3959 - Fw: Message Subject - Mesa, AZ - 37/hr - Employment Opening - APPLY NOW [email protected] -