Hello, first of all sorry for my poor English, I'm French
Here is the problem:
I have to work on huge files (4 000 000 line and more). They are UNIX type text files. Generally we use egrep to extracts all the lines of the file that match a label or a list of label disposed in column in an other files. But egrep is limited on our UNIX version on the number of lines in the "list of extract to do" file.
I think UltraEdit is able to do that.
Example of "work" file
files look like this (little extract)
And I want to extract all the lines matching HM1008L and all the line matching HB5020X in an unique file.
The list of strings to be extracted is placed on a file in rows (one string per line) with a max of approximately 100 lines.
The extracted lines have to be sorted in the same order as in the original file (by date of the first row).
So for my example I would like to see in new file:
Thanks for all help I can get here.
Here is the problem:
I have to work on huge files (4 000 000 line and more). They are UNIX type text files. Generally we use egrep to extracts all the lines of the file that match a label or a list of label disposed in column in an other files. But egrep is limited on our UNIX version on the number of lines in the "list of extract to do" file.
I think UltraEdit is able to do that.
Example of "work" file
files look like this (little extract)
Code: Select all
.
.
.
2006/02/28-10:36:42.719 PATMEF:Name=HB5020X size= 80
2006/02/28-10:36:42.719 PATMEF:Name=HB5020X size= 80
2006/02/28-10:36:42.719 PATMEF:Name=HB5020X size= 80
2006/02/28-10:36:42.719 PATMMO:Name=PC000WL PhyVal=40470 Label/Unit= Limit= 0 Status=GO (GO ) RawVal=40470 PCU Acquisition line 30
2006/02/28-10:36:42.719 PATMMO:Name=PC0023V PhyVal=100.2318 Label/Unit=V Limit= 5 Status=GO (GO ) RawVal=534 PCU 100V Bus Voltage
2006/02/28-10:36:42.719 PATMEF:Name=HB5020X size= 80
2006/02/28-10:36:42.719 PATMMO:Name=PC000WL PhyVal=40469 Label/Unit= Limit= 0 Status=GO (GO ) RawVal=40469 PCU Acquisition line 30
2006/02/28-10:36:42.719 PATMMO:Name=PC0023V PhyVal=100.0441 Label/Unit=V Limit= 5 Status=GO (GO ) RawVal=533 PCU 100V Bus Voltage
2006/02/28-10:36:42.719 PATMEF:Name=HB5020X size= 80
2006/02/28-10:36:42.719 PATMEF:Name=HB5020X size= 80
2006/02/28-10:36:42.792 PATMMO:Name=HM1002L PhyVal=0 Label/Unit= Limit= 0 Status=GO (GO ) RawVal=0 Data Filed Header Flag
2006/02/28-10:36:42.792 PATMMO:Name=HM1003L PhyVal=2047 Label/Unit=IDLE Limit= 0 Status=GO (GO ) RawVal=2047 Application Process Identifier
2006/02/28-10:36:42.792 PATMMO:Name=HM1007L PhyVal=0 Label/Unit= Limit= 0 Status=GO (GO ) RawVal=0 Source Sequence Count
2006/02/28-10:36:42.792 PATMMO:Name=HM1008L PhyVal=807 Label/Unit= Limit= 0 Status=GO (GO ) RawVal=807 Packet Length
2006/02/28-10:36:44.854 PATMMO:Name=HM1003L PhyVal=********** Label/Unit= Limit= 0 Status=UND(GO ) RawVal=1791 Application Process Identifier
2006/02/28-10:36:44.854 PATMMS:Name=HM1003L PhyVal=********** Label/Unit= RawVal=1791 Status=UND(GO )
2006/02/28-10:36:44.854 PATMMO:Name=HM1008L PhyVal=5 Label/Unit= Limit= 0 Status=GO (GO ) RawVal=5 Packet Length
2006/02/28-10:36:44.856 PATMMO:Name=HM1003L PhyVal=********** Label/Unit= Limit= 0 Status=UND(UND) RawVal=1790 Application Process Identifier
2006/02/28-10:36:44.856 PATMMO:Name=HM1008L PhyVal=3 Label/Unit= Limit= 0 Status=GO (GO ) RawVal=3 Packet Length
2006/02/28-10:36:44.857 PATMMO:Name=HM1002L PhyVal=1 Label/Unit= Limit= 0 Status=GO (GO ) RawVal=1 Data Filed Header Flag
2006/02/28-10:36:44.857 PATMMO:Name=HM1003L PhyVal=104 Label/Unit=AOCS_S_D_02 Limit= 0 Status=GO (UND) RawVal=104 Application Process Identifier
2006/02/28-10:36:44.919 PATMEF:Name=HB5020X size= 80
.
.
.
The list of strings to be extracted is placed on a file in rows (one string per line) with a max of approximately 100 lines.
The extracted lines have to be sorted in the same order as in the original file (by date of the first row).
So for my example I would like to see in new file:
Code: Select all
2006/02/28-10:36:42.719 PATMEF:Name=HB5020X size= 80
2006/02/28-10:36:42.719 PATMEF:Name=HB5020X size= 80
2006/02/28-10:36:42.719 PATMEF:Name=HB5020X size= 80
2006/02/28-10:36:42.719 PATMEF:Name=HB5020X size= 80
2006/02/28-10:36:42.719 PATMEF:Name=HB5020X size= 80
2006/02/28-10:36:42.719 PATMEF:Name=HB5020X size= 80
2006/02/28-10:36:42.792 PATMMO:Name=HM1008L PhyVal=807 Label/Unit= Limit= 0 Status=GO (GO ) RawVal=807 Packet Length
2006/02/28-10:36:44.854 PATMMO:Name=HM1008L PhyVal=5 Label/Unit= Limit= 0 Status=GO (GO ) RawVal=5 Packet Length
2006/02/28-10:36:44.919 PATMEF:Name=HB5020X size= 80