A question about splitting files

A question about splitting files

3
NewbieNewbie
3

    Oct 11, 2006#1

    I am a root in this forum and I start to learn how to write a macro from yesterday. But now I am really confused about writing a macro to do a task, I have a huge amount files with the following format:

    Code: Select all

        254      0     14      AUG    1998
          1  99999  91348   6.97N158.22E    46   2301
          2    100    100    850     76      7      3
          3          PTPN                32767     kt
          9   1008     46    292    244     90      4
    ...................................
        254     12     14      OCT    1998
          1  99999  91348   6.97N158.22E    46   1101
          2    100    100     50     77      7      3
          3          PTPN                32767     kt
          9   1009     46    250    236    150      4
    ...................................
        254     12     14      OCT    1999
          1  99999  91348   6.97N158.22E    46   1101
          2    100    100     50     77      7      3
          3          PTPN                32767     kt
          9   1009     46    250    236    150      4
    ...................................
    I want to split them into separate files for each year, I tried to write some macros but they all failed to work, can anyone give me an inspiration? Thanks very much.

    6,675585
    Grand MasterGrand Master
    6,675585

      Oct 11, 2006#2

      Your task is similar to Splitting based on content. It would be no problem for me to adapt my macro from the linked thread to your needs WHEN you can tell me the exact criteria to identify a line as start line of a year block and as last line of a year block to select the block which should be saved into a new file with the year as file name.

      But maybe with the explanation of the macro at the linked thread you will find yourself the solution for your task.
      Best regards from an UC/UE/UES for Windows user from Austria

      3
      NewbieNewbie
      3

        Oct 12, 2006#3

        Thanks for your direction mofi! I am trying to split the files for each year that its first line is "254       X     XX      MON    YEAR", I tried to use regular expression in FIND command to identify the "MON    YEAR" string to make sure that the number found is for year not for a data field, but because I am a new learner, I found there are lot of problems to carry out what I want, not only on RE. So would you pls give me more direction in detail, thanks very much!

        The sample content of the file looks like the following:

        Code: Select all

            254      0     14      JAN    1999           
              1  99999  91348   6.97N158.22E    46   2301
              2    100    323    881     68      7      3
              3          PTPN                32767     kt
              9   1007     46    296    226     40      5
              4   1000    100    282    212     50      7
            254      0     11      SEP    1999           
              1  99999  91348   6.97N158.22E    46   2301
              2    100    100     50     55      7      3
              3          PTPN                32767     kt
            254     12     11      SEP    1999           
              1  99999  91348   6.97N158.22E    46   1102
              2    100    118    879     63      7      3
              3          PTPN                32767     kt
              9   1007     46    232    230    140      3
              4   1000     98    248    229    130      4
            254      0     11      APR    2000           
              1  99999  91348   6.97N158.22E    46   2301
              2    100    133    850     46      7      3
              3          PTPN                32767     kt
              9   1008     46    246    237    140      8
              4   1000    108    246    231  32767  32767
            254     12     11      APR    2000           
              1  99999  91348   6.97N158.22E    46   1100
              2    100    100     50     50      7      3
              3          PTPN                32767     kt
              9   1006     46    242    239    130      4
              4   1000     94    250    225    125      6
              6    976    304  32767  32767     90     16
              6    943    609  32767  32767     90     17
              4    925    777    212    178    100     18
              6    911    914  32767  32767    105     19
              6    879   1219  32767  32767    110     19
              4    850   1509    186    140    105     18
              6    819   1828  32767  32767    100     15
            254      0     12      APR    2000           
              1  99999  91348   6.97N158.22E    46   2302
              2    100    100    850     54      7      3
              3          PTPN                32767     kt
              9   1007     46    244    232    140      4
              4   1000     99    240    220    135      7
              6    977    304  32767  32767    100     18
        And I want splited file start as :

        Code: Select all

            254      0     12      APR    2000  
        Thank again for you help~

        6,675585
        Grand MasterGrand Master
        6,675585

          Oct 12, 2006#4

          Okay, here is the macro for your need. This macro uses the UltraEdit style regular expression which is need for this macro because ^c cannot be used with Unix or Perl engine within a regular expression. If you prefer the Unix or Perl regex engine by default, insert the appropriate macro command for the engine before ExitMacro and at the end of the macro.

          The regular expression search string looks strange because I have restricted it as much as possible to really find the year lines. The three alphabetical ranges matches JAN, FEB, MAR, APR, MAY, JUN, JUL, AUG, SEP, OCT, NOV, DEC (and some other hopefully not existing crazy letter combinations).

          In comparison to the macro at Splitting based on content only a few lines has been deleted or modified (marked green).

          The ExitMacro part after first search is for security. It avoids an execution of this macro on a wrong file.

          The macro property Continue if a Find with Replace not found respectively Continue if search string not found must be checked for this macro.

          InsertMode
          ColumnModeOff
          HexOff
          UnixReOff
          Bottom
          IfColNum 1
          Else
          "
          "
          EndIf
          Loop
          GotoBookMark
          IfEof
          ExitLoop
          Else
          ToggleBookmark
          Bottom
          EndIf
          EndLoop
          Top
          ToggleBookmark
          Find MatchCase RegExp "%[ ^t]++254+[ ^t]+[0-9]+[ ^t]+[0-9]+[ ^t]+[ADFJMNOS][ACEOPU][BCGLNPRTVY][ ^t]+^{19^}^{20^}[0-9][0-9]"
          IfNotFound
          ExitMacro
          EndIf

          Clipboard 9
          Key LEFT ARROW
          SelectWord
          Copy
          EndSelect
          Loop
          Find MatchCase RegExp "%[ ^t]++254+[ ^t]+[0-9]+[ ^t]+[0-9]+[ ^t]+[ADFJMNOS][ACEOPU][BCGLNPRTVY][ ^t]+^c"
          IfFound
          Key LEFT ARROW
          Else
          Find MatchCase RegExp "%[ ^t]++254+[ ^t]+[0-9]+[ ^t]+[0-9]+[ ^t]+[ADFJMNOS][ACEOPU][BCGLNPRTVY][ ^t]+^{19^}^{20^}[0-9][0-9]"
          IfFound

          Key HOME
          IfColNumGt 1
          Key HOME
          EndIf
          Else
          Bottom
          EndIf

          ToggleBookmark
          Clipboard 8
          GotoBookMarkSelect
          Copy
          EndSelect
          ToggleBookmark
          GotoBookMark
          NewFile
          Paste
          Clipboard 9
          Paste
          ".csv"
          StartSelect
          Key HOME
          Cut
          EndSelect
          SaveAs "^c"
          CloseFile
          IfEof
          ExitLoop
          Else
          Key END
          Find MatchCase RegExp "%[ ^t]++254+[ ^t]+[0-9]+[ ^t]+[0-9]+[ ^t]+[ADFJMNOS][ACEOPU][BCGLNPRTVY][ ^t]+^{19^}^{20^}[0-9][0-9]"
          Key LEFT ARROW
          SelectWord
          Copy
          EndSelect
          EndIf
          EndIf
          EndLoop
          ToggleBookmark
          ClearClipboard
          Clipboard 8
          ClearClipboard
          Clipboard 0
          Best regards from an UC/UE/UES for Windows user from Austria

          3
          NewbieNewbie
          3

            Oct 13, 2006#5

            Thanks very much mofi~! It really helped me a lot~!

            Now i have anyother problem, i want to replace such as " 6.97N158.22E"
            styles into "6.97 158.22 " (replacing the character with space), how should i write the RegExp with replace command?

            6,675585
            Grand MasterGrand Master
            6,675585

              Oct 13, 2006#6

              With UltraEdit style regular expression enter following in the replace dialog with Match Case option also checked:

              Find: ^([0-9]+.[0-9][0-9]^)[A-Z]^([0-9]+.[0-9][0-9]^)[A-Z]
              Replace: ^1 ^2 

              Note: There is a single trailing space after ^2.

              Instead of [A-Z] you can also use the upper case characters allowed for coordinates which I think they are here. ( [NS] for first and [EW] for second? )
              Best regards from an UC/UE/UES for Windows user from Austria