Tapatalk

How to sort lines with tab separated values?

How to sort lines with tab separated values?

9

    Jul 29, 2021#1

    I just updated UE to the newest version available 28.10.0.98 (64-bit) yesterday. I noticed to my surprise that UE can't sort correctly my file with the settings I use.

    What I need is to sort from column 33 to the end, which is a time format: YYMM tab HHMM
    I tried with column 33 - 36, based on tab stop value = 8. The sorting should be done from a selected text as shown.

    No matter what I have done this just doesn't work. I googled around and found a similar thread from 2015. I'm surprised that such bug has been around for such long time. The former UE version I used on this PC was v19 released in 2013, and it did the sorting job perfectly.

    BR Lars
    contact.txt (582 Bytes)   0
    contact_p01.png (47.93KiB)

    6,685587
    Grand MasterGrand Master
    6,685587

      Jul 30, 2021#2

      There is used in UltraEdit for Windows < v23.20 an internal function for sorting lines which takes horizontal tabs and the configured tab stop value into account. So the sort with the settings to sort according to columns 33 to 36 works with these versions of Windows.

      UltraEdit for Windows v23.20 introduced a new sort with much more features using the third-party executable sort.exe for sorting lines or entire files which is installed with UltraEdit in the subdirectory GNU\sort in the program files directory of UltraEdit. I suggest to look on More information about the sort function, especially about numeric and column sort and how to sort words on lines and on the power tip How to sort a text file or CSV file with UltraEdit.

      There is now the special sort Tab delimited sort for a file with tab separated values (TSV files) and a Custom delimited sort for CSV files using a comma or a semicolon or a vertical bar as separator. But the Tab delimited sort cannot be used in your case as the tabs are not used to separate the values, but to align the text in the file. So the number of tabs varies between the lines.

      The problem here with this text file using horizontal tabs for aligning the text is that the columns are for sort.exe in real the number of characters in a line. The horizontal tabs are not interpreted as placeholders for 1 to n (8 in your case) spaces as the internal sort of UltraEdit for Windows < v23.20 does. That makes the sort very fast, but makes it not really possible to directly sort files like the attached one using horizontal tabs for text alignment.

      For sorting a text file with a varying number of horizontal tabs on each line to align the text in columns according to a specific column range it is necessary with UE v28.10.0.98 to do the following:
      1. Convert all tabs to spaces in active file or current selection
        In ribbon mode click on ribbon tab Format in second group Convert on the second item Tabs / spaces.
        In toolbar/menu mode with contemporary menus click in menu Format in the submenu Tabs / spaces on the first menu item Tabs to spaces.
        In toolbar/menu mode with traditional menus click in menu Format on the menu item Tabs to spaces.
         
      2. Run a column (character count) based sort on entire file or current selection
        In ribbon mode click on ribbon tab Edit in last but one group Sort on the down arrow of the item Sort and click in popup menu on the second item Advanced sort/options.
        In toolbar/menu mode with contemporary menus click in menu Edit in the submenu Sort on the second menu item Advanced sort/options.
        In toolbar/menu mode with traditional menus click in menu File in the submenu Sort on the second menu item Advanced sort/options.

        The options to use for your sort based on columns 33 to 36, i.e. just YYMM, are as follows:
        Sort order: Ascending
        Remove duplicates: unchecked
        Ignore case: unchecked
        Numeric sort: unchecked
        Tab delimited sort: unchecked
        Custom delimited sort: unchecked
        Key 1: Start column 33 and End column 36
        Key 2: Start column 1 and End column 32
        Key 3: Start column 0 and End column 0
        Key 4: Start column 0 and End column 0
        Use locale (slower): unchecked
        Key 2 would be with Start column 41 and End column 44 and Key 3 would be with Start column 1 and End column 32 for a sort taking additionally also HHMM into account.
         
      3. Convert all series of spaces back to tabs in active file or current selection
        In ribbon mode click on ribbon tab Format in second group Convert on the down arrow of the second item Tabs / spaces and click in popup menu on second item Spaces to tabs (all).
        In toolbar/menu mode with contemporary menus click in menu Format in the submenu Tabs / spaces on the second menu item Spaces to tabs (all).
        In toolbar/menu mode with traditional menus click in menu Format on the menu item Spaces to tabs (all).
      Well, in this use case the number of horizontal tabs on the lines to sort is constant for the first six data columns. For that reason it is also possible to use a tab delimited sort with following advanced sort options:

      Sort order: Ascending
      Remove duplicates: unchecked
      Ignore case: unchecked
      Numeric sort: unchecked
      Tab delimited sort: checked
      Custom delimited sort: is grayed out
      1. Field: 5 with Start character 0 and End character 0
      2. Field: 6 with Start character 0 and End character 0
      3. Field: 1 with Start character 0 and End character 0
      4. Field: 3 with Start character 0 and End character 0
      Use locale (slower): unchecked

      Start character 0 and End character 0 means the entire tab separated field value is taken into account for the sort.
      Best regards from an UC/UE/UES for Windows user from Austria

      9

        Aug 01, 2021#3

        Thank you for your answer :-) The change with Tabs to spaces (selected text for sort), solved the issue, so UE sorted correctly as needed.
        I did use the same setting with start from column 33 and then -1 in the end column field, and that worked fine.

        BR Lars