More information about the sort function, especially about numeric and column sort and how to sort words on lines

More information about the sort function, especially about numeric and column sort and how to sort words on lines

4
NewbieNewbie
4

    Dec 10, 2020#1

    Hello,

    I've read the descriptions about the sorting function:
    However, I have still some basic questions/misunderstandings:

    1) Line sorting

    I've tried to make a line sorting (vs column), but I didn't succeed.

    Let's take an example of lines in UltraEdit:

    Code: Select all

    beta gamma alpha
    zebra tree system
    bag store car
    I would like to sort every line alphabetically, and expect the following result:

    Code: Select all

    alpha beta gamma
    system tree zebra
    bag car store
    Can we do that without a script? Or is the sort only designed for lines in entire file / selected block?

    2) Numeric sort

    Could you please give me a simple example of that function?

    The description is not clear for me and when I try to use it, nothing changes on my data.

    Thanks a lot for your support!

    6,690587
    Grand MasterGrand Master
    6,690587

      Dec 10, 2020#2

      The Sort feature sorts always entire lines in selected block or entire file. It does never sort words (space/tab separated strings) within a line. An UltraEdit script would be required for such a words sort within each line.

      An example demonstrating numeric and column based sorts.

      The file contains following lines:

      Code: Select all

      10 number 10
      3  number 1000
      2 number 1000
      2  number 1000
      1  number 100
      The result is as follows for a standard ascending sort with all the checkbox options not checked and start column of key 1 is 1 and end column of key 1 is -1 and the start and end columns of the other three keys are all 0:

      Code: Select all

      1  number 100
      10 number 10
      2  number 1000
      2 number 1
      3  number 1000
      That is a simple string based sort of the lines depending on the code values of the characters on each line. The first three characters of the lines as listed above have the following hexadecimal code values:

      Code: Select all

      31 20 20
      31 30 20
      32 20 20
      32 20 6E
      33 20 20
      So a simple, non-numeric sort results in comparing two strings (lines) character by character to find out which string of the two compared strings is less or equal the other string depending on the code values of the characters.

      The result is as follows for an ascending sort with checked option numeric sort and all other checkbox options not checked  and start column of key 1 is 1 and end column of key 1 is 2, start column of key 2 is 3 and end column of key 2 is -1 and the start and end columns of the other two keys are all 0:

      Code: Select all

      1  number 100
      2  number 1000
      2 number 1
      3  number 1000
      10 number 10
      The lines are sorted in this case first numeric according to the one or two digit number at beginning of each line and next according to the strings of the lines from column 3 to end of the line whereby on identical number at beginning the order of the lines does not change in this case.

      For a numeric sort it is necessary to specify the columns containing the number on which the string should be converted to numbers. It is important for the numeric sort that the specified columns contain only digits and leading/trailing spaces/horizontal tabs, optionally a decimal-point and decimal places. One or more thousands separators are possible, too. The character left to first digit can be a - sign (the ASCII hyphen-minus character and not the Unicode minus sign).

      The result is as follows for an ascending sort with checked option numeric sort and all other checkbox options not checked  and start column of key 1 is 10 and end column of key 1 is 14, start column of key 2 is 1 and end column of key 2 is 2 and the start and end columns of the other two keys are all 0:

      Code: Select all

      2 number 1
      10 number 10
      1  number 100
      2  number 1000
      3  number 1000
      The lines are sorted in this case first according to the second number in each line in the columns 10 to 14 (leading space is ignored on conversion of the number strings to numbers), and on identical number in these columns, second according to the first number in each line in the columns 1 and 2 (with the trailing space ignored on conversion of the number strings to numbers).

      The column numbers can be seen on status bar at bottom of the main application window. It is advisable for a column/numeric sort to first determine all the column numbers by placing the caret to the columns and note down the column numbers shown in the status bar on a paper before opening the dialog window Advanced Sort/Options.

        Dec 10, 2020#3

        The following UltraEdit script can be used to sort the words on every (selected) line in a not too large file without sorting the lines in the file.

        Code: Select all

        if (UltraEdit.document.length > 0)  // Is any file opened?
        {
           // Define environment for this script.
           UltraEdit.insertMode();
           if (typeof(UltraEdit.columnModeOff) == "function") UltraEdit.columnModeOff();
           else if (typeof(UltraEdit.activeDocument.columnModeOff) == "function") UltraEdit.activeDocument.columnModeOff();
        
           // Select the entire file on nothing selected on script start.
           if (!UltraEdit.activeDocument.isSel())
           {
              UltraEdit.activeDocument.selectAll();
           }
        
           // Is there something selected (file not empty)?
           if (UltraEdit.activeDocument.isSel())
           {
              // Define the string terminating a line in active file.
              var sLineTerm;
              if (UltraEdit.activeDocument.lineTerminator < 1) sLineTerm = "\r\n";
              else if (UltraEdit.activeDocument.lineTerminator == 1) sLineTerm = "\n";
              else sLineTerm = "\r";
        
              // Get all selected lines loaded as an array of strings.
              var asLines = UltraEdit.activeDocument.selection.split(sLineTerm);
        
              // Sort the strings consisting of non-whitespace characters of each line.
              for (var nLine = 0; nLine < asLines.length; nLine++)
              {
                 var asWords = asLines[nLine].match(/\S+/g);
                 if (asWords != null)       // Are there "words" at all?
                 {
                    if (asWords.length > 1) // Are there at least two words?
                    {
                       // Sort them case-sensitive and join them with a space between.
                       asWords.sort();
                       asLines[nLine] = asWords.join(" ");
                    }
                 }
              }
        
              // Join the lines together to a large string and overwrite selected
              // text by the lines on which the "words" are simply sorted.
              UltraEdit.activeDocument.write(asLines.join(sLineTerm));
           }
        }
        
        Best regards from an UC/UE/UES for Windows user from Austria

        4
        NewbieNewbie
        4

          Dec 11, 2020#4

          Thank you very much Mofi, your explanation is super clear !

          If I understand well, we have to choose a numeric or alphabetic sort.

          So, if we would like to sort the first column numerically first and add a second sort on the strings in the 2nd column alphabetically, it is not possible.

          For example let's take these initial data:

          Code: Select all

          2  bumber 1
          10 number 10
          1  number 100
          2  aumber 1000
          3  number 1000
          And what we expect is that the string "anumber" is sorted before "bnumber".

          Code: Select all

          1  number 100  
          2  anumber 1000
          2  bnumber 1   
          3  number 1000 
          10 number 10

          6,690587
          Grand MasterGrand Master
          6,690587

            Dec 11, 2020#5

            Yes, you are right. It is not possible with UltraEdit v27.10.0.164 to sort lines numeric according to number at beginning of the lines according to key 1 and on identical numbers alphabetic according to the strings defined by key 2. If a string of a key cannot be successfully converted to a number, the order of the two compared lines is not modified at all.

            The result is as follows for your example on running an ascending sort with checked option numeric sort and all other checkbox options not checked  and start column of key 1 is 1 and end column of key 1 is 2, start column of key 2 is 3 and end column of key 2 is -1 and the start and end columns of the other two keys are all 0:

            Code: Select all

            1  number 100
            2  bumber 1
            2  aumber 1000
            3  number 1000
            10 number 10
            Better would be:

            Code: Select all

            1  number 100
            2  aumber 1000
            2  bumber 1
            3  number 1000
            10 number 10
            Another example:

            Code: Select all

            2  bumber 1
            We try to explain it.
            10 number 10
            1  number 100
            How does sort work?
            2  aumber 1000
            3  number 1000
            The result is as follows with same sort options as above:

            Code: Select all

            We try to explain it.
            How does sort work?
            1  number 100
            2  bumber 1
            2  aumber 1000
            3  number 1000
            10 number 10
            Better would be:

            Code: Select all

            1  number 100
            2  aumber 1000
            2  bumber 1
            3  number 1000
            10 number 10
            How does sort work?
            We try to explain it.
            From a programmers point of view I do not really understand why the sort is not falling back to a string sort if conversion from string to number fails. The string comparison could be done for two strings which cannot  be converted both successfully to numbers without or with ignoring case and could be even local specific. The sort option Use local (slower) is currently disabled on checking Numeric sort and vice versa. From a programmers point of view this would not be necessary if a string comparison would be done for those strings which cannot be converted successfully to numbers on having numeric sort option enabled.

            The sort is done since UltraEdit for Windows v23.20 by using sort.exe in subdirectory GNU while in former versions it was done by code inside UltraEdit. sort.exe can be also directly used from within a command prompt window to sort an entire file. The execution of sort.exe --help describes briefly the available options. More information about the options can be read at Sort text files. I have just read in that manual that a general numeric sort is available with conversion of the strings to double precision floating point numbers and not to integer numbers and there is also a special numeric sort which sorts also numbers with decimal places.

            Special numeric sort example:

            Code: Select all

            3.2 line 1
            -10 line 2
            1.2 line 3
            2.4 line 4
            2.1 line 5
            The result of a numeric sort according to columns 1 to 4 done with UltraEdit:

            Code: Select all

            -10 line 2
            1.2 line 3
            2.1 line 5
            2.4 line 4
            3.2 line 1
            The first point on how a general numeric sort is done in the referenced manual is:
            • Lines that do not start with numbers (all considered to be equal).
            That explains why the order of lines does not change if a string defined by the keys cannot be converted successfully to a number. Well, UltraEdit runs sort.exe with using option --numeric-sort and not with option --general-numeric-sort as I could find out with Process Monitor on having option numeric sort checked on doing the sort on the example above. But it looks like the first rule is the same on a key string is not a number on running a sort with option --numeric-sort.

            PS: I learned a lot about sorting of lines with UltraEdit on answering your questions as most of what I wrote here was not known by me before.
            Best regards from an UC/UE/UES for Windows user from Austria

            4
            NewbieNewbie
            4

              Dec 11, 2020#6

              Thanks a lot Mofi ! you really master UE