UE symbol explanations for line terminators / line endings

UE symbol explanations for line terminators / line endings

681
Advanced UserAdvanced User
681

    Feb 21, 2012#1

    Is there a reference anywhere for symbols in UE?
    I did a couple of searches in the help and here but could not find something useful.

    I have saved a .docx file to text and when I open the text file I see these symbols for EOL


    When I paste in parts of this text into a fresh word file word is not interpreting the EOL properly.
    I am also aware of possibe conversions as shown below.
    I tried the "UNIX/MAC to DOS" conversion and it seemed to help.
    Are the symbols highlighted in yellow then UNIX or MAC end of line characters?

    6,610548
    Grand MasterGrand Master
    6,610548

      Feb 22, 2012#2

      With View - Show Line Endings (UE v13.10 and later) respectively View - Show Spaces/Tabs (UE < v13.10) following is displayed for the various line termination types:

      The paragraph sign (decimal 182, hex. 0xB6) is used for DOS line terminations which is carriage return plus line-feed.

      The not sign ¬ (decimal 172, hex. 0xAC) is used for UNIX line terminations which is a line-feed only.

      The plus-minus sign ± (decimal 177, hex. 0xB1) is used for MAC line terminations which is a carriage return only.

      The symbols displayed depend on the font set and the selected script (code page). That's the reason why I have added also the decimal and hexadecimal code value of the characters because the symbols can be different when not using a font like Courier New and a different script than Western (ANSI 1252, Latin I).

      Carriage return is often abbreviated with CR, must be coded in Unix/Perl regular expressions and in JavaScript strings with \r and in UltraEdit regular expression and in non regular expression finds/replaces with ^r.

      Line-feed is often abbreviated with LF, must be coded in Unix/Perl regular expressions and in JavaScript strings with \n and in UltraEdit regular expression and in non regular expression finds/replaces with ^n.

      ^p can be used for the pair carriage return + line-feed in UltraEdit regular expression and in non regular expression finds/replaces. In Unix regular expressions \p can be used for this control character pair. In JavaScript and in Perl regular expressions \r\n must be used because there is no separate definition for this character sequence.


      Carriage return only is often used by Microsoft Office applications for line breaks within a paragraph or table cell. Line break is not equal end of paragraph in word processing applications. A line break can be inserted within a paragraph in MS Word with Shift+Return. (MS Word uses in binary storage format (*.doc) 0x0B (vertical tab) for a line break and 0x0D (carriage return) for end of paragraph.)

      In MS Excel tables a line break inserted within a cell with Alt+Return results on saving the table as CSV file in having a line-feed without a preceding carriage return within a value. Copying table data from MS Word or MS Excel are copied always in CSV format into the text version of the clipboard.

      According to CSV specification it is absolutely no problem if a field value contains CR, LF or CRLF for a line break if the field value is enclosed in double quotes. That does not make the CSV invalid according to CSV specifications. But many, really many programmers have implemented CSV file reading poorly and interpret every line terminating character as end of data row. And also many programmers have coded the export of CSV files also poorly because field values with line terminating characters are not enclosed in double quotes (and quotes within such values escaped with an additional double quote character).

      For details on CSV see Wikipedia article Comma-separated values.

      681
      Advanced UserAdvanced User
      681

        Re: UE symbol explanations for line terminators / line endings - Thanks Mofi

        Feb 22, 2012#3

        As ever thank you Mofi. Great reference with even better useful detail about Word/Excel. Thank you sir :)