NULL character cannot be used in clipboard in text mode

NULL character cannot be used in clipboard in text mode

6
NewbieNewbie
6

    Dec 25, 2008#1

    UltraEdit.clipboardContent won't accept NULL char :(
    e.g.:

    Code: Select all

    UltraEdit.clipboardContent = "Hello\0world\n";
    UltraEdit.activeDocument.paste();
    Result: Hello

    Fix that plz!

    236
    MasterMaster
    236

      Dec 25, 2008#2

      Hi, this is a user-to-user forum. You probably want to read what's written on top of every page here :)

      6,675585
      Grand MasterGrand Master
      6,675585

        Dec 27, 2008#3

        NULLs are not possible because the Javascript core engine handles strings as NULL terminated character array. So every NULL byte in a character array is automatically the end of the string. IDM can't change this and of course in text files NULL bytes should be never used or present.
        Best regards from an UC/UE/UES for Windows user from Austria

        6
        NewbieNewbie
        6

          Dec 27, 2008#4

          pietzcker wrote:Hi, this is a user-to-user forum. You probably want to read what's written on top of every page here :)
          Thanks, I wrote to the support about the issue...
          Mofi wrote:NULLs are not possible because the Javascript core engine handles strings as NULL terminated character array. So every NULL byte in a character array is automatically the end of the string. IDM can't change this and of course in text files NULL bytes should be never used or present.
          And I got a similar reply, which is incorrect -_-'
          Javascript can handle NULL chars.
          Write the following in your address bar and press enter:
          javascript:var a="zero\0zero";alert(a.length);
          You get 9, not 4!

          Also, try doing the following:
          Go to UltraEdit, copy a null char using HEX mode, then execute the following code (in HEX mode):

          Code: Select all

          UltraEdit.activeDocument.paste();
          UltraEdit.clipboardContent = UltraEdit.clipboardContent;
          UltraEdit.activeDocument.paste();
          To save your time: it will paste only on the first time.
          Now, does it seem logical to you?
          I'd call it a bug/incorrect behavior.

          6,675585
          Grand MasterGrand Master
          6,675585

            Dec 28, 2008#5

            Yes, the Javascript engine creates for your example a byte array with correct byte size and property length contains the correct number of bytes for the array. But array property length is not equal the string length for your example. It is the size of the array, not the string length. Try following in a HTML file:

            Code: Select all

            <script type="text/javascript">
               var Text1="zero\0zero";
               alert("Text 1 is: "+Text1);
               alert("Text 1 length is: "+String(Text1.length));
               var Text2 = "Text is: "+Text1+" and text length is: "+String(Text1.length);
               alert(Text2);
               alert("Text 2 length is: "+String(Text2.length));
            </script>
            and you will see only "Text 1 is: zero" in the first message box and "Text 1 length is: 9" in the second message box.
            The third alert outputs also "Text is: zero" because the strings are correct concatenated, but the NULL character inside the string variable is the terminating character for the print. This is confirmed by the fourth message showing "Text 2 length is: 40".

            And yes, the clipboard buffer is just a dynamic memory buffer which can hold also binary data. The clipboard is not a string array, it is a byte array and therefore can contain all values from 0 to 255 which are not special interpreted (normally) as the bytes in a string array are when accessing the data in the array.

            UltraEdit.clipboardContent = UltraEdit.clipboardContent;

            I'm not wondering that this clears the clipboard content because the clipboard is a dynamically allocated memory buffer using malloc() and free() in 'C' or new and delete in C++ (which uses in the background also malloc() and free()). So when you assign (=copy) a buffer to the clipboard, the first function called is free() to release the existing memory buffer of the (active) clipboard. Now the clipboard buffer pointer is NULL and therefore the next function calls for getting the size in bytes of the new data, allocating the buffer with required byte size from heap and copying the data must fail here. So your line is the same as UltraEdit.clearClipboard(); with the only difference that you test also the error handling of the functions used in the background for handling dynamic memory buffers.

            However, the problem you have here is not the Javascript engine and how it handles bytes with value 0. The problem is that UltraEdit stops character copying from the clipboard buffer to the file buffer in text mode on first ocurrence of a NULL byte which is in my point of view a correct behavior because NULL bytes are binary bytes and should not exist in a text file (except in UTF-16 files). For example if you create a new file, write some text, switch to hex edit mode, replace 1 character with 00, select all bytes in hex edit mode, turn off hex editing and paste now the clipboard content into the file in text edit mode, UltraEdit pastes only the text left the NULL byte. But when you now switch again into hex edit mode and paste the clipboard content, you will see that the clipboard contains really all bytes because all of them are inserted in hex edit mode.
            Best regards from an UC/UE/UES for Windows user from Austria

            6
            NewbieNewbie
            6

              Dec 28, 2008#6

              Sorry, but I have to disagree...
              Mofi wrote:But array property length is not equal the string length for your example.
              Yes it does, it always does.
              Try the following HTML file:

              Code: Select all

              <script type="text/javascript">
                 var Text1="one\0two";
                 alert("Text 1 is: "+Text1);
                 alert("Text 1 starting from 4-th char is: "+Text1.substr(4));
              </script>
              Actually, the first alert prints different text on different browsers. IE and Opera print "one", while Firefox and Chrome print "onetwo". That's because the alert function works in different manner on these browsers, that's all. Javascript have all the data, and the second alert can prove that.
              Mofi wrote:UltraEdit.clipboardContent = UltraEdit.clipboardContent;

              I'm not wondering that this clears the clipboard content because the clipboard is a dynamically allocated memory buffer using malloc() and free() in 'C' or new and delete in C++ (which uses in the background also malloc() and free()). So when you assign (=copy) a buffer to the clipboard, the first function called is free() to release the existing memory buffer of the (active) clipboard.
              That's not related to the topic. C and Javascript have different rules, although they have a similar syntax.
              Mofi wrote:The problem is that UltraEdit stops character copying from the clipboard buffer to the file buffer in text mode on first ocurrence of a NULL byte which is in my point of view a correct behavior because NULL bytes are binary bytes and should not exist in a text file (except in UTF-16 files).
              It's not the problem. Actually, although I neither agree with that behavior, it's logical in a way, and I can accept it. The problem is that it won't work even in HEX mode, where it supposed to. But the real problem is the assignment of string to UltraEdit.clipboardContent (which brings us to my first post).

              Now, try this code in UE:

              Code: Select all

              var str;
              
              str = "test";
              UltraEdit.clipboardContent = "test";
              UltraEdit.activeDocument.write(str.length+";"+UltraEdit.clipboardContent.length+"\n");
              str = "test\0with NULL byte";
              UltraEdit.clipboardContent = "test\0with NULL byte";
              UltraEdit.activeDocument.write(str.length+";"+UltraEdit.clipboardContent.length+"\n");
              The output is:
              4;4
              19;4
              Which proves that the behavior of UltraEdit.clipboardContent is not a correct behavior of a Javascript string, and (first post again) - "UltraEdit.clipboardContent won't accept NULL char"

              6,675585
              Grand MasterGrand Master
              6,675585

                Dec 30, 2008#7

                999999 wrote:Actually, the first alert prints different text on different browsers. IE and Opera print "one", while Firefox and Chrome print "onetwo". That's because the alert function works in different manner on these browsers, that's all. Javascript have all the data, and the second alert can prove that.
                Yes, that's what I wanted to explain. The array contains all bytes, but how the bytes are interpreted on read depends on the interpreting functions. In C/C++ a NULL terminated string (not a Unicode string) is always printed from first byte to the first occurrence of a byte with value 0. Of course a very dangerous definition if somehow the NULL byte was overwritten before which can easily happen on string manipulations. IE and Opera (= my preferred browser) are handling it in this way in the alert() function. Unicode string arrays contain an extra length property which is used also for printing Unicode strings. It looks like Firefox uses this length information for printing in the alert() function. I don't know which implementation is correct for JavaScript and of course it is not important for this issue here.
                999999 wrote:That's not related to the topic. C and Javascript have different rules, although they have a similar syntax.
                Yes, that is not related to the topic. But you asked why the second paste inserted nothing and I give you the answer. UltraEdit.clipboardContent is not a core JavaScript function, it is a function implemented by IDM written in C++. However, I'm quite sure that the methods how to handle dynamic memory buffers are the same in all programming languages and of course the JavaScript engine used by UE/UES is written probable also in C/C++. But I don't know exactly.
                999999 wrote:Which proves that the behavior of UltraEdit.clipboardContent is not a correct behavior of a Javascript string, and (first post again) - "UltraEdit.clipboardContent won't accept NULL char"
                As I can see on your code I have to agree that the string with the NULL byte is different handled by the clipboardContent function written by IDM in comparison to the dynamically created string variable arrays by the JavaScript core engine. And I agree now that maybe this is a lack in program code of UE/UES. But only the IDM developers can look into this issue and find out were the truncation of the string with the NULL byte occurs.

                But I guess the priority to look into this issue is very low for IDM because who really needs to use the clipboard with strings including NULL bytes with that method?
                Best regards from an UC/UE/UES for Windows user from Austria

                6
                NewbieNewbie
                6

                  Dec 30, 2008#8

                  Mofi wrote:Yes, that's what I wanted to explain. The array contains all bytes, but how the bytes are interpreted on read depends on the interpreting functions. In C/C++ a NULL terminated string (not a Unicode string) is always printed from first byte to the first occurrence of a byte with value 0. ... Unicode string arrays contain an extra length property which is used also for printing Unicode strings.
                  Just so you'll know: the NULL char influence is not related to whether the string is UNICODE or not. It's related to the program implementation. For example, Windows API have ASCII and UNICODE versions of their functions, and both check for NULL terminator (for unicode it's double NULL char). But the low level functions use a structure of a UNICODE string and it's length.
                  If you are interested, you can read more about it here: http://technet.microsoft.com/en-us/sysi ... 6.aspx#ECC
                  999999 wrote:But I guess the priority to look into this issue is very low for IDM because who really needs to use the clipboard with strings including NULL bytes with that method?
                  I do :)
                  Actually what I wanted to do is write a script that will convert an ASCII file with this kind of format:
                  12 34 56 78 90
                  To a binary file, but I don't see another way except the clipboard (write() will neither work in HEX mode).

                  6,675585
                  Grand MasterGrand Master
                  6,675585

                    Dec 31, 2008#9

                    Aha! You want to write a script which converts an ASCII string with hexadecimal values into binary data.

                    What about using a special ASCII character sequence for a NULL character like #!? and finally run in hex edit mode a hexadecimal replace all to convert every occurrence of this special character sequence into a NULL byte as workaround. For example with following scripting code at the end of your script:

                    Code: Select all

                    UltraEdit.ueReOn();
                    UltraEdit.activeDocument.hexOn();                        // Turn on hex edit mode.
                    UltraEdit.activeDocument.top();                          // Set cursor to top of the file.
                    UltraEdit.activeDocument.findReplace.searchAscii=false;  // Run replace with hexadecimal values.
                    UltraEdit.activeDocument.findReplace.matchCase=false;
                    UltraEdit.activeDocument.findReplace.searchDown=true;
                    UltraEdit.activeDocument.findReplace.replaceAll=true;    // Replace all occurrences in the current file.
                    UltraEdit.activeDocument.findReplace.regExp=false;
                    UltraEdit.activeDocument.findReplace.mode=0;           
                    UltraEdit.activeDocument.findReplace.replace("23213F", "00");  // Replace all  #!?  with a NULL byte.
                    You can use a longer, more complicated place holder string if you think this sequence could exist in the data.

                    You may also look on Paste hexadecimal.
                    Best regards from an UC/UE/UES for Windows user from Austria

                    6
                    NewbieNewbie
                    6

                      Jan 01, 2009#10

                      OK, great, I've just found out that UltraEdit won't paste binary data at all, if it was assigned in the script.

                      Go to HEX mode, copy FF byte, and run this code in HEX mode:

                      Code: Select all

                      UltraEdit.activeDocument.paste(); // works
                      
                      UltraEdit.clipboardContent = UltraEdit.clipboardContent;
                      UltraEdit.activeDocument.paste(); // works
                      
                      UltraEdit.clipboardContent = "\xFF";
                      UltraEdit.activeDocument.paste(); // pastes 0x79
                      
                      UltraEdit.clipboardContent = String.fromCharCode(0xFF);
                      UltraEdit.activeDocument.paste(); // pastes 0x79

                      6,675585
                      Grand MasterGrand Master
                      6,675585

                        Jan 02, 2009#11

                        Please report this program bug by email to IDM. Thanks.
                        Best regards from an UC/UE/UES for Windows user from Austria

                        262
                        MasterMaster
                        262

                          Jan 02, 2009#12

                          999999 wrote:OK, great, I've just found out that UltraEdit won't paste binary data at all, if it was assigned in the script
                          ...
                          Strange. It works for me. 4 times 0xFF is pasted into the document (DOS). Using UE 14.20.1.1001 (english edition) on a WinXP 2002 SP3.