UltraCompare 6.00.1.1059 fails to detect differences in large binary files

UltraCompare 6.00.1.1059 fails to detect differences in large binary files

7
NewbieNewbie
7

    Jan 05, 2011#1

    It appears that Ultracompare 6.00.1.1059 fails to detect differences in large binary files. Is this a known problem? See below for details.
    I have two .iso images of SuSe linux downloaded 3 months apart using 2 different versions of Internet Download Manager (a different IDM ;-)).
    In "properties" they both claim to have the same number of characters:

    Code: Select all

         ARay@Gromit /cygdrive/i/ray/Downloads $ ls -al ./linux/openSUSE-11.3-DVD-i586.iso
         -rwxrwxrwx 1 ARay None 4346398720 Oct 22 19:15 ./linux/openSUSE-11.3-DVD-i586.iso
         
         ARay@Gromit /cygdrive/i/ray/Downloads $ ls -al ./general/openSUSE-11.3-DVD-i586.iso
         -rwxrwxrwx 1 ARay None 4346398720 Jan  4 13:10 ./general/openSUSE-11.3-DVD-i586.iso
    However, both have different md5 checksums (this using the MS tool on DOS command line but confirmed with another):

    Code: Select all

           I:\Ray\Downloads\microsoft>fciv.exe "I:\Ray\Downloads\linux\openSUSE-11.3-DVD-i586.iso"
           //
           // File Checksum Integrity Verifier version 2.05.
           //
    md5>   56ee0b2ecfe7b6803e4bade99d4a5daf i:\ray\downloads\linux\opensuse-11.3-dvd-i586.iso
           
           I:\Ray\Downloads\microsoft>fciv.exe "I:\Ray\Downloads\general\openSUSE-11.3-DVD-i586.iso"
           //
           // File Checksum Integrity Verifier version 2.05.
           //
    md5>   adc4bc430e6d2ea9c0d52e47f0bbf50a i:\ray\downloads\general\opensuse-11.3-dvd-i586.iso

    But when I use Ultracompare...(ultraedit version 14.20.0.1033 ultracompare 6.00.1.1059) I get:

    Code: Select all

          First file name: I:\Ray\Downloads\linux\openSUSE-11.3-DVD-i586.iso
          Second file name: I:\Ray\Downloads\General\openSUSE-11.3-DVD-i586.iso
          
          Summary:
          ```````````````````
          The files are identical
    But when I use the Gnu "cmp" command under cygwin I get:

    Code: Select all

          ARay@Gromit /cygdrive/i/ray/Downloads $ cmp ./linux/openSUSE-11.3-DVD-i586.iso ./General/openSUSE-11.3-DVD-i586.iso
          ./linux/openSUSE-11.3-DVD-i586.iso ./General/openSUSE-11.3-DVD-i586.iso differ: char 291805821, line 1238971   

    So my question: Is there anything I should know about Ultracompare (later edit - pro, not lite) (like size limits to files)?
    (It seems to finish too quickly for my liking).

    6,603548
    Grand MasterGrand Master
    6,603548

      Jan 05, 2011#2

      UC Lite has limits on file size as you can read at UltraCompare Lite vs. UltraCompare Pro on line with Large file support. ISO images are large files and therefore it is possible that UC Lite installed with UltraEdit compares just the beginning of the ISO files if at all. According to UltraCompare help page UltraCompare Versions the Lite version does not support binary file comparison, at least the currently latest UC Lite version 6.40.1.1005. I'm not sure, but I think some older versions of UC Lite support also a binary comparison of small files.

      BTW: Do you know the Windows console commands comp and fc which can be used to run a quick file comparison in a console window? Run in a console window comp /? or fc /? to get help about the command line options of these 2 Windows commands.
      Best regards from an UC/UE/UES for Windows user from Austria

      7
      NewbieNewbie
      7

        Jan 07, 2011#3

        Mofi, thank you for your response. I am sorry that I failed to mention that the Ultracompare I am using is purchased and not lite. It is licenced to me.
        I was aware of the lite limitations. I did not know about the Windows compare tools. However I am much more familiar with the Bash command window under cygwin with it's superb history facility and the GNU tools such as cmp, diff, less, grep etc - I find the DOS window very hard work.

        I did eventually used cmp to get the differences between the files (there are 351662 bytes different between the files), but, I paid for a windowing program Ultracompare which I expected to give me a graphical indication of differences not to have to go back to creating an intermediate difference file with cmp and use Ultraedit to explore it.

        I am somewhat unwilling to pay for an "upgrade" to Ultracompare without any guarantee that the fault has been fixed; that is why I asked if it was a known fault hoping to discover whether it had been fixed or not in the latest version of Ultracompare. If it has not been fixed, it is not a simple matter to give the developers two 4 gigabyte files to test it on!
        Thanks again for your response, it is good that someone takes an interest in such things...

        Regards, Ray (currently in Scotland).

        6,603548
        Grand MasterGrand Master
        6,603548

          Jan 07, 2011#4

          Okay. I don't really know if UltraCompare Professional or UC3 v6.00.1.1059 has a bug on comparing large binary files. I have never done this before. I have only 1 large binary file with 2.14 GB (backup image of my system drive) on my computer's hard disk (and on another hard disk). I created now a copy of the file on another drive and modified some bytes in the middle and near the end using UltraEdit on this copy. Then I closed UltraEdit and compared the large files using release candidate of UltraCompare v8.00. UltraCompare found the 14 different bytes.

          I found following in the list of hotfixes and UltraCompare histories published by IDM and which I collect in a text file on my computer.

          UC v7.10.0.1020 updated on 2010-05-10
          Fixed memory/crash issue with compare of extremely large (multi-GB) directories/subdirectories

          UC v6.40.0.1019 updated on 2009-09-24
          Fixed issue where extremely large file/folder compare causes hang

          So it looks like this issue is already fixed. I suggest to wait until v8.00 is released (next week), then download the new version and try the compare with the new version using the 30 days trial period. If it works and you like the new features too, it is maybe worth for you to pay for the upgrade. As registered user of v6.00.1.1059 you might have also the option to update to latest v6.xx without any additional payment. Please contact IDM support by email with your registration information to clarify that.
          Best regards from an UC/UE/UES for Windows user from Austria

          7
          NewbieNewbie
          7

            Jan 07, 2011#5

            Thanks very much Mofi, it sounds like it will be worthwhile downloading. BUT if I do that and it still fails and I decide not to buy it, will I still have my old ultracompare (which remains useful for small text file comparisons) or will that get wiped out when I install the trial version?

            6,603548
            Grand MasterGrand Master
            6,603548

              Jan 07, 2011#6

              I don't use UC3, I'm using UltraCompare Professional. So I can't tell you the exact directories. I post here what must be done for testing a new version of UC Professional with comments what I would expect you have to do for testing a new version of UC3.
              1. Rename, or create a copy, or backup with ZIP or RAR the UltraCompare program directory. For UC Prof. this is usually the directory %ProgramFiles%\IDM Computer Solutions\UltraCompare, for your UC3 version the directory is probably on your USB stick.
              2. Rename, or create a copy, or backup with ZIP or RAR the UltraCompare application data directory containing the files with your customizations (toolbar, menu, keyboard mapping, perhaps INI file, etc.). For UC Prof. this directory is %appdata%\IDM Computer Solutions\UltraCompare. For your UC3 the files must be also on your USB stick. Search for uc3.ini or UCCmds32.cmf to find it.
              3. For UC Prof. the registry key HKEY_CURRENT_USER\Software\IDM Computer Solutions\UltraCompare Pro must be renamed or exported using Regedit. UC3 must work from stick. Therefore there is surely nothing stored in the registry of the current user. I suppose that UC3 saves all configurations by default in an INI file, probably uc3.ini on the stick, as UC Prof. users can do by disabling the configuration setting Use registry for settings (instead of INI). However, from my comparisons of setting changes after updating UltraCompare, I know that no configuration settings have been removed, just new ones have been added. I compare always the changes made after an upgrade to a new major version and log them for myself. This information is sometimes useful on helping other users here in the forum.
              4. And last you must backup the installer MSI or installer executable of your current version.
              Now it is possible to install the new version over the old one. When you decide to go back, you have to uninstall the new version, delete the remaining files and directories (see above), reinstall the previously registered version, and finally restore / rename the backups you made before installing the new version. Of course the program files directory should be identical.
              Best regards from an UC/UE/UES for Windows user from Austria

              7
              NewbieNewbie
              7

                Jan 08, 2011#7

                Thanks Mofi, I can do all that and will do so next week when V8 is available. I had not read that UC could run from a usb stick so mine is just solidly in my windows XP computer. I was worried in case there was some complex licensing system such as for Paintshop Pro which is really complex to reverse. Your instructions read very comprehensively, you have obviously put a lot of work into analysing Ultraedit and ultracompare. Thank you for that effort. I will report again at the end of next week after doing the tests.

                Regards Ray

                  Jan 14, 2011#8

                  I took the plunge and bought the latest version of Ultracompare (8.00.0.1010). There is an improvement! Ultracompare 8 now tells me:

                  Code: Select all

                  First file name: I:\Ray\Downloads\linux\openSUSE-11.3-DVD-i586.iso
                  Second file name: I:\Ray\Downloads\General\openSUSE-11.3-DVD-i586.iso
                  350274 : 350274 Byte(s) diff   
                  51081150 Byte(s) match   
                  51431424 : 51431424 Byte(s) total   
                  So at least it does not say that there are no differences!
                  In fact (according to Windows) the two files are identical in size and each contain 4,346,398,720 bytes. Thus UC gets the file size wrong by a reduction of about 85 times. The total number of bytes of the file displayed in both panes of the UC window corresponds to the above (0310c7f0 in hex) and not the file size. No bytes are shown coloured. When I tell UC to go to the first different byte, it goes to the last line of the display. I cannot tell whether any bytes are shown to be different.

                  BUT (and truly amazingly) the number of bytes which differ between the two files exactly corresponds to what cygwin cmp tells me is the number of bytes different! i.e. it would appear that UC is actually doing the comparison - it simply does not show the results.
                  So, I think at this point I have to give up and just say that UC either:
                  a)Has a defect in comparing large files
                  or
                  b) it is short of resources on my machine and has a defect in not telling me that it is unable to compare such large files.

                  I have run out of time here. UC is still useful to me to compare small text files, but I give up on large binaries. My currently favoured method is to use cmp under cygwin (or on Linux), output all the differences in a huge text file, then use Macros in Ultraedit to condense down the information. Takes some time, but I can do it.

                  Just to be 100% clear the version I report on above is UltraCompare Professional Version 8.00.0.1010

                  PS, I know why there are no differences shown in the panes, the first difference is a byte number 11649A7D and UC only manages to display up to byte number 0310C7F0 - oops!

                  6,603548
                  Grand MasterGrand Master
                  6,603548

                    Jan 15, 2011#9

                    I have two applications installed capable comparing files with showing differences in text or binary mode, my file manager Total Commander (just simple compares, but very quick and easy to use) and UltraCompare (much more powerful with much more options, but slower).

                    When I compare two binary 2 GB files with only a difference at the end, Total Commander just shows a message that the process to display the file contents with differences has to been aborted because of not enough memory, but Total Commander finished the compare and shows in the same message that the files are different without any further details.

                    When I compare both 2 GB binary files with UltraCompare, UC fails also to display the difference at end of the huge files, probably also because of out of memory. Unfortunately UC does not inform the user that displaying the entire files fails because of not enough memory. UC at least shows the bytes at beginning of the files up to the position where not enough memory was available anymore to load the rest. In the status bar of UC I see

                    Code: Select all

                    7 : 7 Byte(s) diff    -2001674759 Byte(s) match   2293292544 : 2293292544 Byte(s) total
                    The same is displayed in the output window.
                    The difference count is correct and also the 2 total number of files. The number of matching bytes is wrong because the number is printed signed and not unsigned. I will report this issue. All byte counters should be unsigned. And all counters should use 64-bit unsigned variables. That is supported also in 32-bit applications with a special data type class. For these counters UC should make use of them. By looking on your output I can see that currently only a 32-bit variable is used because 2^32 + 51431424 = 4346398720, an overflow occurred.

                    What both GUI applications for comparing binary such large files miss, is what console compare applications supporting binary compares of large files support: the ability to output just a simple report with a list of file offsets, number of different bytes at each of these offsets and the hex values of the different bytes at these offsets, at least for binary compares with less than 10% different bytes. That is definitely worth suggesting IDM.

                    Currently UltraCompare does not support any output for a binary compare. Such a feature should be added, especially for comparing large and huge files binary from command line without showing the results in the GUI, but instead printing them to an output file. Perhaps IDM have not yet done this because such compares are rare and therefore not often requested by users and there are already console applications on every platform capable creating such a report. It would be already helpful if UC would write to output window on which file offsets differences occur and how many bytes are different. Then it would be possible to open both files in UltraEdit and inspect the differences there as you do. I will send IDM an appropriate enhancement suggestion, but not yet because IDM is very busy at the moment. If you do that too, there are already 2 users requesting a simple output for binary comparisons. The more users requesting it, the higher the priority for IDM to implement it.

                    7
                    NewbieNewbie
                    7

                      Jan 15, 2011#10

                      Mofi, that was a very thoughtful and well researched reply thankyou - your analysis of the overflow was impressive to say the least.

                      I think that you are assuming that, even if they change from 32 bit to 64 bit variables, the application will still not be able to display all the files without running out of resources. You are almost certainly correct, but I suspect that the reason for the current lack of displayed characters is not lack of resources but the overflow that you identified. I.e. the program thinks that it only NEEDS to output 51431424 bytes (= 4346398720 - 2^32) and so only does that. Why else would the erroneous count just exactly equal the number of displayed characters?

                      That being said I think that your suspicions about the likelihood of running out of resources are well founded. IDM would do well to
                      1. fix the 32 to 64 bit problem
                      2. construct some tests with very large files to see just how far they can push UC until it fails
                      3. then make sure that there is a sensible error message
                        or
                      4. Publicize and check for a limit on the file sizes that UC is prepared to handle to avoid upsetting users
                      There is another option to simply replicating the textual output method of cmp or the Windows command line equivalent, though more complicated perhaps to implement. That is to show a map of differences i.e. do not show every byte different but just some "block" - the size of block depending on capability to display them - perhaps a bit like disc optimisers. Clicking on block could open up a "normal" comparison window just for that block. i.e. Ultracompare would have to retain a map of differences and where to locate the blocks in the original files on disc in order to show them at reasonable speed. Thus instead of coloured bytes, there would be coloured blocks.

                      I am presuming here that it is possible to do reasonably efficient random access on a block in a file in Windows ; I am not a Windows programmer.

                      In any case, I agree with you, IDM should change UC to warn users of any display limitation. I did not spot immediately that there were not enough bytes and spent quite a while carefully scrolling through the display looking for differences.

                      Here is my proposal for my submission to IDM (simpler than your version of course):
                      There is a defect in Ultracompare (8.00.0.1010). If you compare large files (e.g. 4 gigabytes) the comparison correctly reports the number of bytes different. It does not however report the correct size of the file, nor does it display all the bytes of the files. The defect is that there is NO WARNING to the user that this has happened. The very least that you should do is: a) test it with big files with a difference towards the end, b) if there are resource problems on the users computer which means that the differences cannot be shown then warn the user - don't just fail silently; it is very irritating to the end users.

                      It could be that it is impractical for you to show byte by byte windows for huge files for resource reasons. For me though, I bought UC because of the graphic representation - it would be great if it worked. I can use the free cmp under cygwin for a textual difference output. Additionally, scrolling through differences for huge files is almost impossible with the current GUI. Sometimes the minimum mouse movement is huge in terms of bytes and you can spend a long time using page down. So I have a suggestion for an enhancement:

                      Show only "blocks" to be different i.e. replace the byte view with a block view (your choice of block size depending on resource limitations). Clicking on the block would open up another window in the same mode that you now have for files i.e. coloured byte by byte comparison. This assumes that random access to large files is practical for you to implement.

                      I realise that the latter is major work, so at least try to fix the reason for the bad display and improve the warnings.
                      What do you think of those words?
                      Regards, Ray

                      6,603548
                      Grand MasterGrand Master
                      6,603548

                        Jan 16, 2011#11

                        RayFoulkes wrote:What do you think of those words?
                        Very good, send that to IDM support by email.

                        Reading data from specific file offsets is no problem. It is not necessary to read file data sequentially. For text files it is necessary to do so to get correct number of lines. But for binary files, it is not necessary to read the data sequential. It would make definitely sense for binary compares of really large files to show only the blocks of differences. Well, for a text compare it is already possible to show only differences, but the binary comparison feature miss that. Just showing the different blocks with lets say 128 bytes before first difference within a block and 128 bytes after last difference within a block would be much better than showing complete file contents. The display shows the file offsets and one or two blank lines could be used in the display as visual separation between the displayed blocks of differences. Just showing blocks of differences would make visually inspecting them easier and also navigation. Displaying matching binary data is really not needed.

                        I ran the compare from yesterday again with UC and looked this time on the system performance tab of task manager. After UC finished there was still 1.6 GB of my 2 GB physical RAM available. I have turned off using virtual RAM on hard disk (pagefile.sys). So it looks like UC does not load the entire file contents to RAM for display. The hard disk activities on scrolling lets me think the same. UC loads the file contents in blocks depending on current file offset in the files which I finally could verify with Process Monitor. So you are probably correct that the display of the difference at end does not work because of wrong data type for some variables and there is not really a memory problem.
                        Best regards from an UC/UE/UES for Windows user from Austria

                        7
                        NewbieNewbie
                        7

                          Jan 18, 2011#12

                          Mofi - ok, I sent the above (more or less :) ) to IDM support. Let's hope they read it.
                          Your analysis is great. It may mean that there is no real problem to show byte by byte differences. Your suggestion of showing only differences would need some good indicators (red lines or something) where "all these bytes are the same" are omitted because scrolling through a very large file is really quite difficult and easy to miss gaps in the counting.

                          Of course, if there WERE resource problems, it would not really solve that because somebody might compare two totally dissimilar files i.e. NO bytes being the same.

                          Anyway, thanks for all your help. I think between us we have solidly identified a UC problem - let's hope IDM listen to us and fix it. Bye for now.
                          Ray