Tapatalk

Odd comparison results for short lines with very similar data

Odd comparison results for short lines with very similar data

16
Basic UserBasic User
16

PostNov 27, 2013#1

I have a recurring issue that has been driving me crazy for the longest time.

When I compare two simple lists of numbers, in addition to the obvious differences and matches, I regularly get "partial" hits that make it very difficult to resolve the comparison.

In the attached screenshot, the problems are the "564123" on the left and the "564093" on the right.

Suggestions?
UC_question.jpg (27.04KiB)

6,823625
Grand MasterGrand Master
6,823625

PostNov 27, 2013#2

All text compare applications I know have problems on comparing such shorts lines with similar data. A text compare is designed to find 100% identical blocks, completely different blocks, and similar blocks. With such short lines containing similar text the comparison result is often not what the user expect as the user expects a result of 100% identical or not identical at all.

The comparison result might be easier to interpret with turning off View - Relational Lines Mode.

But best would be to do such comparisons using an UltraEdit script which outputs what is 100% identical and what exists only in file A and what only in file B ignoring all similarities.

The UltraEdit Scripts forum already contain such scripts, see for example:
As you can see most users completely automate such simple comparisons for which text compare applications are not designed by using a script.

The opposite would be a script like posted at Check for similar lines and mark or output them for manual review which explicitly finds lines which are very similar or completely identical.

16
Basic UserBasic User
16

PostNov 27, 2013#3

Hmm. I will try the scripts, though it seems like a workaround.

I guess I'm just confused why a text comparison would yield an "almost" or ""similar" result instead of either "different" or "match". This now makes me question whether or not UltraCompare is returning accurate comparisons.

In any case, I'll try the scripts.

6,823625
Grand MasterGrand Master
6,823625

PostNov 28, 2013#4

UltraCompare returns accurate results for those 2 lists of numbers and the scripts are not workarounds. I have explained already that text compare applications like UltraCompare are designed to find and display also similar lines and not just 100% identical or not identical because at least 1 character is different. I'm quite sure that if you try other comparison tools designed for text compare like UltraCompare, you will get the same result. You have simply the wrong expectations for what a text compare application is designed for and therefore what it returns when using such an application designed for block compares with such data which have nothing common with text files containing real text or source code.

481
Basic UserBasic User
481

PostNov 28, 2013#5

It's a tricky problem for sure. I think Beyond Compare does a better job generally:

Compare.PNG (6.06KiB)

6,823625
Grand MasterGrand Master
6,823625

PostNov 28, 2013#6

UltraCompare displays the same result like Beyond Compare with View - Relational Lines Mode disabled as I have suggested in my first post.

481
Basic UserBasic User
481

PostNov 29, 2013#7

Not quite (for me at least). It doesn't display the character differences on 564123, but maybe that's an option somewhere.

6,823625
Grand MasterGrand Master
6,823625

PostNov 29, 2013#8

Okay, one more image for the data

Code: Select all

564075
564123
564211
564285
564468
564495
and

Code: Select all

562467
562539
563119
563354
563376
564075
564093
564285
564495
made on result displayed by UltraCompare Professional v8.50.0.1026 with Relational Lines Mode being disabled.

uc_prof_no_relational_lines.png (1.7KiB)

The colors are not the standard colors of UltraCompare Professional after installation. I have modified some colors at Options - Configuration - Display - Text - Colors & Font. If the digits 12 in 564233 and 09 in 564093 are not displayed different, the foreground color setting for Difference Text is most likely not defined well.