Topic: Binary Comparison Software (Read 164 times)

  • The Shatmaster
  • Pip
  • Group: Premium Member
  • Joined: May 23, 2005
  • Posts: 168
Does anyone know of any software that compares binary files and graphically shows the differences in binary? I found a decent one called Hex Compare but it only shows the binary in Hexadecimal and ASCII and it confuses the shit out of me. It would be a lot easier if I could look at the differences in binary instead but I can't seem to find anything.

If that didn't make any sense, instead of viewing the files in hexadecimal:
03 0A 2F 3B

I need to see it in binary
00000001100000101100101111000111100

dead man posting
  • Avatar of dom
  • Chapter Four: The Imagination And Where It Leads
  • PipPipPipPipPipPipPip
  • Group: Premium Member
  • Joined: Nov 9, 2003
  • Posts: 1022
you realise viewing it in binary is going to be even harder to understand

i strongly suggest you just learn how hexadecimal works and how to figure out the equivalent binary form, because it'll be much easier
Last Edit: December 29, 2007, 02:04:40 pm by dr. ron pual
  • Avatar of JohnnyCasil
  • Comrade!
  • PipPipPipPip
  • Group: Premium Member
  • Joined: Jan 5, 2005
  • Posts: 453
Yea, I agree with pot noodle.  Unless you have some reason that you absolutely NEED it in binary, of which I can't really think of.  Or if you could explain why the hexadecimal display confuses you, we could help with that?
  • The Shatmaster
  • Pip
  • Group: Premium Member
  • Joined: May 23, 2005
  • Posts: 168
I already know how to convert between different bases. The issue is that it shows the data to me in chunks of 8 bits: two hexadecimal digits - but the data isn't necessarily stored in a way where every 8 bits is a new single and seperate piece of information. For instance if there is a T/F bit squeezed between two Unicode characters than when I try to look at the difference between the information in hexadecimal it would show that the sequene of characters after the one true/false bit may not match the character in another file if the T/F bits are different when in fact the characters match, it's simply the one bit that doesn't. By showing me the difference in binary I'll be able to see exactly where the difference occours so if I notice one bit that is different and 16 bits after it are the same I can guess that those 16 bits might represent a character or a short integer.

Take this as an example. Lets say I have a protocol that does something like this: the first sequence of bits is a sequence of ASCII characters representing the date the data was created. The next one bit is a true/false bit indicating if the data is encoded or not and needs to be decoded. The next sequence of characters up until the escape character is ASCII version information. The binary would look like this:

Data :
12/1/07 False Version 9.0.0

Binary:
00110001001100100010111100110001
001011110011000000110111
10101011
00110010101110010011100110110100
10110111101101110001000000011100
100101110001100000010111000110000


Hexadecimal:
31322F312F3038AB32B939B4B7B7101C97181730



If I change that one bit flag than watch what happens:

Binary:
00110001001100100010111100110001
001011110011000000110111
0101011
00110010101110010011100110110100
10110111101101110001000000011100
100101110001100000010111000110000


Hexadecimal:
31322F312F30382B32B939B4B7B7101C97181730

It's not immediately apparent in the Hexadecimal that there is only one bit that is different - there's no way to tell without looking at the binary or thinking too hard about it that an entire byte is not different. There becomes an even greater challenge when I try to align the two files if one file has an extra bit in there that the other doesn't - if I'm using a tool that views it in hexadecimal I can't just shift everything one bit to the right I have to shift it one entire byte to the right which means that in this particular protocol I couldn't see the ASCII characters that represent the version if I shift it (it would look like garbage if I converted the hexidecimal above to characters: +2?'?????????).
  • Avatar of JohnnyCasil
  • Comrade!
  • PipPipPipPip
  • Group: Premium Member
  • Joined: Jan 5, 2005
  • Posts: 453
I don't see where your problem is.  You cannot store an individual bit, only 1 byte.  The storage space of a boolean value is compiler dependent, but it is usually the native size for the CPU (32bit or 64bit).
  • The Shatmaster
  • Pip
  • Group: Premium Member
  • Joined: May 23, 2005
  • Posts: 168
I thought you could control individual bits - I didn't know you had to do it a byte at a time. Thanks johnny. Now that I know that my Hex Compare program will work just fine except that it only displays characters as ASCII instead of Unicode. It's not that big of a deal considering that most of the characters I'm looking for are among the first 128 but it would be nice to use unicode just in case there are special characters with accents and junk... I may end up writing a different program for parsing text out of the bytes.
Last Edit: December 30, 2007, 03:08:38 am by Fahrenheit Jr, Mr
  • Avatar of JohnnyCasil
  • Comrade!
  • PipPipPipPip
  • Group: Premium Member
  • Joined: Jan 5, 2005
  • Posts: 453
You can set multiple bits within a byte (ie Bit flags), but like you said, you still have to store the whole byte.  Also, I made an error in my previous post.  I said that boolean data types are usually 32bit.  As far as MSVS and Java are concerned they are both stored as an 8bit value.
  • The Shatmaster
  • Pip
  • Group: Premium Member
  • Joined: May 23, 2005
  • Posts: 168
That makes much more sense to me - why would you waste 32 bits on something that effectively only needs one?
  • Avatar of JohnnyCasil
  • Comrade!
  • PipPipPipPip
  • Group: Premium Member
  • Joined: Jan 5, 2005
  • Posts: 453
Alignment reasons.  It is easier for a 32bit CPU to only process 32bit values.  This article does a pretty good job explaining it if you are interested in the reading.  I just wasn't thinking when I said that it made it 32bit because I was thinking of alignment, but forgetting that for a 8bit value alignment doesn't really matter as far as read/write access goes.