Question: Write a program to compute a message digest for a file of any type and any size. A Message Digest or Message Authentication Code is
Write a program to compute a message digest for a file of any type and any size.
A "Message Digest" or "Message Authentication Code" is a simple concept. You take your message and pass it through an algorithm that outputs a short sequence of characters. This sequence of characters is a "fingerprint" for the message. Any change in the message results in a different fingerprint.
There is no way to obtain the original message from its fingerprint and it is almost impossible to find two messages that yield the same fingerprint (just like trying to find 2 people who have the same fingerprints).
Message digests are a quick way to check if a message has been altered. If you have a digest of the original message and compare it with a digest of the message you just received and they match, then you know that the message has been unaltered.
For us, a message digest is 8 hexadecimal digits, 0-9, a-f. The file is read from standard input using file redirection. Use the old DOS command window. The program works like this:
C:\> .\a.exe < jeeves.txt
MD value is:
31965eca
Message digests are used to ensure the authenticity of files. Say that I give you a file jeeves.txt but I have secretly altered it. If you compute the message digest and it is different from the original, you know the file has been altered. (Of course, you must reliably know the original digest. But since digests are short, they can be put on a web page where anyone can see them.)
Message digest functions are public knowledge. Everyone knows the algorithm. What makes a digest secure is that it is very hard to change a document without changing the digest. If you are emailed a contract, you cant alter an amount from $10,000 to $100,000 because this would totally alter the digest and it would be obvious there was a change.
Actually digests (such as MD5) are complex and secure. Ours is much less secure. To compute our MD digest of a file: Declare four int variables, s1, s2, s3, s4 each initialized to zero. Declare four int multipliers, m1, m2, m3, m4 initialized to m1 = 3, m2 = 7, m3 = 13, m4 = 23.
For each byte B of the file do the following in sequence:
s1 = (s1 + B*m1) % 256
s2 = (s1+s2 + B*m2) % 256
s3 = (s1+s2+s3 + B*m3) % 256
s4 = (s1+s2+s3+s4 + B*m4) % 256
Each of s1, s2, s3, and s4 gets a new value every time a byte is read from the file. The new values are based on the byte and the current values in the variables. After the last byte has been processed, write out the four sums as eight characters using two hex digits for each:
printf("%02x%02x%02x%02x ", s1, s2, s3, s4 );
A complication: You want to be sure that you read each byte of the input file without alteration. Message digests work with any type of file, including so-called "binary files." Unfortunately, the OS changes some bytes when reading in "text mode". To ensure that each call to getchar() gets one byte of the file exactly as it is in the file, include these header files:
#include
#include
and at the start of the program set the input mode to "binary" by calling this function:
_setmode( _fileno(stdin), _O_BINARY );
The above is supposed to work on all systems and works on my gcc, but might not work in other environments.
The usual message digest algorithm used throughout the world is MD5. Our message digest function is called MD since it is much too simple.
Document your program nicely, use sensible layout and indenting, use sensible variable names, and test thoroughly. You will probably have a single main() function of not many more lines than the pseudo-code above. Turn in your program using the Blackboard assignments tool. Include documentation at the top that identifies the program and its author.
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
