Question: 5. Repeat the previous question, but now use MD5. MD5 is a hash function (link) is a hash function that receivers a binary file, and


5. Repeat the previous question, but now use MD5. MD5 is a hash function (link) is a hash function that receivers a binary file, and returns a 128-bits hash-key of this file. It was implemented very efficiently in the OS of your computer. For example, in Linux, run md5 filename. The result is usually printed as a string of 32 hex digits. From Wiki: Tic-tac-toe (also known as noughts and crosses or Xs and Os) is a paper and pencil game for two players, X and 0, who take turns marking the spaces in a 3x3 grid. The player who succeeds in placing three of their marks in a horizontal, vertical, or diagonal row wins the game. 3 4. Suggest an algorithm that checks if your 10TB disk hard drive contains two identical files. You are not allowed to use values provided by the File System/Operating System (such as MD5). The number n of files is about 1010. Note that 1GB is roughly 10% If there are any identical pairs of files, your algorithm should print the names of such a pair, and stop. Suggest a solution that is efficient both in terms of CPU time and in terms of the number of disk access operations (I/O). Your algorithm should be practical for your currant desktop or PC. There are a several heuristics that could be efficient in certain scenarios, but could also fail miserably. Using the size of the file will fails if most of the files are images from your camera (uncompressed). Summing up ASCII values is slightly better, but is iffy for images because it mostly depending on the background pixels. You do not have access to the creation date, and even if you do, it will not assist if you are looking for copyright violation. In short, try to do better. 5. Repeat the previous question, but now use MD5. MD5 is a hash function (link) is a hash function that receivers a binary file, and returns a 128-bits hash-key of this file. It was implemented very efficiently in the OS of your computer. For example, in Linux, run md5 filename. The result is usually printed as a string of 32 hex digits. From Wiki: Tic-tac-toe (also known as noughts and crosses or Xs and Os) is a paper and pencil game for two players, X and 0, who take turns marking the spaces in a 3x3 grid. The player who succeeds in placing three of their marks in a horizontal, vertical, or diagonal row wins the game. 3 4. Suggest an algorithm that checks if your 10TB disk hard drive contains two identical files. You are not allowed to use values provided by the File System/Operating System (such as MD5). The number n of files is about 1010. Note that 1GB is roughly 10% If there are any identical pairs of files, your algorithm should print the names of such a pair, and stop. Suggest a solution that is efficient both in terms of CPU time and in terms of the number of disk access operations (I/O). Your algorithm should be practical for your currant desktop or PC. There are a several heuristics that could be efficient in certain scenarios, but could also fail miserably. Using the size of the file will fails if most of the files are images from your camera (uncompressed). Summing up ASCII values is slightly better, but is iffy for images because it mostly depending on the background pixels. You do not have access to the creation date, and even if you do, it will not assist if you are looking for copyright violation. In short, try to do better
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
