In most cases we use hashing for verifying integrity. However, there are situation in which we are not so much concern about whether 2 files are the same but more about their similarity.
Let's dig deeper. For the purpose of this post I will make a copy of my "/var/log/user.log" file and name it "hashing_lab.txt". You may ask why I choose this file basically, I need a file with at least 4K to complete this scenario. This file will have more than that so that's good enough.
Let's verify my file size
root@securitynik:~# ls -al hashing_lab.txt
From the above we see the file "hashing_lab.txt" has a size of 9612 bytes. This is good enough for us.
Let's move on.
As mentioned previously hashing is typically used for verifying integrity. So let's take a md5 hash of our file "hashing_lab.txt"
root@securitynik:~# md5sum hashing_lab.txt
This returned the following
now let's take a copy of this file
root@securitynik:~# cp hashing_lab.txt hashing_lab.txt.copy
From this copied file, let's grab the hash
root@securitynik:~# md5sum hashing_lab.txt.copy
From the above we get the result
Let's put both files together for clarity.
From this perspective we can see the good of hashing. We were able to verify the integrity of these files are intact. Basically "hashing_lab.txt.copy" is an exact copy of "hashing_lab.txt"
See you in the next post for the bad.