Danni-Tech
Making the Complicated Simple
Key Points from This Video on Hashing:
What is Hashing?
Hashing is the process of using a mathematical algorithm to produce a numeric value (digest) that represents the original data. This is often referred to as a fingerprint, digest, or simply a hash.Simple Example of Hashing:
- The word “friend” is hashed, with each letter represented by its alphabetical position, resulting in a numeric value of 56.
- Even a small change (e.g., removing a letter to form “fried”) results in a completely different hash value, such as 42, demonstrating how hashes detect data changes.
Purpose of Hashing:
Hashing is commonly used to verify data integrity by detecting changes in the original data.Collisions in Hashing:
- A collision occurs when two different inputs produce the same hash value.
- Collisions are unavoidable because hash functions generate a fixed-size digest, which limits the number of possible outputs.
Example of a Collision:
- Using a simple 2-bit hashing algorithm (SimpleHash2), there are only 4 possible outputs: 00, 01, 10, 11.
- With more inputs than outputs, different messages will eventually share the same hash, causing a collision.
Common Hashing Algorithms:
- MD5 – 128-bit digest
- SHA-1 – 160-bit digest
- SHA-224 – 224-bit digest
- SHA-384 – 384-bit digest
- SHA-512 – 512-bit digest
Demonstration Using Linux (Ubuntu via WSL):
- The
echo
command is used to send data to standard output:echo "friend"
- Data can be hashed using utilities like
sha224sum
:echo "friend" | sha224sum
- To avoid adding a newline character, use the
-n
flag:echo -n "friend" | sha224sum
- The
Key Observations from the Demo:
- Hashes are irreversible – You cannot determine the original message from the hash.
- Small changes produce drastically different hashes – Even a tiny modification alters the entire digest.
- Fixed-length output – Regardless of the input size, the hash output is always a fixed length.