# Birthday paradox and collision resistance

## Hash function

Hash functions are compression functions that compress a message of any length to a message of fixed length (e.g. 128 or 164 bits) - the hash value. One-way hash functions (one-way function) are important in the context of the digital signature application. Hash functions are public, the process is not secret. The security of the one-way hash function lies in its one-way nature. The result is irreversibly dependent on the input. Examples are the outdated MD-4 and MD-5 (Message Digest Algorithm) and SHA (Secure Hash Algorithm - e.g. the also outdated SHA-1, SHA-2 - with the variant SHA-256 - and the SHA based on a different functional principle -3).

### Security features

Hans-Joachim Knobloch (Secorvo Security Consulting, Karlsruhe) describes the security properties of a crypto hash function in the article "Sponge over it - How SHA-3 (Keccak) works" in kes - Die Zeitschrift für Informationssicherheit "1/2013, p. 10 ff as follows:

A hash function h maps a message m of any length (within wide limits) to its hash value x = h (m) of fixed length. In order to be suitable for cryptographic purposes, h must have the following security properties that build on each other:

### One-way feature

In practice, it should not be possible to use a given hash value x0 a message m0 to find whose hash value x0 is, that is for the h (m0) = x0 applies.

"Practically not possible" is usually understood to mean that with a realistically acceptable effort there is only a negligible probability of success - if you had any amount of time or computing power, you could find a suitable message at some point by simply trying it out

This one-way property alone is sufficient for various applications of crypto hash functions, for example for the one-way encryption of passwords.

### Weak collision resistance

It should not be possible in practice to m0 a second message m1 to find for the h (m0) = h (m1) applies. This security property requires more than the one-way property alone, because if the latter were violated, collisions could be found simply by looking at the hash value x0= h (m0) calculates back the given message and would have broken the weak collision resistance at the same time.

### Strong collision resistance

It should not be possible in practice to send two messages to m1 and m2 to find for the h (m1) = h (m2) applies. At first glance, the strong collision resistance doesn't seem to be that different from the weak one. And of course, if the weak collision resistance were breached, you could simply send a message to m2 choose as given and a conflicting message m1 determine.

However, there are also ways to find conflicting messages without one of them being fixed in the first place. One of them is the application of the so-called "birthday paradox". Its name comes from the fact that at first glance it seems astonishing that in a school class with 23 children there is more than 50 percent probability that two of them will celebrate their birthday on the same day of the year.

The apparent paradox is explained by the fact that it is not specified which day of the year should be the common birthday. Using the same principle, a collision can be found in hash functions by keeping messages m1, m2, m3, ... hashes until an mi finds whose hash value is identical to that of any of the preceding i–1 news is. In the case of a hash function whose outputs are n bits long, the probability of finding such a hit is from around 2n / 2-Try not to neglect anymore. Therefore, the output length of a hash function should be chosen twice as long as the key length for symmetric encryption at a comparable security level. In this sense, for example, SHA-256 "fits" AES-128 and SHA-512 to AES-256.

The strong collision resistance is mainly required for signature applications. Otherwise, an attacker could sign a seemingly harmless message for the signer and thereby unknowingly have him sign their “bad twin” at the same time. This is what happened in practical attacks in which trust centers were submitted requests for a “harmless” user certificate, the signature of which was then also valid for a security-critical certificate - for example for an unauthorized sub-certification authority - that the applicant would never normally have received.