Monday, November 25, 2013

Code 59 : Homophonic Cipher

Homophonic Cipher

Background

Homophonic ciphers first appeared in a correspondence from the year 1401. The cryptologists of this time therefore must have known about the lack of security of simple monoalphabetic substitutions.1 If a plaintext character is always mapped to the same ciphertext character, the cipher can easily be attacked by a frequency analysis because the characters occur with a specific frequency in each language.

Principle

There is a simple solution for this problem and it is surprising that it was not discovered earlier when it was, approximately 600 years ago. A character is not only mapped to one other character, but to an arbitrarily chosen one from a set of characters. As already mentioned in the chapter “Frequency Analysis”, the 'E' occurs with a frequency of 17% in the German language and the 'N' with a frequency of 10%. The idea suggests itself to assign each plaintext character a specific number of ciphertext characters, depended on its frequency. If the ciphertext alphabet would be composed of the numbers 0-99 we would get such a substitution table:

ersetzungstabelle-homophone
Fig. 1: Substitution table for a Homophonic cipher.2
The mapping of numbers to characters can be seen as key in this cipher. The recipient of the encoded message needs exactly this table to decode it.
"GEHEIMNIS" could, according to figure 1 be encoded with: "943221419383641199".

Security

If the assignment of numbers to plaintext characters is done randomly and dependent on the frequency, then the total frequency of all numbers assigned to a plaintext character is 1% for each plaintext character. A frequency analysis of single characters is therefore not applicable to break the cipher. The cipher is nevertheless not as secure as it might seem at first. One reason for this is the relationship between characters in a language. A Q for example always occurs together with a following U in the German language. Also, Q is extremely rare and it can therefore be assumed that it is only encoded by one number. In analogous way we could guess that, based on its frequency, U should be encoded by 4 different numbers. So if we would find a number that is always followed by only 4 different numbers, we have a good guess for the mapping that corresponds to U and Q.3

Details

Even though it is possible to map one plaintext character to several different characters in this cipher, it still belongs to the class of monoalphabetic substitutions. One of the criteria for a polyalphabetic substitution would be that it is possible to map two different plaintext characters to the same ciphertext character. This is not the case. Here the ciphertext alphabet is bigger than the cleartext alphabet.

Weblinks


References

1 Kippenhan, Rudolf: "Verschlüsselte Botschaften", Nikol, 2006,  P. 127
Kryptographiespielplatz, 2009-02-13
3 Singh, Simon: "Geheime Botschaften", Carl Hanser Verlag, 1999,  P. 75

0 comments:

Post a Comment

Note: Only a member of this blog may post a comment.