Code 10 : Autokey Cipher ~ FACT, INFORMATION, TRUTH

Monday, October 14, 2013

Code 10 : Autokey Cipher

9:40 PM No comments

To encrypt a plaintext message using the Vigenère Cipher, one locates the row with the first letter to be encrypted, and the column with the first letter of the keyword. The ciphertext letter is located at the intersection of the row and column. This continues for the entire length of the message.
Running key code chart

An Autokey cipher is identical to the Vigenère cipher with the exception that instead of creating a keyword by repeating one word over and over, the keyword is constructed by appending the keyword to the beginning of the actual plaintext message.
For example, if your plain text message was:

This is a secret message

And your keyword was "zebra", then your actual keyword would be:

zebrathisisasecretmessage

Enciphering and deciphering the message is performed using the exact same method as the Vigenère Cipher.

Autokey Cipher

The Autokey Cipher is one such example. In general, the term autokey refers to any cipher where the key is based on the original plaintext. In its simplest form, it was first described by Girolamo Cardano, and consisted of using the plaintext itself as the keystream. However, since there was no key involved in this system, it suffered the same major flaw as the Atbash and the Trithemius Ciphers: if you knew it had been used, it was trivial to decode.

The most famous version of the Autokey Cipher, however, was described by Blaise de Vigenère in 1586 (the one that was later misattributed the Vigenère Cipher). This cipher incorporates a keyword in the creation of the keystream, as well as the original plaintext.

Encryption
Encryption using the Autokey Cipher is very similar to the Vigenère Cipher, except in the creation of the keystream.

The keystream is made by starting with the keyword or keyphrase, and then appending to the end of this the plaintext itself.

We then use a Tabula Recta to find the keystream letter across the top, and the plaintext letter down the left, and use the crossover letter as the ciphertext letter.

As an example we shall encode the plaintext "meet me at the corner" using the keyword king. First we must generate the keystream, which starts with the keyword, and then continues with the plaintext itself, getting kingmeetme....

The keystream in the Autokey CIpher starts with the keyword, and is then followed by the plaintext itself.

With the keystream generated, we use the Tabula Recta, just like for the Vigenère Cipher. We find K across the top, and M down the left side. The ciphertext letter is "W".

For the second letter, "e", we go to I across the top, and E down the left to get the ciphertext letter "M".

Continuing in this way we get the ciphertext "WMRZYIEMFLEVHYRGF".

The Tabula Recta is used in the same way as we used it for encrypting the Vigenère Cipher.

The plaintext, keystream and ciphertext generated using the Autokey CIpher.

Decryption
To decrypt a ciphertext using the Autokey Cipher, we start just as we did for the Vigenère Cipher, and find the first letter of the key across the top, find the ciphertext letter down that column, and take the plaintext letter at the far left of this row. As well as being the plaintext letter, we now need to add this letter to the end of the keystream as we shall need it later. Continuing to decode each letter, we add them to the end of the keystream each time.

We shall decrypt the ciphertext "QNXEPKMAEGKLAAELDTPDLHN" which has been encrypted using the keyword queen. We start with the information shown in the table below.

The ciphertext and keyword. We will fill the rest of the keystream as we find the plaintext.

We look along the top row to find the letter from the keystream, Q. We look down this column (in yellow) and find the ciphertext letter "Q" (in green). We then go along this row (in blue) to the left hand edge, and the letter here (in purple) is the plaintext letter. In this case it is "a".

We now add this to the end of the keystream, as well as to the plaintext row.

We have added the first letter from the plaintext, and appended this to the end of the keystream as well.

In the same way as above, we find the keystream letter U, and find the ciphertext letter "N" in this column. We then follow this row to find the plaintext letter "t".

Again we add this plaintext letter to the end of the keystream.

With the second letter of the plaintext fillef in.

We then continue in the same way to retrieve the plaintext "attack the east wall at dawn".

With all the keystream completed, we can decipher the whole message.

Discussion
The Autokey Cipher is a much more secure way of generating the keystream than the Vigenère Cipher, which is amazing since for over 200 years it was believed that the Vigenère was unbreakable. The weakness of the Vigenère Cipher was the repeating nature of the keystream, which allowed us to work out the length of the keyword and thus perform frequency analysis on the different parts.

The Autokey Cipher does not suffer from this weakness, as the repeating nature of the keystream is not used. However, even though it is more secure, it is still not impossible to break the Autokey Cipher. The weakness here is that it is likely that some common words will have been used in the plaintext, and thus also in the keystream. For example "the" is likely to appear in the keystream somewhere, and so by trying this everywhere we can identify other bits of likely plaintext, and put these back in the keystream, and so on.

As an example, we have intercepted the message "PKBNEOAMMHGLRXTRSGUEWX", and we know an Autokey Cipher has been used. We are going to have a look to see if the word "the" produces any leads. If the word appears in the plaintext, then it is also likely to appear in the keystream. We start by putting "the" in every possible position in the keystream, to see if we get any fragments that make sense.

We place the word "THE" in the keystream at every point possible. We then decrypt the message in each case to get lots of trigrams of possible plaintext.

Some more of the possibilities for positions of "THE" in the keystream.

The final options for the positions of "THE" in the keystream.

With this done, we identify the most likely plaintext fragments. For example, "bxs" and "zzq" are very unlikely plaintext, but "tac" and "ako" are more likely possibilities. We shall start with "tac". We know that, since it is an Autokey Cipher, if "tac" is plaintext it will also appear in the keystream. Also, if "THE" is in the keystream it appears in the plaintext.

If the keyword had length 4, then the "t" of "the" in the plaintext will be 4 places to the left of the "T" in "THE" in the keystream, and similarly for "tac". Putting this information in the grid we get the following table. The red letters are the information we can then work out using the Tabula Recta.

Keyword of length 4. The plaintext is 4 places further left than the corresponding keystream.

From this we would have "yxr" as some plaintext, which seems unlikely. So we try a different length of keyword. It is likely it is somewhere between 3 and 12 letters long. We shall look at the next couple.

Keyword of length 5.

Keyword of length 6.

We can continue down this route, but it does not get us anywhere. The hopeful "IGA" in the keystream (and keyword if it is of length 6), seems less likely with "arq" in the plaintext.

The plaintext "tac" has not helped use, so let's go back and try "ako". We do the same thing, but this time with the position of "THE" that produced "ako".

Keyword of length 4. "NEN" is possible for plaintext, but "uui" seems unlikely.

Keyword of length 5. "emj" is not a possible ending for a plaintext.

Keyword of length 6. Both bits of possible plaintext here are plausible. Worth further investigation.

With this last one, we get "TAC" which is a possible piece of plaintext, and "wn" finishing the message, which could also work. With this, we decide to investigate a little bit more along this line of inquiry. Just as we did before, if "TAC" is in the keystream, it must be in the plaintext, so we can add it to the grid, and use it to work out some more keystream.

Adding the "tac" to the plaintext allows us to reveal some more of the keystream.

The revealed letters "INC" are the third, fourth and fifth letters of the keystream, and as we are working with a keyword of length 6, they would be in the keyword, not the plaintext. We can then think about words of length 6 with these letters (or use a crossword solver), and we find the most plausible is probably prince or flinch. Wee try the former of these.

The keyword prince, gives us a first word "attack".

As this has produced a word that makes sense, it is certain we have found the keyword. We can now continue to decode the message by putting in the rest of the known plaintext to the keystream, or we can decrypt it now that we know the keyword.

We can add the plaintext to the keystream to continue to decrypt.

Finally, we retrieve the plaintext "attack at the break of dawn".

There are several parts to this system that worked well in this example. The first word we chose to check, "THE", was indeed in the plaintext. In reality, it may take a few goes to find a word that does appear. We also found a sensible plaintext segment on our second go with "ako". We could have tried many other possibilities before getting to this one. The final guess of the keyword relied on it being a word. To make the encryption more secure, they might have used a non-sensical 'word', which would have slowed us down as well.

Although there are difficulties in using this method, and it is quite long winded doing it by hand, with the help of a computer we can identify the possibilities very quickly.

FACT, INFORMATION, TRUTH

Blog Archive

Translate

Monday, October 14, 2013