Solving the Caesar Shift cipher

Caesar Shift is one of the oldest means of encrypted communication. It is a type of substitution cipher. The technique is as follows:

Given a message: CAESAR and a key: 12, we shift each letter of the message by key mod 26 (key%26). Key values should be between 0 – 25 (0 meaning no shift). However, we use mod 26 to accommodate key values greater than 26, so that the mod function maps it back to the range 0 – 25. The shift can be positive or negative, meaning for each letter in the message, we can replace it with a letter that comes ‘key = 12’ places after it or before it respectively.

Positive Caesar shift on message: CAESAR with key: 12:

C → O

A → M

E → Q

S → E

A → M

R → D

So the encoded message is: OMQEMD

Notice that while shifting letters, whenever we reach Z, we circle around from A, as in the case of S and R above.

Similarly, negative Caesar shift would give:

C → Q

A → O

E → S

S → G

A → O

R → F

So the encoded message is: QOSGOF

Notice that a positive encryption key ‘k’ corresponds to a negative encryption key of 26 – k.

In our example above, using a positive key of 12 and a negative key of (26 – 12) = 14 would result in the same encoding: OMQEMD.

There are many variations to the Caesar cipher: In one of them, the keys vary with each letter. The variation of the keys may be represented by a function, like f(key) = (key + 1), where the key increases with each letter.

Here are a few sources on the Caesar cipher:

http://en.wikipedia.org/wiki/Caesar_cipher

http://www.simonsingh.net/The_Black_Chamber/caesar.html

http://www.cs.trincoll.edu/~crypto/historical/caesar.html

The Idea:

We have an encoded message and we do not know the key. We can find either the positive key or the negative key, as the two are related. We also need a dictionary source (a text file consisting of all words in the English language) to help in the decoding process.

The program works as follows:

→ Try all keys from 1 to 25

→ After decoding with a particular key ‘k’ check the dictionary file for the decoded word.

If it exists, it is a legitimate word. Output the word and key ‘k’ and continue.

Else continue without any output.

For large words and sentences, the procedure can be quite time-consuming. Most of the time is used by the program in dictionary lookup.

A faster solution would be to remove the dictionary lookup altogether and let the program output the decoded word for each key k from 1 – 25. The output words can then be checked by the user for legitimacy.

Source Code:

#include <stdio.h>
#include <string.h>
#include <stdlib.h>

FILE *WORDLIST;
int flag = 0;

int main()
{

	int len;
	int key, i;
	char word[50];

	clock_t start, end;

	WORDLIST = fopen("dictionary.txt", "r");

	system("clear");
	printf("Enter word (in small letters):\n>> ");
	scanf("%s", word);
	len=strlen(word);
	len++;

	char *buffer = malloc(sizeof(100));

	for(key=1; key<26; key++)
	{
		unsigned char temp[len]; 	/* Unsigned is necessary else program fails when value of temp reaches 128 */
		strcpy(temp, word);

		for(i=0; i<len-1; i++)
		{
			temp[i]=word[i]+key;
			if(temp[i]>=123) 	/* If word[i] crosses the character 'z' */
			{
				temp[i]=97+temp[i]-123; /* then replace it with corresponding character from beginning of alphabet */
			}
		}

		while(fscanf(WORDLIST, "%s", buffer) != EOF)
		{
			if(strcmp(buffer, temp) == 0)
			{
				flag = 1;
				printf("Match: %s\n", buffer);
			}
		}

		rewind(WORDLIST);

	}

if(flag == 0) printf("\nNo matches found.\n");

fclose(WORDLIST);

return 0;

}
Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s