C0SM0
- Mar 28, 2022
- 6 min read

Vigenere Cipher - CodeX

// Breaking the Vigenere Cipher with the CodeX...

Hey Hackers! Welcome back to the CodeX project, a python suite for breaking ciphers and cryptographic algorithms. The Vigenere Cipher, also referred to as the "Polyalphabetic Cipher” is the latest cipher we will attempt to tackle. It was invented by Giovan Bellaso in 1553 but Blaise de Vigenere published a more advanced version of the cipher. The Vigenere Cipher is a symmetric encryption cipher, meaning it uses the same key to encrypt and decrypt text. The Vigenere Cipher is actually my favorite cipher, mainly because of cool it is to break [more on that later ;) ].

// Indexing:

The Vigenere Cipher has the following index chart, where each character has an index to represent it [and vice versa].

For the sake of this example, we will use “KEY” as our key and “ENCRYPT” as our plaintext. Note that each letter in our key and plaintext has an index.

//Basic Encryption:

The encryption process for the Vigenere Cipher is fairly simple. We take each index of our plaintext, iterate through our key, and add the indexes together to get a new value. Our key is "KEY" [10, 4, 24], so we add 10 to our first index, 4 to our second index, 24 to our third index, and cycle back to 10 for the fourth index. Keep in mind that if we add a key index to a plaintext index and it goes over 25, it needs to cycle back from 0 in order to keep it within the index range of 0 - 25.

Going down each column, we can see how the encryption process works.

//Basic Decryption:

The decryption process is the same as the encryption process, but we just invert our steps. We start with the ciphertext and subtract the key indexes from the ciphertext indexes. Then we match our new indexes to characters on our index chart.

//Python Encryption:

Because we don't want to do this by hand, we have python to automate a lot of this for us. I will link my code here because the code is to complex to completely dive into in this article. Here is some pseudo code we can use when developing. This won't necessarily work if you just throw it into a file in run it. It was designed to be implemented into the codex. But is should be able to explain the logic that you can incorporate into your own code.

# symbols that can't be processed through the cipher
SYMBOLS= ['\n', '\t', ' ', '.', '?', '!', ',', '/', '\\', '<', '>', '|','[', ']', '{', '}', '@', '#', '$', '%', '^', '&', '*', '(', ')','-', '_', '=', '+', '`', '~', ':', ';', '"', "'", '0', '1', '2', '3','4', '5', '6', '7', '8', '9']

LETTERS='ABCDEFGHIJKLMNOPQRSTUVWXYZ'

def encrypt_vigenere(plain_content, encryption_key, print_cnt):
    # output variable
    output = [] 
    index = 0

    # format key
    key = encryption_key.upper()
   
    # ciphering process
    for character in plain_content:
        num = LETTERS.find(character.upper())

        # starts encryption
        if num != -1:
            num += LETTERS.find(key[index])
            num %= len(LETTERS)
            
            # check for symbols
            if character in SYMBOLS:
                output.append(character)

            # check for uppercase
            elif character.isupper():
                output.append(LETTERS[num])

            # check for lowercase and others
            else:
                output.append(LETTERS[num].lower())

            # update index
            index += 1
            
            # stop process when ciphering is done
            if index == len(key):
                index = 0

        # adds character in case of error
        else:
            output.append(character)
    
    # outputs content to cli
    if print_cnt == True:
        print(f'Encrypted Content:\n{("".join(output))}\n')

    # outputs content to file
    else:
        with open(print_cnt, 'w') as f:
            f.write(''.join(output))
        print('Output written to file sucessfully')

The code is essentially iterating through every character in our ciphertext. Each character is matched to it's appropriate index and is added to the appropriate key value. We take this resulting value [our new index] and associate it with it's appropriate character

// Decrypting With Python:

The decryption process is the same as above, but we subtract our key index and ciphertext index.

# decryption process
def decrypt_vigenere(plain_content, encryption_key, print_cnt):
    # output variable
    output = [] 
    index = 0

    # format key
    key = encryption_key.upper()
   
    # ciphering process
    for character in plain_content:
        num = LETTERS.find(character.upper())

        # starts encrypiton
        if num != -1:
            num -= LETTERS.find(key[index])
            num %= len(LETTERS)
            
            # check for symbols
            if character in SYMBOLS:
                output.append(character)

            # check for uppercase
            elif character.isupper():
                output.append(LETTERS[num])

            # check for lowercase and others
            else:
                output.append(LETTERS[num].lower())

            # update index
            index += 1
            
            # stop process when ciphering is done
            if index == len(key):
                index = 0

        # adds character in case of error
        else:
            output.append(character)
    
    # outputs content to cli
    if print_cnt == True:
        print(f'Decrypted Content:\n{("".join(output))}\n')

    # outputs content to file
    else:
        with open(print_cnt, 'w') as f:
            f.write(''.join(output))
        print('Output written to file sucessfully')

//Cryptographic Wordlisting:

One of the key concepts I want the Codex project to strive in is it's ability to break ciphers. Sure, we can encrypt and decrypt. But if we don't know the key, we are as good as done. So, I have been sure to include features into the CodeX project that allow us to decrypt ciphers without the key.

If you want to learn more about how we can break ciphers, you can check out our course on breaking ciphers here.

One of the ways we can break the Vigenere cipher is through cryptographic wordlisting [a dictionary attack]. This is basically the ability to iterate through a list of possible keys and attempt to decrypt the text using those keys. The implementation would look something like

for line in wordlist:
    decrypt_vigenere(line)

//Letter Frequencies:

The concept of breaking the Vigenere Cipher with letter frequencies is what makes the Vigenere Cipher my favorite cipher. Letter frequencies are how common certain letters appear in text. We can break this cipher with these frequencies as well as exploiting the mathematical processes behind the cipher. This can be done in two steps.

Acquiring the key length
discovering the key indexes

If we take our ciphertext and shift it on itself, we can discover common letters that appear on the same indexes. We can use this process to determine the key length. Let’s use "CCEAVCCFTEBYNIN” as our ciphertext

Coincidences are how often the same character appears in the same index. When we match each shifted line to our original ciphertext, we can see characters that appear in the same index [highlighted in white]. We will mark our greatest valued coincidences in yellow.

//Discovering Coincidences:

If we look at the greatest valued numbers that appear within our ciphertext, we can see the distance between them is four, therefore, our key likely has a length of four. The bigger the text, the more coincidences we’ll find. Most computing programs that break the Vigenere Cipher will require a minimum of 300 characters of ciphertext.

Let’s take our ciphertext, and mark every fourth character, starting from the first index. Realistically speaking our ciphertext would be much longer, so let’s assume we marked every fourth index within the whole ciphertext message. Let's count every time a character appears in every fourth index.

We can now discover the letter frequencies of the cipher. We can use the following equation to discover the frequencies of the cipher.

// English Alphabet Frequencies:

Each letter in the alphabet has a certain frequency on how common it appears in text. We can use these frequencies along with our aforementioned ciphertext frequencies to break the Vigenere Cipher.

//Discovering The Index:

In order to obtain the index of the cipher, sort ciphertext characters in alphabetical order. We can then get the ciphertext frequency and alphabetical frequency. Multiply the frequencies together [product] add totals together [sum]. After each time we solve we can shift the ciphertext frequencies and repeat the process until we cycle through all of the frequencies. In this case, we would shift a total of four times due to our key being a size of four.

//Examining Our Products:

The shift with the highest frequency is the first index of our four letter key. We can see that the first index of our key is zero. In the Vigenere Cipher, the letter at index zero is “A”. To get the other indexes, we simply repeat this process using every fourth character from the second, third, and fourth index of the ciphertext.

//Automation:

Now, that was a lot of math that nobody wants to do manually. Thankfully, because it's just math, we can "easily" implement it into Python [it was not easy]. Again, the code is to complex to showcase in this article. So i have left the link to our implementation here. Thanks for reading, and as always,

Happy Hacking!

// Socials:

Vigenere Cipher - CodeX

// Breaking the Vigenere Cipher with the CodeX...

Recent Posts

Comments