# Introduction

I set out to build a web application that could encrypt plaintext and decrypt ciphertext using the Shift, Vigenère and Affine cipher. I wanted to build my skills in the backend and frontend so I settled upon using Flask as a lightweight backend micro-service that would handle all the encryption and decryption computation paired with a React frontend, sending http request with Axios.

I followed this guide to setup my project skeleton, which showed me how to setup a Flask project and connect it to a React project, and host it all on Heroku for free.

With that all complete and my skeleton code pushed to Github, I created a Github project to track my progress and list out all the features I was hoping to incorporate in my web app.

## Core Features

- Web application built with React and Flask hosted on Heroku.
- Shift cipher algorithm for encryption and decryption with key.
- Vigenère cipher algorithm for encryption and decryption with key.
- Affine cipher algorithm for encryption and decryption with key.

## Advanced Features

- Auto decryption algorithms for all three ciphers.
- Vigenère cipher algorithm to decrypt ciphertext only knowing the length of the key.

# Flask Backend

## API

Now that I had my skeleton complete, I started with building the backend API for expected requests, which would be routed to the appropriate cipher algorithm for encryption/decryption.

Encryption routes will return a dictionary containing the ciphertext as a String, while decryption routes will return a dictionary containing a list of decryptions in JSON format to allow me to parse and map the decryptions so I can display them in React.

## Cipher Algorithms

**Shift Cipher**

**Vigenère Cipher**

**Affine Cipher**

After I had written and tested the cipher algorithms for correctness, the next step was to build the frontend components in React that would use them.

# React Frontend

## Components

In order to navigate between the different ciphers on a single page application, I used Reacts HashRouter to display the appropriate components.

Once I had each “page” in place, I continued to build off my skeleton code to create a general template for how each cipher encrypt/decrypt component would be structured and operate. In general, I needed a form to gather user input, state management to handle form changes, a button to handle form submittal, and some logic to know when to display a response.

Once I had created a component for each ciphers’ encryption and decryption, the core functionality of the web app was complete — I was able to accurately encrypt plaintext and decrypt ciphertext when given a key for each cipher by sending a POST request with data to the backend for computation, receiving a response and displaying it in the browser.

# Advanced Features

## Auto-Decryption Shift Cipher

The first advanced feature I wanted to add was giving the user the ability to auto-decrypt a shift cipher without knowing the key. The brute-force solution to this problem is elementary; try all possible shift values on the ciphertext and display the plaintext, one of which will look like English.

In order to improve upon this, I ran each decryption through a python language detection library, PyCLD2, which would try and detect the language of the text as well as rate the reliability of its detection results.

If any reliable decryptions were found, those were returned. Otherwise all 26 decryptions would be returned, ordered by their decryption score calculated by PyCLD2.

## Auto-Decryption Affine Cipher

The second advanced feature I added was decrypting an Affine cipher automatically. This solution was straightforward as there are only 12 values in the domain for alpha, and 26 values in the domain for beta, allowing us to brute-force a decryption by trying all 12 * 26 = 312 possible keys.

After we generate a list of all possible decryptions, run them through the PyCLD2 to find the most likely decryptions. Again, if any reliable decryptions were found, those were returned, otherwise all 312 decryptions would be returned order by their decryption score calculated by PyCLD2.

## Auto-Decryption Vigenère Cipher

The last advanced feature I wanted to add was auto-decrypting a Vigenère cipher. I was able to do this in two ways:

- The user enters the ciphertext and the length of the key.
- The user enters the ciphertext without knowing the key or length of the key.

For both of these options, results can vary depending on the length of the ciphertext and length of the key. In general, a longer ciphertext leads to a higher likelihood of a correct decryption/key detection due to having more data to analyze.

To generate possible decryptions/keys, I performed a Kasiski Examination paired with a chi squared supported frequency analysis, and applied a lighweight python language detection library, PyCLD2. These three tools worked together to first determine possible key lengths, then identify the best possible letters for each position in the key, and lastly rate the quality of the decryption using a potential key.

## Kasiski Examination

- The first step in the kasiski Examination is to search the ciphertext for any repeated trigrams (any length 3 substring that occurs more than once).
- Once the repeated trigrams have been located, calculate the distance between the start of each trigram, and the matching trigram that follows. Do this for all trigrams.
- After all the distances have been found between repeated trigrams, list out all the factors, excluding 1, for all of the distances.
- Count the frequency of all found factors, the most frequent factor is a good estimation for the length of the Vigenère key.

**I slightly modified my approach in the follwing ways —**

*In order to help with shorter ciphertext lengths, if no repeated trigrams were found, search for repeated bigrams, decreasing to 1 until repeats were found.**Additionally, on the first key length guess attempt, only search for keys with a length greater than 3, as factors 2 and 3 tend to occur at a high frequency for larger key lengths, overshadowing the importance of the larger factors.**Lastly, I return a list of the top two most likely key lengths to increase my chances at an accurate decryption, making this tradeoff for an increase in computation/hitting potential memory constraints on Heroku.*

Once I have found the possible key lengths using the Kasiski Examination, I generate a list of only the likely keys for each length:

- Split ciphertext into n subgroups, where n is the length of the key, placing every n’th character into the subgroup, starting at index 0 to n-1.

2. Run the subgroup through a no key, shift cipher decryption to identify which shift value(s) best matches an english sentence using a frequency analysis. Calculating the chi squared value for that shift, and keeping the shifts that have a chi squared value lower than 25. If none, take top two shifts.

## Chi Squared formula

*The closer the observed frequency of each letter in the decryption is to the expected frequency of the same letter in English leads to a lower value.**The chi squared is the sum of all these values.**The smaller the chi squared value is for a decryption, the more likely it could be an English sentence, and therefore the shift value that produced this decryption could be our key.*

Example: Lets say I find that [10, 0, 23] meet the criteria for good shift values for the first letter

3. Translate the shift value(s) to its respective letter in the alphabet.

Example: [‘k’, ‘a’, ‘w’]

4. Repeat steps 2 and 3 n times, generating a list of lists, where each nested list represents a list of good potential letters in that position of the key.

Example: [[‘k’, ‘a’, ‘w’], [‘e’], [‘x’, ‘y’]]

5. Iterate through the previously made list of lists, creating a single list containing all possible keys using the identified letter(s) at each position.

Example: [“kex”, “key”, “aex”, “aey”, “wex”, “wey”]

7. Repeated steps 1 through 6 for each key length, generating a list of all possible decryptions for all found keys/key lengths.

8. Determine reliability of each decryption by running through the language detection library, providing an overall score for each decryption/key.

9. If none are found to reliably translate to English, or do not meet the minimum score threshold of 1000, try generating keys again with smaller key lengths: [1, 2, 3]

10. Return all the reliable decryptions found, if none, return all the decryptions found.

# Resources

**Setup Project Skeleton**

**Kasiski Examination for Vigenère auto-decrypt**

**Github Repo**

**Website**