The frequency-analysis-simulator from varunsingh87

frequency-analysis-simulator's Issues

Data visualizer

Graph the data in a new executable Java program.

Goal

Easily select factors for the data collection of decryption of ciphers through a friendly graphical interface

Requirements

Tabs for Cipher type: Vigenere and simple substitution
Widget for selecting parameters: key, Caesar decryption algorithm, key length calculation algorithm
Widget for inputs, grouped into sets, and option to add new inputs
Button to perform the data collection

Rewrite simple substitution cipher

The current implementation of simple substitution cipher uses at least O(n^3) time, which is startlingly inefficient. Using chi-square, frequency analysis, a tree of possibilities, and only after this an indexed dictionary, rewrite SimpleSubstitutionCipher.

Integrate this into UI (frequencyanalysissimulator.presentation.main.Main) and dataanalysis package with a button to choose substitution cipher. The algorithm should efficiently (should seem instantaneous when run) find the plaintext when given a ciphertext without the key. The UI can also give various properties if relevant.

Index of coincidence gives negative value in dataset

According to William Friedman's test, the index of coincidence for a polyalphabetic cipher should be less than 0.0660. All of the inputs in the dataset should have this property because they are encrypted using the Vigenère cipher. Instead, the index of coincidence for some values at the beginning of the data collection for friedman_kerckhoff has large negative values because the index of coincidence is greater than kappa_r (probability of uniform random selection from the case-insensitive English alphabet).

The bug couldn't occur in the Friedman Test because the formula is copied. The source of the bug could be:

in Vigenere encryption, which means the entire dataset must be recollected (all experimental groups, not just the ones with Friedman) after tests prove that the Vigenere encryption is perfectly accurate.
index of coincidence calculation (though it seems to also be directly from the formula)
It could be something to do with floating-point precision
cipherText vs letterOnlyCipherText use

Account for spaces and punctuation when analyzing data

Right now DataCollector uses the original cipher length by substringing, but the original input includes punctuation and spaces, which means the data is erroneous. Use a different algorithm or modify the algorithm to ignore spaces and punctuation when counting the length, unless it is negligible.

varunsingh87 / frequency-analysis-simulator Goto Github PK

frequency-analysis-simulator's People

Contributors

Stargazers

Forkers

frequency-analysis-simulator's Issues

Variants of Vigenere cipher

Key as data

Data visualizer

GUI for Data Collection

Goal

Requirements

Rewrite simple substitution cipher

Index of coincidence gives negative value in dataset

Account for spaces and punctuation when analyzing data

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent