Giter Club home page Giter Club logo

varunsingh87 / frequency-analysis-simulator Goto Github PK

View Code? Open in Web Editor NEW
4.0 0.0 2.0 16.11 MB

A Java program that decrypts cryptograms without keys using frequency analysis

Home Page: https://varunsingh87.github.io/Frequency-Analysis-Simulator/

License: GNU General Public License v2.0

Java 100.00%
frequency-analysis cryptanalysis java cipher frequency-analysis-simulator monoalphabetic-cipher vigenere-cipher computer-science-project data-visualization

frequency-analysis-simulator's People

Contributors

varunsingh87 avatar

Stargazers

 avatar  avatar  avatar  avatar

Forkers

tubbz-alt

frequency-analysis-simulator's Issues

GUI for Data Collection

Goal

Easily select factors for the data collection of decryption of ciphers through a friendly graphical interface

Requirements

  • Tabs for Cipher type: Vigenere and simple substitution
  • Widget for selecting parameters: key, Caesar decryption algorithm, key length calculation algorithm
  • Widget for inputs, grouped into sets, and option to add new inputs
  • Button to perform the data collection

Rewrite simple substitution cipher

The current implementation of simple substitution cipher uses at least O(n^3) time, which is startlingly inefficient. Using chi-square, frequency analysis, a tree of possibilities, and only after this an indexed dictionary, rewrite SimpleSubstitutionCipher.

Integrate this into UI (frequencyanalysissimulator.presentation.main.Main) and dataanalysis package with a button to choose substitution cipher. The algorithm should efficiently (should seem instantaneous when run) find the plaintext when given a ciphertext without the key. The UI can also give various properties if relevant.

Index of coincidence gives negative value in dataset

According to William Friedman's test, the index of coincidence for a polyalphabetic cipher should be less than 0.0660. All of the inputs in the dataset should have this property because they are encrypted using the Vigenère cipher. Instead, the index of coincidence for some values at the beginning of the data collection for friedman_kerckhoff has large negative values because the index of coincidence is greater than kappa_r (probability of uniform random selection from the case-insensitive English alphabet).

The bug couldn't occur in the Friedman Test because the formula is copied. The source of the bug could be:

  • in Vigenere encryption, which means the entire dataset must be recollected (all experimental groups, not just the ones with Friedman) after tests prove that the Vigenere encryption is perfectly accurate.

  • index of coincidence calculation (though it seems to also be directly from the formula)

  • It could be something to do with floating-point precision

  • cipherText vs letterOnlyCipherText use

Account for spaces and punctuation when analyzing data

Right now DataCollector uses the original cipher length by substringing, but the original input includes punctuation and spaces, which means the data is erroneous. Use a different algorithm or modify the algorithm to ignore spaces and punctuation when counting the length, unless it is negligible.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.