Giter Club home page Giter Club logo

txtcompare's Introduction

txtCompare

Tools for text comparison

intertextFinder

How to use (with texts provided in the .txt format, UTF-8 encoding):

  • put the corpus into a folder named TXT in the same folder as this script
  • put the texts to analyze into a folder named todo in the same folder as this script
  • run the script by executing in the console the following command: python intertextFinder.py
  • for each text file in the todo folder, a new file with the same name and the extra extension .html is added into the same folder, open it in a web browser to see the results

Demo of the result at https://philippegambette.github.io/txtCompare/intertextFinder

pairwiseMedite

Automatically call MEDITE at http://obvil.lip6.fr/medite/:

  • to start pairwise comparisons of all UTF-8 encoded text files in a files folder in the same folder and simply execute the script with python: python pairwiseMedite.py
  • to start comparing two long text files which were previously split in smaller parts (file1-1.txt compared with file2-1.txt, file1-2.txt with file2-2.txt, etc., if file1 and file2 are given as input file names): python pairwiseMedite.py L_education_sentimentale_1870.txt L_education_sentimentale_1880.txt

Requires Python 3 and selenium with Firefox browser (https://selenium-python.readthedocs.io/installation.html).

Demo of the result at https://philippegambette.github.io/txtCompare/pairwiseMedite

In order to split long files into smaller parts, insert the character ? each time you want to split the file and then use the script decoupeOuvrage.py with the filename as input: python decoupeOuvrage L_education_sentimentale_1870.txt

sankeyCompare

Visualize the differences of order of texts in two collections of texts (for example, two editions of a collection of poems, or short stories) with a Sankey Diagram, built in Javascript/jQuery from a spreadsheet file.

Demo of the result at https://philippegambette.github.io/txtCompare/sankeyCompare

visuLexique

Visualisation of the evolution of the frequencies, along a text, of words taken from two input word lists.

Tool available online, with a demo on the Memoirs of Marguerite de Valois, at https://philippegambette.github.io/txtCompare/visuLexique

txtcompare's People

Contributors

philippegambette avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.