Giter Club home page Giter Club logo

correctly's Introduction

CorrectLy - The English text corrector

CorrectLy is an NLP-based spelling and grammar correction tool that accepts articles as well as raw text and returns a corrected sentence. This automated proof-reading tool can correct incorrect words, correct verb-forms based on the sentence tense, correct preposition-noun agreements as well as suggest correct sentence structure. CorrectLy is built using Python, powered by data and makes use of core NLP techniques.

NLP

The project makes extensive use of the following Python NLP libraries:

  • SpaCy (excellent library for splitting into sentences, tokenizing sentence, generating POS tags and determiners)
  • NLTK (helps in tokenizing, visualizing sentence structure tree, has huge collection of data corpus)
  • language_check (great spelling-correction library with extensive support for simple grammar suggestions, punctuation errors)
  • pattern (a CLiPS product which helps in conjugating verbs - helps form the correct structure of the verb based on the tense, person, number, mood)
  • sympound (another spelling correction algorithm-based library which even accepts dictionary)
  • numpy (for minor mathematical calculations and memory-storage of dictionaries)

Algorithms

Grammar correction algorithms are implemented with help from these libraries. There are algorithms for:

  • Spelling correction (what are yuo doign in hte collrge -> What are you doing in the college.)
  • V-V correction (He is play in the garden. -> He is playing in the garden.)
  • V-V-V correction (Harry has been watched movie since afternoon. -> Harry has been watching movie since afternoon.)
  • Preposition correction (The children are sitting on the room. -> The children are sitting in the room.)
  • Sentence structure correction (I am looking at boy. -> I am looking at the boy.)

Getting started

The project has been built entirely using Python 3. The backend framework is powered by Flask. To install all the dependencies, you need to clone the repository, navigate to it and type make install. To start the application, you can type make start OR python3 app.py and then navigate to localhost:5000.

The application can be used as:

  1. Raw text inputted through the text box.
  2. DocX document uploaded and processed with all text formatting taken care of. The spelling and grammar-corrected document is returned in the DocX format.

The application outputs the corrected document / raw text with some statistics:

  • Number of errors of each type
  • Total number of errors (indicates the severity of the document)
  • Display the table containing the incorrect sentence structure and the correct sentence structure.

To Do

  • The preposition as well as sentence structure correction is powered by data - so more the data, better it works. So, dataset improvement.
  • The system still isn't very natural in suggesting the sentences (and might break for extreme cases) - so replacing the algorithms with a neural network-integrated approach.

correctly's People

Contributors

rounakdatta avatar sauravcr7 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

correctly's Issues

Suggest to loosen the dependency on textblob

Hi, your project CorrectLy(commit id: b682254) requires "textblob==0.15.1" in its dependency. After analyzing the source code, we found that the following versions of textblob can also be suitable, i.e., textblob 0.9.0, 0.9.1, 0.10.0, 0.11.0, 0.11.1, 0.12.0, 0.13.0, 0.13.1, 0.14.0, 0.15.0, 0.15.2, 0.15.3, 0.16.0, 0.17.0, 0.17.1, since all functions that you directly (1 APIs: textblob.blob.TextBlob.init) or indirectly (propagate to 5 textblob's internal APIs and 0 outsider APIs) used from the package have not been changed in these versions, thus not affecting your usage.

Therefore, we believe that it is quite safe to loose your dependency on textblob from "textblob==0.15.1" to "textblob>=0.9.0,<=0.17.1". This will improve the applicability of CorrectLy and reduce the possibility of any further dependency conflict with other projects.

May I pull a request to further loosen the dependency on textblob?

By the way, could you please tell us whether such an automatic tool for dependency analysis may be potentially helpful for maintaining dependencies easier during your development?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.