Giter Club home page Giter Club logo

redpen-cc / redpen Goto Github PK

View Code? Open in Web Editor NEW
552.0 552.0 74.0 9.1 MB

RedPen is an open source proofreading tool to check if your technical documents meet the writing standard. RedPen supports various markup text formats (Markdown, Textile, AsciiDoc, Re:VIEW, reStructuredText and LaTeX).

Home Page: https://redpen.cc

License: Apache License 2.0

Shell 0.29% Java 93.80% CSS 0.61% HTML 1.25% JavaScript 3.68% Batchfile 0.17% TeX 0.12% C++ 0.08%
asciidoc latex linter markdown proofreading restructuredtext wiki

redpen's People

Contributors

akinomurasame avatar alterakey avatar angryziber avatar cmoen avatar duck8823 avatar gautela avatar gerryhocks avatar gitter-badger avatar hirokiky avatar johtani avatar karronoli avatar kenhys avatar kubosho avatar mattn avatar midnightsuyama avatar mocobeta avatar nacyot avatar norm-ideal avatar ocadaruma avatar paclearner avatar raccoonyy avatar sankichi92 avatar taher-ghaleb avatar taisa831 avatar takahi-i avatar tiqwab avatar tokorom avatar toshihiko-yamazaki avatar tsuyoshicho avatar yusuke avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

redpen's Issues

Handle spacial usages of period

Periods are used not only the end of the sentence but also the exceptional usages. The followings are the special usages.

  • digits
    • 123.4893
  • special words
    • Mr. Mrs.

The parsers should handle above language matters.

Configuration: End of sentence characters

Current DV supports period, question mark and exclamation mark as the end of sentence character. But some users would want to add other characters as the end of character such as colon.

This issue is to add the configuration to the end of sentence characters for such users.

Check if Markdown underlined headers can be used

Markdown has two styles of headers ("underlined" and "atx"). Currently DV works with atx style header and the style is tested.

We should check the the library used in the MarkdownParser supports the underline syntax or not. If we find that the library support the style, tests are added.

Provide Japanese character setting file

Configuration for Japanese could be much different from english configurations. Although default Settings of DocumentValidator should be for English, Japanese configurations should be not so difficult.

For the easy configuration for japanese, I will provide the sample setting files for Japanese documents.

Exclude Main class

In order to use DocumentValidator as a library, Main class should be excluded. The Main class would be a sub module of main project.

Katakana SpellCheck Validator

Check Japanese Katakana words with edit distance to extract the variation of spelling such as インデクス and インデックス.

Central configuration file

Currently DocumentVlaidator has two configuration files (validator.conf and character-table.conf), but there should also be the configrutions for parsers. However, too many configuraiton files lead to decrease the usability.

i will add the central configuration file for the DocumentValidator which contains the sub configurations such as validator.conf and character-table-conf.txml.

Enhancement of Wiki Parser

Support the following elements. I will support the following common elements in this issue.

  • Inline Formatting (Issue #41)
    • Bold
    • Italics
    • italic and bold
    • underline text
    • strikethrough text
  • Comments(Issue #39)
  • Lists (Issue #35)
    • Bullet lists
    • Numbered lists
  • Links (Issue #33)
    • External links
    • Internal links
    • images

Add return code

DocumentValidator now always return 0 even when validation errors are found. Therefore we are not able to know if there are errors or not without the output file or terminal.

Returning the error code (1), we can see the results from shell scripts or wrapper tools.

Improve handling of links in Wikiparser

Wikiparser support to extract links in the document which are used to implement validaotor to check the links.

Currently wikiparser extract the label of the links when there is a link block such as [label | url], however the label is not used to validator the link url. Therefore parser should extract url of the link (not the label).

Support question mark and exclamation mark as the full stop

StringUtils.getSentenceEndPosition method should detect exclamation mark and question mark as the end of the sentences. When the characters are not detected as the end position of the sentences. The sentences are not split with parsers.

For example, given the following string the parsers do not detect two sentences.

Is this a apple? Yes it is.

SpaceBeginningOfSentenceValidator always reports error for headings

SpaceBegginingOfSentenceValidator flushes the errors alway when there are heading coments. For example, adding SpaceBeginningOfSentenceValidator in the validation-config.xml file and process the input document containing following line, SpaceBeginningOfSentenceValidator reports a error message "Space not exist the beginning of sentence".

# this is a heading

But, the heading contents may not start with space and therefore the behavior is not expected.

SentenceIterator should check sentences in special tags

Currently SentenceIterator does not check sentences in headers or list elements. I will make SentenceIterator check sentences contained such tags.

Before, fixing this issue, Header elements and ListElements should be extended to have a Sentence Element.

Link and URL Validator

Markdown or Textile formats provide internal links and external urls in the contents. Sometimes links are down and user cannot got to the pages the links connect to. This validator check if the internal and external links are alive.

Add Sentence object into ValidationError

Currently ValidationError does not have Sentence information except for the error messages created in Validators. To make the errors into more readable, i will split error message into error type and sentence information.

Specfically i will add "sentence" member in ValidationError, and error messages in validator make simple removing sentence information.

Handle Tables in MarkdownParser

Current MarkdownParser treat the tables in a markdown syntax document. This leads the registered validators process the tables as the input sentence and flush the errors. But this behavior is not expected one, since the table is not a sentence (each elements could be sentences...).

To suppress the warnings, the changes not to add the tables as sentences would be needed.

Use StringBuffer in parsers

Parsers in DocumentValidator use String allocating many substrings. The codes are not efficient at all. The parsers could be more efficient using StringBuffer.

Define default symbol table.

Document validator should have the default symbol (character) table. If there is a default value for common symbols (such as comma, colon), the configurations would be simplified since users just need to override the default value onluy if the modifications are needed
.

Add Validator name into ValidationError

Current output of ValidationError does not contains the Validator's name flushed it. Users do not know which ValidationError came from.

Adding the ValidationError into ValidatorName, Users understand that which validator output it.

Handle quotation mark

If the left quotation mark is registered as the same as right quotation mark, they cannot be distinguished.

For example if we have the following settings, DocumentValidator thinks left quotation mark '"' needs not only before space but also after space.

<character name="LEFT_DOUBLE_QUOTATION_MARK" value="\'"  before-space="true" />
<character name="RIGHT_DOUBLE_QUOTATION_MARK" value="\'" after-space="true" />  

Katakana end hyphen validator for Japanese text

Validate the end hyphens of Katakana words in Japanese documents. Japanese Katakana words have variations in end hyphen. For example, "computer" is written in Katakana by "コンピュータ (without hyphen) ", and "コンピューター (with hypen) ". This validator check if Katakana words ending format is match the predefined standard.

Enrich Javadoc

  • Enrich description in javadoc
  • Follow standard Javadoc style (e.g. start with capital character)

Support language dependent default settings

DocumentValidator has global default settings in DefaultSymbol class and users override the default values if needed. This setting is good for european languages but quite different from Japanese or Chinese documents.

I will add language dependent settings especially for asian languages.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.