Giter Club home page Giter Club logo

arabic-tags's Introduction

Project Description

We're working on two projects: NahwApp, and ArabicTagging, which is this one. The purpose of the former is to quiz the student on I'rab, but to do so, it needs an Arabic corpus that's tagged with grammatical explanations in addition to technical information. This project helps develop that corpus by providing the tools to tag Arabic text.

There are two phases to this tool, which must be done in order:

  1. Manuscript
  2. Editing

Manuscript phase

The editing phase is where we insert the actual Arabic. The tool ensures that no illegal characters are making it through the corpus, in addition to preventing double spaces (as well as other forms of whitespace) and short vowels. While it should mostly consist of copy pasting text and minor tweaking, this tool allows you to also write up your own text very easily within the website.

Editing phase

The text from the Manuscript phase will be given to an AI to vowelize and tokenize (credits to CAMeL Lab). After that, the user can correct the short vowels and insert grammatical data for each word. Once the text is ready, it could be exported as a JSON and be used in any project. At the moment, however, only NahwApp can read the data.

arabic-tags's People

Contributors

amrojjeh avatar

Stargazers

 avatar

Watchers

 avatar

arabic-tags's Issues

Text wrapping

Div and textarea wrap differently when LTR is at the end. This is low priority as it shouldn't be a huge issue, but worth noting

Improve editing performance

While the parsing algorithm is fast, the rendering stage is very slow, sometimes reaching 60 milliseconds if there are a lot of spans to create. This is low priority since this stage is not really meant to be used as an editing ground or for long text, but this would still be very nice to improve.

Change routing to /excerpt/plain

The other two would be /excerpt/technical and /excerpt/grammatical. /excerpt/ would just redirect to /excerpt/plain if there's an id, otherwise it would redirect to /

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.