Comments (2)
Hello, we currently do not support this. But this can be accomplished by using something like the function below:
import re
def remove_repetitions_ar(s, policy=1):
"""Reduces the repeated characters (more than two repeated)
from an Arabic string to one or two characters based on the
optional specified policy.
Args:
s (:obj:`str`): The string to be normalized.
policy (:obj:`int`, optional):
The reduction policy. If policy=`1` the repeated characters will
be reduced to `1` character. If policy=`2` the repeated characters
will be reduced to `2` characters. Defaults to `1`.
Returns:
:obj:`str`: The normalized string.
"""
_REP_AR_RE = re.compile(r'(.)\1{2,}')
if policy == 1:
return _REP_AR_RE.sub(u'\\1', s)
elif policy == 2:
return _REP_AR_RE.sub(u'\\1\\1', s)
else:
raise ValueError("Policy value should be either 1 or 2!")
remove_repetitions_ar('مرحباااا')
'مرحبا'
Hope this is helpful.
from camel_tools.
yes it helps a lot
why I asked because I saw in the docs the module camel_tools.morphology.errors.MorphologyError
So I was thinking may be this module is for errors like the repeating characters.
but unfortenatly the docs don't have enough examples.
so is there any module in camel tools that check grammar or orthographic errors and correct it?
from camel_tools.
Related Issues (20)
- Device crashed while installing in windows.
- [QUESTION] How use CaMeL Tools in Hugging Face Space HOT 1
- [BUG] Maximum Sequence Limit not set on Camel-bert Model HOT 2
- [BUG] Can't install Camel-tools using pip command HOT 5
- ImportError: cannot import name 'EMOJI_DATA' from 'emoji'
- [QUESTION] Disambiguation using unfactored bert model does not yield same results as using the Camelira Web Interface HOT 1
- [QUESTION] Using CAMeL Tools with Classical Arabic
- root field: specify vowel letters
- [FEATURE REQUEST] Python 3.11 support
- [QUESTION] About errors of spelling HOT 1
- [BUG] Top scored analyses isn't "correct"
- [BUG] Building Diacritics
- Rewrite rule broken
- Source feature overwritten by suffix
- Error in default build on macOS M1 Sonoma HOT 2
- [QUESTION] Why is output form dialect id system different from the ADIDA online interface?
- [BUG] When running the tagger, some words are missing features like 'lex' and 'diac' for lev and glf pretrained models
- [QUESTION] How to change the default Character map?
- [BUG] Problem with installing camel-tools library via PIP HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from camel_tools.