areebbeigh / profanityfilter Goto Github PK

View Code? Open in Web Editor NEW

73.0 2.0 25.0 706 KB

A universal Python library for detecting and filtering profanity

Home Page: https://pypi.python.org/pypi/profanityfilter

License: BSD 3-Clause "New" or "Revised" License

Python 98.28% Batchfile 0.91% Shell 0.81%

profanityfilter universal-python-library profanity-detection profanity

profanityfilter's Introduction

profanityfilter

A universal Python library for detecting and/or filtering profane words.

PyPI: https://pypi.python.org/pypi/profanityfilter
Doc: https://areebbeigh.github.io/profanityfilter/

Installation

> pip install profanityfilter

Usage

from profanityfilter import ProfanityFilter

pf = ProfanityFilter()

pf.censor("That's bullshit!")
> "That's ********!"
pf.set_censor("@")
pf.censor("That's bullshit!")
> "That's @@@@@@@@!"
pf.define_words(["icecream", "choco"])
pf.censor("I love icecream and choco!")
> "I love ******** and *****"
pf.is_clean("That's awesome!")
> True
pf.is_clean("That's bullshit!")
> False
pf.is_profane("Profane shit is not good")
> True

pf_custom = ProfanityFilter(custom_censor_list=["chocolate", "orange"])
pf_custom.censor("Fuck orange chocolates")
> "Fuck ****** **********"

pf_extended = ProfanityFilter(extra_censor_list=["chocolate", "orange"])
pf_extended.censor("Fuck orange chocolates")
> "**** ****** **********"

Console Executable

profanityfilter -h
> usage: profanityfilter-script.py [-h] [-t TEXT | -f PATH] [-o OUTPUT_FILE]
>                                  [--show]
>
> Profanity filter console utility
>
> optional arguments:
>   -h, --help            show this help message and exit
>   -t TEXT, --text TEXT  Test the given text for profanity
>   -f PATH, --file PATH  Test the given file for profanity
>   -o OUTPUT_FILE, --output OUTPUT_FILE
>                         Write the censored output to a file
>   --show                Print the censored text

Contributing

Fork
Add changes
Add unit tests
Make a pull request :)

I encourage you to fork this repo and expand it in anyway you like. Pull requests are welcomed!

Additional Info

Developer: Areeb Beigh [email protected]
GitHub Repo: https://github.com/areebbeigh/profanityfilter/

profanityfilter's People

Contributors

Stargazers

Watchers

profanityfilter's Issues

ModuleNotFoundError: No module named 'hunspell_serializable'

I got this error to run code.

Traceback (most recent call last):
  File "/home/nitin/PycharmProjects/filter/test.py", line 1, in <module>
    from profanity_filter import ProfanityFilter
  File "/home/nitin/anaconda3/envs/filter/lib/python3.7/site-packages/profanity_filter/__init__.py", line 1, in <module>
    from profanity_filter.console import main
  File "/home/nitin/anaconda3/envs/filter/lib/python3.7/site-packages/profanity_filter/console.py", line 4, in <module>
    from profanity_filter.profanity_filter import ProfanityFilter
  File "/home/nitin/anaconda3/envs/filter/lib/python3.7/site-packages/profanity_filter/profanity_filter.py", line 22, in <module>
    from profanity_filter import spacy_utlis
  File "/home/nitin/anaconda3/envs/filter/lib/python3.7/site-packages/profanity_filter/spacy_utlis.py", line 6, in <module>
    from profanity_filter.spacy_component import SpacyProfanityFilterComponent
  File "/home/nitin/anaconda3/envs/filter/lib/python3.7/site-packages/profanity_filter/spacy_component.py", line 9, in <module>
    from profanity_filter.types_ import Language
  File "/home/nitin/anaconda3/envs/filter/lib/python3.7/site-packages/profanity_filter/types_.py", line 7, in <module>
    from hunspell_serializable import HunSpell
ModuleNotFoundError: No module named 'hunspell_serializable'

no output

this seems to be yet another absolutely useless bundle of code.
for the life of me, I cannot get a single bit of output, not even errors.

Wrong boolean value for .is_clean/.is_profane

Hi!
I installed the module with "!pip install profanityfilter" and ran the examples for .is_clean() and is_profane():

pf.is_clean("That's awesome!")
pf.is_clean("That's bullshit!")
pf.is_profane("Profane shit is not good")

The result didn't coincide with mentioned by you (see attached screenshot).

Uncensoring a Word

Is there any way i can uncensor the word fuck or shit ?

Provide a way to know which words were censored

I would like to be able collect the words that ProfanityFilter.censor censors into my logs, so I can determine whether or not I would like to whitelist them as they come up.

Rework

Hey @areebbeigh I was intending on just adding pluralization support, but I ended up reworking a lot of the logic. I won't make a PR until we have a chance to chat (and until I get it ironed out).

I converted the plugin to a class structure, so ideally you could do:

pf = ProfanityFilter()
pf.append_words(['more', 'profane', 'words'])

print pf.censor('some of these words are fucking profane')

>some of these ***** are ******* *******

Take a look at the changes I made in this commit (I didn't do it atomically because I wasn't expecting to do so much! I got caught up in it.): brandonsturgeon@8e34e92

Hi, I'd like to use this to check if usernames include profane words, however, usernames are all one word and is_profane isn't able to pick up profane words when they are connected to other words. Can you add support for this? For example, if an inputted username is f***youusername it should mark as profane.

Regex characters in "bad words" are not escaped

"Bad words" containing characters significant in regular expressions are not escaped, which results in them not being detected correctly. For example, 13i+ch is actually looking for strings containing "13ich", "13iich", "13iiiiiiich", etc.

The logic to add word boundaries also does not take into consideration the possibility of having non-"word" characters at the start or end of the string. For example, @$$, even if it were escaped, has no word boundaries at all, and so \b@$$\b will only match if it is entirely contained within another word.

We have fixed this in qld-gov-au@96324f8

Remove a word?

Is there any way to remove a word from the list of bad words? For example, in my application that generates memes, I don't want to block "damn" because it's a key word in several memes.

Missing dependency on package "inflection"

Traceback (most recent call last):
  File "/usr/lib/python3.5/runpy.py", line 184, in _run_module_as_main
    "__main__", mod_spec)
  File "/usr/lib/python3.5/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/home/al/Source/pimodisco/pimodisco/__main__.py", line 6, in <module>
    from pimodisco.filter import filter
  File "/home/al/Source/pimodisco/pimodisco/filter.py", line 4, in <module>
    from profanityfilter import ProfanityFilter
  File "/home/al/.local/lib/python3.5/site-packages/profanityfilter/__init__.py", line 1, in <module>
    from .profanityfilter import *
  File "/home/al/.local/lib/python3.5/site-packages/profanityfilter/profanityfilter.py", line 4, in <module>
    import inflection