rrenaud / gibberish-detector Goto Github PK

View Code? Open in Web Editor NEW

597.0 597.0 153.0 2.88 MB

A small program to detect gibberish using a Markov Chain

License: MIT License

Python 100.00%

gibberish-detector's People

Contributors

Stargazers

Watchers

Forkers

gwho bpgergo kentwelcome jesse-osiecki vipulkohli zhao9jack craftsliu abegining ljdawn darkpicnic askebos tplink32 rtkrruvinskiy rtkwlf ulysses425 jinglingshu anonymoustian s lopuhin yu-maaa bryanoltman zhangyangisme utopiahxm glennbull cgerson vathsanvk buptdjd simplgy pirate lucienh lomascolo jackieleewelas tuan1101 ebubekirbbr ericxsun pjelement jasonshaw01 pavel-slepenkov chichoo subramanyata caohy1988 python3pkg songofhack pjanata ksmaheshkumar vandonova zhangyingwei-resources kuldeepmarganache genjiluo hyabcd efeducationfirstmobile avaudioplayer ghowlett windream swuecho qiun upman prashanttz alexlongguo ashimiblessing richardgiddings boxu-zhang zhangjunqiang lftuo gregkaleka sanjingshou11 yuhuichina blackskies xiaomale papiot oowoodone fud ricalanis primekun gaojie-wang yocoa kaanuki bhumijgupta muellpanda soco-ai rbahumi a-new yangchen1995 grepler liukai2008 dingsiwei amitness pombredanne guihui oliverhuangchao fineguy jamiecollinson wqj111186 antichown yb7 niharika161 5l1v3r1 shauryauppal-1mg redp4th dpfergusonak

gibberish-detector's Issues

big.txt

Thanks for this code.
I'm curious, where does big.txt come from?
Is there a specific set of datasets you combined to get it?

Hello 👋
I want to say thanks for the job you did!
I created a line-by-line port of Gibberish-Detector in JavaScript and it is also published in NPM. Would appreciate review or any suggestions. Additionally I'd like to ask if it's possible and appropriate to refer from your README back to my implementation?

Incompatible with Python3

Some of the functions are incompatible with Python3 like xrange.
Probably add python3 compatible files in separate folder in repo

Can I make this model addaptable for words I want by adding text in big.txt ?

It is giving gibberish for word "ok", cn anyone please help me regarding this

Repetition Detection needed--Vowel/Consonant Algorithm good

I would greatly appreciate it if you could devise some code for the case of "the the the the the the the the the a a a a a" I will send you a small donation if you can effectively solve this problem.

Licence missing

Hi!

Thanks!

Threshold selection

Threshold selection only uses two very small datasets. Is this method too simple? Is there a more appropriate way to select the threshold?

Video link needs an update

Peter's talk can be found at https://youtu.be/yvDCzhbjYWs, the current link doesn't work for me.

Output Gibberish part of Input

Hello, and thank you for writing this fantastic program!

Do you think there could be a way for this program to output parts of the input that are gibberish, while leaving out the parts of the strings that "make sense" based on its model, instead of simply answering "True" or "False"? Or simply highlight the gibberish parts in some way?

If this was made possible, any string could be parsed to have gibberish parts deleted out. Would be a good way to sanitize a lot of file content that's meant for human reading but has junk strings in it for whatever reason.

Thanks again for writing this in the first place. I think you could expand this project to become something majorly useful. Who wouldn't want to be able to detect gibberish in their strings! I would love to contribute to this project, and I have recently started learning Python so hopefully I can someday, but I tried to understand your code and am too much for a noob yet to do it unfortunately.

rrenaud / gibberish-detector Goto Github PK

gibberish-detector's People

Contributors

Stargazers

Watchers

Forkers

gibberish-detector's Issues

big.txt

log probabilities?

Python -> JavaScript port

Incompatible with Python3

Can I make this model addaptable for words I want by adding text in big.txt ?

Repetition Detection needed--Vowel/Consonant Algorithm good

Licence missing

Threshold selection

Video link needs an update

Output Gibberish part of Input

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent