Giter Club home page Giter Club logo

satyapavan / high-speed-data-sanitizing-engine Goto Github PK

View Code? Open in Web Editor NEW
0.0 3.0 0.0 101 KB

To enable a data sanitizing system that replaces sensitive data. There are lot of tools existing for this, but they all depend on a pattern/structured data. This engine, aims to solve the problem for unstructured data

License: GNU General Public License v3.0

Java 54.48% JavaScript 41.29% HTML 4.23%
engine data masking high-speed high-performance data-masking

high-speed-data-sanitizing-engine's Introduction

log-sanitizing-engine

Agenda

To enable a log sanitizing system that replaces sensitive data with a masked information. There are lot of tools existing for this, but they all depend on a pattern/structured data like IP/PWD/Telephone number and all. This engine, aims to provide masking for text's which adher to no patterns, like our greedytext.

Good Reads

TODO performance of multi-pattern string match in java vs python

https://stackoverflow.com/questions/19829892/java-regular-expressions-performance-and-alternative

https://stackoverflow.com/questions/1326682/java-replacing-multiple-different-substring-in-a-string-at-once-or-in-the-most

https://github.com/almondtools/stringbench

https://github.com/intel/hyperscan

https://www.hyperscan.io/

Good - https://stackoverflow.com/questions/8845245/high-performance-mass-short-string-search-in-python

https://en.wikipedia.org/wiki/Aho%E2%80%93Corasick_algorithm

https://en.wikipedia.org/wiki/Commentz-Walter_algorithm

http://se.ethz.ch/~meyer/publications/string/string_matching.pdf

https://wiomoc.de/aho-corasick-viz/

https://github.com/cloudflare/ahocorasick

Good - https://stackoverflow.com/questions/1250399/algorithm-for-linear-pattern-matching

https://en.wikipedia.org/wiki/Trie

https://pypi.org/project/pytst/

https://stackoverflow.com/questions/49950747/why-is-string-comparison-so-fast-in-python

https://stackoverflow.com/questions/42742810/speed-up-millions-of-regex-replacements-in-python-3

https://stackoverflow.com/questions/6690739/high-performance-fuzzy-string-comparison-in-python-use-levenshtein-or-difflib

https://bergvca.github.io/2017/10/14/super-fast-string-matching.html

https://wiki.python.org/moin/PythonSpeed/PerformanceTips

https://www.quora.com/What-are-the-best-open-source-high-performance-string-matching-libraries

https://stackoverflow.com/questions/18340097/what-is-the-fastest-substring-search-method-in-java

https://stackoverflow.com/questions/7505160/high-performance-simple-java-regular-expressions

https://stackoverflow.com/questions/11663648/high-speed-string-matching-algorithms

https://johannburkard.de/software/stringsearch/

https://johannburkard.de/software/stringsearch/

https://www.theserverside.com/discussions/thread/61661.html

high-speed-data-sanitizing-engine's People

Contributors

satyapavan avatar

Watchers

 avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.