Giter Club home page Giter Club logo

regexp's Introduction

regexp

Build Status

This is a regular expression library for Haskell that focuses on higher level operations like computing the intersection of regular expressions or deciding whether two regular expressions match the same set of strings. This is in stark contrast to pretty much every single regular expression library out there (including ones for other languages), which are only concerned with matching strings. Unfortunately, deprioritizing string matching means it isn't very efficient, so if that's all you need, you should use a different library.

Here is a summary of supported features:

  • Intersection and complement
  • Derivatives à la Brzozowski
  • Equivalence checking
  • Solving systems of linear equations with regular expression coefficients (which can be used to implement intersection, complement, and more)
  • Arbitrary alphabets, even infinite ones!

Usage and Development

We use Stack so it's pretty much trivial to get started. If you don't have Stack already, install it and set it up by running

stack setup

in your shell. You only need to do this once. Then, you can run

stack repl

to be dropped in GHCi where you can play around with the library. This will install all dependencies, build the library, and do whatever is necessary so everything "Just Works™".

stack haddock --open regexp

will open the documentation in your browser and

stack test

will run the test suite.

Stack is all about reproducible builds, so you should not run into any issues.

Examples

Load up the library in GHCi:

stack repl

Creating Regular Expressions

The simplest regular expressions are rZero, which matches no strings, and rOne, which matches only the empty string. You can combine regular expressions using rPlus and rTimes (choice and sequencing). Kleene star is implemented by rStar. For example:

rOne
rStar rZero
rOne `rTimes` (rZero `rPlus` rOne)

are all valid expressions, though they are boring since they are all equivalent to rOne. More interesting expressions can be constructed using character classes. Standard Haskell string notation is interpreted as a character class containing all characters in the string. For example,

"abc" :: RegExp Char

is the (regular expression formed by) the character class containing the characters a, b, and c. This expression will match single character strings "a", "b", and "c" and nothing else. Note that the type annotation is required for Haskell to interpret the string as a regular expression.

String Matching

We can check that this is indeed how character classes behave by trying to match them against strings:

matches ("abc" :: RegExp Char) "a"
==> True

matches ("abc" :: RegExp Char) "ab"
==> False

matches (rStar "abc" :: RegExp Char) "ab"
==> True

Checking Equivalence

We can check if two regular expressions are equivalent and get a counter example in the case they are not:

equivalent ("abc" :: RegExp Char) ("abc" `rPlus` rZero) 
==> Right ()

equivalent ("abc" :: RegExp Char) ("ab") 
==> Left "c"

equivalent (rStar "abc" :: RegExp Char) ("abc" `rTimes` rStar "abc") 
==> Left ""

Intersection and Complement

We can compute intersections and complements:

intersection (rStar "ab" :: RegExp Char) (rStar "a") 
==> rStar "a"

intersection (rStar "a" :: RegExp Char) (RegExp.Operations.complement $ rStar "a") 
==> rZero

regexp's People

Contributors

cacay avatar mcschroeder avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.