Giter Club home page Giter Club logo

lambda-soup's Introduction

Lambda Soup   version 0.6.1 Travis status Coverage

Lambda Soup is a functional HTML scraping and manipulation library for OCaml aimed at being easy to use.

Lambda Soup usage example

Lambda Soup is simple. It provides a set of elementary traversals for getting from node to node, familiar functional combinators such as filter, map, and fold, and support for all CSS selectors that still make sense when not running in a browser (and a few obvious extensions on top of that).

Here is a trivial self-contained example:

"<p class='Hello'>World!</p>" |> parse $ ".Hello" |> R.leaf_text;;
- : string = "World!"

And, a mutation:

let soup = parse "<p class='Hello'>World!</p>" in
wrap (soup $ ".Hello" |> R.child) (create_element "strong");
soup |> to_string;;
- : string = "<p class=\"Hello\"><strong>World!</strong></p>"

For some more examples, see the Lambda Soup postprocessor that runs on Lambda Soup's own documentation after it is generated by ocamldoc.

The library is tested thoroughly.

Lambda Soup is based on Markup.ml. As a consequence, it resolves entity references, detects character encodings automatically, and converts everything to UTF-8. And, you can use Lambda Soup on XML, by parsing the XML with Markup.ml and feeding the signals to Lambda Soup.


Installing

opam install lambdasoup

Starting from scratch

To use Lambda Soup interactively as in the GIF at the top of this README, you need to have done something like this:

your-package-manager install ocaml opam
opam init
eval `opam config env`          # Or restart your shell
opam install lambdasoup

and make sure your ~/.ocamlinit file looks something like this:

let () =
  try Topdirs.dir_directory (Sys.getenv "OCAML_TOPLEVEL_PATH")
  with Not_found -> ()
;;

# use "topfind";;

Then, run ocaml -short-paths to start the top-level, and scrape away!


Depending

Lambda Soup uses semantic versioning, but is currently in 0.x.x. For now, the minor version number will be incremented on breaking changes. So, to give yourself a chance to review the changelog before your code breaks, put the following constraint on Lambda Soup: lambdasoup {< "0.7.0"}.


Documentation

Lambda Soup's interface consists of one module Soup, whose signature is documented here.


Developing

See CONTRIBUTING. All feedback is welcome – open an issue on GitHub, or send me an email at [email protected]. If you find yourself repeatedly writing the same helper on top of Lambda Soup's functions, perhaps we should add it to Lambda Soup.

lambda-soup's People

Contributors

aantron avatar copy avatar yannham avatar fxfactorial avatar jimt avatar bryant1410 avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.