Giter Club home page Giter Club logo

ocaml-avro's Introduction

Avro build

  • avro is the runtime library, should be reasonably small (depends only on camlzip)
  • avro-compiler can read a json schema and generate code from it

The online documentation can be found here

Supported features

  • code generation from json schema (using avro-compiler)
    • NOTE: default values are not supported yet.
  • object container file format
  • binary encoding
  • schema evolution (in particular the schema is emitted as-is, nothing is stripped)
  • single object file format
  • sorting
  • json encoding
  • compression codecs:
    • null
    • deflate
    • snappy (optional codec)

Schemas

Schemas are json files (described here). Some examples can be found in tests/.

For example, the schema:

{
  "name": "employee",
  "type": "record",
  "fields": [
    {"name": "firstname", "type": "string"},
    {"name": "lastname", "type": "string"},
    {"name": "age", "type": "int"},
    {"name": "friendliness", "type": "float"},
    {"name": "job", "type": [
      {"name": "programmer", "type": "record",
        "fields": [{"name": "language", "type": {"type":"enum", "name":"language","symbols": ["cpp", "ocaml", "java", "python"]}}]},
      {"name": "RH", "type": "record",
        "fields": [{"name": "hair_spikiness", "type": {"type":"fixed", "size": 4, "name":"spikiness"}}]}
    ] }
  ]
}

will be compiled to the types:

type nonrec language =   | Cpp | Ocaml | Java | Python

type nonrec programmer = { language: language; }

type nonrec rH = { hair_spikiness: string; }
type nonrec union_0 =  
  | C_programmer of programmer 
  | C_RH of rH 
                       
type nonrec t = {
  firstname: string; 
  lastname: string; 
  age: int; 
  friendliness: float; 
  job: union_0; 
}

Compression

New codecs can be registered in Avro.Obj_container_format.Codec. Currently it's quite naive as it goes from a string to a string, instead of being fully streaming. This might change before 1.0.

module Codec : sig
  type t

  (**)

  val register :
    name:string ->
    compress:(string -> string) ->
    decompress:(string -> string) ->
    unit -> t
  (** Register decompression codecs. Defaults are "null" and "deflate". *)
end

Examples

See some examples from the tests/ directory:

  • tests/records.json and tests/records_test.ml that use it. you can see what the schema compiler will produce:

    $ ./compiler.sh ./tests/records.json
    (* generated by avro-compiler *)
    
    open Avro
    
    module Str_map = Map.Make(String)
    
    let schema = "{\"type\":\"record\",\"name\":\"test\",\"fields\":[{\"type\":\"long\",\"name\":\"a\"},{
    \"type\":\"string\",\"name\":\"b\"}]}"
    
    type nonrec t = { a: int64;  b: string; }
    
    let read (input:Input.t) : t =
      (let a = Input.read_int64 input in
       let b = Input.read_string input in
       { a;b })
    
    let write (out:Output.t) (self:t) : unit =
      ((let self = self.a in Output.write_int64 out self);
        (let self = self.b in Output.write_string out self);
        )
    
  • similarly for tests/employee.json and tests/employee_test.ml that use it

License

MIT license.

ocaml-avro's People

Contributors

c-cube avatar tmcgilchrist avatar

Watchers

James Cloos avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.