Giter Club home page Giter Club logo

scramjet's Introduction

Master Build Status Develop Build Status Dependencies Dev Dependencies

What does it do?

Scramjet is a powerful, yet simple framework written on top of node.js object streams, somewhat similar to the well-known event-stream or highland module, but with a much simplier API and written fully in ES6. It is built upon the logic behind three well known javascript array operations - namingly map, filter and reduce. This means that if you've ever performed operations on an Array in JavaScript - you already know Scramjet like the back of your hand.

The main advantage of scramjet is running asynchronous operations on your data streams. First of all it allows you to perform the transformations both synchronously and asynchronously by using the same API - so now you can "map" your stream from whatever source and call any number of API's consecutively.

The benchmarks are punblished in the scramjet-benchmark repo.

Example

How about a CSV parser of all the parkings in the city of Wrocław from http://www.wroclaw.pl/open-data/...

const request = require("request");
const StringStream = require("scramjet").StringStream;

let columns = null;
request.get("http://www.wroclaw.pl/open-data/opendata/its/parkingi/parkingi.csv")
    .pipe(new StringStream())
    .split("\n")
    .parse((line) => line.split(";"))
    .pop(1, (data) => columns = data)
    .map((data) => columns.reduce((acc, id, i) => (acc[id] = data[i], acc), {}))
    .on("data", console.log.bind(console))

API Docs

Here's the list of the exposed classes, please review the specific documentation for details:

Note that:

  • Most of the methods take a callback argument that operates on the stream items.
  • The callback, unless it's stated otherwise, will receive an argument with the next chunk.
  • If you want to perform your operations asynchronously, return a Promise, otherwise just return the right value.

The quick reference of the exposed classes:

DataStream ⇐ stream.PassThrough

DataStream is the primary stream type for Scramjet. When you parse your stream, just pipe it you can then perform calculations on the data objects streamed through your flow.

Detailed DataStream docs here

Method Description Example
new DataStream(opts) Create the DataStream. DataStream example
dataStream.debug(func) ⇒ DataStream Injects a debugger statement when called. debug example
dataStream.use(func) ⇒ * Calls the passed in place with the stream as first argument, returns result. use example
dataStream.group(func) ⇒ DataStream Groups execution by key in a single thread group example
dataStream.tee(func) ⇒ DataStream Duplicate the stream tee example
dataStream.slice(start, end, func) ⇒ DataStream Gets a slice of the stream to the callback function. slice example
dataStream.accumulate(func, into) ⇒ Promise Accumulates data into the object. accumulate example
dataStream.reduce(func, into) ⇒ Promise Reduces the stream into a given accumulator reduce example
dataStream.reduceNow(func, into) ⇒ * Reduces the stream into the given object, returning it immediately. reduceNow example
dataStream.remap(func, Clazz) ⇒ DataStream Remaps the stream into a new stream. remap example
dataStream.flatMap(func, Clazz) ⇒ DataStream Takes any method that returns any iterable and flattens the result. flatMap example
dataStream.each(func) ↩︎ Performs an operation on every chunk, without changing the stream
dataStream.map(func, Clazz) ⇒ DataStream Transforms stream objects into new ones, just like Array.prototype.map map example
dataStream.assign(func) ⇒ DataStream Transforms stream objects by assigning the properties from the returned assign example
dataStream.filter(func) ⇒ DataStream Filters object based on the function outcome, just like filter example
dataStream.shift(count, func) ⇒ DataStream Shifts the first n items from the stream and pipes the other shift example
dataStream.separate() ⇒ MultiStream Splits the stream two ways separate example
dataStream.toBufferStream(serializer) ⇒ BufferStream Creates a BufferStream toBufferStream example
dataStream.stringify(serializer) ⇒ StringStream Creates a StringStream stringify example
dataStream.toArray(initial) ⇒ Promise Aggregates the stream into a single Array
DataStream.fromArray(arr) ⇒ DataStream Create a DataStream from an Array fromArray example

StringStream ⇐ DataStream

A stream of string objects for further transformation on top of DataStream.

Detailed StringStream docs here

Method Description Example
new StringStream(encoding) Constructs the stream with the given encoding StringStream example
stringStream.shift(bytes, func) ⇒ StringStream Shifts given length of chars from the original stream shift example
stringStream.split(splitter) ⇒ StringStream Splits the string stream by the specified regexp or string split example
stringStream.match(splitter) ⇒ StringStream Finds matches in the string stream and streams the match results match example
stringStream.toBufferStream() ⇒ StringStream Transforms the StringStream to BufferStream toBufferStream example
stringStream.parse(parser) ⇒ DataStream Parses every string to object parse example
StringStream.SPLIT_LINE A handly split by line regex to quickly get a line-by-line stream

BufferStream ⇐ DataStream

A factilitation stream created for easy splitting or parsing buffers

Detailed BufferStream docs here

Method Description Example
new BufferStream(opts) Creates the BufferStream BufferStream example
bufferStream.shift(chars, func) ⇒ BufferStream Shift given number of bytes from the original stream shift example
bufferStream.split(splitter) ⇒ BufferStream Splits the buffer stream into buffer objects split example
bufferStream.breakup(number) ⇒ BufferStream Breaks up a stream apart into chunks of the specified length breakup example
bufferStream.toStringStream(encoding) ⇒ StringStream Creates a string stream from the given buffer stream toStringStream example
bufferStream.parse(parser) ⇒ DataStream [Parallel] Parses every buffer to object parse example

MultiStream

An object consisting of multiple streams than can be refined or muxed.

Detailed MultiStream docs here

Method Description Example
new MultiStream(streams, options) Crates an instance of MultiStream with the specified stream list MultiStream example
multiStream.map(aFunc) ⇒ MultiStream Returns new MultiStream with the streams returned by the tranform. map example
multiStream.filter(func) ⇒ MultiStream Filters the stream list and returns a new MultiStream with only the filter example
multiStream.dedupe(cmp) ⇒ DataStream Removes duplicate items from stream using the given hash function dedupe example
multiStream.mux(cmp) ⇒ DataStream Muxes the streams into a single one mux example
multiStream.add(stream) Adds a stream to the MultiStream add example
multiStream.remove(stream) Removes a stream from the MultiStream remove example

Browserifying

Scramjet works in the browser too, there's a nice, self-contained sample in here, just run it:

    git clone https://github.com/signicode/scramjet.git
    cd scramjet
    npm install .
    cd samples/browser
    npm start # point your browser to http://localhost:30035 and open console

If you need your scramjet version for the browser, grab browserify and just run:

    browserify lib/index -standalone scramjet -o /path/to/your/browserified-scramjet.js

With this you can run your transformations in the browser, use websockets to send them back and forth. If you do and fail for some reason, please remember to be issuing those issues - as no one person can test all the use cases and I am but one person.

Usage

Scramjet uses functional programming to run transformations on your data streams in a fashion very similar to the well known event-stream node module. Most transformations are done by passing a transform function. You can write your function in two ways:

  1. Synchronous

Example: a simple stream transform that outputs a stream of objects of the same id property and the length of the value string.

   datastream.map(
       (item) => ({id: item.id, length: item.value.length})
   )
  1. Asynchronous (using Promises)

Example: A simple stream that fetches an url mentioned in the incoming object

   datastream.map(
       (item) => new Promise((resolve, reject) => {
           request(item.url, (err, res, data) => {
               if (err)
                   reject(err); // will emit an "error" event on the stream
               else
                   resolve(data);
           });
       })
   )

The actual logic of this transform function is as if you passed your function to the then method of a Promise resolved with the data from the input stream.

License and contributions

As of version 2.0 Scramjet is MIT Licensed.

Help wanted

The project need's your help! There's lots of work to do - transforming and muxing, joining and splitting, browserifying, modularizing, documenting and issuing those issues.

If you want to help and be part of the Scramjet team, please reach out to me, signicode on Github or email me: [email protected].

scramjet's People

Contributors

michalcz avatar

Stargazers

 avatar Leandro de Jesus avatar Roman avatar Stefan Aichholzer avatar Stuart Hudson avatar roll avatar

Watchers

James Cloos avatar  avatar

Forkers

steflen

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.