Giter Club home page Giter Club logo

hashdiff's Introduction

HashDiff Build Status Gem Version

HashDiff is a ruby library to compute the smallest difference between two hashes.

Docs: Documentation

Why HashDiff?

Given two Hashes A and B, sometimes you face the question: what's the smallest modification that can be made to change A into B?

An algorithm that responds to this question has to do following:

  • Generate a list of additions, deletions and changes, so that A + ChangeSet = B and B - ChangeSet = A.
  • Compute recursively -- Arrays and Hashes may be nested arbitrarily in A or B.
  • Compute the smallest change -- it should recognize similar child Hashes or child Arrays between A and B.

HashDiff answers the question above using an opinionated approach:

  • Hash can be represented as a list of (dot-syntax-path, value) pairs. For example, {a:[{c:2}]} can be represented as ["a[0].c", 2].
  • The change set can be represented using the dot-syntax representation. For example, [['-', 'b.x', 3], ['~', 'b.z', 45, 30], ['+', 'b.y', 3]].
  • It compares Arrays using the LCS(longest common subsequence) algorithm.
  • It recognizes similar Hashes in an Array using a similarity value (0 < similarity <= 1).

Usage

To use the gem, add the following to your Gemfile:

gem 'hashdiff'

Quick Start

Diff

Two simple hashes:

a = {a:3, b:2}
b = {}

diff = HashDiff.diff(a, b)
diff.should == [['-', 'a', 3], ['-', 'b', 2]]

More complex hashes:

a = {a:{x:2, y:3, z:4}, b:{x:3, z:45}}
b = {a:{y:3}, b:{y:3, z:30}}

diff = HashDiff.diff(a, b)
diff.should == [['-', 'a.x', 2], ['-', 'a.z', 4], ['-', 'b.x', 3], ['~', 'b.z', 45, 30], ['+', 'b.y', 3]]

Arrays in hashes:

a = {a:[{x:2, y:3, z:4}, {x:11, y:22, z:33}], b:{x:3, z:45}}
b = {a:[{y:3}, {x:11, z:33}], b:{y:22}}

diff = HashDiff.best_diff(a, b)
diff.should == [['-', 'a[0].x', 2], ['-', 'a[0].z', 4], ['-', 'a[1].y', 22], ['-', 'b.x', 3], ['-', 'b.z', 45], ['+', 'b.y', 22]]

Patch

patch example:

a = {a: 3}
b = {a: {a1: 1, a2: 2}}

diff = HashDiff.diff(a, b)
HashDiff.patch!(a, diff).should == b

unpatch example:

a = [{a: 1, b: 2, c: 3, d: 4, e: 5}, {x: 5, y: 6, z: 3}, 1]
b = [1, {a: 1, b: 2, c: 3, e: 5}]

diff = HashDiff.diff(a, b) # diff two array is OK
HashDiff.unpatch!(b, diff).should == a

Options

There are six options available: :delimiter, :similarity, :strict, :numeric_tolerance, :strip and :case_insensitive.

:delimiter

You can specify :delimiter to be something other than the default dot. For example:

a = {a:{x:2, y:3, z:4}, b:{x:3, z:45}}
b = {a:{y:3}, b:{y:3, z:30}}

diff = HashDiff.diff(a, b, :delimiter => '\t')
diff.should == [['-', 'a\tx', 2], ['-', 'a\tz', 4], ['-', 'b\tx', 3], ['~', 'b\tz', 45, 30], ['+', 'b\ty', 3]]

:similarity

In cases where you have similar hash objects in arrays, you can pass a custom value for :similarity instead of the default 0.8. This is interpreted as a ratio of similarity (default is 80% similar, whereas :similarity => 0.5 would look for at least a 50% similarity).

:strict

The :strict option, which defaults to true, specifies whether numeric types are compared on type as well as value. By default, a Fixnum will never be equal to a Float (e.g. 4 != 4.0). Setting :strict to false makes the comparison looser (e.g. 4 == 4.0).

:numeric_tolerance

The :numeric_tolerance option allows for a small numeric tolerance.

a = {x:5, y:3.75, z:7}
b = {x:6, y:3.76, z:7}

diff = HashDiff.diff(a, b, :numeric_tolerance => 0.1)
diff.should == [["~", "x", 5, 6]]

:strip

The :strip option strips all strings before comparing.

a = {x:5, s:'foo '}
b = {x:6, s:'foo'}

diff = HashDiff.diff(a, b, :comparison => { :numeric_tolerance => 0.1, :strip => true })
diff.should == [["~", "x", 5, 6]]

:case_insensitive

The :case_insensitive option makes string comparisions ignore case.

a = {x:5, s:'FooBar'}
b = {x:6, s:'foobar'}

diff = HashDiff.diff(a, b, :comparison => { :numeric_tolerance => 0.1, :case_insensitive => true })
diff.should == [["~", "x", 5, 6]]

Specifying a custom comparison method

It's possible to specify how the values of a key should be compared.

a = {a:'car', b:'boat', c:'plane'}
b = {a:'bus', b:'truck', c:' plan'}

diff = HashDiff.diff(a, b) do |path, obj1, obj2|
  case path
  when  /a|b|c/
    obj1.length == obj2.length
  end
end

diff.should == [['~', 'b', 'boat', 'truck']]

The yielded params of the comparison block is |path, obj1, obj2|, in which path is the key (or delimited compound key) to the value being compared. When comparing elements in array, the path is with the format array[*]. For example:

a = {a:'car', b:['boat', 'plane'] }
b = {a:'bus', b:['truck', ' plan'] }

diff = HashDiff.diff(a, b) do |path, obj1, obj2|
  case path
  when 'b[*]'
    obj1.length == obj2.length
  end
end

diff.should == [["~", "a", "car", "bus"], ["~", "b[1]", "plane", " plan"], ["-", "b[0]", "boat"], ["+", "b[0]", "truck"]]

When a comparison block is given, it'll be given priority over other specified options. If the block returns value other than true or false, then the two values will be compared with other specified options.

Sorting arrays before comparison

An order difference alone between two arrays can create too many diffs to be useful. Consider sorting them prior to diffing.

a = {a:'car', b:['boat', 'plane'] }
b = {a:'car', b:['plane', 'boat'] }

HashDiff.diff(a, b) => [["+", "b[0]", "plane"], ["-", "b[2]", "plane"]]

b[:b].sort!

HashDiff.diff(a, b) => []

License

HashDiff is distributed under the MIT-LICENSE.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.