Giter Club home page Giter Club logo

odiff's Introduction

odiff

A value difference generator that can generate a list of differences between one javascript value and another. It recursively finds differences within arrays and objects.

Example

var a = [{a:1,b:2,c:3},              {x:1,y: 2, z:3},              {w:9,q:8,r:7}]
var b = [{a:1,b:2,c:3},{t:4,y:5,u:6},{x:1,y:'3',z:3},{t:9,y:9,u:9},{w:9,q:8,r:7}]

var diffs = odiff(a,b)

/* diffs now contains:
[{type: 'add', path:[], index: 2, vals: [{t:9,y:9,u:9}]},
 {type: 'set', path:[1,'y'], val: '3'},
 {type: 'add', path:[], index: 1, vals: [{t:4,y:5,u:6}]}
]
*/

Motivation

This differencing algorithm is intended to make object differences easy to manage when you need to update an object in a way other than simply copying the reference. An example is if you need to create a database query to update a record based on the changes between two objects. It also works in a basic way on primitives (no string differencing tho).

While this algorithm puts in some effort to make the number of change records minimal, it by no means generates an absolutely minimal set of changes. It also doesn't handle compressing string differences in any way. For these reasons, this algorithm is not ideal for use in sending changes over the wire, especially if your data contains lots of small changes to large strings.

Other algorithms I found either had undesirable behavior when dealing with array inserts ( flitbit/diff, cosmicanant/recursive-diff, ackrause/ObjectCompare, thomseddon/docdiff ), had weird nested difference formats that make things harder ( NV/objectDiff.js ), didn't describe their behavior at all ( aogriffiths/jsondiff-js, benjamine/JsonDiffPatch ), or all three. Also some only work on objects (not arrays) ( Evaw/objectDiff, omgaz/js-diff ).

Install

npm install odiff

Usage

Accessing odiff:

// typescript
import odiff from "odiff";

// node.js
var odiff = require('odiff')

// amd
require.config({paths: {odiff: '../generatedBuilds/odiff.umd.js'}})
require(['odiff'], function(odiff) { /* your code */ })

// global variable
<script src="odiff.umd.js"></script>
odiff; // odiff.umd.js can define odiff globally if you really
       //   want to shun module-based design

Using odiff:

odiff(valueA, valueB) - Returns an array of changes that, when applied to valueA, will turn that value into valueB. The results are also build such that you can pick and choose what changes to do, and as long as you do them (selectively) in order, the changes will work properly. Each element in the resulting array has the following members:

  • type - Either "set", "unset", "add", or "rm"
  • path - An array representing the path from the root object. For example, ["a",2,"b"] represents valueA.a[2].b. Will be an empty array if the change applies to the top-level object (ie valueA directly).
  • val - The value the indicated property was changed to. Only defined for the "set" type.
  • index - The index at which an item was added or removed from an array. Only defined for the "add" and "rm" types.
  • vals - An array of values added into the indiconly availableated property. Only defined for the "add" and "rm" types.

Odiff also exposes two functions used internally:

odiff.equal(a,b) - Returns true if the two values are equal, false otherwise. NaN is treated as equal to itself.

odiff.similar(a,b) - Returns true if the two values are similar, false otherwise. "Similar" is defined as having less than two shallow inner values different (as long as not 100% of the values are different) or having fewer than 10% of its shallow values different. NaN is treated as equal to itself.

Algorithm behavior

The differencing algorithm is intended to run well on values that are pretty similar. Some properties of the algorithm:

  • If a value is inserted into or removed from the middle of an array, the algorithm should generate an 'add' change item, rather than a string of 'set' change items
  • If an array element is changed only a little bit, a sequence of change items is generated to change that element. A "little bit" is defined as less than two shallow inner values being changed or fewer than 10% of its shallow values being changed.
  • If an array element is changed a lot (defined as the opposite of "a little bit" above), a single change item will be generated to reset that complex value. This is intended as a trade off between number of changes and "size" of each change.
  • The change items are ordered such that if you apply the changes as written to valueA in order, you will get valueB. It does this by reversing the order in which array changes are listed.
  • The 3 types of values the algorithm recognizes are Objects, Arrays, and atomic primitives. Only Objects and Arrays are recursively analyzed.
  • Strings and numbers are treated as atomic primitives - if a string changes one character, that whole value will be written in the value key of the change item.
  • NaN is treated as equal to itself for the purposes of this difference.
  • Objects with circular references aren't supported (yet)

Design Decisions

Why not use JSONPatch format?

  • The JSON Pointer format uses weird escape codes, I assume because doing that saves a couple bytes here and there
  • The remove item can't specify a count, so if you remove a big sequence from an array, you get a ton of little removal events (strange since this totally negates the bytes saved by JSON Pointer)

Todo

  • Use object-traverse to refactor - mostly because it supports circular reference detection

How to Contribute!

Anything helps:

  • Creating issues (aka tickets/bugs/etc). Please feel free to use issues to report bugs, request features, and discuss changes
  • Updating the documentation: ie this readme file. Be bold! Help create amazing documentation!
  • Submitting pull requests.

How to submit pull requests:

  1. Please create an issue and get my input before spending too much time creating a feature. Work with me to ensure your feature or addition is optimal and fits with the purpose of the project.
  2. Fork the repository
  3. clone your forked repo onto your machine and run npm install at its root
  4. If you're gonna work on multiple separate things, its best to create a separate branch for each of them
  5. edit!
  6. If it's a code change, please add to the unit tests (at test/odiffTest.js) to verify that your change
  7. When you're done, run the unit tests and ensure they all pass
  8. Commit and push your changes
  9. Submit a pull request: https://help.github.com/articles/creating-a-pull-request

Change Log

  • 1.5.0 - Adding the ability to detect empty object-type changes (between array an Object)
  • 1.4.4 - Fixing a bug in the similar function that would cause unoptimized diffs.
  • 1.4.3 - Fixing a bug that caused an incorrect removal index in a particular case.
  • 1.4.1 - Correcting typescript type file.
  • 1.4.0 - Modified the way typescript types are exported.
  • 1.3.0 - Added 'vals' to the 'rm' diff type. Also deprecated 'num'.
  • 1.2.0 - Adding support for typescript (typescript definition).
  • 1.1.0 - Adding support for Date object comparisons.
  • 1.0.0 - Fixing bug in rm where the index was previously the last index removed and changed to the index being the first item removed.
  • 0.1.0 - Adding "unset" diff type (so undefined and unset keys can be distinguished).
  • 0.0.2 - fixing bug related to isNaN being garbage.
  • 0.0.1 - first commit!

License

Released under the MIT license: http://opensource.org/licenses/MIT

odiff's People

Contributors

chapmanjacobd avatar dperetti avatar fresheneesz avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

odiff's Issues

Issue with similar function for arrays

Hello,

I believe there may be a bug with the similar function for comparing two arrays. However, I could just be reading the code wrong. See the following code snippet that I have copied below: https://github.com/Tixit/odiff/blob/master/odiff.js#L193-L203.

var tenPercent = a.length/10
var notEqual = Math.abs(a.length-b.length) // initialize with the length difference
for(var n=0; n<a.length; n++) {
    if(equal(a[n],b[n])) {
        if(notEqual >= 2 && notEqual > tenPercent || notEqual === a.length) {
            return false
        }

        notEqual++
    }
}

The variable that is incremented is named notEqual, and yet it is used to count the number of equal entries. Should this be changed to check for != instead?

var tenPercent = a.length/10
var notEqual = Math.abs(a.length-b.length) // initialize with the length difference
for(var n=0; n<a.length; n++) {
    if(!equal(a[n],b[n])) {
        if(notEqual >= 2 && notEqual > tenPercent || notEqual === a.length) {
            return false
        }

        notEqual++
    }
}

Thanks,
~ Paddy L.

Help - how do I actually use the diff between two objects

Hi,

Maybe a stupid question, but I would like to know if this is something that odiff can do ...
I have two objects a and b, and I want to get a resulting third object that contains only the contents that differ between a and b (or, in other words, the variables that are not in both a and b).

My objects can be rather complicated with nested json-sourced data, but here's a simple example:

let a = {x: 1, y: 2, z: 3};
let b = {y: 2};
diff(a, b)
// {x: 1, z: 3}

Is this something that can be done with odiff? If so, how do I call it?

If not, do you have any suggestions for other methods or libraries for this?

Thanks!!

Filter diff operation by path

I came to a point in a project I am developing where I'd need odiff to "ignore" an object entirely. ie:

// Original
{
  normal: { a: 1, b: 2 },
  ignore: { a: 1, b: 2 }
}

// New
{
  normal: { a: 1, b: 999 },
  ignore: { a: 1, b: 999 }
}

The odiff of those might look like:

[
  { type: 'SET', path: ['normal', 'b'], val: 999 },
  { type: 'SET', path: ['ignore', 'b'], val: 999 },
]

While I'd need to "ignore" the ignore path to produce something like this:

[
  { type: 'SET', path: ['normal', 'b'], val: 999 },
  { type: 'SET', path: ['ignore'], val:  { a: 1, b: 999 } },
]

I believe that deep-diff has a prefilter option to do that.
Is there any way to do that with odiff as it is today? Would it make sense instead to attempt a PR to add the "prefilter" thingy?

Thanks!

Adding Removed Values

I'm not a fan of the structure of returns being different depending on what happened (add/removes/updates/etc). I've been using deep-diff for the last few years and the format of outputs makes it easy/predictable to handle the results.

When an object is removed, you do not list the value of what was removed. I am writing an automatic patchnotes generator based on changes in data. It describes in natural english what changes are occuring to the data. Reverse looking up paths with the index and path is a nuisance considering you have access to the values when performing your lookups. From a technical debt standpoint it would make more sense to have the values in "rm" modes included like you do with "add".

Any chance of adding values to the "rm" type?

Index for adding multiple things in an array before and after existing.

the index for the items that are added after the item that is already in the array is the wrong index.

example:

odiff({ a: ['b'] }, { a: ['a', 'b', 'c', 'd'] })

result:

[ { "type": "add", "path": [ "a" ], "index": 1, "vals": [ "c", "d" ] }, { "type": "add", "path": [ "a" ], "index": 0, "vals": [ "a" ] } ]

the index of "vals": [ "c", "d" ] should be 2.

`index` for `rm` appears to show _last_ index removed, not the first

I found this while trying to get the previous values that were set to show an actually +/- diff.

const originalArray = ['a', 'b', 'c', 'd', 'e'];
const newArray = ['a', 'e'];
const diff = odiff(originalArray, newArray);

Will return:

[
  {
    "type": "rm",
    "path": [],
    "index": 3,
    "num": 3
  }
]

num is correct, 3 elements were removed. index is the last index that was removed; d in this case. I would expect index to be 1, the first element removed.

If index was 1, then it's trivial to get the original elements removed:

const elementsRemoved = originalArray.slice(diff.index, diff.index + diff.num);

Package is missing license type

There is no license line in your package.json. Your package fails an acceptable license test in my build pipeline because of this.

rm bug report

odiff version: 1.4.2

const a = [
  {},
  {},
  {
    "b": null,
    "i": 2587.884,
    "j": 89.2944,
    "k": 1254.2556,
    "l": 1880.97
  },
  {
    "b": null,
    "i": 386.11559999999986,
    "j": 1102.9512,
    "k": 1252.6019999999999,
    "l": 814.398
  },
  {
    "b": null,
    "i": 386.11559999999986,
    "j": 150.4776,
    "k": 1252.6019999999999,
    "l": 814.398
  },
  {}
]
const b = [
  {},
  {},
  {
    "b": "603c8f2d-db93-45ec-bea3-f6cd2f6be5b7",
    "i": 2587.825865625,
    "j": 89.2944,
    "k": 1254.37186875,
    "l": 1880.97
  },
  {
    "b": "3c1a45a6-b776-4a54-852a-6a4103105ddc",
    "i": 386.11559999999986,
    "j": 570.9921868791004,
    "k": 1252.6019999999999,
    "l": 1878.3160262417994
  },
  {
    "b": "5f3f02f7-e0e4-4a01-8126-73e33845f6cd",
    "i": 386.11559999999986,
    "j": -381.4814131208996,
    "k": 1252.6019999999999,
    "l": 1878.3160262417994
  },
  {}
]
console.log(odiff(a, b))

The result is

[{
  "type": "rm",
  "path": [],
  "index": 0,
  "num": 3,
  "vals": [{}, {}, {
    "b": null,
    "i": 2587.884,
    "j": 89.2944,
    "k": 1254.2556,
    "l": 1880.97
  }]
}, {
  "type": "add",
  "path": [],
  "index": 2,
  "vals": [{
    "b": "603c8f2d-db93-45ec-bea3-f6cd2f6be5b7",
    "i": 2587.825865625,
    "j": 89.2944,
    "k": 1254.37186875,
    "l": 1880.97
  }, {
    "b": "3c1a45a6-b776-4a54-852a-6a4103105ddc",
    "i": 386.11559999999986,
    "j": 570.9921868791004,
    "k": 1252.6019999999999,
    "l": 1878.3160262417994
  }, {
    "b": "5f3f02f7-e0e4-4a01-8126-73e33845f6cd",
    "i": 386.11559999999986,
    "j": -381.4814131208996,
    "k": 1252.6019999999999,
    "l": 1878.3160262417994
  }]
}]

The index of rm object is wrong.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.