azaslavsky / domjson Goto Github PK
View Code? Open in Web Editor NEWConvert DOM trees into compact JSON objects, and vice versa, as fast as possible.
Home Page: http://azaslavsky.github.io/domJSON/
License: Other
Convert DOM trees into compact JSON objects, and vice versa, as fast as possible.
Home Page: http://azaslavsky.github.io/domJSON/
License: Other
Hi,
I was wondering how I can get the full DOM for a page. I tried putting in the html element or the body element but it crashed on them both.
Thanks
It's confusing, since we also have a option type called "filterList." Also, opts.domProperties
will no longer accept booleans.
Useful data types are agent
, which will track the user agent that produced the object, and versions
, which stores the version of domJSON that created the object.
domProperties
are parsed when using an inclusive filterList (or boolean false)createNode
function to concatenate children, then append them using innerHTML?Two demos: one to simply show the JSON output, and another to demonstrate batch updates via Web Workers
e.g.:
const dom = await JSDOM.fromURL('http://...')
const parsed = toJSON(dom.window.document) // domjson.toJSON()
Error:
TypeError: Cannot read property 'href' of undefined
at D:\git\food-search\node_modules\domjson\dist\domJSON.js:19:28
at domJSON (D:\path\to\my\project\node_modules\domjson\dist\domJSON.js:7:23)
at Object.<anonymous> (D:\path\to\my\project\node_modules\domjson\dist\domJSON.js:15:3)
at Module._compile (module.js:641:30)
at Module._extensions..js (module.js:652:10)
at Object.require.extensions.(anonymous function) [as .js]
Currently only works well in Chrome and Canary. Lot's of failed tests on Firefox (!!), and god only knows about IE10, let alone older versions.
Some stuff to include in the docs:
Under usage there is:
var someDOMElement = document.getElementById('sampleId'); var jsonOutput = domJSON.toJSON(myDiv);
first we select ID, than there is some myDiv...i am confused. Can you be more specific or descriptive? Beginner here
Also expand performance
Both external and internal dimensions, if they are available
Basically, it sets the serial propertie ("innerHTML", in this case), which writes HTML, and then does so again with the modified child nodes. Best way is to probably look at the meta data and avoid all serial information when creating nodes.
Says it all - write the unit tests.
Great tool! However, I was wondering:
domJSON defines the tags <link>
and <script>
as 'disallowed', and states the following:
A list of disallowed HTMLElement tags - there is no flexibility here, these cannot be processed by domJSON for security reasons!
What's the reason behind this? What would be the security implications of just serializing them as well?
opts.nodeTypes
should replace opts.htmlOnly
, which is too narrow of a spec. opts.tagNames
should be FilterList
type field to specify which HTML tags to include/exclude, and should default to true
.
You have a typo in the UMD script here (dmoJSON
should be domJSON
): https://github.com/azaslavsky/domJSON/blob/master/src/domJSON.js#L24
Also, in CJS this
is not window
so the script fails here win.location
is undefined): https://github.com/azaslavsky/domJSON/blob/master/src/domJSON.js#L55
I can easily submit a PR for these but I think it might be better if you wrote the code as CJS then ran it through browsify to get UMD. If you are interested I can submit a PR.
Currently, properties that accept a list of fields can accept values of true
for all fields, false
for none, an array listing which fields to save (aka boolean intersect; ex: ['a', 'b', 'c',...]
), or an array (lead with a boolean false
) of which specify which fields NOT to save (aka boolean differenceex: [false, 'a', 'b', 'c',...]
).
I would like this fields to accept objects in the following format:
{
intersect: true, //defaults to true
fields: ['a', 'b', 'c'] //if empty, defaults to false
}
I work on code editor and have following question:
DomJson Accepts html element as an argument for domJSON.toJSON(element)
Can i pass pure html string instead? domJSON.toJSON('<p>some code</p>')
i tried that and it logs error length is undefined
which may be because it expects html element, not a string.
Firstly wonderful library excellent, only request is to add xpath of html element in the JSON. this is not issue requesting for enhancement.
When using .toJSON
with computedStyle: true
computed styles are copied to the [node].style
object. However, when using .toDOM
only the [node].attributes.styles
text is added to the element.
would be nice to have a option which returns only a minimal set of the dom node, such as
https://github.com/azaslavsky/domJSON
or even better:
For example, JsonML outputs dom to json like this:
var jsonMl = ["table",{"class":"MyTable","style":"background-color:yellow"},
["tbody",
["tr",
["td",{"class":"MyTD","style":"border:1px solid black"},
"#550758"],
["td",{"class":"MyTD","style":"background-color:red"},
"Example text here"]
],
["tr",
["td",{"class":"MyTD","style":"border:1px solid black"},
"#993101"],
["td",{"class":"MyTD","style":"background-color:green"},
"127624015"]
],
["tr",
["td",{"class":"MyTD","style":"border:1px solid black"},
"#E33D87"],
["td",{"class":"MyTD","style":"background-color:blue"},
"\u00A0",
["span",{"id":"mySpan","style":
"background-color:maroon;color:#fff
!important"},"\u00A9"],
"\u00A0"
]
]
]
];
Your compiler returns the nodes within a group such as "childNodes" and the like, which is noisy.
Please provide a option to turn the dom-node into this what jsonMl makes but with performance in mind:)
Hi! Great library.
Not sure if this is a bug or a browser issue, but I think you might be interested - I seem to be having issues JSONifying a few types of input elements. Here is a JSFiddle demonstrating what can happen.
I'm using chrome n 43.0.2357.124 (64-bit).
It's as if checkbox
(and radio
, file
, etc) will raise an exception if you so much as access selectionStart
like so:
'selectionStart' in node // true
node['selectionStart'] // DOMException
... which is more or less what I think is happening here.
Two demos: one for DOM snapshots, and one for WebWorkers optimization
This took me a second to realize while using the plugin.
https://developer.mozilla.org/en-US/docs/Web/API/Node/replaceChild
Instead of
var DOMDocumentFragment = domJSON.toDOM(jsonOutput); someDOMElement.parentNode.replaceChild(someDOMElement, DOMDocumentFragment);
it should be
var DOMDocumentFragment = domJSON.toDOM(jsonOutput); someDOMElement.parentNode.replaceChild(DOMDocumentFragment, someDOMElement);
at least in Chrome/60.0.3112.113 in the console
So that developers have a solid idea of how each option affects performance.
Hi, in IE 11 within the copyJSON function IE stops partway into the for ... in and reports InvalidStateError. Specifically when trying to access node[n] here:
if (opts.cull) {
if (node[n] || node[n] === 0 || node[n] === false) {
copy[n] = node[n];
}
} else {
copy[n] = node[n];
}
I got this error:
>npx jest
FAIL src/reviewer.test.ts (5.44s)
× renders test site (23ms)
● renders test site
TypeError: Cannot read property 'href' of undefined
7 |
> 8 | const domJSON = require('domjson');
| ^
at node_modules/domjson/dist/domJSON.js:19:28
at node_modules/domjson/dist/domJSON.js:7:23
at Object.<anonymous> (node_modules/domjson/dist/domJSON.js:15:3)
at Object.<anonymous> (src/reviewer.test.ts:8:21)
If I import require('jsdom-global')()
as suggested on #26 (Usage w/ Jsdom: TypeError: Cannot read property 'href' of undefined), jest breaks on my tear down with this:
FAIL src/reviewer.test.ts
● Test suite failed to run
TypeError: Illegal invocation
at removeEventListener (node_modules/jsdom/lib/jsdom/living/generated/EventTarget.js:131:15)
Can this library do web session recording like: https://github.com/rrweb-io/rrweb ?
Currently, only the package.json
and bower.json
versions are bumped during the gulp bump
task - make sure to do it for the source and dist as well!
Absolute paths are a little hard to do right now. The new API will probably split absolute paths in styles and attributes into to separate option properties. Absolutes for attributes will be handled with a FieldList
type input, which will specify the attributes to perform the path check on.
I was tryed to use it on NodeJS environment, and I also used the 'puppeteer' ,
here is my demo:
const puppeteer = require('puppeteer'); const domJSON = require('domjson'); (async () => { const browser = await puppeteer.launch(); const page = await browser.newPage(); await page.goto('http://lishi.tianqi.com/beijing/201612.html'); const ulsObjArr = await page.evaluate(() => { const uls = document.querySelectorAll('div.tqtongji2 ul'); return Array.prototype.map.call(uls, function(){ return domJSON.toJSON(ul) }); }); console.log('===================================='); console.log(ulsObjArr); console.log('===================================='); await browser.close(); })();
and it has an error:
win.location.href is not defined
Did I have something wrong ? or how can i solve it?
Make the library compatible with AMD, CommonJS/Browserify, and standalone usage by adding a factory.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.