iabudiab / htmlkit Goto Github PK
View Code? Open in Web Editor NEWAn Objective-C framework for your everyday HTML needs.
License: MIT License
An Objective-C framework for your everyday HTML needs.
License: MIT License
When I parse an XHTML, I discovered that the tags in the are not being closed:
<head>
<title>page</title>
<meta charset="utf-8"></meta>
<link href="css/template.css" rel="stylesheet" type="text/css"></link>
</head>
the innerHTML
returns:
<head>
<title>page</title>
<meta charset="utf-8">
<link href="css/template.css" rel="stylesheet" type="text/css">
</head>
and for example if I query [document querySelector:@"link"].outerHTML
I get
<link href="css/template.css" rel="stylesheet" type="text/css">
[document querySelector:@"meta"].innerHTML
<object returned empty description>
[document querySelector:@"meta"].outerHTML
<meta charset="utf-8">
Title is correct, though.
My XHTML is not valid anymore and the Webview fails parsing it. Is there a way to avoid this loss of information?
thanks!
When I deleted this statement, I can use the command "import <HTMLKit/HTMLKit.h>", or there is other way to use this kit?
HTMLNode.m, line 128
- (HTMLElement *)nextSiblingElement
{
HTMLNode *node = self.previousSibling;
while (node && node.nodeType != HTMLNodeElement) {
node = node.nextSibling;
}
return node.asElement;
}
Because of an iteration starts with a previous element the next sibling element it the element itself. Possibly a typo.
If you load a lot of html into HTMLParser, it uses upwards of 500-600MB for parsing at peak. More autorelease pools need to be added. I found a couple of places that are the worst offenders:
- (void)HTMLInsertionModeInBody:(HTMLToken *)token
- (NSString *)innerHTML
HTMLTokenizer nextObject
Also, for a high number of nodes, always creating NSMutableOrderedSet
is not ideal - that uses about 40-50MB.
Let's say you have the following html:
<html>
<body>
<div> <example> </div>
</body>
</html>
Then on:
HTMLDocument *doc = [HTMLDocument documentWithString:html];
and printing out the html via doc.rootElement.outerHTML
you get partially decoded entities:
<html><head></head><body> <div><example></div> </body></html>
where <
is left as is, and >
is decoded.
Not sure what the correct thing to do here is.
I've created a small testing app with HTMLKit inside it. I want to present a UITableView
with all the HTML in it. I load the HTML as follows:
NSString *htmlString = @"<div><div><p>Test!</p></div><p>Test!</p><h1>HTMLKit</h1><p>Hello there!</p></div>";
// Via parser
HTMLParser *parser = [[HTMLParser alloc] initWithString:htmlString];
HTMLDocument *document = [parser parseDocument];
HTMLElement *head = document.body;
[self.items addObject:[self addElement:head]];
HTMLElement *body = document.body;
[self.items addObject:[self addElement:body]];
And these are the addElement
related functions:
- (Entry *)addElement:(HTMLElement *)element {
Entry *entry = [[Entry alloc] init];
if (element.childElementsCount > 0) {
entry.tags = [self enumerate:element];
}
entry.tag = element.outerHTML;
return entry;
}
- (NSArray *)enumerate:(HTMLElement *)element {
NSMutableArray *items = [@[] mutableCopy];
for (int i = 0; i < element.childElementsCount; i++) {
HTMLElement *child = [element childElementAtIndex:i];
[items addObject:[self addElement:child]];
}
return items;
}
This has the screenshot below as a result:
Ideally, I'd like for <body>
to not contain the <div><div>..etc
. What's the best way to achieve this with HTMLKit
?
Hi, I have a app using HTMLKit. When I sent my app to app store to review , app store reported "querySelectAll: " referenced non public API. It's really true ? If it's true , is it possible to fix it ?
HTMLKit erroneous HTMLElement attribute value serialization
This is basically the same bug as in #16 but with HTMLElement
attribute values:
Serializing the following HTML:
<body key="& testing 0x00A0"></body>
would produce:
<body key="& testing 0x00A0"></body>
Whereas it should be:
<body key="& testing "></body>
When I write the code as follow, it works:
CSSSelector * gtSelector(NSInteger index)
{
NSString *name = [NSString stringWithFormat:@":gt(%ld)", (long)index];
return namedBlockSelector(name, ^BOOL(HTMLElement * _Nonnull element) {
NSUInteger elementIndex = [element.parentElement indexOfChildNode:element];
if (index > 0) {
return elementIndex > index;
} else {
return (elementIndex > index && elementIndex < element.parentElement.childNodesCount);
}
});
}
I installed your library as you suggested by using cocoa-pods, but after I tried to include it in my project in my bridge-header I keep getting bunch of errors that these *.h files like HTMLElement.h etc. does not exist, even doe I see them in pod folder. Any idea why?
I'm using XCode 8 and Swift 2.3.
UPDATE
What I had to do to make it works (for now), I had to mark each of these "missing" files as "public" in cocoa-pod.
Hello,
I have this error when building project with this library.
https://imgur.com/a/X9nOn
Could you help me resolve this?
Thanks,
...
This looks like a powerful library to navigate around HTML nodes, however what would be the simplest method of obtaining cleaned up 'plain text' from HTML input? I'd like it to preserve any 'invalid' non-html tags such as John Do <[email protected]>
and not try and parse it as NSAttributedString
's initWithHTML
does.
Would be great if we could get support for https://www.w3schools.com/jsref/prop_node_innertext.asp
Example html:
nbsp.zip
Text Content shows nbsp; nbsp; nbsp; nbsp; nbsp; nbsp; nbsp; nbsp; nbsp; nbsp; nbsp; nbsp; nbsp; nbsp; nbsp; nbsp; nbsp; nbsp; nbsp; nbsp; nbsp; nbsp; nbsp; nbsp; nbsp; nbsp; nbsp; nbsp; nbsp; nbsp; nbsp; nbsp; nbsp; nbsp; nbsp; nbsp; nbsp; nbsp; nbsp; nbsp; nbsp; nbsp; nbsp; nbsp; nbsp; nbsp; nbsp; nbsp; nbsp; nbsp; nbsp; nbsp;
on a hidden div.
Inner Text is much cleaner:
Introducing you to:
JLL IDEAs
PropTech Challenge
Hi,
JLL IDEAs - A PropTech Innovation Challenge in partnership with AGNIi and Startup India is inviting applications for exciting technologies, ideas, products and tools which have a use case in Real Estate in areas like Sustainability, Smart Buildings, Real Estate Valuation, Occupancy and Space Planning, Transport/Urban mobility, asset management etc.
Top 3 Winners -
- Cash awards up to INR 25 Lacs
Top 10 winners -
- Co-working space for 1 year
- Opportunity to get business orders from JLL, our partners and clients
- Access to mentorship from top Real Estate experts
Last Date: Extended till 30th September 2019
Good luck!
Team - Startup India
Apply Now
You received this email because you subscribed to our list. You can unsubscribe at any time.
When you do a simple doc.body, the document gets retained by the iterator in root
and referenceNode
and the iterator gets retained by the document in attachNodeIterator
- hence neither gets deallocated.
- (instancetype)initWithNode:(HTMLNode *)node
showOptions:(HTMLNodeFilterShowOptions)showOptions
filter:(id<HTMLNodeFilter>)filter
{
self = [super init];
if (self) {
_root = node; // retains doc
_filter = filter; // retains doc
_whatToShow = showOptions;
_referenceNode = _root;
_pointerBeforeReferenceNode = YES;
[_root.ownerDocument attachNodeIterator:self]; // doc retains iterator
}
return self;
}
A possible fix is to convert the nodeIterator
array into a weak objects NSHashTable
.
Example Stacktrace:
Crashed: com.app.imap.download
0 CoreFoundation 0x19d7b39f4 __CFStringChangeSizeMultiple + 244
1 CoreFoundation 0x19d7ae61c __CFStringAppendBytes + 620
2 CoreFoundation 0x19d79fa78 __CFStringAppendFormatCore + 12648
3 CoreFoundation 0x19d7b2224 _CFStringAppendFormatAndArgumentsAux2 + 48
4 CoreFoundation 0x19d6dbcf8 -[__NSCFString appendFormat:] + 100
5 App 0x1029faec4 -[HTMLElement outerHTML] (HTMLElement.m:158)
6 Foundation 0x19e16d7d0 -[NSObject(NSKeyValueCoding) valueForKey:] + 268
7 Foundation 0x19e1c9940 -[NSArray(NSKeyValueCoding) valueForKey:] + 392
8 App 0x102a00848 -[HTMLNode innerHTML] (HTMLNode.m:727)
9 App 0x1029fb16c -[HTMLElement outerHTML] (HTMLElement.m:182)
10 Foundation 0x19e16d7d0 -[NSObject(NSKeyValueCoding) valueForKey:] + 268
...
2548 App 0x102a00848 -[HTMLNode innerHTML] (HTMLNode.m:727)
2549 App 0x1029fb16c -[HTMLElement outerHTML] (HTMLElement.m:182)
2550 Foundation 0x19e16d7d0 -[NSObject(NSKeyValueCoding) valueForKey:] + 268
2551 Foundation 0x19e1c9940 -[NSArray(NSKeyValueCoding) valueForKey:] + 392
2552 App 0x102a00848 -[HTMLNode innerHTML] (HTMLNode.m:727)
2553 App 0x1029fb16c -[HTMLElement outerHTML] (HTMLElement.m:182)
2554 Foundation 0x19e16d7d0 -[NSObject(NSKeyValueCoding) valueForKey:] + 268
2555 Foundation 0x19e1c9940 -[NSArray(NSKeyValueCoding) valueForKey:] + 392
2556 App 0x102a00848 -[HTMLNode innerHTML] (HTMLNode.m:727)
2557 App 0x1029fb16c -[HTMLElement outerHTML] (HTMLElement.m:182)
2558 Foundation 0x19e16d7d0 -[NSObject(NSKeyValueCoding) valueForKey:] + 268
2559 Foundation 0x19e1c9940 -[NSArray(NSKeyValueCoding) valueForKey:] + 392
2560 App 0x102a00848 -[HTMLNode innerHTML] (HTMLNode.m:727)
2561 App 0x1029fb16c -[HTMLElement outerHTML] (HTMLElement.m:182)
2562 Foundation 0x19e16d7d0 -[NSObject(NSKeyValueCoding) valueForKey:] + 268
2563 Foundation 0x19e1c9940 -[NSArray(NSKeyValueCoding) valueForKey:] + 392
2564 App 0x102a00848 -[HTMLNode innerHTML] (HTMLNode.m:727)
2565 App 0x1029fb16c -[HTMLElement outerHTML] (HTMLElement.m:182)
2566 Foundation 0x19e16d7d0 -[NSObject(NSKeyValueCoding) valueForKey:] + 268
2567 Foundation 0x19e1c9940 -[NSArray(NSKeyValueCoding) valueForKey:] + 392
2568 App 0x102a00848 -[HTMLNode innerHTML] (HTMLNode.m:727)
2569 App 0x1029fb16c -[HTMLElement outerHTML] (HTMLElement.m:182)
2570 Foundation 0x19e16d7d0 -[NSObject(NSKeyValueCoding) valueForKey:] + 268
2571 Foundation 0x19e1c9940 -[NSArray(NSKeyValueCoding) valueForKey:] + 392
2572 App 0x102a00848 -[HTMLNode innerHTML] (HTMLNode.m:727)
2573 App 0x1029fb16c -[HTMLElement outerHTML] (HTMLElement.m:182)
2574 Foundation 0x19e16d7d0 -[NSObject(NSKeyValueCoding) valueForKey:] + 268
2575 Foundation 0x19e1c9940 -[NSArray(NSKeyValueCoding) valueForKey:] + 392
2576 App 0x102a00848 -[HTMLNode innerHTML] (HTMLNode.m:727)
2577 App 0x1029fb16c -[HTMLElement outerHTML] (HTMLElement.m:182)
2578 Foundation 0x19e16d7d0 -[NSObject(NSKeyValueCoding) valueForKey:] + 268
2579 Foundation 0x19e1c9940 -[NSArray(NSKeyValueCoding) valueForKey:] + 392
2580 App 0x102a00848 -[HTMLNode innerHTML] (HTMLNode.m:727)
2581 App 0x1029fb16c -[HTMLElement outerHTML] (HTMLElement.m:182)
2582 Foundation 0x19e16d7d0 -[NSObject(NSKeyValueCoding) valueForKey:] + 268
2583 Foundation 0x19e1c9940 -[NSArray(NSKeyValueCoding) valueForKey:] + 392
2584 App 0x102a00848 -[HTMLNode innerHTML] (HTMLNode.m:727)
2585 App 0x1029fb16c -[HTMLElement outerHTML] (HTMLElement.m:182)
2586 Foundation 0x19e16d7d0 -[NSObject(NSKeyValueCoding) valueForKey:] + 268
2587 Foundation 0x19e1c9940 -[NSArray(NSKeyValueCoding) valueForKey:] + 392
2588 App 0x102a00848 -[HTMLNode innerHTML] (HTMLNode.m:727)
2589 App 0x1029fb16c -[HTMLElement outerHTML] (HTMLElement.m:182)
2590 Foundation 0x19e16d7d0 -[NSObject(NSKeyValueCoding) valueForKey:] + 268
2591 Foundation 0x19e1c9940 -[NSArray(NSKeyValueCoding) valueForKey:] + 392
2592 App 0x102a00848 -[HTMLNode innerHTML] (HTMLNode.m:727)
2593 App 0x1029fb16c -[HTMLElement outerHTML] (HTMLElement.m:182)
2594 Foundation 0x19e16d7d0 -[NSObject(NSKeyValueCoding) valueForKey:] + 268
2595 Foundation 0x19e1c9940 -[NSArray(NSKeyValueCoding) valueForKey:] + 392
2596 App 0x102a00848 -[HTMLNode innerHTML] (HTMLNode.m:727)
2597 App 0x1029fb16c -[HTMLElement outerHTML] (HTMLElement.m:182)
2598 Foundation 0x19e16d7d0 -[NSObject(NSKeyValueCoding) valueForKey:] + 268
2599 Foundation 0x19e1c9940 -[NSArray(NSKeyValueCoding) valueForKey:] + 392
2600 App 0x102a00848 -[HTMLNode innerHTML] (HTMLNode.m:727)
2601 App 0x1029fb16c -[HTMLElement outerHTML] (HTMLElement.m:182)
2602 Foundation 0x19e16d7d0 -[NSObject(NSKeyValueCoding) valueForKey:] + 268
2603 Foundation 0x19e1c9940 -[NSArray(NSKeyValueCoding) valueForKey:] + 392
2604 App 0x102a00848 -[HTMLNode innerHTML] (HTMLNode.m:727)
2605 App 0x1029fb16c -[HTMLElement outerHTML] (HTMLElement.m:182)
2606 Foundation 0x19e16d7d0 -[NSObject(NSKeyValueCoding) valueForKey:] + 268
2607 Foundation 0x19e1c9940 -[NSArray(NSKeyValueCoding) valueForKey:] + 392
2608 App 0x102a00848 -[HTMLNode innerHTML] (HTMLNode.m:727)
2609 App 0x1029fb16c -[HTMLElement outerHTML] (HTMLElement.m:182)
2610 Foundation 0x19e16d7d0 -[NSObject(NSKeyValueCoding) valueForKey:] + 268
2611 Foundation 0x19e1c9940 -[NSArray(NSKeyValueCoding) valueForKey:] + 392
2612 App 0x102a00848 -[HTMLNode innerHTML] (HTMLNode.m:727)
2613 App 0x1029fb16c -[HTMLElement outerHTML] (HTMLElement.m:182)
2614 Foundation 0x19e16d7d0 -[NSObject(NSKeyValueCoding) valueForKey:] + 268
2615 Foundation 0x19e1c9940 -[NSArray(NSKeyValueCoding) valueForKey:] + 392
2616 App 0x102a00848 -[HTMLNode innerHTML] (HTMLNode.m:727)
2617 App 0x1029fb16c -[HTMLElement outerHTML] (HTMLElement.m:182)
2618 Foundation 0x19e16d7d0 -[NSObject(NSKeyValueCoding) valueForKey:] + 268
2619 Foundation 0x19e1c9940 -[NSArray(NSKeyValueCoding) valueForKey:] + 392
2620 App 0x102a00848 -[HTMLNode innerHTML] (HTMLNode.m:727)
2621 App 0x1029fb16c -[HTMLElement outerHTML] (HTMLElement.m:182)
2622 Foundation 0x19e16d7d0 -[NSObject(NSKeyValueCoding) valueForKey:] + 268
2623 Foundation 0x19e1c9940 -[NSArray(NSKeyValueCoding) valueForKey:] + 392
2624 App 0x102a00848 -[HTMLNode innerHTML] (HTMLNode.m:727)
2625 App 0x1029fb16c -[HTMLElement outerHTML] (HTMLElement.m:182)
2626 Foundation 0x19e16d7d0 -[NSObject(NSKeyValueCoding) valueForKey:] + 268
2627 Foundation 0x19e1c9940 -[NSArray(NSKeyValueCoding) valueForKey:] + 392
2628 App 0x102a00848 -[HTMLNode innerHTML] (HTMLNode.m:727)
2629 App 0x1029fb16c -[HTMLElement outerHTML] (HTMLElement.m:182)
2630 Foundation 0x19e16d7d0 -[NSObject(NSKeyValueCoding) valueForKey:] + 268
2631 Foundation 0x19e1c9940 -[NSArray(NSKeyValueCoding) valueForKey:] + 392
2632 App 0x102a00848 -[HTMLNode innerHTML] (HTMLNode.m:727)
2633 App 0x1029fb16c -[HTMLElement outerHTML] (HTMLElement.m:182)
2634 Foundation 0x19e16d7d0 -[NSObject(NSKeyValueCoding) valueForKey:] + 268
2635 Foundation 0x19e1c9940 -[NSArray(NSKeyValueCoding) valueForKey:] + 392
2636 App 0x102a00848 -[HTMLNode innerHTML] (HTMLNode.m:727)
2637 App 0x1029fb16c -[HTMLElement outerHTML] (HTMLElement.m:182)
2638 Foundation 0x19e16d7d0 -[NSObject(NSKeyValueCoding) valueForKey:] + 268
2639 Foundation 0x19e1c9940 -[NSArray(NSKeyValueCoding) valueForKey:] + 392
2640 App 0x102a00848 -[HTMLNode innerHTML] (HTMLNode.m:727)
2641 App 0x1029fb16c -[HTMLElement outerHTML] (HTMLElement.m:182)
2642 App 0x1026c74d8 -[MRLinkParser parseUnsubscribeWithDocument:] (MRLinkParser.m:123)
Maybe put in a check for that? Not sure how to reproduce but seeing this out in the wild.
frame #8386: 0x00007fff36033e0f CoreFoundation`__RELEASE_OBJECTS_IN_THE_ARRAY__ + 122
frame #8387: 0x00007fff35f32e3e CoreFoundation`-[__NSArrayM dealloc] + 289
frame #8388: 0x00007fff35fd592d CoreFoundation`-[__NSOrderedSetM dealloc] + 157
frame #8389: 0x00007fff6c50dd16 libobjc.A.dylib`object_cxxDestructFromClass(objc_object*, objc_class*) + 83
frame #8390: 0x00007fff6c5076c3 libobjc.A.dylib`objc_destructInstance + 94
frame #8391: 0x00007fff6c50762b libobjc.A.dylib`_objc_rootDealloc + 62
frame #8392: 0x00007fff36033e0f CoreFoundation`__RELEASE_OBJECTS_IN_THE_ARRAY__ + 122
frame #8393: 0x00007fff35f32e3e CoreFoundation`-[__NSArrayM dealloc] + 289
frame #8394: 0x00007fff35fd592d CoreFoundation`-[__NSOrderedSetM dealloc] + 157
frame #8395: 0x00007fff6c50dd16 libobjc.A.dylib`object_cxxDestructFromClass(objc_object*, objc_class*) + 83
frame #8396: 0x00007fff6c5076c3 libobjc.A.dylib`objc_destructInstance + 94
frame #8397: 0x00007fff6c50762b libobjc.A.dylib`_objc_rootDealloc + 62
frame #8398: 0x00007fff36033e0f CoreFoundation`__RELEASE_OBJECTS_IN_THE_ARRAY__ + 122
frame #8399: 0x00007fff35f32e3e CoreFoundation`-[__NSArrayM dealloc] + 289
frame #8400: 0x00007fff35fd592d CoreFoundation`-[__NSOrderedSetM dealloc] + 157
frame #8401: 0x00007fff6c50dd16 libobjc.A.dylib`object_cxxDestructFromClass(objc_object*, objc_class*) + 83
frame #8402: 0x00007fff6c5076c3 libobjc.A.dylib`objc_destructInstance + 94
frame #8403: 0x00007fff6c50762b libobjc.A.dylib`_objc_rootDealloc + 62
frame #8404: 0x00007fff36033e0f CoreFoundation`__RELEASE_OBJECTS_IN_THE_ARRAY__ + 122
frame #8405: 0x00007fff35f32e3e CoreFoundation`-[__NSArrayM dealloc] + 289
frame #8406: 0x00007fff35fd592d CoreFoundation`-[__NSOrderedSetM dealloc] + 157
frame #8407: 0x00007fff6c50dd16 libobjc.A.dylib`object_cxxDestructFromClass(objc_object*, objc_class*) + 83
frame #8408: 0x00007fff6c5076c3 libobjc.A.dylib`objc_destructInstance + 94
frame #8409: 0x00007fff6c50762b libobjc.A.dylib`_objc_rootDealloc + 62
frame #8410: 0x00007fff36033e0f CoreFoundation`__RELEASE_OBJECTS_IN_THE_ARRAY__ + 122
frame #8411: 0x00007fff35f32e3e CoreFoundation`-[__NSArrayM dealloc] + 289
frame #8412: 0x00007fff35fd592d CoreFoundation`-[__NSOrderedSetM dealloc] + 157
frame #8413: 0x00007fff6c50dd16 libobjc.A.dylib`object_cxxDestructFromClass(objc_object*, objc_class*) + 83
frame #8414: 0x00007fff6c5076c3 libobjc.A.dylib`objc_destructInstance + 94
frame #8415: 0x00007fff6c50762b libobjc.A.dylib`_objc_rootDealloc + 62
frame #8416: 0x00007fff36033e0f CoreFoundation`__RELEASE_OBJECTS_IN_THE_ARRAY__ + 122
frame #8417: 0x00007fff35f32e3e CoreFoundation`-[__NSArrayM dealloc] + 289
frame #8418: 0x00007fff35fd592d CoreFoundation`-[__NSOrderedSetM dealloc] + 157
frame #8419: 0x00007fff6c50dd16 libobjc.A.dylib`object_cxxDestructFromClass(objc_object*, objc_class*) + 83
frame #8420: 0x00007fff6c5076c3 libobjc.A.dylib`objc_destructInstance + 94
frame #8421: 0x00007fff6c50762b libobjc.A.dylib`_objc_rootDealloc + 62
frame #8422: 0x00007fff36033e0f CoreFoundation`__RELEASE_OBJECTS_IN_THE_ARRAY__ + 122
frame #8423: 0x00007fff35f32e3e CoreFoundation`-[__NSArrayM dealloc] + 289
frame #8424: 0x00007fff35fd592d CoreFoundation`-[__NSOrderedSetM dealloc] + 157
frame #8425: 0x00007fff6c50dd16 libobjc.A.dylib`object_cxxDestructFromClass(objc_object*, objc_class*) + 83
frame #8426: 0x00007fff6c5076c3 libobjc.A.dylib`objc_destructInstance + 94
frame #8427: 0x00007fff6c50762b libobjc.A.dylib`_objc_rootDealloc + 62
frame #8428: 0x00007fff36033e0f CoreFoundation`__RELEASE_OBJECTS_IN_THE_ARRAY__ + 122
frame #8429: 0x00007fff35f32e3e CoreFoundation`-[__NSArrayM dealloc] + 289
frame #8430: 0x00007fff35fd592d CoreFoundation`-[__NSOrderedSetM dealloc] + 157
frame #8431: 0x00007fff6c50dd16 libobjc.A.dylib`object_cxxDestructFromClass(objc_object*, objc_class*) + 83
frame #8432: 0x00007fff6c5076c3 libobjc.A.dylib`objc_destructInstance + 94
frame #8433: 0x00007fff6c50762b libobjc.A.dylib`_objc_rootDealloc + 62
frame #8434: 0x00007fff36033e0f CoreFoundation`__RELEASE_OBJECTS_IN_THE_ARRAY__ + 122
frame #8435: 0x00007fff35f32e3e CoreFoundation`-[__NSArrayM dealloc] + 289
frame #8436: 0x00007fff35fd592d CoreFoundation`-[__NSOrderedSetM dealloc] + 157
frame #8437: 0x00007fff6c50dd16 libobjc.A.dylib`object_cxxDestructFromClass(objc_object*, objc_class*) + 83
frame #8438: 0x00007fff6c5076c3 libobjc.A.dylib`objc_destructInstance + 94
frame #8439: 0x00007fff6c50762b libobjc.A.dylib`_objc_rootDealloc + 62
frame #8440: 0x00007fff36033e0f CoreFoundation`__RELEASE_OBJECTS_IN_THE_ARRAY__ + 122
frame #8441: 0x00007fff35f32e3e CoreFoundation`-[__NSArrayM dealloc] + 289
frame #8442: 0x00007fff35fd592d CoreFoundation`-[__NSOrderedSetM dealloc] + 157
frame #8443: 0x00007fff6c50dd16 libobjc.A.dylib`object_cxxDestructFromClass(objc_object*, objc_class*) + 83
frame #8444: 0x00007fff6c5076c3 libobjc.A.dylib`objc_destructInstance + 94
frame #8445: 0x00007fff6c50762b libobjc.A.dylib`_objc_rootDealloc + 62
frame #8446: 0x00007fff36033e0f CoreFoundation`__RELEASE_OBJECTS_IN_THE_ARRAY__ + 122
frame #8447: 0x00007fff35f32e3e CoreFoundation`-[__NSArrayM dealloc] + 289
frame #8448: 0x00007fff35fd592d CoreFoundation`-[__NSOrderedSetM dealloc] + 157
frame #8449: 0x00007fff6c50dd16 libobjc.A.dylib`object_cxxDestructFromClass(objc_object*, objc_class*) + 83
frame #8450: 0x00007fff6c5076c3 libobjc.A.dylib`objc_destructInstance + 94
frame #8451: 0x00007fff6c50762b libobjc.A.dylib`_objc_rootDealloc + 62
frame #8452: 0x00007fff36033e0f CoreFoundation`__RELEASE_OBJECTS_IN_THE_ARRAY__ + 122
frame #8453: 0x00007fff35f32e3e CoreFoundation`-[__NSArrayM dealloc] + 289
frame #8454: 0x00007fff35fd592d CoreFoundation`-[__NSOrderedSetM dealloc] + 157
frame #8455: 0x00007fff6c50dd16 libobjc.A.dylib`object_cxxDestructFromClass(objc_object*, objc_class*) + 83
frame #8456: 0x00007fff6c5076c3 libobjc.A.dylib`objc_destructInstance + 94
frame #8457: 0x00007fff6c50762b libobjc.A.dylib`_objc_rootDealloc + 62
frame #8458: 0x00007fff36033e0f CoreFoundation`__RELEASE_OBJECTS_IN_THE_ARRAY__ + 122
frame #8459: 0x00007fff35f32e3e CoreFoundation`-[__NSArrayM dealloc] + 289
frame #8460: 0x00007fff35fd592d CoreFoundation`-[__NSOrderedSetM dealloc] + 157
frame #8461: 0x00007fff6c50dd16 libobjc.A.dylib`object_cxxDestructFromClass(objc_object*, objc_class*) + 83
frame #8462: 0x00007fff6c5076c3 libobjc.A.dylib`objc_destructInstance + 94
frame #8463: 0x00007fff6c50762b libobjc.A.dylib`_objc_rootDealloc + 62
frame #8464: 0x00007fff36033e0f CoreFoundation`__RELEASE_OBJECTS_IN_THE_ARRAY__ + 122
frame #8465: 0x00007fff35f32e3e CoreFoundation`-[__NSArrayM dealloc] + 289
frame #8466: 0x00007fff35fd592d CoreFoundation`-[__NSOrderedSetM dealloc] + 157
frame #8467: 0x00007fff6c50dd16 libobjc.A.dylib`object_cxxDestructFromClass(objc_object*, objc_class*) + 83
frame #8468: 0x00007fff6c5076c3 libobjc.A.dylib`objc_destructInstance + 94
frame #8469: 0x00007fff6c50762b libobjc.A.dylib`_objc_rootDealloc + 62
frame #8470: 0x00007fff36033e0f CoreFoundation`__RELEASE_OBJECTS_IN_THE_ARRAY__ + 122
frame #8471: 0x00007fff35f32e3e CoreFoundation`-[__NSArrayM dealloc] + 289
frame #8472: 0x00007fff35fd592d CoreFoundation`-[__NSOrderedSetM dealloc] + 157
frame #8473: 0x00007fff6c50dd16 libobjc.A.dylib`object_cxxDestructFromClass(objc_object*, objc_class*) + 83
frame #8474: 0x00007fff6c5076c3 libobjc.A.dylib`objc_destructInstance + 94
frame #8475: 0x00007fff6c50762b libobjc.A.dylib`_objc_rootDealloc + 62
frame #8476: 0x00007fff36033e0f CoreFoundation`__RELEASE_OBJECTS_IN_THE_ARRAY__ + 122
frame #8477: 0x00007fff35f32e3e CoreFoundation`-[__NSArrayM dealloc] + 289
frame #8478: 0x00007fff35fd592d CoreFoundation`-[__NSOrderedSetM dealloc] + 157
frame #8479: 0x00007fff6c50dd16 libobjc.A.dylib`object_cxxDestructFromClass(objc_object*, objc_class*) + 83
frame #8480: 0x00007fff6c5076c3 libobjc.A.dylib`objc_destructInstance + 94
frame #8481: 0x00007fff6c50762b libobjc.A.dylib`_objc_rootDealloc + 62
frame #8482: 0x00007fff36033e0f CoreFoundation`__RELEASE_OBJECTS_IN_THE_ARRAY__ + 122
frame #8483: 0x00007fff35f32e3e CoreFoundation`-[__NSArrayM dealloc] + 289
frame #8484: 0x00007fff35fd592d CoreFoundation`-[__NSOrderedSetM dealloc] + 157
frame #8485: 0x00007fff6c50dd16 libobjc.A.dylib`object_cxxDestructFromClass(objc_object*, objc_class*) + 83
frame #8486: 0x00007fff6c5076c3 libobjc.A.dylib`objc_destructInstance + 94
frame #8487: 0x00007fff6c50762b libobjc.A.dylib`_objc_rootDealloc + 62
frame #8488: 0x00007fff36033e0f CoreFoundation`__RELEASE_OBJECTS_IN_THE_ARRAY__ + 122
frame #8489: 0x00007fff35f32e3e CoreFoundation`-[__NSArrayM dealloc] + 289
frame #8490: 0x00007fff35fd592d CoreFoundation`-[__NSOrderedSetM dealloc] + 157
frame #8491: 0x00007fff6c50dd16 libobjc.A.dylib`object_cxxDestructFromClass(objc_object*, objc_class*) + 83
frame #8492: 0x00007fff6c5076c3 libobjc.A.dylib`objc_destructInstance + 94
frame #8493: 0x00007fff6c50762b libobjc.A.dylib`_objc_rootDealloc + 62
frame #8494: 0x00007fff36033e0f CoreFoundation`__RELEASE_OBJECTS_IN_THE_ARRAY__ + 122
frame #8495: 0x00007fff35f32e3e CoreFoundation`-[__NSArrayM dealloc] + 289
frame #8496: 0x00007fff35fd592d CoreFoundation`-[__NSOrderedSetM dealloc] + 157
frame #8497: 0x00007fff6c50dd16 libobjc.A.dylib`object_cxxDestructFromClass(objc_object*, objc_class*) + 83
frame #8498: 0x00007fff6c5076c3 libobjc.A.dylib`objc_destructInstance + 94
frame #8499: 0x00007fff6c50762b libobjc.A.dylib`_objc_rootDealloc + 62
frame #8500: 0x00007fff36033e0f CoreFoundation`__RELEASE_OBJECTS_IN_THE_ARRAY__ + 122
frame #8501: 0x00007fff35f32e3e CoreFoundation`-[__NSArrayM dealloc] + 289
frame #8502: 0x00007fff35fd592d CoreFoundation`-[__NSOrderedSetM dealloc] + 157
frame #8503: 0x00007fff6c50dd16 libobjc.A.dylib`object_cxxDestructFromClass(objc_object*, objc_class*) + 83
frame #8504: 0x00007fff6c5076c3 libobjc.A.dylib`objc_destructInstance + 94
frame #8505: 0x00007fff6c50762b libobjc.A.dylib`_objc_rootDealloc + 62
frame #8506: 0x00007fff36033e0f CoreFoundation`__RELEASE_OBJECTS_IN_THE_ARRAY__ + 122
frame #8507: 0x00007fff35f32e3e CoreFoundation`-[__NSArrayM dealloc] + 289
frame #8508: 0x00007fff35fd592d CoreFoundation`-[__NSOrderedSetM dealloc] + 157
frame #8509: 0x00007fff6c50dd16 libobjc.A.dylib`object_cxxDestructFromClass(objc_object*, objc_class*) + 83
frame #8510: 0x00007fff6c5076c3 libobjc.A.dylib`objc_destructInstance + 94
frame #8511: 0x00007fff6c50762b libobjc.A.dylib`_objc_rootDealloc + 62
frame #8512: 0x00007fff36033e0f CoreFoundation`__RELEASE_OBJECTS_IN_THE_ARRAY__ + 122
frame #8513: 0x00007fff35f32e3e CoreFoundation`-[__NSArrayM dealloc] + 289
frame #8514: 0x00007fff35fd592d CoreFoundation`-[__NSOrderedSetM dealloc] + 157
frame #8515: 0x00007fff6c50dd16 libobjc.A.dylib`object_cxxDestructFromClass(objc_object*, objc_class*) + 83
frame #8516: 0x00007fff6c5076c3 libobjc.A.dylib`objc_destructInstance + 94
frame #8517: 0x00007fff6c50762b libobjc.A.dylib`_objc_rootDealloc + 62
frame #8518: 0x00007fff6c52152a libobjc.A.dylib`AutoreleasePoolPage::releaseUntil(objc_object**) + 134
frame #8519: 0x00007fff6c507c30 libobjc.A.dylib`objc_autoreleasePoolPop + 175
frame #8520: 0x00007fff35aa998b CoreData`developerSubmittedBlockToNSManagedObjectContextPerform + 411
frame #8521: 0x0000000103c8f84f libdispatch.dylib`_dispatch_client_callout + 8
frame #8522: 0x0000000103c96df1 libdispatch.dylib`_dispatch_lane_serial_drain + 777
frame #8523: 0x0000000103c97ba8 libdispatch.dylib`_dispatch_lane_invoke + 438
frame #8524: 0x0000000103ca5045 libdispatch.dylib`_dispatch_workloop_worker_thread + 676
frame #8525: 0x0000000103d1b0b3 libsystem_pthread.dylib`_pthread_wqthread + 290
frame #8526: 0x0000000103d1af1b libsystem_pthread.dylib`start_wqthread + 15
Impossible to change an attribute of a cloned element if its original element had attributes.
An example:
HTMLElement *element = [HTMLElement new];
element.elementId = @"originalId";
HTMLElement *clone = [element cloneNodeDeep:YES];
NSString *cloneId = @"cloneId";
clone.elementId = cloneId
The last line raises the NSInvalidArgumentException:
_[_NSSingleEntryDictionaryI setObject:forKeyedSubscript:]: unrecognized selector sent to instance 0x7fcbd143e150
The root cause is the HTMLElement copy method:
- (id)copyWithZone:(NSZone *)zone
{
...
copy->_attributes = [_attributes copy];
...
}
If the _attributes is not nil a copy will be a NSDictionary not a HTMLOrderedDictionary.
can suport this css selector [property=og:url]?
url:
http://www.87wx.net/book/88577/
when i use this selector, can't get the target.
but use jsoup to query, it work.
let element:HTMLElement = HTMLElement(tagName: "div")
element.innerHTML = "Line<br/>Breaks"
print("\(element.textContent)")
output:
LineBreaks
desired:
Line\nBreaks
At leas this is how NSAttributedString
's initWithHTML
works. Anything I need to do to get this to work properly?
When I try to add HTMLKit as a dependency I get an error message at the Resolving package dependencies stage:
because HTMLKit >=0.9.4 contains incompatible tools version and root depends on HTMLKit 3.1.0..<4.0.0, version solving failed.
Tested on Xcode 11.4 and 12 beta.
The problem manifest itself when reading a HTML file and outputting it again to a HTML string.
<!DOCTYPE html>
<head>
<title>debug</title>
</head>
<body>
<svg id="draw_area" width="600" height="800" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" version="1.1">
<image id="overlay_img" xlink:href="foo.png" x="0" y="0" width="600" height="800" />
</svg>
</body>
</html>
NSString *dpath = [[NSBundle mainBundle] pathForResource:@"debug" ofType:@"html"];
NSString *dcontent = [NSString stringWithContentsOfFile:dpath
encoding:NSUTF8StringEncoding error:nil];
HTMLDocument* ddocument = [HTMLDocument documentWithString:dcontent];
dcontent = [[ddocument documentElement] innerHTML];
After above code the variable dcontent
contains:
<head>
<title>debug</title>
</head>
<body>
<svg id="draw_area" width="600" height="800" xmlns="http://www.w3.org/2000/svg" xmlns xlink="http://www.w3.org/1999/xlink" version="1.1">
<image id="overlay_img" xlink href="foo.png" x="0" y="0" width="600" height="800"></image>
</svg>
</body>
Note how xmlns:xlink
is converted to xmlns xlink
and xlink:href
to xlink href
after outputting the content to HTML again. This broke the svg.
Every once in a while, I get a crash on an "internal consistency" error.
My app is fetching some HTML pages in a background process and pulling out some interesting stuff. A few times, I've had the app (in the simulator) crash on the following:
2019-07-13 21:30:37.026686-0400 dreamwidth[65452:90258127] *** Terminating app due to uncaught exception 'NSInternalInconsistencyException', reason: '*** -[NSHashTable NSHashTable {
[10] <HTMLNodeIterator: 0x60000eae1680>
}
] count underflow'
*** First throw call stack:
(
0 CoreFoundation 0x000000010af6a6fb __exceptionPreprocess + 331
1 libobjc.A.dylib 0x00000001098abac5 objc_exception_throw + 48
2 CoreFoundation 0x000000010af6a555 +[NSException raise:format:] + 197
3 Foundation 0x000000010827f9d7 hashProbe + 407
4 Foundation 0x000000010827fe5c -[NSConcreteHashTable removeItem:] + 49
5 myapp 0x0000000106d01e3d -[HTMLDocument detachNodeIterator:] + 93
6 myapp 0x0000000106d115b7 -[HTMLNodeIterator dealloc] + 87
I suspect that it's something to do with timing: it doesn't happen all the time, but when it does, it crashes my app. I suspect that the HTML document is being deallocated when this happens.
I've tried to avoid iterating on, say, .childNodes to avoid creating node iterators, but the crash still happens every so often.
Xcode 8.3 has an issue with default modulemaps file in the sources folder, the workaround in #12 was to rename the file and point Xcode to it. This causes the SwiftPM build to throw lots of warnings because of the generated umbrella header.
The workaround should be reverted once Xcode fixes the issue.
Hi
I am implementing text highlight feature for my UIWebView page, and I wanted to use your library to find all DOM ranges for the desired text and then use rangy library to highlight them.
Could you assist me?
File HTMLNodeFilter.m:
@interface HTMLNodeFilterBlock ()
{
BOOL (^ _block)(HTMLNode *);
}
@end
The _block must be an HTMLNodeFilterValue instead of BOOL. Possibly it was a typo.
This issue does nothing on simulators but it leads wrong behavior on devices.
Example:
HTMLDocument *document = [HTMLDocument documentWithString:@"<div id=\"id\"></div>"];
NSString *divId = @"id";
HTMLNodeFilterBlock *filter = [HTMLNodeFilterBlock filterWithBlock:^HTMLNodeFilterValue(HTMLNode * _Nonnull node) {
HTMLElement *element = (HTMLElement *)node;
return [element.elementId isEqualToString:divId] ? HTMLNodeFilterAccept : HTMLNodeFilterSkip;
}];
HTMLNodeIterator *iterator = [document nodeIteratorWithShowOptions:HTMLNodeFilterShowElement filter:filter];
HTMLElement *element = (HTMLElement*)iterator.nextObject;
On a simulator, the element is the div
. But on a device, the element is the html
.
HTMLRange
s are being retained by the document when attached on initialization. This is basically the same as in #4
The NSString+HTMLKit.h and NSCharacterSet+HTMLKit.h categories contain generic named category methods. They are causing collisions with other methods I've already been using. It would be great if you could prefix the methods to reduce potential collisions. Perhaps the prefix could be "html_" or something similar. Thanks!
Could we please discuss implementing DOM range feature? I would be pleased to contribute for this part of project
Hello,
First of all, a very nice library you have created here. I am using it for a feature in my app where I need to load a HTML string , select node with particular tags and fetch the attributes of name and value from those tags. I have reached this far :
HTMLParser *parser = [[HTMLParser alloc]initWithString : htmlString];
HTMLElement *htmlElement = [[HTMLElement alloc] initWithTagName:@"//input[@type = 'hidden']"];
NSArray *nodes = [parser parseFragmentWithContextElement : htmlElement];
for(HTMLNode *node in nodes)
{
// Here I want to have the attributes of the node
// e.g. something like node.Attributes["name"].Value and node.Attributes["value"].Value
}
It would be great if you could help me out and guide me in the right direction as I am quite new to iOS.
Thank you.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.