Comments (6)
This seems to work using the XML parser instead of the HTML one, but you do need to specify the namespace correctly:
name: "Factorio Release"
url: 'https://forums.factorio.com/app.php/feed/forum/3'
filter:
- xpath:
path: '//atom:entry[1]/atom:title/text()'
method: xml
namespaces:
atom: 'http://www.w3.org/2005/Atom'
from urlwatch.
This seems to work using the XML parser instead of the HTML one, but you do need to specify the namespace correctly:
name: "Factorio Release" url: 'https://forums.factorio.com/app.php/feed/forum/3' filter: - xpath: path: '//atom:entry[1]/atom:title/text()' method: xml namespaces: atom: 'http://www.w3.org/2005/Atom'
This works great! So my problem is solved, but I don't know if the issue should be left open, since it probably should work with xpath also?
from urlwatch.
I'm not sure. Your trying to parse XML with an html parser. From what I could see it should work but doesn't.
I expect a simple test case using lxml etree on its own would be a good start, open an issue on the lxml bug tracker with sample code and see what happens.
I don't see anything wrong with how urlwatch is using the library, but I'm not an expert.
from urlwatch.
I don't know either. But according to wikipedia XPath stands for "XML Path Language" ... I also found lots of XML examples without searching for it... Maybe the used library is not set out for XML? But that makes also not really sense. Let's keep this here for the moment and see what the dev(s) have to say about this.
from urlwatch.
By default urlwatch uses the HTMLParser class from lxml etree. My example switches it to the XML parser.
from urlwatch.
FYI: https://bugs.launchpad.net/lxml/+bug/2067707
from urlwatch.
Related Issues (20)
- Reporting blanks HOT 28
- add support to specify multiple recipients per URL HOT 7
- YAML Anchors/Aliases not working HOT 4
- CSS Filter "AttributeError: 'CSSSelector' object has no attribute 'evaluate'" HOT 2
- FEATURE: Support multiple reporters with different options HOT 6
- Meaning of max_tries is confusing
- urlwatch stopped working HOT 4
- sendmail is not documented HOT 2
- Randomly "not enough values to unpack" Python errors HOT 4
- Cache inconsistency creating new items from nowhere HOT 3
- Feature request: Extension of regex filtering to extract data HOT 7
- Consider releasing version 2.29 HOT 5
- Question - Report http errors only once HOT 5
- urlwatch 2.25-1 on Debian Stable 12.5 (navigate fails) HOT 3
- Enable/disable job from the command line HOT 1
- urlwatch moans when supplying --config HOT 1
- urlwatch 2.25 - AttributeError: 'list' object has no attribute 'read' HOT 2
- urlwatch 2.28: html: separate: true not sending separate emails HOT 2
- --test-filter works but not with a normal execution HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from urlwatch.