Comments (12)
For which URL does this happen?
from urlwatch.
the text that causes the error is in the footer of the webpage
from urlwatch.
Ok, I've tested with this page and it properly specifies UTF-8 both in the HTTP header and in the meta http-equiv tag. So I guess you have any filters enabled for this URL? If so, which?
from urlwatch.
yes, html2text
from urlwatch.
What is your system locale set to? You can check with the following snippet:
$ python3
[...]
>>> import sys
>>> sys.getdefaultencoding()
'utf-8'
from urlwatch.
also utf8
>>> import sys
>>> sys.getdefaultencoding()
'utf-8'
from urlwatch.
do you also have the problem only when running as a cronjob?
then look for the solution here:
#48
from urlwatch.
It looks like Python's stdout is set to latin-1 for some reason. Can you please run the following?
$ python3
[...]
>>> import sys, os
>>> sys.stdout.encoding
'UTF-8'
>>> os.environ.get('PYTHONIOENCODING', '')
''
from urlwatch.
python3
>>> import sys, os
>>> sys.stdout.encoding
'ISO-8859-1'
>>> os.environ.get('PYTHONIOENCODING', '')
''
from urlwatch.
Try running urlwatch with PYTHONIOENCODING="UTF-8" urlwatch
. That should solve your problem. I just wonder why your Python uses latin-1 as stdout encoding...
What distribution are you using?
from urlwatch.
Arch Linux
from urlwatch.
PYTHONIOENCODING="UTF-8" urlwatch
does the trick for me! Thanks
from urlwatch.
Related Issues (20)
- [pyppeteer] No module named 'pyppeteer' using Docker python3.10 bookworm HOT 1
- Reporting blanks HOT 28
- add support to specify multiple recipients per URL HOT 7
- YAML Anchors/Aliases not working HOT 4
- CSS Filter "AttributeError: 'CSSSelector' object has no attribute 'evaluate'" HOT 2
- FEATURE: Support multiple reporters with different options HOT 6
- Meaning of max_tries is confusing
- urlwatch stopped working HOT 4
- sendmail is not documented HOT 2
- Randomly "not enough values to unpack" Python errors HOT 4
- Cache inconsistency creating new items from nowhere HOT 3
- Feature request: Extension of regex filtering to extract data HOT 7
- Consider releasing version 2.29 HOT 5
- Question - Report http errors only once HOT 5
- urlwatch 2.25-1 on Debian Stable 12.5 (navigate fails) HOT 3
- Enable/disable job from the command line HOT 1
- urlwatch moans when supplying --config HOT 1
- urlwatch 2.25 - AttributeError: 'list' object has no attribute 'read' HOT 2
- XML parsing with CDATA not working HOT 6
- urlwatch 2.28: html: separate: true not sending separate emails HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from urlwatch.