Comments (14)
What is your system locale and what does your shell script output?
from urlwatch.
» locale
LANG=de_DE.UTF-8
LC_CTYPE=de_DE.UTF-8
LC_NUMERIC=de_DE.UTF-8
LC_TIME=de_DE.UTF-8
LC_COLLATE=de_DE.UTF-8
LC_MONETARY=de_DE.UTF-8
LC_MESSAGES=de_DE.UTF-8
LC_PAPER="de_DE.UTF-8"
LC_NAME="de_DE.UTF-8"
LC_ADDRESS="de_DE.UTF-8"
LC_TELEPHONE="de_DE.UTF-8"
LC_MEASUREMENT="de_DE.UTF-8"
LC_IDENTIFICATION="de_DE.UTF-8"
LC_ALL=
the shell scripts prints a txt-file via cat
from urlwatch.
Ok, thanks. And which encoding is the text file in? (the file
utility might give a clue, or enca
or something).
from urlwatch.
File says the following
» file file.txt
file.txt: UTF-8 Unicode text
from urlwatch.
Thanks, I can work with that. Probably it makes most sense that urlwatch treats shell output as whatever the system encoding is set to, that's probably the best assumption we can make there.
from urlwatch.
What does Python3's sys.getdefaultencoding()
give?
% python3
Python 3.4.3 (default, Oct 14 2015, 20:28:29)
[GCC 4.8.4] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys
>>> sys.getdefaultencoding()
'utf-8'
from urlwatch.
I cannot reproduce this issue here on Ubuntu:
% hexdump -C ~/foo
00000000 23 21 2f 62 69 6e 2f 73 68 0a 0a 65 63 68 6f 20 |#!/bin/sh..echo |
00000010 22 53 61 70 70 65 72 6c c3 b6 74 3f 22 0a |"Sapperl..t?".|
0000001e
% set | egrep -a "^(LC_|LANG)"
LANG=en_US.UTF-8
LANGUAGE=en_US
LC_ADDRESS=de_AT.UTF-8
LC_IDENTIFICATION=de_AT.UTF-8
LC_MEASUREMENT=de_AT.UTF-8
LC_MONETARY=de_AT.UTF-8
LC_NAME=de_AT.UTF-8
LC_NUMERIC=de_AT.UTF-8
LC_PAPER=de_AT.UTF-8
LC_TELEPHONE=de_AT.UTF-8
LC_TIME=de_AT.UTF-8
% ./urlwatch --list
1: ~/foo
% ./urlwatch
===========================================================================
01. CHANGED: ~/foo
===========================================================================
---------------------------------------------------------------------------
CHANGED: ~/foo
---------------------------------------------------------------------------
--- @ Fri, 12 Feb 2016 10:13:27 +0100
+++ @ Fri, 12 Feb 2016 10:14:08 +0100
@@ -1 +1 @@
-Sapperlöt
+Sapperlöt?
---------------------------------------------------------------------------
--
urlwatch 2.1, Copyright 2008-2016 Thomas Perl
Website: http://thp.io/2008/urlwatch/
watched 1 URLs in 0 seconds
from urlwatch.
>>> sys.getdefaultencoding()
'utf-8'
from urlwatch.
can you give me your test script?
from urlwatch.
By the way, the character \xfc
is actually latin-1, not utf-8:
>>> b'\xfc'.decode('utf-8')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xfc in position 0: invalid start byte
>>> b'\xfc'.decode('latin-1')
'ü'
from urlwatch.
seems to have someting to do with wrong locales. the urlwatch job is executed with cron and there i have the following locales:
LANG=
LC_CTYPE="POSIX"
LC_NUMERIC="POSIX"
LC_TIME="POSIX"
LC_COLLATE="POSIX"
LC_MONETARY="POSIX"
LC_MESSAGES="POSIX"
LC_PAPER="POSIX"
LC_NAME="POSIX"
LC_ADDRESS="POSIX"
LC_TELEPHONE="POSIX"
LC_MEASUREMENT="POSIX"
LC_IDENTIFICATION="POSIX"
LC_ALL=
I will try to set locales in the script but i am not sure if that works also for urlwatch
from urlwatch.
now i get the mail with the changes but also an exception in the cron log:
Traceback (most recent call last):
File "/usr/bin/urlwatch", line 376, in <module>
main(parser.parse_args())
File "/usr/bin/urlwatch", line 343, in main
report.finish()
File "/usr/lib/python3.5/site-packages/urlwatch/handler.py", line 128, in finish
ReporterBase.submit_all(self, self.job_states, duration)
File "/usr/lib/python3.5/site-packages/urlwatch/reporters.py", line 81, in submit_all
cls(report, cfg, job_states, duration).submit()
File "/usr/lib/python3.5/site-packages/urlwatch/reporters.py", line 296, in submit
print(self._green(line))
UnicodeEncodeError: 'ascii' codec can't encode character '\xfc' in position 6: ordinal not in range(128)
from urlwatch.
ok, i solved it by disabling detailed output to console in the config:
details: false
from urlwatch.
Opened #51 as a possible fix for users in case somebody else runs into this.
from urlwatch.
Related Issues (20)
- [pyppeteer] No module named 'pyppeteer' using Docker python3.10 bookworm HOT 1
- Reporting blanks HOT 28
- add support to specify multiple recipients per URL HOT 7
- YAML Anchors/Aliases not working HOT 4
- CSS Filter "AttributeError: 'CSSSelector' object has no attribute 'evaluate'" HOT 2
- FEATURE: Support multiple reporters with different options HOT 6
- Meaning of max_tries is confusing
- urlwatch stopped working HOT 4
- sendmail is not documented HOT 2
- Randomly "not enough values to unpack" Python errors HOT 4
- Cache inconsistency creating new items from nowhere HOT 3
- Feature request: Extension of regex filtering to extract data HOT 7
- Consider releasing version 2.29 HOT 5
- Question - Report http errors only once HOT 5
- urlwatch 2.25-1 on Debian Stable 12.5 (navigate fails) HOT 3
- Enable/disable job from the command line HOT 1
- urlwatch moans when supplying --config HOT 1
- urlwatch 2.25 - AttributeError: 'list' object has no attribute 'read' HOT 2
- XML parsing with CDATA not working HOT 6
- urlwatch 2.28: html: separate: true not sending separate emails HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from urlwatch.