Comments (6)
Just ran into this myself, it would be great if there was some option to fix this! Perhaps an option to print the error message in the cell (in red, so it's clear it's not that literal text?)
I'm working with the Firefox history database, so sadly removing the malformed data is not an option :(
from litecli.
Do you have an example value that I could use to reproduce this?
from litecli.
I uploaded an example database file here: https://hack.wesleyac.com/test.sqlite
Using the invalid unicode value \xc3\x28
. Let me know if that's sufficient for you :)
from litecli.
Thank you @WesleyAC. I was able to reproduce the issue. The fix is now in a PR (pending review from other core devs).
Long form description of what is going on:
Turns out sqlite3 library for Python uses utf-8 by default which works fine since Sqlite3 stores everything as utf-8. But as you pointed out there could be invalid unicode values that can sneak in. Thankfully the python library allows overriding of the decoder that can be used. So I've caught the exception and applied latin-1 decoding. Unfortunately this is a batch process which means, if a single value has an invalid byte value, the whole set has to use the fallback encoding of latin-1.
It seems to work well for now, but I can't use it to highlight the invalid value in red.
from litecli.
Unfortunately this is a batch process which means, if a single value has an invalid byte value, the whole set has to use the fallback encoding of latin-1.
Seems we can use decode('utf-8', 'backslashreplace')
to avoid this issue:
>>> b'\xf0\x9f\x98\x8a\x80abc'.decode('utf-8', 'backslashreplace')
'😊\\x80abc'
>>> b'\xf0\x9f\x98\x8a\x80abc'.decode('latin-1')
'ð\x9f\x98\x8a\x80abc'
from litecli.
I just dived into this issue a little, the root cause of this is:
- SQLite uses a dynamic type system (the type is recommended, not required), even though UTF-8 is the default encoding for
TEXT
type, but SQLite does not check if it's a valid UTF-8 string when inserting to it. - Python's sqlite library is using UTF-8 to decode the TEXT column by default. When it encounters invalid UTF-8 char, it throws
UnicodeDecodeError: 'utf-8' codec can't decode byte ...
error.
@amjith's CR fixed this issue by catching the UnicodeDecodeError
and then try to decode it as latin-1
.
from litecli.
Related Issues (20)
- Support `.eqp` like in sqlite3 HOT 3
- Can't use .once more than once. HOT 1
- When using .once -o, only last line remains HOT 2
- .schema: line breaks and missing index information
- “python_requires” should be set with “>=3.6”, as litecli 1.9.0 is not compatible with all Python versions
- feature request: sqlite3.38+ support for json to use its new features HOT 1
- Test failure with sqlparse==0.4.3
- Header names vague or hidden in grid mode HOT 2
- when insert to `table t(t text)` , raise an exception
- When long wide characters are stored in the database, it will not output some rows if enabled auto_vertical_output HOT 1
- STRICT tables seem to not be supported HOT 1
- FR: configuration to import custom python functions
- Feature comparison with SQLite CLI special commands (dot commands) HOT 1
- FR: support environment variables in (favourite) queries
- Cancelling a query leads to an error message, there is no 'KILL' command in sqlite HOT 1
- syntax style changes output behaviour HOT 1
- <null> entries in .schema output
- Litecli crashes in non root containerized environment without home directory HOT 7
- default prompt broken with very long filename?
- `startup_commands` fails when config value is a single string. HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from litecli.