Giter Club home page Giter Club logo

python-iconv's Introduction

Iconv-based codec library for Python

Written by Martin v. Loewis
Ported to Python 3 by Bodo Graumann

This package provides a set of codecs to Python based on the underlying iconv library of the operating system, as available on glibc 2, Solaris, or other Unix variants. It consists of two modules: iconv and iconvcodec. For common usage the codec interface is more convenient and should be preferred.

Installation

To install the module use

pip install python-iconv

This module package requires atleast Python 3.6.

Module iconv

The iconv module exposes a global function to create iconv objects:

open(tocode, fromcode)

Return descriptor for character set conversion. If the conversion of fromcode to tocode is not known to the system, a ValueError is raised.

Iconv objects provide a single method to convert a string

iconv(in[, outlen[, count_only]])

Return the string resulting from the conversion of in. The parameter in must be a byte string. It is the caller's responsibility to guarantee that the internal representation of the in object indeed uses fromcode of the Iconv object. The parameter outlen represents an estimate of the resulting string size in bytes. If the buffer is to small, an exception is thrown. If count_only is set, no conversion is attempted, but the number of necessary bytes is returned.

In case of an error, the iconv method raises the exception iconv.error. This exception has four arguments:

  • the error string as returned from strerror
  • the error number
  • the number of input bytes processed
  • the output string produced so far

Module iconvcodecs

This module encapsulates the iconv module into a set of codecs. To use it, simply import it. As a result, the C library's codecs will be available:

b"Hello".decode("T.61")
"World".encode("JOHAB")

Contributing

Contributions are always welcome. Setting up a local dev environment is as simple as:

python -m venv env
source env/bin/activate
pip install -e .
python -m unittest

Code should be auto-formatted with black.

pip install black
black *.py

Publishing

We currently only publish source distributions.

pip install twine
python setup.py sdist
twine upload dist/*

License

This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 3 of the License, or (at your option) any later version.

Author

Bodo Graumann [email protected]

python-iconv's People

Contributors

bodograumann avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

Forkers

nwoods

python-iconv's Issues

Windows 10 compatibility?

Using Python 3.10 and going
pip install python-iconv
failed with
error: microsoft visual c++ 14.0 or greater is required. get it with "microsoft c++ build tools": https://visualstudio.microsoft.com/visual-cpp-build-tools/

After installing the default workload for "Desktop development with C++" via Visual Studio Build Tools the result is different after
pip install python-iconv

C:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools>pip install python-iconv
Defaulting to user installation because normal site-packages is not writeable
Collecting python-iconv
  Using cached python-iconv-1.1.2.tar.gz (17 kB)
  Preparing metadata (setup.py) ... done
Building wheels for collected packages: python-iconv
  Building wheel for python-iconv (setup.py) ... error
  error: subprocess-exited-with-error

  × python setup.py bdist_wheel did not run successfully.
  │ exit code: 1
  ╰─> [14 lines of output]
      running bdist_wheel
      running build
      running build_py
      creating build
      creating build\lib.win32-3.10
      copying iconvcodec.py -> build\lib.win32-3.10
      running build_ext
      building 'iconv' extension
      creating build\temp.win32-3.10
      creating build\temp.win32-3.10\Release
      C:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools\VC\Tools\MSVC\14.29.30133\bin\HostX86\x86\cl.exe /c /nologo /Ox /W3 /GL /DNDEBUG /MD -IC:\Program Files\Python310\include -IC:\Program Files\Python310\Include -IC:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools\VC\Tools\MSVC\14.29.30133\include -IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\ucrt -IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\shared -IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\um -IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\winrt -IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\cppwinrt -IC:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools\VC\Tools\MSVC\14.29.30133\include -IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\ucrt -IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\shared -IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\um -IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\winrt -IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\cppwinrt /Tciconvmodule.c /Fobuild\temp.win32-3.10\Release\iconvmodule.obj
      iconvmodule.c
      iconvmodule.c(1): fatal error C1083: Cannot open include file: 'iconv.h': No such file or directory
      error: command 'C:\\Program Files (x86)\\Microsoft Visual Studio\\2019\\BuildTools\\VC\\Tools\\MSVC\\14.29.30133\\bin\\HostX86\\x86\\cl.exe' failed with exit code 2
      [end of output]

  note: This error originates from a subprocess, and is likely not a problem with pip.
  ERROR: Failed building wheel for python-iconv
  Running setup.py clean for python-iconv
Failed to build python-iconv
Installing collected packages: python-iconv
  Running setup.py install for python-iconv ... error
  error: subprocess-exited-with-error

  × Running setup.py install for python-iconv did not run successfully.
  │ exit code: 1
  ╰─> [14 lines of output]
      running install
      running build
      running build_py
      creating build
      creating build\lib.win32-3.10
      copying iconvcodec.py -> build\lib.win32-3.10
      running build_ext
      building 'iconv' extension
      creating build\temp.win32-3.10
      creating build\temp.win32-3.10\Release
      C:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools\VC\Tools\MSVC\14.29.30133\bin\HostX86\x86\cl.exe /c /nologo /Ox /W3 /GL /DNDEBUG /MD -IC:\Program Files\Python310\include -IC:\Program Files\Python310\Include -IC:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools\VC\Tools\MSVC\14.29.30133\include -IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\ucrt -IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\shared -IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\um -IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\winrt -IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\cppwinrt -IC:\Program Files (x86)\Microsoft Visual Studio\2019\BuildTools\VC\Tools\MSVC\14.29.30133\include -IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\ucrt -IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\shared -IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\um -IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\winrt -IC:\Program Files (x86)\Windows Kits\10\include\10.0.19041.0\cppwinrt /Tciconvmodule.c /Fobuild\temp.win32-3.10\Release\iconvmodule.obj
      iconvmodule.c
      iconvmodule.c(1): fatal error C1083: Cannot open include file: 'iconv.h': No such file or directory
      error: command 'C:\\Program Files (x86)\\Microsoft Visual Studio\\2019\\BuildTools\\VC\\Tools\\MSVC\\14.29.30133\\bin\\HostX86\\x86\\cl.exe' failed with exit code 2
      [end of output]

  note: This error originates from a subprocess, and is likely not a problem with pip.
error: legacy-install-failure

× Encountered error while trying to install package.
╰─> python-iconv

note: This is an issue with the package mentioned above, not pip.
hint: See above for output from the failure.

Python 3.11: iconvmodule.c:166:26: error: lvalue required as left operand of assignment

python-iconv fails to build with Python 3.11:

$ python3 --version
Python 3.11.0a1

$ python3 setup.py build
running build
running build_py
running build_ext
building 'iconv' extension
gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -fPIC -I/usr/local/include/python3.11 -c iconvmodule.c -o build/temp.linux-x86_64-3.11/iconvmodule.o
iconvmodule.c: In function ‘Iconv_iconv’:
iconvmodule.c:112:35: warning: passing argument 2 of ‘iconv’ from incompatible pointer type [-Wincompatible-pointer-types]
  112 |     iresult = iconv(self->handle, &inbuf, &inbuf_size, &outbuf, &outbuf_size);
      |                                   ^~~~~~
      |                                   |
      |                                   const char **
In file included from iconvmodule.c:1:
/usr/include/iconv.h:42:54: note: expected ‘char ** restrict’ but argument is of type ‘const char **’
   42 | extern size_t iconv (iconv_t __cd, char **__restrict __inbuf,
      |                                    ~~~~~~~~~~~~~~~~~~^~~~~~~
iconvmodule.c:122:17: warning: comparison of integer expressions of different signedness: ‘size_t’ {aka ‘long unsigned int’} and ‘int’ [-Wsign-compare]
  122 |     if (iresult == -1) {
      |                 ^~
iconvmodule.c: In function ‘PyInit_iconv’:
iconvmodule.c:166:26: error: lvalue required as left operand of assignment
  166 |     Py_TYPE(&Iconv_Type) = &PyType_Type;
      |                          ^
error: command '/usr/bin/gcc' failed with exit code 1

This seems to be caused by python/cpython@f3fa63e:

Convert the Py_TYPE() and Py_SIZE() macros to static inline functions.
The Py_SET_TYPE() and Py_SET_SIZE() functions must now be used to set an object type and size.

ASCII//TRANSLIT not found with python 3.9

In python 3.9 the ASCII//TRANSLIT codec cannot be loaded anymore:

> python -m unittest
.....EE
======================================================================
ERROR: test_incremental_encode (test_iconvcodec.TestIconvcodecModule)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/bodo/Libraries/python-iconv/test_iconvcodec.py", line 21, in test_incremental_encode
    encoder = codecs.getincrementalencoder("ASCII//TRANSLIT")()
  File "/usr/lib/python3.9/codecs.py", line 986, in getincrementalencoder
    encoder = lookup(encoding).incrementalencoder
LookupError: unknown encoding: ASCII//TRANSLIT

======================================================================
ERROR: test_transliterate (test_iconvcodec.TestIconvcodecModule)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/bodo/Libraries/python-iconv/test_iconvcodec.py", line 17, in test_transliterate
    bytestring = string.encode("ASCII//TRANSLIT")
LookupError: unknown encoding: ASCII//TRANSLIT

----------------------------------------------------------------------
Ran 7 tests in 0.003s

FAILED (errors=2)

Convert files

Could you give example code to convert complete files or offer a function for that?

Encoding fails when output is longer than input

The following script raises an unexpected error:

#!/usr/bin/env python3

import iconvcodec

failmode_even = False

tm = '™'
if failmode_even:
    tm += ' '

tm_bin = tm.encode('ASCII//TRANSLIT')

print(tm_bin)

(in case GitHub doesn't support UTF-8 here, the character in the tm string is U+2122).

I'm using Python 3.8.5 on Ubuntu 20.04, with python-iconv 1.1.0 installed by pip.

There are two possible exceptions raised depending on whether the input string length is even or odd (change failmode_even in the example script to toggle them) but the underlying cause appears to be the same. Looks to me like a faulty (or maybe outdated?) assumption about the width of characters in Python strings but I wasn't able to figure out a fix after a few minutes of cursory exploration.

Exceptions:

Traceback (most recent call last):
  File "/home/nwoods/.local/lib/python3.8/site-packages/iconvcodec.py", line 10, in encode
    return encoder.iconv(msg), len(msg)
iconv.error: ('Argument list too long', 7, 0, b'')

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/nwoods/.local/lib/python3.8/site-packages/iconvcodec.py", line 17, in encode
    out1, len1 = encode(msg[inlen:], errors)
TypeError: slice indices must be integers or None or have an __index__ method

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "./tmfail.py", line 7, in <module>
    tm_bin = tm.encode('ASCII//TRANSLIT')
TypeError: encoding with 'ASCII//TRANSLIT' codec failed (TypeError: slice indices must be integers or None or have an __index__ method)

or

Traceback (most recent call last):
  File "/home/nwoods/.local/lib/python3.8/site-packages/iconvcodec.py", line 10, in encode
    return encoder.iconv(msg), len(msg)
iconv.error: ('Argument list too long', 7, 3, b'(TM)')

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/nwoods/.local/lib/python3.8/site-packages/iconvcodec.py", line 13, in encode
    assert inlen % 2 == 0
AssertionError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "./tmfail.py", line 11, in <module>
    tm_bin = tm.encode('ASCII//TRANSLIT')
AssertionError: encoding with 'ASCII//TRANSLIT' codec failed (AssertionError: )

Thanks for your help, let me know if there are any more details I can provide.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.