Giter Club home page Giter Club logo

Comments (17)

exarkun avatar exarkun commented on July 20, 2024

Please be explicit. Which methods?

from pyopenssl.

tisdall avatar tisdall commented on July 20, 2024

OpenSSL.SSL.Context.use_privatekey_file - "keyfile must be a byte string"
OpenSSL.SSL.Context.use_certificate_file - "certfile must be bytes or unicode"
OpenSSL.SSL.Context.load_verify_locations - "cafile must be None or a byte string"

Three different methods all accepting a file name and each has different requirements. There are other methods I haven't used that I didn't check. I think use_privatekey_file is the one that threw an exception when I switched from 0.13.1 to 0.14 as I was using a unicode string.

from pyopenssl.

exarkun avatar exarkun commented on July 20, 2024

Thank you.

from pyopenssl.

flavio avatar flavio commented on July 20, 2024

This looks like a duplicate of issue #15.

from pyopenssl.

tisdall avatar tisdall commented on July 20, 2024

@flavio - looks related, but not a duplicate. My issue is that there's no consistency between what the different methods take as far as "string" type. Also, that it changed between the versions and broke some of my existing code.

from pyopenssl.

exarkun avatar exarkun commented on July 20, 2024

Considering the direction Python has chosen for handling paths, I think the sensible thing for pyOpenSSL is to:

  • accept bytes or unicode for any path value
  • automatically encode unicode to bytes using sys.getfilesystemencoding()

It would be nice to have a function which performed the necessary check and encoding for a single value. This could be used throughout pyOpenSSL anywhere an API accepts a path.

from pyopenssl.

tisdall avatar tisdall commented on July 20, 2024

I've been using something like this:

file_name_encoding = sys.getfilesystemencoding()
if file_name_encoding is None:
    file_name_encoding = sys.getdefaultencoding()

As per the docs saying "Return the name of the encoding used to convert Unicode filenames into system file names, or None if the system default encoding is used." (https://docs.python.org/2/library/sys.html#sys.getfilesystemencoding)

from pyopenssl.

tisdall avatar tisdall commented on July 20, 2024

how is this for a utility function:

from six import PY3, binary_type, text_type

def filename_to_bytes(filename):
    if isinstance(filename, binary_type):
        return filename
    elif isinstance(filename, text_type):
        fs_encoding = sys.getfilesystemencoding()
        if fs_encoding is None:
            fs_encoding = sys.getdefaultencoding()
        return filename.encode(fs_encoding)
    else:
        raise TypeError("filenames must either be a string or bytes")

from pyopenssl.

exarkun avatar exarkun commented on July 20, 2024

I'm not sure it makes sense to apply sys.getdefaultencoding() to filesystem paths. If there's no indication of how path names are represented in the underlying system then it's probably better not to guess or to make a very simple, safe guess (like "ascii").

Another thing to consider is that, according to the documentation (https://docs.python.org/3/library/sys.html#sys.getfilesystemencoding), getfilesystemencoding cannot return None in Python 3.2 and newer. That means the case where getfilesystemencoding() returns None is limited to Python 2.6 and Python 2.7 - where support for unicode path names is perhaps not something folks expect, anyway.

from pyopenssl.

tisdall avatar tisdall commented on July 20, 2024

As I quoted, the docs say it returns None when the file encoding is the
system default... So, when it's None I use the system default, no? But,
yes, it seems to only matter with py2.
On Mar 29, 2015 3:16 PM, "Jean-Paul Calderone" [email protected]
wrote:

I'm not sure it makes sense to apply sys.getdefaultencoding() to
filesystem paths. If there's no indication of how path names are
represented in the underlying system then it's probably better not to guess
or to make a very simple, safe guess (like "ascii").

Another thing to consider is that, according to the documentation (
https://docs.python.org/3/library/sys.html#sys.getfilesystemencoding),
getfilesystemencoding cannot return None in Python 3.2 and newer. That
means the case where getfilesystemencoding() returns None is limited to
Python 2.6 and Python 2.7 - where support for unicode path names is
perhaps not something folks expect, anyway.


Reply to this email directly or view it on GitHub
#62 (comment).

from pyopenssl.

exarkun avatar exarkun commented on July 20, 2024

So, when it's None I use the system default, no?

I don't think so, no. sys.getdefaultencoding() isn't the "system default" and it isn't particularly related to path names. It's just "ascii" on Python 2 and "utf-8" on Python 3 - unless someone screws around with it (using sys.setdefaultencoding).

I don't think that making the fallback be either "ascii" or "utf-8" depending on the Python version is a good idea, I don't think that tying it to the Python "default encoding" (which itself is simply a bad idea) is particularly useful, and I don't think that letting people screw with it by using sys.setdefaultencoding will be particularly beneficial.

I'm not entirely decided on whether hard-coding "ascii" or "utf-8" makes more sense.

The advantage of ascii is that if the encoding succeeds you can be pretty sure you're addressing the file the user meant to address (there are ascii-incompatible encodings but ... for all intents and purposes they are unused). The disadvantage is that many possible path names will not be encodable using ascii. The mitigating factor, perhaps, is that if you're using non-ascii path names your environment really, really should be set up so that sys.getfilesystemencoding() returns something (even if it isn't the right thing, but that's not pyOpenSSL's problem, that's just the downside of how Python has decided to handle path names).

The advantage of utf-8 is that it can encode anything. The disadvantage is that it might be the wrong encoding and so the referenced file won't be found.

What's worse? Having UnicodeEncodeError raised? Or an OpenSSL error indicating file-not-found because the wrong filename was used?

from pyopenssl.

tisdall avatar tisdall commented on July 20, 2024

Well, wouldn't the best thing be to mimic open()'s functionality? I'd say if you pass a string into open and it couldn't find the file due to encoding issues, then it'd be too much to expect any third party to do it better.

I guess when I came across this issue, that was the most surprising thing... I expected that I could pass it a string in the same way I had been doing with open() and it would work.

from pyopenssl.

exarkun avatar exarkun commented on July 20, 2024

Well, wouldn't the best thing be to mimic open()'s functionality?

Yes. That's fine. How does open work? The library documentation doesn't actually say. All that's documented on https://docs.python.org/3/library/functions.html#open is:

file is either a string or bytes object giving the pathname (absolute or relative
to the current working directory) of the file to be opened or an integer file
descriptor of the file to be wrapped.

from pyopenssl.

tisdall avatar tisdall commented on July 20, 2024

^_^ I knew that was coming next...

I've tried making sense of https://hg.python.org/cpython/file/00c982c9f681/Python/fileutils.c but it's a little beyond me. PyUnicode_EncodeFSDefault says it uses the filesytem default encoding but if it's not set that it defaults to the current locale encoding. That may be what getdefaultencoding is already doing in py3k instead of returning None.

from pyopenssl.

tisdall avatar tisdall commented on July 20, 2024

sorry.. I meant that's what getfilesystemencoding is doing...

from pyopenssl.

tisdall avatar tisdall commented on July 20, 2024

It seems like Py2 does what you're saying... if it has no value for Py_FileSystemDefaultEncoding it defaults to ASCII only. So, with no default encoding and something not covered by ASCII you get an UnicodeEncodeError when using open(). I'm not sure how to confirm that, though.

So, maybe this is a good starting point:

from six import PY3, binary_type, text_type

def filename_to_bytes(filename):
    if isinstance(filename, binary_type):
        return filename
    elif isinstance(filename, text_type):
        return filename.encode(sys.getfilesystemencoding() or 'ascii')
    else:
        raise TypeError("filenames must either be a string or bytes")

from pyopenssl.

tisdall avatar tisdall commented on July 20, 2024

This seems to be fixed by #209 so I'll close it.

from pyopenssl.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.