Comments (21)
Here's a proposal, for TLS extensions to the connection scope for both HTTP
and Websockets (and, potentially, for other protocols that use TLS):
-
tls_used
(bool) -- True if this connection is over TLS, false
otherwise. Optional; defaults tofalse
. -
tls_client_cert_chain
(Iterable[Unicode string]) -- An iterable
of Unicode strings, where each string is a PEM-encoded x509 certificate.
The first certificate is the client certificate. Any subsequent certificates
are part of the certificate chain sent by the client, with each certificate
signing the preceeding one. Only applicable if the connection was over TLS;
for non-TLS connections or if the client did not provide a client certificate
then it will be an empty iterable. Some web server implementations may be
unable to provide this (e.g. if TLS is terminated by a separate proxy or
load balancer). Optional; defaults to empty list. -
tls_server_cert
(Optional[Unicode string]) -- The PEM-encoded x509
certificate sent by the server when establishing the TLS connection. Only
applicable if the connection was over TLS; for non-TLS connections then it
will beNone
. Some web server implementations may be unable to provide
this (e.g. if TLS is terminated by a separate proxy or load balancer).
Optional; defaults toNone
. -
tls_client_cert_error
(Optional[Unicode string]) --None
if a
client certificate was provided and successfully verified, or was not
provided. If a client certificate was provided but verification
failed, this is a non-empty string containing an error message or error code
indicating why validation failed; the details are web server specific. Most
web server implementations will reject the connection if the client
certificate verification failed, instead of setting this value. However,
some may be configured to allow the connection anyway. This is especially
useful when testing that client certificates are supported properly by the
client - it allows a response containing an error message that can be
presented to a human, instead of just refusing the connection. Optional;
defaults toNone
. -
tls_version
(Optional[int]) -- The TLS version in use. This uses the
version numbers as defined in the TLS specifications, which is an unsigned
integer. Common values include 0x0303 for TLS 1.2 or 0x0304 for TLS 1.3.
If TLS is not in use, set toNone
. Some web server implementations may
be unable to provide this (e.g. if TLS is terminated by a separate proxy or
load balancer); in that case set toNone
. Optional; defaults toNone
. -
tls_cipher_suite
(Optional[Iterable[int, int]]) -- The TLS cipher suite
that is being used. This is a pair of unsigned integers specified in the
relevant RFC, for example[0x13, 0x01]
for TLS_AES_128_GCM_SHA256.
If TLS is not in use, set toNone
. Some web server implementations may
be unable to provide this (e.g. if TLS is terminated by a separate proxy or
load balancer); in that case set toNone
. Optional; defaults toNone
.
from asgiref.
My two cents on this would be "let's define the exact variables needed for client certificates". Passing a raw handle to a transport isn't going to be portable across different HTTP versions
from asgiref.
This ticket is basically awaiting someone to sit down and fully write out the spec as a pull request, so we can get direct feedback on the format.
from asgiref.
Thanks for your suggestions.
My concerns about having just a pre-parsed representation are:
-
The format you've proposed doesn't include all the fields from the certificate. It might be OK for your use cases, but it's not sufficient for other use-cases. (E.g. I'm testing that the TLS client certificate complies with various standards, so my tests need access to pretty much all fields of the certificate).
-
What happens when a newly-written specification adds new features to the certificate? By definition, none of the existing parsers will handle it, and there may be a long delay until the data format spec and all existing implementations have been updated. This is also a significant amount of ongoing work for the spec writers and ASGI web server authors, to support specs that may be of interest to only a small number of people. They might also be proprietary specs, so I suspect open source developers would not be keen to add support for them.
-
There would need to be a specification for the data format. The documentation for getpeercert() doesn't actually specify the data format in a way that would allow someone else to reimplement it and get the same result. And I think that's important, we can't assume everyone uses Python's "ssl" module to implement TLS, especially for high performance web servers that are unlikely to be written in Python. Writing that spec is a chunk of work.
-
It makes it harder for every ASGI web server to implement this standard, as they have to parse the certificate correctly to generate the parsed data.
So my preference is for ASGI to just to provide the PEM data, and then middleware/frameworks (e.g. Django) can provide access to a parsed version if they wish to. In this case, the API to the parsed data would be part of Django so could be extended/changed as needed following Django's versioning rules, without worrying about having to update all ASGI web servers and frameworks at the same time.
As an alternative, if you really want some pre-parsed data at the ASGI level, can I suggest having both the full PEM data, and a pre-parsed version that just contains a few key fields to support the most common use-cases? That eliminates my objections (1) and (2), it means people doing weird and/or cutting-edge things can parse the PEM certificate to get all the data they want. You would still need to write a spec, and all ASGI web servers would have to implement it. In this case, I suggest that perhaps the Subject fields from the client certificate are probably all you need - that will cover the most common use case of "I just want to identify the user".
from asgiref.
This would be tremendously useful. Mutual TLS for authentication is only partially useful without being able to identify the client. Uvicorn (and presumably other ASGI servers) refuses to implement anything like this until it is listed in the ASGI specification, but unfortunately this discussion seems to have died. What can we do to resurrect it?
from asgiref.
Hm, one problem with this is that the ASGI events are deliberately limited to a JSON-plus-bytes-type set of potential values; passing a full object like this is sort of difficult to specify.
Is there any prior art in WSGI apps for doing similar things? I totally see the use case for information from client certificates especially, but I'm wary of specifying it rather than leaving it to servers to experiment with and decide on what would be best.
from asgiref.
It looks like uwsgi provides a nonstandard HTTPS_CLIENT_CERTIFICATE
, with the raw bytes of the client cert. I was unable to find anything in gunicorn implementing client certificates, except for this gist where someone implemented a custom worker to access information from the client cert. So it looks like the Python ecosystem doesn't yet have much prior art with regards to accessing the TLS session information.
I'm certainly not wedded to the very wide interface suggested by my sample patch. I suggest taking a look at specifying access to some subset of BaseTransport.get_extra_info(), and moving from there.
from asgiref.
Right, I think some level of that information would be a useful optional extra to the scope for servers to provide, though like with the raw path info, it's impossible to provide in all cases (e.g. where a loadbalancer is terminating SSL).
I'm not particularly deep in this one, so while I like the idea, I'd rather someone else comes up with the proposed set of things to include, the reasoning about why they are, and then the PR to add it to the spec.
from asgiref.
Apache's mod_ssl has a way to do this, which some apps will be using already. Please consider reusing some of this interface.
See the "Environment Variables" section of:
https://httpd.apache.org/docs/2.4/mod/mod_ssl.html
In particular:
- SSL_CLIENT_CERT_CHAIN_0 - the client certificate
- SSL_CLIENT_CERT_CHAIN_<n> - PEM-encoded certificates in client certificate chain that was sent by the client. <n> is a positive nonzero integer. SSL_CLIENT_CERT_CHAIN_<n+1> is the certificate that signed SSL_CLIENT_CERT_CHAIN_<n>. The last entry in the chain should be signed by a known CA root.
- SSL_CLIENT_VERIFY - Client cert verification status:
- "NONE" if there is no client certificate
- "SUCCESS" if client certificate validation succeeded
- "GENEROUS" ??? Not sure when Apache uses this; probably not needed ???
- "FAILED:<reason>" if client certificate validation failed and the webserver is configured to pass the request to ASGI anyway; where <reason> is the reason for the verification failure.
- SSL_SERVER_CERT - PEM-encoded server certificate
Also worth considering, although perhaps not using directly:
- SSL_PROTOCOL - The SSL protocol version (SSLv3, TLSv1, TLSv1.1, TLSv1.2) (Alternatively, would probably be better if ASGI passed this as a numeric TLS version number)
- SSL_CIPHER / SSL_CIPHER_USEKEYSIZE - The cipher being used (not sure if this is the best format - would probably be better if ASGI specified the numeric TLS identifier for the cipher suite instead).
The many other fields that Apache support are mainly extracting certain fields from the certificates. A Python app can just parse the PEM-encoded certificates to get those fields, so I don't think they add enough value to justify the cost of supporting them in ASGI servers and middleware.
from asgiref.
@Jon-Work I like what you've got here but instead of a PEM encoded certs (or a binary value as proposed in the issue) it seems that using a dictionary similar to what is returned by getpeercert(). This would eliminate the need to decode the certs to get the necessary info. Here's a (lightly edited) example I pulled from SSLObject.getpeercert:
{
'subject': (
(('domainComponent', 'com'),),
(('domainComponent', 'foobar'),),
(('organizationalUnitName', 'Users'),),
(('commonName', 'jwatson'),),
(('emailAddress', '[email protected]'),)
),
'issuer': (
(('domainComponent', 'com'),),
(('domainComponent', 'foobar'),),
(('commonName', 'FB-SUB-CA'),)
),
'version': 3,
'serialNumber': 'ABCDEF0123456789',
'notBefore': 'Mar 2 15:54:35 2020 GMT',
'notAfter': 'Mar 2 15:54:35 2021 GMT',
'subjectAltName': (
('othername', '<unsupported>'),
('email', '[email protected]')
),
'OCSP': ('http://ca.foobar.com/ocsp',),
'caIssuers': ('http://ca.foobar.com/FB-SUB-CA.crt',),
'crlDistributionPoints': ('http://ca.foobar.com/crl/FB-CA.crl',)
}
I could be missing something but some of that looks messy to me e.g. unnecessary nested tuples in the subject
and issuer
sections. So something like this perhaps?
{
'subject': (
('domainComponent', 'com'),
('domainComponent', 'foobar'),
('organizationalUnitName', 'Users'),
('commonName', 'jwatson'),
('emailAddress', '[email protected]')
),
'issuer': (
('domainComponent', 'com'),
('domainComponent', 'foobar'),
('commonName', 'FB-SUB-CA')
),
'version': 3,
'serialNumber': 'ABCDEF0123456789',
'notBefore': 'Mar 2 15:54:35 2020 GMT',
'notAfter': 'Mar 2 15:54:35 2021 GMT',
'subjectAltName': (
('othername', '<unsupported>'),
('email', '[email protected]')
),
'OCSP': ('http://ca.foobar.com/ocsp'),
'caIssuers': ('http://ca.foobar.com/FB-SUB-CA.crt'),
'crlDistributionPoints': ('http://ca.foobar.com/crl/FB-CA.crl')
}
Upon looking at the docs for getpeercert() and it seems that the extra level of tuples is 'as designed'. I don't get why it's necessary but perhaps it should just be left as-is.
from asgiref.
@Jon-Work You have some fair points here, to but sure but I'm not sure I follow all of it. The assumption I am working off of is that since client-certificates are validated during the handshake, any server that supports them would need to be able to be able to handle parsing the incoming certificate. From what I've seen in the ASGI servers I have looked at, this happens before the scope is created. I'm not sure how that works if certificates that the server doesn't understand the format of the certificate being presented. Are you expecting the client-certificate to be a pass-through? I can only see that being feasible if the client-certificate is not evaluated during the handshake which I don't believe would be a proper implementation of TLS. Based on all that, it doesn't seem like a major burden to format the information in the certificate as a dictionary based on the relevant RFC(s).
Having said all that, I definitely can see how for a more intense use case like yours that having the actual certificate in it's full form would be useful or even necessary. And having the PEM sure beats having no information about the client certificate at all. And yes, for my use-case (which I imagine is the most common) I am only interested in the subject. The issuer chain is also desirable since it is not strictly the case that all subject will be unique across issuers and it can useful to define authorizations based on issuer.
So I'm on-board with the PEM approach. Would you be willing to lend a hand in defining a POC/PR? @hashbrowncipher has already identified where this change would need to be made but I haven't seen any response to your proposal. The only part I would need to work out is how to convert the binary result into a PEM encoded string (this might be trivial, I haven't begun to look.)
from asgiref.
Re the server not understanding a certificate: A certificate contains fields that are marked as either "critical" or "non-critical". if the server (or anyone else parsing a certificate) doesn't understand a "non-critical" field it's supposed to ignore it; it's only if the server doesn't understand a "critical" field that it's supposed to reject the certificate. But for what I'm doing, I might understand fields that the server doesn't.
I haven't tried this, but according to the docs, the widely used "cryptography" library can do PEM/DER conversions: https://cryptography.io/en/latest/x509/reference/
from cryptography.x509 import load_pem_x509_certificate
from cryptography.x509 import load_der_x509_certificate
from cryptography.hazmat.backends import default_backend
from cryptography.hazmat.primitives.serialization import Encoding
# Load a PEM certificate (should start "-----BEGIN CERTIFICATE-----"):
cert = load_pem_x509_certificate(pem_data_bytes, default_backend())
# Convert to a DER (binary format) certificate:
result_bytes = cert.public_bytes(Encoding.DER)
# Load a DER (binary format) certificate:
cert = load_der_x509_certificate(der_data_bytes, default_backend())
# Convert to a PEM certificate (should start "-----BEGIN CERTIFICATE-----"):
result_bytes = cert.public_bytes(Encoding.PEM)
For your use case, would adding the following extra field be suitable?:
tls_client_cert_names
(Iterable[Iterable[Tuple[str, str]]]) -- An iterable of x509 Distinguished Names, where each name is represented as an iterable of parts. Each part consists of a kind (common name, country, etc (TODO need to define canonical list of these strings and put it in the spec)) and the value of that part. (TODO: Details about decoding or not IDNs, etc). The first entry is the subject of the client certificate; the second entry is the issuer of the client certificate. Any subsequent entries are part of the certificate chain sent by the client, with each name being the issuer of the certificate that has a subject matching the preceding name. Only applicable if the connection was over TLS; for non-TLS connections or if the client did not provide a client certificate
then it will be an empty iterable. Some web server implementations may be unable to provide this (e.g. if TLS is terminated by a separate proxy or load balancer). Optional; defaults to empty list. Iftls_client_cert_chain
is provided and non-empty then this field must be provided and must contain information that is consistent withtls_client_cert_chain
.
I'm happy to help, but my availability may be limited, so it would be better if you took the lead on this, if you're willing to do that?
Hopefully the bullet points I've written can become part of a spec PR with just minor editing, I tried to write them in "spec language".
from asgiref.
@Jon-Work Sorry for the delayed response. I got a little busy on Friday. I will take a crack at it once I have a little time and follow up. Thanks.
from asgiref.
Very interested in seeing any progress on this proposal. Adding this to the spec would seem to provide one of the final missing pieces of information about a connection to the server.
As for the formatting I am inclined to lean towards what @Jon-Work has proposed using PEM encoded strings as it seems the most portable for the future. Effectively providing the raw data to the server/middeware and allowing each server/middleware implementation to determine the format that is provided to the end user.
from asgiref.
I've had a go at a proposal for this. See pull request. Comments and feedback welcome.
from asgiref.
Pull request updated. Comments and feedback welcome.
from asgiref.
from asgiref.
@James-Matthew-Watson An implementation of the proposed spec would be very useful - it would prove that it's actually possible to implement the spec. It's likely that while implementing it, you'll have comments on the spec that can make it better.
(The above applies even if the implementation code is a bit hacky - it matters that you correctly implement the spec, not how you do that).
from asgiref.
from asgiref.
We have currently implemented the support to get client certificate details with NGINX + proxy headers (encoded PEM) but that method is missing actual chain but gets us forward and NGINX does certificate chain validation.
While looking at the problem I found that ASGI is missing the feature for direct passage thus I found this issue.
While looking at the comments there I wanted to point out that JWS spec is using JSON for the representation of the chain x5c
field: https://datatracker.ietf.org/doc/html/rfc7515#section-4.1.6
What I see nice in this JWS spec is that whole chain is provided and simple encoding of DER is used to make it compact.
In case you are not aware DER is the "raw" form of the certificate in binary format and PEM has it in base64 encoded format with additional line breaks, header and footer. PEM format is often used in files as that allows whole chain to be provided while with DER there is no common way of doing that.
In our use case we have custom extended key usages in the certificates that we observe to see if certain extended key usages are allowed and then fish computer vs. user certificate details from SAN field. In theory we could observe parsed form of the certificate but somehow it feels better to have exact client certificate (chain) and analyze that. Using cryptography
library as illustrated above is quite easy.
What is very important is that technical client certificate validation is done properly before forwarding the details for application. This removes the need for application to do full chain validation especially in the case where full chain information is not available.
Is there a need of help with the topic?
We have played with stack: nginx+uvicorn+starlette+fastapi
from asgiref.
This can be closed, I guess?
from asgiref.
Related Issues (20)
- Add Type Annotations to `asgiref` Module HOT 1
- Add Type Annotations to `asgiref` Module Functions HOT 2
- Parameters to Generic[...] must all be type variables when use with sys.setprofile HOT 3
- Can asig http extensions be used in WebSocket Denial Response? HOT 2
- Using asyncio.shield hangs/deadlocks when used with sync middleware. HOT 10
- What am I allowed to do with the send/receive callables? HOT 1
- Wording: should “extra coroutines” actually be “extra tasks”? HOT 4
- async_to_sync does not use the correct thread when used as decorator HOT 3
- Contradiction between general and HTTP-specific sections on handling of send on a closed connection HOT 2
- `PATH_INFO` set incorrectly by `WsgiToAsgiInstance.build_environ()` HOT 1
- Compatibility with gevent monkey-patching? HOT 1
- Failure in exception handling in current_thread_executor.py _WorkItem.run() HOT 1
- Why `http.response.start` exists? HOT 1
- sdist is missing `tox.ini` HOT 1
- Task was destroyed but it is pending! HOT 1
- Regression in v3.8.0 HOT 9
- Issue in sync.py's SyncToAsync class as new ThreadPoolExecutor executors with daemon threads getting created for requests. HOT 4
- Spec question: `websocket.disconnect` doesn't support the `reason` field HOT 10
- Suggestion: set `NotRequired` on all fields that aren't required HOT 8
- DatabaseError with new version of asgiref HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from asgiref.