vmware-tanzu / pinniped Goto Github PK

View Code? Open in Web Editor NEW

509.0 19.0 62.0 15.42 MB

Pinniped is the easy, secure way to log in to your Kubernetes clusters.

Home Page: https://pinniped.dev

License: Apache License 2.0

Go 96.88% Dockerfile 0.14% Shell 1.78% HTML 0.47% SCSS 0.46% JavaScript 0.08% CSS 0.19%

kubernetes identity authentication idp login oidc ldap active-directory

pinniped's Introduction

Overview

Pinniped provides identity services to Kubernetes.

Easily plug in external identity providers into Kubernetes clusters while offering a simple install and configuration experience. Leverage first class integration with Kubernetes and kubectl command-line.
Give users a consistent, unified login experience across all your clusters, including on-premises and managed cloud environments.
Securely integrate with an enterprise IDP using standard protocols or use secure, externally managed identities instead of relying on simple, shared credentials.

To learn more, please visit the Pinniped project's website, https://pinniped.dev.

Getting started with Pinniped

Care to kick the tires? It's easy to install and try Pinniped.

Discussion

Got a question, comment, or idea? Please don't hesitate to reach out via GitHub Discussions, GitHub Issues, or in the Kubernetes Slack Workspace within the #pinniped channel. Join our Google Group to receive updates and meeting invitations.

Contributions

Pinniped is better because of our contributors and maintainers. It is because of you that we can bring great software to the community.

Want to get involved? Contributions are welcome.

Please see the contributing guide for more information about reporting bugs, requesting features, building and testing the code, submitting PRs, and other contributor topics.

Adopters

Some organizations and products using Pinniped are featured in ADOPTERS.md. Add your own organization or product here.

Reporting security vulnerabilities

Please follow the procedure described in SECURITY.md.

License

Pinniped is open source and licensed under Apache License Version 2.0. See LICENSE.

pinniped's People

Contributors

Stargazers

Watchers

pinniped's Issues

Support passing arbitrary extra query parameters in upstream OIDC authorize request.

Some providers support extra parameters, for example to control which tenant the user should be logging in to.

We can add a new .spec.authorizationConfig.extraParameters field to allow this to be specified in an UpstreamOIDCProvider.

Originally posted by @enj in #199 (comment)

Distribute an official Pinniped Helm chart

Is your feature request related to a problem? Please describe.
I'd like to have a helm chart, so I can easily install and manage pinniped, especially using gitops

Describe the solution you'd like
Create a helm chart

Describe alternatives you've considered
Write my own, but an official chart would be better, and more stable

Are you considering submitting a PR for this feature?

Sure

How will this project improvement be tested?
- Helm lint
How does this change the current architecture?
- It probably doesn't
How will this change be backwards compatible?
- It doesn't change anything in that way
How will this feature be documented?
- I don't understand this question

Additional context

Issue federation ID tokens with group claim as an array

What happened?

@enj and I were working on #322.
We noticed that we issue downstream ID tokens with a groups claim formatted either as an array or a string:

pinniped/internal/oidc/callback/callback_handler.go

Lines 275 to 276 in 3f08f2e

groupsAsArray, okAsArray := groupsAsInterface.([]string)

groupsAsString, okAsString := groupsAsInterface.(string)

.
We should be consistent about the data type for our downstream groups claim.

What did you expect to happen?

We should be consistent about the data type we use for our downstream groups claim.
We should always use an array.

What is the simplest way to reproduce this behavior?

In what environment did you see this bug?

Pinniped server version: v0.3.0
Pinniped client version: n/a
Pinniped container image (if using a public container image): n/a
Pinniped configuration (what IDP(s) are you using? what downstream credential minting mechanisms are you using?): n/a
Kubernetes version (use kubectl version): n/a
Kubernetes installer & version (e.g., kubeadm version): n/a
Cloud provider or hardware configuration: n/a
OS (e.g: cat /etc/os-release): n/a
Kernel (e.g. uname -a): n/a
Others: n/a

What else is there to know about this bug?

We should also try to cast groupsAsInterface to an []interface{} in the case that the upstream token has been unmarshaled that way.

Update copyright notices and scripts for the new year

Our files are all currently tagged as 2020, but new files should be 2021 and newly-modified files should be 2020-2021 or something like that.

Document the deployment, configuration, and usage of the Supervisor

Acceptance Criteria

GIVEN I want to learn about the Pinniped Supervisor
WHEN I go to the Pinniped README.md
THEN I can learn a little bit about what the Supervisor is and why I should care
AND I can still learn a little bit about what the Concierge is and why I should care (this documentation should already exist!)
AND I know where to go to get a deeper understanding of the Supervisor
AND I know where to go to get a deeper understanding of the Concierge (this documentation should already exist!)
AND I know where to go to run a demo of the Supervisor
AND When I run the demo of the Supervisor, it works

Improve Tracker/GH sync (Ryan + ??)

Broken links in documentation

The following docs have broken links:

Pinipped Scope
- Contributing docs
Architecture doc
- Pinniped Go client library
- cmd/local-user-authenticator/main.go.

Supervisor token refresh should fail when the upstream refresh token no longer works for LDAP

Purpose: Today, the duration of a downstream Supervisor refresh session is 9 hours. Once you have logged in to your upstream IDP, your downstream session remains valid for 9 hours even if your session or user account in the upstream IDP is revoked. Instead, during the next downstream token refresh, the Supervisor should notice that your upstream session is no longer valid and the refresh should fail, which will cause the CLI to start a new login flow for the user.

Given that I am running a Supervisor with an upstream LDAP or AD IDP configured,
And given that I have already used the Pinniped CLI to get tokens from the Supervisor,
And given that I wait long enough for my downstream access token and my upstream refresh token to expire (but less than 9 hours) or I otherwise cause my upstream session to expire (revoke the session or disable the user account or whatever)
When I use the Pinniped CLI to refresh my downstream session
Then I am returned to the login page of my provider, and I can successfully log in again to get new tokens

Notes:

Refreshing group membership is out of scope and will be addressed by a future issue
As part of this PR the Supervisor refresh sessions durations can be extended to more like 12 or 16 hours, which would be a reasonable default considering that the upstream IDP can effectively override it by having shorter refresh session durations if desired. 12 or 16 hours is long enough that it will cover a whole work day, but short enough that it will not cover any part of the next work day.
Possible implementation for LDAP:
- Record DN on initial login
- Assert that the LDAP entry still exists
- Assert that the configured UID field is unchanged in LDAP (i.e. assert that our downstream sub claim would not change)
- Assert that the configured Username field is unchanged in LDAP
Possible implementation for AD:
- All of the same checks as LDAP
- Assert that the user is not disabled via the userAccountControl field
- Assert that the user's password has not changed since the initial login via the pwdlastset field
- Assert that the user's password has not expired since the initial login via the userAccountControl field
- Assert that the user’s account is not currently locked out via the userAccountControl field

Fake issue for testing, please ignore

Purpose: This is a fake issue for testing.

GIVEN that I have a a fake issue
WHEN I look at the issue
THEN I can easily see that it is fake and should be ignored.

CI: do not gate integration tests running on other tests

TODO

Is your feature request related to a problem? Please describe.
A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]

Describe the solution you'd like
A clear and concise description of what you want to happen.

Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.

Are you considering submitting a PR for this feature?

How will this project improvement be tested?
How does this change the current architecture?
How will this change be backwards compatible?
How will this feature be documented?

Additional context
Add any other context or screenshots about the feature request here.

`pinniped get-kubeconfig` double-prints error messages

What happened?

I run pinniped get-kubeconfig and it prints the error message twice.

pivotal@akeesler-a01:enhancements$ pinniped get-kubeconfig
Error: required flag(s) "token" not set
required flag(s) "token" not set

pivotal@akeesler-a01:enhancements$ pinniped get-kubeconfig --token whatever --pinniped-namespace integration --idp-type webhook
Error: no identity providers were found in namespace "integration"
no identity providers were found in namespace "integration"

What did you expect to happen?

I run pinniped get-kubeconfig and it prints the error message once.

pivotal@akeesler-a01:enhancements$ pinniped get-kubeconfig
Error: required flag(s) "token" not set

pivotal@akeesler-a01:enhancements$ pinniped get-kubeconfig --token whatever --pinniped-namespace integration --idp-type webhook
Error: no identity providers were found in namespace "integration"

What is the simplest way to reproduce this behavior?

Run pinniped get-kubeconfig.

In what environment did you see this bug?

Pinniped server version: n/a
Pinniped client version: https://github.com/vmware-tanzu/pinniped/releases/tag/v0.1.0
Pinniped container image (if using a public container image): n/a
Pinniped configuration (what IDP(s) are you using? what downstream credential minting mechanisms are you using?): n/a
Kubernetes version (use kubectl version): n/a
Kubernetes installer & version (e.g., kubeadm version): n/a
Cloud provider or hardware configuration: n/a
OS (e.g: cat /etc/os-release): n/a
Kernel (e.g. uname -a): n/a
Others: n/a

What else is there to know about this bug?

Whoever fixes this might want to check and see if our CLI double prints error messages in other cases.

Add integration test for Pinniped Supervisor/JWTAuthenticator end-to-end flow

Is your feature request related to a problem? Please describe.

We don't currently have an integration test that tests the end-to-end flow of 1) obtaining a token set from the Pinniped Supervisor and 2) using that token set to authenticate to a cluster via a JWTAuthenticator. We should add this as an integration test, since it is quite possible the most common Pinniped architecture that folks will deploy.

Describe the solution you'd like

We should have an integration test that 1) logs into the Pinniped Supervisor and gets a token set issued for a particular cluster and 2) uses that token set to authenticate to the particular cluster via a JWTAuthenticator.

Describe alternatives you've considered

None.

Are you considering submitting a PR for this feature?

How will this project improvement be tested?
It is a test, so N/A.
How does this change the current architecture?
It is a test, so N/A.
How will this change be backwards compatible?
It is a test, so N/A.
How will this feature be documented?
It is a test, so N/A.

Additional context

This came up during development of #258. We punted on this to try to optimize for the shortest path to a working system.

Context deadline exceeded during credential exchange

I'm consistently seeing a context deadline of 30s exceeded when exchanging credentials the first time followed by successful responses taking 6-7 seconds afterwards.

$ time PINNIPED_NAMESPACE=pinniped-concierge PINNIPED_IDP_TYPE=webhook PINNIPED_IDP_NAME=local-user-authenticator PINNIPED_TOKEN=pinny-the-seal:password123 PINNIPED_K8S_API_ENDPOINT=https://127.0.0.1:46863 PINNIPED_AUTHENTICATOR_TYPE=webhook PINNIPED_AUTHENTICATOR_NAME=local-user-authenticator pinniped exchange-credential

failed to get credential: could not login: Post "https://127.0.0.1:46863/apis/login.concierge.pinniped.dev/v1alpha1/namespaces/pinniped-concierge/tokencredentialrequests": context deadline exceeded
0.02s user 0.01s system 0% cpu 30.024 total

$ time PINNIPED_NAMESPACE=pinniped-concierge PINNIPED_IDP_TYPE=webhook PINNIPED_IDP_NAME=local-user-authenticator PINNIPED_TOKEN=pinny-the-seal:password123 PINNIPED_K8S_API_ENDPOINT=https://127.0.0.1:46863 PINNIPED_AUTHENTICATOR_TYPE=webhook PINNIPED_AUTHENTICATOR_NAME=local-user-authenticator pinniped exchange-credential
{"kind":"ExecCredential","apiVersion":"client.authentication.k8s.io/v1beta1", ...}}
0.01s user 0.00s system 0% cpu 6.988 total

$ time PINNIPED_NAMESPACE=pinniped-concierge PINNIPED_IDP_TYPE=webhook PINNIPED_IDP_NAME=local-user-authenticator PINNIPED_TOKEN=pinny-the-seal:password123 PINNIPED_K8S_API_ENDPOINT=https://127.0.0.1:46863 PINNIPED_AUTHENTICATOR_TYPE=webhook PINNIPED_AUTHENTICATOR_NAME=local-user-authenticator pinniped exchange-credential
{"kind":"ExecCredential","apiVersion":"client.authentication.k8s.io/v1beta1", ...}}
0.02s user 0.01s system 0% cpu 7.066 total

What did you expect to happen?

I expected a cold request to be much quicker than the 30s deadline (and guess by the deadline that it's not meant to take so long?). Let me know if this is known and expected, or if it's not reproducible and is just my local kind cluster.

Please be specific and include proposed behavior!

The request to exchange-credentials returns within a second, preferably much less.

What is the simplest way to reproduce this behavior?

As outlined above, call pinniped exchange-credential explicitly.

In what environment did you see this bug?

pinniped version
version.Info{Major:"0", Minor:"2", GitVersion:"v0.2.0", GitCommit:"1223cf78770d85ba4a56b261b05b0b7216942067", GitTreeState:"clean", BuildDate:"2020-11-03T15:41:04Z", GoVersion:"go1.15.3", Compiler:"gc", Platform:"linux/amd64"}

Pinniped server version: 0.2.0
Pinniped client version: 0.2.0
Pinniped container image (if using a public container image): docker.io/getpinniped/pinniped-server:v0.2.0
Pinniped configuration (what IDP(s) are you using? what downstream credential minting mechanisms are you using?): See above env vars for exchange-credential .
Kubernetes version (use kubectl version): ```
Client Version: version.Info{Major:"1", Minor:"18", GitVersion:"v1.18.2", GitCommit:"52c56ce7a8272c798dbc29846288d7cd9fbae032", GitTreeState:"clean", BuildDate:"2020-04-16T11:56:40Z", GoVersion:"go1.13.9", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"18", GitVersion:"v1.18.2", GitCommit:"52c56ce7a8272c798dbc29846288d7cd9fbae032", GitTreeState:"clean", BuildDate:"2020-04-30T20:19:45Z", GoVersion:"go1.13.9", Compiler:"gc", Platform:"linux/amd64"}

- Kubernetes installer & version (e.g., `kubeadm version`): kind v0.8.0 go1.14.2 linux/amd64
- Cloud provider or hardware configuration: Dell Precision 5540
- OS (e.g: `cat /etc/os-release`):

NAME="Ubuntu"
VERSION="20.04.1 LTS (Focal Fossa)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 20.04.1 LTS"
VERSION_ID="20.04"
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
VERSION_CODENAME=focal
UBUNTU_CODENAME=focal

- Kernel (e.g. `uname -a`): Linux hpmor4 5.4.0-52-generic #57-Ubuntu SMP Thu Oct 15 10:57:00 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux

Others:

What else is there to know about this bug?

Also, I don't see anything relevant in the pinniped-concierge logs:

k -n pinniped-concierge logs deployment/pinniped-concierge | tail -n 10            Found 3 pods, using pod/pinniped-concierge-6d9f5db45f-wkpxm
I1109 22:15:56.642997       1 apiservice_updater.go:71] apiServiceUpdaterController Sync successfully updated API service
I1109 22:15:57.075550       1 webhookcachefiller.go:77] webhookcachefiller-controller "msg"="added new webhook authenticator" "endpoint"="https://local-user-authenticator.local-user-authenticator.svc/authenticate" "webhook"={"name":"local-user-authenticator","namespace":"pinniped-concierge"} 
I1109 22:18:56.639313       1 certs_observer.go:66] certsObserverController Sync updated certs in the dynamic cert provider
I1109 22:18:56.639409       1 certs_expirer.go:88] certsExpirerController Sync found a renew delta of -487h25m36.360595607s
I1109 22:18:56.641064       1 apiservice_updater.go:71] apiServiceUpdaterController Sync successfully updated API service
I1109 22:18:56.993453       1 webhookcachefiller.go:77] webhookcachefiller-controller "msg"="added new webhook authenticator" "endpoint"="https://local-user-authenticator.local-user-authenticator.svc/authenticate" "webhook"={"name":"local-user-authenticator","namespace":"pinniped-concierge"} 
I1109 22:21:56.639700       1 certs_observer.go:66] certsObserverController Sync updated certs in the dynamic cert provider
I1109 22:21:56.639949       1 certs_expirer.go:88] certsExpirerController Sync found a renew delta of -487h22m36.360064181s
I1109 22:21:56.643493       1 apiservice_updater.go:71] apiServiceUpdaterController Sync successfully updated API service
I1109 22:21:56.992338       1 webhookcachefiller.go:77] webhookcachefiller-controller "msg"="added new webhook authenticator" "endpoint"="https://local-user-authenticator.local-user-authenticator.svc/authenticate" "webhook"={"name":"local-user-authenticator","namespace":"pinniped-concierge"}

Design: safe handling of usernames and groups from different IDPs

TODO

Is your feature request related to a problem? Please describe.
A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]

Describe the solution you'd like
A clear and concise description of what you want to happen.

Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.

Are you considering submitting a PR for this feature?

How will this project improvement be tested?
How does this change the current architecture?
How will this change be backwards compatible?
How will this feature be documented?

Additional context
Add any other context or screenshots about the feature request here.

Measure the latency of our CI process over time.

Is your feature request related to a problem? Please describe.
Whenever I look at our CI history, I can't tell at a glance whether the CI process has gotten slower over time.

Describe the solution you'd like
I'd like to see some chart showing long term trends in the CI timing.

CI for LTS releases (Ryan)

Add minimal impersonation proxy as an alternative strategy for concierge on managed clusters

Purpose This will allow Pinniped to work on more kinds of clusters. Today, the concierge only provides access to a cluster if it can access the cluster signing key, which is not possible on many cloud providers' clusters. An impersonation proxy will allow the user to provide a JWT token to make calls to the kube api server without the concierge having to issue a cluster specific certificate.

Given that I have a GKE cluster configured with Pinniped
And given that I have configured my kubeconfig to use the impersonation proxy as the kube API server
And given that I have configured my kubeconfig to use the Pinniped CLI with an OIDC provider
When I run kubectl get commands
Then my commands execute as my identity from the OIDC provider
And my commands are effected by the RBAC settings for my identity

Notes: Additional scope will be part of later stories.

#67 was a draft version that we can look at for reference.
Configuration of the proxy is out of scope.
Group membership is out of scope.
JWT validation is out of scope.
Assume that the username will always be at the claim username.
Do the simplest possible thing to get kubectl to accept our TLS configuration (or lack thereof)

Consider adding caching to `dynamicOpenIDConnectECDSAStrategy` to reduce memory overhead

Is your feature request related to a problem? Please describe.
Every time dynamicOpenIDConnectECDSAStrategy.GenerateIDToken() is called (i.e., every time we issue an ID token), we call compose.NewOpenIDConnectECDSAStrategy(s.fositeConfig, key). We don't necessarily need to call this constructor every time, since the s.fositeConfig and key may be the same as the last time that dynamicOpenIDConnectECDSAStrategy.GenerateIDToken() was called.

See https://github.com/vmware-tanzu/pinniped/pull/249/files#diff-81c6a81fe5a22ee2ceab532b3aba3af58fa78b283bd98c07db18950aa5ea9f00R51.

Describe the solution you'd like
Cache the values of compose.NewOpenIDConnectECDSAStrategy(s.fositeConfig, key) per s.fositeConfig and key so we don't allocate a whole new struct every time dynamicOpenIDConnectECDSAStrategy.GenerateIDToken() is called.

Describe alternatives you've considered
Right now, we aren't doing this because we are trying to simplify our implementation. It could be the case that this is OK and doesn't cause the memory pressure that we think it does. But it seems wasteful to call compose.NewOpenIDConnectECDSAStrategy(s.fositeConfig, key) many times where we could just reuse the return value from the first call.

Are you considering submitting a PR for this feature?

How will this project improvement be tested?
Unit test that checks a local cache for the dynamicOpenIDConnectECDSAStrategy to make sure it is using the cache properly.
How does this change the current architecture?
It doesn't.
How will this change be backwards compatible?
It doesn't affect user-facing functionality.
How will this feature be documented?
It is an internal implementation change, so it doesn't necessarily need any documentation. However, we could add a release note that we have reduced memory usage after this change.

Additional context

This came about when @margocrawf and I were working on the token endpoint.

Design work re:out-of-tree IDP connectors

Remove the custom labels feature

Is your feature request related to a problem? Please describe.
No, this would be a refactoring with no change in behavior. The code would be simplified.

Describe the solution you'd like
Use the "middleware" approach that we use to add owner references to also add custom labels when the operator chose to initially deploy the Concierge and/or Supervisor apps with custom labels (which is a feature of our ytt templates). Remove the code that explicitly passes around the custom labels into the controllers and remove the related unit tests from the controllers.

Describe alternatives you've considered
None.

Are you considering submitting a PR for this feature?
This is open for anyone to work on.

How will this project improvement be tested?
One possibility is a new integration test similar to the existing uninstall tests.
How does this change the current architecture?
It moves some code around, but no major changes.
How will this change be backwards compatible?
The approach described above should be backwards compatible, I think. Resources that had custom labels applied by an older version of the software would still have them applied after upgrade. Newer versions would still apply custom labels, just using a new code path.
How will this feature be documented?
It is already documented.

UpstreamOIDCProvider has incorrect documentation about client secret type

What happened?

I was reading the UpstreamOIDCProvider docs: https://github.com/vmware-tanzu/pinniped/tree/main/generated/1.19#k8s-api-go-pinniped-dev-generated-1-19-apis-supervisor-idp-v1alpha1-oidcclient.
I noticed the docs said that the UpstreamOIDCProvider.spec.client.secretName type should be "secrets.pinniped.dev/oidc".
However, the code expects secret type "secrets.pinniped.dev/oidc-client".

What did you expect to happen?

The docs should refer to the correct secret type, "secrets.pinniped.dev/oidc-client".

What is the simplest way to reproduce this behavior?

Look at the docs: https://github.com/vmware-tanzu/pinniped/tree/main/generated/1.19#k8s-api-go-pinniped-dev-generated-1-19-apis-supervisor-idp-v1alpha1-oidcclient.

In what environment did you see this bug?

Pinniped server version:
Pinniped client version:
Pinniped container image (if using a public container image):
Pinniped configuration (what IDP(s) are you using? what downstream credential minting mechanisms are you using?):
Kubernetes version (use kubectl version):
Kubernetes installer & version (e.g., kubeadm version):
Cloud provider or hardware configuration:
OS (e.g: cat /etc/os-release):
Kernel (e.g. uname -a):
Others:

What else is there to know about this bug?

LDAP login works when the LDAPIdenitityProvider configuration uses the simplest (TBD) LDAP options

Purpose: After arriving at the LDAP login page, I would like to be able to complete the login by using my correct LDAP account username and password.

Given that I have an LDAPIdentityProvider configured with Pinniped
And given I have only used the most straightforward configuration options (specific options TBD)
And given I have configured my kubeconfig to use pinniped
And given I run kubectl, been presented with the LDAP login page
When I enter my correct username and password and submit the form
Then My browser is redirected to the page that says "login was successful and you may close the browser"
And Then My kubectl command continues successfully using my identity from LDAP

Notes: For this issue, we should choose whichever LDAP configuration is the easiest to implement and only support those configuration options as part of this issue. Other configuration options that we wish to support should be added as new issues.

Add Kubernetes 1.20 codegen

Is your feature request related to a problem? Please describe.

Kubernetes 1.20 came out recently.
I'd like for Pinniped to support the latest Kubernetes.

Describe the solution you'd like

Add Kubernetes 1.20 generated code
Update any hardcoded references to 1.19 in our code (like here and here and here)

Describe alternatives you've considered

N/A

Are you considering submitting a PR for this feature?

How will this project improvement be tested?
New CI jobs that run our integration tests on 1.20.
How does this change the current architecture?
Hopefully not at all...
How will this change be backwards compatible?
Hopefully it will be without us having to do anything...
How will this feature be documented?
Release note?

Additional context

N/A

Test epic

This is a test epic created in ZenHub to see how it looks in GitHub.

JWTAuthenticator takes 10 seconds to warm up

What happened?

When I create/update a JWTAuthenticator, it takes 10 seconds to initialize since we use Kube's oidc.Authenticator. Therefore, when I create a login.concierge.pinniped.dev/v1alpha1.TokenCredentialRequest in that 10 second window, I get a response with "authentication failed" and an underlying log message of "oidc: authenticator not initialized".

What did you expect to happen?

I want to be able to start using a JWTAuthenticator immediately after creating/updating it.

What is the simplest way to reproduce this behavior?

Create a JWTAuthenticator.
Within 10 seconds, create a TokenCredentialRequest referring to that JWTAuthenticator with a valid ID token.
You should see that authentication failed.
To convince yourself that the ID token is valid, wait 10 seconds, and then create the TokenCredentialRequest again, and see that it succeeds.

In what environment did you see this bug?

Pinniped server version:
Pinniped client version:
Pinniped container image (if using a public container image):
Pinniped configuration (what IDP(s) are you using? what downstream credential minting mechanisms are you using?):
Kubernetes version (use kubectl version):
Kubernetes installer & version (e.g., kubeadm version):
Cloud provider or hardware configuration:
OS (e.g: cat /etc/os-release):
Kernel (e.g. uname -a):
Others:

What else is there to know about this bug?

This came up during development of #258. We wanted to use Kube's Kube's oidc.Authenticator to get the benefit of that code, but the 10 second timeout is a bummer.

Restore max in flight check when updating to 0.19.5

#233 disabled the mutating max in flight check to address #191 that was caused by kubernetes/kubernetes#95300 and kubernetes/kubernetes#91177.

This was fixed in 1.20 via kubernetes/kubernetes#95303 and kubernetes/kubernetes#95371 and backported to 1.19 in kubernetes/kubernetes#96282.

1.19.5 will be released on December 9th. We should bump to 0.19.5 once that occurs and drop the changes made in #233.

Design: browser-less (and possibly password-less) CLI SSO

TODO

Is your feature request related to a problem? Please describe.
A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]

Describe the solution you'd like
A clear and concise description of what you want to happen.

Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.

Are you considering submitting a PR for this feature?

How will this project improvement be tested?
How does this change the current architecture?
How will this change be backwards compatible?
How will this feature be documented?

Additional context
Add any other context or screenshots about the feature request here.

Add an integration test helper to assert that no pods restart during the test.

Is your feature request related to a problem? Please describe.
Occasionally, there could be a bug in the Pinniped supervisor or concierge servers which is triggered by the integration test suite and causes the affected pod to crash and be restarted by Kubernetes. This can lead to situations where we don't notice a bug because Kubernetes very helpfully restarts the container and our tests tolerate some eventual consistency (e.g., using require.Eventually() assertions).

Describe the solution you'd like
We should have some test assertion helper that checks on the Pinniped pods before and after each test to validate that they have not been restarted. This is similar to the library.DumpLogs() helper we have already, and should be a straightforward extension of that code.

Describe alternatives you've considered
We could deploy our code under test in pods with restartPolicy: Never, but we'd need to significantly refactor the deployment YAML.

Are you considering submitting a PR for this feature?

Not immediately, please leave a comment if you'd like to take a shot at this.

How will this project improvement be tested?

It's test code, so we can probably just do some manual testing (e.g., insert a panic() into the supervisor code and make sure the tests fail).
How does this change the current architecture?

This is a minor change.
How will this change be backwards compatible?

This is a minor change to tests, no backwards compatibility concerns.
How will this feature be documented?

No documentation needed.

Generated secrets have "blockOwnerDeletion: true", but they don't need it.

What happened?

When the supervisor is installed and configured, several secrets are generated with blockOwnerDeletion: true, meaning they would block the deletion of the owning object.

In fact the owning Deployment doesn't have a foregroundDeletion finalizer so this isn't functioning anyway.

What did you expect to happen?

We don't need to set blockOwnerDeletion: true, we can just let the secrets get cleaned up asynchronously after the owner is gone.

What is the simplest way to reproduce this behavior?

Install the supervisor and configure a FederationDomain. You should see several Secret resources created with blockOwnerDeletion set to true.

In what environment did you see this bug?

Pinniped server version: v0.3.0

What else is there to know about this bug?

@enj originally noticed this (thanks @enj!).

Fuzz fosite config struct

We need a way to track if the fosite config grows new knobs for us to tweak. The simplest approach to me would be to deterministically fuzz it and then encode it into JSON for diffing similar to:

pinniped/internal/fosite/authorizationcode/authorizationcode_test.go

Line 204 in 86865d1

func TestFuzzAndJSONNewValidEmptyAuthorizeCodeSession(t *testing.T) {

Enable audit logging for all of our test environments

Is your feature request related to a problem? Please describe.
A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]

@enj and I were debugging a mysteriously deleted Secret, and we had a really hard time figuring out why it was getting deleted.
We enabled audit logging, and immediately discovered what entity was deleting the Secret and we were able to figure out our bug.
More generally: it would be helpful when debugging test environments to have an audit log to help us understand what is going on.

Describe the solution you'd like
A clear and concise description of what you want to happen.

Enable kube-apiserver audit logs in our test environments (i.e., our test kind clusters).
We can write this audit log to a file inside of the kind docker container.

Describe alternatives you've considered

None.

Are you considering submitting a PR for this feature?

How will this project improvement be tested?
Manually checking that audit logs are being populated after this fix goes in.
How does this change the current architecture?
It doesn't change our source code architecture, as it is a test change.
It will fill up our kind cluster disks more quickly, but these disks are ephemeral as they are inside of the kind container.
How will this change be backwards compatible?
Yes - this is a purely additive test change.
How will this feature be documented?
Perhaps we should have some sort of "how to debug test PR test failures" section in our CONTRIBUTING.md?

Additional context
Here is what @enj and I did to enable audit logs in one of our kind clusters.

SSH into the VM on which our test kind cluster was running.
Exec into the kind container.
cd /etc/kubernetes
Create an audit-policy.yaml file, something like the below.

apiVersion: audit.k8s.io/v1beta1
kind: Policy
metadata:
  name: Default
# Don't generate audit events for all requests in RequestReceived stage.
omitStages:
- "RequestReceived"
rules:
# Don't log requests for events
- level: None
  resources:
  - group: ""
    resources: ["events"]
# Don't log authenticated requests to certain non-resource URL paths.
- level: None
  userGroups: ["system:authenticated", "system:unauthenticated"]
  nonResourceURLs:
  - "/api*" # Wildcard matching.
  - "/version"
  - "/healthz"
  - "/readyz"
# A catch-all rule to log all other requests at the Metadata level.
- level: Metadata
  # Long-running requests like watches that fall under this rule will not
  # generate an audit event in RequestReceived.
  omitStages:
  - "RequestReceived"

Add the --audit-policy-file=/etc/kubernetes/audit-policy.yaml flag to the manifests/kube-apiserver.yaml command array (surely there is a way in kind to do this).
Add the --audit-log-path=/var/log/kube-audit.log flag to the manifests/kube-apiserver.yaml command array (surely there is a way in kind to do this).
Add volumeMounts and volumes for those files (surely there is a way in kind to do this).

   volumeMounts:
    - mountPath: /var/log
      name: log
    - mountPath: /etc/kubernetes/audit-policy.yaml
      name: audit
      readOnly: true
...

  volumes:
  - hostPath:
      path: /var/log
      type: DirectoryOrCreate
    name: log
  - hostPath:
      path: /etc/kubernetes/audit-policy.yaml
      type: File
    name: audit

CA bundle missing from pinniped cli generated kubeconfig when using minikube

What happened?

When following the installation demo on minikube, everything installs correctly and runs, but when using the generated kubeconfig, it throws an unknown authority error.

When I look at the generated kubeconfig, the PINNIPED_CA_BUNDLE env var is an empty string. Adding the correct CA pem to the kubeconfig resolves the unknown authority error.

What did you expect to happen?

PINNIPED_CA_BUNDLE should be populated automatically with pinniped get-kubeconfig

What is the simplest way to reproduce this behavior?

Start a minikube cluster and walk through the installation demo steps.

In what environment did you see this bug?

Pinniped server version: v0.2.0
Pinniped client version: v0.2.0
Kubernetes version (use kubectl version): v1.19.4
Kubernetes installer & version (e.g., kubeadm version): minikube v1.15.1

CI chores to setup for testing impersonation (Matt)

Reduce the number of SAR checks we make

We currently make a lot of remote calls back to the Kube API server for authz checks. At the bare minimum we should stop making these calls for:

authorizationOptions := genericapiserveroptions.NewDelegatingAuthorizationOptions().
    WithAlwaysAllowPaths("/healthz", "/healthz/").
    WithAlwaysAllowGroups(user.SystemPrivilegedGroup)

We should also create a wrapper authorizer.AuthorizerFunc that skips authz for TokenCredentialRequest since it is a pre-authentication API.

Pinniped should support configuration to modify the API group for all outbound API requests.

Currently, pinniped can only be installed once per cluster. In particular, the concierge has issues because of the singleton, cluster-scoped APIService object and CRDs.

We should allow the API groups to be configured so that instead of .pinniped.dev, an administrator can specify some custom suffix.

Acceptance

Given that I have a running cluster
When I install Pinniped with a api_suffix=pinniped.mydomain.com
And I pass some new --api-suffix flag to the CLI when generating a kubeconfig
Then I see that all CRDs and aggregated APIs on the cluster fall under that suffix
And I can use the concierge and supervisor as usual

Actually respect `klog` logging flags

What is the problem that you wish to solve?

I wish there was a way to enable more verbose logging in pinniped components so that I could better understand what is going on in the code.
There are a couple places in our code where we use verbose logging (e.g., here).
There are a couple places in our code where we would like to use more verbose logging (e.g., observing when a controller does specific stuff and how long that takes).

What is the best solution to the above problem?

Reuse klog logging flags.
Our logging solution, klog, automatically provides some flags to enable this verbose logging (see k8s.io/component-base.init).

What are the alternative solutions that you have considered?

Enable verbose logging all the time - seems like we don't always want verbose logging since this often obscures the most interesting log messages to 90% of the population.

How will this project improvement be tested?

Good question...

In what environment do you hope to see this improvement?

Pinniped server version: latest
Pinniped client version: latest
Pinniped container image (if using a public container image): latest
Pinniped configuration (what IDP(s) are you using? what downstream credential minting mechanisms are you using?): all IDPs
Kubernetes version (use kubectl version): latest
Kubernetes installer & version (e.g., kubeadm version): latest
Cloud provider or hardware configuration: all (this solution should be platform agnostic)
OS (e.g: cat /etc/os-release): all (this solution should be platform agnostic)
Kernel (e.g. uname -a): all (this solution should be platform agnostic)
Others:

What else is there to know about this improvement?

None.

The `TestSupervisorLogin` integration test can be flaky.

What happened?

The TestSupervisorLogin test failed on a PR CI test run:

=== RUN   TestSupervisorLogin
    supervisor_login_test.go:41: created test OIDCProvider supervisor/test-oidc-provider-zrtnr
    supervisor_login_test.go:79: created test client credentials Secret test-client-creds-fthnc
    supervisor_login_test.go:82: created test UpstreamOIDCProvider test-upstream-v7p7j
    supervisor_login_test.go:92: 
        	Error Trace:	supervisor_login_test.go:92
        	Error:      	Not equal: 
        	            	expected: 302
        	            	actual  : 422
        	Test:       	TestSupervisorLogin
    supervisor_login_test.go:41: cleaning up test OIDCProvider supervisor/test-oidc-provider-zrtnr
--- FAIL: TestSupervisorLogin (1.96s)

What did you expect to happen?

The test should succeed!

What is the simplest way to reproduce this behavior?

I think we should be able to reliably reproduce this flake if we add an artificial delay to the upstream-observer controller sync method.

In what environment did you see this bug?

This occurred on the PR tests for commit ad1bc6c.

What else is there to know about this bug?

We can probably fix this by adding the appropriate require.Eventually(...) call to the assertion block that's failing.

Pinniped should support configuration to modify the recognized API groups for all inbound API requests.

Currently, pinniped can only be installed once per cluster. In particular, the concierge has issues because of the singleton, cluster-scoped APIService object and CRDs.

We should allow the API groups to be configured so that instead of .pinniped.dev, an administrator can specify some custom suffix.

This is related to #319, but for inbound requests (aggregated API requests to our server). These should be configurable to recognize multiple API group suffixes. Supporting multiple suffixes allows a more graceful migration from existing installations to a new configuration that changes the suffix.

Acceptance

Given that I have a running cluster
When I install Pinniped with a api_suffix=pinniped.mydomain.com and recognize_additional_api_suffix=pinniped.dev
And I do not pass the some --api-suffix flag to the CLI when generating a kubeconfig
Then I see two APIService objects in the Kubernetes API, one for each suffix
And I see that my login still works under the old API group
And I can use the concierge as usual

Add demo for using Pinniped with an upstream OIDC provider

Is your feature request related to a problem? Please describe.
Our OIDC federation design is not trivial. It will help folks to build a mental model of how it works by following a demo (like we have with the Concierge/WebhookAuthenticator).

Describe the solution you'd like
A demo that shows how to set up the Supervisor and the Concierge (using a JWTAuthenticator) to validate Supervisor tokens.

Describe alternatives you've considered
N/A

Are you considering submitting a PR for this feature?

How will this project improvement be tested?
How does this change the current architecture?
How will this change be backwards compatible?
How will this feature be documented?

Additional context
Add any other context or screenshots about the feature request here.

Wire in debug transport to all http.Client / http.RoundTrippers

There are various places both client side and server side where we use an http.Client/http.RoundTripper. These should be instrumented with DebugWrappers so that they emit log output at high log levels.

Deploy internal test environments (Andrew/??)

Simple browser-based LDAP login page

Purpose: LDAP identity providers will not have their own web pages configured for us to redirect to, so we will need to present pinniped users with our own simple login page that prompts them for their LDAP login.

Given that I have an LDAPIdentityProvider configured with Pinniped
And given I have configured my kubeconfig to use pinniped
When I run kubectl commands
Then my browser opens a login page hosted by the supervisor that prompts me for my LDAP username and password

Styling is out of scope for this story.

Design: multiple IDP support

TODO

Is your feature request related to a problem? Please describe.
A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]

Describe the solution you'd like
A clear and concise description of what you want to happen.

Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.

Are you considering submitting a PR for this feature?

How will this project improvement be tested?
How does this change the current architecture?
How will this change be backwards compatible?
How will this feature be documented?

Additional context
Add any other context or screenshots about the feature request here.

Refactor docs/website to organize and avoid duplication.

Is your feature request related to a problem? Please describe.

Our markdown documentation and website source code has a lot of duplicate content right now.

Describe the solution you'd like

We should find some clean way of making a single canonical source for our core documentation. This might be as simple as deleting the non-website copy of docs, or it could be something along the lines of an expanded build script for the website.

Supervisor has an LDAPIdentityProvider.idp.supervisor.pinniped.dev CRD

Purpose: A first step toward the Supervisor supporting LDAP servers as an upstream IDP type.

GIVEN I have installed the Pinniped Supervisor
WHEN I kubectl create LDAPIdentityProvider
THEN it is created
AND I can read it, delete it, etc.

Details of the fields allowed in the CRD TBD (see previous research from the team).

Update integration tests to exercise user info claims support

We should update Okta to pass through groups. Also we should add a GitLab based test since their OIDC code only provides groups via the user info endpoint (the ID token has minimal info).

Integration tests: mark more tests as parallel

TODO

Is your feature request related to a problem? Please describe.
A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]

Describe the solution you'd like
A clear and concise description of what you want to happen.

Describe alternatives you've considered
A clear and concise description of any alternative solutions or features you've considered.

Are you considering submitting a PR for this feature?

How will this project improvement be tested?
How does this change the current architecture?
How will this change be backwards compatible?
How will this feature be documented?

Additional context
Add any other context or screenshots about the feature request here.

Add a concierge backend which uses service account tokens and an enhanced kubectl

Is your feature request related to a problem? Please describe.
I would like to use Pinniped on my managed Kubernetes cluster (e.g., cloud provider) where the existing Concierge TokenCredentialRequest API is inoperable. I would also like to do this without needing to provision additional load balancer or ingress resources on my cluster.

Describe the solution you'd like
We could make this work with a custom kubectl binary which adds one small enhancement to the ExecCredential API: the ability to set additional headers.

The flow would look something like:

User installs custom build of kubectl or maybe some future upstream version which includes this enhancement.
User downloads the Pinniped kubeconfig for their cluster where the concierge is running.
User runs kubectl [...], which invokes the pinniped login [...] ExecCredential plugin.
The pinniped login command logs in the user and then sends a TokenCredentialRequest to the concierge.
The concierge authenticates the user's token and creates several cluster objects:
1. A ServiceAccount named $user-service-account in the concierge namespace.
2. A ClusterRole which grants the impersonate verb for the user's authenticated username and groups.
3. A ClusterRoleBinding which binds the ClusterRole to the ServiceAccount.
The concierge uses the TokenRequest API to request a bound service account token for the newly-created ServiceAccount, which it returns in the status of the TokenCredentialRequest along with the authenticated username and groups.
The pinniped login command returns an ExecCredential status containing the token and extra Impersonate-User and Impersonate-Group headers.
When the custom kubectl receives the ExecCredential status, it makes requests using the ServiceAccount token but impersonating the user.
The Kubernetes audit log contains the ephemeral ServiceAccount and the impersonated user.

Some other notes:

The objects in step 5 can also be cached and re-used across multiple logins.
The kubectl enhancement is something we might be able to ship upstream. It would add a new requestHeaders field in the ExecCredential status.
Our ExecCredential plugin could try to detect when it's running with an unsupported version of kubectl. In these cases we can prompt the user to install our custom version or ask them to add --as flags (or the equivalent YAML) to make things mostly work even with current 1.20 upstream kubectl.

Describe alternatives you've considered

Impersonation proxy via LoadBalancer service

This design is an alternative to the impersonation proxy design from #67/#339, which does not require changes to kubectl but does require additional ingress/load balancer resources.

Impersonation proxy via federated API

We attempted to work around this by serving the impersonation proxy via a federated API by creating an APIService. However, this does not work because when our ExecCredential plugin returns a token, it can only be sent in the Authorization header which is rejected by the federation API server and not passed through to the concierge.

Current certificate-based TokenCredentialRequest API

All of these designs are an alternative backend for the current TokenCredentialRequest API which mints short-lived certificates for users. This is only possible on clusters that allow TLS authentication and support some method of signing a short-lived certificate.

Are you considering submitting a PR for this feature?
Yes, but we need to make a team decision about which direction we want to go.

How will this project improvement be tested?
This would need extensive integration tests as well as acceptance environments in all major cloud providers.
How does this change the current architecture?
The current TokenCredentialRequest behavior would still be supported, at least initially.
How will this change be backwards compatible?
Yes.
How will this feature be documented?
We should consider writing some new documentation describing how to use Pinniped in managed cluster environments.

Additional context

N/A

Log OIDC discovery errors at high log level

We should still log a (truncated?) version of this error string for debugging purposes.

The error from go-oidc includes the entire response body, so we probably need to truncate that.

See #215 (comment).

In local development flows, Tilt is unable to perform live-reload.

What happened?

When I'm using Tilt to do local development (./hack/kind-up.sh && ./hack/tilt-up.sh), when I edit a .go source file, I see that Tilt is falling back to a full rebuild:

Will copy 1 file(s) to container: e708205333
- '/Users/moyerm/vmware-tanzu/pinniped/hack/lib/tilt/build/pinniped-supervisor' --> '/usr/local/bin/pinniped-supervisor'
tar: usr/local/bin/pinniped-supervisor: Cannot open: File exists
tar: Exiting with failure status due to previous errors

Live Update failed with unexpected error:
	command terminated with exit code 2
Falling back to a full image build + deploy

The full rebuild still works correctly, but is somewhat slow.

What did you expect to happen?

I expected Tilt to perform a live reload without needing to build fresh images or rollout new pods, in most cases. This should be much faster.

What is the simplest way to reproduce this behavior?

Run ./hack/kind-up.sh && ./hack/tilt-up.sh, then edit a Go source file under one of the Pinniped components. You can see the above error message in the Tilt output (web or console streaming).

In what environment did you see this bug?

I see this on the current main branch head (385d2db).

What else is there to know about this bug?

We had one attempted fix in 2e50e8f, but I think we missed another bit of config which still causes our pods to run as non-root users:

pinniped/deploy/supervisor/deployment.yaml

Lines 65 to 68 in be8f11f

 spec: 

 securityContext: 

 runAsUser: 1001 

 runAsGroup: 1001

I believe we need to add ytt conditionals for excluding those lines in our Tilt deployment. There are similar lines in the concierge and local-user-authenticator deployments as well.

	groupsAsArray, okAsArray := groupsAsInterface.([]string)
	groupsAsString, okAsString := groupsAsInterface.(string)

vmware-tanzu / pinniped Goto Github PK

pinniped's Introduction

Overview

Getting started with Pinniped

Discussion

Contributions

Adopters

Reporting security vulnerabilities

License

pinniped's People

Contributors

Stargazers

Watchers

Forkers

pinniped's Issues

Acceptance

Acceptance

Impersonation proxy via LoadBalancer service

Impersonation proxy via federated API

Current certificate-based TokenCredentialRequest API

Recommend Projects

Recommend Topics

Recommend Org