Comments (4)
Are you deploying kubeflow with kubeflow-manifests? Have you checked the disk usage of the volume? The default volume is 10Gi, without further information I can only guess maybe after 10-15 days the disk is full?
from oidc-authservice.
Yes, I have deployed using kubeflow manifest. To be specific, https://github.com/kubeflow/manifests/tree/v1.6.0.
Yes, i did check the disk usage already. The PVC used by authservice is 10G and only file in that is "data-db" which was only 5 MB a the time of issue. Overall volumes usage also looks good.
from oidc-authservice.
We are experiencing the same issue in our environment as well. The "Failed to Save State in Store: Input/Output Error" error keeps showing up for the authservice pod, even though all other components seem to be running fine. Upon launching the Kubeflow environment and adding five users, we have encountered a recurring 403 error on the Kubeflow Dex login page, even when no users were logged in.
Environment:
- Kubernetes version: v1.19.2
- Kubeflow version: v1.4.1 from https://github.com/kubeflow/manifests/tree/v1.4.1
Pod Information:
- Pod Name: authservice-0
- Namespace: istio-system
- Container Image: gcr.io/arrikto/kubeflow/oidc-authservice:28c59ef
Issue Details:
- The
authservice-0
pod within theistio-system
namespace shows no anomalies in resource usage. CPU and memory consumption appear to be normal.
# kubectl top pod authservice-0 -n istio-system
NAME CPU(cores) MEMORY(bytes)
authservice-0 1m 3Mi
- The associated persistent volume claim (PVC) has the expected data in the NFS storage, and the
data.db
file appears to be intact.
# ls -lh /export/kubernetes/istio-system-authservice-pvc-pvc-3e8dd897-4478-40c5-a007-e1d1aa55f734
total 24K
-rw-r--r-- 1 systemd-network tss 32K Jul 24 05:25 data.db
- The observed issue seems to be different from the "boltdb memory/PVC bloat bug" mentioned in issues #2 and #88, as the resource consumption appears normal.
Error Logs:
# kubectl logs authservice-0 -n istio-system
time="2023-07-24T05:25:20Z" level=info msg="Starting readiness probe at 8081"
time="2023-07-24T05:25:20Z" level=info msg="No USERID_TOKEN_HEADER specified, using 'kubeflow-userid-token' as default."
time="2023-07-24T05:25:20Z" level=info msg="No SERVER_HOSTNAME specified, using '' as default."
time="2023-07-24T05:25:20Z" level=info msg="No SERVER_PORT specified, using '8080' as default."
time="2023-07-24T05:25:20Z" level=info msg="No SESSION_MAX_AGE specified, using '86400' as default."
time="2023-07-24T05:25:20Z" level=info msg="Starting web server at :8080"
time="2023-07-24T05:47:51Z" level=error msg="Failed to save state in store: error trying to save session: input/output error" ip=192.168.200.145 request=/
time="2023-07-24T05:48:29Z" level=error msg="Failed to save state in store: error trying to save session: input/output error" ip=192.168.200.145 request=/
time="2023-07-24T05:50:10Z" level=error msg="Failed to save state in store: error trying to save session: input/output error" ip=192.168.200.145 request=/
time="2023-07-24T05:50:25Z" level=error msg="Failed to save state in store: error trying to save session: input/output error" ip=192.168.200.145 request=/
time="2023-07-24T05:55:03Z" level=error msg="Failed to save state in store: error trying to save session: input/output error" ip=192.168.200.145 request=/
time="2023-07-24T05:55:04Z" level=error msg="Failed to save state in store: error trying to save session: input/output error" ip=192.168.200.145 request=/
from oidc-authservice.
Additional Content:
After setting the log level of oidc-authservice
to DEBUG, I rechecked the logs when the error occurred again. I discovered that the error is related to boltstore/reaper, which is responsible for releasing unnecessary resources, rather than using boltdb for session management.
2023/08/01 03:22:10 boltstore: remove expired sessions error: input/output error
time="2023-08-01T03:22:57Z" level=warning msg="Request doesn't have a valid session." ip=192.168.200.15 request=/logout
time="2023-08-01T03:22:57Z" level=error msg="Failed to save state in store: error trying to save session: input/output error" ip=192.168.200.15 request=/
from oidc-authservice.
Related Issues (20)
- Enable oidc-authservice repository CI for power(ppc64le) architecture. HOT 4
- External Authentication with Updated OIDC authservice image HOT 1
- Wildcard support for GROUPS_ALLOWLIST
- ERROR: CSRF check failed. This may happen if you opened the login form in more than 1 tabs. Please try to login again. HOT 6
- Set LOG_LEVEL not work HOT 4
- Getting access denied 403 from OIDC login with Azure AD in Kubeflow HOT 1
- all URIs are whitelisted and cannot be secured by OIDC provider HOT 1
- OIDC authentication repeating and getting session timed out HOT 2
- Support Secure and HttpOnly flags in session cookie HOT 2
- x509: certificate signed by unknown authority-While deploying Kubeflow
- STORE_PATH option isn't honoured HOT 1
- Authservice validation HOT 1
- /var/lib/authservice
- Sessions are not cleaned up when using bolt db
- Possibly memory leak HOT 2
- Update container image on gcr.io HOT 1
- Failed to exchange authorization code with token: oauth2: cannot fetch token: 400 Bad Request Using Azure AD OIDC HOT 3
- how to get user info by session token?
- Access kubeflow from path "/kubeflow" instead of "/" HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from oidc-authservice.