Comments (10)
That seems reasonable to me. I can look at adding in that functionality, but it might not be for a few days. If you need it sooner I definitely welcome contributions!
from k8s-scheduled-volume-snapshotter.
Sounds good! I've merged the PR and created a new release with the changes: v0.15.0
from k8s-scheduled-volume-snapshotter.
Could you provide the yaml of one of or two of those snapshots from your screenshot above that are being created every 15 minutes? My best guess is something is wrong with this logic when determining if we should create a new snapshot, but having the YAML associated with one of these snapshots in a bad state would help determine if that is the case
from k8s-scheduled-volume-snapshotter.
Actually, I believe it is trying to create a new snapshot because of this logic where it sees a new snapshot has not been successfully created within the last 24 hours, so it attempts to create a new one. It would be interesting to see the state of these snapshots (i.e. the YAML) to see if they are in an error state or something else
from k8s-scheduled-volume-snapshotter.
@ryaneorth sorry for the delay, i'm back to work today... thanks for your answers, yes, here it is:
this is a working one:
apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshot
metadata:
creationTimestamp: "2024-03-23T03:40:07Z"
deletionGracePeriodSeconds: 0
deletionTimestamp: "2024-03-23T07:40:08Z"
finalizers:
- snapshot.storage.kubernetes.io/volumesnapshot-bound-protection
generation: 2
labels:
envName: lab
frequency: hourly
scheduled-volume-snapshot: common-mongodb-snapshots-hourly
name: common-mongodb-snapshots-hourly-1711165207
namespace: lab
resourceVersion: "379413284"
uid: 5ccf69b1-bb3c-4786-b401-91a95cd42ec6
spec:
source:
persistentVolumeClaimName: datadir-common-mongodb-hidden-0
volumeSnapshotClassName: lab-common-mongodb-snapshots
status:
boundVolumeSnapshotContentName: snapcontent-5ccf69b1-bb3c-4786-b401-91a95cd42ec6
creationTime: "2024-03-23T04:04:00Z"
readyToUse: true
restoreSize: "61743560"
this is a broken one:
apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshot
metadata:
creationTimestamp: "2024-04-02T07:41:21Z"
finalizers:
- snapshot.storage.kubernetes.io/volumesnapshot-as-source-protection
- snapshot.storage.kubernetes.io/volumesnapshot-bound-protection
generation: 1
labels:
envName: lab
frequency: hourly
scheduled-volume-snapshot: common-mongodb-snapshots-hourly
name: common-mongodb-snapshots-hourly-1712043681
namespace: lab
resourceVersion: "398226498"
uid: f869c9fb-a398-43c6-944e-52a696fee29f
spec:
source:
persistentVolumeClaimName: datadir-common-mongodb-hidden-0
volumeSnapshotClassName: lab-common-mongodb-snapshots
status:
boundVolumeSnapshotContentName: snapcontent-f869c9fb-a398-43c6-944e-52a696fee29f
error:
message: 'Failed to check and update snapshot content: failed to take snapshot
of the volume 172.16.0.102#mnt/kube_data#mongodb/lab/datadir-common-mongodb-hidden-0_pvc-0132f065-a5d2-490b-903c-2be6345dbd1e#pvc-0132f065-a5d2-490b-903c-2be6345dbd1e#:
"rpc error: code = Internal desc = failed to mount src nfs server: rpc error:
code = Aborted desc = An operation with the given Volume ID 172.16.0.102#mnt/kube_data#mongodb/lab/datadir-common-mongodb-hidden-0_pvc-0132f065-a5d2-490b-903c-2be6345dbd1e#pvc-0132f065-a5d2-490b-903c-2be6345d
bd1e#
already exists"'
time: "2024-04-02T07:42:13Z"
readyToUse: false
unfortunately i lost the old ones as rotation occurred in this weekend and some were deleted...
from k8s-scheduled-volume-snapshotter.
@ryaneorth i even tried cleaning up every broken snapshot, but as soon as it restarts creating new ones, it again fails... the only thing i didn't clean up was the nfs storage where the snapshots are, i was waiting for your feedback in case you needed some logs from them, to replicate this behaviour...
from k8s-scheduled-volume-snapshotter.
Thanks @fragolinux for those details. That confirms why it is continually trying to create the snapshot: because of the code I linked to in my last comment. Specifically, the logic is set to see if there was a successful snapshot created within the time threshold specified by the Scheduled volume snapshot. In your case because the snapshots are failing, it will keep trying to create them.
How would you expect the scheduled volume snapshotter to behave in this case when there are failed snapshots? Would you expect it to not continue to try to create them?
from k8s-scheduled-volume-snapshotter.
Hi, thanks
What about some flag to choose what to do... Keep trying, or maybe deleting the failed, offending ones?
No idea which is safer, just saying my 1st thoughts
from k8s-scheduled-volume-snapshotter.
@fragolinux - does it look like the changes I made in PR #27 will meet your needs?
from k8s-scheduled-volume-snapshotter.
hi @ryaneorth , yes, thanks! I think that allowing to delete the failed ones (with an additional flag) is beyond the scope of your project... i need to implement some sort of monitoring to be alerted if something fails, anyway... my problem was that the nfs storage i was using was full (by the failed snapshots themselves, so it could be useful to get rid of them, as they're of no use and just waste storage space...
from k8s-scheduled-volume-snapshotter.
Related Issues (12)
- Feature - cronjob max complete/failed pods history limits HOT 2
- Snapshotter container image has critical/high vulnerabilities HOT 1
- Allow stable API snapshot.storage.k8s.io/v1 HOT 1
- difference between pvc size and snapshot size HOT 2
- add options to avoid queue and just wait for next schedule time HOT 1
- notification HOT 4
- Truncation for schedule volume snapshot label in child VolumeSnapshot object HOT 1
- support for arm64 HOT 4
- Volumesnapshot does not work when default class is not defined HOT 1
- Deprecated K8s API versions in K8s 1.21+ HOT 1
- Conflict on multiple scheduled snapshots of same PVC HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from k8s-scheduled-volume-snapshotter.