Giter Club home page Giter Club logo

Comments (4)

CrypticCabub avatar CrypticCabub commented on August 10, 2024

Hi @neilromatowski0373

Instances not being able to start due to KMS permissions is a common issue that is described in our IG here:
https://docs.aws.amazon.com/solutions/latest/instance-scheduler-on-aws/troubleshooting.html#encrypted-ec2-instances-not-starting

As for the debug log you are referring to, this is actually a perfectly normal state of the solution as this log refers to 3 distinct values used in a scheduling decision: The current state of the schedule, the state of the schedule during the last execution, and the current state of the instance.

Seeing that the current and previous states of the schedule are both "running" and the instance's actual state is "stopped" indicates to the scheduler that the instance was stopped for some reason outside of normal scheduling. This can be either due to a start failure as is the case in your situation, or more commonly, due to the instance being stopped by manual user intervention outside of the normal operation of the schedule. Normal behavior in this scenario is to take no action (we attempt to avoid overwriting manual user action), but you can override this behavior by setting the "enforced" flag on the schedule to true. This will cause the scheduler to always take action when the current schedule state differs from the actual instance state regardless of whether we detect the instance as having been started/stopped by manual action.

we are currently planning to improve the clarity of these debug messages in an upcoming release

from instance-scheduler-on-aws.

neilromatowski0373 avatar neilromatowski0373 commented on August 10, 2024

Hi @CrypticCabub

Thank you for your prompt and considered response. I appreciate that. I think we need to build in a check to ensure that the role has permissions to KMS as part of our initial set-up. Stop the problem from occurring in the first place :). As for the logic it makes sense not to 'intervene'. Not keen on enforcing this as there could be valid reasons for this.

I was hoping that we could leverage a centralized log, view, to report on scheduler state. Maybe an extension to the App Insights dashboard.

image

We have a large number of linked accounts. Admittedly still at very early stages for instance scheduling implementation but we have been on the back foot, reacting to users saying that their instances (the KMS scenario ones) have gone down but not coming back on schedule.

Thank you for taking to time to listen and get back to me.

from instance-scheduler-on-aws.

CrypticCabub avatar CrypticCabub commented on August 10, 2024

No problem! It looks like you are already using the per-schedule metrics which should be able to provide some of the info you are looking for. As I indirectly mentioned in my previous response, we are currently evaluating ways we can improve observability for the solution and would love any additional feedback on how our customers want to be able to observe/monitor the solution as well what they are doing currently for this purpose

from instance-scheduler-on-aws.

CrypticCabub avatar CrypticCabub commented on August 10, 2024

closing due to inactivity. Please reopen if you have any further questions/feedback

from instance-scheduler-on-aws.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.