Giter Club home page Giter Club logo

Comments (21)

Toputnal avatar Toputnal commented on July 19, 2024

I'll dig in to Cloudwatch and integration for OpsGenie.

from avalon.

Toputnal avatar Toputnal commented on July 19, 2024

It looks like there is a method to setup SNS Topics in Cloudwatch such that Cloudwatch will send alerts to OpsGenie via OpsGenie's API. In my test environment I don't have anything to monitor, yet, but I will setup something simple and test the Cloudwatch integration. If all goes well, then we'll need to configure our production Avalon instance's Cloudwatch with SNS alerts to OpsGenie's web API. @d-venckus @davidschober @mbklein If we do this, then we can setup OpsGenie to alert whoever, whenever, via OpsGenie's integrated scheduler.

from avalon.

Toputnal avatar Toputnal commented on July 19, 2024

Oh, and I should mention that if an alert clears itself in CloudWatch, that will automatically update and close the alert in OpsGenie, assuming we get the integration setup as I mentioned above.

from avalon.

d-venckus avatar d-venckus commented on July 19, 2024

from avalon.

Toputnal avatar Toputnal commented on July 19, 2024

I have created an Topic in our AWS instance so that any messages that go to @mbklein's email also alert OpsGenie, theoretically. We could use some kind of artificial alert to ensure things are working, but I'll leave what that test might be to @mbklein. @davidschober

from avalon.

davidschober avatar davidschober commented on July 19, 2024

@Toputnal Not sure what you mean with the last comment. I think we need to ensure the main systems are up and respond

  • Fedora
  • Avalon Web
  • Avalon Workers
  • SOLR

Is that what you're looking for?

from avalon.

Toputnal avatar Toputnal commented on July 19, 2024

from avalon.

Toputnal avatar Toputnal commented on July 19, 2024

@davidschober ^

from avalon.

d-venckus avatar d-venckus commented on July 19, 2024

Will OpsGenie also alert MBK a second time, Jim? Is that the idea? To test OpsGenie connectivity fully?

from avalon.

Toputnal avatar Toputnal commented on July 19, 2024

Eventually, yes, @d-venckus, but for now I just wanna see some Blinkin' lights! :-)

from avalon.

davidschober avatar davidschober commented on July 19, 2024

Got it. Can someone document what we're monitoring?

from avalon.

Toputnal avatar Toputnal commented on July 19, 2024

I'm not certain what the existing monitoring that is setup in AWS is actually monitoring. (That is, what alarms are currently setup to email @mbklein). I'll dig in and see what I can find (without changing anything, at first, obviously). If I accidentally break anything, our SOP is to blame @egspoony ;-) (Back me up on this @d-venckus)!

from avalon.

Toputnal avatar Toputnal commented on July 19, 2024

I have setup alerts for all the services listed above, minus RDS which is provided as a service to us rather than being a service that we run on top of some EC2 instance. Does what I've added look good to @d-venckus , @davidschober @mbklein

from avalon.

Toputnal avatar Toputnal commented on July 19, 2024

I have setup a Dashboard in CloudWatch called "JRBDashboard" which shows what things we are now setup to alert OpsGenie, so that's probably a good place to look at our current AWS/OpsGenie integration. @d-venckus @mbklein @davidschober

from avalon.

davidschober avatar davidschober commented on July 19, 2024

@mbklein I'll let you take a peak and OK it. @Toputnal and @d-venckus can you write up some brief docs on what we are monitoring, how to set it up, etc? You can put it https://github.com/nulib/repodev_planning_and_docs/wiki/AVR-Technical-Documentation#monitoring-via-cloudwatch or you can send me a doc and I can c&p.

from avalon.

Toputnal avatar Toputnal commented on July 19, 2024

I will update the doc you listed above @davidschober as soon as we stabilize the thresholds of the things we are monitoring. Currently, we have quite a bit of "noise" which we are still ironing out.

from avalon.

davidschober avatar davidschober commented on July 19, 2024

Thanks @Toputnal

from avalon.

davidschober avatar davidschober commented on July 19, 2024

@Toputnal I forget are we closing this and creating a "create final cloud watch monitoring" issues with the findings? That seems to make sense to me,.

from avalon.

Toputnal avatar Toputnal commented on July 19, 2024

from avalon.

davidschober avatar davidschober commented on July 19, 2024

from avalon.

Toputnal avatar Toputnal commented on July 19, 2024

All the alerts I created are still in place after the burn down, and rebuild. Yay! Moving to Review.

from avalon.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.