Comments (21)
I'll dig in to Cloudwatch and integration for OpsGenie.
from avalon.
It looks like there is a method to setup SNS Topics in Cloudwatch such that Cloudwatch will send alerts to OpsGenie via OpsGenie's API. In my test environment I don't have anything to monitor, yet, but I will setup something simple and test the Cloudwatch integration. If all goes well, then we'll need to configure our production Avalon instance's Cloudwatch with SNS alerts to OpsGenie's web API. @d-venckus @davidschober @mbklein If we do this, then we can setup OpsGenie to alert whoever, whenever, via OpsGenie's integrated scheduler.
from avalon.
Oh, and I should mention that if an alert clears itself in CloudWatch, that will automatically update and close the alert in OpsGenie, assuming we get the integration setup as I mentioned above.
from avalon.
from avalon.
I have created an Topic in our AWS instance so that any messages that go to @mbklein's email also alert OpsGenie, theoretically. We could use some kind of artificial alert to ensure things are working, but I'll leave what that test might be to @mbklein. @davidschober
from avalon.
@Toputnal Not sure what you mean with the last comment. I think we need to ensure the main systems are up and respond
- Fedora
- Avalon Web
- Avalon Workers
- SOLR
Is that what you're looking for?
from avalon.
from avalon.
from avalon.
Will OpsGenie also alert MBK a second time, Jim? Is that the idea? To test OpsGenie connectivity fully?
from avalon.
Eventually, yes, @d-venckus, but for now I just wanna see some Blinkin' lights! :-)
from avalon.
Got it. Can someone document what we're monitoring?
from avalon.
I'm not certain what the existing monitoring that is setup in AWS is actually monitoring. (That is, what alarms are currently setup to email @mbklein). I'll dig in and see what I can find (without changing anything, at first, obviously). If I accidentally break anything, our SOP is to blame @egspoony ;-) (Back me up on this @d-venckus)!
from avalon.
I have setup alerts for all the services listed above, minus RDS which is provided as a service to us rather than being a service that we run on top of some EC2 instance. Does what I've added look good to @d-venckus , @davidschober @mbklein
from avalon.
I have setup a Dashboard in CloudWatch called "JRBDashboard" which shows what things we are now setup to alert OpsGenie, so that's probably a good place to look at our current AWS/OpsGenie integration. @d-venckus @mbklein @davidschober
from avalon.
@mbklein I'll let you take a peak and OK it. @Toputnal and @d-venckus can you write up some brief docs on what we are monitoring, how to set it up, etc? You can put it https://github.com/nulib/repodev_planning_and_docs/wiki/AVR-Technical-Documentation#monitoring-via-cloudwatch or you can send me a doc and I can c&p.
from avalon.
I will update the doc you listed above @davidschober as soon as we stabilize the thresholds of the things we are monitoring. Currently, we have quite a bit of "noise" which we are still ironing out.
from avalon.
Thanks @Toputnal
from avalon.
@Toputnal I forget are we closing this and creating a "create final cloud watch monitoring" issues with the findings? That seems to make sense to me,.
from avalon.
from avalon.
from avalon.
All the alerts I created are still in place after the burn down, and rebuild. Yay! Moving to Review.
from avalon.
Related Issues (20)
- Smoke test production, 11/1/2018 HOT 2
- Fix GA tracking code HOT 1
- SOLR Read only (AVR couldn't write) HOT 1
- Avalon: can't create thumbnails
- elastic transcoder erroring on batch HOT 1
- Can't ingest 4 films into Spec collection HOT 5
- Verify "move strategy" is in place in AVR HOT 2
- Email not being sent for failure HOT 1
- fix move strategy HOT 1
- Help! my batch disappeared HOT 3
- Run script to move masterfiles to pres HOT 2
- Video embed missing "https://" in src HOT 1
- normalize Avalon preservation bucket to use pairtrees HOT 2
- generate list of Avalon presevation assets HOT 2
- Can't select anything HOT 6
- April 8 sprint
- Dependabot can't evaluate your Ruby dependency files
- Deal with all handle/ark/link issues HOT 16
- Can't delete files in AVR HOT 9
- Long Canvas course IDs getting truncated HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from avalon.