red-hat-data-services / odh-deployer Goto Github PK
View Code? Open in Web Editor NEWThe odh-deployer image creates a custom resource for the image in operator image in odh-operator-allinone
License: Apache License 2.0
The odh-deployer image creates a custom resource for the image in operator image in odh-operator-allinone
License: Apache License 2.0
We want to create a full automated test on validating prometheus alert rules, i.e., when a PR is made, test is ran and checks whether the changes will cause the alerts to fire.
The alert rules are defined in the odh-deployer prometheus config-map, while the components such as CodeFlare Operator, MCAD, and others are in their own separate repos, making it more challenging to run these tests against all components.
We need to revisit the grafana dashboard once we have the DB monitoring metrics ready and do the following:
Error Measurement for Jupyterhub Database
RHODS creates the following namespaces:
redhat-ods-applications
redhat-ods-monitoring
redhat-ods-operator
rhods-notebooks
rhods-notebooks
uses a different prefix name from the other namespaces that RHODS creates making it more difficult to search/find the namespaces related to RHODS.
We require Prometheus to actively look for the pods and fire alerts if the conditions are met. CodeFlare and MCAD components need to expose metrics endpoint before alerts are passing.
Alerts can be added to test if Prometheus is successfully scraping the endpoints and to meet with SLIs/SLOs
Currently ODH Application, ODH Documentation, and ODH Quickstart CRDs and CRs are stored in the deployer repo. For consistency and maintainability we should remove these resources from the deployer repo and move them over to the manifest repo. This will bring us in line with what is currently being done with other CRDs within the project.
Here is the PR with the current implementation #260
This will require updates to the deployer bash script as well.
So the way grafana is defined is we need to provide it with the data source (in our case Prometheus) definition in grafana-datasources
secret. In the secret, we provide it with a bearer token from a service account. We need some sort of templating/automation which would create this secret for us with the right bearer-token every time the script is deployed. We arent tracking this secret in git currently so once we have a solution we need to add it to this repo.
We would like to create and perform unit tests on the rules that are added. To achieve this, the simplest way is to make use of PromTool. PromTool requires the alerts to be in their own yaml file as the tool is not able to directly parse from the ConfigMap.
The Dockerfile sets the HOME env var to /root, does this imply the image is running as root? Does it need to? If it isn't running as root, HOME should be set somewhere else.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.