layer5io / meshery-smp-action Goto Github PK
View Code? Open in Web Editor NEWGitHub Action for pipelining microservices and Kubernetes performance testing with Meshery
Home Page: https://layer5.io/projects/nighthawk
License: Apache License 2.0
GitHub Action for pipelining microservices and Kubernetes performance testing with Meshery
Home Page: https://layer5.io/projects/nighthawk
License: Apache License 2.0
Current tests use bash scripts to deploy service meshes and apps, but should be using mesheryctl mesh
and mesheryctl app
.
Performance test profiles are named as {service-mesh}-{load-generator}-{test-configuration}.yaml currently, it‘s ugly and hard to understand.
The goal of this issue is to discuss and find out the how to make the profile names be in order for people to understand what is being tested.
There are some remaining CNCF runners not being remove after tests done, the number of them gradually increases over time.
We can delete them manually, but it's better to make sure they are properly removed.
The same thing happened to equinix servers deletion:
We should add retries and confirmations to ensure CNCF runners and machines are removed.
The project's current issue templates are missing an open invitation link where new contributors can join the community's Figma team and view user interface designs and other UX projects.
Each template that has a reference to Figma in its resources section should an invite link added.
- 🎨 Wireframes and [designs for Meshery UI](https://www.figma.com/file/SMP3zxOjZztdOLtgN4dS2W/Meshery-UI) in Figma [(open invite)](https://www.figma.com/team_invite/redeem/qJy1c95qirjgWQODApilR9)
Acceptance Tests
All references to Figma include the "open invite" link.
The Scheduled Benchmark Tests workflow creates dynamic test names based on the configuration of the test. Current format is shown below.
So, a sample test name now is: istio-fortio-load-test.yaml
Remove the .yaml
from the test name and only include the rest of the file name.
Current State:
Template content in the readme.
Desired State:
Project-specific content in the readme.
Contributor Resources
Currently we are
The instructions are mentioned here
Ideally both these tasks should be automated, using the equinix APIs and tools like terraform
We can use tools like terraform and use existing terraform support for equinix APIs to achieve this.
https://github.com/equinix/terraform-provider-equinix
https://github.com/equinix/cloud-provider-equinix-metal
https://github.com/machulav/ec2-github-runner#example
Successful action runs with complete automation would solve this issue
@leecalcote @gyohuangxin would creating a runner on demand (i.e. after starting a workflow) mean that the self-hosted-runner in itself would not be needed? Given we have to register a self-hosted-runner to a repository first.
https://docs.github.com/en/actions/hosting-your-own-runners/adding-self-hosted-runners
The application deployed by SMP github action is not reachable to Meshery.
Root cause:
minikube tunnel
is easy to be killed.Tests run on a 24 hour period.
As the project ramps the diversity of testing, it would be good to get these results generated more frequently in order to iterate more quickly on test harness updates and performance test profiles.
Increase frequency of self-hosted performance tests to once per hour
The SMP becnhmark test that run as GitHub Actions are currently failing
We would want to figure out what is causing this issue and solve this so that correct and error free performance test are published on the SMP Dashboard
Currently we use app onboard
to deploy manifests.
We should use pattern apply
where possible as that gives us flexibility and configurability.
We have included Istio, Linkerd, OSM performance tests.
We should enable other service meshes performance tests listed on https://smp-spec.io/dashboard.
Writing bash scripts for each mesh is time consuming, so we should use mesheryctl
to deploy them.
And we should use mesheryctl app onboard
to deploy sample apps if #48 is readay.
We use mesheryctl patter apply
to deploy applications and manifests on Istio mesh, but it fails sometimes, even though we increased the sleep time: https://github.com/layer5io/meshery-smp-action/runs/8251690332?check_suite_focus=true#step:5:44
If any job in this workflow fails, an email should be sent to mailto:[email protected]|[email protected] with details of the failure - https://github.com/meshery/meshery/blob/master/.github/workflows/build-and-release-stable.yml
Multiple issues of the Scheduled Benchmark Tests on Self Hosted Runner generate different results
While here just one combination failed :- https://github.com/layer5io/meshery-smp-action/runs/7770247406 (linkerd-wrk-soak)
Here a couple failed :- https://github.com/layer5io/meshery-smp-action/actions/runs/2832186044
And here one runner startup itself failed :- https://github.com/layer5io/meshery-smp-action/actions/runs/2833335073
The behaviour should be consistent across Scheduled Benchmark Test run on self hosted runners. Given we are working with external hardware where connections can fail, ideally we should have a retry mechanism and we should work to considerably reduce the consistency
In the logs Auth seems to be a common culprit
Opening Meshery (http://localhost:31391) in browser.
Failed to open Meshery in browser, please point your browser to http://localhost:31391 to access Meshery.
authentication failed: Get "http://localhost:31[39](https://github.com/layer5io/meshery-smp-action/runs/7765562299?check_suite_focus=true#step:5:40)1/api/providers": dial tcp [::1]:31391: connect: connection refused
Verifying prerequisites...
Authentication token not found. please supply a valid user token with the --token (or -t) flag. or login with `mesheryctl system login`
Onboarding application... Standby for few minutes...
Error: Authentication token not found. please supply a valid user token with the --token (or -t) flag. or login with `mesheryctl system login`
This is seen for istio
Opening Meshery (http://192.168.49.2:32398/) in browser.
Failed to open Meshery in browser, please point your browser to http://192.168.49.2:32398/ to access Meshery.
Verifying prerequisites...
Adapter for required mesh not found
Onboarding application... Standby for few minutes...
rpc error: code = Unknown desc = no matches for kind "Gateway" in version "networking.istio.io/v1alpha3"
Error from server (NotFound): namespaces "istio-system" not found
Error from server (NotFound): namespaces "istio-system" not found
Service Mesh: Istio - ISTIO
Gateway URL: [http://192.168.49.2:](http://192.168.49.2/)
Current State:
No newcomers-alert.yml
Out of date slack.yml
Desired State:
Updated newcomers-alert.yml and slack.yml
Contributor Resources
We had an implementation of running SMP on self-hosted runner #39 , but the configuration of self-hosted runner is hardcoded as "c3.small.x86".
We should make self-hosted runner configurable as the workflow's options, e.g. server type, location......
The Scheduled Benchmark Tests workflow runs performance benchmark tests at regular intervals and captures the test results.
It runs the tests defined in these two test configuration files.
These test configurations are not yet defined properly.
Define proper test configurations to run benchmark tests.
Update the test configuration files linked above.
Description
Provide a sample performance test which would be done using this action so that users can consider what runner types and benchmark configurations should be used in other tests.
Meshery SMP Action is failing on some CI runs due to inaccessibility of k8s cluster present in the workflow
The error must be rectified and CI run should be free of errors
Currently only Istio adapter has a pattern file and hence can use the pattern apply
construct. For linkerd and OSM tests we need to use app onboard
to apply their demo kubernetes manifests
We should run a common pattern apply
method across all meshes
Runs showing that pattern apply works for all meshes
As we start using this action, we also need to take note on the environment specifications like the test configurations, GitHub runner specs as well as review the results of the performance benchmarks in this environment.
Of the scheduled tests that run multiple times a day, they have faced a few challenges. Notably, one of those challenges is in the cleanup phase once a test is complete. Currently, it is frequently the case that any number of bare metal servers that are used for testing or orphaned, and not decommissioned at the end of each test. This leaves an inordinate amount of bare metal servers, unnecessarily unavailable for used by other projects.
@vielmetti has been most helpful in identifying ways to mitigate this from happening.
All resources provisioned for a scheduled test are subsequently decommissioned at the end of that same test.
Recently @vielmetti point this out:
You can create servers that will auto-delete themselves at a time certain, perfect for test runs. See https://deploy.equinix.com/developers/docs/metal/deploy/spot-market/#spot-market-request-creation. You want the “end_at” parameter on the API endpoint for device creation
Panic error happens when running benchmarking test either on Github runner or CNCF cluster runner.
Logs for github runner:
https://github.com/layer5io/meshery-smp-action/runs/5493977248?check_suite_focus=true#step:6:1573
Logs for CNCF cluster runner:
https://github.com/gyohuangxin/meshery-smp-action/runs/5462624684?check_suite_focus=true#step:6:1174
Regarding #38, we have implement running SMP on self-hosted CNCF cluster, and codes have been merged into self-hosted branch. We should cherry-pick it to master branch.
Description
This repository is meant for the meshery GH action for performing SMP tests.
Using some boilerplate code from https://github.com/layer5io/meshery-smi-conformance-action, initialize this action to use mesheryctl perf
subcommands for creating a performance test.
Currently we are arbitrarily waiting 10 minutes waiting for the server to get provisioned and started.
We want to optimise on our waiting time by keeping track of the state variable by the machine
The logic of this can look like
in the above mentioned bash script.
A little experimentation might be required based on the intervals on which we poll. (We do not want cause anything that might come under a DOS attack :) )
A link to the self-hosted workflow run would be key to get your changes accepted
These versions are highlighted in the readme as being used by the action currently:
minikube version: 'v1.21.0'
kubernetes version: 'v1.20.7'
The latest versions of these tools are:
minikube version: 'v1.30.1'
kubernetes version: 'v1.27.3'
List this repo's name in slack.yml workflow.
Currently the tests are run with the sample application of the particular service mesh. This is different for each service mesh.
Add sample application as a configurable item and run tests across multiple applications for each service mesh.
A new field should be added here and the scripts needs to be changed to take in this dynamic value.
Open Service Mesh is an archived project. It's performance testing here can be removed.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.