Comments (3)
Procedure to implement Gatling
Objective: With the aim of monitoring the performance of our system and detecting imminent system failures preemptively, we are planning to implement the following tools for system analysis purposes.
- Gatling: A load-testing tool that mimics the behavior of a user. (Heavier load can be applied to the system by increasing the number of synthetic users)
- Grafana: A dashboard tool that can be used to real-time monitoring of the system performance. Grafana queries Prometheus to retrieve relevant metrics from Prometheus, and these queries can be adjusted for different analysis.
- Prometheus: A system for gathering and storing metrics as a time-series database.
Pre-requisite
-
AWS EKS cluster should be up and running which will serve the 4 microservices, db (database service), s1 (user service), s2 (music service), and s3 (playlist service).
a. Starting a fresh EKS cluster.
make -f eks.mak start
-
Ensure AWS DynamoDB is initialized. The tables have to be available for db (database service) to serve the other 3 microservices. (s1/s2/s3)
aws dynamodb list-tables
-
Provision the cluster. This includes;
a. Create a namespace within the cluster in which applications will be placed.
kubectl create ns c756ns
kubectl config set-context --current --namespace=c756ns
b. Provision the Kubernetes cluster. This includes “Installing Istio”, “Installing Prometheus stack by callingobs.mak
recursively”, and “Deploying and monitoring the four microservices”.
make -f k8s.mak provision
-
Get the Grafana URL using which we can access the dashboard.
make -f k8s.mak grafana-url
• User: admin
• Password: prom-operator
Note: The hostname is obtained from ‘istio” namespace.
kubectl get -n istio-system svc/grafana-ingress -o jsonpath="{.status.loadBalancer.ingress[0]['ip','hostname']}"
Parameters:
1: path to kubectl
2: namespace
3: the resource to query (typically an svc)
How does Gatling work?
1. We will be using a Gatling docker image which will allow us to create and apply synthetic load to our system.
ghcr.io/scp-2021-jan-cmpt-756/gatling:3.4.2
2. Scenarios of user behavior are defined in “ReadTables.scala”.
a. The original package name is defined, which will be used to trigger a Gatling instance later.
package proj756
b. The required imports for Gatling to work.
import scala.concurrent.duration._
import io.gatling.core.Predef._
import io.gatling.http.Predef._
c. “Utility” object
envVarToInt(“USERS”, 1)
- Utility to get an Int from an environment variable. (e.g., The number of users is defined under “docker container run” command.)
envVar("CLUSTER_IP", "127.0.0.1")
- Utility to get a string from an environment variable. (e.g., The cluster IP is defined under “docker container run” command.)
d. “RMusic” and “RUser” objects – Scenarios defined to be tested for respective services. Sending an HTTP get request with {UUID} continuously every one second.
Note: “eager()
” loads the whole data in memory before the Simulation starts, saving disk access at runtime. “random()
” randomly picks an entry in the sequence. “circular()
” goes back to the top of the sequence once the end is reached.
e. “RUserVarying” and “RMusicVarying” objects – Scenarios defined to be tested for respective services. Sending a HTTP get request with {UUID} continuously with different intervals between calls. (Each interval is randomly selected between 1 and 60 seconds)
f. “ReadTablesSim” class – This class inherits properties from “Simulation” class and used to define HTTP protocols for simulations. (e.g., cluster IP is read from environment variables defined under “docker container run” command)
g. “ReadUserSim” and “ReadMusicSim” classes – These classes are directly called by “docker container run
” command, which will inject independent users (as defined in “RMusic” and “RUser” objects) via HTTP protocols defined in “ReadTablesSim” class
h. “ReadBothVaryingSim” class – This class is directly called by “docker container run
” command, which will inject concurrent users (as defined in “RMusicVarying” and “ReadUserVarying” objects) via HTTP protocols defined in “ReadTablesSim” class.
Note: There are 2 types of workload model for injection. - Open vs. Closed
• Closed systems, where you control the concurrent number of users. Closed system are system where the number of concurrent users is capped. At full capacity, a new user can effectively enter the system only once another exits.
• Open systems, where you control the arrival rate of users. Open systems have no control over the number of concurrent users: users keep on arriving even though applications have trouble serving them.
Note: For closed model, We have two methods that we use to inject users.
• constantConcurrentUsers(nbUsers).during(duration)
: Inject so that number of concurrent users in the system is constant
• rampConcurrentUsers(fromNbUsers).to(toNbUsers).during(duration)
: Inject so that number of concurrent users in the system ramps linearly from a number to another
3. Create a script that will trigger Gatling. (e.g., gatling-1-music.sh)
docker container run --detach --rm \
-v ${PWD}/gatling/results:/opt/gatling/results \
-v ${PWD}/gatling:/opt/gatling/user-files \
-v ${PWD}/gatling/target:/opt/gatling/target \
-e CLUSTER_IP=`tools/getip.sh kubectl istio-system svc/istio-ingressgateway` \
-e USERS=1 \
-e SIM_NAME=ReadMusicSim \
--label gatling \
ghcr.io/scp-2021-jan-cmpt-756/gatling:3.4.2 \
-s proj756.ReadMusicSim
4. To list Gatling containers currently running
tools/list-gatling.sh
5. To stop all the Gatling containers.
tools/kill-gatling.sh
from term-project-cloudriven.
Prometheus Basics - Time Series
- We can query Prometheus directly without Grafana.
- We can output metrics to Prometheus.
Prometheus Basics – Two Fundamental Roles
- First, it gathers and records metrics in a time-series database (TSDB), which includes special compression techniques optimized for this type of data.
- Second, it supports queries against that database. It features a query language, PromQL, that meets the specific needs of time series data.
Prometheus Technical Details
- The set of metrics available from a given container is determined by that container, not Prometheus.
- The set of metrics available from our three microservices are defined by the Python client library we use, the Python Prometheus Flask exporter. We may define new metrics for our term project.
Pre-requisite
-
AWS EKS cluster should be up and running which will serve the 4 microservices, db (database service), s1 (user service), s2 (music service), and s3 (playlist service).
a. Starting a fresh EKS cluster.
make -f eks.mak start
-
Ensure AWS DynamoDB is initialized. The tables have to be available for db (database service) to serve the other 3 microservices. (s1/s2/s3)
aws dynamodb list-tables
-
Provision the cluster. This includes;
a. Create a namespace within the cluster in which applications will be placed.
kubectl create ns c756ns
kubectl config set-context --current --namespace=c756ns
b. Provision the Kubernetes cluster. This includes “Installing Istio”, “Installing Prometheus stack by callingobs.mak
recursively”, and “Deploying and monitoring the four microservices”.
make -f k8s.mak provision
-
Get the Promethus URL using which we can directly run queries on Promethus.
make -f k8s.mak prometheus-url
Note: The hostname is obtained from ‘istio” namespace.
kubectl get -n istio-system svc/prom-ingress -o jsonpath="{.status.loadBalancer.ingress[0]['ip','hostname']}"
Parameters:
1: path to kubectl
2: namespace
3: the resource to query (typically an svc)
A query returning a single time series
The following query requests the current values of all time-series that have their service label assigned the string cmpt756db.
flask_http_request_total{service="cmpt756db"}
flask_http_request_total{container="cmpt756db",endpoint="http", instance="10.244.1.10:30002",job="cmpt756db",method="GET",namespace="c756ns", pod="cmpt756db-79ddc5446d-2566f",service="cmpt756db",status="200"}
Instant vector: A query returning multiple time series
Requesting any time series for our sample metric, regardless of the values for its keys. Note that the returned values were not necessarily sampled at the same time but are simply the most recent samples returned for each time series.
flask_http_request_total
Range vector: A query returning several values from a single series
Our next query will return to the single time series but we will ask for all the samples over a given time range, returning a range vector.
flask_http_request_total{service="cmpt756db"}[5m]
The [5m] suffix requests all samples from the most recent 5 minutes, ordered from oldest to most recent.
The entries in the Value column will now include both a count and a timestamp, separated by an @ symbol. The timestamp is in seconds since January 1, 1970, GMT. Copy one of the timestamps and paste it into the Unix epoch converter to decode the time into something more understandable.
Multiple range vectors
we can run a query requesting ranges for multiple time series.
flask_http_request_total[5m]
Matching query types to vector type
The PromQL language enforces the distinction between instant and range vectors.
• Aggregation operators such as avg or min can only be applied to instant vectors.
• Functions that compute a value over time, such as increase or rate, can only be applied to range vectors.
The list of PromQL functions specifies for each function whether a vector argument must be instant or range.
Computing a rate across a range (feat. range vectors)
The rate of HTTP calls per second (divided by number of seconds)
rate(flask_http_request_total{service="cmpt756db"})
WRONG!!!
rate(flask_http_request_total{service="cmpt756db"}[5m])
CORRECT!!!
Computing an average across an instant (feat. instant vectors)
the average number of HTTP requests per time series since each series began.
avg(flask_http_request_total{service="cmpt756db"}[5m])
WRONG!!!
avg(flask_http_request_total{service="cmpt756db"})
CORRECT!!!
from term-project-cloudriven.
This is considered done. Closing the issue.
from term-project-cloudriven.
Related Issues (20)
- Testing new endpoints for s3 service in postman
- Update README file HOT 1
- Create "playlist.csv" HOT 1
- Create Scripts to run Gatling containers HOT 1
- Create a playlist table in Dynamodb HOT 2
- Understand Service Mesh (Istio) via Guide 3 HOT 1
- Test Retry mechanisems HOT 1
- Scale the services, test failure of node and recovery mechanisms HOT 3
- Add automation
- Renamed issue19-s3 branch name to dev HOT 2
- Add and test circuit breaker HOT 1
- Fix issue with the AWS DynamoDB tables HOT 1
- S2 Music service cleanup
- API setup for microservices
- Fix the loader issue HOT 1
- Create pcli for the new service HOT 2
- S3 error with listing all playlists HOT 2
- S3 error with creating a playlist HOT 1
- Perform Gatling Scaling Test - Load Evenly Distributed
- Perform Gatling Scaling Test - Stress Test
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from term-project-cloudriven.