Giter Club home page Giter Club logo

kibana-load-testing's Introduction

kibana-load-testing

Environment requirements

  • Maven 3.3.9+
  • Java (JDK) 8+

Running performance testing on CI

Kibana CI has dedicated jobs to run performance testing for your Kibana branch or Cloud snapshot.

Load testing infrastructure on CI

We are using bare metal machine EX-62 provided by Hezner:

  • Intel® Core™ i9-9900K (8 cores)
  • 128 GB DDR4 RAM
  • 2 TB SSD
  • 1 GBit/s-Port connection

Execution is managed by functional test runner: we have custom FTR config file, that defines how to start Elasticsearch and Kibana servers. We also use custom runner to start Gatling simulation file. At the moment both ES/Kibana and Gatling runner are hosted on the machine.

Running performance testing on your machine

Note: While running locally a high load test you might face different issues, so we suggest using dedicated machines and make sure you are aware of needed environment tunnings to minimise side effects.

Running simulation against a local instance

  • Start ES and Kibana instances.

Important: Run Kibana without base path or add a static one to your kibana.yml like server.basePath: "/xfh" before start.

  • Update Kibana configuration in /resources/config/local.conf file
host {
  kibana = "http://localhost:5620" // Kibana base url
  // "http://localhost:5620/xhf" if you start Kibana with static base path
  es = "http://localhost:9220" // ElasticSearch base url
  version = "8.0.0" // Stack version
}

security {
  on = true // false for OSS, otherwise - true
}

auth {
  providerType = "basic"
  providerName = "basic"
  username = "elastic" // user should have permissions to load sample data and access plugins
  password = "changeme"
}
  • start test scenario
mvn clean test-compile // if you made any changes to the config or simulations
mvn gatling:test -Dgatling.simulationClass=org.kibanaLoadTest.simulation.branch.DemoJourney // could be any other existing simulation class

Running simulation against existing cloud deployment

  • Create Elastic Cloud deployment
  • Add a new configuration file in src/test/resources/config: cloud-8.5.0.conf
host {
  kibana = "https://gcp-8-5-0-def.kb.us-central1.gcp.cloud.es.io:9243"
  es = "https://gcp-8-5-0-def.es.us-central1.gcp.cloud.es.io"
  version = "8.5.0"
}

security {
  on = true
}

auth {
  providerType = "basic"
  providerName = "cloud-basic"
  username = <username>
  password = <password>
}
  • start test scenario with your newly added config
mvn clean test-compile
mvn gatling:test -Denv=config/cloud-8.5.0.conf -Dgatling.simulationClass=org.kibanaLoadTest.simulation.cloud.LensJourney

Running simulation against newly created cloud deployment

  • Generate API_KEY for your cloud user account
  • Check deployment template at src/test/resources/config/deploy/default.conf
  • start test scenario, new deployment will be created before simulation and deleted after it is finished
mvn clean test-compile
export API_KEY=<your_cloud_key>
mvn gatling:test -DcloudStackVersion=7.11.0-SNAPSHOT -Dgatling.simulationClass=org.kibanaLoadTest.simulation.cloud.DemoJourney
  • Optionally create a custom deployment configuration and pass it in command -DdeploymentConfig=config/deploy/custom.conf

Follow logs to track deployment status:

09:40:23.535 [INFO ] httpClient - preparePayload: Using Config(SimpleConfigObject({"elasticsearch":{"deployment_template":"gcp-io-optimized","memory":8192},"kibana":{"memory":1024},"version":"7.11.0-SNAPSHOT"}))
09:40:23.593 [INFO ] httpClient - createDeployment: Creating new deployment
09:40:29.848 [INFO ] httpClient - createDeployment: deployment b76dd4a9255a417ca133fe8edd8157a2 is created
09:40:29.848 [INFO ] httpClient - waitForClusterToStart: waitTime 300000ms, poolingInterval 20000ms
09:40:30.727 [INFO ] httpClient - waitForClusterToStart: Deployment is in progress... Map(kibana -> initializing, elasticsearch -> initializing, apm -> initializing)
...
09:46:01.211 [INFO ] httpClient - waitForClusterToStart: Deployment is in progress... Map(kibana -> reconfiguring, elasticsearch -> started, apm -> started)
09:46:21.989 [INFO ] httpClient - waitForClusterToStart: Deployment is ready!
...
...
10:01:08.146 [INFO ] i.g.c.c.Controller - StatsEngineStopped
simulation org.kibanaLoadTest.simulation.cloud.DemoJourney completed in 429 seconds
10:01:08.148 [INFO ] httpClient - deleteDeployment: Deployment b76dd4a9255a417ca133fe8edd8157a2
10:01:09.440 [INFO ] httpClient - deleteDeployment: Finished with status code 200

Adding new simulation

The simplest way is to add new class in simulation package:

class MySimulation extends BaseSimulation {
  val scenarioName = s"My new simulation ${appConfig.buildVersion}"

  val scn = scenario(scenarioName)
    .exec(
      Login
        .doLogin(
          appConfig.isSecurityEnabled,
          appConfig.loginPayload,
          appConfig.loginStatusCode
        )
        .pause(5 seconds)
    )
    // conbine your simulation using existing scenarios or adding new ones
    .exec(Discover.doQuery(appConfig.baseUrl, defaultHeaders).pause(5 seconds))
    .exec(...)
    .exec(...)

  // Define load model, check https://gatling.io/docs/current/general/simulation_setup/
  setUp(
    scn
      .inject(
        rampConcurrentUsers(10) to (250) during (4 minute)
      )
      .protocols(httpProtocol)
  ).maxDuration(10 minutes)
}

In order to run your simulation, use the following command:

mvn gatling:test -Dgatling.simulationClass=org.kibanaLoadTest.simulation.MySimulation

Running simulation from APM traces collected during single user journey run on CI

We created GenericJourney simulation in order run scalability testing for a single user journey. Simulation reads json file with APM traces directly in Gatling runtime and make api calls defined in the file. In order to run it, pass json file using the following command:

mvn gatling:test -Dgatling.simulationClass=org.kibanaLoadTest.simulation.generic.GenericJourney -DjourneyPath=<path_to_json_file>

It is possible to override journey config by setting custom values via environment variables: KIBANA_HOST, ES_URL, AUTH_PROVIDER_TYPE, AUTH_PROVIDER_NAME, AUTH_LOGIN, AUTH_PASSWORD

It is possible to skip unloading kbn & es archives on journey teardown (e.g. you want to inspect Kibana):

-DskipCleanupOnTeardown=true

Test results

Gatling generates html report for each simulation run, available in <project_root>/target/gatling/<simulation>path

Open index.html in browser to preview the report.

Open testRun.txt to find more about Kibana instance you tested.

Running performance testing from VM

Follow guide to setup VM and run tests on it.

Delete your deployments on Elastic cloud

Run the following command to delete all existing deployments

export API_KEY=<your_key>
mvn exec:java -Dexec.mainClass=org.kibanaLoadTest.deploy.DeleteAll -Dexec.classpathScope=test -Dscope=all

If you don't provide -Dscope=all it will delete only the ones with load-testing name prefix

kibana-load-testing's People

Contributors

afharo avatar chloeruka avatar danielmitterdorfer avatar dependabot[bot] avatar dmlemeshko avatar joshdover avatar marius-dr avatar rudolf avatar wayneseymour avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

kibana-load-testing's Issues

Split simulation into warm up and main run simulations

I had a discussion with Daniel around benchmarking scenarios and results consistency. There is a high chance of having extra noise because we include warm up data into final results. Gatling has no exact feature to make a split, but might use one of the options:

  • split into 2 scenarios, run sequentially
  • mark warm up data as "silent"
  • split into 2 simulation files, e.g. having warmup simulation run before any other simulation.
  • any other option

Capture Kibana API calls to update simulations

In order to keep simulations up-to-date we need a mechanism to record/collect API calls for our user-flow simulations, so that we can find out changes and apply it in simulation classes.

So far I can see several options:

  • Gatling recorder
  • node.js script with puppeteer navigating to pages and collecting API calls

we can store baseline as json and compare with the run output.

Run crashed - response.log file is not found in /target

Hi team,

I am receiving the following error when using this tool. Any thoughts?

21:04:26.649 [ERROR] i.g.a.Gatling$ - Run crashed
java.lang.RuntimeException: response.log file is not found in /target
        at org.kibanaLoadTest.helpers.Helper$.moveResponseLogToResultsDir(Helper.scala:229)
        at org.kibanaLoadTest.simulation.BaseSimulation.$anonfun$new$2(BaseSimulation.scala:109)
        at io.gatling.core.scenario.Simulation.$anonfun$executeAfter$1(Simulation.scala:155)
        at io.gatling.core.scenario.Simulation.$anonfun$executeAfter$1$adapted(Simulation.scala:155)
        at scala.collection.immutable.List.foreach(List.scala:333)
        at io.gatling.core.scenario.Simulation.executeAfter(Simulation.scala:155)
        at io.gatling.app.Runner.run0(Runner.scala:87)
        at io.gatling.app.Runner.run(Runner.scala:52)
        at io.gatling.app.Gatling$.start(Gatling.scala:80)
        at io.gatling.app.Gatling$.fromArgs(Gatling.scala:45)
        at io.gatling.app.Gatling$.main(Gatling.scala:37)
        at io.gatling.app.Gatling.main(Gatling.scala)
        at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:64)
        at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.base/java.lang.reflect.Method.invoke(Method.java:564)
        at io.gatling.mojo.MainWithArgsInFile.runMain(MainWithArgsInFile.java:53)
        at io.gatling.mojo.MainWithArgsInFile.main(MainWithArgsInFile.java:34)
java.lang.reflect.InvocationTargetException
        at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:64)
        at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.base/java.lang.reflect.Method.invoke(Method.java:564)
        at io.gatling.mojo.MainWithArgsInFile.runMain(MainWithArgsInFile.java:53)
        at io.gatling.mojo.MainWithArgsInFile.main(MainWithArgsInFile.java:34)
Caused by: java.lang.RuntimeException: response.log file is not found in /target
        at org.kibanaLoadTest.helpers.Helper$.moveResponseLogToResultsDir(Helper.scala:229)
        at org.kibanaLoadTest.simulation.BaseSimulation.$anonfun$new$2(BaseSimulation.scala:109)
        at io.gatling.core.scenario.Simulation.$anonfun$executeAfter$1(Simulation.scala:155)
        at io.gatling.core.scenario.Simulation.$anonfun$executeAfter$1$adapted(Simulation.scala:155)
        at scala.collection.immutable.List.foreach(List.scala:333)
        at io.gatling.core.scenario.Simulation.executeAfter(Simulation.scala:155)
        at io.gatling.app.Runner.run0(Runner.scala:87)
        at io.gatling.app.Runner.run(Runner.scala:52)
        at io.gatling.app.Gatling$.start(Gatling.scala:80)
        at io.gatling.app.Gatling$.fromArgs(Gatling.scala:45)
        at io.gatling.app.Gatling$.main(Gatling.scala:37)
        at io.gatling.app.Gatling.main(Gatling.scala)
        ... 6 more
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time:  07:42 min
[INFO] Finished at: 2021-08-16T21:04:27+05:30
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal io.gatling:gatling-maven-plugin:3.1.2:test (default-cli) on project kibana-load-test: Gatling failed.: Process exited with an error: -1 (Exit value: -1) -> [Help 1]
org.apache.maven.lifecycle.LifecycleExecutionException: Failed to execute goal io.gatling:gatling-maven-plugin:3.1.2:test (default-cli) on project kibana-load-test: Gatling failed.
    at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:215)
    at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:156)
    at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:148)
    at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject (LifecycleModuleBuilder.java:117)
    at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject (LifecycleModuleBuilder.java:81)
    at org.apache.maven.lifecycle.internal.builder.singlethreaded.SingleThreadedBuilder.build (SingleThreadedBuilder.java:56)
    at org.apache.maven.lifecycle.internal.LifecycleStarter.execute (LifecycleStarter.java:128)
    at org.apache.maven.DefaultMaven.doExecute (DefaultMaven.java:305)
    at org.apache.maven.DefaultMaven.doExecute (DefaultMaven.java:192)
    at org.apache.maven.DefaultMaven.execute (DefaultMaven.java:105)
    at org.apache.maven.cli.MavenCli.execute (MavenCli.java:972)
    at org.apache.maven.cli.MavenCli.doMain (MavenCli.java:293)
    at org.apache.maven.cli.MavenCli.main (MavenCli.java:196)
    at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0 (Native Method)
    at jdk.internal.reflect.NativeMethodAccessorImpl.invoke (NativeMethodAccessorImpl.java:64)
    at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke (DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke (Method.java:564)
    at org.codehaus.plexus.classworlds.launcher.Launcher.launchEnhanced (Launcher.java:282)
    at org.codehaus.plexus.classworlds.launcher.Launcher.launch (Launcher.java:225)
    at org.codehaus.plexus.classworlds.launcher.Launcher.mainWithExitCode (Launcher.java:406)
    at org.codehaus.plexus.classworlds.launcher.Launcher.main (Launcher.java:347)
Caused by: org.apache.maven.plugin.MojoExecutionException: Gatling failed.
    at io.gatling.mojo.GatlingMojo.execute (GatlingMojo.java:207)
    at org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo (DefaultBuildPluginManager.java:137)
    at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:210)
    at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:156)
    at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:148)
    at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject (LifecycleModuleBuilder.java:117)
    at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject (LifecycleModuleBuilder.java:81)
    at org.apache.maven.lifecycle.internal.builder.singlethreaded.SingleThreadedBuilder.build (SingleThreadedBuilder.java:56)
    at org.apache.maven.lifecycle.internal.LifecycleStarter.execute (LifecycleStarter.java:128)
    at org.apache.maven.DefaultMaven.doExecute (DefaultMaven.java:305)
    at org.apache.maven.DefaultMaven.doExecute (DefaultMaven.java:192)
    at org.apache.maven.DefaultMaven.execute (DefaultMaven.java:105)
    at org.apache.maven.cli.MavenCli.execute (MavenCli.java:972)
    at org.apache.maven.cli.MavenCli.doMain (MavenCli.java:293)
    at org.apache.maven.cli.MavenCli.main (MavenCli.java:196)
    at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0 (Native Method)
    at jdk.internal.reflect.NativeMethodAccessorImpl.invoke (NativeMethodAccessorImpl.java:64)
    at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke (DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke (Method.java:564)
    at org.codehaus.plexus.classworlds.launcher.Launcher.launchEnhanced (Launcher.java:282)
    at org.codehaus.plexus.classworlds.launcher.Launcher.launch (Launcher.java:225)
    at org.codehaus.plexus.classworlds.launcher.Launcher.mainWithExitCode (Launcher.java:406)
    at org.codehaus.plexus.classworlds.launcher.Launcher.main (Launcher.java:347)
Caused by: org.apache.commons.exec.ExecuteException: Process exited with an error: -1 (Exit value: -1)
    at org.apache.commons.exec.DefaultExecutor.executeInternal (DefaultExecutor.java:404)
    at org.apache.commons.exec.DefaultExecutor.execute (DefaultExecutor.java:166)
    at org.apache.commons.exec.DefaultExecutor.execute (DefaultExecutor.java:153)
    at io.gatling.mojo.Fork.run (Fork.java:167)
    at io.gatling.mojo.GatlingMojo.executeGatling (GatlingMojo.java:302)
    at io.gatling.mojo.GatlingMojo.iterateBySimulations (GatlingMojo.java:237)
    at io.gatling.mojo.GatlingMojo.execute (GatlingMojo.java:195)
    at org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo (DefaultBuildPluginManager.java:137)
    at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:210)
    at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:156)
    at org.apache.maven.lifecycle.internal.MojoExecutor.execute (MojoExecutor.java:148)
    at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject (LifecycleModuleBuilder.java:117)
    at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject (LifecycleModuleBuilder.java:81)
    at org.apache.maven.lifecycle.internal.builder.singlethreaded.SingleThreadedBuilder.build (SingleThreadedBuilder.java:56)
    at org.apache.maven.lifecycle.internal.LifecycleStarter.execute (LifecycleStarter.java:128)
    at org.apache.maven.DefaultMaven.doExecute (DefaultMaven.java:305)
    at org.apache.maven.DefaultMaven.doExecute (DefaultMaven.java:192)
    at org.apache.maven.DefaultMaven.execute (DefaultMaven.java:105)
    at org.apache.maven.cli.MavenCli.execute (MavenCli.java:972)
    at org.apache.maven.cli.MavenCli.doMain (MavenCli.java:293)
    at org.apache.maven.cli.MavenCli.main (MavenCli.java:196)
    at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0 (Native Method)
    at jdk.internal.reflect.NativeMethodAccessorImpl.invoke (NativeMethodAccessorImpl.java:64)
    at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke (DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke (Method.java:564)
    at org.codehaus.plexus.classworlds.launcher.Launcher.launchEnhanced (Launcher.java:282)
    at org.codehaus.plexus.classworlds.launcher.Launcher.launch (Launcher.java:225)
    at org.codehaus.plexus.classworlds.launcher.Launcher.mainWithExitCode (Launcher.java:406)
    at org.codehaus.plexus.classworlds.launcher.Launcher.main (Launcher.java:347)

[load data] add support for create data stream api

Some of journeys, e.g. cloud_security_dashboard, require index to be created using create data stream api;

ESArchiver.scala is to be extended with create data stream api call for indexes matches logs-* pattern

Caused by: java.lang.RuntimeException: co.elastic.clients.elasticsearch._types.ElasticsearchException: [es/indices.create]
failed: [illegal_argument_exception] cannot create index with name [logs-cloud_security_posture.findings_latest-default],
because it matches with template [logs] that creates data streams only, use create data stream api instead

Collect performance metrics during scalability testing

We want to collect metrics to diagnose the system behavior under heavy load and perform Root Cause Analysis for identified problems. You can find a proposal for metric collection in this document.
In the current task, we need to:

  1. Develop a strategy for metric ingestion. Some metrics can be provided by a host server (CPU utilization, OS memory consumption, etc.), the Kibana server will provide others. kibana-load-testing will collect these metrics with a regular interval.
  2. Set up an ingestion pipeline to store metrics for further analysis alongside the load testing results.

gather Elasticsearch and Kibana monitoring data during Cloud load test runs

Similar to #39
It would be really useful to get the monitoring metrics during load test runs. This would show things like the Elasticsearch search queue and rejected searches. It would also show Kibana memory and response times.

It would be a little challenging to look at the Monitoring page along side of our dashboard so we might re-build some of the monitoring visualizations from the monitoring indices so we can add them to one dashboard.

The problem is, we want to run load tests on the latest master snapshot builds and those are only deployed to Cloud staging. But our kibana-stats.elastic.dev is on Cloud production. We can't configure our load test deployment to send monitoring data from staging to a deployment id in production.
We would need to configure our load test deployment to use the kibana-stats URL, username, password and we don't think we have a way to do that right now.
https://www.elastic.co/guide/en/cloud/current/Deployment_-_CRUD.html#ec_request_example_10

We thought about configuring the load test deployment to keep the monitoring data local and at the end of the job, save a snapshot to cloud storage, and immediately restore that snapshot into kibana-stats.

We also thought about running kibana-stats in Cloud staging so that we could use the deployment id.

/cc @marius-dr @dmlemeshko

[Meta] Track performance metrics for cloud snapshots and master branch

As we agreed, we gonna have different load scenarios for testing snapshots and locally build Kibana: Kibana + ES hosted on worker can take higher load so it is irrelevant to compare cloud vs worker results.

We plan to proceed this way:

  • monitor cloud by running daily "cloud scenario" for 8.0 (CI trigger) & 7.x (currently triggered manually)
  • daily monitor master by building Kibana and using latests ES snapshot, running "local scenario" for it.
  • build charts in Kibana stats to:
    • track cloud snaphots performance
    • track master build performance
    • compare dev branch vs master

Tasks:

  • define cloud scenario
  • define local scenario
  • set 7.x trigger on CI
  • set master branch trigger on CI
  • configure dashboards on build stats

Cloud testing: failed to create new deployment

Looks like payload for /api/v1/deployments has been changed and we are getting error:

12:40:00.539 [INFO ] httpClient - createDeployment: Creating new deployment
12:40:01.053 [INFO ] httpClient - createDeployment: Request completed with `Bad Request 400`
12:40:01.056 [ERROR] i.g.a.Gatling$ - Run crashed
java.lang.RuntimeException: Expected field 'id' in '{"error":"BadRequest","msg":"Invalid content type","ok":false}'
	at spray.json.lenses.package$.unexpected(package.scala:34)
	at spray.json.lenses.package$ValidateOption.getOrError(package.scala:68)
	at spray.json.lenses.ScalarLenses$$anon$1.$anonfun$retr$2(ScalarLenses.scala:31)
	at scala.util.Either.flatMap(Either.scala:352)
	at spray.json.lenses.ScalarLenses$$anon$1.$anonfun$retr$1(ScalarLenses.scala:31)
	at spray.json.lenses.LensImpl.tryGet(Lens.scala:46)
	... 23 common frames omitted

As alternative, it is possible to testing existing (manually created cloud deployment) before fix is made.

benchamark api/metrics/vis/data vs internal/bsearch

According to @lizozom the key difference between the two is that they use async search differently: we poll for bsearch from the client while the timeseries endpoint is doing server side polling. The difference is mostly due to legacy implementation, and not necessarily technical.

The idea is to benchmark each of this end-points as part visualisation loading simulation with same loading setup. We already have tsvb simulations, so we only need to add a one that load viz data with /internal/bsearch

Liza also suggested to use shard delay for this benchmarking

GatlingTestRunner: add cli to run with custom parameters

Currently we run load testing for Kibana build via FTR and custom config. It solves the basic needs, but does not allow to pass custom arguments to maven command.

We need to have a cli for load testing that allows to specify scenario, path to load testing project, etc.

node scripts/load_testing_runner \
  --relativePath "../kibana-load-testing" \
  --scenario org.kibanaLoadTest.simulation.DemoJourney

[Bug] HttpHelper - Exception occurred when running Kibana with a base path

Summary

A fresh pull and install causes a build failure locally when running against a local instance of Kibana (master).

Details

Setup:

  • ES and APM servers running through Docker as per the apm-integration-testing repo with ./scripts/compose.py start master --no-kibana
  • Kibana running locally with auth (and modified local.conf to use the auth that the apm-integration-testing service provides)
  • Verified that ES and the APM server and Kibana were connected (manually opened Kibana, added the Logs sample data set and navigated around a bit)
  • local clone of this repo, run with:
export env=config/local.conf
mvn gatling:test -Dgatling.simulationClass=org.kibanaLoadTest.simulation.DemoJourney

Result:

[ERROR] i.g.a.Gatling$ - Run crashed
java.lang.RuntimeException: Adding sample data failed:

Expected result:

  • the sample dataset should be added successfully.

The issue appears to be with adding the sample data:

}}
09:57:11.179 [INFO ] Base Simulation - Loading sample data
09:57:11.297 [ERROR] HttpHelper - Exception occurred during loading sample data
09:57:11.299 [ERROR] i.g.a.Gatling$ - Run crashed
java.lang.RuntimeException: Adding sample data failed:
	at org.kibanaLoadTest.helpers.HttpHelper.addSampleData(HttpHelper.scala:94)
Full stack trace:
09:57:10.866 [INFO ] Base Simulation - Running DemoJourney simulation
09:57:11.178 [INFO ] KibanaConfiguration - Base URL = http://localhost:5601
09:57:11.179 [INFO ] KibanaConfiguration - Kibana version = 8.0.0
09:57:11.179 [INFO ] KibanaConfiguration - Security Enabled = true
09:57:11.179 [INFO ] KibanaConfiguration - Auth payload = {"providerType":"basic","providerName":"basic","currentURL":"http://localhost:5601/login","params":{"username":"admin","password":"changeme"}}
09:57:11.179 [INFO ] Base Simulation - Loading sample data
09:57:11.297 [ERROR] HttpHelper - Exception occurred during loading sample data
09:57:11.299 [ERROR] i.g.a.Gatling$ - Run crashed
java.lang.RuntimeException: Adding sample data failed:
	at org.kibanaLoadTest.helpers.HttpHelper.addSampleData(HttpHelper.scala:94)
	at org.kibanaLoadTest.simulation.BaseSimulation.$anonfun$new$1(BaseSimulation.scala:56)
	at io.gatling.core.scenario.Simulation.$anonfun$executeBefore$1(Simulation.scala:154)
	at io.gatling.core.scenario.Simulation.$anonfun$executeBefore$1$adapted(Simulation.scala:154)
	at scala.collection.immutable.List.foreach(List.scala:431)
	at io.gatling.core.scenario.Simulation.executeBefore(Simulation.scala:154)
	at io.gatling.app.Runner.run0(Runner.scala:70)
	at io.gatling.app.Runner.run(Runner.scala:52)
	at io.gatling.app.Gatling$.start(Gatling.scala:80)
	at io.gatling.app.Gatling$.fromArgs(Gatling.scala:45)
	at io.gatling.app.Gatling$.main(Gatling.scala:37)
	at io.gatling.app.Gatling.main(Gatling.scala)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.base/java.lang.reflect.Method.invoke(Method.java:567)
	at io.gatling.mojo.MainWithArgsInFile.runMain(MainWithArgsInFile.java:50)
	at io.gatling.mojo.MainWithArgsInFile.main(MainWithArgsInFile.java:33)
java.lang.reflect.InvocationTargetException
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.base/java.lang.reflect.Method.invoke(Method.java:567)
	at io.gatling.mojo.MainWithArgsInFile.runMain(MainWithArgsInFile.java:50)
	at io.gatling.mojo.MainWithArgsInFile.main(MainWithArgsInFile.java:33)
Caused by: java.lang.RuntimeException: Adding sample data failed:
	at org.kibanaLoadTest.helpers.HttpHelper.addSampleData(HttpHelper.scala:94)
	at org.kibanaLoadTest.simulation.BaseSimulation.$anonfun$new$1(BaseSimulation.scala:56)
	at io.gatling.core.scenario.Simulation.$anonfun$executeBefore$1(Simulation.scala:154)
	at io.gatling.core.scenario.Simulation.$anonfun$executeBefore$1$adapted(Simulation.scala:154)
	at scala.collection.immutable.List.foreach(List.scala:431)
	at io.gatling.core.scenario.Simulation.executeBefore(Simulation.scala:154)
	at io.gatling.app.Runner.run0(Runner.scala:70)
	at io.gatling.app.Runner.run(Runner.scala:52)
	at io.gatling.app.Gatling$.start(Gatling.scala:80)
	at io.gatling.app.Gatling$.fromArgs(Gatling.scala:45)
	at io.gatling.app.Gatling$.main(Gatling.scala:37)
	at io.gatling.app.Gatling.main(Gatling.scala)
	... 6 more
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time:  10.925 s
[INFO] Finished at: 2020-12-05T09:57:11-07:00
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal io.gatling:gatling-maven-plugin:3.1.0:test (default-cli) on project kibana-load-test: Gatling failed.: Process exited with an error: 255 (Exit value: 255) -> [Help 1]
[ERROR]

What does work

  • When I create a new cloud deployment as part of the test run, it works fine.
  • Running the tests against a 7.10.0-SNAPSHOT through docker locally without any issues, with apm-integration-testing run with ./scripts/compose.py start 7.10.0 --with-kibana and config/local.conf as:
app {
  host = "http://localhost:5601"
  version = "7.10.0-SNAPSHOT"
}

security {
  on = true
}

auth {
  providerType = "basic"
  providerName = "basic"
  username = "admin"
  password = "changeme"
}

[meta] Gatling limitations & improvements

This issue collects current problems, limitations and possible improvements related to Gatling tool.

We need to do thorough research in this area to have more ideas around tool & environment tuning so that we minimize the noise introduced by Gatling itself.

  • Investigate socket reuse in Gatling (ephemeral port depletion)
  • Review official OS tuning guide
  • Research for Elasticsearch connector to ingest test results, Gatling has grahite sender , docs

Create script to generate Gatling simulation

Gatling recorder allows to generate simulation out of HAR file, but it doesn’t look like a good solution for us for few reasons:

  • The generated file has predefined load setup, that can’t be changed out of the box
  • We need to replace not only /bsearch but /login part as well to get Cookie header for authentication.
  • cli in headless mode is quite limited

Overall, it feels like it is easy to write our own simulation generator in nodejs and avoid HAR files but rather whatever format single user benchmark tool gives back.

Implement custom DataWriter to support journeys with loading static resources

In order to run scalability tests with static resources been loaded, we can't use the default DataWriter.

Default one is not configurable and stores response body for each request that causes out of memory exception.

At the moment we exclude static resources from APM traces and can use existing DataWriter. But if we won't to include it, we need to consider our own implementation. There are already existing ones:

"config/local.conf" settings aren't being used?

Hi, I am getting an error I don't know how to debug attempting to run the DemoJourney simulation.

Here is my "local.conf":

% export env=config/local.conf

% cat src/test/resources/config/local.conf
app {
  host = "http://localhost:5601/cry"
  version = "7.12.0"
}

security {
  on = true
}

auth {
  providerType = "basic"
  providerName = "basic"
  username = "admin"
  password = "changeme"
}

And I have a local Kibana and ES 7.12.0 running:

% curl -i http://localhost:5601/cry
HTTP/1.1 302 Found
location: /cry/login?next=%2Fcry%2F
kbn-name: pink.local
kbn-license-sig: 29bcf563e040fd115ec3733e9e9d9fe170e849a278f1ab8f1e38d35adf62fbd2
cache-control: private, no-cache, no-store, must-revalidate
content-length: 0
date: Thu, 01 Apr 2021 23:45:39 GMT
content-type: application/octet-stream
Connection: keep-alive
Keep-Alive: timeout=120

When I run the gatling:test I get (full output is at the bottom):

16:47:22.781 [INFO ] KibanaConfiguration - Getting Kibana status info
16:47:23.889 [ERROR] i.g.a.Gatling$ - Run crashed
java.net.ConnectException: Connection refused
	at java.base/sun.nio.ch.Net.connect0(Native Method)
	at java.base/sun.nio.ch.Net.connect(Net.java:574)
	at java.base/sun.nio.ch.Net.connect(Net.java:563)
	at java.base/sun.nio.ch.NioSocketImpl.connect(NioSocketImpl.java:588)
	at java.base/java.net.SocksSocketImpl.connect(SocksSocketImpl.java:333)
	at java.base/java.net.Socket.connect(Socket.java:648)
	at org.apache.http.conn.socket.PlainConnectionSocketFactory.connectSocket(PlainConnectionSocketFactory.java:75)
	at org.apache.http.impl.conn.DefaultHttpClientConnectionOperator.connect(DefaultHttpClientConnectionOperator.java:142)
	... 31 common frames omitted
Wrapped by: org.apache.http.conn.HttpHostConnectException: Connect to localhost:5620 [localhost/127.0.0.1, localhost/0:0:0:0:0:0:0:1] failed: Connection refused
...

which suggests my "local.conf" is not being read to use Kibana at port 5601 rather than port 5620.

Can someone help me figure out what is going on? I was able to successfully run these back in December. Thanks.

Full output of the mvn run

[16:45:39 trentm@pink:~/el/kibana-load-testing (git:master)]
% mvn gatling:test -Dgatling.simulationClass=org.kibanaLoadTest.simulation.DemoJourney
[INFO] Scanning for projects...
[INFO]
[INFO] --------------< org.elastic.kibana-test:kibana-load-test >--------------
[INFO] Building kibana-load-test 1.0-SNAPSHOT
[INFO] --------------------------------[ jar ]---------------------------------
[INFO]
[INFO] --- gatling-maven-plugin:3.1.2:test (default-cli) @ kibana-load-test ---
16:47:19,797 |-INFO in ch.qos.logback.classic.LoggerContext[default] - Could NOT find resource [logback-test.xml]
16:47:19,798 |-INFO in ch.qos.logback.classic.LoggerContext[default] - Could NOT find resource [logback.groovy]
16:47:19,798 |-INFO in ch.qos.logback.classic.LoggerContext[default] - Found resource [logback.xml] at [jar:file:/Users/trentm/.m2/repository/io/gatling/gatling-maven-plugin/3.1.2/gatling-maven-plugin-3.1.2.jar!/logback.xml]
16:47:19,799 |-WARN in ch.qos.logback.classic.LoggerContext[default] - Resource [logback.xml] occurs multiple times on the classpath.
16:47:19,799 |-WARN in ch.qos.logback.classic.LoggerContext[default] - Resource [logback.xml] occurs at [jar:file:/Users/trentm/.m2/repository/io/gatling/gatling-maven-plugin/3.1.2/gatling-maven-plugin-3.1.2.jar!/logback.xml]
16:47:19,799 |-WARN in ch.qos.logback.classic.LoggerContext[default] - Resource [logback.xml] occurs at [file:/Users/trentm/el/kibana-load-testing/target/test-classes/logback.xml]
16:47:19,805 |-INFO in ch.qos.logback.core.joran.spi.ConfigurationWatchList@5d740a0f - URL [jar:file:/Users/trentm/.m2/repository/io/gatling/gatling-maven-plugin/3.1.2/gatling-maven-plugin-3.1.2.jar!/logback.xml] is not of type file
16:47:19,848 |-INFO in ch.qos.logback.classic.joran.action.ConfigurationAction - debug attribute not set
16:47:19,849 |-INFO in ch.qos.logback.core.joran.action.AppenderAction - About to instantiate appender of type [ch.qos.logback.core.ConsoleAppender]
16:47:19,853 |-INFO in ch.qos.logback.core.joran.action.AppenderAction - Naming appender as [CONSOLE]
16:47:19,861 |-INFO in ch.qos.logback.core.joran.action.NestedComplexPropertyIA - Assuming default type [ch.qos.logback.classic.encoder.PatternLayoutEncoder] for [encoder] property
16:47:19,901 |-INFO in ch.qos.logback.classic.joran.action.LevelAction - ROOT level set to WARN
16:47:19,901 |-INFO in ch.qos.logback.core.joran.action.AppenderRefAction - Attaching appender named [CONSOLE] to Logger[ROOT]
16:47:19,901 |-INFO in ch.qos.logback.classic.joran.action.ConfigurationAction - End of configuration.
16:47:19,902 |-INFO in ch.qos.logback.classic.joran.JoranConfigurator@214b199c - Registering current configuration as safe fallback point

Java HotSpot(TM) 64-Bit Server VM warning: Option UseBiasedLocking was deprecated in version 15.0 and will likely be removed in a future release.
16:47:21,714 |-INFO in ch.qos.logback.classic.LoggerContext[default] - Could NOT find resource [logback-test.xml]
16:47:21,715 |-INFO in ch.qos.logback.classic.LoggerContext[default] - Could NOT find resource [logback.groovy]
16:47:21,716 |-INFO in ch.qos.logback.classic.LoggerContext[default] - Found resource [logback.xml] at [file:/Users/trentm/el/kibana-load-testing/target/test-classes/logback.xml]
16:47:21,716 |-WARN in ch.qos.logback.classic.LoggerContext[default] - Resource [logback.xml] occurs multiple times on the classpath.
16:47:21,716 |-WARN in ch.qos.logback.classic.LoggerContext[default] - Resource [logback.xml] occurs at [jar:file:/Users/trentm/.m2/repository/io/gatling/gatling-maven-plugin/3.1.2/gatling-maven-plugin-3.1.2.jar!/logback.xml]
16:47:21,716 |-WARN in ch.qos.logback.classic.LoggerContext[default] - Resource [logback.xml] occurs at [file:/Users/trentm/el/kibana-load-testing/target/test-classes/logback.xml]
16:47:21,775 |-INFO in ch.qos.logback.classic.joran.action.ConfigurationAction - debug attribute not set
16:47:21,776 |-INFO in ch.qos.logback.core.joran.action.AppenderAction - About to instantiate appender of type [ch.qos.logback.core.ConsoleAppender]
16:47:21,781 |-INFO in ch.qos.logback.core.joran.action.AppenderAction - Naming appender as [CONSOLE]
16:47:21,788 |-INFO in ch.qos.logback.core.joran.action.NestedComplexPropertyIA - Assuming default type [ch.qos.logback.classic.encoder.PatternLayoutEncoder] for [encoder] property
16:47:21,835 |-INFO in ch.qos.logback.core.joran.action.TimestampAction - Using current interpretation time, i.e. now, as time reference.
16:47:21,836 |-INFO in ch.qos.logback.core.joran.action.TimestampAction - Adding property to the context with key="date" and value="20210401164721" to the LOCAL scope
16:47:21,836 |-INFO in ch.qos.logback.core.joran.action.AppenderAction - About to instantiate appender of type [ch.qos.logback.core.FileAppender]
16:47:21,838 |-INFO in ch.qos.logback.core.joran.action.AppenderAction - Naming appender as [FILE]
16:47:21,839 |-INFO in ch.qos.logback.core.joran.action.NestedComplexPropertyIA - Assuming default type [ch.qos.logback.classic.encoder.PatternLayoutEncoder] for [encoder] property
16:47:21,840 |-INFO in ch.qos.logback.core.FileAppender[FILE] - File property is set to [target/simulation-20210401164721.log]
16:47:21,841 |-INFO in ch.qos.logback.classic.joran.action.RootLoggerAction - Setting level of ROOT logger to INFO
16:47:21,841 |-INFO in ch.qos.logback.core.joran.action.AppenderRefAction - Attaching appender named [CONSOLE] to Logger[ROOT]
16:47:21,841 |-INFO in ch.qos.logback.classic.joran.action.ConfigurationAction - End of configuration.
16:47:21,841 |-INFO in ch.qos.logback.classic.joran.JoranConfigurator@376b4233 - Registering current configuration as safe fallback point

16:47:21.971 [INFO ] i.g.c.c.GatlingConfiguration$ - Gatling will try to use 'gatling.conf' as custom config file.
16:47:22.330 [INFO ] a.e.s.Slf4jLogger - Slf4jLogger started
16:47:22.781 [INFO ] KibanaConfiguration - Getting Kibana status info
16:47:23.889 [ERROR] i.g.a.Gatling$ - Run crashed
java.net.ConnectException: Connection refused
	at java.base/sun.nio.ch.Net.connect0(Native Method)
	at java.base/sun.nio.ch.Net.connect(Net.java:574)
	at java.base/sun.nio.ch.Net.connect(Net.java:563)
	at java.base/sun.nio.ch.NioSocketImpl.connect(NioSocketImpl.java:588)
	at java.base/java.net.SocksSocketImpl.connect(SocksSocketImpl.java:333)
	at java.base/java.net.Socket.connect(Socket.java:648)
	at org.apache.http.conn.socket.PlainConnectionSocketFactory.connectSocket(PlainConnectionSocketFactory.java:75)
	at org.apache.http.impl.conn.DefaultHttpClientConnectionOperator.connect(DefaultHttpClientConnectionOperator.java:142)
	... 31 common frames omitted
Wrapped by: org.apache.http.conn.HttpHostConnectException: Connect to localhost:5620 [localhost/127.0.0.1, localhost/0:0:0:0:0:0:0:1] failed: Connection refused
	at org.apache.http.impl.conn.DefaultHttpClientConnectionOperator.connect(DefaultHttpClientConnectionOperator.java:156)
	at org.apache.http.impl.conn.PoolingHttpClientConnectionManager.connect(PoolingHttpClientConnectionManager.java:376)
	at org.apache.http.impl.execchain.MainClientExec.establishRoute(MainClientExec.java:393)
	at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:236)
	at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:186)
	at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:89)
	at org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:110)
	at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185)
	at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83)
	at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:108)
	at org.kibanaLoadTest.helpers.HttpHelper.loginIfNeeded(HttpHelper.scala:29)
	at org.kibanaLoadTest.helpers.HttpHelper.getStatus(HttpHelper.scala:105)
	at org.kibanaLoadTest.KibanaConfiguration.syncWithInstance(KibanaConfiguration.scala:90)
	at org.kibanaLoadTest.simulation.BaseSimulation.<init>(BaseSimulation.scala:72)
	at org.kibanaLoadTest.simulation.DemoJourney.<init>(DemoJourney.scala:7)
	... 17 common frames omitted
Wrapped by: java.lang.reflect.InvocationTargetException: null
	at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
	at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:64)
	at java.base/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
	at java.base/java.lang.reflect.Constructor.newInstanceWithCaller(Constructor.java:500)
	at java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:481)
	at io.gatling.app.Runner.run0(Runner.scala:65)
	at io.gatling.app.Runner.run(Runner.scala:52)
	at io.gatling.app.Gatling$.start(Gatling.scala:80)
	at io.gatling.app.Gatling$.fromArgs(Gatling.scala:45)
	at io.gatling.app.Gatling$.main(Gatling.scala:37)
	at io.gatling.app.Gatling.main(Gatling.scala)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:64)
	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.base/java.lang.reflect.Method.invoke(Method.java:564)
	at io.gatling.mojo.MainWithArgsInFile.runMain(MainWithArgsInFile.java:53)
	at io.gatling.mojo.MainWithArgsInFile.main(MainWithArgsInFile.java:34)
java.lang.reflect.InvocationTargetException
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:64)
	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.base/java.lang.reflect.Method.invoke(Method.java:564)
	at io.gatling.mojo.MainWithArgsInFile.runMain(MainWithArgsInFile.java:53)
	at io.gatling.mojo.MainWithArgsInFile.main(MainWithArgsInFile.java:34)
Caused by: java.lang.reflect.InvocationTargetException
	at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
	at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:64)
	at java.base/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
	at java.base/java.lang.reflect.Constructor.newInstanceWithCaller(Constructor.java:500)
	at java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:481)
	at io.gatling.app.Runner.run0(Runner.scala:65)
	at io.gatling.app.Runner.run(Runner.scala:52)
	at io.gatling.app.Gatling$.start(Gatling.scala:80)
	at io.gatling.app.Gatling$.fromArgs(Gatling.scala:45)
	at io.gatling.app.Gatling$.main(Gatling.scala:37)
	at io.gatling.app.Gatling.main(Gatling.scala)
	... 6 more
Caused by: org.apache.http.conn.HttpHostConnectException: Connect to localhost:5620 [localhost/127.0.0.1, localhost/0:0:0:0:0:0:0:1] failed: Connection refused
	at org.apache.http.impl.conn.DefaultHttpClientConnectionOperator.connect(DefaultHttpClientConnectionOperator.java:156)
	at org.apache.http.impl.conn.PoolingHttpClientConnectionManager.connect(PoolingHttpClientConnectionManager.java:376)
	at org.apache.http.impl.execchain.MainClientExec.establishRoute(MainClientExec.java:393)
	at org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:236)
	at org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:186)
	at org.apache.http.impl.execchain.RetryExec.execute(RetryExec.java:89)
	at org.apache.http.impl.execchain.RedirectExec.execute(RedirectExec.java:110)
	at org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185)
	at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83)
	at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:108)
	at org.kibanaLoadTest.helpers.HttpHelper.loginIfNeeded(HttpHelper.scala:29)
	at org.kibanaLoadTest.helpers.HttpHelper.getStatus(HttpHelper.scala:105)
	at org.kibanaLoadTest.KibanaConfiguration.syncWithInstance(KibanaConfiguration.scala:90)
	at org.kibanaLoadTest.simulation.BaseSimulation.<init>(BaseSimulation.scala:72)
	at org.kibanaLoadTest.simulation.DemoJourney.<init>(DemoJourney.scala:7)
	... 17 more
Caused by: java.net.ConnectException: Connection refused
	at java.base/sun.nio.ch.Net.connect0(Native Method)
	at java.base/sun.nio.ch.Net.connect(Net.java:574)
	at java.base/sun.nio.ch.Net.connect(Net.java:563)
	at java.base/sun.nio.ch.NioSocketImpl.connect(NioSocketImpl.java:588)
	at java.base/java.net.SocksSocketImpl.connect(SocksSocketImpl.java:333)
	at java.base/java.net.Socket.connect(Socket.java:648)
	at org.apache.http.conn.socket.PlainConnectionSocketFactory.connectSocket(PlainConnectionSocketFactory.java:75)
	at org.apache.http.impl.conn.DefaultHttpClientConnectionOperator.connect(DefaultHttpClientConnectionOperator.java:142)
	... 31 more
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time:  5.313 s
[INFO] Finished at: 2021-04-01T16:47:23-07:00
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal io.gatling:gatling-maven-plugin:3.1.2:test (default-cli) on project kibana-load-test: Gatling failed.: Process exited with an error: 255 (Exit value: 255) -> [Help 1]
[ERROR]
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR]
[ERROR] For more information about the errors and possible solutions, please read the following articles:
[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException

[Meta] Consume APM data to generate scalability simulations

In order to keep consistency between single user and scalability benchmarking, we agreed to consume API calls sequence from APM data, collected during single user benchmarks by scalability benchmarking tool.

GenericJourney: Use json feeder to auth requests

At the moment, every scenario includes calling internal/security/login end-point. It has few downsides:

  • we depend on security service scalability
  • we can't test individual end-points if it requires authorisation
  • we can't test individual user flow without doing login

The solution is to pre-generate vaild cookies and use Gatling feeder to pass it to user sessions and skip login

[Meta] Scenario to cover with custom data/saved objects

The purpose of this issue is to list the most interesting scenarios we would like to collect load metrics for.
With recent changes in current project we can ingest data in esArchives format and import saved objects, so we should be pretty flexible in scenarios.

  • dashboard with multiple map visualisations
  • dashboard with multiple tsvb visualisations
  • dashboard with multiple lens visualisations
  • complex canvas workpad
  • kibana with multiple spaces

gather Elasticsearch and Kibana monitoring data during local build load test runs

Similar to #38, we would like to gather Monitoring metrics during load test runs. This would show things like the Elasticsearch search queue and rejected searches. It would also show Kibana memory and response times.

In the current local load test job we build the current Kibana and get the most recently promoted Elasticsearch build.
To gather monitoring we would need to also get the most recent metricbeat build, configure it, and start it.

Then it should be a relatively simple job of configuring that metricbeat to send Elasticsearch and Kibana monitoring data to kibana-stats.elastic.dev.

APM data too...(separate issue)

/cc @dmlemeshko @marius-dr

Compare Kibana server performance with/without in-built APM

Test environment:

Bare metal machine EX62
OS Linux, Node v16.13.2

Test scenarios:

DiscoverJourney

  • 1 min warm up with 20 conc. users
  • 3 min ramping up from 20 to 700 conc. users

CanvasJourney

  • 1 min warm up with 20 conc. users
  • 3 min ramping up from 20 to 100 conc. users

LensJourney

  • 1 min warm up with 20 conc. users
  • 3 min ramping up from 20 to 700 conc. users

APM configuration:

ELASTIC_APM_ACTIVE: true,
ELASTIC_APM_CONTEXT_PROPAGATION_ONLY: 'false',
ELASTIC_APM_ENVIRONMENT: 'ci',
ELASTIC_APM_TRANSACTION_SAMPLE_RATE: '1.0',
ELASTIC_APM_MAX_QUEUE_SIZE=20480
ELASTIC_APM_CAPTURE_SPAN_STACK_TRACES=false
ELASTIC_APM_METRICS_INTERVAL=60s

Results are available on Kibana-stats))

Example of 2 end-points called during DiscoverJourney
Screenshot 2022-02-02 at 15 52 04

Full results:
Screenshot 2022-02-10 at 11 41 57

The slowest end-points:
Screenshot 2022-02-10 at 11 46 02

Gatling data retention

We store n kinds of gatling data.

  • user data: gatling-users
  • stats data: gatling-stats
  • metric data: gatling-data

For metric data , we wish to retain data as follows

  • Rollover after 30 days or 50GB

  • Delete after 1 year

  • After #250 is over,
    we've to remove the index template used:
    GET _index_template/template_with_gatling_data_mappings_1shard_0replicas

[GenericJourney] load esArchives/kbnArchives from journey file

Follow-up to #elastic/kibana/pull/139891

Some journeys use esArchives/kbnArchives to load test data. This information is available in journey json with testData property. Relative path to kibana folders will be defined. $KIBANA_DIR env variable should be used to get Kibana repo root path.

ESArchiver is already implemented and KbnArchiver is to be added.

Use before/after hooks in GenericJourney to set & later cleanup test environment.

Identify queries associated with scenario/user workflow in log

As discussed with @dmlemeshko we would like to identify the ES queries issued by Kibana when a scenario is executed. We will use these queries for Rally load testing. This is possible using via elasticsearch.logQueries.

However, we also need to identify which queries are associated with which kibana API call and scenario e.g. logging into discover. When loading testing with Rally, we would execute those queries associated with a scenario together - thereby simulating a user action. In order to do this, we need someway of grouping queries in the logs.

One proposal here is for the load gen to attach a unique value to a custom header for each scenario - this could just be a generated uuid and/or the name of the scenario. We need to determine if Kibana can be made to log these headers when logging Kibana queries. A change of this value in the log would in turn indicate a new scenario.

@dliappis

Generate scalability tests based on single user benchmarking build artifacts

In elastic/kibana#130777 we pull APM traces collected during single user benchmarking journeys run on CI job and normalise into format representing a sequence of Kibana API calls: each journey has its own json files stored as artifact of that CI job.

We need to change current scalability benchmarking job to do the following:

  1. Pull artifacts from the latest single user benchmarking job build
  • we can use buildkite API to get the latest build and fetch artifacts, or try accessing artifacts directly?
  • to use buildkite API we need an API token
  • we will use public bucket to store Kibana build, plugins and jsons with API calls
  1. Generate Gatling simulations using scalability-simulation-generator :
  • node scripts/generate_simulations.js --dir <path to json files> --packageName org.kibanaLoadTest --url <kibana baseUrl>
  1. Download (or build fresh based on commit hash) Kibana build used by single user benchmarking job
  • we can probably start with fresh build based on commit hash of the buildkite build
  1. Build this project with generated simulations inside and run scalability tests
  • copy generated scala files under src/test/scala/org/kibanaLoadTest/simulation
  • build project mvn clean test-compile
  • run test gatling:test -Dgatling.simulationClass=org.kibanaLoadTest.simulation.<generated_simulation_file_name>

switch from esArchiver to saved objects API for .kibana data in functional tests

We currently use esArchiver to load both test data and Kibana data (which are saved objects stored in the .kibana index).
We should switch to loading the Kibana data using the saved object API https://www.elastic.co/guide/en/kibana/master/saved-objects-api.html

  1. We advise users NOT to write directly to the .kibana index
    "Do not write documents directly to the .kibana index. When you write directly to the .kibana index, the data becomes corrupted and permanently breaks future Kibana versions."

  2. The saved object API is the recommended way.

  3. When we use esArchiver to load data, it removes everything else first (except the default space). But using the Saved Object API we could potentially choose to add objects to the existing objects.

Back when most of the functional tests were written, we didn't support exporting and importing everything in the .kibana index. I'm not 100% sure if everything is now supported but I think it should be.

The steps to switch from using esArchiver to the SO API would probably be;

  1. Add a line in each test (or index file) where esArchiver is loading a .kibana indexthat exports saved objects
  2. run the tests to get all the exported files
  3. change the tests to
    1. remove the original esArchiver load
    2. add a new step to clear other saved objects out? We need to figure this part out.
    3. change the SO export to the SO import
  4. run the tests to verify they pass

Build cgroups to isolate Gatling process

In order to minimise potential noise while running Kibana/ES and load executor (Gatling) on the same machine, we can limit CPU number that load executor can use.

Setup might look like:

# Isolate the Gatling process to the first 2 cores (2 physical cores)
sudo cset set --set=/benchmark --cpu=0-1
sudo cset proc --exec /benchmark mvn gatling:test -q -Dgatling.simulationClass=${simulationPackage}.${simulationClass}

Cleanup should be executed in the end:

function tearDown() {
    echo "Destroying cgroups"
    sudo cset set --destroy /benchmark
​
    echo "Setting CPU powersave governor"
    for (( cpu=0; cpu<=${CORE_INDEX}; cpu++ ))
    do
        sudo cpufreq-set -c ${cpu} --min ${MIN_FREQ} --max ${MAX_FREQ} --governor=powersave
    done
}

[cloud] multiple Kibana nodes test

Compare the following configurations on Cloud:

  • Kibana 1 GB RAM | Up to 8 vCPU (2 zones)
  • Kibana 2 GB RAM | Up to 8 vCPU (1 zone)

with the default configuration Kibana 1 GB RAM | Up to 8 vCPU (1 zone)

ESArchiver fails with java.lang.RuntimeException: More than 1 index found in mappings.json

Steps to reproduce:

Run node scripts/run_scalability.js --journey-path x-pack/test/scalability/apis/api.saved_objects_tagging.tags.json

api.saved_objects_tagging.tags.json

{
  "journeyName": "GET /api/saved_objects_tagging/tags",
  "kibanaVersion": "8.7.0-SNAPSHOT",
  "scalabilitySetup": {
    "thresholdSLA": 3000,
    "warmup": [
      {
        "action": "constantUsersPerSec",
        "userCount": 10,
        "duration": "10s"
      }
    ],
    "test": [
      {
        "action": "incrementUsersPerSec",
        "userCount": 10,
        "times": 5,
        "duration": "2s"
      }
    ],
    "maxDuration": "10m"
  },
  "testData": {
    "esArchives": [
      "x-pack/test/functional/es_archives/logstash_functional",
      "test/functional/fixtures/es_archiver/getting_started/shakespeare"
    ],
    "kbnArchives": [
      "x-pack/test/functional/fixtures/kbn_archiver/saved_objects_management/saved_objects_mix.json"
    ]
  },
  "streams": [
    {
      "requests": [
        {
          "date": "2022-11-14T09:31:49.963Z",
          "http": {
            "method": "GET",
            "path": "/api/saved_objects_tagging/tags",
            "headers": {
              "Cookie": "sid=Fe26.2**b4f51707bfe081641d5680f3564a6294e67",
              "Kbn-Version": "8.7.0-SNAPSHOT",
              "Kbn-System-Request": "true",
              "Referer": "http://localhost:5620/app/home",
              "Sec-Fetch-Dest": "empty",
              "Sec-Fetch-Site": "same-origin",
              "X-Kbn-Context": "%7B%22name%22%3A%22home%22%2C%22url%22%3A%22%2Fapp%2Fhome%22%7D",
              "Host": "localhost:5620",
              "Accept-Encoding": "gzip, deflate, br",
              "Pragma": "no-cache",
              "Sec-Fetch-Mode": "cors",
              "Content-Type": "application/json"
            },
            "statusCode": 200
          }
        }
      ]
    }
  ]
}
 proc [scalability-tests]  proc [gatling: test] 14:46:50.790 [INFO ] ESArchiver - [/Users/dmle/github/kibana/x-pack/test/functional/es_archives/logstash_functional] Loading 'mappings.json'
 proc [scalability-tests]  proc [gatling: test] 14:46:50.893 [INFO ] ESArchiver - [/Users/dmle/github/kibana/x-pack/test/functional/es_archives/logstash_functional] Loading 'data.json.gz'
 proc [scalability-tests]  proc [gatling: test] 14:46:51.537 [ERROR] i.g.a.Gatling$ - Run crashed
 proc [scalability-tests]  proc [gatling: test] java.lang.RuntimeException: More than 1 index found in mappings.json
 proc [scalability-tests]  proc [gatling: test] 	at org.kibanaLoadTest.helpers.ESArchiver.load(ESArchiver.scala:71)
 proc [scalability-tests]  proc [gatling: test] 	... 20 common frames omitted
 proc [scalability-tests]  proc [gatling: test] Wrapped by: java.lang.RuntimeException: java.lang.RuntimeException: More than 1 index found in mappings.json
 proc [scalability-tests]  proc [gatling: test] 	at org.kibanaLoadTest.helpers.ESArchiver.load(ESArchiver.scala:81)
 proc [scalability-tests]  proc [gatling: test] 	at org.kibanaLoadTest.simulation.generic.GenericJourney.$anonfun$new$3(GenericJourney.scala:140)
 proc [scalability-tests]  proc [gatling: test] 	at org.kibanaLoadTest.simulation.generic.GenericJourney.$anonfun$new$3$adapted(GenericJourney.scala:140)
 proc [scalability-tests]  proc [gatling: test] 	at org.kibanaLoadTest.simulation.generic.GenericJourney.$anonfun$testDataLoader$1(GenericJourney.scala:42)
 proc [scalability-tests]  proc [gatling: test] 	at org.kibanaLoadTest.simulation.generic.GenericJourney.$anonfun$testDataLoader$1$adapted(GenericJourney.scala:41)
 proc [scalability-tests]  proc [gatling: test] 	at scala.collection.ArrayOps$.foreach$extension(ArrayOps.scala:1321)
 proc [scalability-tests]  proc [gatling: test] 	at org.kibanaLoadTest.simulation.generic.GenericJourney.testDataLoader(GenericJourney.scala:41)
 proc [scalability-tests]  proc [gatling: test] 	at org.kibanaLoadTest.simulation.generic.GenericJourney.$anonfun$new$1(GenericJourney.scala:140)
 proc [scalability-tests]  proc [gatling: test] 	at io.gatling.core.scenario.Simulation.$anonfun$params$16(Simulation.scala:178)
 proc [scalability-tests]  proc [gatling: test] 	at io.gatling.core.scenario.Simulation.$anonfun$params$16$adapted(Simulation.scala:178)
 proc [scalability-tests]  proc [gatling: test] 	at scala.collection.immutable.List.foreach(List.scala:333)
 proc [scalability-tests]  proc [gatling: test] 	at io.gatling.core.scenario.Simulation.$anonfun$params$15(Simulation.scala:178)
 proc [scalability-tests]  proc [gatling: test] 	at io.gatling.app.Runner.run(Runner.scala:56)
 proc [scalability-tests]  proc [gatling: test] 	at io.gatling.app.Gatling$.start(Gatling.scala:91)
 proc [scalability-tests]  proc [gatling: test] 	at io.gatling.app.Gatling$.fromArgs(Gatling.scala:53)
 proc [scalability-tests]  proc [gatling: test] 	at io.gatling.app.Gatling$.main(Gatling.scala:41)
 proc [scalability-tests]  proc [gatling: test] 	at io.gatling.app.Gatling.main(Gatling.scala)
 proc [scalability-tests]  proc [gatling: test] 	at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:104)
 proc [scalability-tests]  proc [gatling: test] 	at java.base/java.lang.reflect.Method.invoke(Method.java:577)
 proc [scalability-tests]  proc [gatling: test] 	at io.gatling.plugin.util.MainWithArgsInFile.runMain(MainWithArgsInFile.java:53)
 proc [scalability-tests]  proc [gatling: test] 	at io.gatling.plugin.util.MainWithArgsInFile.main(MainWithArgsInFile.java:34)

Add esArchiver functionality

Kibana test framework provides an easy way to ingest test data via esArchiver: it loads and unloads test data between the tests.

In order to use custom data set in performance scenarios, we need to add similar functionality and generate data sets as well.

Java heap space due to large response.log file

With GenericJourney we can run many requests concurrently including static bundle ones. Historically we were enabling Gatling to save executed requests in response.log file that later we parse and ingest individual docs for each requests. With static bundles we are having 200+ extra requests per user and it causes Java heap space exception.

We should either reconfigure logging to exclude response body of each request or change ingestion process and stop using response.log file (actually disable logging in file)

Gatling index too large

We've one index gatling-data-2021-11 that has too much data.
Currently it has 134.5gb.

We want it split into monthly chunks as we think this one large index will
make upgrades difficult.

  • reindex 2021-11 data that fits only in November into a temp 2021-11
  • delete the orig 2021-11
  • reindex the temp 2021-11 back to the orig name
  • clean up gatling-data which means reindexing any data that is in it, then deleting it and immediately completing #249

Enable monitoring to Kibana stats

To have more information about Kibana & ES performance during scenario run, we want to specify Kibana stats instance as monitoring for each new deployment

Add more deployment configurations

In order to be more flexible with cloud testing, we need to add more config files that can be passed in CI:

  • Hardware profile
  • auto-scaling
  • ES Nodes size
  • Kibana Nodes size

[ci] Run simulation for 7.x branch

Extend current pipeline to build kibana from 7.x branch an run the same simulations that we do for master.
This should help us monitor performance of local builds between minor versions and be able to compare with master

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.