Giter Club home page Giter Club logo

testground's People

Contributors

ackintosh avatar adlrocha avatar aschmahmann avatar autonome avatar bidon15 avatar brdji avatar daonb avatar daviddias avatar dependabot[bot] avatar dirkmc avatar fabiomartins91 avatar galargh avatar glendc avatar hacdias avatar jimpick avatar jxs avatar kerzhner avatar laurentsenta avatar machawk1 avatar momack2 avatar mxinden avatar nonsense avatar olizilla avatar petar avatar raulk avatar robmat05 avatar rorical avatar stebalien avatar wdbaruni avatar yusefnapora avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

testground's Issues

Testground daemon runtime

The daemon is a long-lived server process that handles incoming GitHub webhooks, as well as explicit API calls, orchestrates test plan execution, and offers the dashboard view. It needs to model a job queue of some kind. It schedules only against nomad.

It will need a configuration file of some kind to define things like the Nomad daemon endpoint.

Test the TestGround

A few people have reported being uncertain if TestGround is buildable or if they just don't have the right setup in their machine.

Given that TestGround is designed to be runnable in a local node for non exhaustive tests. Having a Test Plan that is designed so that we check that a TestGround is buildable on Travis (and therefore, granting that it is buildable at least for its essential features) would be a great thing.

Looking for a comment on how urgent should this be and/or if there is any reason to not do it.

Trace API invocations in state db

Assuming we have a DB where we track the state of the system and jobs (#148), and that we're running in a shared, service setup (Maturity Stage 2, #643), the daemon should audit/trace all incoming testground API calls in the database, for future reference.

Review: Chewing strategies for Large DataSets

see: https://github.com/ipfs/testground/blob/02ec37ecee35c5ccd62911eea19a8f275b2903f0/plans/chew-large-datasets/README.md

Test Parameters

  • The directory depths allow specifying file sizes but file sizes can also be specified independently. How are these resolved?
  • As far as I can tell, one currently needs to explicitly list out every file to be added in File Sizes. Instead, we should have a list of [{average: ..., variance: ..., percent: ...}] and a final total file count (where the percent fields add to 100%). Otherwise, generating realistic tests will be really tedious.

For example:

{
  depth: 4,
  numFiles: 1e6, // maybe allow a list and test with each one?
  fileSizes: [
    {average: "1MiB", variance: "10%", percent: "90%"},
    {average: "1KiB", variance: "0%", percent: "10%"},
  ]
}

IPFS MFS Write

- Pin the MFS root hash

The MFS root hash is always pinned (in go-ipfs).

IPFS Url Store / IPFS File Store

To be explicit:

  • Run FileStore doesn't really make sense. I assume you meant: add the files using the filestore (e.g., ipfs add --nocopy).
  • Verifying that all the files are listed involves running ipfs filestore ls to list all blocks and their filenames/urls.

Contrast Testground and Testlab

The libp2p testlab project is a preceding framework initiative to build a system for running large distributed test cases.

The testground spec mentions it in the "Incremental Implemention Plan":

Study how libp2p/testlab can cover the distributed deployment requirements of this design, and understand how it can be reused within this context.

Initial assessment:

  • The current version can be regarded as a domain-specific deployer, capable of launching p2p and ipfs daemons (PR in review) in a Nomad cluster. It does not use Docker, nor does it have support for deploying test logic. It essentially schedules N containers in a cluster, and returns the API multiaddrs to the scheduling process to enable creating client stubs for control, from the code that launched the deployment.
  • In some ways, it can be regarded as a cloud-native iptb.
  • The observability/monitoring/metrics elements are not yet developed.

Testlab's roadmap covers some of the same ground as the spec for Testground. IPFS support using Docker exists as a PR (starts IPFS nodes and connects them, but no scenario yet).

Testground has a similar approach to writing test code, using a config file + go logic.

Testground:

Testlab:

The Testlab "deployment configuration" JSON file is being used to declaratively describe stages which are transformed via plugins into deployable sets of software controlled by nomad. The test code execution is baked into the "scenario" binary that gets deployed in the final stage.

Contrasting that, the Testground "manifest" TOML file appears to be used to specify a series of individual tests, but the go code is responsible for launching any instances itself - for example, the "smlbench" test uses (currently disabled) the "IPTB ensemble" wrapper that is part of the Testground SDK.

There are many other differences. As both projects are out there, and represent different design choices, it might be good to capture the "why?" and how that relates to the core needs and desires for a large scale test framework. There remains the possibility of utilizing some or all of Testlab with Testground ... but they do appear to be competing on the basis of how they are configured and tests are written, so it may make more sense to deprecate one in favour of the other if we are only going to pursue development on one of them.

This issue is a bit of a placeholder ... we could choose to do more "study" (as suggested in the spec) or we could add some documentation.

CLI describe command

Outputs information about a test plan or a test case for now. In the future it may be extended to inspect builders and runners, and their configuration properties.

AWS Tags

If you are creating virtual machines or other resources on AWS, please add some tags so we can track costs ... anything that isn't tagged will be deleted (if we can't figure out what it's being used for)

Edit the table below to add tags:

Tag Key Tag Value Contact Description Keep until date
Project Jim_Testground @jimpick Experimenting with Testground 2019.10.25
Project Packer @jimpick Building images 2019.10.25

archive service: design and implement API endpoint

Design an HTTP API endpoint that can ingest the following artifacts, emitted by the test runner:

  1. Go CPU profiles (e.g. 10 seconds).
  2. Go heap profiles.
  3. Go mutex profile.
  4. Go block profile.
  5. stdout logs.
  6. event logs.

Each archive request will carry, at least:

  1. name of the test case.
  2. test run number.
  3. node tag.
  4. commit hash.
  5. timestamp

Testground roadmap/timeline

I wanted to clarify and write down our goals/expectations for Testground to ensure we stay on track. This is a continuation of the conversation @Stebalien @jimpick @raulk @daviddias and I had last week.

Goal: Ship the next go-ipfs minor release (using testground to validate healthy network performance at scale) by November 15th, 2019.

Our updated release process takes ~3 weeks from the time we cut our RC, on or before October 25th.

At the time of cutting our RC, all tests need to be passing - aka, testground should have sufficient coverage to validate that master performs as well or better vs prod in simulated network conditions. Master currently has a large backlog of changes, including a large libp2p refactor IIRC, so it will probably take 2-3 weeks to use testground to validate and fix issues in master to ensure the build is green. This means we need to have testground working well enough by ~October 4th that we can start using it (or a portion of it) to identify any bugs introduced in the stabilize branch or other areas that we want to release in 0.5.0.

Does this seem feasible? If not, what do we need to change to hit this deadline? (parallelize more? more knowledge sharing? etc) Our goal was to get a go-ipfs release out in Q3. This new deadline, 11/15, is nearing the end of Q4 (given holidays). Delaying our next release to Q1 2020 would be unacceptable - what support / focus / etc is needed to get us on this trajectory?

Contrast Testground and ipfs/benchmarks

The ipfs/benchmarks is a preceding framework initiative to build a system for running regular test cases.

The ipfs/benchmarks system was primarily built to run several benchmarks for js-ipfs on a nightly basis in order to measure progress / catch regressions. It utilizes a dedicated bare metal server in order to get consistent benchmark runs so that it is possible to compare against historically archived benchmark results with some level of confidence.

ipfs/benchmarks architecture diagram

It also has a benchmarks dashboard (currently down) built with Grafana.

You can see the types of tests that are being run nightly:

ipfs/benchmarks#271 (note: the screenshot here is an enhanced Grafana dashboard using non-production data on a development deployment)

Screen Shot 2019-10-11 at 12 24 08 PM

The following tests are run nightly against js-ipfs built from master:

  • adding files to js-ipfs (multiple sizes, strategies)
  • adding files to go-ipfs (using a stable version, for comparison)
  • extracting a local file from js-ipfs (multiple sizes)
  • transfer between two js-ipfs nodes (no network throttle, multiple sizes, multiplexers, websockets, encryption)
  • transfer between from a js-ipfs node to a go-ipfs node (multiple sizes)
  • transfer between from a go-ipfs node to a js-ipfs node (multiple sizes)
  • multi-peer transfer - content is on 4 js-ipfs peers, a 5th peer retrieves a file (multiple sizes, multiplexers, websockets, encryption)

Some work was performed to see if we could re-use the infrastructure to also run nightly go-ipfs tests against a go-ipfs built from master (with no new tests), but the changes have not been deployed yet (needs more discussion)

An important feature of ipfs/benchmarks is that Clinic.js is used to generate runtime profiles and flamegraphs of the Node.js tests.

Screen Shot 2019-10-11 at 12 46 56 PM

Screen Shot 2019-10-11 at 12 47 36 PM

These are saved into IPFS and can be retrieved as HTML files:

https://ipfs.io/ipfs/QmeBBjfQgLfAtPwSKLVLnk6uVMq7gXmQ7omE3oDmCHFwjR/addMultiKb_balanced

Compared to what is currently implemented for Testground, ipfs/benchmarks has:

  • support for nightly scheduled test runs
  • support for building js-ipfs (go-ipfs in a PR)
  • small test suite focused on simple scenarios (mostly js-ipfs)
  • dedicated bare-metal test minion hardware for reproducible results
  • metrics collection database (InfluxDB)
  • javascript tracing/profile (Clinic.js)
  • a dashboard (Grafana) ... currently only has one dashboard which is poorly laid out, but Grafana is flexible and the dashboard could be greatly improved. Grafana has multiple user support, but the currently setup has limited access

Testground appears to be on-track to implement the following differentiating features:

  • support for building Docker containers
  • testing across clusters of machines instead of a single machine
  • complex scenarios
  • ability to test continuously vs. nightly
  • go-ipfs and libp2p tests
  • metrics/results sent to ElasticSearch
  • no dashboards yet, evaluating Kibana ... Grafana is also capable of using ElasticSearch as a data source, so that might also be an option

In terms of maturity, ipfs/benchmarks was started last year, and is a complete working system. It has some minor docker startup issues that occasionally need devops intervention. We have been doing js-ipfs releases even though the system has been offline, so it's likely not a crucial part of the release testing process.

Testground is in a very immature state currently, but it has a planned feature set that will surpass ipfs/benchmarks and is being actively invested in with developer time. It will have a dashboard as well. There is a desire to not have 2 separate sets of dashboards for developers to regularly check.

It's not yet entirely clear if we should be trying to run two separate testing infrastructures, or if we should be actively working to merge them together.

Fake bitswap + datastore for mixed testing

I would be neat to do a "fake bitswap + datastore" thing we could slap into an IPFS test build that would pretend it was transferring over the network at a simulated rate, but wouldn't actually transfer blocks ... it would use a out-of-band backchannel to communicate any hashes that the other end needed.

We could do mixed simulation, but with many lightweight (but real) nodes on a real network for things like package manager use cases with massive numbers of clients.

It's pretty inexpensive to spin up lots of nodes in the cloud using a serverless platform such as Cloud Run (or maybe AWS Fargate), but if we want to send lots of real traffic across multiple regions, that costs $$$.

Produce build outputs with deterministic names for caching

Idea: plan name + hash of build input struct.

For the executable build strategy, we should be storing executables under a user-determined/autogenerated-but-consistent directory, and naming them deterministically.

For the docker build strategy, we should name/tag docker images with the deterministic name.

When the build step runs, we should check if we have a cached artifact.

We should provide a clean command to prune artefacts cached by a builder.

smlbench tests do not run

I was looking through the code of the smlbench tests to use it as a reference for Test Case 1. They don't run at the moment and it is easy to understand why. The manifest is a c&p from the DHT tests one.

Should I take any inspiration at all from these tests? It seems that the way that is being used to spawn a go-ipfs daemon is by running a wired version of IPTB, programmatically. Is this intended? Still the plan or should we look into spawning a go-ipfs daemon and using the go-ipfs-api to operate it.

Combine random/interest dataset plans

There are currently two test plans that cover transferring datasets: interest and random. However, the tests will be identical. The only difference is the input data.

We could simplify them by:

  1. Combining them.
  2. Allowing the user to specify the data source:
  • Generated files as specified in #80
  • An IPFS path.

Migrate sync service from Consul to Redis

The current version of the keyprefix watch in Consul is pretty inefficient.

It is intended to watch a tree node and return updates whenever anything under it changes.

Unfortunately, it sends the entire subtree with every update, instead of the delta. So in a 100,000 node scenario, this scales very poorly, as each node will be receiving the entire subtree every time that a node appears. That's O(n^n) or O(n!) 😨

In all fairness, the Consul community is addressing this by introducing "streaming queries": hashicorp/consul#6310. However, we can't wait to characterise the performance and feasibility, so we will be migrating this component to Redis.

Various patterns are possible:

  • Redis Streams (X command group).
  • (Sorted) sets/lists with keyspace notifications via pubsub.
  • Strings with keyspace notifications via pubsub.

I'll analyse the tradeoffs in comments.

Related to #23.

Produce docker container for DHT tests

  • Consume a list of upstream dependencies/commits, emit replace directives, append to gomod file.
  • Template Dockerfile.
  • Trigger Docker build.
  • Collect Docker container ID.

epic: archive service

This service will likely be backed by a blob store. It will index and store CPU, heap, mutex, etc. profiles.

Acquire a domain name

It would be nice to have a URL where we could host things like:

  • dashboard
  • grafana
  • blog for infra test updates
  • sub-namespaces for specific teams / endeavours to post results

Keep in mind that we might want to share resources between IPFS, libp2p, and possibly even Filecoin and community projects / collaborations down the road.

Use Ansible to automate cluster setup

This is close to becoming a PR ... I've been experimenting and prototyping in the aws-ansible branch to learn a bit of Ansible so I can have reproducible cluster setups (a serious pain point for me so far).

I've got playbooks working for:

  • Dynamic Inventory - retrieve the list of machines from the AWS EC2 API that are tagged with the same 'TG' tag as the current host
  • Connectivity Test - ping all the hosts in the inventory
  • Redis - setup the redis.conf config file on the first machine and start the server
  • Filebeat - setup the filebeat.yml from a template populated from an ansible config stored in an S3 bucket
  • Networking - setup Docker networking on each machine with non-overlapping subnets and GRE tunnels (currently setup for 2 machines, will generalize it for 2+ soon)

I'm planning to take these individual scripts and put them into a directory as Ansible "roles" so they can be run all together.

DHT find_peers distributed test over 1000 nodes

A test scenario that we'll use as an example/template to build the system gradually.

  1. Initially this will run within a single process.
  2. Run as multiple processes locally, coordinated by the sync service (Consul). Local test runner.
  3. Deploy to a Nomad cluster. Nomad test runner.

Impl Traffic Shaping

We need an API/tool to declaratively enforce traffic shaping rules via tc. Think of this as Terraform for traffic shaping. It takes an object (serializable into JSON or YAML or whatever), and applies the rules expressed within. It then exposes an API to "release" those rules and revert the network adaptor.

reporting service: client proxy to inject in test scenarios

The reporting HTTP service will be exposed by the coordinator, but we don't want the test scenarios to deal with raw HTTP calls. They should receive a proxy client that encapsulates the network calls and exposes a nice, simple API.

type Reporter interface {
	RecordMetric(name string, value float) error
}

// Encapsulates the reporting context and implements Reporter.
type httpReporter struct {
	commit		string
	run			int
	scenario	string
}

master test plan: sketch out the skeleton

The master test plan contains all the test scenarios that the scheduler will run for every single commit that is scheduled to be canary-tested.

  • Each test scenario can be a struct implementing a TestScenario interface. The TestScenario interface could be mono-method.
  • The master test plan instantiates an IPTB of, say, 16 nodes.
  • The grand total node count can vary over time as the master test plan evolves.
  • Test scenarios should be parallelisable.
  • Test scenarios will "check out" IPFS instances from the IPTB pool, and will conduct tests on them. For example, an "add-then-get" test scenario will check out two instances: the adder and the getter.
  • Something needs to supervise the assignment of available IPFS instances to test scenarios.
  • While those instances remain checked out, they will be unavailable to other concurrent test scenarios.
  • When a test scenario finishes, it's not clear if we should completely dispose of that IPFS instance, or if it can be reused.

The scheduler will run the master test plan N times per commit, in order to acquire various observations. However, the master test plan does not need to know it's being run repeatedly.

Test plan executables should generate nomad HCLs for test cases

We want the HCL definition of a nomad test case to be co-located with the test case itself. The way to achieve this is to have the test plan executable print the HCL to stdout.

Introduce a TEST_COMMAND environment variable that can take values: run, schedule.

  • When the value is run, the executable will run the test case designated by TEST_CASE_SEQ, as normal.
  • When the value is schedule, the executable will spit out the Nomad HCL for scheduling the test case designated by TEST_CASE_SEQ, with the parameters conveyed in other environment variables.

The emitted HCL may in itself be a Go template, as there are certain elements that cannot be determined by the test case itself (e.g. test run ID, etc.). We'll need to define the expectations and input/output clearly here.

github integration: lay the foundation

What we need to do

Write a GitHubBridge component that:

  • sets up a webhook endpoint.
  • subscribes to commits on master, commits on pull requests, and comments on PRs.
  • prints out those events to stdout.

Can be tested manually against a personal repo (don't create test PRs or commits on go-ipfs itself!). Not sure how this can be unit tested; maybe create integration tests that use a personal repo and the GitHub API to make commits, etc. that trigger the events we'll receive via the webhook endpoint. You can register and remove webhooks dynamically via the WebHook API: https://developer.github.com/v3/repos/hooks/.

This integration test will also need to run on Travis (we need to set up Travis -- can do that in this issue too?).

Definition of Done

  • Documented code.
  • Integration tests against a personal repo, using the webhook API to dynamically register a webhook, and the general GitHub API to make actions that lead to webhook notifications.
  • Pull request.
  • Travis config.

epic: test scheduler

The test scheduler is the component of the system that takes a commit hash, checks out the go-ipfs tree, builds any necessary artifacts, and schedules a master test plan run against that build.

The scheduler should be developed as an abstraction, with an initial implementation that schedules test runs locally and in a serial fashion (FIFO queue).

In the near future, it should evolve to schedule test runs on a nomad cluster, so that we can parallelise the live canary testing of multiple commits at the same time.

Error while deleting containers

Ran the command from the readme for running the tests locally:

TESTGROUND_BASEDIR=pwd testground -vv run dht/lookup-peers --builder=docker:go --runner=local:docker --build-cfg bypass_cache=true

And I get errors about deleting containers. Log below, from after it created all the containers.

log:

305.87214s INFO started containers {"runner": "local:docker", "run_id": "8c5f91b5-4a26-427f-86bf-63e67e823221", "count": 50}
305.90342s ERROR Error: No such container: f60ad6ea40133850ab76fccd0e3dbe195880002db33f88d2132dfd710565f984 {"runner": "local:docker", "run_id": "8c5f91b5-4a26-427f-86bf-63e67e823221"}
github.com/ipfs/testground/pkg/runner.(*LocalDockerRunner).Run
/Users/dietrich/go/src/github.com/ipfs/testground/pkg/runner/local_docker.go:237
github.com/ipfs/testground/pkg/engine.(*Engine).DoRun
/Users/dietrich/go/src/github.com/ipfs/testground/pkg/engine/engine.go:225
github.com/ipfs/testground/cmd.runCommand
/Users/dietrich/go/src/github.com/ipfs/testground/cmd/run.go:128
github.com/urfave/cli.HandleAction
/Users/dietrich/go/src/github.com/urfave/cli/app.go:523
github.com/urfave/cli.Command.Run
/Users/dietrich/go/src/github.com/urfave/cli/command.go:174
github.com/urfave/cli.(*App).Run
/Users/dietrich/go/src/github.com/urfave/cli/app.go:276
main.main
/Users/dietrich/go/src/github.com/ipfs/testground/main.go:37
runtime.main
/usr/local/go/src/runtime/proc.go:203
305.90969s INFO deleting containers {"runner": "local:docker", "run_id": "8c5f91b5-4a26-427f-86bf-63e67e823221", "ids": ["f60ad6ea40133850ab76fccd0e3dbe195880002db33f88d2132dfd710565f984", "18de494c88af62bfb614ae9f9c1ef9f875f6de887383b586d69b2be0921df42d", "110351118efb8c56924b08176b15034b221420280c11286ec76428b758facbd4", "9ae807880b811897909ea91362467f7cdb91f872c30032ef2ea908111094d6dc", "48e64f8cc2954699db0da9c22c83d3f1c377948d2444a6d326bdc73e290670ad", "6016eaab4e6d1bedab9fe06618ecd9edc4b9c5cc7a26ce8b6e6f3d0ecc24a31f", "dd1f01c45b03ac8a23a4b4a9fb3b2b8bdc7b850848ee39d6f8cf7fabb3140ef2", "028443d4686419e13bcb945352cb56258866c47a9c62287ffa78c7de28eda44d", "8cd6b61bdb80bb25ddad18517ac6897f670fe68a380e1be98bbebdb1dc8458a7", "1a38f9c3e3add39ef5315886efc0db229d4639b60d1f1dd3eb0f085a95fd2aa0", "8e5eeb3a0707d4a8b477662f9b66a59822394099ceecf04482f54d893fe64b93", "5cbb26c5d196209d3457ca41c40fc5a5f48e5ce905ef7dceda7336f3a3a328ff", "90f1370c399dc28bca53352bf7bf12b2ab1060c1e39d88465122bd4a58276593", "0eb7d340f8453fe0a4b9b61c68fbbae16b247fa63822082c4fda19fb455fc754", "2eb0db65b7bbd22349527becda0600d86b0b6828f4267e09c04502463c04625e", "21747a4afd2c532701aca25f09feb868eab91e661ebdf3f23ee64d210ec3f122", "e2029145cce0ece593cb979bd9710e6798ad1868d2fce42aa5294fd871ffc098", "9974da148aec9a952208486fd125f3e0623344bb893469a86a24ed200f4d7a9e", "9d089215a44b52495aa5b6392a3ed18aaf8045807970f8437de02c9986bef8c9", "46e6bd86605929ad4b8aeed173c6f04e1377076d70b898ad8cde769b99ecb5c2", "a43c8327f795b83464d89ab1c90a8e5d4b06fd1734b9e0eeef04791266a6f703", "4d300eff9709b28f19d5bc03477010519e5c2f60cc08ed39c90da1f5ebbdbf8c", "6e4fd0993962816884eb305cf77812468bc53a10c16a33576531b942c3dd7fb3", "2737bac62e45936ac9a12dd93367d4fa614a12a5b461d2864e3895a9ba0b25d6", "c8418f988df67b44ad8e4b4c77029792fa19ff61e2eaf81ab6b0b6ced7b523a3", "37c2b0361e3ad53dbb39b6e85310373e9146fcf73b03d4796f2064a47b580156", "8fc69571f791d0016beec8bf9c028158e0160abf87742a70f3da322cf221dd96", "458809f2fcee4a837b95fb0d3cbf300e353290cd37b3e2c7be6c35284b0d5e1e", "9ec5921c95bad596aa4b079e63ca93d31d591cfeef1d9c5679c6659ebc9d2717", "76b9a7821f1d45a8784b1bcf291acb80e04ec003b5dc33db19c2232542241abd", "f953adaaf34ffa4e9be2e92ea6940c68cb8ac1a6b70d709e9d12ecb3e6227ead", "9b99f9b20fcc9a677ea9aa0caeef0ddf08736b96dd360cde454a169b282b2bf0", "b5764f5be1a539dc07efb4cf26dd90728b9dd9dfefda141b87ff3e69fb2176e7", "1dff45985a770f8d8d1e74e29a44e5598aae541239c51d12de123521087e62cd", "4985c012dce1425b634c025d77de53b38a700d3950d0f076e9e9e7215ef9695d", "7bf4f4f4dd9dc16cb1c66510c6bd727bdfe431a1acc41c1b40ec291b126a3ba9", "5d9e66432d9cbb82b5873420d70c6e50b01657309abb9385e6413e72bf726b59", "a04bd280e828af4759a581565a2a1c259667523e19898d988e29ede3b5bbb69c", "227b1d2c8c5aa90f0be7c2d46099aaff5f66cdd691033477d99917b75848ae24", "1b68e36153c9a48747b1faecdd509ceb697e83949ce0c54f68557076134a3ab9", "ada537bc80fdcd021d4eb44e0a1856db6584576ba542a0e9c7e7aa834c9b7d48", "40152c6cd7b1ba7e8351d070843572c840bd26467d4401089456c961497fbf6f", "286bdf4c86cf3b25f8ecbb757914ae326f9c838937f47d3137d2db79f41d5ea5", "0184c77eb0548229e37055aebe18378367b0a433fc02c739c1e7ab707cf19b82", "61bd6ab454fd86af8b01088ac1f94321ca15de605838973d3849d7e8b6e51cec", "72d2ecc82cbc9cac23830befb4da9ede1db2770001f8d029b67b9b16efce5217", "1d90304528b25fac622bd39504f5444f5c3067189b54c25dee17dc192d26209b", "185161f998f8952a632c1ccb322d217bfa758637e50db99fa8338febe7c630ca", "a13096d7ed77a712e1edaa11fb39a8fbfe9b5472ebd592553d416ff2f41f0c05", "eee6c43871d68c558d5b03ddc2732f8b6888ad1b5cc3b565a97e05f0984b9daa"]}
305.91539s DEBUG deleting container {"runner": "local:docker", "run_id": "8c5f91b5-4a26-427f-86bf-63e67e823221", "id": "f60ad6ea40133850ab76fccd0e3dbe195880002db33f88d2132dfd710565f984"}
305.93339s ERROR failed while deleting containers {"runner": "local:docker", "run_id": "8c5f91b5-4a26-427f-86bf-63e67e823221", "error": "Error: No such container: f60ad6ea40133850ab76fccd0e3dbe195880002db33f88d2132dfd710565f984"}
github.com/ipfs/testground/pkg/runner.deleteContainers
/Users/dietrich/go/src/github.com/ipfs/testground/pkg/runner/local_docker.go:286
github.com/ipfs/testground/pkg/runner.(*LocalDockerRunner).Run
/Users/dietrich/go/src/github.com/ipfs/testground/pkg/runner/local_docker.go:238
github.com/ipfs/testground/pkg/engine.(*Engine).DoRun
/Users/dietrich/go/src/github.com/ipfs/testground/pkg/engine/engine.go:225
github.com/ipfs/testground/cmd.runCommand
/Users/dietrich/go/src/github.com/ipfs/testground/cmd/run.go:128
github.com/urfave/cli.HandleAction
/Users/dietrich/go/src/github.com/urfave/cli/app.go:523
github.com/urfave/cli.Command.Run
/Users/dietrich/go/src/github.com/urfave/cli/command.go:174
github.com/urfave/cli.(*App).Run
/Users/dietrich/go/src/github.com/urfave/cli/app.go:276
main.main
/Users/dietrich/go/src/github.com/ipfs/testground/main.go:37
runtime.main
/usr/local/go/src/runtime/proc.go:203
Error: No such container: f60ad6ea40133850ab76fccd0e3dbe195880002db33f88d2132dfd710565f984

Is it known that `exec:go` does not run?

» TESTGROUND_BASEDIR=`pwd` testground run smlbench/store-get-value --builder=exec:go
resolved testground base dir from env variable: /Users/imp/code/go-projects/src/github.com/ipfs/testground
Incorrect Usage: invalid value "exec:go" for flag -builder: allowed values are docker:go; got: exec:go

epic: reporting service

Test scenarios need to record their results and observations in a database, from where the dashboard will read them. The reporting service will be an HTTP API with a proxy object that we'll inject in test scenarios so they can record metrics easily without having to worry about constructing HTTP clients, nor providing the context (i.e. test run number, test case, commit id).

ElasticSearch deployment alternatives

There are many, many different options for deploying an ElasticSearch cluster...

Right now, we're using the "Elastic Cloud" offering from Elastic, for ease-of-setup + unified billing (via AWS Marketplace) + it's got the latest features and matches the online documentation. However, it's not open source ... here's the full feature matrix:

https://www.elastic.co/subscriptions (click on "Expand all features")

The "Elastic Cloud" version automates a lot of the tasks involved in running a cluster, but it's not fully managed - there are still admin tasks that somebody will have to perform.

The open source version probably gives us most of what we need for Filebeat/ElasticSearch (needs to be confirmed) ... the Elastic Cloud version of Kibana has a lot of features that aren't in the Open Source version.

"Elastic Cloud" can be purchased directly from Elastic.co or via the AWS Marketplace. If purchased direct, it can be deployed on either AWS or GCP. If purchased via AWS Marketplace, only AWS is available (no surprise) ... the nice thing is that the billing is integrated into the AWS bill. I'm only provisioning a tiny install so far, so I haven't evaluated costs - it's billed hourly on AWS Marketplace based on the size and number of servers used in the cluster.

The Open Source version of the Elastic Stack can be set up manually, and is available in many Linux distributions. Setting up a cluster is a fair amount of effort though. If the volume of data we need to store gets really large, it might be more economical to go this route.

AWS has a competing hosted ElasticSearch offering: https://aws.amazon.com/elasticsearch-service/ ... and is sponsoring an alternative open source distribution to the Elastic.co commercial version: https://opendistro.github.io/for-elasticsearch/

There are also other paid hosted services, eg. https://logz.io/

Using testground to test Bitswap in varying leech / seed combinations

There are a couple of test plans for testing Bitswap performance:

These are good real-world tests that can be used to determine if a change to Bitswap is safe to release.

As mentioned in the above test plans, for Bitswap we are most concerned about

  • overall transfer time
  • overall data transfer (ie how much unique vs duplicate data is transferred)

It would be helpful to also run more focused tests that isolate the effect of changes, in particular to benchmark each combination of L leeches pulling from S seeds:

  • 1 leech pulling from
    • 1 seed
    • 2 seeds
    • 4 seeds
    • 8 seeds
    • 16 seeds
  • 2 leeches pulling from
    • 1 seed
    • 2 seeds
    • 4 seeds
    • 8 seeds
    • 16 seeds
  • 4 leeches ...
  • 8 leeches ...
  • 16 leeches ...

For these tests we would not be concerned with connection management - all nodes would be connected directly to each other.

Conceptualise GitHub automation layer

I regard this as a mux router for GitHub webhook events. Apparently nothing like this exists (I only checked go-land).

In essence, it will be a rulebook (defined in configuration) that enumerates rules like: “on a new commit on go-ipfs/master, trigger this test plan”. Feels a bit like GitHub Actions.

Actually, at the bare minimum we should define a series of commands that befriended committers will trigger via a @testbot mention — e.g. @testbot run <testplan> with <dependency=gitref> <dependency=gitref> <dependency=gitref>

reporting service: design database schema

A metric is defined by:

  • metric name (e.g. time_to_get_1mb_file) -- unique
  • unit: full (e.g. milliseconds), abbreviated (e.g. ms)
  • direction of improvement: +1, -1 (so that the reporting UI can colour it appropriately when rendering deltas)

A observation of a metric (aka result) should be identified by:

  • metric name (e.g. time_to_get_1mb_file)
  • test case name (e.g. add_get_file)
  • test run (e.g. run number #123)
  • value: float

Minimal sync service

Create a wrapper around Consul watches for:

  • adding and removing peers to a test scenario.
  • leader election via locks.
  • increment a state counter (e.g. how many nodes have entered state X).

These are important primitives for distributed test cases.

Dashboard concept using mock data

Let's make an HTML dashboard that displays the output from some mock test runs.

Primarily, this iteration of the dashboard will be used to further the design discussion.

We know we will be running tests, and collecting data from the tests, and we'll want to make that data viewable somewhere.

We want to further discuss or design additional responsibilities of the dashboard, which might include:

  • high-level overview of all the testing
  • easy pinpointing of improvements or regressions in key metrics
  • ability to view historical graphs of individual metrics
  • ability to navigate to different code branches / pull-requests
  • test plan scheduling management - accept commands to schedule certain test plans to run continuously, periodically, or on an ad-hoc (manual) basis
  • infrastructure resource management - view and/or change cluster setups

@raulk proposed the following for the initial dashboard concept:

A simple matrix like https://build.golang.org/ would work well for us, for now.
• rows: reported result (e.g. time to first provider record, time to get file, time to find peer); grouped by test case (e.g. add and get 8mb file, find random peer, bitswap transfer no provider lookup, bitswap transfer with provider lookup).
• columns: commits that have been tested. potentially with a second layer of nested columns inside each commit, one per test run — if we run each test plan several times in the canary setup, to account for variance
• each cell contains the numeric value of the metric for that test run.
• cell bg are traffic-light colour coded based on if the metric improved or worsened vs. a baseline.
• we pick the baseline in a dropdown (master, commit X, release Y, etc.)

I'd like to propose that we generate some realistic "mock" data for the first batch of tests that we would like to start running, and use that as a basis to rapidly prototype a design.

Contrast Grafana vs. Kibana

This issue is here to collect experiences of Grafana vs. Kibana so we can make some decisions about which ones to use...

Testground census and commands/actions

The testground contains test plans, which in turn contain test cases. Test plans are black boxes with a cleanly defined environment. They can be developed in Go, Shell scripts, JavaScript, etc. Test plans need to enlist with the testground's census.

Test plans have three actions:

  • Build.
  • Schedule.
  • Run.

Build action

The build action needs to take a dependency manifest as an input. Why? Test plans will be triggered against specific commits of upstream dependencies. A Go test plan would, for example, add replace directives to its go.mod file before it calls go build and creates the Docker container.

$ cat << EOF | testground build <testplan>
github.com/libp2p/go-libp2p=18daff102bb2...
github.com/libp2p/go-libp2p-kad-dht=89ac3dbe740...

^^ This produces a Docker container for the test plan, against those upstream dependencies. Alternatively, we can introduce a binary builder that generates an executable rather than a Docker container, --output=binary.

Schedule action

The schedule action produces the Nomad HCL job descriptions for running a testplan, provided a container ID:

$ testground schedule --container=<container-id> <testplan>[/<testcase>]

Alternatively we could have a local scheduler which produces a shell script to run locally.

Run action

The run action may not belong in test plans, but rather in test runners.

The run action takes a batch of Nomad HCL job descriptions and submits them to the Nomad cluster for execution. TBD.

Pluggable test plans (via manifests)

Currently, test plans are part of the source tree of the test ground. It would be nice to define a clean, lightweight, abstract API between the testground and testplans, such that test plans could be built and loaded as plugins.

https://godoc.org/plugin

EDIT: after a conversation with @Stebalien, Go plugins are difficult and unergonomic to use. Instead, we're detaching test plans by introducing a "test plan manifest" -- which is a descriptor of the test plan, the supported build strategies, run strategies, and its test cases.

Example:

name = "dht"
# hashicorp/go-getter URLs, so in the future we can support fetching test plans
# from GitHub.
source_path = "file:${TESTGROUND_REPO}/plans/dht"

[[ build_strategies ]]
type = "executable:go"
go_version = "1.13"
module_path = "github.com/ipfs/testground/plans/dht"
exec_pkg = "exec"

[[ build_strategies ]]
type = "docker:go"
go_version = "1.13"
module_path = "github.com/ipfs/testground/plans/dht"
exec_pkg = "exec"

[[ run_strategies ]]
type = "local:binary"

[[ run_strategies ]]
type = "local:docker"

[[ run_strategies ]]
type = "distributed:nomad"

[[testcases]]   # seq 0
name = "lookup-peers"
instances = { min = 2, max = 100, default = 50 }

  [testcases.params.bucket_size]
  type = "int"
  desc = "bucket size"
  unit = "peers"

[[testcases]]   # seq 1
name = "lookup-providers"
instances = { min = 2, max = 100, default = 50 }

  [testcases.params.bucket_size]
  type = "int"
  desc = "bucket size"
  unit = "peers"

[[testcases]]   # seq 2
name = "store-get-value"
instances = { min = 2, max = 100, default = 50 }
roles = ["storer", "fetcher"]

  [testcases.params.bucket_size]
  type = "int"
  desc = "bucket size"
  unit = "peers"

Content-based deterministic canonical build ID

Right now the canonical build ID is computed from the build parameters and upstream dependencies. We need to walk the source tree of the test plan and digest it to add a content-addressed component. Otherwise, we can get cache hits even if the source of the test plan has changed.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.