twitter / rpc-perf Goto Github PK

View Code? Open in Web Editor NEW

458.0 25.0 83.0 1.32 MB

A tool for benchmarking RPC services

License: Apache License 2.0

Rust 100.00%

rpc-perf's People

Contributors

Stargazers

Watchers

Forkers

brayniac bryce-anderson unleashed sagar0 henrypoter thinkingfish yetone huayl tempbottle yodamaster gjmulder bedeedidiong insom neven7 rjammala kaoudis abdallahcoptan alexxnica kryndex jon dyhpoon estebank dolfly spencerx kevyang tejom nsmanju01 jacktuck aeroglyphic wurikiji ashanatayal maximebedard rakeshmadivi jayxyc hw-standalonecomplex patkamin swlynch99 liangjz mbrukman brianmcdonaldws jg8481 iamgooglego abdolrhman icodein isgasho noxiouz xiaozan-dev bruuuuuuuce cassyunknown classicvalues myriantic isabella232 aetimmes junmingzhao42 seanpm2001 twitter-backup mookerji flausenhaus twitter-mirror mm88ggg l1kw1d heron-sense amhudson deanlj-dev alhudson sagnikiitb chhaviilli lichaonetuser awesome1128 officialprasanta wentdavid eqeiland kk-st datactor avenpace iq-scm thecodeofmontecristo opslevel-test cedric-courtaud thoven87

rpc-perf's Issues

Config redesign

Given size/num are now set for 4 different parameters, I wonder if it makes sense to reconsider how workload is specified- explicitly, does it make sense to create separate section about the dataset (key space, key generation scheme, value size etc), and load (rate, arrival pattern for the future). @brayniac what do you think?

Originally posted by @thinkingfish in #167

Add verbosity option

Currently log level is fixed. This should be turned into an option.

rpcperf_thrift: add Default for Buffer

src/buffer.rs:27:5: 32:6 warning: you should consider adding a `Default` implementation for `buffer::Buffer`, #[warn(new_without_default)] on by default
src/buffer.rs:27     pub fn new() -> Buffer {
src/buffer.rs:28         let mut buffer = Vec::<u8>::new();
src/buffer.rs:29         buffer.resize(4, 0);
src/buffer.rs:30 
src/buffer.rs:31         Buffer { buffer: buffer }
src/buffer.rs:32     }
src/buffer.rs:27:5: 32:6 help: for further information visit https://github.com/Manishearth/rust-clippy/wiki#new_without_default

CI is using a deprecated rust linter

CI builds are reporting failures for 273.3 due to the following error:

$ cargo fmt -- --write-mode=diff
This version of rustfmt is deprecated. Use rustfmt-nightly. Use --force to run deprecated rustfmt.
usage: rustfmt [options] <file>...
Options:
    -h, --help          show this message
    -V, --version       show version information
    -v, --verbose       print verbose output
    -f, --force         force rustfmt to actually run
        --write-mode [replace|overwrite|display|diff|coverage|checkstyle]
                        mode to write in (not usable when piping from stdin)
        --skip-children 
                        don't reformat child modules
        --config-help   show details of rustfmt configuration options
        --dump-default-config PATH
                        Dumps the default configuration to a file and exits.
        --dump-minimal-config PATH
                        Dumps configuration options that were checked during
                        formatting to a file.
        --config-path [Path for the configuration file]
                        Recursively searches the given path for the
                        rustfmt.toml config file. If not found reverts to the
                        input file path
        --file-lines JSON
                        Format specified line ranges. See README for more
                        detail on the JSON format.
The command "cargo fmt -- --write-mode=diff" exited with 1.
cache.2
store build cache
0.01s
9.76schanges detected, packing new archive
.
uploading archive
Done. Your build exited with 1.

network interface binding on client

On client systems that have multiple NICs/IP addresses, it would be useful to be able to specify which local interface (either ip address or interface name, e.g. eth0) to use for the connection to the server.

An example...I have a 10Gb NIC and a 25Gb NIC on my client system. It is not clear which will be used by rpc-perf.

approach for generic thrift support

Allow users to test their Thrift services with rpc-perf. This requires having a way for rpc-perf to understand the Thrift IDL potentially through a TOML config or parsing the IDL directly. Consideration needs to be given to how to generate the actual requests in the workload lib.

Can the different protocols be modularized

Currently the details of each protocol leak into many areas that shouldn't really care such as connection management and workload buffer generation. An example of this tight coupling leading to problems when generating a workload can be found here: each protocol needs to be concerned with types that it probably has no notion of.

I propose changing the structure. A protocol needs to do these things:

parse its config
Generate a workload buffer
Parse the response

This can be balled up into a trait with the relatively simple interface:

trait Protocol {
    fn generate_request(&mut self) -> Vec<u8>;
    fn parse_response(&mut self, bytes: &[u8]) -> ParsedResponse;
}

mod thrift {
   fn parse_config(config: Toml) -> Box<Protocol> // Could also give a concrete type and box later...
}

From there, each module will implement its own protocol specific parsing, types, etc. This moves the coupling of the protocols to only a single point: the initialization of the parsing when a protocol must be selected.

Benefits include:

Stronger static guarantees about each individual protocol
Easier testing of individual protocols
Easier to implement new protocols.
This will make mux support easier because protocols could be transparently stacked.

Fails to compile on OS X (Yosemite) / Rust nightly

swaisbrot@tw-mbp-swaisbrot:~/workspace/rpc-perf% multirust run nightly cargo build --release
   Compiling histogram v0.3.2
   Compiling bitflags v0.1.1
   Compiling winapi-build v0.1.1
   Compiling mpmc v0.1.0 (https://github.com/brayniac/mpmc#0cc33c9d)
   Compiling winapi v0.2.4
   Compiling regex-syntax v0.2.2
   Compiling rustc-serialize v0.3.16
   Compiling libc v0.1.12
   Compiling getopts v0.2.14
   Compiling shuteye v0.0.5
   Compiling lazy_static v0.1.15
   Compiling clock_ticks v0.0.5
   Compiling winapi v0.1.23
   Compiling crc v1.2.0
   Compiling log v0.3.3
   Compiling advapi32-sys v0.1.2
   Compiling nix v0.3.9
   Compiling kernel32-sys v0.2.1
   Compiling memchr v0.1.6
   Compiling bytes v0.2.11 (https://github.com/carllerche/bytes?branch=v0.2.x#d9eac250)
   Compiling request v0.1.0 (file:///Users/swaisbrot/workspace/rpc-perf)
   Compiling aho-corasick v0.3.4
   Compiling parser v0.1.0 (file:///Users/swaisbrot/workspace/rpc-perf)
   Compiling libc v0.2.2
   Compiling slab v0.1.2
   Compiling time v0.1.34
   Compiling rand v0.3.11
   Compiling mio v0.4.3 (file:///Users/swaisbrot/workspace/rpc-perf)
   Compiling simple_logger v0.4.0
   Compiling ratelimit v0.1.8
   Compiling workload v0.1.0 (file:///Users/swaisbrot/workspace/rpc-perf)
   Compiling regex v0.1.41
   Compiling toml v0.1.23
   Compiling rpc-perf v0.1.0 (file:///Users/swaisbrot/workspace/rpc-perf)
Assertion failed: (Ty == resolve(Ty->getRef()) && "type was not uniqued, possible ODR violation."), function getOrCreateTypeDIE, file /Users/rustbuild/src/rust-buildbot/slave/nightly-dist-rustc-mac/build/src/llvm/lib/CodeGen/AsmPrinter/DwarfUnit.cpp, line 713.
Could not compile `rpc-perf`.
swaisbrot@tw-mbp-swaisbrot:~/workspace/rpc-perf% multirust run nightly rustc --version
rustc 1.7.0-nightly (b4707ebca 2015-12-27)

Better configurable dataset/heap size

We have found that the total heap being actively used in cache may have a significant impact on cache performance. For testing performance with different cache heap sizes, it would be helpful to make it so that the total data set size being queried by rpc-perf clients is easily configurable.

add timeout support

Typical clients provide the ability to set a timeout on a request, by adding timeout support to rpc-perf we can more closely emulate client behaviors for slow requests by abandoning them.

I propose the timeout is implemented at a global level (eg - a single timeout setting applies to all requests). Until we have pipelining support, the behavior should be to trash and re-establish the connection when a timeout occurs.

Install instructions

The install instructions currently indicate to use 'cargo install' but this version lags behind master, and may contain bugs fixed since then. Revert the readme to recommend building from master

testing: integration tests

Let's add integration tests. Initial thought is to have a mock server for each protocol. Open to other ideas on this.

readme: build instructions

Include information on multirust, or a link to official rust install documentation. Let's also specify which rust versions are targeted

Open connections stat appears to only count for one thread

With my current set up, it seems like the open connections stat (calculated as socket_create_count - socket_close_count) is only reflecting the open connections for a single thread. For example, with 3 threads and 50 connections per thread, I am seeing in the rpc perf logs:

2018-08-01 21:07:36.410 INFO [rpc-perf] Config: Threads: 3 Poolsize: 50
...
2018-08-01 21:09:37.449 INFO [rpc-perf] Window: 1 
2018-08-01 21:09:37.449 INFO [rpc-perf] Connections: Ok: 0 Error: 0 Timeout: 0 Open: 50

I am seeing on my server that all of the expected connections are there. So it would appear that rpc perf is creating all of the connections it is configured with, but the stat is counting them incorrectly. For reference, here is the config file I am using:

[general]
threads = 3
connections = 50
windows = 60
duration = 60
request-timeout = 200
connect-timeout = 500

[[workload]]
name = "get_counter"
method = "get"
rate = 800
  [[workload.parameter]]
  style = "random"
  size = 25
  num = 100000000
  regenerate = true

[[workload]]
name = "get_tweet"
method = "get"
rate = 800
  [[workload.parameter]]
  style = "random"
  size = 24
  num = 40000000
  regenerate = true

[[workload]]
name = "set_counter"
method = "set"
rate = 100
  [[workload.parameter]]
  style = "random"
  size = 25
  num = 100000000
  regenerate = true
  [[workload.parameter]]
  style = "random"
  size = 8
  num = 10
  regenerate = false

[[workload]]
name = "set_tweet"
method = "set"
rate = 100
  [[workload.parameter]]
  style = "random"
  size = 24
  num = 40000000
  regenerate = true
  [[workload.parameter]]
  style = "random"
  size = 1024
  num = 10
  regenerate = false

bug: impossible to use memcache "version" command

It is impossible to use the memcache "version" command from rpc-perf

Config used:

[general]
threads = 1
connections = 1
duration = 1
windows = 300
protocol = "memcache"
tcp-nodelay = false
ipv4 = true
ipv6 = false
database = 0

[[workload]]
name = "version"
method = "version"
rate = 1

Expected result: rpc-perf runs and executes the version command against memcache at a rate of 1 qps

Actual result:

2018-02-06 20:45:20.794 INFO  [rpc-perf] rpc-perf 2.1.0-pre initializing...
2018-02-06 20:45:20.794 INFO  [rpc-perf] feature: asm: disabled
2018-02-06 20:45:20.794 ERROR [rpc-perf] malformed config: 'parameter' must be an array

datastructures: support additional atomic sizes

Currently we use AtomicUsize within the datastructures crate because that was the only available atomic at the time. Rust 1.34 is due to stabilize atomics with different widths. Commit: rust-lang/rust@14b36fb

If we can make the datastructures generic across the different atomic types, users can better tailor the memory consumption of the datastructures by choosing to use smaller atomics - eg, a histogram of AtomicU32 would be much smaller on x86_64 than one using AtomicUsize.

This is blocked until 1.34 is released as stable.

connect latencies misreported

client.rs fails to reset t0 on reconnect. This may result in misreported connect latencies under some circumstances

protocol: thrift details leak through to config

As noted in #61 we currently leak protocol details, such as the need for a "stop" at the end of a struct to the user through the config. If we make the Type and ThriftType recursive, we may be able to represent the structure more accurately in the config.

Simplify cargo build profiles

Our cargo build profiles are slightly dated and explicitly capture defaults - for instance, the debug profile we define is just the defaults. Reduce the config to the minimum that is necessary for the behaviors we want.

warmup should fail if target hit rate cannot be reached

the current warmup behavior is to spin endlessly on the warmup workload(s) until the target hit rate is attained. however, there is the possibility that the target hit rate will never be reached, and we should detect this condition by monitoring the hit rate to see if it levels off over some number of windows.

Add --version flag to rpc-perf

Two issues here. We don't report the rpc-perf version early enough in the output. This results in it not reporting if there is an error parsing options or config. We should also add a --version flag for completeness.

nit: README could be clearer on toml/cmdline configuration

Upon quick sifting I was a bit confused in the usage section. After re-reading more slowly, my interpretation is:

There are some settings that are only possible in the TOML config file.
There are some settings that are possible to configure in both the TOML and the config file.
There are no settings that are only configurable at the command line.
Settings configured by command line take precedence over settings configured in the TOML file.

From chat, I believe this was correctly interpreted, though turns out there's some stuff that's only available at the cmdline.

In any case, here's a suggested phrasing that I think would have made me not re-read:

rpc-perf can be configured through a combination of a configuration file (in TOML format), and command line arguments. All settings are available in the configuration file, and most are available as command line arguments. Where a setting is specified in both places, the command line takes precedence.

The settings that are only available in the configuration file pertain to XXX, and aren't available at the command line because of YYY.

The general intent is to define profiles, or default configurations, in configuration files and use command line arguments for quick ad-hoc iteration.

overflow issues

rpc-perf/rpc-perf/src/config/mod.rs

Line 160 in fbe2748

if 10_usize.pow(self.length as u32) < count {

This can overflow when a keyspace config has the value for length greater than 19. It then gives a cryptic error that is hard to debug. I think someone would want to be able to create a workload that has larger keys then 19 bytes.

Warmup rate exceeds configured rate

While testing using thrift protocol, we discovered that rpc-perf doesnot respect the rate param specified in config file

trace logging enabled

#61 left trace logging enabled

rpcperf_cfgtypes: map(f).unwrap_or(a) should be map_or(a, f)

src/lib.rs:150:16: 153:34 warning: called `map(f).unwrap_or(a)` on an Option value. This can be done more directly by calling `map_or(a, f)` instead, #[warn(option_map_unwrap_or)] on by default
src/lib.rs:150     let seed = parameter.get("seed")
src/lib.rs:151                 .and_then(|k| k.as_integer())
src/lib.rs:152                 .map(|i| i as usize)
src/lib.rs:153                 .unwrap_or(index);
src/lib.rs:150:16: 153:34 note: replace `map(|i| i as usize).unwrap_or(index)` with `map_or(index, |i| i as usize)`
src/lib.rs:150:16: 153:34 help: for further information visit https://github.com/Manishearth/rust-clippy/wiki#option_map_unwrap_or
src/lib.rs:155:16: 158:30 warning: called `map(f).unwrap_or(a)` on an Option value. This can be done more directly by calling `map_or(a, f)` instead, #[warn(option_map_unwrap_or)] on by default
src/lib.rs:155     let size = parameter.get("size")
src/lib.rs:156                 .and_then(|k| k.as_integer())
src/lib.rs:157                 .map(|i| i as usize)
src/lib.rs:158                 .unwrap_or(1);
src/lib.rs:155:16: 158:30 note: replace `map(|i| i as usize).unwrap_or(1)` with `map_or(1, |i| i as usize)`
src/lib.rs:155:16: 158:30 help: for further information visit https://github.com/Manishearth/rust-clippy/wiki#option_map_unwrap_or

metrics: check for channel's existence before removing

Current delete_channel() optimistically takes a lock and removes the channel, even if it doesn't exist. This forces checks down into the code using the library if taking the lock is to be avoided. https://github.com/twitter/rpc-perf/blob/master/metrics/src/metrics/mod.rs#L98

Expected behavior

delete_channel() should check if the channel exists and avoid taking action if it does not exist.

Actual behavior

delete_channel() takes a lock and optimistically tries to remove the channel, even if it does not exist. This causes no correctness issues, but does take a lock unnecessarily.

Steps to reproduce the behavior

n/a

workload::Value and request::thrift::ThriftValue are almost identical

Something I've run into hacking on #64 is that the workload::Value enum and request::thrift::ThriftValue enum are nearly identical. This results in tree translation here. By using the same tree structure the copy could be avoided. I'm suggesting making Parameter and Value the 'tree to rule them all' which can be directly used by things like thrift for generating a buffer. IMO this makes sense because all the information harvested from the config parsing should be available in this tree, rendering everything else strictly a subset.

A complication is that the build is split out in such a way that rpcperf_workload depends on rpcperf_request but the tree is defined in rpcperf_workload and it results in a cyclic dependency when trying to expose this tree to both modules. Two possible solutions for this are to make yet another module just for that tree or merge rpcperf_workload and rpcperf_request modules. The workload module is a single file so I think merging them is the easiest thing to do.

On the other hand, maybe this is not a good idea altogether for reasons I've not yet contemplated.

readme: rust installation

Readme references using multirust for rust install, which is now deprecated for rustup. Update instructions in the readme.

default reconnect behavior should be to use exponential back off

A few engineers on our team reported that without specifying the request connect timeout setting, reconnect on timeout has the old behavior without exponential back off. We should make exponential back off the default, since it is the most sensible reconnect policy.

Bail when server is down

It currently keeps trying to run for a long time.

add contributors to readme

Add a contributors section to the readme

nit: nodelay

rename nodelay to tcp-nodelay

Method vs Command

The command parameter has been renamed method but the config syntax is not updated. Fix config.rs and main.rs to bring clarity to the choice in naming here.

thrift: buffer write methods

As noted in #61 the write_* methods could be cleaned up by using the WriteBytesExt from byteorder. http://burntsushi.net/rustdoc/byteorder/trait.WriteBytesExt.html#method.write_u16

readme: usage and practices

Include a section on typical usage and practices to consider when benchmarking.

add random key workload generator

Extend rpc-perf to generate random key workloads

more graceful connect retry behavior

connect is a fairly expensive operation for the cache server, at large volume it can easily overwhelm the server thread accepting new connections temporarily, leading to test instabilities. Since it's currently impossible to coordinate different rpcperf instances at a fine granularity, I think we should introduce a few features into the connect behavior to reduce the chance of "reconnect storms":

allow connect to be rate-limited, and the rate can be set in the config;
allow timeouts to be retried with an exponential backoff, with a max cap that can be configured.

contributing guidelines

Document how to contribute to rpc-perf

protocol: add thrift mux support

mux session layer support would be pretty nice. It is also a pretty complex topic considering it adds a session layer to the protocol stack.

I've started some work on codec support for the mux protocol. It can be found here:
https://github.com/bryce-anderson/rust-mux

Currently it uses a custom buffer type which needs to either be expanded upon or replaced. The Buf type in the bytes crate is pretty close to what I want but not quite and the project doesn't seem very active.

Mio

Why did you include mio source into the rpc-perf repo instead of using a version from crates.io or linking directly to its git repo? Just curious :)

Expose stats from rpc-perf for monitoring

There's value in exposing statistics for rpc-perf itself so we can observe details of the runtime behavior. Some of the things to expose would mirror the output (rate, percentile latencies, etc), but we could also expose things like work queue overflow.

convert try! to ?

Update the code to make use of ? instead of the try!() macro

endpoints can't be specified in config

The --endpoint CLI option is marked required and the argument parsing does not account for the ability to pass endpoints via the config.

buffer lost for incomplete response in nightly

The nightly version loses the buffer contents when the response parses as incomplete

nit: wildcard use

Avoid wildcard use statements

[feature request] add option to warm up cache if num is specified

My current use case of rpc-perf with explicitly sized dataset involves first configuring rpc-perf to send write heavy traffic to warm up the cache, then restarting rpc-perf with the workload being tested. It would be nice if we had a "warm up" option that would systematically write all keys, then automatically transition to the configured workload.