🚀 The feature, motivation and pitch Similar to <code class="notra

✨✨ Here's <a href="https://copilot-workspace.githubnext.com/vllm-project/vllm/issues/4

Benchmark: benchmark_throughput and benchmark_latency should be able to write output to JSON file. about vllm HOT 1 CLOSED

simon-mo commented on September 26, 2024

Benchmark: benchmark_throughput and benchmark_latency should be able to write output to JSON file.

from vllm.

simon-mo commented on September 26, 2024

Topic

Can benchmark_throughput and benchmark_latency write their metrics to a JSON file?

Before

No, currently, benchmark_throughput.py and benchmark_latency.py do not support writing their metrics directly to a JSON file. The scripts output their results to the console instead, as seen in their respective code benchmarks/benchmark_throughput.py and benchmarks/benchmark_latency.py.
The benchmark_serving.py script, however, does support saving results to a JSON file, as indicated by its code structure and usage of JSON file writing in benchmarks/benchmark_serving.py.

After

Yes, benchmark_throughput.py and benchmark_latency.py now support writing their metrics directly to a JSON file.
The scripts include a new command-line argument --output-json to specify the output file path for the JSON results.
The JSON output includes key metrics such as average latency, throughput, and detailed per-iteration metrics.
The benchmark_serving.py script's existing JSON file writing functionality remains unchanged.

Plan

benchmarks/benchmark_latency.py (CHANGE)
- Add argument parsing for --output-json to specify the output JSON file path
- Implement JSON writing logic to output latency metrics to the specified JSON file
- Ensure existing console output remains unchanged
benchmarks/benchmark_throughput.py (CHANGE)
- Add argument parsing for --output-json to specify the output JSON file path
- Implement JSON writing logic to output throughput metrics to the specified JSON file
- Ensure existing console output remains unchanged
.buildkite/run-benchmarks.sh (CHANGE)
- Add --output-json arguments to the two benchmarks (throughput and latency) and ensure they are uploaded as artifacts
- For both benchmarks (thoughput and latency), turn them into multiline bash command
- Do not change the benchmark serving output