Example: Prepare Testbeds
Overview
A testbed is a running, data-loaded DBMS deployment inside the Kubernetes cluster, ready for benchmarking. Preparing a testbed involves the following steps:
Start the SUT (System Under Test — the DBMS container).
Start monitoring.
Run the loading phase (generate and import data).
Run the benchmarking phase.
Collect and evaluate measurements.
Remove all ephemeral components from the cluster.
Normally you want all phases to run in one go. Bexhoma exposes three operating modes that let you stop the process at different points, which is useful for debugging, extending the implementation, or inspecting the DBMS state after loading.
Note: All examples below keep the DBMS running after the experiment finishes. The summary output includes a “Services” section with the
kubectl port-forwardcommand to connect to the running instance.
Operating Modes
Every benchmark script (ycsb.py, benchbase.py, hammerdb.py, tpch.py, tpcds.py) accepts the same three positional mode arguments:
Mode |
What happens |
|---|---|
|
Start the SUT only. No data is loaded. Useful to inspect the fresh DBMS or run custom SQL by hand. |
|
Start the SUT and run the data loading phase. Stops before benchmarking. Useful to pre-populate a volume and check import metrics. |
|
Full experiment: start SUT, load data (or reuse an existing volume), run the benchmark, collect results. |
Data volumes are identified by DBMS, benchmark, and scale factor.
If a volume already exists and is marked as loaded, the load step is skipped automatically in run mode — the data is reused.
Environment Variables
The examples below assume three node-selector variables and a log directory.
Adjust the node names to match your cluster, or drop -rnn/-rnl/-rnb entirely to let Kubernetes schedule freely.
BEXHOMA_NODE_SUT="cl-worker11"
BEXHOMA_NODE_LOAD="cl-worker19"
BEXHOMA_NODE_BENCHMARK="cl-worker19"
LOG_DIR="./logs_tests"
BEXHOMA_MS=1 # number of parallel experiment management processes
mkdir -p $LOG_DIR
Reading the Summary Output
After each mode completes, Bexhoma prints a summary. The sections have consistent meaning across all benchmark types:
Section |
Present in |
Description |
|---|---|---|
Workload |
all modes |
Experiment metadata: type, code, duration, parameter choices, bexhoma version |
Services |
all modes |
|
Connections |
all modes |
Hardware details of the node hosting the SUT (RAM, CPU, disk, resource requests) |
Loading |
|
Import throughput metrics (rows/second, time, pod count) |
Execution |
|
Benchmark results: throughput, latency, error count |
Workflow |
|
Planned vs. actual pod configuration — useful for verifying scale-out |
Ingestion - SUT |
|
CPU and RAM consumed by the DBMS during the loading phase |
Ingestion - Loader |
|
CPU and RAM consumed by the loader pods during the loading phase |
Execution - SUT |
|
CPU and RAM consumed by the DBMS during benchmarking |
Execution - Benchmarker |
|
CPU and RAM consumed by the driver pods during benchmarking |
Tests |
all modes |
Automated sanity checks — e.g., no zero throughput, no NaN metrics, workflow matches plan |
A TEST failed line does not necessarily abort the experiment; it flags a condition worth investigating (e.g., a query error in TPC-DS Q90, or a monitoring gap).
Inspecting Cluster State
You can watch or clean up all pods belonging to a specific benchmark using its use-case label:
# Watch components
kubectl get all -l app=bexhoma,usecase=<label>
# Remove all components
kubectl delete all -l app=bexhoma,usecase=<label>
Benchmark |
Label |
|---|---|
YCSB |
|
Benchbase TPC-C |
|
HammerDB TPC-C |
|
TPC-H |
|
TPC-DS |
|
YCSB
YCSB is a key-value workload generator.
Configurations are numbered sequentially per DBMS — the first configuration is PostgreSQL-1, the second PostgreSQL-2, and so on.
Start DBMS
Starts PostgreSQL without loading any data. After this completes you can connect to the instance and run queries manually.
bexhoma ycsb \
-dbms PostgreSQL \
-xwl c \
-m \
-mc \
-ms $BEXHOMA_MS \
-tr \
-rnn $BEXHOMA_NODE_SUT \
start &>$LOG_DIR/test_ycsb_start_postgresql.log
The summary confirms the SUT is running and shows the port-forward command.
Start DBMS and Load Data
Starts PostgreSQL and imports YCSB data using 8 parallel loader pods with 64 threads each.
bexhoma ycsb \
-dbms PostgreSQL \
-xwl c \
-nlp 8 \
-nlt 64 \
-m \
-mc \
-ms $BEXHOMA_MS \
-tr \
-rnn $BEXHOMA_NODE_SUT -rnl $BEXHOMA_NODE_LOAD \
load &>$LOG_DIR/test_ycsb_load_postgresql.log
Key parameters:
-nlp 8: 8 parallel loader pods-nlt 64: 64 threads per loader pod
The summary’s Loading table shows aggregate throughput across all loader pods. The Ingestion tables show how much CPU and RAM the SUT and loader consumed.
Start DBMS and Load Data and Run Workload
Full experiment: loads data and then runs YCSB workload C with 8 benchmarker pods.
bexhoma ycsb \
-dbms PostgreSQL \
-xwl c \
-nlp 8 \
-nlt 64 \
-nbp 8 \
-nbt 64 \
-m \
-mc \
-ms $BEXHOMA_MS \
-ss \
-tr \
-rnn $BEXHOMA_NODE_SUT -rnl $BEXHOMA_NODE_LOAD -rnb $BEXHOMA_NODE_BENCHMARK \
run &>$LOG_DIR/test_ycsb_run_postgresql.log
Key parameters:
-nbp 8: 8 parallel benchmarker pods-nbt 64: 64 threads per benchmarker pod-ss: use a single shared storage volume
The Execution table aggregates throughput across all benchmarker pods. The Workflow section confirms the actual pod count matched the plan.
Benchbase (TPC-C)
Benchbase runs TPC-C transactions.
Configurations are numbered sequentially per DBMS — PostgreSQL-1, PostgreSQL-2, etc.
Start DBMS
bexhoma benchbase \
-dbms PostgreSQL \
-m \
-mc \
-ms $BEXHOMA_MS \
-tr \
-rnn $BEXHOMA_NODE_SUT \
start &>$LOG_DIR/test_benchbase_start_postgresql.log
Start DBMS and Load Data
Imports TPC-C data (scale factor 1) using 8 parallel loader pods.
bexhoma benchbase \
-dbms PostgreSQL \
-nlp 8 \
-nlt 64 \
-m \
-mc \
-ms $BEXHOMA_MS \
-tr \
-rnn $BEXHOMA_NODE_SUT -rnl $BEXHOMA_NODE_LOAD \
load &>$LOG_DIR/test_benchbase_load_postgresql.log
The Loading table reports Throughput [SF/h] — scale factors loaded per hour.
Start DBMS and Load Data and Run Workload
Loads data with 1 pod (Benchbase loads serially by design) and runs the workload with 8 benchmarker pods for 5 minutes.
bexhoma benchbase \
-dbms PostgreSQL \
-nlp 1 \
-nlt 64 \
-nbp 8 \
-nbt 64 \
-m \
-mc \
-ms $BEXHOMA_MS \
-ss \
-tr \
-rnn $BEXHOMA_NODE_SUT -rnl $BEXHOMA_NODE_LOAD -rnb $BEXHOMA_NODE_BENCHMARK \
run &>$LOG_DIR/test_benchbase_run_postgresql.log
The Execution table reports:
Throughput (requests/second)andGoodput (requests/second)— total and successful transaction rateLatency Distribution.95th Percentile Latency (microseconds)— tail latencyefficiency— how close actual throughput came to the target
HammerDB (TPC-C)
HammerDB runs TPC-C using virtual users (vusers).
Configurations are numbered sequentially per DBMS — PostgreSQL-1, PostgreSQL-2, etc.
Start DBMS
bexhoma hammerdb \
-dbms PostgreSQL \
-m \
-mc \
-ms $BEXHOMA_MS \
-tr \
-rnn $BEXHOMA_NODE_SUT \
start &>$LOG_DIR/test_hammerdb_start_postgresql.log
Start DBMS and Load Data
Imports TPC-C data (1 warehouse = scale factor 1) using a single loader pod.
HammerDB’s loader is inherently single-threaded, so -nlp 1 -nlt 1 is typical.
bexhoma hammerdb \
-dbms PostgreSQL \
-nlp 1 \
-nlt 1 \
-m \
-mc \
-ms $BEXHOMA_MS \
-tr \
-rnn $BEXHOMA_NODE_SUT -rnl $BEXHOMA_NODE_LOAD \
load &>$LOG_DIR/test_hammerdb_load_postgresql.log
The Loading table reports Imported warehouses [1/h].
Start DBMS and Load Data and Run Workload
Loads data and runs TPC-C for 5 minutes with 64 virtual users.
bexhoma hammerdb \
-dbms PostgreSQL \
-nlp 1 \
-nlt 1 \
-nbp 1 \
-nbt 64 \
-m \
-mc \
-ms $BEXHOMA_MS \
-ss \
-tr \
-rnn $BEXHOMA_NODE_SUT -rnl $BEXHOMA_NODE_LOAD -rnb $BEXHOMA_NODE_BENCHMARK \
run &>$LOG_DIR/test_hammerdb_run_postgresql.log
The Execution table reports NOPM (New Orders Per Minute) and TPM (Transactions Per Minute), which are the standard HammerDB TPC-C metrics.
TPC-H
TPC-H is the standard analytical benchmark with 22 queries.
Data is generated by a loader pod using dbgen, then loaded into the SUT.
After loading, the DBMSBenchmarker tool runs the 22 queries and records per-query latency.
Start DBMS
bexhoma tpch \
-dbms PostgreSQL \
-m \
-mc \
-ms $BEXHOMA_MS \
-tr \
-rnn $BEXHOMA_NODE_SUT \
start &>$LOG_DIR/test_tpch_start_postgresql.log
Start DBMS and Load Data
Generates and loads TPC-H data at scale factor 1.
The flags -xii -xic -xis trigger index creation, constraint application, and statistics gathering after the raw data is ingested.
bexhoma tpch \
-dbms PostgreSQL \
-nlp 1 \
-nlt 1 \
-xii -xic -xis \
-m \
-mc \
-ms $BEXHOMA_MS \
-tr \
-rnn $BEXHOMA_NODE_SUT -rnl $BEXHOMA_NODE_LOAD \
load &>$LOG_DIR/test_tpch_load_postgresql.log
The Loading table breaks down time into phases:
Column |
Meaning |
|---|---|
|
Time to generate raw data files with |
|
Time to import data into the DBMS |
|
Time to apply schema (DDL) |
|
Time to create indexes and constraints |
|
Total loading time |
Start DBMS and Load Data and Run Workload
Loads TPC-H data and runs all 22 queries once.
bexhoma tpch \
-dbms PostgreSQL \
-nlp 1 \
-nlt 1 \
-nbp 1 \
-nbt 64 \
-xii -xic -xis \
-m \
-mc \
-ms $BEXHOMA_MS \
-ss \
-tr \
-rnn $BEXHOMA_NODE_SUT -rnl $BEXHOMA_NODE_LOAD -rnb $BEXHOMA_NODE_BENCHMARK \
run &>$LOG_DIR/test_tpch_run_postgresql.log
The Execution section reports per-query latency in the Latency of Timer Execution table and the TPC-H standard metrics:
Metric |
Meaning |
|---|---|
|
Geometric mean of per-query median runtimes |
|
TPC-H Power metric: |
|
TPC-H Throughput metric across all streams and queries |
TPC-DS
TPC-DS is the decision-support benchmark with 99 queries.
Its structure mirrors TPC-H: dbgen2 generates data, a loader imports it, DBMSBenchmarker runs the queries.
Start DBMS
bexhoma tpcds \
-dbms PostgreSQL \
-m \
-mc \
-ms $BEXHOMA_MS \
-tr \
-rnn $BEXHOMA_NODE_SUT \
start &>$LOG_DIR/test_tpcds_start_postgresql.log
Start DBMS and Load Data
bexhoma tpcds \
-dbms PostgreSQL \
-nlp 1 \
-nlt 1 \
-xii -xic -xis \
-m \
-mc \
-ms $BEXHOMA_MS \
-tr \
-rnn $BEXHOMA_NODE_SUT -rnl $BEXHOMA_NODE_LOAD \
load &>$LOG_DIR/test_tpcds_load_postgresql.log
Start DBMS and Load Data and Run Workload
Loads TPC-DS data and runs all 99 queries. Some queries may fail on certain DBMS (e.g., Q90 raises a division-by-zero on PostgreSQL at SF=1); these are flagged in the Errors section and in Tests.
bexhoma tpcds \
-dbms PostgreSQL \
-nlp 1 \
-nlt 1 \
-nbp 1 \
-nbt 64 \
-xii -xic -xis \
-m \
-mc \
-ms $BEXHOMA_MS \
-ss \
-tr \
-rnn $BEXHOMA_NODE_SUT -rnl $BEXHOMA_NODE_LOAD -rnb $BEXHOMA_NODE_BENCHMARK \
run &>$LOG_DIR/test_tpcds_run_postgresql.log
The Execution section reports per-query latency for all 99 TPC-DS queries and the same Power@Size / Throughput@Size metrics as TPC-H.