Concept: Cluster Configuration
Bexhoma reads all cluster and experiment settings from a file called cluster.config in the working directory.
The file is a Python dict literal (parsed with ast.literal_eval).
A fully-commented template is provided at k8s-cluster.config.
Copy it to cluster.config and adjust the sections below before running any experiment.
Top-Level Structure
{
'benchmarker': { ... }, # local paths used by the orchestrator
'credentials': { ... }, # Kubernetes cluster access and monitoring config
'volumes': { ... }, # benchmark data sources and init-script sets
'instances': { ... }, # legacy (unused in Kubernetes mode)
'dockers': { ... }, # DBMS configurations — see DBMS.md
}
benchmarker — Local Orchestrator Paths
'benchmarker': {
'resultfolder': '/home/myself/benchmarks',
'jarfolder': './jars/'
},
Key |
Description |
|---|---|
|
Absolute path on the local machine where experiment results are written. Must exist and be writable. This directory is also mounted into the evaluator container so results are accessible from inside the cluster. |
|
Path to the directory holding JDBC driver jars. The default |
credentials — Kubernetes Access and Monitoring
Cluster Contexts
'credentials': {
'k8s': {
'appname': 'bexhoma',
'context': {
'my-context': {
'namespace': 'my-namespace',
'clustername': 'My Cluster',
'service_sut': '{service}.{namespace}.svc.cluster.local',
'port': 9091,
},
},
'monitor': { ... }, # see below
}
}
Key |
Description |
|---|---|
|
Label applied to all Kubernetes objects created by bexhoma. Do not change — it is used for cleanup and status queries. |
|
Dict of named Kubernetes contexts. The key must match a context name in your |
|
Kubernetes namespace in which bexhoma deploys all components. Ensure your kubeconfig user has |
|
Human-readable label shown in experiment summaries and reports. |
|
DNS name template for reaching the SUT (System Under Test) service from within the cluster. The placeholders |
|
Port on the local machine used when forwarding traffic to the SUT via |
To use a second cluster simply add another entry under context:
'context': {
'cluster-a': { 'namespace': 'benchmarks', ... },
'cluster-b': { 'namespace': 'experiments', ... },
},
Pass -cx cluster-b on the command line to run an experiment on the second cluster.
monitor — Hardware and Application Metrics
The monitor block sits inside credentials.k8s and controls how Prometheus metrics are collected during experiments.
'monitor': {
'service_monitoring': 'https://prometheus.mycluster.com/api/v1/',
'service_monitoring_application': 'http://{service}.{namespace}.svc.cluster.local:9090/api/v1/',
'extend': 20,
'shift': 0,
'metrics': {
'total_cpu_memory': { ... },
'total_cpu_util': { ... },
'total_network_rx': { ... },
...
},
'postgresql': { 'metrics': { ... } },
'mysql': { 'metrics': { ... } },
'pgbouncer': { 'metrics': { ... } },
'tidb': { 'metrics': { ... } },
'tikv': { 'metrics': { ... } },
'pd': { 'metrics': { ... } },
'yb-master': { 'metrics': { ... } },
'yb-tserver': { 'metrics': { ... } },
...
}
Prometheus endpoints
Key |
Description |
|---|---|
|
URL of the cluster-level Prometheus API ( |
|
URL template for the per-experiment Prometheus that bexhoma installs itself (used for application-level metrics with |
Timing adjustments
Key |
Description |
|---|---|
|
Number of seconds added to both ends of each monitoring interval. An interval |
|
Number of seconds to shift the entire interval forward. An interval |
Hardware metric definitions (metrics)
The metrics dict defines which cluster-level (cAdvisor / node-exporter) metrics to collect for every experiment phase.
Each entry has:
'total_cpu_util': {
'type': 'cluster', # 'cluster' = hardware metric; 'application' = DBMS-specific
'active': True, # False = skip this metric
'metric': 'gauge', # 'gauge' (mean), 'counter' (max−min delta), or 'ratio' (max)
'query': '<promql>', # PromQL; {configuration}, {experiment}, {host}, {gpuid} are substituted
'title': 'CPU Utilization',
},
Field |
Values |
Effect |
|---|---|---|
|
|
Queried from |
|
|
Queried from |
|
|
Set to |
|
|
Aggregated as mean over the interval |
|
|
Aggregated as max − min (delta) over the interval |
|
|
Aggregated as max over the interval |
|
PromQL string |
Placeholders: |
The default set of hardware metrics covers CPU utilization, CPU throttle, memory (working set and cached), network RX/TX, filesystem read/write, I/O wait, and per-core variance.
GPU metrics (DCGM) are present but disabled by default (active: False).
Named application metric sets
In addition to the metrics dict, named sub-dicts define DBMS-specific application metrics that are scraped when application monitoring (-ma) is enabled for a matching DBMS.
Each DBMS configuration (in dockers) references one of these sets by name via its monitor.sut.metrics or monitor.worker.metrics field.
Name |
Used by |
|---|---|
|
PostgreSQL, PGBouncer (SUT component) |
|
PGBouncer (pool component) |
|
MySQL |
|
TiDB (SQL layer) |
|
TiDB (TiKV storage) |
|
TiDB (Placement Driver) |
|
YugabyteDB (master nodes) |
|
YugabyteDB (tablet servers) |
|
CockroachDB (worker nodes) |
|
Dragonfly |
|
Redis |
Each named set follows the same structure as metrics above.
See Monitoring for details on enabling and interpreting application metrics.
volumes — Data Sources and Init Scripts
The volumes section maps each benchmark type to a set of named init-script sequences.
Experiment scripts (tpch.py, ycsb.py, etc.) reference these by the volume key and script-set name.
'volumes': {
'tpch': {
'id': '2',
'initscripts': {
'Schema': [
'initschema-tpch.sql',
],
'Index_and_Constraints': [
'initindexes-tpch.sql',
'initconstraints-tpch.sql',
],
'Index_and_Constraints_and_Statistics': [
'initindexes-tpch.sql',
'initconstraints-tpch.sql',
'initstatistics-tpch.sql',
],
}
},
'tpcds': { 'id': '1', 'initscripts': { ... } },
'tpcc': { 'id': '1', 'initscripts': { ... } },
'ycsb': { 'id': '1', 'initscripts': { ... } },
'benchbase': { 'id': '1', 'initscripts': { ... } },
...
}
Volume keys and id
Field |
Description |
|---|---|
key ( |
Identifier referenced by experiment scripts to locate the correct init-script set |
|
A numeric tag appended to the Kubernetes PVC name to allow multiple incompatible data formats for the same benchmark to coexist on the cluster (e.g., switching between columnar and row-store schemas without overwriting data). Changing |
initscripts — named script sequences
Each entry under initscripts is a named list of files.
Experiment CLI flags (-xii, -xic, -xis) control which sets are executed and when:
Flag |
Typical script set |
When it runs |
|---|---|---|
(none) |
|
Before data ingestion: creates the empty schema |
|
|
After data ingestion: creates indexes |
|
|
After data ingestion: creates indexes and foreign key constraints |
|
|
After indexes/constraints: refreshes query planner statistics |
Scripts are executed in list order. The file suffix determines how they are executed:
Suffix |
Execution |
|---|---|
|
Piped to the DBMS command-line client via the |
|
Executed as a shell script inside the SUT container |
Script files must be present in the benchmark’s experiment config folder, e.g., experiments/tpch/<DBMS>/ for TPC-H.
See DBMS for per-DBMS DDL script locations.
Placeholders in init scripts
Init scripts may use these placeholders, which bexhoma substitutes at runtime:
Placeholder |
Value |
|---|---|
|
Target database name (used in database-per-tenant mode) |
|
Target schema name (used in schema-per-tenant mode) |
Volumes defined in the default config
Key |
Benchmark |
Typical script sets |
|---|---|---|
|
TPC-H |
|
|
TPC-DS |
|
|
HammerDB / Benchbase TPC-C |
|
|
YCSB |
|
|
Benchbase |
|
instances — Legacy IaaS Settings
'instances': {},
This section is a remnant of an earlier IaaS (VM-based) deployment mode. It is not used in Kubernetes mode and should be left empty.
dockers — DBMS Configurations
The dockers section defines how each DBMS is started, connected to, and monitored.
See DBMS for the full reference and per-DBMS configuration snippets.
Minimal Working Configuration
The minimum set of changes required to run the first experiment on a new cluster:
Set
benchmarker.resultfolderto a local directory that exists and is writable.Under
credentials.k8s.context, add an entry whose key matches a context in yourkubeconfig, and setnamespaceto the Kubernetes namespace you have access to.Set
credentials.k8s.monitor.service_monitoringto the URL of a Prometheus instance in your cluster, or leave it as a placeholder if you have no preinstalled Prometheus (bexhoma will install one per experiment).Ensure the storage classes referenced in
k8s/pvc-bexhoma-results.ymlandk8s/pvc-bexhoma-data.ymlexist in your cluster and supportReadWriteManyaccess.
Everything else can be left at its default values for a first run.