Concept: DBMS

To include a DBMS in a Kubernetes-based experiment you need:

a Docker image
a JDBC driver (for SQL-based DBMS used with DBMSBenchmarker)
a Kubernetes deployment manifest (YAML template)
a configuration entry in cluster.config (the dockers section)
DDL scripts for schema creation, data loading, and index building

This document covers all DBMS currently supported by bexhoma.

Relational / SQL

PostgreSQL
PostgreSQL + PGBouncer
MySQL
MariaDB
MonetDB
Citus

Distributed SQL / Cloud-Native

YugabyteDB
CockroachDB
TiDB

Key-Value Stores (YCSB)

Dragonfly
DragonflyCluster
Redis (Cluster)

External / Cloud Database Services

DatabaseService

Raw Hardware Benchmarks (no DBMS)

Hardware

Configuration Reference

Every entry in the dockers section of cluster.config may contain the following fields:

Field	Required	Description
`loadData`	yes	Shell command run inside the SUT container to execute an init script. Placeholders: `{scriptname}`, `{database}`, `{schema}`, `{service_name}`, `{namespace}`
`delay_prepare`	no	Seconds to wait after the container starts before bexhoma considers it ready (useful for engines that restart during init, e.g., MySQL InnoDB setup)
`template`	yes	Base connection template passed to DBMSBenchmarker — includes `version`, `alias`, `docker_alias`, `dialect`, and `JDBC` sub-section
`template.JDBC.driver`	yes¹	Fully qualified Java class name of the JDBC driver
`template.JDBC.auth`	yes¹	`[username, password]` pair
`template.JDBC.url`	yes¹	JDBC URL template. Placeholders: `{serverip}`, `{database}`, `{schema}`, `{dbname}`, `{timeout_ms}`, `{namespace}`
`template.JDBC.jar`	yes¹	JDBC jar filename(s) expected in `jarfolder`
`template.JDBC.database`	no	Default database name (substituted into `{database}` in `loadData` and the JDBC URL; overridden per tenant in multi-tenant mode)
`template.JDBC.schema`	no	Default schema name (substituted into `{schema}`; overridden per tenant)
`template.init_SQL`	no	SQL statement executed after connection is established (e.g., `SET optimizer_switch = ...`)
`logfile`	no	Path inside the container to a log file that bexhoma downloads after each experiment
`datadir`	no	Path inside the container to the DBMS data directory; bexhoma measures its size after loading
`attachWorker`	no	Command run on the coordinator to register a new worker node (used for distributed DBMS like Citus, CockroachDB)
`worker_port`	no	Internal port of the worker/coordinator headless service (used for TiDB)
`store_args`	no	Whether to capture and display the container startup arguments in the experiment summary (default: `True`)
`priceperhourdollar`	no	Reserved for cost modelling (currently unused)
`monitor`	no	Application-level monitoring configuration; see Monitoring

¹ Only required for SQL DBMS queried via DBMSBenchmarker. Entries whose SUT is never queried through DBMSBenchmarker — Dragonfly, DragonflyCluster, Redis, Hardware — omit template.JDBC entirely; template itself is still required (at minimum version, alias, docker_alias), and template.auth (a [username, password] pair) becomes required instead — configurations/manifest.py falls back to template['auth'] for BEXHOMA_USER/BEXHOMA_PASSWORD injection whenever template.JDBC is absent.

JDBC URL Placeholders

Placeholder	Substituted with
`{serverip}`	Cluster-internal DNS name of the SUT service
`{database}`	Database name (from `template.JDBC.database`, overridden by tenant settings)
`{schema}`	Schema name (from `template.JDBC.schema`, overridden by tenant settings)
`{dbname}`	Alias for `{database}` (legacy; prefer `{database}`)
`{timeout_ms}`	Query timeout in milliseconds
`{namespace}`	Kubernetes namespace from the cluster config

Host Information Collected Automatically

After the SUT starts, bexhoma collects the following from inside the container and attaches it to the connection record:

Total RAM (from /proc/meminfo)
CPU model and core count (from /proc/cpuinfo)
Kernel version (uname -r)
Disk space used on / and in the data directory (du on datadir)
GPU count and model (via nvidia-smi, if present)
Kubernetes node name (from the pod spec)

This information appears in the Connections section of every experiment summary.

Deployment Manifests

Every DBMS needs a Kubernetes manifest template at k8s/deploymenttemplate-<Key>.yml. The key must match the dockers key in cluster.config exactly. See Deployment Template Conventions for the full authoring rules.

Resource requests, limits, and node selectors can be overridden at runtime via CLI parameters (-rnn, -rnl, -rnb) or programmatically via experiment.set_resources(...). See Deployments for details.

Deployment Template Conventions

This section describes the rules that every k8s/deploymenttemplate-<Key>.yml file must follow. start_sut() in configurations/lifecycle.py reads these files and applies the conventions at runtime to generate the actual Kubernetes resource names and inject environment variables. The full reference (including job templates and non-SUT manifests) is in k8s/README.md.

File naming

The template filename is derived automatically from the dockers key:

k8s/deploymenttemplate-{Key}.yml

The Key must exactly match the key in the dockers section of cluster.config and the value passed as docker= to configurations.default(...). Casing matters: PostgreSQL → deploymenttemplate-PostgreSQL.yml.

Document structure

A SUT deployment template is a multi-document YAML file (documents separated by ---). start_sut() processes all documents in the file. Each document must be one of the following Kubernetes kinds:

Kind	Required?	Notes
`PersistentVolumeClaim`	Optional	Omit when `use_storage=False`; removed automatically when `use_ramdisk=True`
`Service`	Optional	Used for the SUT port and, optionally, for worker headless services
`Deployment`	At least one required	The main DBMS container
`StatefulSet`	Optional	One per distributed worker role (e.g. TiKV, PD)

Recommended order within the file: PVC first, then Service(s), then Deployment/StatefulSet(s).

Mandatory labels

Every document in the template must carry these labels on metadata.labels (and on spec.template.metadata.labels for pod-creating resources):

Label	Placeholder value	Runtime value
`app`	`bexhoma`	Always `bexhoma`
`component`	role-specific (see below)	Never overwritten — used as routing key by `start_sut()`
`configuration`	`default`	DBMS configuration name (e.g. `PostgreSQL`)
`experiment`	`default`	Unique experiment code (timestamp-based string)

Use exactly these placeholder values in the template file; start_sut() replaces all of them at runtime except component.

The `component` label

The component label value on each document tells start_sut() what kind of resource it is and how to name and wire it. Do not change the component value at runtime — it is an identity, not a parameter.

`component` value	Applied to	What `start_sut()` does
`sut`	Deployment, Service, PVC	Names all resources using `generate_component_name(component='sut', ...)`; the resulting name becomes the SUT service hostname
`storage`	PVC	Names the PVC using `generate_component_name(component='storage', ...)`; labels it with load status
`worker`	StatefulSet, Service	Names the StatefulSet using `get_worker_name(component='worker')`; injects `BEXHOMA_WORKER_*` env vars into all containers
`store`	StatefulSet, Service	Same as `worker` but injects `BEXHOMA_STORE_*` env vars
`pd`	StatefulSet, Service	Same pattern; injects `BEXHOMA_PD_*` env vars (used by TiDB)
`pool`	Deployment, Service	Connection pool sidecar (e.g. PGBouncer); named using `generate_component_name(component='pool', ...)`

For every StatefulSet with component: X, start_sut() automatically injects these env vars into all containers:

Env var	Content
`BEXHOMA_{X}_NAME`	StatefulSet name
`BEXHOMA_{X}_SERVICE`	Headless service name
`BEXHOMA_{X}_LIST`	Comma-separated DNS addresses of all pods (`{name}-{i}.{service}`)
`BEXHOMA_{X}_LIST_SPACE`	Same, space-separated

(X is the uppercased component value, e.g. worker → BEXHOMA_WORKER_NAME.)

Runtime name generation

All resource name: fields in the template are placeholders; start_sut() always overwrites them. The generated names follow this formula (from generate_component_name() in configurations/base.py):

{app}-{component}-{configuration}-{experiment}

All lowercase. For workers the formula draws from the storage label rather than the experiment code. Worker PVCs from volumeClaimTemplates follow the fixed prefix bxw-{worker_name}-{i}.

Example SUT name: bexhoma-sut-postgresql-1717000000

Reserved container names

The name: of every container inside a pod spec is read by start_sut() and the evaluator pipeline. These names must be used exactly:

Container name	Pod type	Role
`dbms`	SUT Deployment / StatefulSet	Primary DBMS process. `start_sut()` reads its `args` for the summary and manages its volume mounts.
`monitor-application`	SUT Deployment	Application metrics exporter (e.g. postgres-exporter). Removed when app monitoring is disabled.
`cadvisor`	SUT / worker (sidecar)	Node metrics. Removed when cluster-level monitoring already exists.
`dcgm-exporter`	SUT / worker (sidecar)	GPU metrics. Removed when cluster-level monitoring already exists.
`pool`	Connection-pool Deployment	PGBouncer or equivalent.

For job templates (loading and benchmarking) the reserved names are datagenerator (initContainer), sensor (loader), and dbmsbenchmarker (benchmarker). See k8s/README.md for details.

Reserved volume names

The following volume names in pod specs are matched by name in start_sut():

Volume name	Purpose	Removed when
`benchmark-storage-volume`	Mounts the per-experiment DBMS storage PVC	`use_storage=False`
`benchmark-data-volume`	Mounts the shared read-only dataset PVC (`bexhoma-data`)	`use_distributed_datasource=False`
`bxw`	Worker-node storage in StatefulSet `volumeClaimTemplates`	`use_storage=False`
`dshm`	Shared-memory `emptyDir`	Never removed

Service port names

start_sut() strips Service ports by name when monitoring is disabled. Always name the DBMS connection port port-dbms so it is never stripped.

Port name	Always kept
`port-dbms`	Yes
`port-bus`	Yes (cluster bus for distributed DBMSs)
`port-web`	Yes
`port-monitoring`	Stripped when monitoring inactive
`port-monitoring-application`	Stripped when app monitoring inactive

Minimal SUT template skeleton

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  labels: {app: bexhoma, component: sut, configuration: default, experiment: default}
  name: bexhoma-storage
spec:
  accessModes: [ReadWriteOnce]
  resources:
    requests:
      storage: 50Gi
  storageClassName: shared
---
apiVersion: v1
kind: Service
metadata:
  labels: {app: bexhoma, component: sut, configuration: default, experiment: default}
  name: bexhoma-service
spec:
  ports:
  - {port: 9091, protocol: TCP, name: port-dbms, targetPort: 5432}
  selector: {app: bexhoma, component: sut, configuration: default, experiment: default}
---
apiVersion: apps/v1
kind: Deployment
metadata:
  labels: {app: bexhoma, component: sut, configuration: default, experiment: default}
  name: bexhoma-deployment
spec:
  replicas: 1
  selector:
    matchLabels: {app: bexhoma, component: sut, configuration: default, experiment: default}
  template:
    metadata:
      labels: {app: bexhoma, component: sut, configuration: default, experiment: default}
    spec:
      containers:
      - name: dbms
        image: mydbms:latest
        ports:
        - {containerPort: 5432}
        volumeMounts:
        - {mountPath: /var/lib/mydbms/data, name: benchmark-storage-volume}
      volumes:
      - name: benchmark-storage-volume
        persistentVolumeClaim: {claimName: bexhoma-storage}

Copy k8s/deploymenttemplate-Dummy.yml as a starting point for a new DBMS.

PostgreSQL

PostgreSQL is the primary test target for analytical benchmarks (TPC-H, TPC-DS) and transactional benchmarks (Benchbase TPC-C, YCSB).

Deployment template: k8s/deploymenttemplate-PostgreSQL.yml

The default deployment uses a minimal, near-standard PostgreSQL configuration. The only active startup argument is max_connections=640; all other parameters use PostgreSQL defaults. The template includes an extensive set of commented-out tuning knobs (memory, parallelism, WAL, autovacuum, planner cost constants) as a reference — uncomment or override individual parameters at experiment time using --set.

Configuration (from cluster.config):

'PostgreSQL': {
    'loadData': 'psql -U postgres -d {database} < {scriptname}',
    'delay_prepare': 0,
    'template': {
        'version': 'v11.4',
        'alias': 'General-B',
        'docker_alias': 'GP-B',
        'JDBC': {
            'driver': "org.postgresql.Driver",
            'auth': ["postgres", ""],
            'url': 'jdbc:postgresql://{serverip}:9091/{database}?reWriteBatchedInserts=true&currentSchema={schema}',
            'jar': 'postgresql-42.5.0.jar',
            'database': 'postgres',
            'schema': 'public',
        }
    },
    'logfile': '/usr/local/data/logfile',
    'datadir': '/var/lib/postgresql/data/',
    'priceperhourdollar': 0.0,
    'monitor': {
        'sut': {
            'metrics': 'postgresql',
            'blackbox': True,
        },
    },
},

The database and schema fields support multi-tenant mode: bexhoma substitutes the tenant-specific database or schema name when running schema-per-tenant or database-per-tenant experiments.

DDL scripts:

PGBouncer

PGBouncer is the configuration key for a PostgreSQL deployment fronted by a PgBouncer connection pool. The SUT component runs PostgreSQL; the pool component runs PgBouncer. Benchmarker pods connect to PgBouncer, which multiplexes connections to PostgreSQL.

This configuration is used to benchmark the overhead and benefit of connection pooling compared to direct PostgreSQL access. See Example: PGBouncer for a worked example.

Deployment template: k8s/deploymenttemplate-PGBouncer.yml

Configuration (from cluster.config):

'PGBouncer': {
    'loadData': 'psql -U postgres -d {database} < {scriptname}',
    'delay_prepare': 0,
    'template': {
        'version': 'v11.4',
        'alias': 'General-B',
        'docker_alias': 'GP-B',
        'JDBC': {
            'driver': "org.postgresql.Driver",
            'auth': ["postgres", ""],
            'url': 'jdbc:postgresql://{serverip}:9091/{database}?reWriteBatchedInserts=true',
            'jar': 'postgresql-42.5.0.jar',
            'database': 'postgres',
            'schema': 'public',
        }
    },
    'logfile': '/usr/local/data/logfile',
    'datadir': '/var/lib/postgresql/data/',
    'priceperhourdollar': 0.0,
    'monitor': { ... },  # Prometheus discovery for both the sut (postgres_exporter) and pool (pgbouncer_exporter) components
},

The monitor section uses Kubernetes pod discovery to scrape both the PostgreSQL exporter (port 9187 on the SUT pods) and the PgBouncer exporter (port 9127 on the pool pods).

PGBouncer uses the same DDL scripts as PostgreSQL — see the PostgreSQL section above.

MySQL

MySQL is supported for TPC-H and YCSB experiments. The default image runs MySQL 8.4.

Deployment template: k8s/deploymenttemplate-MySQL.yml

The default deployment tuning includes innodb-buffer-pool-size=32G, innodb-write-io-threads=64, innodb-read-io-threads=64, innodb-redo-log-capacity=8G, innodb-flush-log-at-trx-commit=0, and innodb-parallel-read-threads=64. See the MySQL 8.4 documentation for parameter explanations.

Configuration (from cluster.config):

'MySQL': {
    'loadData': 'mysql --local-infile {database} < {scriptname}',
    'delay_prepare': 120,
    'template': {
        'version': 'CE 8.0.36',
        'alias': 'General-C',
        'docker_alias': 'GP-C',
        'dialect': 'MySQL',
        'JDBC': {
            'driver': "com.mysql.cj.jdbc.Driver",
            'auth': ["root", "root"],
            'url': 'jdbc:mysql://{serverip}:9091/{dbname}?rewriteBatchedStatements=true',
            'jar': ['mysql-connector-j-8.0.31.jar', 'slf4j-simple-1.7.21.jar'],
            'database': 'mysql',  # placeholder — must be overwritten per experiment
        }
    },
    'logfile': '/var/log/mysqld.log',
    'datadir': '/var/lib/mysql/',
    'priceperhourdollar': 0.0,
    'monitor': {
        'sut': {
            'metrics': 'mysql',
            'blackbox': False,
        },
    },
},

delay_prepare: 120 makes bexhoma wait 2 minutes before querying the DBMS. This accounts for InnoDB’s buffer pool initialisation, which can cause a brief restart.

DDL scripts:

MariaDB

MariaDB is supported for TPC-H, Benchbase TPC-C, HammerDB TPC-C, and YCSB experiments. It uses a dedicated bexhoma database user rather than root.

Deployment template: k8s/deploymenttemplate-MariaDB.yml

The default deployment sets innodb_log_buffer_size, innodb-write-io-threads=16, and innodb-log-file-size.

Configuration (from cluster.config):

'MariaDB': {
    'loadData': 'mariadb --user=bexhoma --password=password < {scriptname}',
    'template': {
        'version': 'v10.4.6',
        'alias': 'General-A',
        'docker_alias': 'GP-A',
        'dialect': 'MySQL',
        'JDBC': {
            'driver': "org.mariadb.jdbc.Driver",
            'auth': ["bexhoma", "password"],
            'url': 'jdbc:mariadb://{serverip}:9091/{dbname}?rewriteBatchedStatements=true',
            'jar': 'mariadb-java-client-3.1.0.jar',
            'database': 'mariadb',  # placeholder — must be overwritten per experiment
        },
        'init_SQL': "SET optimizer_switch = 'mrr=on,mrr_cost_based=off'",
    },
    'logfile': '/usr/local/data/logfile',
    'datadir': '/var/lib/mysql/',
    'priceperhourdollar': 0.0,
},

The init_SQL statement is executed after each connection is established and enables multi-range read optimisation.

DDL scripts:

MonetDB

MonetDB is a column-store DBMS well-suited for TPC-H analytical workloads.

Deployment template: k8s/deploymenttemplate-MonetDB.yml

Configuration (from cluster.config):

'MonetDB': {
    'loadData': "cd /home/monetdb;printf 'user=monetdb\npassword=monetdb\n' > .monetdb;mclient demo < {scriptname}",
    'template': {
        'version': '11.37.11',
        'alias': 'Columnwise',
        'docker_alias': 'Columnwise',
        'JDBC': {
            'auth': ['monetdb', 'monetdb'],
            'driver': 'org.monetdb.jdbc.MonetDriver',
            'jar': 'monetdb-jdbc-12.2.jre8.jar',
            'url': 'jdbc:monetdb://{serverip}:9091/demo?so_timeout={timeout_ms}',
            'database': 'demo',
        }
    },
    'logfile': '/var/monetdb5/dbfarm/merovingian.log',
    'datadir': '/var/monetdb5/',
    'priceperhourdollar': 0.0,
},

The loadData command first writes a .monetdb credentials file into the container home directory before piping the script to mclient.

DDL scripts:

TPC-H
TPC-DS

Citus

Citus is a distributed extension for PostgreSQL that shards tables across worker nodes. The coordinator node is deployed as the SUT; worker nodes are deployed as a StatefulSet. After each worker pod starts, bexhoma runs the attachWorker command on the coordinator to register it.

Deployment template: k8s/deploymenttemplate-Citus.yml

Configuration (from cluster.config):

'Citus': {
    'loadData': 'psql -U postgres < {scriptname}',
    'attachWorker': "psql -U postgres --command=\"SELECT * from master_add_node('{worker}.{service_sut}', 5432);\"",
    'template': {
        'version': '10.0.2',
        'alias': 'General-B',
        'docker_alias': 'GP-B',
        'JDBC': {
            'driver': "org.postgresql.Driver",
            'auth': ["postgres", "postgres"],
            'url': 'jdbc:postgresql://{serverip}:9091/postgres?loadBalanceHosts=true',
            'jar': 'postgresql-42.5.0.jar',
            'database': 'postgres',
            'schema': 'public',
        }
    },
    'logfile': '/usr/local/data/logfile',
    'datadir': '/var/lib/postgresql/data/',
    'priceperhourdollar': 0.0,
},

DDL scripts:

See Example: Citus for a worked example.

YugabyteDB

YugabyteDB is a distributed, PostgreSQL-compatible DBMS. It consists of master and tablet-server (TServer) components, both deployed as StatefulSets. The JDBC connection targets the TServer service.

Configuration (from cluster.config):

'YugabyteDB': {
    'loadData': 'psql -U yugabyte --host yb-tserver-service.{namespace}.svc.cluster.local --port 5433 < {scriptname}',
    'template': {
        'version': 'v2.17.1',
        'alias': 'Cloud-Native-1',
        'docker_alias': 'CN1',
        'JDBC': {
            'driver': "com.yugabyte.Driver",
            'auth': ["yugabyte", ""],
            'url': 'jdbc:yugabytedb://yb-tserver-service.{namespace}.svc.cluster.local:5433/yugabyte?load-balance=true',
            'jar': 'jdbc-yugabytedb-42.3.5-yb-2.jar',
            'database': 'yugabyte',
            'schema': 'public',
        }
    },
    'logfile': '/usr/local/data/logfile',
    'datadir': '/var/lib/postgresql/data/',
    'priceperhourdollar': 0.0,
    'monitor': { ... },  # Prometheus discovery for yb-master (port 7000) and yb-tserver (port 9000) pods
},

The monitoring section configures Kubernetes pod discovery for both the master component (port 7000) and the TServer component (port 9000), scraping YugabyteDB’s native Prometheus endpoint at /prometheus-metrics.

DDL scripts:

YCSB

See Example: YugabyteDB for a worked example.

CockroachDB

CockroachDB is a distributed, PostgreSQL wire-compatible DBMS. Worker nodes form the CockroachDB cluster; bexhoma registers each worker via the attachWorker command. The PostgreSQL JDBC driver is used for connectivity.

Configuration (from cluster.config):

'CockroachDB': {
    'loadData': 'cockroach sql --host {service_name} --port 9091 --insecure --file {scriptname}',
    'delay_prepare': 120,
    'attachWorker': "",  # CockroachDB auto-discovers cluster members
    'template': {
        'version': 'v24.2.4',
        'alias': 'Cloud-Native-2',
        'docker_alias': 'CN2',
        'JDBC': {
            'driver': "org.postgresql.Driver",
            'auth': ["root", ""],
            'url': 'jdbc:postgresql://{serverip}:9091/defaultdb?reWriteBatchedInserts=true&sslmode=disable',
            'jar': 'postgresql-42.5.0.jar',
            'database': 'defaultdb',
        }
    },
    'logfile': '/usr/local/data/logfile',
    'datadir': '/cockroach/cockroach-data',
    'priceperhourdollar': 0.0,
    'monitor': { ... },  # Prometheus discovery for worker pods at /_status/vars on port 8080
},

The monitoring section scrapes CockroachDB’s built-in Prometheus endpoint (/_status/vars, port 8080) from all worker pods.

DDL scripts:

TPC-C
YCSB

See Example: CockroachDB for a worked example.

TiDB

TiDB is a distributed, MySQL-compatible DBMS. It consists of three components: the TiDB SQL layer (SUT), TiKV storage nodes (worker), and the Placement Driver (PD) coordinator. All three are deployed and monitored separately.

Configuration (from cluster.config):

'TiDB': {
    'loadData': 'mysql -P 4000 -h 127.0.0.1 -D {database} --local-infile < {scriptname}',
    'delay_prepare': 60,
    'template': {
        'version': 'CE 8.0.22',
        'alias': 'General-C',
        'docker_alias': 'GP-C',
        'dialect': 'MySQL',
        'JDBC': {
            'driver': "com.mysql.cj.jdbc.Driver",
            'auth': ["root", "root"],
            'url': 'jdbc:mysql://{serverip}:9091/{dbname}?rewriteBatchedStatements=true',
            'jar': ['mysql-connector-j-8.0.31.jar', 'slf4j-simple-1.7.21.jar'],
            'database': 'test',
        }
    },
    'logfile': '/var/log/mysqld.log',
    'datadir': '/var/lib/mysql/',
    'priceperhourdollar': 0.0,
    'worker_port': 2379,
    'store_args': False,
    'monitor': { ... },  # Prometheus discovery for tidb (port 9500), pd (port 2379), and tikv (port 20180) pods
},

store_args: False suppresses capturing TiDB’s verbose startup arguments in the summary. worker_port: 2379 is the PD headless service port used internally for cluster coordination. The monitoring section scrapes all three components via Kubernetes pod discovery.

DDL scripts:

YCSB

See Example: TiDB for a worked example.

Dragonfly

Dragonfly is a Redis-compatible, high-performance in-memory data store. It is used as the SUT for YCSB key-value workloads as a single-instance deployment.

Configuration (from cluster.config):

'Dragonfly': {
    'loadData': 'redis-cli < {scriptname}',
    'delay_prepare': 0,
    'attachWorker': '',
    'template': {
        'version': 'xxx',
        'alias': 'Key-Value-1',
        'docker_alias': 'KV1',
        'auth': ["root", ""],
    },
    'logfile': '/var/log/redis/redis-server.log',
    'datadir': '/data',
    'priceperhourdollar': 0.0,
    'monitor': { ... },  # Prometheus discovery for sut pods on port 6379 at /metrics
},

No JDBC driver is needed — YCSB communicates via the Redis wire protocol. The monitoring section scrapes Dragonfly’s native Prometheus endpoint at /metrics on port 6379.

DDL scripts:

YCSB

See Example: Dragonfly for a worked example.

DragonflyCluster

DragonflyCluster is the configuration key for Dragonfly deployed as a cluster (multiple worker nodes). The loader and benchmarker connect via the cluster’s headless service name.

Configuration (from cluster.config):

'DragonflyCluster': {
    'loadData': 'redis-cli --host bexhoma-service.{namespace}.svc.cluster.local < {scriptname}',
    'delay_prepare': 0,
    'attachWorker': '',
    'template': {
        'version': 'xxx',
        'alias': 'Key-Value-1',
        'docker_alias': 'KV1',
        'auth': ["root", ""],
    },
    'logfile': '/var/log/redis/redis-server.log',
    'datadir': '/data',
    'priceperhourdollar': 0.0,
    'monitor': { ... },  # Prometheus discovery for worker pods on port 6379 at /metrics
},

The key difference from single-node Dragonfly is that loadData targets the cluster service DNS name rather than localhost. DragonflyCluster uses the same YCSB DDL scripts as single-node Dragonfly.

Redis

Redis is the configuration key for a Redis cluster deployment (multiple shards as workers). It is used as the SUT for YCSB key-value workloads in a distributed setting.

Configuration (from cluster.config):

'Redis': {
    'loadData': 'redis-cli --host bexhoma-service.{namespace}.svc.cluster.local < {scriptname}',
    'delay_prepare': 0,
    'attachWorker': '',
    'template': {
        'version': 'xxx',
        'alias': 'Key-Value-1',
        'docker_alias': 'KV1',
        'auth': ["root", ""],
    },
    'logfile': '/var/log/redis/redis-server.log',
    'datadir': '/data',
    'priceperhourdollar': 0.0,
    'monitor': { ... },  # Prometheus discovery for worker pods on port 9121 (redis_exporter)
},

Monitoring uses a redis_exporter sidecar scraping each worker pod on port 9121.

Redis does not have dedicated DDL scripts; YCSB manages table/key-space creation directly via the Redis wire protocol.

See Example: Redis for a worked example.

DatabaseService

DatabaseService is a generic configuration for external database services — cloud-managed databases or any PostgreSQL-compatible endpoint that is not itself deployed inside the Kubernetes cluster. The JDBC connection and loadData command reference the service by its in-cluster DNS name via {namespace}.

Configuration (from cluster.config):

'DatabaseService': {
    'loadData': 'psql -U postgres --host bexhoma-service.{namespace}.svc.cluster.local --port 9091 < {scriptname}',
    'template': {
        'version': 'v11.4',
        'alias': 'General-B',
        'docker_alias': 'GP-B',
        'JDBC': {
            'driver': "org.postgresql.Driver",
            'auth': ["postgres", ""],
            'url': 'jdbc:postgresql://bexhoma-service.{namespace}.svc.cluster.local:9091/postgres?reWriteBatchedInserts=true',
            'jar': 'postgresql-42.5.0.jar',
            'database': 'postgres',
        }
    },
    'logfile': '/usr/local/data/logfile',
    'datadir': '/var/lib/postgresql/data/',
    'priceperhourdollar': 0.0,
},

DDL scripts:

TPC-H
YCSB

See Example: Cloud Database for a worked example.

Hardware

Hardware is not a DBMS — it benchmarks raw disk I/O (fio), CPU/memory (sysbench), or network latency/throughput (sockperf) on a dedicated SUT container, bypassing any database engine entirely. The fio/sysbench benchmarker connects to the SUT over SSH; sockperf connects directly to one of a fixed pool of persistent sockperf server instances running on the SUT (see below). There is no loadData command and no DDL scripts. Run it with python hardware.py run -dbms Hardware -xht fio ... (see hardware.py -h for the full set of fio/sysbench/sockperf-specific flags).

Deployment template: k8s/deploymenttemplate-Hardware.yml — mounts a PVC at /database on the SUT; the benchmarker’s HARDWARE_TEST_DIR targets a subdirectory of this mount so fio measures the PVC-backed storage instead of the SUT container’s ephemeral filesystem. The same manifest also declares a static pool of SOCKPERF_NUM_SERVERS (default 16) container/Service ports starting at SOCKPERF_BASE_PORT (default 20000), each running one UDP and one TCP sockperf server (started by images/hardware/sut/entrypoint.sh); a benchmarker pod picks its dedicated server via BEXHOMA_CHILD (see images/hardware/benchmarker/run_sockperf.sh), so several benchmarker pods can each reach a separate server instead of contending on one socket.

Configuration (from cluster.config):

'Hardware': {
    'loadData': '',
    'delay_prepare': 0,
    'template': {
        'version': 'v1.0',
        'alias': 'Hardware',
        'docker_alias': 'HW',
        'auth': ['bench', ''],
    },
    'logfile': '',
    'datadir': '/database',
    'priceperhourdollar': 0.0,
},

template.auth is required for every non-JDBC dockers entry — it’s used to populate BEXHOMA_USER/BEXHOMA_PASSWORD env vars injected into every job pod (configurations/manifest.py). Hardware’s benchmarker scripts don’t actually use these two vars (SSH auth goes through BEXHOMA_SUT_USER/ BEXHOMA_SUT_KEY instead), but the manifest builder still requires the key to be present.

No template.JDBC block (see footnote¹ above) — results are parsed from the benchmarker pod’s own log output, not queried through DBMSBenchmarker.

DDL scripts: none — Hardware has no loading phase; experiment_dict_template["loader"] is always empty and cluster.config['volumes']['hardware']['initscripts']['Schema'] is an empty placeholder required only so configurations.default(...) can resolve its (unused) init-script lookup.

Add a New DBMS

To add a DBMS called NewDBMS:

Add a configuration entry in the dockers section of cluster.config following the reference above.
Create a deployment manifest at k8s/deploymenttemplate-NewDBMS.yml.
Copy k8s/deploymenttemplate-Dummy.yml as a starting point and adjust the image, port, and resource defaults.
Add DDL scripts in a subfolder of experiments/, e.g., experiments/tpch/NewDBMS/ for TPC-H.

Register the DBMS in the experiment script (e.g., example.py):

if args.dbms == "NewDBMS":
    name_format = 'NewDBMS-{cluster}'
    config = configurations.default(
        experiment=experiment,
        docker='NewDBMS',
        configuration=name_format.format(cluster=cluster_name),
        alias='DBMS A1'
    )

The docker='NewDBMS' parameter must match the dockers key and the YAML filename.

Add the choice to the CLI argument parser:

parser.add_argument('-dbms', help='DBMS to run the experiment on', choices=['NewDBMS'])

If you need a JDBC driver that is not yet included, please raise an issue at https://github.com/Beuth-Erdelt/Benchmark-Experiment-Host-Manager/issues.

Concept: DBMS

Configuration Reference

JDBC URL Placeholders

Host Information Collected Automatically

Deployment Manifests

Deployment Template Conventions

File naming

Document structure

Mandatory labels

The component label

Runtime name generation

Reserved container names

Reserved volume names

Service port names

Minimal SUT template skeleton

PostgreSQL

PGBouncer

MySQL

MariaDB

MonetDB

Citus

YugabyteDB

CockroachDB

TiDB

Dragonfly

DragonflyCluster

Redis

DatabaseService

Hardware

Add a New DBMS

The `component` label