bexhoma.experiments.tpch module

Experiment class for TPC-H benchmarks.

Provides TpchExperiment, which extends DbmsBenchmarkerExperiment to orchestrate TPC-H data generation, loading, and query execution via the DBMSBenchmarker tool inside a Kubernetes cluster.

Authors: Patrick K. Erdelt Copyright (C) 2020 Patrick K. Erdelt SPDX-License-Identifier: AGPL-3.0-or-later See LICENSE for details.

class bexhoma.experiments.tpch.TpchExperiment(cluster, code=None, queryfile='queries-tpch.config', SF='100', num_experiment_to_apply=1, timeout=7200, script=None)

Bases: DbmsBenchmarkerExperiment

TPC-H experiment: orchestrates data generation, loading, and DBMSBenchmarker query execution inside a Kubernetes cluster.

Registers a TPCH benchmark object and pre-populates the experiment dict template. Workload configuration (modes, info strings, indexing strategies) is delegated to configure_workload().

Extends DbmsBenchmarkerExperiment.

enable_refresh_stream(template: str = 'jobtemplate-benchmarking-tpch-refresh-PostgreSQL.yml') None

Add a TPC-H RF1/RF2 refresh stream that runs in parallel with the query stream.

The refresh stream becomes benchmark_run=2 within each client round. Call set_default_benchmarking_parameters() with TPCH_REFRESH_STREAMS and TPCH_REFRESH_STREAM_OFFSET before calling this method so those values reach both the generator initContainer and the loader main container.

Parameters:

template – k8s job-template file for the refresh benchmarker job. Choose the variant matching the target DBMS (jobtemplate-benchmarking-tpch-refresh-PostgreSQL.yml or jobtemplate-benchmarking-tpch-refresh-MySQL.yml).

set_queries_full() None

Switch to the full TPC-H query file covering all 22 queries.

set_queries_profiling() None

Switch to the abbreviated profiling query file for import validation.

show_summary() None

Print the TPC-H experiment summary, including the refresh stream section.

When enable_refresh_stream() was called during the live run, RefreshStreamBenchmark is already in self.benchmarks and the generic loop inside dbmsbenchmarker.show_summary() places the section right after ### Execution Per Phase.

When called post-hoc via bexperiments summary (no enable_refresh_stream()), a temporary RefreshStreamBenchmark is appended to self.benchmarks before delegating to super(), so the same loop positions it identically. The temporary entry is removed afterwards.