bexhoma.evaluators.tpcc module

Evaluator for HammerDB TPC-C experiments.

Provides TpccEvaluator, which extends LogEvaluator to parse and aggregate transactions-per-minute (TPM) and throughput results produced by HammerDB.

class bexhoma.evaluators.tpcc.TpccEvaluator(code, path, include_loading=False, include_benchmarking=True, benchmark_run: int = 0)

Bases: LogEvaluator

Evaluator for a HammerDB TPC-C experiment.

Parses per-pod log files to extract NOPM, TPM, and optional latency statistics (CALLS, MIN, AVG, MAX, TOTAL, P99, P95, P50, SD, RATIO) and assembles them into DataFrames. Aggregation over parallel pods follows the same pattern as the other logger-based evaluators.

Parameters:

code – Experiment identifier — also the name of the result sub-folder.
path – Root path that contains the result folders.
include_loading – Whether loading-phase results are expected.
include_benchmarking – Whether benchmarking-phase results are expected.

benchmarking_aggregate_by_parallel_pods(df, columns=['phase'])

Aggregates parallel-pod TPC-C result rows into one row per group.

Groups by columns and applies per-metric aggregation (NOPM/TPM averaged across pods, max for latency percentiles, etc.). Also recomputes efficiency for runs where vusers equal 10× the scale factor.

The phase column holds the phase identifier (configuration-experiment_run-client) and the job column holds the job identifier (configuration-experiment_run-client-benchmark_run).

The default columns=['phase'] groups by phase, producing one row per phase. To keep one row per job, pass columns=['job'].

Parameters:

df (pandas.DataFrame) – Typed TPC-C benchmarking DataFrame.
columns (list[str]) – Grouping columns (default ['phase']).

Returns:

Aggregated DataFrame with one row per group.

Return type:

pandas.DataFrame

benchmarking_set_datatypes(df)

Casts all TPC-C benchmarking result columns to their appropriate data types.

Handles two variants: with latency statistics (CALLS present) and without.

Parameters:: df (pandas.DataFrame) – DataFrame of raw TPC-C benchmarking results.
Returns:: DataFrame with columns cast to correct types.
Return type:: pandas.DataFrame

get_summary_benchmark_per_connection()

Returns benchmarking results with one row per pod, filtered to the key display columns.

Applies benchmarking_set_datatypes() and selects the columns used for the per-connection summary table (experiment run, terminals, target, client, child, time, errors, throughput, goodput, efficiency, and latency percentiles), then sorts by (experiment_run, client, child).

Returns:: DataFrame indexed as "DBMS" with one row per pod, or None if there are no benchmarking results.
Return type:: pandas.DataFrame or None

get_summary_benchmark_per_phase()

Returns benchmarking results aggregated over parallel pods, one row per phase.

Applies benchmarking_set_datatypes(), aggregates via benchmarking_aggregate_by_parallel_pods(), and selects the columns used for the per-phase summary table (experiment run, terminals, target, pod count, time, errors, throughput, goodput, efficiency, and latency percentiles), sorted by (experiment_run, target, pod_count).

Returns:: DataFrame indexed as "DBMS" with one row per phase, or an empty DataFrame if there are no benchmarking results.
Return type:: pandas.DataFrame

get_summary_benchmark_per_phase_multitenant()

Returns TPC-C benchmarking results aggregated per phase and tenant, one row per (phase, tenant_id).

Like get_summary_benchmark_per_phase() but groups by ['phase', 'tenant_id'] so each tenant appears as a separate row.

Returns:: DataFrame indexed as "DBMS" with one row per (phase, tenant), or an empty DataFrame if there are no benchmarking results.
Return type:: pandas.DataFrame

get_summary_loading_per_run()

Returns loading metrics aggregated per experiment run.

Delegates to get_loading_per_run() (defined in base), which reduces the per-connection loading DataFrame to one row per (code, configuration, experiment_run) and adds a 'Throughput [SF/h]' column.

Returns:: DataFrame with one row per experiment run.
Return type:: pandas.DataFrame

log_to_df(filename)

Parses a HammerDB TPC-C pod log file into a DataFrame.

Extracts NOPM, TPM, vuser counts, and — when HammerDB time-profile output is present — latency statistics (CALLS, MIN, AVG, MAX, TOTAL, P99, P95, P50, SD, RATIO) for the NEWORD procedure.

Parameters:: filename (str) – Absolute path to the HammerDB log file.
Returns:: DataFrame with one row per TPC-C result iteration, or empty on parse failure.
Return type:: pandas.DataFrame

record_tests(experiment, df_loading: DataFrame, df_reduced: DataFrame, workflow_actual: dict, workflow_planned: dict, **extra) → None

Record TPC-C pass/fail tests: NOPM throughput and workflow completeness.

Parameters:

experiment – The owning experiment object.
df_loading – Per-run loading DataFrame (unused here).
df_reduced – Per-phase execution DataFrame.
workflow_actual – Reconstructed actual workflow dict.
workflow_planned – Planned workflow dict from workload config.

test_results()

Validates results by reading all pickle files and delegating to the parent check.

Returns:: 0 on success, 1 if an exception is raised.
Return type:: int