bexhoma.evaluators.dbmsbenchmarker module
Evaluator for DBMSBenchmarker experiments.
Provides DbmsBenchmarkerEvaluator, which extends LogEvaluator to parse
and aggregate per-query performance results, warnings, errors, and latency
statistics produced by the DBMSBenchmarker tool.
Authors: Patrick K. Erdelt Copyright (C) 2020 Patrick K. Erdelt SPDX-License-Identifier: AGPL-3.0-or-later See LICENSE for details.
- class bexhoma.evaluators.dbmsbenchmarker.DbmsBenchmarkerEvaluator(code, path, include_loading=False, include_benchmarking=True, benchmark_run: int = 0)
Bases:
LogEvaluatorEvaluator for a DBMSBenchmarker experiment.
Wraps a
dbmsbenchmarker.inspector.inspectorinstance and exposes loading times, per-query latency statistics, throughput metrics, warning and error counts, and aggregation over parallel pods.- Parameters:
code – Experiment identifier — also the name of the result sub-folder.
path – Root path that contains the result folders.
include_loading – Unused; loading is always enabled for this evaluator.
include_benchmarking – Unused; benchmarking is always enabled.
- benchmarking_aggregate_by_parallel_pods(df, columns=['phase'])
Aggregates parallel-pod DBMSBenchmarker result rows into one row per group.
Groups by
columnsand applies geo-mean for timing/power metrics and max/sum for count metrics. RecomputesThroughput@Sizefrom the aggregated values.The
phasecolumn holds the phase identifier (configuration-experiment_run-client) and thejobcolumn holds the job identifier (configuration-experiment_run-client-benchmark_run).The default
columns=['phase']groups by phase, producing one row per phase. To keep one row per job, passcolumns=['job'].- Parameters:
df (pandas.DataFrame) – Benchmarking DataFrame (output of
get_df_benchmarking()).
- Returns:
Aggregated DataFrame with one row per group.
- Return type:
pandas.DataFrame
- benchmarking_set_datatypes(df)
Returns the DataFrame, adding a
tenant_idcolumn (value-1) when the column is absent so that DataFrames loaded from older pickles remain compatible with aggregation code that expects the column.DBMSBenchmarker results are otherwise already typed by the inspector; no other conversion is needed.
- Parameters:
df (pandas.DataFrame) – DataFrame of results.
- Returns:
DataFrame with
tenant_idguaranteed to be present.- Return type:
pandas.DataFrame
- get_df_benchmarking()
Returns the DataFrame containing all benchmarking-phase results.
Combines per-query latency statistics, geo-mean execution times, and per-connection timing data from the DBMSBenchmarker inspector into a single DataFrame. Includes
tenant_idread from theBEXHOMA_TENANT_IDloading parameter (-1when absent).- Returns:
DataFrame with one row per connection/pod, or empty DataFrame on failure.
- Return type:
pandas.DataFrame
- get_df_loading()
Returns the DataFrame containing all loading-phase timing results.
Reads loading time fields (
timeGenerate,timeIngesting,timeSchema,timeIndex,timeLoad) from the inspector’s connection data.- Returns:
DataFrame with one row per DBMS connection indexed as
"DBMS".- Return type:
pandas.DataFrame
- get_query_latencies(query_titles=False)
Returns the mean execution latency per query and DBMS.
- Parameters:
query_titles (bool) – When
True, replaces query index labels with human-readable titles fromqueries.config.- Returns:
DataFrame of mean latencies (ms) with queries as columns and DBMS as rows.
- Return type:
pandas.DataFrame
- get_summary_benchmark_per_connection()
Returns benchmarking results with one row per pod, filtered to the key display columns.
Applies
benchmarking_set_datatypes()and selects the columns used for the per-connection summary table (experiment run, terminals, target, client, child, time, errors, throughput, goodput, efficiency, and latency percentiles), then sorts by(experiment_run, client, child).- Returns:
DataFrame indexed as
"DBMS"with one row per pod, orNoneif there are no benchmarking results.- Return type:
pandas.DataFrame or None
- get_summary_benchmark_per_phase()
Returns benchmarking results aggregated over parallel pods, one row per phase.
Applies
benchmarking_set_datatypes(), aggregates viabenchmarking_aggregate_by_parallel_pods(), and selects the columns used for the per-phase summary table (experiment run, terminals, target, pod count, time, errors, throughput, goodput, efficiency, and latency percentiles), sorted by(experiment_run, target, pod_count).- Returns:
DataFrame indexed as
"DBMS"with one row per phase, or an empty DataFrame if there are no benchmarking results.- Return type:
pandas.DataFrame
- get_summary_benchmark_per_phase_multitenant()
Returns benchmarking results aggregated per phase and tenant, one row per
(phase, tenant_id).Like
get_summary_benchmark_per_phase()but groups by['phase', 'tenant_id']so each tenant appears as a separate row.- Returns:
DataFrame indexed as
"DBMS"with one row per (phase, tenant), or an empty DataFrame if there are no benchmarking results.- Return type:
pandas.DataFrame
- get_summary_loading_per_run()
Returns loading metrics aggregated per experiment run.
Delegates to
get_loading_per_run()(defined inbase), which reduces the per-connection loading DataFrame to one row per(code, configuration, experiment_run)and adds a'Throughput [SF/h]'column.- Returns:
DataFrame with one row per experiment run.
- Return type:
pandas.DataFrame
- get_total_errors(query_titles=False)
Returns the per-query error counts for this experiment.
- Parameters:
query_titles (bool) – When
True, replaces query index labels with human-readable titles fromqueries.config.- Returns:
DataFrame of error counts with queries as columns and DBMS as rows.
- Return type:
pandas.DataFrame
- get_total_warnings(query_titles=False)
Returns the per-query warning counts for this experiment.
- Parameters:
query_titles (bool) – When
True, replaces query index labels with human-readable titles fromqueries.config.- Returns:
DataFrame of warning counts with queries as columns and DBMS as rows.
- Return type:
pandas.DataFrame
- load_inspector()
Loads the DBMSBenchmarker inspector for this experiment.
Creates an
inspector.inspectorrooted atself.path_base, loads the experiment identified byself.code, and stores the result inself.evaluation. Setsself.evaluationtoNoneif loading fails so callers can detect the uninitialized state.
- record_tests(experiment, df_loading: DataFrame, df_reduced: DataFrame, workflow_actual: dict, workflow_planned: dict, **extra) None
Record DBMSBenchmarker pass/fail tests.
Tests query metric columns (Geo Times, Power@Size, Throughput@Size), SQL error and warning counts supplied by
_show_extra_sections, and workflow completeness.- Parameters:
experiment – The owning experiment object.
df_loading – Per-run loading DataFrame (unused here).
df_reduced – Per-phase execution DataFrame.
workflow_actual – Reconstructed actual workflow dict.
workflow_planned – Planned workflow dict from workload config.
extra – Must contain
num_errorsandnum_warningsfrom_show_extra_sections().
- bexhoma.evaluators.dbmsbenchmarker.map_index_to_queryname(numQuery)
Maps a query index string (e.g.,
'q1') to a human-readable title from the globalquery_propertiesdictionary.If the title cannot be resolved, the original input string is returned unchanged.