bexhoma.evaluators.ycsb module

Evaluator for YCSB experiments.

Provides YcsbEvaluator, which extends LogEvaluator to parse and aggregate operation throughput and latency results produced by the Yahoo Cloud Serving Benchmark (YCSB) tool.

class bexhoma.evaluators.ycsb.YcsbEvaluator(code, path, include_loading=False, include_benchmarking=True, benchmark_run: int = 0)

Bases: LogEvaluator

Evaluator for a YCSB experiment.

Parses per-pod log files to extract operation counts, throughput, and per-operation latency statistics produced by the Yahoo Cloud Serving Benchmark (YCSB) tool. Provides time-series access to per-second throughput for both the benchmarking and loading phases via get_benchmark_logs_timeseries_df_aggregated(), get_loading_logs_timeseries_df_aggregated(), and their *_single variants.

Parameters:

code – Experiment identifier — also the name of the result sub-folder.
path – Root path that contains the result folders.
include_loading – Ignored; loading is always enabled for this evaluator.
include_benchmarking – Ignored; benchmarking is always enabled.

benchmark_logs_to_timeseries_df(list_logs, metric='current_ops_per_sec', aggregate=True, filetype='benchmarker')

Parses benchmarker log files for the given pod IDs and assembles a time-series DataFrame.

Delegates to logs_to_timeseries_df() with filetype='benchmarker'.

Parameters:

list_logs (list[str]) – Pod IDs used to locate matching log files.
metric (str) – Metric to extract (default 'current_ops_per_sec').
aggregate (bool) – Whether to aggregate all pod DataFrames into one.

Returns:

Aggregated DataFrame or list of per-pod DataFrames.

Return type:

pandas.DataFrame or list[pandas.DataFrame]

benchmarking_aggregate_by_parallel_pods(df, columns=['phase'])

Aggregates parallel-pod YCSB benchmarking rows into one row per group.

Groups by columns and sums counts/throughput, takes mean for average latencies, and max for percentile/max latencies.

The phase column holds the phase identifier (configuration-experiment_run-client) and the job column holds the job identifier (configuration-experiment_run-client-benchmark_run).

The default columns=['phase'] groups by phase, producing one row per phase. To keep one row per job, pass columns=['job'].

Parameters:

df (pandas.DataFrame) – Typed YCSB benchmarking DataFrame.
columns (list[str]) – Grouping columns (default ['phase']).

Returns:

Aggregated DataFrame with one row per group.

Return type:

pandas.DataFrame

benchmarking_set_datatypes(df)

Casts all YCSB benchmarking result columns to their appropriate data types.

Only casts operation-specific columns when they are present in the DataFrame.

Parameters:: df (pandas.DataFrame) – DataFrame of raw YCSB benchmarking results.
Returns:: DataFrame with columns cast to correct types, or original df on error.
Return type:: pandas.DataFrame

get_benchmark_logs_timeseries_df_aggregated(metric='current_ops_per_sec', configuration='', client='1', experiment_run='1')

Returns a DataFrame of per-second benchmarking time-series, aggregated across pods.

Retrieves pod IDs from get_df_benchmarking() and delegates to benchmark_logs_to_timeseries_df() with aggregate=True.

Parameters:

metric (str) – YCSB metric to retrieve (default 'current_ops_per_sec').
configuration (str) – Configuration name (e.g. 'PostgreSQL-64-8-196608').
client (str or int) – Client number (default '1').
experiment_run (str or int) – Experiment run number (default '1').

Returns:

DataFrame indexed by second with one metric column and an 'avg' column.

Return type:

pandas.DataFrame

get_benchmark_logs_timeseries_df_single(metric='current_ops_per_sec', configuration='', client='1', experiment_run='1')

Returns a list of per-pod benchmarking time-series DataFrames (one per pod).

Parameters:

metric (str) – YCSB metric to retrieve (default 'current_ops_per_sec').
configuration (str) – Configuration name (e.g. 'PostgreSQL-64-8-196608').
client (str or int) – Client number (default '1').
experiment_run (str or int) – Experiment run number (default '1').

Returns:

List of DataFrames, one per pod, each indexed by second.

Return type:

list[pandas.DataFrame]

get_df_loading()

Returns the DataFrame containing all loading-phase results.

Returns:: DataFrame of loading results, or empty DataFrame when unavailable.
Return type:: pandas.DataFrame

get_loading_logs_timeseries_df_aggregated(metric='current_ops_per_sec', configuration='', experiment_run='1')

Returns a DataFrame of time series of a metric for the loading phase, aggregated over all pods per second.

Uses get_df_loading() to retrieve the pod list and benchmark_logs_to_timeseries_df() to parse and aggregate the log files. Restricts to a configuration and an experiment run. Aggregation follows the same strategy as for the benchmarking phase: percentiles and maximum by max, minimum by min, average by average, 'current_ops_per_sec' and all others by sum.

Parameters:

metric (str) – Metric to retrieve (default 'current_ops_per_sec').
configuration (str) – Configuration name (e.g. 'PostgreSQL-64-8-196608').
experiment_run (str or int) – Experiment run number (default '1').

Returns:

DataFrame indexed by second with one column for the aggregated metric plus an 'avg' column, or an empty DataFrame when no files are found.

Return type:

pandas.DataFrame

get_loading_logs_timeseries_df_single(metric='current_ops_per_sec', configuration='', experiment_run='1')

Returns a list of DataFrames of time series of a metric for the loading phase, one per pod.

Uses get_df_loading() to retrieve the pod list and benchmark_logs_to_timeseries_df() to parse the log files without aggregation. Restricts to a configuration and an experiment run.

Parameters:

metric (str) – Metric to retrieve (default 'current_ops_per_sec').
configuration (str) – Configuration name (e.g. 'PostgreSQL-64-8-196608').
experiment_run (str or int) – Experiment run number (default '1').

Returns:

List of DataFrames, one per pod, each indexed by second with one metric column.

Return type:

list[pandas.DataFrame]

get_loading_per_connection()

Returns loading metrics for each individual connection, merged with connection metadata and enriched with the scale factor.

Combines the aggregated loading DataFrame (from get_df_loading()) with connection metadata (from get_connections_of_experiment()) on (code, configuration, experiment_run), then normalises the index. Rows for which no loading log was recorded (missing pod_count) are dropped.

Returns:: DataFrame with one row per loading run, indexed as {code}-{configuration}-{experiment_run}.
Return type:: pandas.DataFrame

get_loading_per_pod()

Returns the raw loading DataFrame with one row per pod.

Returns:: DataFrame from get_df_loading() — one row per loading pod.
Return type:: pandas.DataFrame

get_loading_per_run()

Returns loading metrics aggregated per (code, configuration, experiment_run).

Overrides the base implementation to derive 'Throughput [SF/h]' from '[OVERALL].RunTime(ms)' rather than time_load, since YCSB loading results carry wall-clock run time in milliseconds rather than the bexhoma connection-level timing.

Returns:: DataFrame with one row per experiment run.
Return type:: pandas.DataFrame

get_summary_benchmark_per_connection()

Returns benchmarking results with one row per pod, filtered to the key display columns.

Applies benchmarking_set_datatypes() and selects the columns used for the per-connection summary table (experiment run, terminals, target, client, child, time, errors, throughput, goodput, efficiency, and latency percentiles), then naturally sorts by the connection name.

Returns:: DataFrame indexed as "DBMS" with one row per pod, or an empty DataFrame if there are no benchmarking results.
Return type:: pandas.DataFrame

get_summary_benchmark_per_phase()

Returns benchmarking results aggregated over parallel pods, one row per phase.

Applies benchmarking_set_datatypes(), aggregates via benchmarking_aggregate_by_parallel_pods(), and selects the columns used for the per-phase summary table (experiment run, terminals, target, pod count, time, errors, throughput, goodput, efficiency, and latency percentiles), sorted by (experiment_run, target, pod_count).

Returns:: DataFrame indexed as "DBMS" with one row per phase, or an empty DataFrame if there are no benchmarking results.
Return type:: pandas.DataFrame

get_summary_benchmark_per_phase_multitenant()

Returns YCSB benchmarking results aggregated per phase and tenant, one row per (phase, tenant_id).

Like get_summary_benchmark_per_phase() but groups by ['phase', 'tenant_id'] so each tenant appears as a separate row.

Returns:: DataFrame indexed as "DBMS" with one row per (phase, tenant), or an empty DataFrame if there are no benchmarking results.
Return type:: pandas.DataFrame

get_summary_loading_per_connection()

Returns loading metrics aggregated per experiment run.

Delegates to get_df_loading() (defined in base), which reduces the per-connection loading DataFrame to one row per (code, configuration, experiment_run) and adds a 'Throughput [SF/h]' column.

Returns:: DataFrame with one row per experiment run.
Return type:: pandas.DataFrame

get_summary_loading_per_run()

Returns loading metrics aggregated per experiment run.

Delegates to get_df_loading() (defined in base), which reduces the per-connection loading DataFrame to one row per (code, configuration, experiment_run) and adds a 'Throughput [SF/h]' column.

Returns:: DataFrame with one row per experiment run.
Return type:: pandas.DataFrame

loading_aggregate_by_parallel_pods(df, columns=['phase'])

Aggregates parallel-pod YCSB loading rows into one row per job.

The phase column stores BEXHOMA_CONNECTION, which is the job identifier (configuration-experiment_run-client-benchmark_run). The default columns=['phase'] therefore groups by job identifier, producing one row per job. To aggregate per phase, pass columns=['configuration', 'experiment_run', 'client'].

Parameters:

df (pandas.DataFrame) – Typed YCSB loading DataFrame.
columns (list[str]) – Grouping columns (default ['phase']).

Returns:

Aggregated DataFrame with one row per group.

Return type:

pandas.DataFrame

loading_logs_to_timeseries_df(list_logs, metric='current_ops_per_sec', aggregate=True, filetype='benchmarker')

Parses loader log files for the given pod IDs and assembles a time-series DataFrame.

Delegates to logs_to_timeseries_df() with filetype='loading'.

Parameters:

list_logs (list[str]) – Pod IDs used to locate matching log files.
metric (str) – Metric to extract (default 'current_ops_per_sec').
aggregate (bool) – Whether to aggregate all pod DataFrames into one.

Returns:

Aggregated DataFrame or list of per-pod DataFrames.

Return type:

pandas.DataFrame or list[pandas.DataFrame]

loading_set_datatypes(df)

Casts all YCSB loading result columns to their appropriate data types.

Parameters:: df (pandas.DataFrame) – DataFrame of raw YCSB loading results.
Returns:: DataFrame with columns cast to correct types.
Return type:: pandas.DataFrame

log_to_df(filename)

Parses a YCSB pod log file into a single-row DataFrame.

Extracts connection metadata, benchmark parameters, and per-operation metrics (throughput, latency percentiles) from the YCSB summary output.

Parameters:: filename (str) – Absolute path to the YCSB log file.
Returns:: Single-row DataFrame of YCSB results, or empty on parse failure.
Return type:: pandas.DataFrame

logs_to_timeseries_df(list_logs, metric='current_ops_per_sec', aggregate=True, filetype='benchmarker')

Parses YCSB log files for the given pod IDs and assembles a time-series DataFrame.

Each pod ID in list_logs is resolved to matching log files via a glob pattern that uses filetype to distinguish benchmarker from loading logs. When aggregate is True the per-second values from all pods are combined: percentile/max metrics use element-wise maximum, minimum metrics use element-wise minimum, and all others are summed. When aggregate is False a list of per-pod DataFrames is returned instead.

Parameters:

list_logs (list[str]) – Pod IDs used to locate matching log files.
metric (str) – Metric column to extract (default 'current_ops_per_sec').
aggregate (bool) – Whether to aggregate all pod DataFrames into one.
filetype (str) – Log file prefix: 'benchmarker' or 'loading'.

Returns:

Aggregated DataFrame indexed by 'sec' (with an 'avg' column appended) when aggregate is True, or a list of per-pod DataFrames.

Return type:

pandas.DataFrame or list[pandas.DataFrame]

parse_ycsb_log_file(file_path)

Scans the lines of a YCSB log file. Extracts relevant performance infos for time series analysis. Each line starting with a time stamp is converted into a dict containing measurements (operations, sec of measurement, READ latency, …)-

Parameters:: file_path – Full path of log file
Returns:: List of dicts of measures, one entry per line

record_tests(experiment, df_loading: DataFrame, df_reduced: DataFrame, workflow_actual: dict, workflow_planned: dict, **extra) → None

Record YCSB pass/fail tests.

Tests overall throughput for the loading phase (when data is available). When all configurations had loading deactivated (data pre-existing from a reused PVC), the loading throughput test is skipped rather than failed. When benchmarking is active, also tests execution-phase throughput, workflow completeness, and absence of FAILED operation columns.

Parameters:

experiment – The owning experiment object.
df_loading – Per-run loading DataFrame; empty if loading was not active.
df_reduced – Per-phase execution DataFrame.
workflow_actual – Reconstructed actual workflow dict.
workflow_planned – Planned workflow dict from workload config.