bexhoma.evaluators.ycsb module

Evaluator for YCSB experiments.

Provides YcsbEvaluator, which extends LogEvaluator to parse and aggregate operation throughput and latency results produced by the Yahoo Cloud Serving Benchmark (YCSB) tool.

Authors: Patrick K. Erdelt Copyright (C) 2020 Patrick K. Erdelt SPDX-License-Identifier: AGPL-3.0-or-later See LICENSE for details.

class bexhoma.evaluators.ycsb.YcsbEvaluator(code, path, include_loading=False, include_benchmarking=True, benchmark_run: int = 0)

Bases: LogEvaluator

Evaluator for a YCSB experiment.

Parses per-pod log files to extract operation counts, throughput, and per-operation latency statistics produced by the Yahoo Cloud Serving Benchmark (YCSB) tool. Provides time-series access to per-second throughput for both the benchmarking and loading phases via get_benchmark_logs_timeseries_df_aggregated(), get_loading_logs_timeseries_df_aggregated(), and their *_single variants.

Parameters:
  • code – Experiment identifier — also the name of the result sub-folder.

  • path – Root path that contains the result folders.

  • include_loading – Ignored; loading is always enabled for this evaluator.

  • include_benchmarking – Ignored; benchmarking is always enabled.

benchmark_logs_to_timeseries_df(list_logs, metric='current_ops_per_sec', aggregate=True, filetype='benchmarker')

Parses benchmarker log files for the given pod IDs and assembles a time-series DataFrame.

Delegates to logs_to_timeseries_df() with filetype='benchmarker'.

Parameters:
  • list_logs (list[str]) – Pod IDs used to locate matching log files.

  • metric (str) – Metric to extract (default 'current_ops_per_sec').

  • aggregate (bool) – Whether to aggregate all pod DataFrames into one.

Returns:

Aggregated DataFrame or list of per-pod DataFrames.

Return type:

pandas.DataFrame or list[pandas.DataFrame]

benchmarking_aggregate_by_parallel_pods(df, columns=['phase'])

Aggregates parallel-pod YCSB benchmarking rows into one row per group.

Groups by columns and sums counts/throughput, takes mean for average latencies, and max for percentile/max latencies.

The phase column holds the phase identifier (configuration-experiment_run-client) and the job column holds the job identifier (configuration-experiment_run-client-benchmark_run).

The default columns=['phase'] groups by phase, producing one row per phase. To keep one row per job, pass columns=['job'].

Parameters:
  • df (pandas.DataFrame) – Typed YCSB benchmarking DataFrame.

  • columns (list[str]) – Grouping columns (default ['phase']).

Returns:

Aggregated DataFrame with one row per group.

Return type:

pandas.DataFrame

benchmarking_set_datatypes(df)

Casts all YCSB benchmarking result columns to their appropriate data types.

Only casts operation-specific columns when they are present in the DataFrame.

Parameters:

df (pandas.DataFrame) – DataFrame of raw YCSB benchmarking results.

Returns:

DataFrame with columns cast to correct types, or original df on error.

Return type:

pandas.DataFrame

get_benchmark_logs_timeseries_df_aggregated(metric='current_ops_per_sec', configuration='', client='1', experiment_run='1')

Returns a DataFrame of per-second benchmarking time-series, aggregated across pods.

Retrieves pod IDs from get_df_benchmarking() and delegates to benchmark_logs_to_timeseries_df() with aggregate=True.

Parameters:
  • metric (str) – YCSB metric to retrieve (default 'current_ops_per_sec').

  • configuration (str) – Configuration name (e.g. 'PostgreSQL-64-8-196608').

  • client (str or int) – Client number (default '1').

  • experiment_run (str or int) – Experiment run number (default '1').

Returns:

DataFrame indexed by second with one metric column and an 'avg' column.

Return type:

pandas.DataFrame

get_benchmark_logs_timeseries_df_single(metric='current_ops_per_sec', configuration='', client='1', experiment_run='1')

Returns a list of per-pod benchmarking time-series DataFrames (one per pod).

Parameters:
  • metric (str) – YCSB metric to retrieve (default 'current_ops_per_sec').

  • configuration (str) – Configuration name (e.g. 'PostgreSQL-64-8-196608').

  • client (str or int) – Client number (default '1').

  • experiment_run (str or int) – Experiment run number (default '1').

Returns:

List of DataFrames, one per pod, each indexed by second.

Return type:

list[pandas.DataFrame]

get_df_loading()

Returns the DataFrame containing all loading-phase results.

Returns:

DataFrame of loading results, or empty DataFrame when unavailable.

Return type:

pandas.DataFrame

get_loading_logs_timeseries_df_aggregated(metric='current_ops_per_sec', configuration='', experiment_run='1')

Returns a DataFrame of time series of a metric for the loading phase, aggregated over all pods per second.

Uses get_df_loading() to retrieve the pod list and benchmark_logs_to_timeseries_df() to parse and aggregate the log files. Restricts to a configuration and an experiment run. Aggregation follows the same strategy as for the benchmarking phase: percentiles and maximum by max, minimum by min, average by average, 'current_ops_per_sec' and all others by sum.

Parameters:
  • metric (str) – Metric to retrieve (default 'current_ops_per_sec').

  • configuration (str) – Configuration name (e.g. 'PostgreSQL-64-8-196608').

  • experiment_run (str or int) – Experiment run number (default '1').

Returns:

DataFrame indexed by second with one column for the aggregated metric plus an 'avg' column, or an empty DataFrame when no files are found.

Return type:

pandas.DataFrame

get_loading_logs_timeseries_df_single(metric='current_ops_per_sec', configuration='', experiment_run='1')

Returns a list of DataFrames of time series of a metric for the loading phase, one per pod.

Uses get_df_loading() to retrieve the pod list and benchmark_logs_to_timeseries_df() to parse the log files without aggregation. Restricts to a configuration and an experiment run.

Parameters:
  • metric (str) – Metric to retrieve (default 'current_ops_per_sec').

  • configuration (str) – Configuration name (e.g. 'PostgreSQL-64-8-196608').

  • experiment_run (str or int) – Experiment run number (default '1').

Returns:

List of DataFrames, one per pod, each indexed by second with one metric column.

Return type:

list[pandas.DataFrame]

get_loading_per_connection()

Returns loading metrics for each individual connection, merged with connection metadata and enriched with the scale factor.

Combines the aggregated loading DataFrame (from get_df_loading()) with connection metadata (from get_connections_of_experiment()) on (code, configuration, experiment_run), then normalises the index. Rows for which no loading log was recorded (missing pod_count) are dropped.

Returns:

DataFrame with one row per loading run, indexed as {code}-{configuration}-{experiment_run}.

Return type:

pandas.DataFrame

get_loading_per_pod()

Returns the raw loading DataFrame with one row per pod.

Returns:

DataFrame from get_df_loading() — one row per loading pod.

Return type:

pandas.DataFrame

get_loading_per_run()

Returns loading metrics aggregated per (code, configuration, experiment_run).

Overrides the base implementation to derive 'Throughput [SF/h]' from '[OVERALL].RunTime(ms)' rather than time_load, since YCSB loading results carry wall-clock run time in milliseconds rather than the bexhoma connection-level timing.

Returns:

DataFrame with one row per experiment run.

Return type:

pandas.DataFrame

get_summary_benchmark_per_connection()

Returns benchmarking results with one row per pod, filtered to the key display columns.

Applies benchmarking_set_datatypes() and selects the columns used for the per-connection summary table (experiment run, terminals, target, client, child, time, errors, throughput, goodput, efficiency, and latency percentiles), then sorts by (experiment_run, client, child).

Returns:

DataFrame indexed as "DBMS" with one row per pod, or an empty DataFrame if there are no benchmarking results.

Return type:

pandas.DataFrame

get_summary_benchmark_per_phase()

Returns benchmarking results aggregated over parallel pods, one row per phase.

Applies benchmarking_set_datatypes(), aggregates via benchmarking_aggregate_by_parallel_pods(), and selects the columns used for the per-phase summary table (experiment run, terminals, target, pod count, time, errors, throughput, goodput, efficiency, and latency percentiles), sorted by (experiment_run, target, pod_count).

Returns:

DataFrame indexed as "DBMS" with one row per phase, or an empty DataFrame if there are no benchmarking results.

Return type:

pandas.DataFrame

get_summary_benchmark_per_phase_multitenant()

Returns YCSB benchmarking results aggregated per phase and tenant, one row per (phase, tenant_id).

Like get_summary_benchmark_per_phase() but groups by ['phase', 'tenant_id'] so each tenant appears as a separate row.

Returns:

DataFrame indexed as "DBMS" with one row per (phase, tenant), or an empty DataFrame if there are no benchmarking results.

Return type:

pandas.DataFrame

get_summary_loading_per_connection()

Returns loading metrics aggregated per experiment run.

Delegates to get_df_loading() (defined in base), which reduces the per-connection loading DataFrame to one row per (code, configuration, experiment_run) and adds a 'Throughput [SF/h]' column.

Returns:

DataFrame with one row per experiment run.

Return type:

pandas.DataFrame

get_summary_loading_per_run()

Returns loading metrics aggregated per experiment run.

Delegates to get_df_loading() (defined in base), which reduces the per-connection loading DataFrame to one row per (code, configuration, experiment_run) and adds a 'Throughput [SF/h]' column.

Returns:

DataFrame with one row per experiment run.

Return type:

pandas.DataFrame

loading_aggregate_by_parallel_pods(df, columns=['phase'])

Aggregates parallel-pod YCSB loading rows into one row per job.

The phase column stores BEXHOMA_CONNECTION, which is the job identifier (configuration-experiment_run-client-benchmark_run). The default columns=['phase'] therefore groups by job identifier, producing one row per job. To aggregate per phase, pass columns=['configuration', 'experiment_run', 'client'].

Parameters:
  • df (pandas.DataFrame) – Typed YCSB loading DataFrame.

  • columns (list[str]) – Grouping columns (default ['phase']).

Returns:

Aggregated DataFrame with one row per group.

Return type:

pandas.DataFrame

loading_logs_to_timeseries_df(list_logs, metric='current_ops_per_sec', aggregate=True, filetype='benchmarker')

Parses loader log files for the given pod IDs and assembles a time-series DataFrame.

Delegates to logs_to_timeseries_df() with filetype='loading'.

Parameters:
  • list_logs (list[str]) – Pod IDs used to locate matching log files.

  • metric (str) – Metric to extract (default 'current_ops_per_sec').

  • aggregate (bool) – Whether to aggregate all pod DataFrames into one.

Returns:

Aggregated DataFrame or list of per-pod DataFrames.

Return type:

pandas.DataFrame or list[pandas.DataFrame]

loading_set_datatypes(df)

Casts all YCSB loading result columns to their appropriate data types.

Parameters:

df (pandas.DataFrame) – DataFrame of raw YCSB loading results.

Returns:

DataFrame with columns cast to correct types.

Return type:

pandas.DataFrame

log_to_df(filename)

Parses a YCSB pod log file into a single-row DataFrame.

Extracts connection metadata, benchmark parameters, and per-operation metrics (throughput, latency percentiles) from the YCSB summary output.

Parameters:

filename (str) – Absolute path to the YCSB log file.

Returns:

Single-row DataFrame of YCSB results, or empty on parse failure.

Return type:

pandas.DataFrame

logs_to_timeseries_df(list_logs, metric='current_ops_per_sec', aggregate=True, filetype='benchmarker')

Parses YCSB log files for the given pod IDs and assembles a time-series DataFrame.

Each pod ID in list_logs is resolved to matching log files via a glob pattern that uses filetype to distinguish benchmarker from loading logs. When aggregate is True the per-second values from all pods are combined: percentile/max metrics use element-wise maximum, minimum metrics use element-wise minimum, and all others are summed. When aggregate is False a list of per-pod DataFrames is returned instead.

Parameters:
  • list_logs (list[str]) – Pod IDs used to locate matching log files.

  • metric (str) – Metric column to extract (default 'current_ops_per_sec').

  • aggregate (bool) – Whether to aggregate all pod DataFrames into one.

  • filetype (str) – Log file prefix: 'benchmarker' or 'loading'.

Returns:

Aggregated DataFrame indexed by 'sec' (with an 'avg' column appended) when aggregate is True, or a list of per-pod DataFrames.

Return type:

pandas.DataFrame or list[pandas.DataFrame]

parse_ycsb_log_file(file_path)

Scans the lines of a YCSB log file. Extracts relevant performance infos for time series analysis. Each line starting with a time stamp is converted into a dict containing measurements (operations, sec of measurement, READ latency, …)-

Parameters:

file_path – Full path of log file

Returns:

List of dicts of measures, one entry per line

record_tests(experiment, df_loading: DataFrame, df_reduced: DataFrame, workflow_actual: dict, workflow_planned: dict, **extra) None

Record YCSB pass/fail tests.

Tests overall throughput for the loading phase (when data is available) and the execution phase, workflow completeness, and absence of FAILED operation columns.

Parameters:
  • experiment – The owning experiment object.

  • df_loading – Per-run loading DataFrame; empty if loading was not active.

  • df_reduced – Per-phase execution DataFrame.

  • workflow_actual – Reconstructed actual workflow dict.

  • workflow_planned – Planned workflow dict from workload config.