bexhoma.experiments module
- Date:
2022-10-01
- Version:
0.6.0
- Authors:
Patrick K. Erdelt
Classes for managing an experiment. This is plugged into a cluster object. It collects some configuation objects. Two examples are included, dealing with TPC-H and TPC-DS tests. Another example concerns TSBS experiment. Each experiment also should have an own folder having:
a query file
a subfolder for each dbms, that may run this experiment, including schema files
Copyright (C) 2020 Patrick K. Erdelt
This program is free software: you can redistribute it and/or modify it under the terms of the GNU Affero General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more details.
You should have received a copy of the GNU Affero General Public License along with this program. If not, see <https://www.gnu.org/licenses/>.
- class bexhoma.experiments.benchbase(cluster, code=None, SF='1', num_experiment_to_apply=1, timeout=7200)
Bases:
default
Class for defining a Benchbase experiment. This sets
the folder to the experiment - including query file and schema informations per dbms
name and information about the experiment
additional parameters - here SF (the scaling factor), i.e. number of rows divided by 10.000
- evaluate_results(pod_dashboard='')
Build a DataFrame locally that contains all benchmarking results. This is specific to Benchbase.
- get_parts_of_name(name)
- log_to_df(filename)
- test_results()
Run test script locally. Extract exit code.
- Returns:
exit code of test script
- class bexhoma.experiments.default(cluster, code=None, num_experiment_to_apply=1, timeout=7200, detached=True)
Bases:
object
Class for defining an experiment. Settings are set generally. This class should be overloaded to define specific experiments.
- add_benchmark_list(list_clients)
Add a list of (number of) benchmarker instances, that are to benchmark the current SUT. Example [1,2,1] means sequentially we will have 1, then 2 and then 1 benchmarker instances. This is applied to all dbms configurations of the experiment.
- Parameters:
list_clients – List of (number of) benchmarker instances
- add_configuration(configuration)
Adds a configuration object to the list of configurations of this experiment. When a new configuration object is instanciated, an experiment object has to be provided. This method is then called automatically.
- Parameters:
configuration – Configuration object
- benchmark_list(list_clients)
DEPRECATED? Is not used anymore. Runs a given list of benchmarker applied to all running SUTs of experiment.
- Parameters:
list_clients – List of (number of) benchmarker instances
- delay(sec, silent=False)
Function for waiting some time and inform via output about this. Synonymous for wait()
- Parameters:
sec – Number of seconds to wait
silent – True means we do not output anything about this waiting
- end_benchmarking(jobname, config=None)
Ends a benchmarker job. This is for storing or cleaning measures.
- Parameters:
jobname – Name of the job to clean
config – Configuration object
- end_loading(jobname)
Ends a loading job. This is for storing or cleaning measures.
- Parameters:
jobname – Name of the job to clean
- evaluate_results(pod_dashboard='')
Let the dashboard pod build the evaluations. This is specific to dbmsbenchmarker.
All local logs are copied to the pod.
Benchmarker in the dashboard pod is updated (dev channel)
All results of all DBMS are joined (merge.py of benchmarker) in dashboard pod
Evaluation cube is built (python benchmark.py read -e yes) in dashboard pod
- extract_job_timing(jobname, container)
- get_job_timing_benchmarking(jobname)
- get_job_timing_loading(jobname)
- get_workflow_list()
Returns benchmarking workflow as dict of lists of lists. Keys are connection names. Values are lists of lists. Each inner list is for example added by add_benchmark_list(), c.f. Inner lists are repeated according to self.num_experiment_to_apply. Example: {‘PostgreSQL-24-1-16384’: [[1, 2]], ‘MySQL-24-1-16384’: [[1, 2]], ‘PostgreSQL-24-1-32768’: [[1, 2]], ‘MySQL-24-1-32768’: [[1, 2]]}
- Returns:
Dict of benchmarking workflow
- patch_benchmarking(patch)
Patches YAML of loading components. Can be set by experiment before creation of configuration.
- Parameters:
patch – String in YAML format, overwrites basic YAML file content
- patch_loading(patch)
Patches YAML of loading components. Can be overwritten by configuration.
- Parameters:
patch – String in YAML format, overwrites basic YAML file content
- set_additional_labels(**kwargs)
Sets additional labels, that will be put to K8s objects (and ignored otherwise). This is for the SUT component. Can be overwritten by configuration.
- Parameters:
kwargs – Dict of labels, example ‘SF’ => 100
- set_benchmarking_parameters(**kwargs)
Sets ENV for benchmarking components. Can be overwritten by configuration.
- Parameters:
kwargs – Dict of meta data, example ‘PARALLEL’ => ‘64’
- set_connectionmanagement(**kwargs)
Sets connection management data for the experiment. This is for the benchmarker component (dbmsbenchmarker). Can be overwritten by configuration.
- Parameters:
kwargs – Dict of meta data, example ‘timout’ => 60
- set_ddl_parameters(**kwargs)
Sets DDL parameters for the experiments. This substitutes placeholders in DDL script. Can be overwritten by configuration.
- Parameters:
kwargs – Dict of meta data, example ‘index’ => ‘btree’
- set_eval_parameters(**kwargs)
Sets some arbitrary parameters that are supposed to be handed over to the benchmarker component. Can be overwritten by configuration.
- Parameters:
kwargs – Dict of meta data, example ‘type’ => ‘noindex’
- set_experiment(instance=None, volume=None, docker=None, script=None, indexing=None)
Read experiment details from cluster config
- Parameters:
instance –
volume –
docker –
script –
- set_experiments_configfolder(experiments_configfolder)
Sets the configuration folder for the experiment. Bexhoma expects subfolders for expeiment types, for example tpch. In there, bexhoma looks for query.config files (for dbmsbenchmarker) and subfolders containing the schema per dbms.
- Parameters:
experiments_configfolder – Relative path to an experiment folder
- set_loading(parallel, num_pods=None)
Sets job parameters for loading components: Number of parallel pods and optionally (if different) total number of pods. By default total number of pods is set to number of parallel pods. Can be overwritten by configuration.
- Parameters:
parallel – Number of parallel pods
num_pods – Optionally (if different) total number of pods
- set_loading_parameters(**kwargs)
Sets ENV for loading components. Can be overwritten by configuration.
- Parameters:
kwargs – Dict of meta data, example ‘PARALLEL’ => ‘64’
- set_maintaining(parallel, num_pods=None)
Sets job parameters for maintaining components: Number of parallel pods and optionally (if different) total number of pods. By default total number of pods is set to number of parallel pods. Can be overwritten by configuration.
- Parameters:
parallel – Number of parallel pods
num_pods – Optionally (if different) total number of pods
- set_maintaining_parameters(**kwargs)
Sets ENV for maintaining components. Can be overwritten by configuration.
- Parameters:
kwargs – Dict of meta data, example ‘PARALLEL’ => ‘64’
- set_nodes(**kwargs)
- set_queryfile(queryfile)
Sets the name of a query file of the experiment. This is for the benchmarker component (dbmsbenchmarker).
- Parameters:
code – Unique identifier of an experiment
- set_querymanagement(**kwargs)
Sets query management data for the experiment. This is for the benchmarker component (dbmsbenchmarker).
- Parameters:
kwargs – Dict of meta data, example ‘numRun’ => 3
- set_querymanagement_monitoring(numRun=256, delay=10, datatransfer=False)
Sets some parameters that are supposed to be suitable for a monitoring test:
high number of runs
optional delay
optional data transfer
monitoring active
- Parameters:
numRun – Number of runs per query (this is for the benchmarker component)
delay – Number of seconds to wait between queries (this is for the benchmarker component)
datatransfer – If data should we retrieved and compared
- set_querymanagement_quicktest(numRun=1, datatransfer=False)
Sets some parameters that are supposed to be suitable for a quick functional test:
small number of runs
no delay
optional data transfer
no monitoring
- Parameters:
numRun – Number of runs per query (this is for the benchmarker component)
datatransfer – If data should we retrieved and compared
- set_resources(**kwargs)
Sets resources for the experiment. This is for the SUT component. Can be overwritten by experiment and configuration.
- Parameters:
kwargs – Dict of meta data, example ‘requests’ => {‘cpu’ => 4}
- set_storage(**kwargs)
Sets parameters for the storage that might be attached to components. This is in particular for the database of dbms under test. Example:
storageClassName = ‘ssd’, storageSize = ‘100Gi’, keep = False
Can be overwritten by configuration.
- Parameters:
kwargs – Dict of meta data, example ‘storageSize’ => ‘100Gi’
- set_workload(**kwargs)
Sets mata data about the experiment, for example name and description.
- Parameters:
kwargs – Dict of meta data, example ‘name’ => ‘TPC-H’
- show_summary()
- start_loading()
Tells all dbms configurations of this experiment to start loading data.
- start_monitoring()
Start monitoring for all dbms configurations of this experiment.
- start_sut()
Start all dbms configurations of this experiment.
- stop_benchmarker(configuration='')
Stop all benchmarker jobs of this experiment. If a dbms configurations is given, use it. Otherwise tell the cluster to stop all benchmarker jobs belonging to this experiment code.
- stop_loading()
Stop all loading jobs of this experiment. If a list of dbms configurations is set, use it. Otherwise tell the cluster to stop all loading jobs belonging to this experiment code.
- stop_maintaining()
Stop all maintaining jobs of this experiment. If a list of dbms configurations is set, use it. Otherwise tell the cluster to stop all maintaining jobs belonging to this experiment code.
- stop_monitoring()
Stop all monitoring deployments of this experiment. If a list of dbms configurations is set, use it. Otherwise tell the cluster to stop all monitoring deployments belonging to this experiment code.
- stop_sut()
Stop all SUT deployments of this experiment. If a list of dbms configurations is set, use it. Otherwise tell the cluster to stop all monitoring deployments belonging to this experiment code.
- test_results()
Run test script in dashboard pod. Extract exit code.
- Returns:
exit code of test script
- wait(sec, silent=False)
Function for waiting some time and inform via output about this
- Parameters:
sec – Number of seconds to wait
silent – True means we do not output anything about this waiting
- work_benchmark_list(intervals=30, stop=True)
Run typical workflow:
start SUT
start monitoring
start loading (at first scripts (schema or loading via pull), then optionally parallel loading pods)
optionally start maintaining pods
at the same time as 4. run benchmarker jobs corresponding to list given via add_benchmark_list()
- Parameters:
intervals – Seconds to wait before checking change of status
stop – Tells if SUT should be removed when all benchmarking has finished. Set to False if we want to have loaded SUTs for inspection.
- zip()
Zip the result folder in the dashboard pod.
- class bexhoma.experiments.example(cluster, code=None, queryfile='queries.config', num_experiment_to_apply=1, timeout=7200, script=None)
Bases:
default
Class for defining a custom example experiment. This sets
the folder to the experiment - including query file and schema informations per dbms
name and information about the experiment
- class bexhoma.experiments.iot(cluster, code=None, queryfile='queries-iot.config', SF='1', num_experiment_to_apply=1, timeout=7200)
Bases:
default
Class for defining an TSBS experiment. This sets
the folder to the experiment - including query file and schema informations per dbms
name and information about the experiment
additional parameters - here SF (the scaling factor)
- set_queries_full()
- set_queries_profiling()
- set_querymanagement_maintaining(numRun=128, delay=5, datatransfer=False)
- class bexhoma.experiments.tpcc(cluster, code=None, SF='1', num_experiment_to_apply=1, timeout=7200)
Bases:
default
Class for defining an TPC-C experiment (in the HammerDB version). This sets
the folder to the experiment - including query file and schema informations per dbms
name and information about the experiment
additional parameters - here SF (the scaling factor), i.e. number of warehouses
- evaluate_results(pod_dashboard='')
Build a DataFrame locally that contains all benchmarking results. This is specific to HammerDB.
- test_results()
Run test script locally. Extract exit code.
- Returns:
exit code of test script
- class bexhoma.experiments.tpcds(cluster, code=None, queryfile='queries-tpcds.config', SF='100', num_experiment_to_apply=1, timeout=7200)
Bases:
default
Class for defining an TPC-DS experiment. This sets
the folder to the experiment - including query file and schema informations per dbms
name and information about the experiment
additional parameters - here SF (the scaling factor)
- set_queries_full()
- set_queries_profiling()
- class bexhoma.experiments.tpch(cluster, code=None, queryfile='queries-tpch.config', SF='100', num_experiment_to_apply=1, timeout=7200, script=None)
Bases:
default
Class for defining an TPC-H experiment. This sets
the folder to the experiment - including query file and schema informations per dbms
name and information about the experiment
additional parameters - here SF (the scaling factor)
- set_queries_full()
- set_queries_profiling()
- show_summary()
- class bexhoma.experiments.tsbs(cluster, code=None, queryfile='queries-tsbs.config', SF='1', num_experiment_to_apply=1, timeout=7200)
Bases:
default
Class for defining an TSBS experiment. This sets
the folder to the experiment - including query file and schema informations per dbms
name and information about the experiment
additional parameters - here SF (the scaling factor)
- set_queries_full()
- set_queries_profiling()
- set_querymanagement_maintaining(numRun=128, delay=5, datatransfer=False)
- class bexhoma.experiments.ycsb(cluster, code=None, SF='1', num_experiment_to_apply=1, timeout=7200)
Bases:
default
Class for defining an YCSB experiment. This sets
the folder to the experiment - including query file and schema informations per dbms
name and information about the experiment
additional parameters - here SF (the scaling factor), i.e. number of rows divided by 10.000
- evaluate_results(pod_dashboard='')
Build a DataFrame locally that contains all benchmarking results. This is specific to YCSB.
- show_summary()
- test_results()
Run test script locally. Extract exit code.
- Returns:
exit code of test script