bexhoma.clusters module

Date:

2022-10-01

Version:

0.6.0

Authors:

Patrick K. Erdelt

Module to manage testbeds. Historically this supported different implementations based on IaaS. All methods will be deprecated except for Kubernetes (K8s), so the structure will change in future.

This program is free software: you can redistribute it and/or modify it under the terms of the GNU Affero General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more details.

You should have received a copy of the GNU Affero General Public License along with this program. If not, see <https://www.gnu.org/licenses/>.

class bexhoma.clusters.aws(clusterconfig='cluster.config', experiments_configfolder='experiments/', yamlfolder='k8s/', context=None, code=None, instance=None, volume=None, docker=None, script=None, queryfile=None)

Bases: kubernetes

Date:

2022-10-01

Version:

0.6.0

Authors:

Patrick K. Erdelt

Class for containing Kubernetes methods specific to AWS. This adds handling of nodegroups for elasticity.

This program is free software: you can redistribute it and/or modify it under the terms of the GNU Affero General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more details.

You should have received a copy of the GNU Affero General Public License along with this program. If not, see <https://www.gnu.org/licenses/>.

check_nodegroup(nodegroup_type='', nodegroup_name='', num_nodes_aux_planned=0)

eksctl(command)

Runs an eksctl command.

Parameters:: command – An eksctl command
Returns:: stdout of the eksctl command

get_nodegroup_size(nodegroup_type='', nodegroup_name='')

get_nodes(app='', nodegroup_type='', nodegroup_name='')

Get all nodes of a cluster. This overwrites the cluster method with the AWS specific nodegroup-name label.

Parameters:

app – Name of the pod
nodegroup_type – Type of the nodegroup, e.g. sut
nodegroup_name – Name of the nodegroup, e.g. sut_high_memory

scale_nodegroup(nodegroup_name, size)

scale_nodegroups(nodegroup_names, size=None)

wait_for_nodegroup(nodegroup_type='', nodegroup_name='', num_nodes_aux_planned=0)

wait_for_nodegroups(nodegroup_names, size=None)

class bexhoma.clusters.kubernetes(clusterconfig='cluster.config', experiments_configfolder='experiments/', yamlfolder='k8s/', context=None, code=None, instance=None, volume=None, docker=None, script=None, queryfile=None)

Bases: testbed

Date:

2022-10-01

Version:

0.6.0

Authors:

Patrick K. Erdelt

Class for containing specific Kubernetes (K8s) methods. This class can be overloaded to define specific implementations of Kubernetes, for example AWS.

This program is free software: you can redistribute it and/or modify it under the terms of the GNU Affero General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more details.

You should have received a copy of the GNU Affero General Public License along with this program. If not, see <https://www.gnu.org/licenses/>.

add_experiment(experiment)

Add an experiment to this cluster.

Parameters:: experiment – Experiment object

pod_log_exists(pod_name, container='')

Returns, if log of pod already exists on local disk.

Parameters:

pod_name – Name of the pod
container – Name of the container

Returns:

stdout of the eksctl command

Returns:

does log of pod exist?

store_pod_log(pod_name, container='')

Store the log of a pod in a local file in the experiment result folder. Optionally the name of a container can be given (mandatory, if pod has multiple containers). If file containing pod log is already present, we do nothing (no update).

Parameters:

pod_name – Name of the pod
container – Name of the container

class bexhoma.clusters.testbed(clusterconfig='cluster.config', experiments_configfolder='experiments/', yamlfolder='k8s/', context=None, code=None, instance=None, volume=None, docker=None, script=None, queryfile=None)

Bases: object

Date:

2022-10-01

Version:

0.6.0

Authors:

Patrick K. Erdelt

Class to manage experiments in a Kubernetes cluster.

TODO:

Remove instance / volume references from IaaS
Documentation for purpose and position
Documentation for “copy log and init” mechanisms
Clearify if OLD_ can be reused

This program is free software: you can redistribute it and/or modify it under the terms of the GNU Affero General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more details.

You should have received a copy of the GNU Affero General Public License along with this program. If not, see <https://www.gnu.org/licenses/>.

OLD__getTimediff()

OLD_continueBenchmarks(connection=None, query=None)

OLD_getChildProcesses()

OLD_runReporting()

OLD_startPortforwarding(service='', app='', component='sut')

OLD_stopPortforwarding()

add_to_messagequeue(queue, data)

Add data to (Redis) message queue.

Parameters:

queue – Name of the queue
data – Data to be added to queue

check_DBMS_connection(ip, port)

Check if DBMS is open for connections. Tries to open a socket to ip:port. Returns True if this is possible.

Parameters:

ip – IP of the host to connect to
port – Port of the server on the host to connect to

Returns:

True, iff connecting is possible

cluster_access(): provide access to an K8s cluster by initializing connection handlers.

connect_dashboard(): Connects to the dashboard component. This means the output ports of the dashboard component are forwarded to localhost. Expect results be available under port 8050 (dashboard) and 8888 (Jupyter).

connect_master(experiment='', configuration='')

Connects to the master node of a sut component. This means the output ports of the component are forwarded to localhost. Must be limited to a specific experiment or dbms configuration.

Parameters:

experiment – Unique identifier of the experiment
configuration – Name of the dbms configuration

copyInits()

copyLog()

create_dashboard_name(app='', component='dashboard')

Creates a suitable name for the dashboard component.

Parameters:

app – app the dashboard belongs to
component – Component name, should be ‘dashboard’ typically

create_messagequeue_name(app='', component='messagequeue')

Creates a suitable name for the message queue component.

Parameters:

app – app the messagequeue belongs to
component – Component name, should be ‘messagequeue’ typically

dashboard_is_running()

Returns True, iff dashboard is running.

Returns:: True, iff dashboard is running

delay(sec, silent=False)

Function for waiting some time and inform via output about this. Synonymous for wait()

Parameters:

sec – Number of seconds to wait
silent – True means we do not output anything about this waiting

delete_deployment(deployment)

Delete a deployment given by name.

Parameters:: deployment – Name of the deployment to be deleted.

delete_job(jobname='', app='', component='', experiment='', configuration='', client='')

Delete a job given by name or matching a set of labels (component/ experiment/ configuration)

Parameters:

jobname – Name of the job we want to delete
app – app the job belongs to
component – Component, for example sut or monitoring
experiment – Unique identifier of the experiment
configuration – Name of the dbms configuration
client – DEPRECATED?

delete_job_pods(jobname='', app='', component='', experiment='', configuration='', client='')

Delete all pods of a job given by name or matching a set of labels (component/ experiment/ configuration)

Parameters:

jobname – Name of the job we want to delete the pods of
app – app the job belongs to
component – Component, for example sut or monitoring
experiment – Unique identifier of the experiment
configuration – Name of the dbms configuration
client – DEPRECATED?

delete_pod(name)

Delete a pod given by name

Parameters:: name – name of the pod to be deleted

delete_pvc(name)

Delete a persistent volume claim given by name

Parameters:: name – name of the pvc to be deleted

delete_service(name)

Delete a service given by name

Parameters:: name – name of the service to be deleted

delete_stateful_set(name)

Delete a stateful set given by name

Parameters:: name – name of the stateful set to be deleted

downloadLog()

execute_command_in_pod(command, pod='', container='', params='')

Runs an shell command remotely inside a container of a pod.

Parameters:

command – A shell command
pod – The name of the pod
container – The name of the container in the pod
params – Optional parameters, currently ignored

Returns:

stdout of the shell command

get_dashboard_pod_name(app='', component='dashboard')

Returns the name of the dashboard pod.

Parameters:

app – app the dashboard belongs to
component – Component name, should be ‘dashboard’ typically

Returns:

name of the dashboard pod

get_deployments(app='', component='', experiment='', configuration='')

Return all deployments matching a set of labels (component/ experiment/ configuration)

Parameters:

app – app the deployment belongs to
component – Component, for example sut or monitoring
experiment – Unique identifier of the experiment
configuration – Name of the dbms configuration

get_job_pods(app='', component='', experiment='', configuration='', client='')

Return all pods of a jobs matching a set of labels (component/ experiment/ configuration)

Parameters:

app – app the job belongs to
component – Component, for example sut or monitoring
experiment – Unique identifier of the experiment
configuration – Name of the dbms configuration
client – DEPRECATED?

get_job_status(jobname='', app='', component='', experiment='', configuration='', client='')

Return status of a jobs given by name or matching a set of labels (component/ experiment/ configuration)

Parameters:

jobname – Name of the job we want to know the status of
app – app the job belongs to
component – Component, for example sut or monitoring
experiment – Unique identifier of the experiment
configuration – Name of the dbms configuration
client – DEPRECATED?

get_jobs(app='', component='', experiment='', configuration='', client='')

Return all jobs matching a set of labels (component/ experiment/ configuration)

Parameters:

app – app the job belongs to
component – Component, for example sut or monitoring
experiment – Unique identifier of the experiment
configuration – Name of the dbms configuration
client – DEPRECATED?

get_jobs_labels(app='', component='', experiment='', configuration='', client='')

Return all jobs matching a set of labels (component/ experiment/ configuration)

Parameters:

app – app the job belongs to
component – Component, for example sut or monitoring
experiment – Unique identifier of the experiment
configuration – Name of the dbms configuration
client – DEPRECATED?

get_nodes(app='', nodegroup_type='', nodegroup_name='')

Get all nodes of a cluster.

Parameters:

app – Name of the pod
nodegroup_type – Type of the nodegroup, e.g. sut
nodegroup_name – Name of the nodegroup, e.g. sut_high_memory

get_pod_containers(pod)

Return all containers and initcontainers of a pod

Parameters:: pod – name of the pod
Returns:: list of names of (init)containers

get_pod_status(pod, app='')

Return status of a pod given by name

Parameters:

app – app the set belongs to
pod – Name of the pod the status of which should be returned

get_pods(app='', component='', experiment='', configuration='', status='')

Return all pods matching a set of labels (component/ experiment/ configuration)

Parameters:

app – app the pod belongs to
component – Component, for example sut or monitoring
experiment – Unique identifier of the experiment
configuration – Name of the dbms configuration
status – Status of the pod

get_pods_labels(app='', component='', experiment='', configuration='')

Return all labels of pods matching a set of labels (component/ experiment/ configuration)

Parameters:

app – app the set belongs to
component – Component, for example sut or monitoring
experiment – Unique identifier of the experiment
configuration – Name of the dbms configuration

get_ports_of_service(app='', component='', experiment='', configuration='')

Return all ports of a services matching a set of labels (component/ experiment/ configuration)

Parameters:

app – app the service belongs to
component – Component, for example sut or monitoring
experiment – Unique identifier of the experiment
configuration – Name of the dbms configuration

get_pvc(app='', component='', experiment='', configuration='')

Return all persistent volume claims matching a set of labels (component/ experiment/ configuration)

Parameters:

app – app the pvc belongs to
component – Component, for example sut or monitoring
experiment – Unique identifier of the experiment
configuration – Name of the dbms configuration

get_pvc_labels(app='', component='', experiment='', configuration='', pvc='')

Return all labels of persistent volume claims matching a set of labels (component/ experiment/ configuration) or name

Parameters:

app – app the pvc belongs to
component – Component, for example sut or monitoring
experiment – Unique identifier of the experiment
configuration – Name of the dbms configuration
pvc – Name of the PVC

get_pvc_specs(app='', component='', experiment='', configuration='', pvc='')

Return all specs of persistent volume claims matching a set of labels (component/ experiment/ configuration) or name

Parameters:

app – app the pvc belongs to
component – Component, for example sut or monitoring
experiment – Unique identifier of the experiment
configuration – Name of the dbms configuration
pvc – Name of the PVC

get_pvc_status(app='', component='', experiment='', configuration='', pvc='')

Return status of persistent volume claims matching a set of labels (component/ experiment/ configuration) or name

Parameters:

app – app the pvc belongs to
component – Component, for example sut or monitoring
experiment – Unique identifier of the experiment
configuration – Name of the dbms configuration
pvc – Name of the PVC

get_service_endpoints(service_name='bexhoma-service-monitoring-default')

Returns a list of all endpoints of a service as a list. This is in particular interesting for headless services. It is used to find all nodes in a cluster, if monitoring of cluster is active.

Parameters:: service_name – Name of the service
Returns:: List of IPs of endpoints

get_services(app='', component='', experiment='', configuration='')

Return all services matching a set of labels (component/ experiment/ configuration)

Parameters:

app – app the service belongs to
component – Component, for example sut or monitoring
experiment – Unique identifier of the experiment
configuration – Name of the dbms configuration

get_stateful_sets(app='', component='', experiment='', configuration='')

Return all stateful sets matching a set of labels (component/ experiment/ configuration)

Parameters:

app – app the set belongs to
component – Component, for example sut or monitoring
experiment – Unique identifier of the experiment
configuration – Name of the dbms configuration

kubectl(command)

Runs an kubectl command in the current context.

Parameters:: command – An eksctl command
Returns:: stdout of the kubectl command

log_experiment(experiment)

Function to log current step of experiment. This is supposed to be written on disk for comprehension and repetition. This should be reworked and yield a YAML format for example. Moreover this should respect “new” workflows with detached parallel loaders for example.

Parameters:: experiment – Dict that stores parameters of current experiment stept

messagequeue_is_running(component='messagequeue')

Returns True, iff message queue is running.

Returns:: True, iff message queue is running

pod_log(pod, container='')

restart_dashboard(app='', component='dashboard')

Stops the dashboard component and its service.

Parameters:

app – app the dashboard belongs to
component – Component name, should be ‘dashboard’ typically

set_code(code)

Sets the unique identifier of an experiment. Use case: We start a cluster (without experiment), then define an experiment, which creates an identifier. This identifier will be set in the cluster as the default experiment.

Parameters:: code – Unique identifier of an experiment

set_connectionmanagement(**kwargs)

Sets connection management data for the experiments. This is for the benchmarker component (dbmsbenchmarker). Can be overwritten by experiment and configuration.

Parameters:: kwargs – Dict of meta data, example ‘timout’ => 60

set_ddl_parameters(**kwargs)

Sets DDL parameters for the experiments. This substitutes placeholders in DDL script. Can be overwritten by experiment and configuration.

Parameters:: kwargs – Dict of meta data, example ‘index’ => ‘btree’

set_experiment(instance=None, volume=None, docker=None, script=None)

Sets a specific setting for an experiment. In particular this sets instance, volume and dbms (docker image) and name of a list of DDL scrips. This typically comes from a cluster.config.

Parameters:

instances – Dict of instances (DEPRECATED, was for IaaS?)
volumes – Dict of volumes, that carry data
dockers – Dict of docker images and meta data about how to usw
script – Name of list of DDL scripts, that are run when start_loading() is called

set_experiments(instances=None, volumes=None, dockers=None)

Assigns dicts containing information about instances, volumes and dbms (docker images). This typically comes from a cluster.config.

Parameters:

instances – Dict of instances (DEPRECATED, was for IaaS?)
volumes – Dict of volumes, that carry data
dockers – Dict of docker images and meta data about how to usw

set_experiments_configfolder(experiments_configfolder)

Sets the configuration folder for the experiments. Bexhoma expects subfolders for expeiment types, for example tpch. In there, bexhoma looks for query.config files (for dbmsbenchmarker) and subfolders containing the schema per dbms.

Parameters:: experiments_configfolder – Relative path to an experiment folder

set_pod_counter(queue, value=0)

Add data to (Redis) message queue.

Parameters:

queue – Name of the queue
data – Data to be added to queue

set_queryfile(queryfile)

Sets the name of a query file of an experiment. This is for the benchmarker component (dbmsbenchmarker).

Parameters:: code – Unique identifier of an experiment

set_querymanagement(**kwargs)

Sets query management data for the experiments. This is for the benchmarker component (dbmsbenchmarker).

Parameters:: kwargs – Dict of meta data, example ‘numRun’ => 3

set_resources(**kwargs)

Sets resources for the experiments. This is for the SUT component. Can be overwritten by experiment and configuration.

Parameters:: kwargs – Dict of meta data, example ‘requests’ => {‘cpu’ => 4}

set_workload(**kwargs)

Sets mata data about the experiments for example name and description.

Parameters:: kwargs – Dict of meta data, example ‘name’ => ‘TPC-H’

start_dashboard(app='', component='dashboard')

Starts the dashboard component and its service, if there is no such pod. Manifest is expected in ‘deploymenttemplate-bexhoma-dashboard.yml’.

Parameters:

app – app the dashboard belongs to
component – Component name, should be ‘dashboard’ typically

start_datadir(): Starts the data directory in a shared filesystem. This is where data generator pods can store generated data and where loading pods can read the data from. Manifest is expected in ‘pvc-bexhoma-data.yml’

start_messagequeue(app='', component='messagequeue')

Starts the message queue. Manifest is expected in ‘deploymenttemplate-bexhoma-messagequeue.yml’

Parameters:

app – app the messagequeue belongs to
component – Component name, should be ‘messagequeue’ typically

start_monitoring_cluster(app='', component='monitoring')

Starts the monitoring component and its service. Manifest for node exporters is expected in ‘deamonsettemplate-monitoring.yml’.

Parameters:

app – app monitoring belongs to
component – Component name, should be ‘monitoring’ typically

start_resultdir(): Starts the result directory in a shared filesystem. This is where benchmark execution pods can store result data and where the evaluation pods can read results from. Also collected metrics will be stored there. Manifest is expected in ‘pvc-bexhoma-results.yml’

stop_benchmarker(experiment='', configuration='')

Stops all benchmarking components (jobs and their pods) in the cluster. Can be limited to a specific experiment or dbms configuration.

Parameters:

experiment – Unique identifier of the experiment
configuration – Name of the dbms configuration

stop_dashboard(app='', component='dashboard')

Stops the dashboard component and its service.

Parameters:

app – app the dashboard belongs to
component – Component name, should be ‘dashboard’ typically

stop_loading(experiment='', configuration='')

Stops all loading components (jobs and their pods) in the cluster. Can be limited to a specific experiment or dbms configuration.

Parameters:

experiment – Unique identifier of the experiment
configuration – Name of the dbms configuration

stop_maintaining(experiment='', configuration='')

Stops all maintaining components (jobs and their pods) in the cluster. Can be limited to a specific experiment or dbms configuration.

Parameters:

experiment – Unique identifier of the experiment
configuration – Name of the dbms configuration

stop_monitoring(app='', component='monitoring', experiment='', configuration='')

Stops all monitoring components (deployments and their pods) in the cluster and their service. Can be limited to a specific experiment or dbms configuration.

Parameters:

experiment – Unique identifier of the experiment
configuration – Name of the dbms configuration

stop_sut(app='', component='sut', experiment='', configuration='')

Stops all sut components (deployments and their pods, stateful sets and services) in the cluster. Can be limited to a specific experiment or dbms configuration.

Parameters:

experiment – Unique identifier of the experiment
configuration – Name of the dbms configuration

wait(sec, silent=False)

Function for waiting some time and inform via output about this

Parameters:

sec – Number of seconds to wait
silent – True means we do not output anything about this waiting