black and silver laptop computer

Simple Database Testing with D.O.E.S

There are many database benchmarking and testing tools that can be used to understand how a system might behave under specific circumstances. Each one of these has a specialism, some of them are designed for unit testing while others are workload oriented.

After several years working towards this goal I am excited to announce my own foray into the database testing tools community with the open source release of the Diverse Object Evaluation Suite, or DOES for shorthand, on Pure Storage Open Connect.

What is it?

DOES is the collective name for several functions that enable the user to work with a database at the object layer (tables, collections, etc). It provides a highly dynamic and flexible way to work with these objects, providing insight into how databases and operating platforms behave under different scenarios. It supports databases vendors such as Microsoft SQL Server, Oracle Database, SAP HANA, MySQL, MariaDB, PostgreSQL and MongoDB.

DOES is  made up of three functional components:

  1. The Data Engine – Interacts dynamically with a database by performing data definition and manipulation operations on database objects.;
  2. The Platform Engine – Polls  operating systems to gather resource information on various hardware and software components.
  3. The Analysis Engine – Gathers and persists behavioral trends collected from the DataEngine and Platform Engine components. Also creates commonality between tests run on databases and the associated resource consumption for the operating platform.

The capabilities of DOES are accessed through any one of two interfaces that can be used on Microsoft Windows or Linux operating systems:

  • DOES.PowerShell – a PowerShell module that exposes functions through cmdlets

  • DOES.Cli – A console application that exposes functions through execution on a command line such as Bash or command prompt.

For a deeper explanation of how DOES works and the various components take a look at the user documentation on GitHub.

Where can you get it ?

DOES is hosted on the Pure Storage Open Connect GitHub website. The 3.3.4 revision installation packages for it can be downloaded from the following locations:

 

Platform Package
Windows (x64) DOES.Setup.Public.msi
RHEL 7,8,9, Centos 7, Rocky Linux 8.5+, Fedora 32+, SLES 12+, SLES 15+, openSUSE Leap 15+ purestorage_does_public-3.3.4-1.x86_64.rpm

What makes DOES different from other tools?

DOES is different from other database tools as it does not attempt to do anything that already exists elsewhere.Tools such as HammerDB, sysbench, swingbench and benchmark factory offer industry standard benchmarking which involves the use of a fixed schema design. DOES does not offer industry standard benchmarking and allows for a flexible schema design based on several configuration options.

Capacity oriented

Everything done with DOES is capacity focused. If a user wished  to create a 100 gigabyte database the code counts every byte that goes into the database and returns a dataset of that size. Other operations such as deleting, updating or querying the database also count every byte being operated on resulting in a simpler way to interact with a database objects.

Native support for different database object designs 

When creating database objects DOES will utilize any one of a variety of schema templates. Each schema template provides a range of options that will result in the objects being created in a different way. Each database vendor will have unique customizations specific to their supported object data types and object customizations. For example with Microsoft SQL Server the schema can be designed to be created either with or without indexes or to use memory optimized tables.

Scalable object design 

When objects are created they utilize the schema template in conjunction with a table amplification value. This table amplification value sets out how many database objects will be created when using the schema template. As DOES is capacity oriented it will spread the operation over those database objects once they have been created.

For example if a user wanted to create a 100 gigabyte database and specified that 60 objects were to be used, then the capacity would be spread over those objects.

Fine grained control over data randomness 

As the source for all data in DOES is web page data (URL’s, HTML, Headers, etc) it is possible to exercise fine grained control over how random the data is prior to being operated on within the database. HTML is a highly repetitive set of text making it a low randomness starting point. To increase randomness the data is run through an encryption function that only changes and randomizes a set percentage of the string.

Test Functions that include database vendor operational optimization

When using the test functions (workload oriented operations that simulate complex workloads) vendor specific operations are performed to simulate real world database management.

For example when running a complex test with Microsoft SQL Server every 10% of data processed a CHECKPOINT operation is run. At the end of the scenario a DBCC CHECKDB operation is performed.

SAP HANA – Native support for Native Storage Extension object properties 

DOES natively supports the use of the page loadable property on the objects being created. This allows for users to evaluate the impact of warm data management on their dataset and system setup.

What doesn’t it do ?

It is not a benchmark tool providing TPC-C , TPC-H, TPROCC or TPROCH  industry standard benchmarking. Instead DOES has its own way of running testing specific to how its data model is created.

Oracle and SAP HANA are not supported in the publicly distributed package due to licensing restrictions on the client libraries. To support them the package can be rebuilt with those libraries included.

Other Resources

DOES Repository on GitHub – Release 3.3.4

A deep dive into how DOES works

DOES PowerShell Command Reference

DOES Cli Command Reference