silhouette of man taking photo using camera

FlashArray Snapshot Consistency and SAP

Ever since I began working with enterprise storage arrays such as Pure Storage FlashArray//X, storage snapshots have been a constant presence. In my journey of fully understanding the various aspects of snapshots, I found myself looking to Wikipedia and found an excellent definition for a snapshot:

“In computer systems a snapshot is the state of a system at a particular point in time.”

A storage snapshot can then be deduced to be the state of a storage object at any one point in time. Snapshots can be created using a number of different techniques (Staimer, M. (2009), Using different types of storage snapshot technologies for data protection. Available at this location) :

Copy-on-write Storage capacity is provisioned for snapshots. The snapshot itself only stores metadata about where the original data is located, but only tracks the changes since the initial snapshot was taken. This technique is incredibly space efficient as only the changed data since the initial snapshot is captured.
Redirect-on-write This technique is similar to Copy-on-write but eliminates the performance penalty of writing data twice. New writes to the original volume are redirected to the storage space provisioned for snapshots.
Clone or split-mirror Clone or Split mirror snapshots create an identical copy of the data.  This kind of snapshot is highly available but requires all of the data to be copied to a new location at the point of snapshot creation.
Copy-on-write with background copy This is a hybridized method between Copy-on-write snapshots and cloning. The outcome is a copy of the data in its entirety in the snapshot location.
Incremental For this snapshot technique the changes are tracked between the source data and snapshot data when the snapshot is generated. The snapshot data is then the difference in change from one point to the next.
Continuous data protection This technique provides zero recovery point objectives (RPO) and zero recovery target objectives (RTO). All copies of changes to the storage are captured and are timestamped. An incremental snapshot is created at each point in time, providing a very fine-grained recovery method

 

Pure Storage’s FlashArray//X, //C and Pure Cloud Block Store™ (CBS) offerings all incorporate the same fundamental engineering which makes its snapshots unique and suitable for any business environment:

  • Snapshots are created using the Redirect-on-write on write technique, reducing the write penalty.
  • All volumes (and by extension snapshots) are thin provisioned and a part of a global deduplication namespace, further reducing the storage space required for any organization and the snapshots it could create.
  • Snapshots are portable, allowing them to be moved to another FlashArray™ or CBS instance, NFS storage target or cloud provider storage such as AWS S3 or Microsoft Azure BLOB storage.
  • Using Thin provisioning and always on global deduplication, transporting a snapshot is done efficiently by only moving deduplicated data over the network.

Storage snapshot Consistency and SAP 

The ability to create a storage snapshot will typically be paired with an intent. Some of these intents could be data protection, data mobility or business continuity, each of which will be accomplished differently based on the application and environment they want to be implemented in. There are two kinds of storage snapshots which can be created when the application content they are created within:

Application consistent storage snapshots

An application consistent snapshot is typically created as a form of data protection. A snapshot of this kind is typically captured with a sense of application awareness, understanding that the application may be doing operations in a volatile form of storage (memory) and at any point in time could be writing critical information to the persistence storage medium. A storage snapshot with application consistency will perform any operations required to make the snapshot consistent with an application state at any one point in time.

Crash consistent storage snapshots 

A crash consistent snapshot ignores the presence of an application, and assumes that anything written to the storage by the time the snapshot is created is sufficient to bring the application or system back to that point in time.

What does this mean for SAP and SAP HANA environments?

Traditional ERP Central Component (ECC) landscapes could be made up of a multitude of database types.  For the purposes of being succinct, I am going to focus on Microsoft SQL Server, Oracle Database and SAP HANA as the underlying relational systems.

Microsoft SQL Server and storage snapshot consistency for SAP  

@8arkz has previously blogged about achieving application consistency with the Pure Storage VSS Provider. This can easily be used to create an application consistent snapshot of a Microsoft Windows System.  However, with the release of the Pure Storage FlashArray Management Extension for Microsoft SQL Server Management Studio  the entire process can be done from within the management tools for SQL server itself.

In the context of SAP, Andrew Oussoren has blogged about this, and I suggest taking a look at his demo as to how easy it can be to recover an SAP business system built on SQL Server using the extension.

Oracle Database and storage snapshot consistency for SAP  

FlashArray snapshots are crash-consistent. They can be used to restore (or clone) an Oracle database to the point in time when the snapshot was taken. When an Oracle database is restored from a crash consistent backup, it performs Instance Recovery in the same way as it would if it was recovering from a power failure. It cannot be rolled forward beyond the snapshot time.

If the database is in Archive Log mode, we can take application consistent backups, and that will give us the ability to do a point-in-time recovery. Application consistent backup of an Oracle database is taken by placing the database in “Hot Backup” mode before creating the snapshot. This is done by executing the following command on the database: “ALTER DATABASE BEGIN BACKUP”. After the snapshot is created (typically in a matter of seconds), we take the database out of Hot Backup mode by executing the following command: “ALTER DATABASE END BACKUP”.

When an Oracle database is being restored using an application consistent backup, we can apply Archive log files during the recovery process to recover the database beyond the time when the snapshot and perform the recovery to any point in time for which contiguous archive log files are available.

SAP HANA and storage snapshot consistency for SAP  

As an In-memory database the persistent storage for SAP HANA is incredibly important as it dictates how fast transactions can complete and save points can be performed. Previously I have blogged about the process of creating an application consistent storage snapshot for SAP HANA and how it can be automated with some examples given in a different blog post. Recently I have looked at where crash consistent storage snapshots for SAP HANA could be used, and the good news is that a quick storage snapshot can be created using FlashArray – its what gets done with it afterwards that matters.

Creating a crash consistent snapshot for SAP HANA using FlashArray 

It is actually incredibly simple, and the simplicity of using FlashArray makes it even easier.

FlashArray incorporates the concept of a “Protection Group”. This is essentially the snapshot management domain for multiple storage objects, which could be hosts (a single host or group of hosts) or volumes.  When creating a protection group snapshot, all volumes related to the group (if hosts are a member of the group, then all of the volumes attached to that host) have a snapshot created for them. The important aspect of a protection group is that when a storage snapshot is created, all of the volumes are then “point in time consistent” with one another (i.e. the snapshot is created at the exact same time on each of them).

Storage snapshot consistency groups are important when considering systems like SAP HANA where the storage devices attached to the hosts serve disparate functions.  For example:

  • Log volumes for SAP HANA are written-to when each and every transaction comes in.  This helps persist the data in the event of power loss.
  • Data volumes are written-to asynchronously every so often (by default, a savepoint happens every 300 seconds). The data written to the system since the last savepoint is then merged into the data volume area.

If a storage snapshot is created of the data volume, and the log area is not included, then the storage snapshot must be application consistent. Corruption or data loss can occur if only a crash consistent snapshot of the data volume is taken with the log volume excluded. This is where consistency groups are important as both the log and data volumes need to be consistent with one another to a single point in time when creating a crash consistent snapshot.

A consistency group can also become more important when looking at more complex SAP HANA environments such as a Scale Out deployment with multiple hosts where each host is assigned a specific role.

Here is an example of creating a crash consistent storage snapshot on FlashArray for a Scale Out deployment:

The Scale Out deployment is made up of 4 nodes (3 workers and 1 Standby). The hosts are added to a host group and all of the volumes are connected to the host group, not to each individual host itself. A protection group is created and the host group is then added to the protection group.

In the protection group policies can be applied to create a snapshot of all member volumes of a protection group. Once snapshots are created, they can also be replicated to a target from it.

Once a protection group snapshot has been created , it will show up as a single item.

But behind the single protection group snapshot, the volumes snapshots can then be seen by selecting it. All volumes in a single protection group snapshot are consistent with one another.

Recovering from a crash consistent storage snapshot for SAP HANA. 

Recovery is very easy when using crash consistent storage snapshots. Just follow these simple steps

  1. Shutdown the SAP HANA instance if it is running.
  2. Umount all storage volumes from each worker host.
  3. Restore each snapshot for a protection group snapshot, for each volume.
  4. Start-up the SAP HANA database.

Important: Crash consistent snapshots are not necessarily a good long-term data protection solution. There is always a risk that the recovery process will not work. To ensure a 100% chance of recovery in the event of disaster, use an application consistent storage snapshot combined with a Backint certified storage solution for log backups. Crash consistent storage snapshots can be used to quickly create a copy of an SAP HANA database for test and QA purposes.

The SAP HANA Scripts for FlashArray support both application and crash consistent storage snapshots.

References

Wikipedia: Snapshot

Using different types of storage snapshot technologies for data protection