The release of SAP HANA 2.0 SPS05 brings some great new capabilities to the platform , among them Native Storage Extension support for Scale Out Landscapes , significant improvements to the startup initialization of hybrid LOBs and support for having an NFS server for the hana-shared directory on the Master host of a scale out system.
One particular new capability that this release brings is the ability to do page consistency checks independently from SAP HANA data backups, using the SAP HANA Persistence Diagnosis Tool (hdbpersdiag).
There are a number of corruptions with can occur in an SAP HANA database :
- Page corruption on the disk level – issues in the storage subsystem , filesystem or operating system. Typically identified as checksum errors or I/O errors.
- Logical corruption within SAP HANA – issues with objects in SAP HANA itself displayed as inconsistencies such as main storage/delta storage inconsistency, duplicate keys, records located in the incorrect partitions and NULL values in NOT NULL columns among other possibilities.
There are a number of ways in which corruptions can be resolved but each is dependent on what type of corruption has occurred. Briefly , these are the recommended consistency checks and frequencies set out by SAP in the FAQ: SAP HANA Consistency Checks and Corruptions :
Tool | Minimum Frequency | Additional Information |
CHECK_TABLE_CONSISTENCY | monthly | This is the main check tool , should be run regularly or scheduled to run. |
uniqueChecker | monthly | For SAP HANA revisions lower than SAP HANA 1.0 122.02 runs some additional checks in comparison to CHECK_TABLE_CONSISTENCY |
CHECK_CATALOG | On demand | The catalog check should be run in scenarios where there are suspected metadata issues. |
In addition to the corruptions which can occur at the logical layers , there are also corruptions which can occur at the lower layers underneath the database – for example at the data page level. Data pages are entities written to and read from disk for various purposes related to the persistence of logical data structures and objects. It is possible for a data page to fail checks during backup or column load operations which is essentially the data page not matching the expectation. Some examples of errors which could be thrown can be found in the section “What are typical errors and solutions for corruptions on lower layers ?” of the Consistency Checks and Corruptions FAQ”.
Now , why is the ability to do page consistency checks independently from SAP HANA backups so important ?
- Data Page level consistency checks are only run during a data streaming style backup operation , i.e via an SAP HANA backint certified backup provider or direct backup to disk.
- When creating application consistent or crash consistent storage snapshots no consistency checks are run on data pages.
- It is not possible to know , without a full recovery , if a recovery from a storage snapshot will yield uncorrupted data.
Organizations who want to verify if the lower levels of the database persistence are in a good state or a storage snapshot is a good candidate to use before recovery now have the ability to do so using the SAP HANA Persistence Diagnosis Tool.
Using hdbpersdiag with Storage Snapshots on Pure Storage® FlashArray™
Important things to note before using the tool :
- This is an expert tool where only the “check all” function is available for public use.
- The tool is delivered as apart of the SAP HANA 2.0 SPS05 Installation in the /usr/sap/<sid>HDB<instance number>/exe directory
- The tool is not standalone , it is dependent on binaries and environment variables in an SAP HANA deployment.
- It should be run by the <sid>adm user from a command line terminal or ssh connection.
The SAP Note – How to Check the Consistency of the Persistence details out the process of using hdbpersdiag. My example of using it will detail out using a crash consistency storage snapshot of both the log and data volume on FlashArray. I will be using a system with multiple tenants , which does make a difference for illustrative purposes.
1. Take a storage snapshot of the log and data volume for the SAP HANA system.
2. Copy the snapshots to new volumes and connect those volumes to the host with SAP HANA 2.0 SPS05 installed
3. Mount the volumes to a location in the filesystem. In my example I have used /hana/testData as the location for SAP HANA data persistence. As the system I was mounting the new volumes to was the same one the volume snapshots were taken from I needed to use “nouuid” as an option to avoid conflicting with any existing volumes.
Filesystem Size Used Avail Use% Mounted on
devtmpfs 2.3T 0 2.3T 0% /dev
tmpfs 3.4T 32K 3.4T 1% /dev/shm
tmpfs 2.3T 21M 2.3T 1% /run
tmpfs 2.3T 0 2.3T 0% /sys/fs/cgroup
/dev/mapper/3624a9370c49a4cb0e2944f440002d735-part2 60G 18G 43G 30% /
/dev/mapper/3624a9370c49a4cb0e2944f440002dc76 512G 44G 469G 9% /hana/shared
fileserver.puredoes.local:/mnt/nfs/HANA_Backup 1.0T 136G 889G 14% /hana/backup
tmpfs 454G 20K 454G 1% /run/user/469
tmpfs 454G 0 454G 0% /run/user/468
tmpfs 454G 0 454G 0% /run/user/0
/dev/mapper/3624a9370884890ea83bd488200012c64 7.0T 1.1T 6.0T 15% /hana/data
/dev/mapper/3624a9370884890ea83bd488200012c65 3.0T 673G 2.4T 22% /hana/log
tmpfs 454G 0 454G 0% /run/user/1001
/dev/mapper/3624a9370884890ea83bd488200012c6d 3.0T 673G 2.4T 22% /hana/testLog
/dev/mapper/3624a9370884890ea83bd488200012c6e 7.0T 1.1T 6.0T 15% /hana/testData
4. Identify the HANA database volumes within the volume. , with multiple tenants there will be multiple database volumes. Each folder starting with “hdb000x” is a HANA database volume. The tool is used to check the integrity of the SAP HANA data volume.
sh1adm@Hannah:/hana/testData/SH1/mnt00001> ll total 4 drwxr-x--- 2 sh1adm sapsys 117 Jul 13 06:36 hdb00001 drwxr-xr-- 2 sh1adm sapsys 93 Jul 14 06:43 hdb00002.00003 drwxr-xr-- 2 sh1adm sapsys 93 Jul 14 07:00 hdb00002.00004 -rw-r--r-- 1 sh1adm sapsys 17 Jul 14 07:06 nameserver.lck
5. Run the hdbpersdiag tool to check for page data corruption on each HANA database volume.
Test SystemDB Volumes :
/usr/sap/SH1/HDB00/exe/hdbpersdiag -c 'check all' /hana/testData/SH1/mnt00001/hdb00001
Output :
Loaded library 'libhdbunifiedtable' Loaded library 'libhdblivecache' Trace is written to: /usr/sap/SH1/HDB00/hannah/trace Mounted DataVolume(s) #0 /hana/testData/SH1/mnt00001/hdb00001/ (2.7 GB, 2904342528 bytes) Tips: Type 'help' for help on the available commands Use 'TAB' for command auto-completion Use '|' to redirect the output to a specific command. Available command(s) are: count Count the number of lines dump Save the output to a file grep Print lines that contain a match for a pattern head Print the first n lines more Print text, one screen at a time tail Print the last n lines Default Anchor Page OK Restart Page OK Default Converter Pages OK RowStore Converter Pages OK Logical Pages (64750 pages) OK Logical Pages Linkage OK ContainerDirectory OK ContainerNameDirectory OK FileIDMappingContainer OK UndoFileDirectory OK LobDirectory OK MidSizeLobDirectory OK LobFileIDMap OK
Test Tenant 1 Volumes :
/usr/sap/SH1/HDB00/exe/hdbpersdiag -c 'check all' /hana/testData/SH1/mnt00001/hdb00002.00003
Test Tenant 2 Volumes :
/usr/sap/SH1/HDB00/exe/hdbpersdiag -c 'check all' /hana/testData/SH1/mnt00001/hdb00002.00004
At the end of verifying the health of each SAP HANA volume , if everything was listed as OK then I felt the storage snapshot could be used as a recovery point at that time or in future.
Hdbpersdiag is a great tool to help ensure that an SAP HANA system’s persistence is healthy , but it also makes storage snapshots (both application consistent data snapshots and crash consistent snapshots) more appropriate to use for recovery points as it overcomes the lack of consistency checks during the snapshot creation process.