Storage Configuration Best Practices for SAP HANA Tailored
Transcription
Storage Configuration Best Practices for SAP HANA Tailored
White Paper STORAGE CONFIGURATION BEST PRACTICES FOR SAP HANA TAILORED DATA CENTER INTEGRATION ON EMC VMAX AND VMAX3 STORAGE SYSTEMS • SAP HANA VMAX (10K, 20K, and 40K), and VMAX3 (100K, 200K, and 400K) storage systems EMC Solutions Abstract This white paper describes a new concept that revokes limitations of the current SAP High Performance Analytical Appliance (HANA) appliance model. Using Tailored Data Center Integration (TDI) on EMC® VMAX® and VMAX3TM storage systems, customers can integrate SAP HANA into an existing, well-established data center infrastructure, providing multiple benefits. February 2015 Copyright © 2014-15 EMC Corporation. All Rights Reserved. EMC believes the information in this publication is accurate as of its publication date. The information is subject to change without notice. The information in this publication is provided as is. EMC Corporation makes no representations or warranties of any kind with respect to the information in this publication, and specifically disclaims implied warranties of merchantability or fitness for a particular purpose. Use, copying, and distribution of any EMC software described in this publication requires an applicable software license. For the most up-to-date listing of EMC product names, see EMC Corporation Trademarks on EMC.com. All trademarks used herein are the property of their respective owners. Part Number H12342.3 Storage Configuration Best Practices for SAP HANA Tailored Data Center Integration on EMC VMAX and VMAX3 Storage Systems White Paper 2 Table of contents Executive summary............................................................................................................................... 5 Business case .................................................................................................................................. 5 Solution overview ............................................................................................................................ 6 Key benefits ..................................................................................................................................... 6 Introduction.......................................................................................................................................... 7 Purpose ........................................................................................................................................... 7 Scope .............................................................................................................................................. 7 Audience ......................................................................................................................................... 7 Terminology ..................................................................................................................................... 8 Using VMAX and VMAX3 storage for SAP HANA .................................................................................... 9 Scale-up vs. scale-out ...................................................................................................................... 9 SAP HANA TDI scalability .................................................................................................................. 9 SAP HANA persistence ................................................................................................................... 10 Capacity considerations ................................................................................................................. 10 Disk considerations ....................................................................................................................... 11 HANA I/O patterns ......................................................................................................................... 12 Data file system......................................................................................................................... 12 Log file system .......................................................................................................................... 12 SAP HANA OS images on VMAX ...................................................................................................... 13 SAP HANA shared file system on VMAX .......................................................................................... 13 Virtual environments ...................................................................................................................... 13 SAP HANA persistence in virtual environment ............................................................................ 13 vSphere multipathing ................................................................................................................ 13 Configuration recommendations using VMAX (10K, 20K, and 40K arrays) for SAP HANA.................... 14 Host connectivity ........................................................................................................................... 14 FA director/port requirements ........................................................................................................ 14 VMAX scalability ............................................................................................................................ 15 Virtual provisioning considerations ................................................................................................ 16 RAID considerations .................................................................................................................. 16 Thin pools ................................................................................................................................. 16 Meta volumes for data and log .................................................................................................. 17 Masking view ................................................................................................................................. 17 Initiator group ........................................................................................................................... 17 Port group ................................................................................................................................. 17 Storage group............................................................................................................................ 17 Configuration recommendations using VMAX3 (100K, 200K, and 400K arrays) for SAP HANA ........... 18 Storage Configuration Best Practices for SAP HANA Tailored Data Center Integration on EMC VMAX and VMAX3 Storage Systems White Paper 3 FAST elements ............................................................................................................................... 18 Disk group ................................................................................................................................. 18 TDAT .......................................................................................................................................... 18 Data pool .................................................................................................................................. 18 Storage resource pool ............................................................................................................... 19 Service level objective (SLO) .......................................................................................................... 19 Host connectivity ........................................................................................................................... 20 FA-director/port requirements ........................................................................................................ 21 Masking view ................................................................................................................................. 22 Initiator group ........................................................................................................................... 22 Port groups................................................................................................................................ 22 Storage group............................................................................................................................ 23 VMAX3 scalability .......................................................................................................................... 24 Accessing VMAX storage from the SAP HANA nodes ........................................................................... 25 Native Linux multipathing (DM MPIO) ............................................................................................. 25 SLES11...................................................................................................................................... 25 RHEL 6.5.................................................................................................................................... 25 Blacklist .................................................................................................................................... 26 XFS file system ............................................................................................................................... 27 Linux LVM ...................................................................................................................................... 27 SAP HANA storage connector API.................................................................................................... 27 SAP HANA global.ini file ............................................................................................................ 27 Conclusion ......................................................................................................................................... 29 Summary ....................................................................................................................................... 29 Findings ......................................................................................................................................... 29 References.......................................................................................................................................... 30 EMC documentation ....................................................................................................................... 30 VMware documentation ................................................................................................................. 30 SAP documentation ....................................................................................................................... 30 Web resources .......................................................................................................................... 30 Deployment option notes .......................................................................................................... 30 Virtualization note ..................................................................................................................... 31 Storage Configuration Best Practices for SAP HANA Tailored Data Center Integration on EMC VMAX and VMAX3 Storage Systems White Paper 4 Executive summary Business case SAP HANA is an in-memory platform that you can deploy locally (on-premises) or in the cloud. It is a revolutionary platform that is best suited for performing real-time analytics and developing and deploying real-time applications. At the core of this real-time data platform is the SAP HANA database, which is different from any other database engine on the market today. Companies with large amounts of data need their data to be recent, available in realtime, and available at high speed with fast response time and true interactivity. Companies also need data to be available without any pre-fabrication, with no data preparation, no pre-aggregates, and no tuning. SAP HANA combines SAP software components that are optimized on proven hardware provided by SAP hardware partners. SAP HANA can be deployed in two different models, as shown in Figure 1: • Appliance model • Tailored Datacenter Integration (TDI) model Figure 1. SAP HANA appliance model versus the TDI model By default, an SAP HANA appliance includes integrated storage, compute, and network components. The appliance is pre-certified by SAP, built by SAP HANA hardware partners, and shipped to customers with all software components preinstalled, including the operating systems and the SAP HANA software. Compared to the appliance deployment model, the TDI approach is more open and provides greater flexibility. The SAP HANA servers must still meet the SAP HANA requirements and be certified HANA servers. However, the network and storage components can now be shared in customer environments. This allows customers to use their existing enterprise storage arrays for SAP HANA and enables them to seamlessly integrate SAP HANA into existing datacenter operations such as disaster recovery, data protection, monitoring and management. This reduces time-to-value, risk, and costs for an overall HANA adoption. Storage Configuration Best Practices for SAP HANA Tailored Data Center Integration on EMC VMAX and VMAX3 Storage Systems White Paper 5 Solution overview The enterprise storage arrays used in SAP HANA TDI deployments must be precertified by SAP to ensure that they meet the SAP HANA performance and functional requirements 1. We 2 tested performance using the SAP HANA hardware configuration and check tool (hwcct) and various SAP HANA load generators on EMC® 10K, 20K, and 40K VMAX® and 100K, 200K, and 400K VMAX3TM enterprise storage systems. Based on the results of these tests, this white paper describes the storage configuration recommendations for the VMAX and VMAX3 arrays which meets SAP performance requirements (the SAP HANA TDI KPIs for data throughput and latency) and ensures the highest availability for database persistence on disk. Note: SAP recommends that TDI customers run the hwcct tool in their environment to ensure that the customer specific HANA TDI implementation meets the SAP performance criteria. Key benefits This solution provides the following benefits: • Integrate HANA into an existing data center infrastructure. • Use shared enterprise storage to rely on already-available, multisite concepts to benefit from established automation and operations processes. • Transition easily to this new architecture and rely on EMC services to minimize risk. • Use existing operational processes, skills, and tools, and avoid the large risks and costs associated with operational change. 1 EMC VMAX and VMAX3 are certified by SAP. In this paper, "we" refers to the Global Solutions Engineering (GSE) team that validated the solution. 2 Storage Configuration Best Practices for SAP HANA Tailored Data Center Integration on EMC VMAX and VMAX3 Storage Systems White Paper 6 Introduction Purpose Since the introduction of SAP HANA, customers have had the option to deploy, using the SAP HANA appliance model. However, this model had the following limitations: • Limited choice for servers, networks, and storage • Inability to use existing data center infrastructure and operational processes • Fixed sizes for SAP HANA storage capacities (storage was part of the appliance) • Little knowledge and control of the critical components in the HANA appliance • Inability to use existing data center infrastructure and operational costs, resulting in higher infrastructure startup costs • Fixed sizes for HANA server and storage capacities, increasing costs due to lack of capacity and inability to respond rapidly to unexpected growth demands This white paper describes a solution that uses SAP HANA in a TDI deployment scenario on EMC VMAX and VMAX3 enterprise storage. This solution reduces hardware and operational costs, lowers risks, and increases server and network vendor flexibility. All configuration recommendations in this document are based on SAP requirements for high availability and the performance tests and results that are needed to meet the SAP key performance indicators (KPIs) for SAP HANA TDI. Scope Audience This document provides best practices and tips for deploying the SAP HANA database on EMC VMAX and VMAX3 storage systems and provides the following information: • Introduction to the key solution technologies • Description of the configuration requirements for VMAX and VMAX3 with SAP HANA • Instructions about how to access VMAX and VMAX3 storage from the SAP HANA nodes This white paper is intended for system integrators, systems or storage administrators, customers, partners, and members of EMC professional services who need to configure a VMAX and VMAX3 storage array to be used in a TDI environment for SAP HANA. Storage Configuration Best Practices for SAP HANA Tailored Data Center Integration on EMC VMAX and VMAX3 Storage Systems White Paper 7 Terminology This white paper includes the terminology shown in Table 1. Table 1. Terminology Term Definition HANA worker host A HANA host which processes data HANA standby host A HANA host waiting to take over processing in case of a worker host failure Storage Configuration Best Practices for SAP HANA Tailored Data Center Integration on EMC VMAX and VMAX3 Storage Systems White Paper 8 Using VMAX and VMAX3 storage for SAP HANA SAP HANA is an in-memory database. The data is kept in the RAM of one or multiple SAP HANA worker hosts and all database activities such as reads, inserts, updates, or deletes are performed in the main memory of the host and not on disk. This differentiates SAP HANA from other traditional databases, where only a part of the data is cached in RAM and the remaining data resides on disk. Scale-up vs. scaleout You can install the SAP HANA database on a single-host system (scale-up) or on multi-host systems (scale-out). In single-host environments, the database needs to fit into the RAM of a single server. Single-host systems are the preferred environments for online transaction processing (OLTP) type workloads such as SAP Business Suite. In multi-host environments, the database tables are distributed across the RAM of multiple servers. Multi-host environments use worker and standby hosts. A worker host is an active component and accepts and processes database requests. A standby host is a passive component. It has all database services running, but no data in RAM. It is waiting for a failure of a worker host to take over its role. This process is called host auto-failover. Because the in-memory capacity in these deployments can be very high, scale-out HANA clusters are perfectly suited for online analytical processing (OLAP) type workloads with very large data sets. SAP HANA TDI scalability The SAP HANA TDI scalability defines the number of production HANA worker hosts (in scale-out installations) or single hosts (in scale-up installations) that can be connected to enterprise storage arrays and still meet the SAP performance KPIs for enterprise storage. Because the capacity used on disk for the HANA persistence is always related to the RAM capacity of the HANA database, the required capacity on disk for multiple HANA hosts is not the limiting factor in most cases. Enterprise storage arrays can provide much more capacity than required for HANA. The scalability depends on several other factors. For example: • Array model, cache size, disk types • Bandwidth, throughput, and latency • Overall use and resource consumption of the array • How the HANA host is connected to the array • Storage configuration of the HANA persistence Note: The scalability numbers in this document for the VMAX and VMAX3 arrays are recommendations based on performance tests on various models. The actual number of HANA hosts that can be connected to a VMAX and VMAX3 in a customer environment can be higher or lower than the number of HANA hosts referred to in this document. Use the SAP HANA hwcct tool in customer environments to validate the SAP HANA performance and determine the maximum possible number of HANA hosts on a given storage array. Storage Configuration Best Practices for SAP HANA Tailored Data Center Integration on EMC VMAX and VMAX3 Storage Systems White Paper 9 SAP HANA persistence SAP HANA uses disk storage for the following purposes: • To maintain the persistency of the in-memory data on disk to prevent a data loss due to a power outage and to allow a host auto-failover, where a standby HANA host takes over the in-memory data of a failed worker host in scale-out installations • To log information about data changes (redo log) For these purposes, each SAP HANA worker host (scale-out) or single-host (scale-up) requires two file systems on disk storage, a data and a log file system. Capacity considerations In general, the required capacity for the SAP HANA persistence on disk depends on the in-memory database size and the RAM size of the HANA servers. Every SAP HANA customer must perform memory and CPU sizing as the first step to sizing a SAP HANA deployment. For new SAP HANA implementations, size the memory and CPU for a SAP HANA system using the HANA version of the SAP Quick Sizer tool, available on the SAP Service Marketplace website or consult SAP for assistance. For systems that are migrating to SAP HANA, SAP provides tools and reports for proper HANA memory sizing. After you determine the memory requirements, you can estimate the disk capacity requirements by using the sizing rules in the SAP white paper, SAP HANA Storage Requirements. File systems must include: • 1x RAM for the data file system • ½x RAM for the log file system with 512 GB or less, or at least 512 GB for the log file system with 512 GB RAM or higher SAP refers to RAM size as the size of the database in contrast to the physical memory size of the servers. For example, the HANA database can consume 1.3 TB RAM on a single host but the host has 2 TB physical RAM capacity. SAP recommends the sizing based of the actual database size, in this example 1.3 TB. Note: SAP sizing requirements do not consider future growth of the database. However, in certain situations, you may need to expand the size of a data or log file system. EMC recommends sizing the database persistence (the data and log file systems) at a minimum on the physical RAM size of the HANA hosts. To calculate the required usable storage capacity for the HANA persistence of a scaleup (single-host) or scale-out (multi-host) appliance, the following details are required: • (A)—RAM size of a HANA worker host • (B) —Number of HANA worker hosts For example, use the following formulas to calculate the required capacity for a 6+1 HANA scale-out appliance where each server has 2 TB RAM: • Total capacity for data = (A) * (B) = 2 TB * 6 = 12 TB Storage Configuration Best Practices for SAP HANA Tailored Data Center Integration on EMC VMAX and VMAX3 Storage Systems White Paper 10 • Total capacity for Log = 512 GB * (B) = 512 GB * 6 = 3 TB • Total usable capacity for the HANA persistence = 12 TB + 3 TB = 15 TB You can either use any free capacity or add additional capacity to accommodate file systems that do not have performance requirements, such as operating system LUNs (see SAP HANA OS images on VMAX for details) and for the HANA shared file system (see SAP HANA shared file system on VMAX for details). Disk considerations Because of the specific workload of the SAP HANA database with primarily write I/Os, use the following disk types: • 10 k rpm disks • 15 k rpm disks • EFD Enterprise flash disks (EFD) For SAP HANA on VMAX 10K, 20K or 40K arrays, we recommend using a single tier (drive type/technology and RAID protection) strategy. This is because all writes to VMAX storage are sent to VMAX persistent cache and are later written to the disk media. For this reason, the HANA write workload will benefit primarily from VMAX cache prior to any Fully Automated Storage Tiering (FAST) and multi-tier advantages. A single tier strategy provides an adequate solution for HANA and simplifies deployment. VMAX 10K, 20K or 40K storage allows you to separate the HANA workload from a nonHANA workload on a shared storage array by using a dedicated storage disk group. A dedicated disk group can be used if the impact of non-HANA applications on shared disks is too high so that the HANA hosts will no longer meet the performance requirements. This is not a requirement and the HANA devices can also reside on shared disks. With VMAX3 100K, 200K or 400K storage, performance for certain applications is controlled by the service level objective provisioning and host limits. With VMAX3 and because of the enhancements implemented to the FAST technology, you can now combine 10 k or 15 k rpm hard disk drives (HDDs) and EFDs. The number of disks required for the HANA persistence depends on the disk type (10 k or 15 k rpm or EFD), capacity requirements and the RAID protection. A mirrored (RAID-1) protection in the VMAX array provides the best performance for applications with heavy write activities such as SAP HANA. This applies primarily to 10 k rpm and 15 k rpm drives. EFDs can be configured as RAID-5, either 3+1 or 7+1 (3+1 may offer higher availability, especially for large EFDs). To meet the host IOPS requirements on 10 k or 15 k rpm disks, distribute the HANA persistence across a certain number of disks. A HANA worker host generates approximately 1,200 I/O operations per second (IOPS). A 10 k rpm HDD can support approximately 120 IOPS and a 15 k rpm HDD can support approximately 150 IOPS. For example, in the 6+1 HANA scale-out installation (6 worker hosts and one standby host, 7,200 total host IOPS) and 1-tier storage configuration, distribute the persistence across at least 60 x 10 k rpm or 48 x 15 k rpm disks. Choose the disk size Storage Configuration Best Practices for SAP HANA Tailored Data Center Integration on EMC VMAX and VMAX3 Storage Systems White Paper 11 that meets the capacity requirements and provides the best total cost of ownership (TCO). For example, with the 6+1 scale-out installation, 15 TB of usable capacity is required for the HANA persistence. Table 2 compares the usable capacity of the different disk sizes and the RAID (mirrored) protection. Table 2. HANA I/O patterns Disk size (10 k or 15 k rpm) for 15 TB HANA persistence Disk size Usable capacity per disk Usable capacity with 60 disks Usable capacity mirrored Comments 300 GB 268 GB 16,080 GB 8,40 GB Does not meet capacity requirements 400 GB 366 GB 21,960 GB 10,980 GB Does not meet capacity requirements 600 GB 536 GB 32,160 GB 16,080 GB Meets capacity requirements with the best TCO. Future growth is limited. 900 GB 820 GB 49,200 GB 24,600 GB Meets capacity requirements and enables future growth. The SAP HANA persistent file systems have different I/O patterns, which are described in detail in the SAP HANA Storage Requirements white paper. Data file system Access is primarily random to the data file system with various block sizes from small 4 K up to large 64 M blocks. The data is written asynchronously with parallel I/Os to the data file system. During normal operations, most of the I/Os to the data file system are writes and data is read from the file system only during database restart, high availability (HA) failover, or a column store table load. Log file system Access to the log file system is primarily sequential with various block sizes from 4 K up to 1 M blocks. SAP HANA keeps a 1 M buffer for the redo log in memory and whenever the buffer is full, it is synchronously written to the log file system. When a database transaction is committed before the log buffer is full, a smaller block is written to the file system. Because data to the log file system is written synchronously, a low latency for the I/O to the storage device, especially for the smaller 4 K and 16 K block sizes, is important. As with the data file system, during normal database operations, most of the I/O to the log file system are writes and data is read from the log file system only during database restart, HA failover, and log backup or database recovery. Storage Configuration Best Practices for SAP HANA Tailored Data Center Integration on EMC VMAX and VMAX3 Storage Systems White Paper 12 SAP HANA OS images on VMAX You can boot SAP HANA nodes from either local disks or from SAN and VMAX3 devices. If you boot from a SAN, follow the best practices documented in the “Booting from SAN” section of the EMC Host Connectivity Guide for Linux. The capacity required for the operating system is approximately 100 GB per HANA host (worker and standby) and includes capacity for the /usr/sap directory. SAP HANA shared file system on VMAX In an SAP HANA scale-out implementation, install the SAP HANA database binaries on a shared file system that is exposed to all hosts of a system under a /hana/shared mount point. If a host needs to write a memory dump (which can read up to 90 percent of the RAM size), it will be stored in this file system. Guided by the specific customer infrastructure and requirements, the options for the file systems are: • VMAX block storage can create a shared file system using a cluster file system such as an Oracle Cluster File System 2 (OCFS2) on top of the block LUNs. SUSE Linux provides OCFS2 capabilities with the high availability package, and a SUSE license is required. The high availability package is also part of the SUSE Linux Enterprise Server (SLES) for SAP applications distribution from SAP that is used by most of the HANA appliance vendors. • NAS systems, such as EMC VNX®, can be used instead of OCFS2 to provide an NFS share for the HANA shared file system. • Embedded NAS offering (eNAS) of the VMAX3 arrays can provide the NFS share. The size of the HANA shared file system should be the total RAM size of the database, which is the number of worker hosts multiplied with the RAM size of a single node. SAP HANA Storage Requirements provides more details. Virtual environments Customers have the option to run SAP HANA on VMware virtualized infrastructure (vSphere). Some restrictions and limitations apply to virtualized environments, such as the maximum RAM size of a HANA node. Review corresponding SAP OSS notes and follow the VMware best practices to deploy SAP HANA on VMware vSphere. For the SAP HANA persistence on VMAX arrays in virtual environments, all physical configuration recommendations in this document also apply to virtual environments. However, with virtual environments you should also consider the following: SAP HANA persistence in virtual environment Add the data and log LUN for a virtual HANA host to the ESX host and create a Virtual Machine File System (VMFS) datastore for each LUN. You can then create one virtual disk per VMFS datastore and add it as the data or log LUN to the HANA virtual machine. Refer to VMware best practices for an optimized virtual SCSI adapter. vSphere multipathing A HANA virtual machine does not use Linux Device Mapper Multipath within the virtual machine. The data and log LUNs are visible as a single device. For example, /dev/sdb and the XFS file system must be created on this single device. On the ESX host, however, we recommend using EMC PowerPath/VE to intelligently manage I/O paths and to optimize I/O performance. Storage Configuration Best Practices for SAP HANA Tailored Data Center Integration on EMC VMAX and VMAX3 Storage Systems White Paper 13 Configuration recommendations using VMAX (10K, 20K, and 40K arrays) for SAP HANA The following configuration recommendations apply to production HANA systems deployed on VMAX enterprise 10K, 20K, and 40K storage arrays. Production HANA systems in TDI environments must meet the SAP performance requirements (KPIs) and the following special configuration requirements. Host connectivity The HANA nodes connect to the VMAX arrays through a Fibre Channel SAN. All SAN components require 8 Gb/s link speed and the SAN topology must follow best practices with all redundant components and links. FA director/port requirements Special attention is required when connecting HANA nodes to the front-end director ports (FA ports) of a VMAX array. On a VMAX director, two FA-ports share one dedicated CPU core. For example, FA-1E:0 and FA-1E:1 share the same core. To achieve full I/O performance for production HANA deployments, consider the following FA-port requirements fora VMAX array: • Dedicate FA-ports to HANA and do not share them with non-HANA applications. • Use only one FA-port per CPU core on the I/O module and do not use the adjacent port. For example, use FA-1E:0 and leave FA-1E:1 unused. Do not use the adjacent port for non-HANA applications • Never connect a single Host Bus Adapter (HBA) to both ports of the same director. • The minimum number of FA-ports required for HANA depends on the number of HANA nodes connected to a single VMAX engine. Use Table 3 to determine the required number of FA-ports: Table 3. VMAX 10K, 20K, and 40K FA-ports for HANA worker nodes HANA worker nodes Required FA-ports 1-2 2 3-4 3 5-6 4 7-8 5 9-10 6 11-12 8 For example, If 16 HANA nodes are connected to a dual-engine VMAX, use 5 ports on each engine for only HANA. Storage Configuration Best Practices for SAP HANA Tailored Data Center Integration on EMC VMAX and VMAX3 Storage Systems White Paper 14 • Connect all HANA hosts to all FA-ports dedicated to HANA. The more available I/O paths for a host, the better the SAP HANA performance. Note: This rule applies to VMAX 10K, 20K and 40K, but not to the VMAX3 family. • Balance FA-ports used for HANA across all available VMAX engines. • Use 8 Gb/s FC ports. While 10 Gb/s iSCSI or Fibre Channel over Ethernet (FCoE) can be used, we have not validated it for SAP HANA. HANA 2 Gb/s or 4 Gb/s FC ports are not supported. Figure 2 and Figure 3 show the rear view of the VMAX engines with 4-port FC I/O modules (8 Gb/s) for host connectivity. We recommend using the I/O ports marked with a yellow box for HANA connectivity. The adjacent ports should be left unused. VMAX scalability Figure 2. Rear view of a VMAX 10K engine Figure 3. Rear view of a VMAX 20K and 40K engine In a 10K, 20K, or 40K VMAX array, the scalability of SAP HANA primarily depends on the number of available engines in the array. Table 4 shows the VMAX models and the estimated maximum number of HANA worker hosts that can be connected according to the number of available engines. Table 4. VMAX 10K, 20K, and 40K scalability VMAX model 10K Number of available engines Maximum number of HANA worker hosts 1 12 2 18 3 24 Storage Configuration Best Practices for SAP HANA Tailored Data Center Integration on EMC VMAX and VMAX3 Storage Systems White Paper 15 VMAX model 20K 40K Number of available engines Maximum number of HANA worker hosts 4 30 1 12 2 20 3 28 4 36 5 44 6 52 7 60 8 68 1 12 2 22 3 32 4 42 5 52 6 62 7 72 8 82 When EMC Symmetrix Remote Data Facility (SRDF) is used for SAP HANA storage replication, a reduced number of front-end FA-ports will be available and the number of HANA worker hosts which can be connected to the array must be adjusted accordingly. Virtual provisioning considerations VMAX arrays use EMC Virtual ProvisioningTM to provide capacity to an application. The capacity is allocated using virtual provisioning data devices (TDAT) and provided in thin pools based on the disk technology and RAID type. Thin devices (TDEV) are host accessible devices bound to thin pools and natively striped across the pool to provide the highest performance. RAID considerations To provide best write performance for the HANA persistence, RAID-1 mirrored configurations are required for the TDATs on 10 k or 15 k rpm disks. You can configure TDATs on EFDs using RAID-5, either 3+1 or 7+1. We recommend 3+1. Thin pools We recommend creating one thin pool for all HANA data volumes and a second thin pool for the HANA log volumes in a VMAX array. However, if a limited number of disks are available in smaller HANA environments, performance could be improved by Storage Configuration Best Practices for SAP HANA Tailored Data Center Integration on EMC VMAX and VMAX3 Storage Systems White Paper 16 using a single thin pool for both devices. Thin pools consist of TDATs. The number and size of the TDATs in a thin pool depends on the SAP HANA capacity requirements and must be configured using VMAX configuration best practices. Meta volumes for data and log Each HANA worker host requires one data and one log volume for the persistent file systems. The sizes of these volumes depend on the Capacity considerations described earlier in this document. Masking view VMAX uses masking views to assign storage to a host. We recommend creating a single masking view for each HANA host. A masking view consists of the following components: • Initiator group • Port group • Storage group Initiator group The initiator group contains the initiators (WWNs) from the host bus adaptors (HBAs) of the HANA host. Connect each HANA host to the VMAX array with at least two HBAs for redundancy. Port group The port group contains the front-end director ports to which the HANA host is connected. Table 3 lists the minimum number of ports required for a HANA installation with multiple hosts (either a scale-out installation or multiple scale-up hosts). For the VMAX 10K, 20K and 40K arrays, the more ports assigned to the HANA hosts, the better the performance. However, ensure that a single HBA connects to only one port per director. Storage group An SAP HANA scale-out cluster uses the shared-nothing concept for the persistence of the database, where each HANA worker host uses its own pair of data and log volumes and has exclusive access to the volumes during normal operations. If a HANA worker host fails, the HANA persistence of the failed host is used on a standby host. This requires that all persistent devices are visible to all HANA hosts because every host can become a worker or a standby host. The VMAX storage groups of a HANA database must contain all persistent devices of the database cluster. The HANA name server, in combination with the SAP HANA storage connector API, will take care of proper mounting and I/O fencing of the persistence. Storage Configuration Best Practices for SAP HANA Tailored Data Center Integration on EMC VMAX and VMAX3 Storage Systems White Paper 17 Configuration recommendations using VMAX3 (100K, 200K, and 400K arrays) for SAP HANA The following configuration recommendations apply to production HANA systems deployed on EMC VMAX3 100K, 200K, and 400K enterprise storage arrays. Production HANA systems in TDI environments must meet the SAP performance requirements (KPIs) and the following special configuration requirements. FAST elements EMC Fully Automated Storage Tiering (FASTTM) automates the identification of active or inactive application data for reallocating that data across different pools within a VMAX3 storage array. FAST proactively monitors workloads to identify busy data that would benefit from being moved to higher-performing drives, while also identifying less-busy data that could be moved to higher-capacity drives, without affecting existing performance. This promotion/demotion activity is based on achieving service level objectives that set performance targets for associated applications, with FAST determining the most appropriate pool to allocate data on. With VMAX3, the following storage elements are pre-configured for ease of manageability and cannot be changed: • Storage disk groups • TDATs • Data pools • Storage resource pool Disk group A disk group is a collection of available physical drives in the VMAX3. Each drive in a disk group shares the same characteristics, determined by the following: • Rotational speed for HDDs (15 k, 10 k, 7.2 k) • EFD • Capacity SAP HANA requires 15 k or 10 k rpm HDDs or EFDs. HDDs with 7.2 k rpm drives do not meet the SAP HANA performance requirements. TDAT Each disk group is pre-configured with data devices (TDAT) based on EMC best practices for size and RAID protection. SAP HANA requires RAID-1 (mirrored) on HDDs and RAID-5 3+1 or 7+1 on EFDs. For EFDs, RAID-5 3+1 is best. Data pool All TDATs in a storage disk group are added to a data pool. The data pool is a collection of TDAT devices and a 1-to-1 relationship between data pools and disk groups. The performance capability of each data pool is based on the drive type, speed, capacity, quantity of drives, and RAID protection. Storage Configuration Best Practices for SAP HANA Tailored Data Center Integration on EMC VMAX and VMAX3 Storage Systems White Paper 18 Storage resource pool A storage resource pool (SRP) is a collection of data pools that make up a FAST domain. A data pool can only be included in one SRP. VMAX3 ships with a single preconfigured SRP. EMC support is required for custom configurations. Figure 4 shows a sample VMAX3 configuration and a single SRP with multiple disk groups and data pools. In larger HANA environments and where the separation of the HANA workload is required, we recommend using a dedicated SRP for the HANA devices. If HANA is installed on a multi-tier SRP, the Service Level Objective (SLO) provisioning must be used to ensure that HANA data is allocated on a higher tier. Figure 4. Service level objective (SLO) VMAX3 FAST elements In VMAX3 arrays, the FAST technology is enhanced and now delivers SLO performance levels. Thin devices can be added to storage groups and storage groups can be assigned to a specific SLO to set performance expectations. The SLO defines the response time for the storage group. FAST continuously monitors and adapts the workload to maintain (or meet) the response time target. There are five available service level objectives, varying in expected average response time targets. There is an additional Optimized SLO that has no explicit response time target associated with it. Table 5 lists the available SLOs. Table 5. VMAX3 SLOs SLO Behavior Expected Average Response Time Diamond Emulates EFD performance 0.8 ms Platinum Emulates performance between EFD and 15 k rpm drives 3.0 ms Gold Emulates 15 k rpm performance 5.0 ms Silver Emulates 10 k rpm performance 8.0 ms Bronze Emulates 7.2 k rpm performance 14.0 ms Storage Configuration Best Practices for SAP HANA Tailored Data Center Integration on EMC VMAX and VMAX3 Storage Systems White Paper 19 SLO Behavior Optimized (default) Achieves optimal performance by placing most active data on higher performance storage and least active data on most cost-effective storage Expected Average Response Time N/A The actual response time of an application associated with each SLO varies based on the actual workload seen on the application and depends on average I/O size, read/write ratio, and the use of local or remote replication. If the HANA devices are created on a dedicated SRP with just EFDs and/or 10 k or 15 k rpm HDDs, then select the Optimized SLO. If you do not use a dedicated SRP, then select at least a Platinum SLO to ensure that data is allocated on EFDs, and/or select 10 k or 15 k rpm disks. You can add one of the four workload types shown in Table 6 to the SLO that you selected (except for Optimized), to further refine response time expectations. Table 6. VMAX3 service workloads Workload Description OLTP Small block I/O workload OLTP with replication Small block I/O workload with local or remote replication Decision Support System (DSS) Large block I/O workload DSS with replication Large block I/O workload with local or remote replication To improve the latency for small 4 K I/O operations on HANA log devices, assign the OLTP workload type to the HANA storage group. Host connectivity The HANA nodes connect to the VMAX3 arrays through a Fibre Channel SAN. All SAN components require 8 Gb/s or 16 Gb/s link speed and the SAN topology should follow best practices with all redundant components and links. Storage Configuration Best Practices for SAP HANA Tailored Data Center Integration on EMC VMAX and VMAX3 Storage Systems White Paper 20 FA-director/port requirements Because CPU cores are dynamically allocated to FA director-ports in VMAX3 arrays, you can connect HANA hosts to any port on a VMAX3 director. However, connect (or zone) a single HBA initiator to only one FA port per director. You can achieve increased availability and performance by connecting each HANA node to different directors, and by using multiple host initiators. Note: You will see no performance or availability benefits from connecting the same host initiator to multiple ports on the same director. If you do connect the same host initiator to multiple ports on the same FA, contact EMC support to enable the VMAX3 Fixed Block Architecture (FBA) Enable Dual Port flag that allows SCSI-3 reservations handling in this way. To achieve full I/O performance for production HANA deployments, consider the following FA-port requirements for the VMAX3 array: • Dedicate FA-ports to HANA and do not share them with non-HANA applications • Do not connect a single HBA port to more than one port on the same director. • Use Table 7 to determine the required number of FA-ports, which can be distributed across the available engines. The number of FA ports required for HANA depends on the number of HANA nodes connected to a single VMAX3 engine. Table 7. VMAX 100K, 200K, and 400K FA-ports for HANA worker nodes HANA worker nodes Required FA-ports 1-4 2 5-8 4 9-16 8 17-20 12 • Distribute FA-ports used for HANA across all available VMAX3 engines and balance them between directors. For example, if 16 HANA nodes are connected to a VMAX3 with two engines, balance the connectivity across the engines. Use 8 ports on each engine (4 per director) for HANA only. • Use 8 Gb/s or 16 Gb/s FC ports. • Ensure that the zoning between host initiators and storage ports does not cross switches and uses ISL. Figure 5 shows the rear view of the VMAX3 engine. Each engine has two directors with up to 16 FC front-end ports (ports 4-11 and 24-31). Storage Configuration Best Practices for SAP HANA Tailored Data Center Integration on EMC VMAX and VMAX3 Storage Systems White Paper 21 Figure 5. Masking view Rear view of a VMAX3 engine with FA-port assignments VMAX3 uses masking views to assign storage to hosts. Create a single masking view for each HANA host. A masking view consists of the following components: • Initiator group • Port group • Storage group Initiator group The initiator group contains the Port WWN (PWWN) initiators of the HBAs of the HANA host. Connect each HANA host to the VMAX array with at least two HBAs. Initiator groups can be cascaded and a masking view initiator group can be a collection of the initiator groups from each of the HANA nodes. Port groups The port group contains all the VMAX3 front-end ports to which the HANA hosts are connected. If the VMAX3 Enable Dual Port FBA flag is set as described in FAdirector/port requirements, then all FA ports used by HANA can be defined in a single port group. However, if this flag has not been set, then it is important to ensure that a single HBA port connects to only one FA port per director. You can do this by using multiple port groups, as shown in the example in Figure 6. Figure 6 shows an environment with 12 HANA hosts. Each host with two HBAs is connected with dual fabric to the FA-ports of a dual engine VMAX3. Storage Configuration Best Practices for SAP HANA Tailored Data Center Integration on EMC VMAX and VMAX3 Storage Systems White Paper 22 Figure 6. VMAX3 SAN connectivity and port groups In this example, we created two port groups. Each port group contains four FA-ports (a total of 8 for 12 HANA hosts), balanced across two engines and the two directors per engine. Port group PG01 is used by HANA hosts hana01-hana06. Port group PG02 is used by hana07-hana12. This connectivity ensures that a single HBA connects to only one port per director. In this example, all 12 HANA hosts belong to the same HANA database scale-out cluster. Therefore, all HANA persistent devices (data and log) belong to the same VMAX storage group. If the HANA hosts belong to multiple clusters, then one storage group per HANA cluster is required. The port group assignment does not change even with multiple HANA clusters. Storage group An SAP HANA scale-out cluster uses the shared-nothing concept for the persistence of the database where each HANA worker host uses its own pair of data and log volumes and has exclusive access to the volumes during normal operations. If a HANA worker host fails, the HANA persistence of the failed host will be used on a standby host. This concept requires that all persistent devices be visible to all HANA hosts because every host can become a worker or a standby host. The VMAX3 storage groups of a HANA database must contain all persistent devices of the database cluster. The HANA name server, in combination with the SAP HANA Storage Connector API, will take care of proper mounting and I/O fencing of the persistence. Storage Configuration Best Practices for SAP HANA Tailored Data Center Integration on EMC VMAX and VMAX3 Storage Systems White Paper 23 VMAX3 scalability In a 100K, 200K, or 400K VMAX3 array, the scalability of SAP HANA primarily depends on the number of available engines in the array. Table 8 shows the VMAX3 models and the estimated maximum number of HANA nodes that can be connected according to the number of available number of engines: Table 8. VMAX 100K, 200K, and 400K scalability VMAX3 model 100K 200K 400K Engines Maximum HANA nodes 1 12 2 20 1 16 2 28 3 40 4 52 1 20 2 32 3 44 4 56 5 68 6 80 7 92 8 104 If you use SRDF for SAP HANA storage replication, a reduced number of front-end FAports are available, and the maximum number of HANA worker hosts that can be connected to the array must be adjusted accordingly. Storage Configuration Best Practices for SAP HANA Tailored Data Center Integration on EMC VMAX and VMAX3 Storage Systems White Paper 24 Accessing VMAX storage from the SAP HANA nodes The SAP HANA database requires a Linux SUSE SLES11 or a Red Hat RHEL 6.5 operating system on the HANA nodes. To access the VMAX block devices from the HANA nodes, ensure that zoning is based on SAN best practices. A single HBA must connect to only one port per director. Native Linux multipathing (DM MPIO) To access the block devices from the HANA nodes, first enable native Linux multipathing. Follow the steps described in EMC Host Connectivity Guide for Linux to enable Linux DM-MPIO on Red Hat Linux RHEL 6.5 or SUSE SLES11. The following sections provide examples of multipath.conf files: SLES11 ## This is a template multipath-tools configuration file ## Uncomment the lines relevant to your environment ## defaults { # udev_dir /dev # polling_interval 10 # selector "round-robin 0" # path_grouping_policy multibus # getuid_callout "/lib/udev/scsi_id -g -u -d /dev/%n" # prio const # path_checker directio # rr_min_io 100 # max_fds 8192 # rr_weight priorities # failback immediate # no_path_retry fail user_friendly_names no } blacklist { ## Replace the wwid with the output of the command MPIO ## 'scsi_id -g -u -s /block/[internal scsi disk name]' ## Enumerate the wwid for all internal scsi disks. ## Optionally, the wwid of VCM database may also be listed here ## wwid 35005076718d4224d devnode "^(ram|raw|loop|fd|md|dm-|sr|scd|st)[0-9]*" devnode "^hd[a-z][[0-9]*]" devnode "^cciss!c[0-9]d[0-9]*[p[0-9]*]" RHEL 6.5 ## This is a template multipath-tools configuration file ## Uncomment the lines relevant to your environment ## defaults { # udev_dir /dev # polling_interval 10 # selector "round-robin 0" # path_grouping_policy multibus # getuid_callout "/sbin/scsi_id -g -u -s /block/%n" # prio_callout /bin/true Storage Configuration Best Practices for SAP HANA Tailored Data Center Integration on EMC VMAX and VMAX3 Storage Systems White Paper 25 # path_checker readsector0 # rr_min_io 100 # rr_weight priorities # failback immediate # no_path_retry fail user_friendly_names no } ## The wwid line in the following blacklist section is shown as an example ## of how to blacklist devices by wwid. The 3 devnode lines are the ## compiled in default blacklist. If you want to blacklist entire types ## of devices, such as all scsi devices, you should use a devnode line. ## However, if you want to blacklist specific devices, you should use ## a wwid line. Since there is no guarantee that a specific device will ## not change names on reboot (from /dev/sda to /dev/sdb for example) ## devnode lines are not recommended for blacklisting specific devices. ## Note: Remove # to enable the devnode blacklist. You can add the WWID for the Symmetrix VCM database, as shown in this example. The VCM database is a read-only device that is used by the array. By blacklisting it you will eliminate any error messages that could occur because of its presence. Blacklist wwid 360060480000190101965533030303230 devnode "^(ram|raw|loop|fd|md|dm-|sr|scd|st)[0-9]*" devnode "^hd[a-z]" devnode "^cciss!c[0-9]d[0-9]*" } The HANA persistent devices should be visible on a HANA worker host after a reboot or a rescan (command rescan-scsi-bus.sh). Type the following command to verify that all devices are present and each device has the number of active paths you configured: $ multipath –ll 360000970000298700460533030303238 dm-10 EMC,SYMMETRIX size=512G features='0' hwhandler='0' wp=rw `-+- policy='round-robin 0' prio=1 status=active |- 2:0:5:1 sdai 66:32 active ready running |- 1:0:5:1 sdby 68:192 active ready running `- 2:0:4:1 sdcg 69:64 active ready running 360000970000298700460533030303338 dm-11 EMC,SYMMETRIX size=512G features='0' hwhandler='0' wp=rw `-+- policy='round-robin 0' prio=1 status=active |- 2:0:5:3 sdak 66:64 active ready running |- 1:0:5:3 sdca 68:224 active ready running `- 2:0:4:3 sdci 69:96 active ready running 360000970000298700460533030303438 dm-12 EMC,SYMMETRIX size=1.5T features='0' hwhandler='0' wp=rw Storage Configuration Best Practices for SAP HANA Tailored Data Center Integration on EMC VMAX and VMAX3 Storage Systems White Paper 26 `-+- policy='round-robin 0' prio=1 status=active |- 2:0:5:5 sdam 66:96 active ready running |- 1:0:5:5 sdcc 69:0 active ready running `- 2:0:4:5 sdck 69:128 active ready running 360000970000298700460533030303538 dm-14 EMC,SYMMETRIX size=1.5T features='0' hwhandler='0' wp=rw `-+- policy='round-robin 0' prio=1 status=active |- 2:0:5:6 sdan 66:112 active ready running |- 1:0:5:6 sdcd 69:16 active ready running `- 2:0:4:6 sdcl 69:144 active ready running XFS file system The XFS file system provides the best performance for both HANA data and log block devices. To format a block device with the XFS file system, type the following command on the HANA node: $ mkfs.xfs /dev/mapper/3600009700002987004605330303238 Note: Run this command for all block devices. If for some reason a file system must be expanded, use the xfs_growfs command on the Linux host after the volume has been expanded on the VMAX. Linux LVM You can use the Logical Volume Management (LVM) on the HANA host to manage devices in a more flexible way. This document assumes that all HANA persistent devices are presented to the HANA hosts as a single device and that LVM is not required. In environments where customers need more flexibility and the size of HANA persistent devices has to be adjusted in more granular increments than available with a MetaLUN expansion on the VMAX, LVM could help to address these challenges. LVM requires the use of a special storage connector API (fcClientLVM), which is part of the SAP HANA software distribution. SAP HANA storage connector API In an SAP HANA scale-out environment with worker and standby nodes, the SAP HANA storage connector API for Fibre Channel (fcClient) mounts and unmounts the devices to the HANA nodes. If LVM is used, a special version of the API (fcClientLVM) is required. In addition to mounting the devices, the storage connector API also writes SCSI-3 PR (Persistent Reservations) to the devices using the Linux sg_persist command. This is called I/O fencing and ensures that at a given time only one HANA worker host has access to a set of data and log devices. SAP HANA global.ini file The storage connector API is controlled in the storage section of the SAP HANA global.ini file. This section contains entries for the block devices with optional mount options. The WWIDs of the partition entries can be determined using the multipath –ll command on the HANA hosts. Storage Configuration Best Practices for SAP HANA Tailored Data Center Integration on EMC VMAX and VMAX3 Storage Systems White Paper 27 This is an example of a global.ini file: [persistence] basepath_datavolumes=/hana/data/ANA basepath_logvolumes=/hana/log/ANA use_mountpoints = yes [storage] ha_provider = hdb_ha.fcClient partition_*_*__prType = 5 partition_1_data__wwid = 360000970000298700460533030303438 partition_1_log__wwid = 360000970000298700460533030303238 partition_2_data__wwid = 360000970000298700460533030303538 partition_2_log__wwid = 360000970000298700460533030303330 partition_3_data__wwid = 360000970000298700460533030303638 partition_3_log__wwid = 360000970000298700460533030303338 Storage Configuration Best Practices for SAP HANA Tailored Data Center Integration on EMC VMAX and VMAX3 Storage Systems White Paper 28 Conclusion Summary Using SAP HANA in Tailored Datacenter Integration (TDI) deployments with EMC VMAX and VMAX3 enterprise storage arrays provides many benefits, including reducing hardware and operational costs, lowering risks, and increasing hardware vendor flexibility, for both SAP and non-SAP applications. Findings This solution provides the following benefits: • Integrate HANA into an existing data center infrastructure. • Use shared enterprise storage to rely on already-available, multisite concepts to benefit from established automation and operations processes. • Transition easily to this new architecture and rely on EMC services to minimize risk. • Use existing operational processes, skills, and tools, and avoid the large risks and costs associated with operational change. Storage Configuration Best Practices for SAP HANA Tailored Data Center Integration on EMC VMAX and VMAX3 Storage Systems White Paper 29 References EMC documentation You can find the following EMC documentation on EMC.com or on EMC Online Support: • EMC VMAX3 Family (100K, 200K, 400K) Documentation Set—Contains the hardware platform product guide and TimeFinder product guide for the VMAX 10K, VMAX 20K, VMAX 40 K, VMAX3 100K, VMAX3 200K, and VMAX3 400K. • EMC Symmetrix System Viewer for Desktop and iPad—Illustrates VMAX and VMAX3 system hardware, incrementally scalable system configurations, and available host connectivity that is offered for Symmetrix systems. • EMC Host Connectivity Guide for Linux • EMC Cloud Enabled Infrastructure for SAP–Business white paper VMware documentation You can find the following VMware documentation at http://www.vmware.com: SAP documentation You can find the following SAP HANA documentation at http://help.sap.com/hana/: • Best Practices and Recommendations for Scale-up Deployments of SAP HANA on VMware vSphere • SAP HANA Master Guide • SAP HANA Server Installation and Update Guide • SAP HANA Studio Installation and Update Guide • SAP HANA Technical Operations Manual • SAP HANA Administration Guide • SAP HANA Storage Requirements Web resources • SAP HANA Appliance • SAP HANA One • SAP HANA Enterprise Cloud • SAP HANA Tailored Data Center Integration Note: The following documentation requires an SAP username and password. Deployment option notes • Note 1681092–Multiple SAP HANA databases on one appliance • Note 1661202–Support for multiple applications on SAP HANA • Note 1666670–BW on SAP HANA; Landscape deployment planning Storage Configuration Best Practices for SAP HANA Tailored Data Center Integration on EMC VMAX and VMAX3 Storage Systems White Paper 30 Virtualization note • Note 1788665–SAP HANA running on VMware vSphere VMs Storage Configuration Best Practices for SAP HANA Tailored Data Center Integration on EMC VMAX and VMAX3 Storage Systems White Paper 31