H14777. ProtectPoint File System Agent
Transcription
H14777. ProtectPoint File System Agent
PROTECTPOINT FILE SYSTEM AGENT 3 WITH VMAX – BACKUP & RECOVERY BEST PRACTICE FOR ORACLE ON ASM EMC® VMAX® Engineering White Paper ABSTRACT The integration of ProtectPoint™ File System Agent, TimeFinder® SnapVX and the Data Domain system allows Oracle database backup and restore to take place entirely within the integrated system. This capability not only reduces host I/O and CPU overhead, allowing the host to focus on servicing database transactions, but also provides higher efficiency for the backup and recovery process. April, 2015 EMC WHITE PAPER To learn more about how EMC products, services, and solutions can help solve your business and IT challenges, contact your local representative or authorized reseller, visit www.emc.com, or explore and compare products in the EMC Store. Copyright © 2015 EMC Corporation. All Rights Reserved. EMC believes the information in this publication is accurate as of its publication date. The information is subject to change without notice. The information in this publication is provided “as is.” EMC Corporation makes no representations or warranties of any kind with respect to the information in this publication, and specifically disclaims implied warranties of merchantability or fitness for a particular purpose. Use, copying, and distribution of any EMC software described in this publication requires an applicable software license. For the most up-to-date listing of EMC product names, see EMC Corporation Trademarks on EMC.com. Part Number <H14777> 2 TABLE OF CONTENTS EXECUTIVE SUMMARY .............................................................................. 5 AUDIENCE ......................................................................................................... 5 PRODUCT OVERVIEW ................................................................................ 5 Terminology ...................................................................................................... 5 VMAX3 Product Overview ..................................................................................... 7 Data Domain Product Overview ............................................................................ 9 ProtectPoint Product Overview ........................................................................... 10 ORACLE AND PROTECTPOINT FILE SYSTEM AGENT CONSIDERATIONS ... 13 RMAN Backup integrations with Backup Media Managers ....................................... 13 RMAN and ProtectPoint File System Agent integration points with ASM ................... 14 ProtectPoint File System Agent and Oracle Real Application Clusters (RAC) ............. 14 ProtectPoint file System Agent and Remote Replications with SRDF ........................ 14 Backup to disk with SnapVX .............................................................................. 15 Command Execution Permissions for Oracle Users ................................................ 15 ProtectPoint Configuration File, Backup Devices, and ASM changes......................... 15 ORACLE BACKUP AND RECOVERY USE CASES WITH PROTECTPOINT FILE SYSTEM AGENT ....................................................................................... 16 Oracle Database Backup/Recovery Use Cases – The Big Picture ............................. 16 Backup and Recovery Use Cases Setup ............................................................... 19 Backup Oracle ASM database using ProtectPoint .................................................. 19 Using Mount host to pick a backup-set to copy to production (4a, Recoverable) ....... 20 Using Mount host and Database Clone for logical recovery (4a, RestarTable) ........... 22 RMAN Recovery of Production Without SnapVX Copy (4b) ..................................... 23 RMAN Recovery of Production After Copy, overwriting Production data devices (4c).. 26 CONCLUSION .......................................................................................... 28 APPENDIXES........................................................................................... 29 Appendix I – ProtectPoint System Setup ............................................................. 29 Appendix II – Sample CLI Commands: SnapVX, ProtectPoint, Data Domain ............. 41 Appendix III – Providing Solutions Enabler Access to non-root Users ...................... 43 3 Appendix IV – Scripts Used in the Use Cases ....................................................... 45 REFERENCES ........................................................................................... 51 4 EXECUTIVE SUMMARY Many applications are required to be fully operational 24x7x365 and the data for these applications continues to grow. At the same time, their RPO and RTO requirements are becoming more stringent. As a result, there is a large gap between the requirement for fast and efficient protection, and the ability to meet this requirement without disruption. Traditional backup is unable to meet this requirement due to the inefficiencies of reading and writing all the data during full backups. More importantly, during recovery the recovery process itself (‘roll forward’) cannot start until the initial image of the database is fully restored, which can take a very long time. This has led many datacenters to use snapshots for more efficient protection. However, snapshot data is typically left within the primary storage array together with its source, risking the loss of both in the event of data center failure or storage unavailability. Also, often there is no strong integration between the database backup process, managed by the database administrator, and the snapshot operations, managed by the storage administrator. Finally, it is more advantageous to store the backups in media that does not consume primary storage and also benefits from deduplication, compression, and remote replication such as the Data Domain system offers. EMC ProtectPoint addresses these gaps by integrating best-in-class EMC products, the VMAX3 storage array and the Data Domain system, making the backup and recovery process more automated, efficient, and integrated. The integration of ProtectPoint, TimeFinder SnapVX and the Data Domain system allows Oracle ASM database backup and restore to take place entirely within the integrated system! This capability not only reduces host I/O and CPU overhead, allowing the host to focus on servicing database transactions, but also provides higher efficiency for the backup and recovery process. Backup efficiencies are introduced by not requiring any read or write I/Os of the data files by the host. Instead, TimeFinder SnapVX creates a snapshot which is a valid backup of the database, and then copies it directly to the Data Domain system, leveraging VMAX3 Federated Tier Storage (FTS). For Oracle databases prior to 12c Hot Backup mode is used, though for only a few seconds – regardless of the size of the database. As soon as the snapshot is created, Hot Backup mode is ended immediately. The snapshot is then incrementally copied to the Data Domain system in the background, while database operations continue as normal. Oracle 12c offers a new feature: Oracle Storage Snapshot Optimization that allows database backups without the need for Hot-Backup mode, leveraging storage snapshot consistency, which is an inherent feature of SnapVX. The combination of Oracle 12c, VMAX3, and Data Domain allows the highest backup efficiency. Restore efficiencies are introduced in a similar way by not requiring any read or write I/Os of the data files by the host. Instead, Data Domain places the required backup-set on its restore encapsulated devices. The restore devices can be directly mounted to a Mount host for small-scale data retrievals, or mounted to the Production host and cataloged with RMAN so RMAN recover functionality can be used for production-database recovery (e.g. fixing physical block corruption, missing datafiles, etc.). A third option is available, where the restore devices content is copied by SnapVX, overwriting the native VMAX3 Production devices. This option is best used when the Production database requires a complete restore from backup, or for a large-scale recovery that should not be performed from the encapsulated Data Domain devices. Note: This white paper addresses the values and best practices of ProtectPoint File System Agent v1.0 and VMAX3 where the Oracle database resides on ASM. It does not cover ProtectPoint for Oracle databases residing on file systems. Note: At this time ProtectPoint File System Agent requires an approved RPQ for Oracle ASM deployments. By itself, TimeFinder is fully supported to create Oracle ASM recoverable and restartable replicas and backups, as has been the case for many years. AUDIENCE This white paper is intended for database and system administrators, storage administrators, and system architects who are responsible for implementing, managing, and maintaining Oracle databases backup and recovery strategy with VMAX3 storage systems. It is assumed that readers have some familiarity with Oracle and the EMC VMAX3 family of storage arrays, and are interested in achieving higher database availability, performance, and ease of storage management. PRODUCT OVERVIEW TERMINOLOGY The following table explains important terms used in this paper. 5 Term Description Oracle Automatic Storage Oracle ASM is a volume manager and a file system for Oracle database files that supports single- Management (ASM) instance Oracle Database and Oracle Real Application Clusters (Oracle RAC) configurations. Oracle ASM is Oracle’s recommended storage management solution that provides an alternative to conventional volume managers, file systems, and raw devices. Oracle Real Application Oracle Real Application Clusters (RAC) is a clustered version of Oracle Database based on a Clusters (RAC) comprehensive high-availability stack that can be used as the foundation of a database cloud system as well as a shared infrastructure, ensuring high availability, scalability, and agility for applications. Restartable vs. Recoverable Oracle distinguishes between a restartable and recoverable state of the database. A restartable database state requires all log, data, and control files to be consistent (see ‘Storage consistent replications’). Oracle can be simply started, performing automatic crash/instance recovery without user intervention. Recoverable state on the other hand requires a database media recovery, rolling forward transaction log to achieve data consistency before the database can be opened. RTO and RPO Recovery Time Objective (RTO) refers to the time it takes to recover a database after a failure. Recovery Point Objective (RPO) refers to any amount of data-loss after the recovery completes, where RPO=0 means no data loss of committed transactions. Storage consistent Storage consistent replications refer to storage replications (local or remote) in which the target replications devices maintain write-order fidelity. That means that for any two dependent I/Os that the application issue, such as log write followed by data update, either both will be included in the replica, or only the first. To the Oracle database the snapshot data looks like after a host crash, or Oracle ‘shutdown abort’, a state from which Oracle can simply recover by performing crash/instance recovery when starting. Starting with Oracle 11g, Oracle allows database recovery from storage consistent replications without the use of hot-backup mode (details in Oracle support note: 604683.1). The feature has become more integrated with Oracle 12c and is called Oracle Storage Snapshot Optimization. 3 VMAX Federated Tiered Federated Tiered Storage (FTS) is a feature of VMAX3 that allows an external storage system to be Storage (FTS) connected to the VMAX3 backend and provides physical capacity that is managed by VMAX3 software. 3 VMAX HYPERMAX OS HYPERMAX OS is the industry’s first open converged storage hypervisor and operating system. It enables VMAX3 to embed storage infrastructure services like cloud access, data mobility and data protection directly on the array. This delivers new levels of data center efficiency and consolidation by reducing footprint and energy requirements. In addition, HYPERMAX OS delivers the ability to perform real-time and non-disruptive data services. VMAX3 Storage Group A collection of host addressable VMAX3 devices. A Storage Group can be used to (a) present devices to host (LUN masking), (b) specify FAST Service Levels (SLOs) to a group of devices, and (c) manage grouping of devices for replications software such as SnapVX and SRDF®. Storage Groups can be cascaded, such as the child storage groups used for setting FAST Service Level Objectives (SLOs) and the parent used for LUN masking of all the database devices to the host. 3 VMAX TimeFinder SnapVX TimeFinder SnapVX is the latest generation in TimeFinder local replication software, offering higher scale and a wider feature set while maintaining the ability to emulate legacy behavior. 3 VMAX TimeFinder SnapVX Previous generations of TimeFinder referred to snapshot as a space-saving copy of the source Snapshot vs. Clone device, where capacity was consumed only for data changed after the snapshot time. Clone, on the other hand referred to full copy of the source device. With VMAX3, TimeFinder SnapVX snapshots are always space-efficient. When they are linked to host-addressable target devices, the user can choose to keep the target devices space-efficient, or perform a full copy. 6 VMAX3 PRODUCT OVERVIEW Introduction to VMAX3 The EMC VMAX3 family of storage arrays is built on the strategy of simple, intelligent, modular storage, and incorporates a Dynamic Virtual Matrix interface that connects and shares resources across all VMAX3 engines, allowing the storage array to seamlessly grow from an entry-level configuration into the world’s largest storage array. It provides the highest levels of performance and availability featuring new hardware and software capabilities. The newest additions to the EMC VMAX3 family, VMAX 100K, 200K and 400K, deliver the latest in Tier-1 scale-out multi-controller architecture with consolidation and efficiency for the enterprise. It offers dramatic increases in floor tile density, high capacity flash and hard disk drives in dense enclosures for both 2.5" and 3.5" drives, and supports both block and file (eNAS). The VMAX3 family of storage arrays comes pre-configured from factory to simplify deployment at customer sites and minimize time to first I/O. Each array uses Virtual Provisioning to allow the user easy and quick storage provisioning. While VMAX3 can ship as an all-flash array with the combination of EFD (Enterprise Flash Drives) and large persistent cache that accelerates both writes and reads even further, it can also ship as hybrid, multi-tier storage that excels in providing FAST 1 (Fully Automated Storage Tiering) enabled performance management based on Service Level Objectives (SLO). VMAX3 new hardware architecture comes with more CPU power, larger persistent cache, and a new Dynamic Virtual Matrix dual InfiniBand fabric interconnect that creates an extremely fast internal memory-to-memory and data-copy fabric. Figure 1 shows possible VMAX3 components. Refer to EMC documentation and release notes to find the latest supported components. Figure 1 VMAX3 storage array 2 • 1 – 8 VMAX3 Engines • Up to 4 PB usable capacity • Up to 256 FC host ports • Up to 16 TB global memory (mirrored) • Up to 384 Cores, 2.7 GHz Intel Xeon E5-2697-v2 • Up to 5,760 drives • SSD Flash drives 200/400/800/1,600 GB 2.5”/3.5” • 300 GB – 1.2 TB 10K RPM SAS drives 2.5”/3.5” • 300 GB 15K RPM SAS drives 2.5”/3.5” • 2 TB/4 TB SAS 7.2K RPM 3.5” To learn more about VMAX3 and FAST best practices with Oracle databases refer to the white paper: Deployment best practice for Oracle database with VMAX3 Service Level Object Management. VMAX3 Federated Tiered Storage Federated Tiered Storage (FTS) is a feature of VMAX3 that allows external storage to be connected to the VMAX3 backend and provide physical capacity that is managed by VMAX3 software. Attaching external storage to a VMAX3 enables the use of physical disk capacity on a storage system that is not a VMAX3 array, while gaining access to VMAX3 features, including cache optimizations, local and remote replications, data management, and data migration. The external storage devices can be encapsulated by VMAX3, and therefore their data preserved and independent of VMAX3 specific structures, or presented as raw disks to VMAX3, where HYPERMAX OS will initialize them and create native VMAX3 device structures. 1 Fully Automated Storage Tiering (FAST) allows VMAX3 storage to automatically and dynamically manage performance service level goals across the available storage resources to meet the application I/O demand, even as new data is added, and access patterns continue to change over time. 2 Additional drive types and capacities may be available. Contact your EMC representative for more details. 7 FTS is implemented entirely within HYPERMAX OS and does not require any additional hardware besides the VMAX3 and the external storage. Connectivity with the external array is established using fiber channel ports. Note: While the external storage presented via FTS is managed by VMAX3 HYPERMAX OS and benefits from many of the VMAX3 features and capabilities, the assumption is that the external storage provides storage protection and therefore VMAX3 will not add its own RAID to the external storage devices. By leveraging FTS, VMAX3 and Data Domain become an integrated system in which TimeFinder SnapVX local replication technology operates in coordination with Data Domain using ProtectPoint File System Agent software, providing a powerful Oracle database backup and recovery solution. VMAX3 SnapVX Local Replication Overview EMC TimeFinder SnapVX software delivers instant and storage-consistent point-in-time replicas of host devices that can be used for purposes such as the creation of gold copies, patch testing, reporting and test/dev environments, backup and recovery, data warehouse refreshes, or any other process that requires parallel access to, or preservation of the primary storage devices. The replicated devices can contain the database data, Oracle home directories, data that is external to the database (e.g. image files), message queues, and so on. VMAX3 TimeFinder SnapVX combines the best aspects of previous TimeFinder offerings and adds new functionality, scalability, and ease-of-use features. Some of the main SnapVX capabilities related to native snapshots (emulation mode for legacy behavior is not covered): • With SnapVX, snapshots are natively targetless. They only relate to a group of source devices and cannot be otherwise accessed directly. Instead, snapshots can be restored back to the source devices, or linked to another set of target devices which can be made host-accessible. • Each source device can have up to 256 snapshots that can be linked to up to 1024 targets. • Snapshot operations are performed on a group of devices. This group is defined by using either a text file specifying the list of devices, a ‘device-group’ (DG), ‘composite-group’ (CG), a ‘storage group’ (SG), or simply specifying the devices. The recommended way is to use a storage group. • Snapshots are taken using the establish command. When a snapshot is established, a snapshot name is provided, and an optional expiration date. The snapshot time is saved with the snapshot and can be listed. Snapshots also get a ‘generation’ number (starting with 0). The generation is incremented with each new snapshot, even if the snapshot name remains the same. • SnapVX provides the ability to create either space-efficient replicas or full-copy clones when linking snapshots to target devices. Use the “-copy” option to copy the full snapshot point-in-time data to the target devices during link. This will make the target devices a stand-alone copy. If “-copy” option is not used, the target devices provide the exact snapshot point-in-time data only until the link relationship is terminated, saving capacity and resources by providing space-efficient replicas. • SnapVX snapshots themselves are always space-efficient as they are simply a set of pointers pointing to the data source when it is unmodified, or to the original version of the data when the source is modified. Multiple snapshots of the same data utilize both storage and memory savings by pointing to the same location and consuming very little metadata. • SnapVX snapshots are always consistent. That means that snapshot creation always maintains write-order fidelity. This allows easy creation of restartable database copies, or Oracle recoverable backup copies based on Oracle Storage Snapshot Optimization. Snapshot operations such as establish and restore are also consistent – that means that the operation either succeeds or fails for all the devices as a unit. • Linked-target devices cannot ‘restore’ any changes directly to the source devices. Instead, a new snapshot can be taken from the target devices and linked back to the original source devices. In this way, SnapVX allows unlimited number of cascaded snapshots. • FAST Service Levels apply to either the source devices, or to snapshot linked targets, but not to the snapshots themselves. SnapVX snapshot data resides in the same Storage Resource Pool (SRP) as the source devices, and acquire an ‘Optimized’ FAST Service Level Objective (SLO) by default. See Appendixes for a list of basic TimeFinder SnapVX operations. 8 For more information on SnapVX refer to the TechNote: EMC VMAX3 TM Local Replication, and the EMC Solutions Enabler Product Guides. DATA DOMAIN PRODUCT OVERVIEW Introduction to Data Domain Data Domain deduplication storage systems offer a cost-effective alternative to tape that allows users to enjoy the retention and recovery benefits of inline deduplication, as well as network-efficient replication over the wide area network (WAN) for disaster recovery (DR). Figure 2 EMC Data Domain deduplication storage systems Data Domain systems reduce the amount of disk storage needed to retain and protect data by 10 to 30 times. Data on disk is available online and onsite for longer retention periods, and restores become fast and reliable. Storing only unique data on disk also means that data can be cost-effectively replicated over existing networks to remote sites for DR. With the industry’s fastest deduplication storage controller, Data Domain systems allow more backups to complete faster while putting less pressure on limited backup windows. All Data Domain systems are built as the data store of last resort, which is enabled by the EMC Data Domain Data Invulnerability Architecture – end-to-end data verification, continuous fault detection and self-healing, and other resiliency features transparent to the application. Understanding Data Domain device encapsulation and SnapVX relationship With Data Domain being the external storage behind FTS, the Data Domain devices are encapsulated to preserve their data structures. In that way, the Data Domain system can be used with a different VMAX3 storage array if necessary. The ability to encapsulate Data Domain devices as VMAX3 devices allows TimeFinder SnapVX to operate on them. Understanding Data Domain backup and restore devices The VMAX3 integration with Data Domain uses two identical sets of encapsulated devices: backup devices, and restore devices. • The encapsulated backup devices are used as a backup target, and therefore SnapVX will copy the backup data to them. After the incremental copy completes, Data Domain uses it to create a static-image for each, and together the static-images create a backup-set, benefiting from deduplication, compression, and remote replications capabilities. • The encapsulated restore devices are used for database restore operations. They can be mounted directly to Production or a Mount host, or their data can be copied with SnapVX link-copy to VMAX3 native devices, overwriting them. Understanding Data Domain static-images and backup-sets A full overview of Data Domain system is beyond the scope of this paper. However, it is important to mention a few basic Data Domain components that are used in this integration. • A static-image is created for each backup device within the Data Domain system once the devices received all their data from SnapVX. Static-images benefit from Data Domain File System capabilities of deduplication and compression and can add metadata to describe their content. Since the backup and restore devices are overwritten with each new backup or restore respectively, it is the static-images that are kept as distinct backups in the Data Domain system and presented via ProtectPoint catalog. 9 • Static images with matching backup id are called backup-sets. ProtectPoint maintains and lists the backup-sets with their metadata to help selecting the appropriate backup-set to restore. For more information on Data Domain visit: http://www.emc.com/data-protection/data-domain/index.htm Data Domain Block Device Service Data Domain supports a variety of protocols, including CIFS, NFS, VTL, and now also a block device service that enables it to expose devices as FC targets. The block device service in Data Domain is called vdisk and allows the creation of backup and restore Data Domain devices that can be encapsulated by VMAX3 FTS and used by ProtectPoint. Table 1 depicts the basic Data Domain vdisk block device object hierarchy, which is also shown in Figure 3. Table 1 Data Domain block device hierarchy Name Description Pool Similar to a ‘Department’ level. Maximum of 32 pools with DD OS 5.4 and above. Device Group Similar to the ‘Application’ level. Maximum of 1024 device groups per pool. Device Host device equivalent. Maximum of 2048 devices Figure 3 Data Domain block device hierarchy Note: When preparing Data Domain devices, it is recommended that all matching Data Domain backup and restore devices belong to the same Data Domain device group. Note: Data Domain can replicate the backups to another Data Domain system by using Data Domain Replicator (separately licensed). While Data Domain file system structure is not covered in this paper, keep in mind that the replication granularity is currently for a single backup-set, where future releases of Data Domain OS may offer additional capabilities. PROTECTPOINT PRODUCT OVERVIEW ProtectPoint Product Overview EMC ProtectPoint provides faster, more efficient backup while eliminating the backup impact on application servers. By integrating industry leading primary storage and protection storage, ProtectPoint reduces cost and complexity by eliminating traditional backup applications while still providing the benefits of native backups. Overall, ProtectPoint provides the performance of snapshots with the functionality of backups. Some key values of ProtectPoint are: • Non-intrusive data protection: Eliminate backup impact on the application by removing the server from the data path and minimizing the traditional backup window. 10 • Fast backup and instant access: Meet stringent application protection SLAs with ProtectPoint by backup up directly from primary storage to a Data Domain system. By eliminating traditional backup applications, ProtectPoint enables faster, more frequent backup for enterprise applications. • Application owner control: ProtectPoint provides application owners and database administrators with complete control of their own backup, recovery and replication directly from their native application utilities. This empowers application owners with the control they desire without additional cost and complexity. • Simple and efficient: ProtectPoint eliminates the complexity of traditional backup and introduces unparalleled efficiency by minimizing infrastructure requirements. ProtectPoint eliminates backup impact on the local area network (LAN) and minimizes storage area network (SAN) bandwidth requirements by sending only unique data from primary storage to protection storage. By protecting all data on the Data Domain system, ProtectPoint reduces backup storage requirements by 10 to 30 times. • Reliable protection: Since ProtectPoint backs up data to a Data Domain system it is protected with the Data Domain Data Invulnerability Architecture that provides the industry’s best defense against data integrity issues. Inline write-and-read verification, continuous fault detection, and self-healing ensure that backup data is accurately stored, retained, and recoverable throughout its lifecycle on a Data Domain system. Figure 4 ProtectPoint Underlying Technology Figure 4 illustrates ProtectPoint underlying technology components. The ProtectPoint Agent includes two components: the ProtectPoint File System Agent, and the Application Agent. When the Application Agent executes the backup, it is fully managed by the application, such as RMAN, leveraging RMAN Proxy Copy functionality yet utilizing the power of storage snapshots. It can also copy individual files to Data Domain via the network when necessary. ProtectPoint File System Agent is a command line tool available when the full application integration is not (for example, as RMAN does not support Proxy Copy for Oracle ASM). The image shows the storage snapshot of the Production devices, which always represents a full backup (point-in-time copy of Production’s data). The snapshot data is sent incrementally over the FC links from the VMAX3 storage array directly to the Data Domain system as a full backup. Since TimeFinder tracks changed data it only sends the changes to Data Domain. Data Domain then stores the full backup image with compression and deduplication on its file system. ProtectPoint File System Agent Product Overview EMC ProtectPoint is a software product that takes advantage of the integration between best-in-class EMC products, VMAX3 with FTS and Data Domain, to provide a backup offload optimization and automation. ProtectPoint includes both Application Agent and File System Agent. The following discussion is focused on ProtectPoint File System Agent integration with VMAX3 for Oracle databases deployed on ASM. VMAX3 ProtectPoint benefits for Oracle databases deployed on ASM: 11 • ProtectPoint allows Oracle databases backup and restore operations to take place entirely within the integrated system of VMAX3 and Data Domain. This backup-offload capability reduces host I/O overhead and CPU utilization, and allows the host to focus its resources on servicing database transactions. • ProtectPoint database backup efficiencies: o Backup does not require host resources or any read or write I/Os of the data files by the host. Instead, TimeFinder SnapVX creates an internal point-in-time consistent snapshot of the database, and then copies it directly to the Data Domain system, leveraging VMAX3 Federated Tier Storage (FTS). o o For Oracle databases prior to 12c 3, where Hot Backup mode is still needed, it is only required for a few seconds – regardless of the size of the database. As soon as the snapshot is created, Hot Backup mode is ended immediately. For Oracle databases starting with release 12c Oracle offers a new feature: Oracle Storage Snapshot Optimization that allows database backups without the need for Hot-Backup mode. Instead, Oracle is leveraging storage snapshot consistency, which is an inherent feature of SnapVX. o o TimeFinder SnapVX snapshots are incremental. After initial full-copy during system setup, all future copies (backups) between source database devices to the Data Domain backup devices will only copy changed data. Although only changed data is copied to Data Domain, each snapshot (backup) is a full point-in-time image of the production data files. Therefore, Data Domain backup-sets are always full (level 0), which means that no additional time will be spent during recovery to apply incremental backups. o ProtectPoint utilizes Data Domain deduplication and optional compression to reduce the size of backup-sets. It maintains a catalog of the backups. It can also manage Data Domain replication of backups to a remote Data Domain system for additional protection. • ProtectPoint database restore efficiencies: o o Restoring the data from Data Domain does not require any read or write I/Os of the data files by the host. Instead, Data Domain places the required backup-set on its restore encapsulated devices in seconds. The encapsulated restore devices can be mounted to a Mount host for surgical recovery and small-scale data retrievals. Alternatively they can be mounted to the Production host and cataloged with RMAN so the full scale of RMAN functionality can be used for production database recovery. A third option exists where the encapsulated restore devices are used by SnapVX to copy their data over the native Production VMAX3 devices (if the Production database requires a complete restore from backup). For more information on ProtectPoint see: EMC ProtectPoint: A Detailed Review, EMC ProtectPoint: Solutions Guide and ProtectPoint: Implementation Guide. ProtectPoint File System Agent Components ProtectPoint, when deployed with VMAX3 and Oracle ASM, is based on the following key components, as described in Figure 4. • Production host (or hosts in the case of Oracle RAC) with VMAX3 native devices hosting the Oracle database. o o A minimum of 3 sets of database devices should be defined for maximum flexibility: data/control files, redo logs, and FRA (archive logs), each in its own Oracle ASM disk group (e.g. +DATA, +REDO, +FRA). The separation of data, redo, and archive log files allows ProtectPoint to backup and restore only the appropriate file type at the appropriate time. For example, Oracle backup procedures require the archive logs to be copied later than the data files. Also, during restore, if the redo logs are still available on Production, you can restore only data files without overwriting the Production’s redo logs, etc. Note: When Oracle RAC is used it is recommended to use a dedicated ASM disk group for Grid (e.g. +GRID) that does not contain any application data. In this way, if the Mount host or another remote server that will perform the restore are clustered, they will have their own dedicated +GRID ASM disk group set up ahead of time. The ASM disk groups from the backup can simply be mounted to the ready cluster and used. 3 Oracle 11gR2 also allows database recovery from a storage consistent replica that was taken without hot-backup mode. However, if the recovery is not full, a prior scan of the data files is needed. For more details refer to Oracle support note: 604683.1. 12 • Management host, where ProtectPoint, Solutions Enabler, and optionally Unisphere for VMAX3 software is installed. The management host does not require its own VMAX3 storage devices, though it requires a few tiny devices called gatekeepers for communication with the VMAX3 storage array. • An optional Mount host. Mount host is used when the DBA prefers to mount the backup not on the production environment. In this case the encapsulated restore devices can be mounted to the Mount host to review the backup content before copying it over to the Production host, or for extracting small data sets (‘logical’ recovery). Note: For I/O intensive recoveries it is recommended to first copy the backup-set to native VMAX3 devices. • Data Domain system leveraging vdisk service and with two identical sets of devices: backup devices, and restore devices. Each of them is identical to the database production devices it will backup or restore. The backup and restore Data Domain devices are created in Data Domain and exposed as VMAX3 encapsulated devices o via FTS. Not shown is a remote Data Domain system if Data Domain Replicator is used to replicate backups to another system. o • 3 VMAX storage array with Federated Tier Storage (FTS) and encapsulated Data Domain backup and restore device sets. Note: Refer to ProtectPoint release notes for details on supported Data Domain systems, host operating systems and more. ORACLE AND PROTECTPOINT FILE SYSTEM AGENT CONSIDERATIONS RMAN BACKUP INTEGRATIONS WITH BACKUP MEDIA MANAGERS Oracle Recovery Manager (RMAN) is used by many Oracle DBAs to perform comprehensive backup, restore and recovery operations of the Oracle database. Besides performing the backup or recovery operations, RMAN can also maintain a Catalog with a repository of backups it either performed on its own or were performed outside of RMAN, but cataloged with RMAN (so RMAN can utilize them during database recovery). Some of the features RMAN provides are: • Database block integrity validation • Granular restores of tablespaces and data files • Recovery from database block corruptions • Backup efficiency with Block Change Tracking (bitmap) for incremental backups RMAN backup and recovery operations rely on the database ID and database file location (on file systems or ASM disk groups). RMAN is not aware of which host a backup was performed from, or to which host the database is restored. For that reason, RMAN backups can be taken directly from the Production host, from a Mount host where a TimeFinder database replica is mounted (such as when using SnapVX to create database recoverable replicas), or from a physical standby database. That also means that regardless of where the backup was taken, RMAN can restore to the Production host, Mount host, or a physical standby database. This allows DBAs flexibility when planning their backup and recovery strategies. RMAN typically uses one of the following integrations with 3rd party backup media managers such as Data Domain:: • RMAN disk backup is fully executed by RMAN, typically from the Production host, standby-database, or a database clone such as SnapVX can create. RMAN can be used to backup Oracle databases on file systems or ASM, and can leverage RMAN Block Change Tracking during incremental backups, database block verification, and RMAN Catalog. Normal RMAN backup is based on reading all the data that requires backup from primary storage to the host, and writing it back to the backup media. As a result, backup and restore times increase as database size grows. • RMAN SBT based backup (also known as MML) is an RMAN integration with a 3rd party media manager such as Data Domain, where like normal RMAN backup, RMAN first reads the backup data from primary storage to the host. However, it allows the media manager software to intercept the writes and add its own optimizations. Data Domain Boost for Oracle RMAN is a deployment of this backup model. As with normal RMAN backup, MML based backup can be executed from the Production host, a standby-database, or a database clone. MML based backups are fully integrated with RMAN, however backup and restore times are still affected by the database size. • RMAN proxy-copy backup is an integration where RMAN initiates the backup or restore, but does not perform it. Instead, it provides the media manager software a list of database files to backup or restore via the proxy-copy API. The media manager 13 software is responsible for the actual data copy, potentially using storage snapshots. This model only works when the database is on file systems. It is initiated and managed by RMAN. A proxy-copy backup utilizing storage snapshots increases backup efficiencies as backup time no longer depends on the database size. With TimeFinder SnapVX the backup (snapshot) takes seconds, even if data is later copied incrementally in the background. Also, each backup is full (level 0), further reducing recovery time. Note: Oracle does not support RMAN proxy-copy backups with ASM; therefore this model cannot be used for ASM until Oracle does provide support. Since this paper is focused on ASM deployment, RMAN proxy copy backup is not covered here. RMAN AND PROTECTPOINT FILE SYSTEM AGENT INTEGRATION POINTS WITH ASM Since RMAN does not support proxy-copy with ASM, some of the RMAN functionality cannot be leveraged. Therefore, the backup is initiated by a shell script (and not RMAN). The restore is also initiated by a shell script (and not RMAN) which is used to bring back the right backup-set from Data Domain. As soon as it is available (possibly in seconds when mounting the encapsulated restore devices to Production) regardless of database size, RMAN can be used to catalogs as image copies. After this the DBA can use the breadth of RMAN recover commands and functionality to perform database recovery procedures, such as data block recovery, data file recovery, database recovery, and others. In summary, Data Domain offers the following types of integrations with Oracle RMAN: • RMAN disk backup where Data Domain is the backup target. • MML based backup with Data Domain boost for Oracle RMAN. • A partial integration with RMAN for backup offload when the database is deployed on ASM. In this case, the backup or restore are initiated from a shell script, though RMAN performs the actual database recovery after it catalogs the backup-set. PROTECTPOINT FILE SYSTEM AGENT AND ORACLE REAL APPLICATION CLUSTERS (RAC) Oracle Real Application Clusters (RAC) offers improved high-availability and load balancing and is very popular, especially on x86_64 and Linux. From backup and replication perspective for storage snapshots, it makes no difference whether the database is clustered or not. The reason is that RAC requires all database files to be shared across all nodes. Therefore, whether the storage snapshots are for a single database server or a clustered database, the replica will include all database files. Using a set of ASM disk groups similar to those provided in the examples in this paper, the +DATA disk group will include the shared data files (and control files), the +REDO disk group will include all the redo logs files for all the nodes, and the +FRA will include the archive logs from all the nodes. Note that starting with Oracle database release 11gR2, Oracle deploys the first ASM disk group when installing Grid Infrastructure. EMC recommends that when using storage replications (local or remote), this first ASM disk group remains separate from any database data. Since the cluster does not contain any user data it does not need to be part of the replications, and if the replica/backup is mounted to a different cluster, then the cluster can be pre-installed ahead of time. Simply mount the +DATA, +REDO, and/or +FRA disk groups from the backup as necessary. This initial disk group is typically named +GRID. PROTECTPOINT FILE SYSTEM AGENT AND REMOTE REPLICATIONS WITH SRDF SRDF provides a robust set of remote replication capabilities between VMAX and VMAX3 storage arrays, including Synchronous, Asynchronous, three-site (SRDF/STAR), cascaded, and more. When using SRDF to replicate the Production database remotely, there is no need to replicate the encapsulated backup or restore devices. The reason is that the encapsulated backup devices only contain the latest backup, but not all the prior backup content that is saved within Data Domain as static-images. The encapsulated restore devices only contain a single specific set of static-images once they are utilized. To execute the backup remotely, consider performing the ProtectPoint backup from an SRDF target. To replicate backups taken locally to a remote Data Domain system, consider using ProtectPoint with Data Domain Replicator. For more information about Oracle and SRDF see: EMC Symmetrix VMAX using EMC SRDF/TimeFinder and Oracle. 14 BACKUP TO DISK WITH SNAPVX You can use TimeFinder SnapVX to easily and quickly create local database backups, gold copies, test/dev environments, patch tests, reporting instances, and many other use cases. SnapVX snapshots can be taken at any time, and regardless of the size of the database, they are taken within seconds. Restoring snapshots back to their source devices is also very quick. Therefore, as database capacities continue to increase, having a gold copy nearby provides increased availability, protection, and peace of mind. TimeFinder SnapVX replicas can create valid database backup images or database clones. TimeFinder is also integrated with SRDF to offload backups to a remote site or restore from a remote site. For more information about Oracle and TimeFinder see: EMC Symmetrix VMAX using EMC SRDF/TimeFinder and Oracle. COMMAND EXECUTION PERMISSIONS FOR ORACLE USERS Typically, an Oracle host user account is used to execute Oracle RMAN or SQL commands, a storage admin host user account is used to perform storage management operations (such as TimeFinder SnapVX, or multipathing commands), and a different host user account may be used to setup and manage Data Domain system. This type of role and security segregation is common and often helpful in large organizations where each group manages their respective infrastructure with a high level of expertise. To easily execute the integrated solution described in this white paper, the ability to execute specific commands in Oracle, Solutions Enabler, ProtectPoint, and Data Domain is required. There are two ways to address this: • Allow the database backup operator (commonly a DBA) controlled access to commands in Solutions Enabler, leveraging VMAX Access Controls (ACLs). • Use SUDO, allowing the DBA to execute specific commands for the purpose of their backup (possibly in combination with Access Controls). An example for setting up VMAX3 Access Controls is provided in Appendix III. In a similar way, Data Domain can create additional user accounts, other than ‘sysadmin’ that can manage the Data Domain system appropriately. Oracle also allows setting up a backup user and only providing them a specific set of authorizations appropriate for their task. PROTECTPOINT CONFIGURATION FILE, BACKUP DEVICES, AND ASM CHANGES ProtectPoint relies on a configuration file that contains vital information such as the Data Domain systems information (local, and remote if used) and the location of the ProtectPoint logs and Catalog. Also in the configuration file is a list of the native VMAX3 devices and Data Domain backup and restore devices. Since the list of devices is hard-coded into the configuration file and is critical to the validity of the backup (it must include the correct ASM devices for the different ASM disk groups), it is critical to make sure it is up to date and correct. • When changes are made to ASM such as adding devices to the ASM disk group, the device IDs should also be added to the appropriate ProtectPoint configuration file along with their new matching backup and restore devices. It is highly recommended to perform a new backup after making ASM changes. A ProtectPoint restore from an older backup will have the older ASM disk group structure. This is fine for logical recovery or when the old backup is cataloged with RMAN. If, however, the old backup is used to overwrite Production devices, the ASM changes will have to be re-done. • A less common case is when ASM devices are removed from the ASM disk groups. Again, it is recommended to perform a new database backup after making ASM changes. Since the DBA may use any of the older backups, it is recommended to keep the old ProtectPoint configuration file, renaming it appropriately. It can be used to restore the older backups and therefore enough backup and restore devices should be maintained for them. As in the previous point, if the old backup is used for logical recovery or by RMAN after it is cataloged, Production ASM disk groups remain intact. If, however, the old backup is used to overwrite Production devices, the ASM changes will have to be redone. 15 ORACLE BACKUP AND RECOVERY USE CASES WITH PROTECTPOINT FILE SYSTEM AGENT ORACLE DATABASE BACKUP/RECOVERY USE CASES – THE BIG PICTURE This section provides a high level overview of Oracle ASM database backup and recovery use cases with ProtectPoint File System Agent integration, as described in Figure 4. Following the overview, each use case is described in detail. Note: Remote replications of backups using a secondary Data Domain system is not described in the use cases but can be used as part of the solution. The following use cases are shown: 1) Backup Oracle ASM database using ProtectPoint 2) Read-only inspection of the backup on Mount host (4a recoverable, or prelude to 4c) 3) Logical recovery on Mount host using the backup as a database-clone (4a restartable) 4) RMAN recovery of Production data files without overwriting them with a backup (4b) 5) RMAN recovery after overwriting Production data files with a backup (4c) Figure 5 ProtectPoint Workflow Backup Oracle ASM database using ProtectPoint File System Agent • For Oracle databases prior to 12c: Begin hot-backup mode for the database. • Step (1a): Perform ProtectPoint snapshot create using database configuration file o SnapVX will create a consistent snapshot of both +DATA and +REDO ASM disk groups together within seconds, regardless of database size. • For Oracle databases prior to 12c: End hot-backup mode for the database. • In Oracle database: switch and archive the current logs. • Step (1b): Perform ProtectPoint snapshot create using fra configuration file. o • SnapVX will create a consistent snapshot of +FRA ASM disk group (archive logs). Step (2): Perform two ProtectPoint backup create with an appropriate description: one using database configuration file, the other using fra configuration file. 16 o SnapVX link-copy will send incremental changes to the Data Domain encapsulated backup devices. When the data changes are fully copied, ProtectPoint will create a new backup-set. At the end of this process Data Domain will add two backup-sets: the first with a consistent image of the data, control, and redo log files, and the second with the minimum set of archive logs required to recover the data files. Typically, only the data files and occasionally archive logs will be used to recover production, leveraging the latest archive and redo logs from production. However, if the Production database is truly lost, these two backup-sets are self-contained and are enough to restore the entire database. Read-only inspection of the backup on Mount host (4a recoverable, or prelude to 4c) • Purpose: Use this method to inspect backup-sets quickly, prior to performing a SnapVX copy overwriting Production (4c). o When the DBA is not sure which +DATA backup-set to use for SnapVX link-copy that overwrites Production data devices (4c), use the Mount host to quickly browse through the backup-sets (for example, when searching for a valid version of a database block after it was found corrupted). • o The database and FRA backups are mounted on the Mount host. It goes through minimal recovery (just enough so it o It is possible to use this method for ‘logical’ recovery. If there are no plans to copy the backup-set to Production, after can be opened read-only and inspected). If found suitable, it can be copied with SnapVX link-copy to Production. the roll-forward recovery, open the database read-write with resetlogs instead of read-only. Step (3): Perform two ProtectPoint backup restores, one using the database configuration file and a matching backup-id, the other using the fra configuration file and matching backup-id from the same backup time, as shown by ProtectPoint backup show list. o • Data Domain places the content of the backup-sets (+DATA, +REDO, and +FRA) on the encapsulated restore devices. Add the +DATA and +FRA encapsulated restore devices to the Mount host masking view, so they become visible to the host (+REDO ASM disk group is not used while the database is not opened read/write) • Step (4a): Mount the 3 ASM disk groups on the Mount host. • Perform minimal database media recovery using the available archive logs in the +FRA, then open the database READ ONLY. • Review the data. If appropriate, dismount the ASM disk groups and perform SnapVX link-copy to Production (4c). If not, dismount the ASM disk groups, bring another backup-set and repeat. Logical recovery on Mount host using the backup as database-clone (4a restartable) • Purpose: Use this method if the DBA needs to retrieve a small amount of data and wants to use one of the backup-sets as a database ‘clone’, without the need to apply any archive logs. o Because +DATA and +LOG snapshot was taken in a single operation, and SnapVX uses storage consistency by default, the result is a consistent and restartable database replica (the snapshot must include all data, control, and log files). The DBA can open the database on the Mount host normally for read/write. Oracle performs crash-recovery, creating a database clone with all committed transactions up to the time of the snapshot. No roll-forward recovery is allowed on that database clone, and +FRA disk group is not required. o The time to access this database clone is relatively fast, as no SnapVX copy is performed and no archives are applied. However, the time it takes Oracle to perform crash-recovery depends on the amount of transactions since the last checkpoint. • Step (3): Perform ProtectPoint backup restore using the database configuration file and a backup-id. o • Data Domain places the content of the backup-set on the encapsulated restore devices. Add the +DATA and +REDO encapsulated restore devices to the Mount host masking view, so they can become visible to the host. • Step (4a): Mount the 2 ASM disk groups on the Mount host. • Do not perform database media recovery. Instead, simply start the database. 17 • Review the data and extract the appropriate records using Oracle Data Pump, Database Links, or other methods to perform logical recovery of the Production data files. RMAN recovery of Production data files without overwriting them with a backup (4b) • Purpose: Use this method to recover the existing Production data files. It allows the recovery to start within minutes as the encapsulated restore devices are mounted directly to the Production host and not copied first to native VMAX3 devices. • o The ASM +DATA disk group on the encapsulated restore devices is renamed to +RESTORED_DATA and then cataloged o This recovery method is best utilized for small corruptions, such as database block corruptions, a few missing data files, etc. If the Production host sustained a complete loss of its data files follow use case (4c). Step (3): Perform ProtectPoint backup restore using the database configuration file and a backup-id. o • with RMAN, allowing the DBA to use normal RMAN recover commands to recover the Production database. Data Domain places the content of the backup-set on the encapsulated restore devices. Add only the +DATA encapsulated restore devices to the Production host masking view. Note: Do not include the +REDO encapsulated restore devices to Production masking view as they are not used in this scenario. • Step (4b): Rename the encapsulated ASM disk group to +RESTORED_DATA, mount it to ASM and catalog it with RMAN. Note: If ASMlib is used the Oracle ASM disks will need to be renamed as well. Similarly, if another volume manager is used the volumes should be renamed to not conflict with the existing Production volume groups of the volume manager. o • After the catalog operation RMAN can use this backup-set for normal RMAN recovery operations on Production. If RMAN requires missing archive logs, repeat a similar process for older +FRA backup sets: o ProtectPoint backup restore using the fra configuration file and a backup-id. o Add the +FRA encapsulated restore devices to the Production host masking view (only needed first time). o Rename the encapsulated +FRA ASM disk group to +RESTORED_FRA, mount it to ASM and use its archive logs. Note: If ASMlib is used the Oracle ASM disks will need to be renamed as well. Similarly, if another volume manager is used the volumes should be renamed to not conflict with the existing Production volume groups of the volume manager. o If more than one +FRA backup-set is required, dismount the +RESTORED_FRA ASM disk group and bring in the next, repeating this step as necessary. RMAN recovery after overwriting Production data files with a backup (4c) • Purpose: Use this method if it is clear that the Production host is completely lost and it is better to overwrite its data files with the backup content and roll it forward rather than perform targeted recovery as described in use case (4b). o The SnapVX link-copy from the encapsulated restore devices to the native VMAX3 devices is performed inside the integrated system without using host I/Os, however, database recovery will only start after the copy completes. If the DBA is not sure which +DATA backup-set they need, consider browsing the content of the backup-sets first by using the Mount host, as described in use case (4a, prelude to 4c), prior to performing the SnapVX link-copy. • Step (3): Perform ProtectPoint backup restore using the database configuration file and a backup-id. o Data Domain will place the content of the backup-set on the matching encapsulated restore devices. • Dismount the Production +DATA ASM disk group. • Use SnapVX to link-copy the +DATA encapsulated restore devices, overwriting the Production native VMAX3 +DATA ASM disk group with its content. Note: If the Production host’s +REDO ASM disk group survived, do not overwrite it with a backup-set of the logs. However, if it was lost, consider creating a new one, or using SnapVX to copy an older version or +REDO from backup. The content of the backup +REDO logs will not be used for recovery and it may be quicker to just create a new +REDO disk group and log files. • Step (4c): Mount the restored +DATA disk group and perform database media recovery (using RMAN or SQL). 18 • If RMAN requires missing archive logs, repeat a similar process from (4b) for older +FRA backup sets: o ProtectPoint backup restore using the fra configuration file and a backup-id. o Add the +FRA encapsulated restore devices to the Production host masking view (only needed first time). o Rename the encapsulated +FRA ASM disk group to +RESTORED_FRA, mount it to ASM and use its archive logs. If more than one +FRA backup-set is required, dismount the +RESTORED_FRA ASM disk group and bring in the next, o repeating this step as necessary. Summary While this guide cannot cover all possible backup and recovery scenarios, which vary based on the circumstances and type of failure, it provides an overview and examples of key scenarios that can be used leveraging ProtectPoint File System Agent integration. BACKUP AND RECOVERY USE CASES SETUP • Perform a system setup as described in Appendix I – ProtectPoint System Setup. At the end of the setup you will have: o An initial snapshot and a one-time full link-copy between Production database devices (+DATA, +REDO and +FRA ASM o Two ProtectPoint configuration files: one with devices of +DATA and +REDO ASM disk groups, containing all the disk group devices) to the encapsulated backup devices. database data, control, and log files, and the other with all the devices of +FRA ASM disk group, containing archive logs. • The following use cases deploy simple Linux shell scripts to simplify their execution. Running any of the scripts without parameters will display the required parameters. The content of the scripts is in Appendix IV – Scripts. Note: Scripts starting with “se_” reference Solutions Enabler commands. Scripts starting with “pp_” reference ProtectPoint commands, and scripts starting with “ora_” reference Oracle SQL or RMAN commands. • Prior to using protectpoint restore prepare the restore devices need to be in read-write (RW) state. However, prior to creating a snapshot of the restore devices (use case 4c), they need to be in not_ready (NR) state. You can use the se_devs.sh script to check the restore devices state and change it between ready and not_ready. Script note: The script ‘se_devs.sh’ takes a storage group (SG) name and a command option: show, ready, or not_ready. It will show the status of the SG devices, make them ready, or not-ready, respectively. [root@dsib1141 scripts]# ./se_devs.sh rstr_data_sg show [root@dsib1141 scripts]# ./se_devs.sh rstr_redo_sg show [root@dsib1141 scripts]# ./se_devs.sh rstr_fra_sg show • When presenting the encapsulated restore devices to the host (either Mount or Production), a rescan of the SCSI bus may be required for the host to recognize them. This can obviously be achieved by a host reboot. However, depending on the operating system, HBA type, or whether ASMlib is used, there are ways to do it online without rebooting. This topic is beyond the scope of this white paper though you can review your HBA and host operating system documentation, or when using ASMlib, use the ‘oracleasm scandisks’ command. A script: ‘os_rescan.sh’ is used to perform this activity online in the following examples. BACKUP ORACLE ASM DATABASE USING PROTECTPOINT • For Oracle databases prior to 12c place the Production database in hot-backup mode. SQL> alter database begin backup; • Step (1a): Use the ProtectPoint snapshot create command to create a snapshot for both +DATA and +REDO ASM disk groups. [root@dsib1141 scripts]# ./pp_snap.sh database Note: When using the Oracle 12c Storage Snapshot Optimization feature instead of hot backup mode for a backup solution, Oracle requires the time of the snapshot during database recovery. In ProtectPoint File System Agent v1.0 the protectpoint backup show list command lists a different time than the snapshot time. Therefore, as a work-around, the script ‘pp_backup.sh’ attaches the TimeFinder SnapVX snapshot time to the user-provided backup description. However, if the DBA 19 believes that the Management host clock (from which SnapVX snapshot time is taken) is not coordinated with the Database servers clock, the DBA may prefer to modify the ‘pp_backup.sh’ script to capture the current time from the database server or the database itself. • For Oracle databases prior to 12c end hot-backup mode. SQL> alter database end backup; • In Oracle (using SQL or RMAN) switch and archive the logs. [root@dsib1141 scripts]# ./ora_switchandarchive.sh • Step (1b): Execute ProtectPoint snapshot create using fra configuration file to create a snapshot of the +FRA ASM disk group (archive logs). [root@dsib1141 scripts]# ./pp_snap.sh fra • Step (2): Link-copy the snapshot PiT data incrementally to Data Domain using two ProtectPoint backup create commands with an appropriate user-description for the backup: one using database configuration file, the other using fra configuration file. Script note: If running in the foreground, the ‘protectpoint backup create’ command will only return the prompt when the copy finishes and the backup-set is created. It can take a while with no progress indication. Instead, the script ‘./pp_backup.sh’ runs it in the background. This way, both “fra” and “database” backup copies to Data Domain can take place simultaneously. In addition, the ‘./se_snap_show.sh’ script can be used to monitor the copy progress. Note also that ‘pp_backup.sh’ adds the actual snapshot time to the description’s end in brackets, and the backup objects (database or fra) to the beginning. [root@dsib1141 [root@dsib1141 [root@dsib1141 [root@dsib1141 [root@dsib1141 ---------Backup id scripts]# scripts]# scripts]# scripts]# scripts]# ./pp_backup.sh database “Nightly backup” ./pp_backup.sh fra “Nightly backup” ./se_snap_show.sh database <- monitor progress ./se_snap_show.sh fra <- monitor progress ./pp_list_backup.sh database <- List Protect Point backups ------------------Backup start time ---------- ------------------35cc4…72fe 2015-03-25 09:47:37 bc020…d08c 2015-03-25 09:50:51 ---------- ------------------Backups found: 2 • ---------Duration (hh:mm:ss) ---------00:02:49 00:02:10 ---------- ----------Status ---------------------------------------Description ----------complete complete ----------- ---------------------------------------database Nightly backup (2015_03_25_09:45:11) fra Nightly backup (2015_03_25_09:46:28) ---------------------------------------- During the backup process a ‘test’ table was used to insert known records. These records will now be used during the recovery use cases 4a and 4c as a reference to how much recovery is performed. Use case 4b demonstrates recovery from physical block corruption and has new set of values in the ‘test’ table included in that use case. SQL> select * from test; TS -------------------------------------------------25-MAR-15 09.37.49.342907 AM -04:00 25-MAR-15 09.45.36.580309 AM -04:00 25-MAR-15 09.46.15.848576 AM -04:00 25-MAR-15 09.46.40.174418 AM -04:00 25-MAR-15 09.48.13.435286 AM -04:00 25-MAR-15 09.51.52.663259 AM -04:00 25-MAR-15 09.59.08.514299 AM -04:00 REC -------------------------------before db snapshot after db snapshot after log switch after fra snapshot after pp backup database started after pp backup fra started both backups completed 7 rows selected. USING MOUNT HOST TO PICK A BACKUP-SET TO COPY TO PRODUCTION (4A, RECOVERABLE) • Perform ProtectPoint backup list to choose a backup-id to restore using either of the configuration files. [root@dsib1141 scripts]# ./pp_list_backup.sh database • Step (3): Perform two ProtectPoint backup restore using the database configuration file and a backup-id, and the fra configuration file and a backup-id: 20 [root@dsib1141 scripts]# ./pp_restore.sh database 35cc4c92-9ab5-09a4-a6f6-b832494372fe [root@dsib1141 scripts]# ./pp_restore.sh fra bc0202f6-52f1-ce09-a46e-ebb9ae61d08c • Add the +DATA, +REDO, and +FRA encapsulated restore devices to the Mount host masking view: [root@dsib1141 scripts]# symaccess -type storage -name mount_sg add sg rstr_data_sg,rstr_redo_sg,rstr_fra_sg • Step (4a, recoverable): Mount the 3 ASM disk groups on the Mount host. o If RAC is running on the Mount host then it should be already configured and running using a separate ASM disk group (+GRID). However, in the case of a single instance, Oracle High-Availability Services may need to be started first. [root@dsib1136 ~]# su - oracle [oracle@dsib1136 ~]$ TOGRID [oracle@dsib1136 ~]$ crsctl start has CRS-4123: Oracle High Availability Services has been started. o • Mount +DATA, +REDO, and +FRA ASM disk groups. [oracle@dsib1136 ~]$ sqlplus "/ as sysasm" SQL> alter system set asm_diskstring='/dev/mapper/ora*p1'; SQL> alter diskgroup data mount; SQL> alter diskgroup redo mount; SQL> alter diskgroup fra mount; Perform minimal database media recovery using the available archive logs in the +FRA, then open the database READ ONLY. o In the example, RMAN is used to copy the backup control file to its right place and then SQL is used with automatic media recovery. Alternatively, RMAN can be used for the media recovery (as in use case 4c). [oracle@dsib1136 ~]$ TODB [oracle@dsib1136 ~]$ rman RMAN> connect target / RMAN> startup nomount; RMAN> restore controlfile from '+FRA/CTRL.BCK'; RMAN> exit [oracle@dsib1136 ~]$ sqlplus "/ as sysdba" SQL> alter database mount; SQL> recover database until cancel using backup controlfile snapshot time 'MAR-25-2015 09:45:11'; ORA-00279: change 783215 generated at 03/25/2015 09:43:21 needed for thread 1 ORA-00289: suggestion : +FRA/ORCL/ARCHIVELOG/2015_03_25/thread_1_seq_52.308.875267149 ORA-00280: change 783215 for thread 1 is in sequence #52 Specify log: {<RET>=suggested | filename | AUTO | CANCEL} AUTO ORA-00279: change 982703 generated at 03/25/2015 09:45:49 needed for thread 1 ORA-00289: suggestion : +FRA/ORCL/ARCHIVELOG/2015_03_25/thread_1_seq_53.309.875267155 ORA-00280: change 982703 for thread 1 is in sequence #53 ORA-00278: log file '+FRA/ORCL/ARCHIVELOG/2015_03_25/thread_1_seq_52.308.875267149' no longer needed for this recovery ORA-00279: change 989138 generated at 03/25/2015 09:45:54 needed for thread 1 ORA-00289: suggestion : +FRA ORA-00280: change 989138 for thread 1 is in sequence #54 ORA-00278: log file '+FRA/ORCL/ARCHIVELOG/2015_03_25/thread_1_seq_53.309.875267155' no longer needed for this recovery ORA-00308: cannot open archived log '+FRA' <- no more archive logs left to use! ORA-17503: ksfdopn:2 Failed to open file +FRA ORA-15045: ASM file name '+FRA' is not in reference form SQL> alter database open read only; 21 Database altered. • For reference, review the ‘test’ table. Committed transactions have been recovered up to the point of the log switch and FRA backup. This recovery only included the minimum archives required to open the database. [oracle@dsib1136 ~]$ TODB [oracle@dsib1136 ~]$ sqlplus "/ as sysdba" SQL> select * from test; TS -------------------------------------------------25-MAR-15 09.37.49.342907 AM -04:00 25-MAR-15 09.45.36.580309 AM -04:00 REC -------------------------------before db snapshot after db snapshot 2 rows selected. USING MOUNT HOST AND DATABASE CLONE FOR LOGICAL RECOVERY (4A, RESTARTABLE) • Perform ProtectPoint backup list to choose a backup-id to restore. Either configuration file will show the same list. [root@dsib1141 scripts]# ./pp_list_backup.sh database • Step (3): Perform ProtectPoint backup restore using the database configuration file and a backup-id. [root@dsib1141 scripts]# ./pp_restore.sh database 35cc4c92-9ab5-09a4-a6f6-b832494372fe • Add the +DATA and +REDO encapsulated restore devices to the Mount host masking view. [root@dsib1141 scripts]# symaccess -type storage -name mount_sg add sg rstr_data_sg,rstr_redo_sg • Step (4a): Mount the 2 ASM disk groups on the Mount host. o If RAC is used on the Mount host then it should be already configured and running using a separate ASM disk group (+GRID). If a single instance database is used on the Mount host (even if Production is clustered), Oracle HighAvailability Services may need to be started first. [root@dsib1136 ~]# su - oracle [oracle@dsib1136 ~]$ TOGRID [oracle@dsib1136 ~]$ crsctl start has CRS-4123: Oracle High Availability Services has been started. o Mount the +DATA and +REDO ASM disk groups. [oracle@dsib1136 ~]$ sqlplus "/ as sysasm" SQL> alter system set asm_diskstring='/dev/mapper/ora*p1'; System altered. SQL> alter diskgroup data mount; Diskgroup altered. SQL> alter diskgroup redo mount; Diskgroup altered. • Do not perform database media recovery. Instead, simply start the database. Note: Since the database was in archive log mode and you did not bring back +FRA from backup, consider disabling archive log mode before opening the database, or leaving it in place (as long as archives are optional, Oracle will only log its inability to write new archive logs in the database alert.log). [oracle@dsib1136 ~]$ TODB [oracle@dsib1136 ~]$ sqlplus "/ as sysdba" SQL> startup ORACLE instance started. Total System Global Area 1325400064 bytes Fixed Size 3710112 bytes Variable Size 1107297120 bytes 22 Database Buffers Redo Buffers Database mounted. Database opened. • 67108864 bytes 147283968 bytes For reference, review the ‘test’ table. Since the database was only started from the time of the backup, without any roll forward of logs, the only transaction reported is the one prior to the snapshot, which is the backup time. SQL> select * from test; TS REC -------------------------------------------------- -------------------------------25-MAR-15 09.37.49.342907 AM -04:00 before db snapshot RMAN RECOVERY OF PRODUCTION WITHOUT SNAPVX COPY (4B) • First, simulate a block corruption to demonstrate this use case 4. o Before the corruption, perform a backup (as described in use case 1a) and introduce a new set of known records to ‘test’ table during the backup process as before: SQL> select * from test; TS -------------------------------------------------30-MAR-15 12.39.22.026100 PM -04:00 30-MAR-15 12.45.44.751072 PM -04:00 30-MAR-15 12.46.41.932923 PM -04:00 30-MAR-15 12.48.43.262434 PM -04:00 o • REC -------------------------------before database snapshot after database snapshot after log switch after fra snapshot Introduce a physical block corruption to one of the data files in ASM, and then query the table. SQL> select * from corrupt_test where password='P7777'; select * from corrupt_test where password='P7777' * ERROR at line 1: ORA-01578: ORACLE data block corrupted (file # 6, block # 151) ORA-01110: data file 6: '+DATA/ORCL/DATAFILE/bad_data' SQL> exit [oracle@dsib1141 oracle]$ dbv file='+DATA/ORCL/DATAFILE/bad_data' blocksize=8192 ... Total Pages Marked Corrupt : 1 ... Perform ProtectPoint backup list using either of the configuration files to choose a backup-id to restore. [root@dsib1141 scripts]# ./pp_list_backup.sh database --------- ----------------Backup id Backup start time -------Duration (hh:mm:ss) --------- ----------------- -------9df…620 2015-03-30 13:31:28 00:01:48 e7e…5a0 2015-03-30 13:38:17 00:14:39 --------- ----------------- -------Backups found: 2 • -------Status ---------------------------------------------------------------Description -------complete complete -------- ---------------------------------------------------------------fra Nightly before block corruption (2015_03_30_12:47:04) database Nightly before block corruption (2015_03_30_12:39:58) ---------------------------------------------------------------- Make sure that the +DATA encapsulated restore devices are in “ready” state. If not, make them ready. [root@dsib1141 scripts]# ./se_devs.sh rstr_data_sg show Symmetrix ID: 000196700531 Device Name Dir Device ---------------------------- ------- ------------------------------------Cap Sym Physical SA :P Config Attribute Sts (MB) ---------------------------- ------- ------------------------------------00027 Not Visible ***:*** TDEV N/Grp'd NR 102401 00028 Not Visible ***:*** TDEV N/Grp'd NR 102401 00029 Not Visible ***:*** TDEV N/Grp'd NR 102401 0002A Not Visible ***:*** TDEV N/Grp'd NR 102401 4 The method to corrupt a database block in ASM is introduced in this blog. 23 [root@dsib1141 scripts]# ./se_devs.sh rstr_data_sg ready • Step (3): Perform ProtectPoint backup restore using the database configuration file and a backup-id (the operation takes a few seconds). [root@dsib1141 scripts]# ./pp_restore.sh database e7e13abe-204c-3838-8c22-ff2bb9a225a0 • Step (4b): Mount the +DATA encapsulated restore devices ASM disk group to Production, renaming it to +RESTORED_DATA. o o Add the +DATA encapsulated restore devices to the Production host masking view. [root@dsib1141 scripts]# symaccess -type storage -name prod_database_sg add sg rstr_data_sg If necessary, scan the host SCSI bus for new devices. Note that if ASMlib is used the encapsulated restore devices will need to be renamed using ASMlib commands (oracleasm renamedisk) prior to the next step of renaming the ASM diskgroup. [root@dsib1141 scripts]# ./os_rescan.sh [root@dsib1141 scripts]# service multipathd restart [root@dsib1141 scripts]# multipath -ll | grep dd_data ora_dd_data4 (360000970000196700531533030303241) dm-38 EMC,SYMMETRIX ora_dd_data3 (360000970000196700531533030303239) dm-39 EMC,SYMMETRIX ora_dd_data2 (360000970000196700531533030303238) dm-37 EMC,SYMMETRIX ora_dd_data1 (360000970000196700531533030303237) dm-36 EMC,SYMMETRIX o Rename the encapsulated +DATA ASM disk group to +RESTORED_DATA, and mount it to ASM. [oracle@dsib1141 oracle]$ cat ./ora_rename_DATA.txt /dev/mapper/ora_dd_data1p1 DATA RESTORED_DATA /dev/mapper/ora_dd_data2p1 DATA RESTORED_DATA /dev/mapper/ora_dd_data3p1 DATA RESTORED_DATA /dev/mapper/ora_dd_data4p1 DATA RESTORED_DATA [oracle@dsib1141 oracle]$ TOGRID [oracle@dsib1141 oracle]$ renamedg dgname=DATA newdgname=RESTORED_DATA config=./ora_rename_DATA.txt asm_diskstring='/dev/mapper/ora_dd*p1' [oracle@dsib1141 oracle]$ sqlplus "/ as sysasm" SQL> select name, state from v$asm_diskgroup; NAME STATE ------------------------------ ----------REDO MOUNTED RESTORED_DATA DISMOUNTED FRA MOUNTED DATA MOUNTED SQL> alter diskgroup restored_data mount; Diskgroup altered. • Catalog the +RESTORED_DATA ASM diskgroup on the encapsulated restore devices with RMAN. Then use it for RMAN recovery, in this case – recover a block corruption. If additional backups of +DATA are needed, unmount the +RESTORED_DATA disk group and repeat the process with another backup-set of +DATA. RMAN> connect catalog rco/oracle@catdb <- if RMAN catalog is available (optional) RMAN> catalog start with '+RESTORED_DATA/ORCL/DATAFILE' noprompt; ... cataloging files... cataloging done List of Cataloged Files ======================= File Name: +RESTORED_DATA/ORCL/DATAFILE/system.258.875207543 File Name: +RESTORED_DATA/ORCL/DATAFILE/sysaux.259.875207561 File Name: +RESTORED_DATA/ORCL/DATAFILE/sys_undots.260.875207563 File Name: +RESTORED_DATA/ORCL/DATAFILE/undotbs1.261.875208249 File Name: +RESTORED_DATA/ORCL/DATAFILE/iops.262.875216445 File Name: +RESTORED_DATA/ORCL/DATAFILE/bad_data.263.875708743 • Perform RMAN recovery using commands based on the situation (in this case – physical block corruption recovery). o First, find where is the corruption by scanning the database and then checking the trace log. RMAN> validate check logical database; ... 24 File Status Marked Corrupt Empty Blocks Blocks Examined High SCN ---- ------ -------------- ------------ --------------- ---------6 FAILED 0 1121 1280 1675452 File Name: +DATA/ORCL/DATAFILE/bad_data Block Type Blocks Failing Blocks Processed ---------- -------------- ---------------Data 0 27 Index 0 0 Other 1 132 validate found one or more corrupt blocks See trace file /u01/app/oracle/diag/rdbms/orcl/orcl/trace/orcl_ora_8556.trc for details [oracle@dsib1141 oracle]$ grep Corrupt /u01/app/oracle/diag/rdbms/orcl/orcl/trace/orcl_ora_8556.trc Corrupt block relative dba: 0x01800097 (file 6, block 151) RMAN> recover datafile 6 block 151; ... channel ORA_DISK_1: restoring block(s) from datafile copy +RESTORED_DATA/ORCL/DATAFILE/bad_data.263.875708743 ... Finished recover at 30-MAR-15 [oracle@dsib1141 oracle]$ dbv file='+DATA/ORCL/DATAFILE/bad_data' blocksize=8192 ... Total Pages Marked Corrupt : 0 ... • If RMAN requires missing archive logs during the recovery (if performing a different type of RMAN recovery than block corruption), choose an appropriate +FRA backup-set from Data Domain and mount the +FRA encapsulated restore devices to production, renaming the disk group to +RESTORED_FRA. Use the archives, and if more are necessary, unmount this disk group and repeat the process. o o ProtectPoint backup restore using the fra configuration file and a backup-id. [root@dsib1141 scripts]# ./pp_restore.sh fra <backup-id> Add the +FRA encapsulated restore devices to the Production host masking view (remove them from the Mount host masking view if they were associated with it previously. [root@dsib1141 scripts]# symaccess -type storage -name mount_sg remove sg rstr_fra_sg [root@dsib1141 scripts]# symaccess -type storage -name prod_database_sg add sg rstr_fra_sg o o On production, rescan the SCSI bus and, if using DM-multipath, restart the service. [root@dsib1141 scripts]# ./os_rescan.sh [root@dsib1141 scripts]# service multipathd restart [root@dsib1141 scripts]# multipath -ll | grep dd_fra Rename the encapsulated +FRA ASM disk group to +RESTORED_FRA, mount it to ASM, and use its archive logs. [oracle@dsib1141 oracle]$ /dev/mapper/ora_dd_fra1p1 /dev/mapper/ora_dd_fra2p1 /dev/mapper/ora_dd_fra3p1 /dev/mapper/ora_dd_fra4p1 cat ora_rename_FRA.txt FRA FRA FRA FRA RESTORED_FRA RESTORED_FRA RESTORED_FRA RESTORED_FRA [oracle@dsib1141 oracle]$ TOGRID [oracle@dsib1141 oracle]$ renamedg dgname=FRA newdgname=RESTORED_FRA config=./ora_rename_FRA.txt verbose=yes asm_diskstring='/dev/mapper/ora_dd*p1' [oracle@dsib1141 oracle]$ sqlplus "/ as sysasm" SQL> select name, state from v$asm_diskgroup; NAME STATE ------------------------------ ----------REDO MOUNTED FRA MOUNTED DATA MOUNTED RESTORED_DATA DISMOUNTED RESTORED_FRA DISMOUNTED SQL> alter diskgroup RESTORED_FRA mount; 25 Diskgroup altered. RMAN RECOVERY OF PRODUCTION AFTER COPY, OVERWRITING PRODUCTION DATA DEVICES (4C) • If a backup-set was picked up on the Mount host (following use case 4a, recoverable), then the encapsulated restore devices already have the required backup-set and likely the database was even rolled forward and opened read-only. In that case, you can skip the next few steps of performing a ProtectPoint restore and jump right to step (4c) to perform SnapVX link-copy after dismounting the Production ASM +DATA disk group. • Perform ProtectPoint backup list using either of the configuration files to choose a backup-id to restore. [root@dsib1141 scripts]# ./pp_list_backup.sh database • Step (3): Perform ProtectPoint backup restore using the database configuration file and a backup-id. [root@dsib1141 scripts]# ./pp_restore.sh database 35cc4c92-9ab5-09a4-a6f6-b832494372fe • Step (4c): Perform SnapVX Establish followed by a link-copy from the +DATA encapsulated restore devices back to the Production devices, overwriting them. o o If the Mount host had the encapsulated +DATA or +FRA restore devices mounted, close the database and unmount the ASM disk groups. To perform a link-copy from the encapsulated restore devices, they need to be made not-ready and have a snapshot that can be linked-copied. [root@dsib1141 scripts]# ./se_devs.sh rstr_data_sg not_ready 'Not Ready' Device operation successfully completed for the storage group. [root@dsib1141 scripts]# ./se_devs.sh rstr_data_sg show Symmetrix ID: 000196700531 Device Name Dir Device ---------------------------- ------- ------------------------------------Cap Sym Physical SA :P Config Attribute Sts (MB) ---------------------------- ------- ------------------------------------00027 Not Visible ***:*** TDEV N/Grp'd NR 102401 00028 Not Visible ***:*** TDEV N/Grp'd NR 102401 00029 Not Visible ***:*** TDEV N/Grp'd NR 102401 0002A Not Visible ***:*** TDEV N/Grp'd NR 102401 o Create a snapshot of the +DATA encapsulated restore devices. Script note: The script ‘se_snap_create.sh’ requires two parameters. The first specifies if the snapshot is to Production or Restore devices (prod|rstr). The second specifies which set of devices to snap: data files only (data), data and logs together (database), or archive logs (fra). [root@dsib1141 scripts]# ./se_snap_create.sh rstr data o o o Make sure that the Production database is down, and that +DATA disk group on the Production host is dismounted. [root@dsib1141 scripts]# su - oracle [oracle@dsib1141 ~]$ TODB [oracle@dsib1141 ~]$ sqlplus "/ as sysdba" SQL> shutdown abort; SQL> exit [oracle@dsib1141 ~]$ TOGRID [oracle@dsib1141 ~]$ sqlplus "/ as sysasm" SQL> alter diskgroup DATA dismount; Link-copy the snapshot from the +DATA encapsulated restore devices to Production +DATA devices. [root@dsib1141 scripts]# ./se_snap_link.sh rstr data Monitor the progress of the copy. [root@dsib1141 scripts]# ./se_snap_show.sh rstr data 26 o • When the copy is done, unlink the session (no need to keep it. [root@dsib1141 scripts]# ./se_snap_unlink.sh rstr data + symsnapvx -sg rstr_data_sg -lnsg prod_data_sg -snapshot_name rstr_data unlink Mount the copied +DATA ASM disk groups on the Production host. o o If necessary, scan the host SCSI bus for new devices. [root@dsib1141 scripts]# ./os_rescan.sh [root@dsib1141 scripts]# service multipathd restart If RAC is running on Production host, it should be already configured and running using a separate ASM disk group (+GRID). However, in the case of a single instance, Oracle High-Availability Services may need to be started first. [root@dsib1136 ~]# su - oracle [oracle@dsib1136 ~]$ TOGRID [oracle@dsib1136 ~]$ crsctl start has CRS-4123: Oracle High Availability Services has been started. o Mount +DATA ASM disk groups or start ASM and make sure +DATA is mounted. [oracle@dsib1141 ~]$ srvctl start asm [oracle@dsib1136 ~]$ sqlplus "/ as sysasm" SQL> select name,state from v$asm_diskgroup; NAME -----------------------------REDO FRA DATA • STATE ----------MOUNTED MOUNTED MOUNTED Perform database media recovery using the available archive logs in Production, bringing any missing archive logs from backup. o In the following example RMAN is used to copy the backup control file to its right place and then recover the database. Alternatively, RMAN or SQL can be used to perform incomplete media recovery if necessary. [oracle@dsib1141 ~]$ TODB [oracle@dsib1141 ~]$ rman Recovery Manager: Release 12.1.0.2.0 - Production on Mon Mar 30 09:42:29 2015 Copyright (c) 1982, 2014, Oracle and/or its affiliates. All rights reserved. RMAN> connect target / connected to target database (not started) RMAN> startup nomount; RMAN> restore controlfile from '+FRA/CTRL.BCK'; RMAN> alter database mount; RMAN> recover database; archived log file name=+FRA/ORCL/ARCHIVELOG/2015_03_25/thread_1_seq_70.326.875267627 thread=1 sequence=70 archived log file name=+FRA/ORCL/ARCHIVELOG/2015_03_25/thread_1_seq_71.327.875267629 thread=1 sequence=71 archived log file name=+REDO/ORCL/ONLINELOG/group_2.257.875207521 thread=1 sequence=72 media recovery complete, elapsed time: 00:02:10 Finished recover at 30-MAR-15 RMAN> alter database open resetlogs; Statement processed • If RMAN requires missing archive logs during the recovery, choose an appropriate +FRA backup from Data Domain and mount the +FRA encapsulated restore devices to the Production host, renaming the disk group to +RESTORED_FRA. Use the archives, and if more are necessary, unmount this disk group and repeat the process. o o ProtectPoint backup restore using the fra configuration file and a backup-id. [root@dsib1141 scripts]# ./pp_restore.sh fra bc0202f6-52f1-ce09-a46e-ebb9ae61d08c Add the +FRA encapsulated restore devices to the Production host masking view (remove them first from the Mount host masking view if they were associated with its masking view) 27 [root@dsib1141 scripts]# symaccess -type storage -name mount_sg remove sg rstr_fra_sg [root@dsib1141 scripts]# symaccess -type storage -name prod_database_sg add sg rstr_fra_sg o o On Production, rescan the SCSI bus and, if using DM-multipath, restart the service. [root@dsib1141 scripts]# ./os_rescan.sh [root@dsib1141 scripts]# service multipathd restart [root@dsib1141 scripts]# multipath -ll | grep dd_fra Rename the encapsulated +FRA ASM disk group to +RESTORED_FRA, mount it to ASM, and use its archive logs. [oracle@dsib1141 oracle]$ /dev/mapper/ora_dd_fra1p1 /dev/mapper/ora_dd_fra2p1 /dev/mapper/ora_dd_fra3p1 /dev/mapper/ora_dd_fra4p1 cat ora_rename_FRA.txt FRA FRA FRA FRA RESTORED_FRA RESTORED_FRA RESTORED_FRA RESTORED_FRA [oracle@dsib1141 oracle]$ TOGRID [oracle@dsib1141 oracle]$ renamedg dgname=FRA newdgname=RESTORED_FRA config=./ora_rename_FRA.txt verbose=yes asm_diskstring='/dev/mapper/ora_dd*p1' ... renamedg operation: dgname=FRA newdgname=RESTORED_FRA config=./ora_rename_FRA.txt verbose=yes asm_diskstring=/dev/mapper/ora_dd*p1 Executing phase 1 ... Completed phase 1 Executing phase 2 ... Completed phase 2 [oracle@dsib1141 oracle]$ sqlplus "/ as sysasm" SQL> select name, state from v$asm_diskgroup; NAME STATE ------------------------------ ----------REDO MOUNTED FRA MOUNTED DATA MOUNTED RESTORED_FRA DISMOUNTED SQL> alter diskgroup RESTORED_FRA mount; Diskgroup altered. • For reference, review the ‘test’ table. All the committed transactions have been recovered, including those from after the backup. [oracle@dsib1141 oracle]$ TODB [oracle@dsib1141 oracle]$ sqlplus "/ as sysdba" SQL> select * from test; TS -------------------------------------------------25-MAR-15 09.37.49.342907 AM -04:00 25-MAR-15 09.45.36.580309 AM -04:00 25-MAR-15 09.46.15.848576 AM -04:00 25-MAR-15 09.46.40.174418 AM -04:00 25-MAR-15 09.48.13.435286 AM -04:00 25-MAR-15 09.51.52.663259 AM -04:00 25-MAR-15 09.59.08.514299 AM -04:00 REC -------------------------------before db snapshot after db snapshot after log switch after fra snapshot after pp backup database started after pp backup fra started both backups completed 7 rows selected. CONCLUSION ProtectPoint File System Agent offers a solution to the growing challenge of maintaining backup and recovery SLAs, even in the face of growing database capacities and increased workload. With ProtectPoint File System Agent the backup time is no longer dependent on the size of the database. Only changed data is sent to the Data Domain system, yet all backups are full. Data Domain offers deduplication, compression, and remote replications to protect the backups. Restore operations are fast, leveraging the direct connectivity between Data Domain and the VMAX3 storage array. Both Backup and Recovery provide great savings in host CPU and I/Os, allowing the Production host to service application transactions more efficiently. 28 APPENDIXES APPENDIX I – PROTECTPOINT SYSTEM SETUP Setup Steps Overview To prepare the system for ProtectPoint, complete the following steps: 1. Set up physical connectivity 2. Set up Management host 3. Set up Production host 4. Set up Mount host (optional) 5. Set up Data Domain system 6. Set up encapsulated vdisks 7. Set up initial SnapVX sessions 8. Set up ProtectPoint software Table 2 describes the native devices and vdisk configuration (after step 3, 5, and 6). Remember to use the command: symdev list -encapsulated -wwn_encapsulated to identify Data Domain device WWNs. Table 2 Devices and SG configuration ASM DG Prod Dev SG DD backup vdisks WWN Dev (shortened) REDO 013 prod_redo_sg 01B 6002…AD00000 bkup_redo_sg vdisk-dev0 023 6002…AD00008 rstr_redo_sg vdisk-dev8 REDO 014 prod_redo_sg 01C 6002…AD00001 bkup_redo_sg vdisk-dev1 024 6002…AD00009 rstr_redo_sg vdisk-dev9 REDO 015 prod_redo_sg 01D 6002…AD00002 bkup_redo_sg vdisk-dev2 025 6002…AD0000A rstr_redo_sg vdisk-dev10 REDO 016 prod_redo_sg 01E 6002…AD00003 bkup_redo_sg vdisk-dev3 026 6002…AD0000B rstr_redo_sg vdisk-dev11 DATA 017 prod_data_sg 01F 6002…AD00004 bkup_data_sg vdisk-dev4 027 6002…AD0000C rstr_data_sg vdisk-dev12 DATA 018 prod_data_sg 020 6002…AD00005 bkup_data_sg vdisk-dev5 028 6002…AD0000D rstr_data_sg vdisk-dev13 DATA 019 prod_data_sg 021 6002…AD00006 bkup_data_sg vdisk-dev6 029 6002…AD0000E rstr_data_sg vdisk-dev14 DATA 01A prod_data_sg 022 6002…AD00007 bkup_data_sg vdisk-dev7 02A 6002…AD0000F rstr_data_sg vdisk-dev15 FRA 037 prod_fra_sg 03B 6002…AD00010 bkup_fra_sg vdisk-dev16 03F 6002…AD00014 rstr_fra_dg vdisk-dev20 FRA 038 prod_fra_sg 03C 6002…AD00011 bkup_fra_sg vdisk-dev17 040 6002…AD00015 rstr_fra_dg vdisk-dev21 FRA 039 prod_fra_sg 03D 6002…AD00012 bkup_fra_sg vdisk-dev18 041 6002…AD00016 rstr_fra_dg vdisk-dev22 FRA 03A prod_fra_sg 03E 6002…AD00013 bkup_fra_sg vdisk-dev19 042 6002…AD00017 rstr_fra_dg vdisk-dev23 SG DDR DD restore vdisks WWN Dev (shortened) SG DDR Set up Physical Connectivity The assumption is that the physical system connectivity was done as part of system installation by EMC personnel. Make sure that: • • SAN connectivity exists between switch(s) and: o Data Domain o VMAX3 o Management host o Production host o Mount host (optional) SAN zones are created for: o Data Domain FC ports with VMAX3 FTS DX ports 29 o Management host and VMAX3 front-end ports o Production host and VMAX3 front-end ports o Mount host (optional) and VMAX3 front-end ports Set up Management Host Software and Masking Views The Management host is where all user commands and scripts are initiated. Note that although in this white paper the same host (dsib1141) was used for both Production and Management hosts, in a real deployment they should be two separate hosts. Perform the following operations to set up the Management host: • Install Solutions Enabler (SE) CLI software. • If Solutions Enabler Access Controls (ACL) are to be used to limit the set of devices and operations that the Management host can perform, make sure that the EMC personnel also initialize the ACL database in the VMAX3 (requires EMC personnel). • Post SE installation: o Update the path to include SE binaries (e.g.: export PATH=$PATH:/usr/symcli/bin). o If a single VMAX3 is managed, add its ID to the environment variable so it will not be needed during SE CLI execution. (e.g. export SYMCLI_SID=000196700531) • Install Unisphere for VMAX3 (optional). • Refresh the SE database: symcfg discover. Then list the available storage devices: symdev list -mb. • If there are available gatekeepers they can be used. Otherwise create additional small communication devices. (e.g. symconfigure -sid 531 -cmd "create gatekeeper count=8 ;" commit) • Update /etc/hosts with references to Production host, Mount host, and Data Domain. [root@dsib1141 scripts]# cat /etc/hosts 127.0.0.1 localhost.localdomain localhost ::1 localhost6.localdomain6 localhost6 10.108.245.141 dsib1141.lss.emc.com dsib1141 10.108.245.136 dsib1136.lss.emc.com dsib1136 10.108.244.18 dsib0018.lss.emc.com dsib0018 • Prod Mount DDS Create a masking view for the management host with the gatekeepers. For example: #!/bin/bash # To find HBA port WWNs run the following command: # cat /sys/class/fc_host/host?/port_name set -x export SYMCLI_SID=000196700531 symaccess -type storage -name mgmt_gk create devs 2D:31 # gatekeeprs for management host symaccess -type initiator -name mgmt_ig create symaccess -type initiator -name mgmt_ig add -wwn <hba1_port1_wwn> symaccess -type initiator -name mgmt_ig add -wwn <hba2_port1_wwn> symaccess -type port -name mgmt_pg create -dirport 1D:8,2D:8 symaccess create view -name mgmt_mv –sg mgmt_gk –ig mgmt_ig –pg mgmt_pg Set up Production host Production database devices may already exist. If not, they can be created using Solutions Enabler CLI or Unisphere. Note: If Grid Infrastructure is used (Oracle RAC), these devices do not need to be backed up as they do not contain any user data. • The following example creates via CLI: 4 x 10GB devices for +REDO ASM disk group, 4 x 100GB devices for +DATA ASM disk group, and 4 x 200GB for +FRA ASM disk group. [root@dsib1141 emulation=FBA, [root@dsib1141 emulation=FBA, [root@dsib1141 emulation=FBA, • scripts]# symconfigure -sid 531 -cmd "create dev count=4,size=10 GB, config=tdev ;" commit scripts]# symconfigure -sid 531 -cmd "create dev count=4,size=100 GB, config=tdev ;" commit scripts]# symconfigure -sid 531 -cmd "create dev count=4,size=200 GB, config=tdev ;" commit Create a masking view for the Production host. The commands are executed on the Management host. 30 Note: Each ASM disk group type (+DATA, +REDO, and +FRA) gets its own storage-group (SG) and therefore it can have its own FAST SLO, and can be used separately for device masking to hosts. This example also creates a cascaded SG that includes ALL the database devices for: data, control, and log files (but not archive logs / FRA). This cascaded SG (called prod_database_sg) is used to create a storage consistent replica of the database for local replications (SnapVX) or remote replications (SRDF). It will also be used for logical recoveries on the Mount host. Note: For RAC deployments, where multiple nodes need access to the shared production database storage devices, use a cascaded Initiator Group that includes all the Production servers’ initiators. • Example masking views for the Production host: #!/bin/bash # To find HBA port WWNs run the following command: # cat /sys/class/fc_host/host?/port_name set -x export SYMCLI_SID=000196700531 symaccess -type port -name prod_pg create -dirport 1D:8,2D:8,3D:8,4D:8 <- Port group symaccess symaccess symaccess symaccess symaccess -type -type -type -type -type initiator initiator initiator initiator initiator <- Initiator group symaccess symaccess symaccess symaccess -type -type -type -type storage storage storage storage -name -name -name -name -name -name -name -name -name prod_ig prod_ig prod_ig prod_ig prod_ig create add -wwn add -wwn add -wwn add -wwn 21000024ff3de26e 21000024ff3de26f 21000024ff3de19c 21000024ff3de19d prod_redo_sg create devs 13:16 prod_data_sg create devs 17:1A prod_fra_sg create devs 37:3a prod_database_sg create sg prod_redo_sg,prod_data_sg <<<<- REDO SG DATA SG FRA SG Cascaded SG symaccess create view -name prod_database_mv –sg prod_database_sg –ig prod_ig –pg prod_pg <- DB masking view symaccess create view -name prod_fra_mv –sg prod_fra_sg –ig prod_ig –pg prod_pg <- FRA masking view • Example of the Oracle user .bash_profile script from the Production host (including the aliases TODB and TOGRID): [oracle@dsib1141 ~]$ cat ~/.bash_profile # Source global definitions if [ -f /etc/bashrc ]; then . /etc/bashrc fi export export export export export export export export BASE_PATH=$PATH DB_BASE=/u01/app/oracle GRID_BASE=/u01/app/grid DB_HOME=/u01/app/oracle/12.1/db GRID_HOME=/u01/app/grid/product/12.1.0/grid/ DB_SID=orcl GRID_SID=+ASM OMS_HOME=/u01/app/oracle/middleware/oms #export ORACLE_BASE=$DB_BASE export ORACLE_HOME=$DB_HOME export ORACLE_SID=$DB_SID export PATH=$BASE_PATH:$ORACLE_HOME/bin:. alias TODB="export ORACLE_HOME=$DB_HOME; export ORACLE_SID=$DB_SID; export ORACLE_BASE=$DB_BASE; export PATH=$DB_HOME/bin:$PATH" alias TOGRID="export ORACLE_HOME=$GRID_HOME; export ORACLE_SID=$GRID_SID; export ORACLE_BASE=$GRID_BASE; export PATH=$GRID_HOME/bin:$PATH" alias alias alias alias DH="cd $DB_HOME" GH="cd $GRID_HOME" OT="tail -200f /u01/app/oracle/diag/rdbms/orcl/orcl/trace/alert_orcl.log" AT="tail -200f /u01/app/grid/diag/asm/+asm/+ASM/trace/alert_+ASM.log" #export PATH=$PATH:.:$ORACLE_HOME/bin #export LD_LIBRARY_PATH=/u01/app/grid/12.1/grid/lib 31 • If using dm-multipath make sure that not only the Production devices are accounted for in /etc/multipath.conf, but also updated with entries for the +DATA and +FRA encapsulated restore devices, in case they are mounted to Production for recovery. In the following example, they are called “ora_dd_dataNN”, and “ora_dd_fraNN”: ... multipath { wwid alias } multipath { wwid alias } ... 360000970000196700531533030303433 ora_dd_data1 360000970000196700531533030303434 ora_dd_data2 Set up Mount host (optional) The Mount host is optional and can be used for logical recoveries or to browse through Data Domain backup-sets (using the encapsulated restore devices) before starting a SnapVX link-copy operation that can take some time to complete. • From the Management host, create a masking view for the Mount host. o Since a masking view cannot be created with a storage group, you can add a storage group with a few gatekeepers. o Create a cascaded storage group that for now only includes the gatekeepers SG. Later, when restore devices should be mounted, their SGs will simply be added to the cascaded SG and they will be accessible by the Mount host. symaccess symaccess symaccess symaccess symaccess -type -type -type -type -type initiator initiator initiator initiator initiator -name -name -name -name -name mount_ig mount_ig mount_ig mount_ig mount_ig create add -wwn add -wwn add -wwn add -wwn 21000024ff3de192 21000024ff3de193 21000024ff3de19a 21000024ff3de19b symaccess -type storage -name mount_gk create devs 32:36 symaccess -type storage -name mount_sg create sg mount_gk <- encapsulated SG for use later symaccess -type port -name mount_pg create -dirport 1D:8,4D:8 symaccess view create -name mount_mv -sg mount_sg -ig mount_ig -pg mount_pg • If using dm-multipath make sure to add entries in /etc/multipath.conf for the +DATA, +LOG, and +FRA encapsulated restore devices. In the following example, they are called “ora_dataNN”, and “ora_fraNN”: ... multipath { wwid alias } multipath { wwid alias } ... • 360000970000196700531533030303433 ora_data1 360000970000196700531533030303434 ora_data2 If RAC is used on the Mount host, it is recommended to configure Grid Infrastructure (+GRID ASM disk group) in advance with the cluster configuration and quorum devices. This way, when a backup is ready to be mounted to the Mount host the encapsulated restore devices will be masked to the host (made visible), and ASM can simply mount (open) the ASM disk groups. • If RAC is not used on the Mount host, then ASM will not be able to start until it has access to the initial ASM disk group (which will not be available until a backup-set is mounted to the Mount host). To prepare, perform the following steps: o Install Grid and Oracle database binaries only with the same version as Production (do not create disk groups or o Extract the ASM init.ora file from Production, and then copy it to the Mount host. database). [oracle@dsib1141 dbs]$ TOGRID [oracle@dsib1141 dbs]$ sqlplus “/ as sysasm” SQL> create pfile='/tmp/initASM.ora' from spfile; [oracle@dsib1141 dbs]$ scp /tmp/initASM.ora dsib1136:/download/scripts/oracle/initASM.ora 32 Later on, when the ASM disk group devices become visible to the Mount host, run the following commands (mounting o • the appropriate ASM disk groups as necessary to the recovery scenario). [oracle@dsib1136 ~]$ srvctl add asm -p /download/scripts/oracle/initASM.ora [oracle@dsib1136 ~]$ srvctl start asm [oracle@dsib1136 ~]$ srvctl status asm The following is an example of the Oracle user .bash_profile script from Mount host (including the aliases TODB and TOGRID): [oracle@dsib1136 ~]$ cat ~/.bash_profile # Source global definitions if [ -f /etc/bashrc ]; then . /etc/bashrc fi export export export export export export export BASE_PATH=$PATH DB_BASE=/u01/app/oracle GRID_BASE=/u01/app/grid DB_HOME=/u01/app/oracle/12.1/db GRID_HOME=/u01/app/grid/product/12.1.0/grid/ DB_SID=orcl GRID_SID=+ASM alias TODB="export ORACLE_HOME=$DB_HOME; export ORACLE_SID=$DB_SID; export ORACLE_BASE=$DB_BASE; export PATH=$DB_HOME/bin:$PATH" alias TOGRID="export ORACLE_HOME=$GRID_HOME; export ORACLE_SID=$GRID_SID; export ORACLE_BASE=$GRID_BASE; export PATH=$GRID_HOME/bin:$PATH" alias alias alias alias alias alias alias alias ls="ls -Fa" ll="ls -Fla" llt="ls -Flart" df="df -H" DH="cd $DB_HOME" GH="cd $GRID_HOME" OT="tail -200f /u01/app/oracle/diag/rdbms/oltp/orcl/trace/alert_orcl.log" AT="tail -200f /u01/app/grid/diag/asm/+asm/+ASM/trace/alert_+ASM.log" #export PATH=$PATH:.:$ORACLE_HOME/bin #export LD_LIBRARY_PATH=/u01/app/grid/12.1/grid/lib • When preparing for the encapsulated restore devices on the mount host: o It is highly recommended that the +DATA, +REDO, and +FRA encapsulated restore devices be presented to the Mount host during setup, and the Mount host rebooted once, so the devices can be registered with the host. That way the host will not need to be rebooted again later. o Match Mount host ASM disk string and device names with Production: If Production uses ASMlib, then during the recovery use cases on the Mount host, it can simply rescan for the new storage devices, find its own labels, and mount them. No further work is required. If Production uses EMC PowerPath (without ASMlib), then during recovery use cases on the Mount host, ASM will find its own labels. No further work is required. o If dm-multiplath is used, the file /etc/multipath.conf should contain similar aliases to the +DATA, +REDO, and +FRA devices from Production, only using WWNs of the matching encapsulated restore devices. This step can only take place later, after the vdisks have been encapsulated. Set up Data Domain system Licenses and SSH • License the Data Domain system for vdisk service, remote replications, etc. [root@dsib1141]# ssh sysadmin@DDS license show [root@dsib1141]# ssh sysadmin@DDS license add <license-key> 33 • Set secure SSH between management host and Data Domain system. Note that “DDS” is an entry in /etc/hosts with the IP address of the Data Domain system. Note: Only follow this step if Data Domain CLI will be scripted from Management host. That way, Data Domain will not ask for password with each set of commands. There is no need to follow this step when ProtectPoint CLIs are used exclusively to communicate with the Data Domain system. [root@dsib1141]# ssh-keygen -t rsa [root@dsib1141]# ssh sysadmin@DDS adminaccess add ssh-keys < ~/.ssh/id_rsa.pub Set up vdisk service and devices • • • • • Enable FC service (probably already enabled). sysadmin@dsib0018# scsitarget enable sysadmin@dsib0018# scsitarget status Enable vdisk service. sysadmin@dsib0018# vdisk enable sysadmin@dsib0018# vdisk status Create a vdisk Pool (for example, ERP). sysadmin@dsib0018# vdisk pool create ERP user sysadmin sysadmin@dsib0018# vdisk pool show list Create vdisk device group (for example, OLTP). sysadmin@dsib0018# vdisk device-group create OLTP pool ERP sysadmin@dsib0018# vdisk device-group show list Create 2 identical groups of vdisk devices; one for backup and one for restore devices matching in capacity to the Production host +REDO, +DATA, and +FRA devices. sysadmin@dsib0018# vdisk device create pool ERP device-group OLTP count 8 capacity 10 GiB sysadmin@dsib0018# vdisk device create pool ERP device-group OLTP count 8 capacity 100 GiB sysadmin@dsib0018# vdisk device create pool ERP device-group OLTP count 8 capacity 200 GiB sysadmin@dsib0018# vdisk device show list pool ERP Device Device-group Pool Capacity WWNN (MiB) -----------------------------------------------------------------------------vdisk-dev0 OLTP ERP 10241 60:02:18:80:00:08:a0:24:19:05:48:90:7a:d0:00:00 vdisk-dev1 OLTP ERP 10241 60:02:18:80:00:08:a0:24:19:05:48:90:7a:d0:00:01 vdisk-dev2 OLTP ERP 10241 60:02:18:80:00:08:a0:24:19:05:48:90:7a:d0:00:02 vdisk-dev3 OLTP ERP 10241 60:02:18:80:00:08:a0:24:19:05:48:90:7a:d0:00:03 ... Note: If the VMAX3 native devices were created using capacity notation of “MB” or “GB” you can also create the vdisks with “MiB” and “GiB” matching capacities. Otherwise, inspect the geometry of the VMAX3 native device using a symdev show command, and use a matching heads/cylinders/sectors geometry when creating the vdisk devices. For eExample, using <heads> <cylinders> and <sectors> instead of MiB or Gib: # On Management host: [root@dsib1141 ~]# symdev show 013 ... Geometry : Native { Sectors/Track : 256 Tracks/Cylinder : 15 Cylinders : 5462 512-byte Blocks : 20974080 MegaBytes : 10241 KiloBytes : 10487040 } => Equivalent to vdisk “Sectors per track” => Equivalent to vdisk “Heads” => Equivalent to vdisk “Cylinders” 34 # On Data Domain: sysadmin@dsib0018# vdisk device create count <count> heads <head-count> cylinders <cylindercount> sectors-per-track <sector-count> pool <pool-name> device-group <device-group-name> Create Access Group for Data Domain device masking • Create a Data Domain Access Group and add the vdisks to it. Note: Data Domain uses a similar concept to VMAX Auto-provisioning Groups (device masking) with an Access Group containing the initiators (VMAX backend port in this case, which will only be visible after the zoning was done correctly), and the vdisks devices to be encapsulated. sysadmin@dsib0018# scsitarget group create ERP service vdisk sysadmin@dsib0018# scsitarget initiator show list Initiator System Address Group Service ------------------------------------------initiator-1 50:00:09:73:50:08:4c:05 n/a n/a initiator-2 50:00:09:73:50:08:4c:49 n/a n/a initiator-3 50:00:09:73:50:08:4c:09 n/a n/a initiator-4 50:00:09:73:50:08:4c:45 n/a n/a ------------------------------------------sysadmin@dsib0018# scsitarget group add ERP initiator initiator-* sysadmin@dsib0018# scsitarget endpoint show list Endpoint System Address Transport Enabled Status -----------------------------------------------endpoint-fc-0 0a FibreChannel Yes Online endpoint-fc-1 0b FibreChannel Yes Online endpoint-fc-2 1a FibreChannel Yes Online endpoint-fc-3 1b FibreChannel Yes Online -----------------------------------------------sysadmin@dsib0018# vdisk device show list pool ERP sysadmin@dsib0018# scsitarget group add ERP device vdisk-dev* sysadmin@dsib0018# scsitarget group show detailed ERP <- Create Access Group <- List VMAX DX ports <- Add DX ports to Access Group <- List DD FC ports <- Check their status <- List vdisk devices <- Add vdisks to Access Group <- List Access Group details Set up encapsulated vdisk • After the vdisks are created and added to a Data Domain Access Group they will become visible to the VMAX3 and therefore can be encapsulated. • In the following example a script on the Management host is used with these steps: o ssh to the Data Domain system and capture the list of vdisks in pool ERP with their WWNs. Save it in file: ‘vdisk_wwn.txt’. o Remove everything but the WWN of each vdisk. Remove the colon from the WWNs. Save the output in file: o Create a command line for symconfigure that creates encapsulated external disks. Save it in file: ‘CMD.txt’. o Execute the command file from a symconfigure CLI command. ‘vdisk_wwn_only.txt’. #!/bin/bash set -x # Get vdisk WWN's from DDS ########################## # # # # # # # # DDS output looks like this: Device Device-group ----------vdisk-dev0 vdisk-dev1 vdisk-dev2 ... -----------OLTP OLTP OLTP Pool ---ERP ERP ERP Capacity (MiB) -------10241 10241 10241 WWNN ----------------------------------------------60:02:18:80:00:08:a0:24:19:05:48:90:7a:d0:00:00 60:02:18:80:00:08:a0:24:19:05:48:90:7a:d0:00:01 60:02:18:80:00:08:a0:24:19:05:48:90:7a:d0:00:02 35 ssh sysadmin@DDS "vdisk device show list pool ERP" | grep vdisk > ./vdisk_wwn.txt # Leave only WWNs and remove the colon ###################################### rm -f ./vdisk_wwn_only.txt while read line; do stringarray=($line) echo ${stringarray[4]} | sed 's/[\:_-]//g' >> ./vdisk_wwn_only.txt done < ./vdisk_wwn.txt # Create a symconfigure command file #################################### rm -f ./CMD.txt while read line; do CMD="add external_disk wwn=$line, encapsulate_data=yes;" echo $CMD >> ./CMD.txt done < ./vdisk_wwn_only.txt # Execute the command file ########################## symconfigure -sid 531 -nop -v -file ./CMD.txt commit • To list encapsulated devices: [root@dsib1141 scripts]# symdev list -encapsulated –gb [root@dsib1141 scripts]# symdev list -encapsulated -wwn_encapsulated • Now that the vdisks are encapsulated, build the storage groups for them: symaccess symaccess symaccess symaccess -type -type -type -type storage storage storage storage -name -name -name -name bkup_redo_sg create devs 1B:1E bkup_data_sg create devs 1F:22 bkup_fra_sg create devs 3B:3E bkup_database_sg create sg bkup_data_sg,bkup_redo_sg symaccess -type storage -name rstr_redo_sg create devs 23:26 symaccess -type storage -name rstr_data_sg create devs 27:2A symaccess -type storage -name rstr_fra_sg create devs 3F:42 Set up initial SnapVX sessions • To create the initial snapshot, use the SnapVX establish command. This example uses two Storage Groups (SGs) for the SnapVX session between the native VMAX3 devices and the Data Domain encapsulated backup devices: prod_database_sg (which includes all data, log, and control files), and prod_fra_sg (which includes the archive logs). Use the following scripts to create the initial snapshots. Script note: The ’se_snap_create.sh’ first parameter specifies on which devices to operate: Production (prod) or Restore (rstr). The second parameter indicates whether to snap data (just +DATA), database (+DATA and +REDO), or archive logs (+FRA). [root@dsib1141 ~]# ./snap_create.sh prod database [root@dsib1141 ~]# ./snap_create.sh fra database • Use SnapVX link –copy command to copy the snapshot data to Data Domain encapsulated backup devices. This example uses the SGs: bkup_database_sg, and bkup_fra_sg as the target SGs for the copy. Use the following scripts to perform it. Script note: The ’se_snap_link.sh’ first parameter specifies on which devices to operate: Production (prod) or Restore (rstr). The second parameter indicates whether to link-copy data (just +DATA), database (+DATA and +REDO), or archive logs (+FRA). Script note: The ’se_snap_verify.sh’ first parameter specifies on which devices to operate: Production (prod) or Restore (rstr). The second parameter indicates whether to link-copy data (just +DATA), database (+DATA and +REDO), or archive logs (+FRA). The third uses ‘0’ to indicate this is the initial link (i.e. the initial snapshot created by the admin user) or ‘1’ for monitoring later snapshots created by ProtectPoint Commands. Note: The first link-copy is a full copy. Subsequent links will only send changed data to the encapsulated backup devices. 36 [root@dsib1141 [root@dsib1141 [root@dsib1141 [root@dsib1141 • ~]# ~]# ~]# ~]# ./se_snap_link.sh prod database ./se_snap_link.sh prod fra ./se_snap_verify.sh prod database 0 ./se_snap_verify.sh prod fra 0 <- monitor the copy progress <- monitor the copy progress When monitoring link-copy to Data Domain wait until the copy changed to ‘D’ (destaged) state, which means all write pending tracks from VMAX3 cache were sent to Data Domain backend devices. [root@dsib1141 scripts]# symsnapvx -sg prod_fra_sg -snapshot_name prod_fra list -linked -detail -i 15 Storage Group (SG) Name : prod_fra_sg SG's Symmetrix ID : 000196700531 (Microcode Version: 5977) ----------------------------------------------------------------------------------------------Sym Link Flgs Remaining Done Dev Snapshot Name Gen Dev FCMD Snapshot Timestamp (Tracks) (%) ----- -------------------------------- ---- ----- ---- ------------------------ ---------- ---00037 prod_fra 0 0003B .D.X Sun Mar 22 21:13:04 2015 0 100 00038 prod_fra 0 0003C .D.X Sun Mar 22 21:13:04 2015 0 100 00039 prod_fra 0 0003D .D.X Sun Mar 22 21:13:04 2015 0 100 0003A prod_fra 0 0003E .D.X Sun Mar 22 21:13:04 2015 0 100 ---------0 Flgs: (F)ailed (C)opy (M)odified (D)efined : : : : F I X X = = = = Force Failed, X = Failed, . = No Failure CopyInProg, C = Copied, D = Copied/Destaged, . = NoCopy Link Modified Target Data, . = Not Modified All Tracks Defined, . = Define in progress Set up ProtectPoint File System Agent software • Copy ProtectPoint File System Agent v1.0 software to the Management host where Solutions Enabler is installed, then untar the kit and run the installation. [root@dsib1141 ProtectPoint]# tar xvf protectpoint-1.0.0.1-linux-x86-64.tar [root@dsib1141 ProtectPoint]# ./protectpoint_install.sh –install ... Provide information to generate config file Application name : OLTP_database Application version : 12c Application information : Oracle 12c OLTP database files Please manually edit /opt/emc/protectpoint-1.0.0.1/config/protectpoint.config file for remaining configuration Installation complete • Update the user PATH to include the location of the software. For example: /opt/emc/protectpoint-1.0.0.1/bin • Update /etc/hosts to include IPv6 and localhost. [root@dsib1141 127.0.0.1 ::1 • config]# cat /etc/hosts localhost.localdomain localhost localhost6.localdomain6 localhost6 Update or create the ProtectPoint configuration files. Each backup session (database, or fra in this case) will have its own ProtectPoint configuration file with its unique devices. While working with the configuration file is cumbersome, once it is set, it can be reused with every backup without a change. Note: Remember to update the configuration file if the Oracle database or FRA devices change! Note: To simplify updating the configuration file(s) always refer to the configuration in Table 2 (and make sure to keep it up-todate). 37 • During the ProtectPoint installation the first ProtectPoint configuration is created. Find it in the installation base directory under ./config/. The following is an example of update configuration file for FRA. FRA configuration file: PP_fra.config: ###################################################################### # this is just template file # Indentation just made for readability ###################################################################### [GENERAL] # APP_NAME is optional APP_NAME = "OLTP_database" # APP_VERSION is optional APP_VERSION = "12c" # APP_INFO is optional APP_INFO = "Oracle 12c OLTP database files" # give absolute path of base directory where catalog, log & lockbox # files should be generated by default BASE_DIR = "/opt/emc/protectpoint-1.0.0.1" # CATALOG_DIR is optional, default is ${[GENERAL].BASE_DIR}/catalog # CATALOG_DIR = <catalog dir> # LOCKBOX_DIR is optional, default is ${[GENERAL].BASE_DIR}/lockbox # LOCKBOX_DIR = <RSA Lock Box dir> # LOG_DIR is optional, default value is ${[GENERAL].BASE_DIR}/log # LOG_DIR = <log dir> # LOGLEVEL is optional, default value is 2, 2: error + warning, # 3: error + warning + info, 4: error + warning + info + debug # LOGLEVEL = <Log Level 2, 3 or 4> # LOGFILE_SIZE is optional, default value is 4 MB # LOGFILE_SIZE = <Log file size in MB> # LOGFILE_COUNT is optional, by default 16 files will be kept # LOGFILE_COUNT = <Number of log files> ##################### Primary System ################################# # VMAX Devices will be backed up to this System [PRIMARY_SYSTEM] # VMAX Devices will be backed up to this DD System # DD_SYSTEM = <host/IP> DD_SYSTEM = 10.108.244.18 # DD_PORT is optional, default value is 3009 # DD_PORT = <Port number to connect DD System> # The Data Domain user - owner of the DD_POOL # DD_USER = <user> DD_USER = sysadmin # DD_POOL is optional, used just for validation that all devices # belong to this pool # DD_POOL = <pool name> # DD_DEVICE_GROUP is optional, used just for validation that all # devices belong to this device group # DD_DEVICE_GROUP = <device group name> # SYstem ID of the VMAX system with production devices # SYMID = <VMAX SymID> SYMID = 000196700531 ########### Primary Devices on Primary System ######################## # All section name starting with PRIMARY_DEVICE_ will be backed up on # Primary DD i.e. [PRIMARY_SYSTEM] [PRIMARY_DEVICE_1] # SRC_SYMID is optional, default is ${[PRIMARY_SYSTEM].SYMID} # SRC_SYMID = <SymID for Source VMAX Device> SRC_SYMDEVID = 00037 # this is optional, default value is ${[PRIMARY_SYSTEM].SYMID} # FTS_SYMID = <SymID for FTS encapsulated DD Device> FTS_SYMDEVID = 0003B # WWN of the DD VDISK device for backup DD_WWN = 600218800008A024190548907AD00010 [PRIMARY_DEVICE_2] # SRC_SYMID is optional, default is ${[PRIMARY_SYSTEM].SYMID} # SRC_SYMID = <SymID for Source VMAX Device> 38 SRC_SYMDEVID = 00038 # this is optional, default value is ${[PRIMARY_SYSTEM].SYMID} # FTS_SYMID = <SymID for FTS encapsulated DD Device> FTS_SYMDEVID = 0003C # WWN of the DD VDISK device for backup DD_WWN = 600218800008A024190548907AD00011 [PRIMARY_DEVICE_3] # SRC_SYMID is optional, default is ${[PRIMARY_SYSTEM].SYMID} # SRC_SYMID = <SymID for Source VMAX Device> SRC_SYMDEVID = 00039 # this is optional, default value is ${[PRIMARY_SYSTEM].SYMID} # FTS_SYMID = <SymID for FTS encapsulated DD Device> FTS_SYMDEVID = 0003D # WWN of the DD VDISK device for backup DD_WWN = 600218800008A024190548907AD00012 [PRIMARY_DEVICE_4] # SRC_SYMID is optional, default is ${[PRIMARY_SYSTEM].SYMID} # SRC_SYMID = <SymID for Source VMAX Device> SRC_SYMDEVID = 0003A # FTS_SYMID is optional, default is ${[PRIMARY_SYSTEM].SYMID} # FTS_SYMID = <SymID for FTS encapsulated DD Device> FTS_SYMDEVID = 0003E # WWN of the DD VDISK device for backup DD_WWN = 600218800008A024190548907AD00013 ###################################################################### ############### Restore Devices on Primary System #################### # All section name starting with PRIMARY_SYSTEM_RESTORE_DEVICE will be # used to restore on Primary DD i.e. [PRIMARY_SYSTEM] # Total number of restore devices should be greater than or equal to # number of static images in backup & should have exact geometry as # static image in backup [PRIMARY_SYSTEM_RESTORE_DEVICE_1] # FTS_SYMID is optional, default is ${[PRIMARY_SYSTEM].SYMID} # FTS_SYMID = <SymID for FTS encapsulated DD Device> FTS_SYMDEVID = 03F # WWN of the DD VDISK device for Restore DD_WWN = 600218800008A024190548907AD00014 [PRIMARY_SYSTEM_RESTORE_DEVICE_2] # FTS_SYMID is optional, default is ${[PRIMARY_SYSTEM].SYMID} # FTS_SYMID = <SymID for FTS encapsulated DD Device> FTS_SYMDEVID = 040 # WWN of the DD VDISK device for Restore DD_WWN = 600218800008A024190548907AD00015 [PRIMARY_SYSTEM_RESTORE_DEVICE_3] # FTS_SYMID is optional, default is ${[PRIMARY_SYSTEM].SYMID} # FTS_SYMID = <SymID for FTS encapsulated DD Device> FTS_SYMDEVID = 041 # WWN of the DD VDISK device for Restore DD_WWN = 600218800008A024190548907AD00016 [PRIMARY_SYSTEM_RESTORE_DEVICE_4] # FTS_SYMID is optional, default is ${[PRIMARY_SYSTEM].SYMID} # FTS_SYMID = <SymID for FTS encapsulated DD Device> FTS_SYMDEVID = 042 # WWN of the DD VDISK device for Restore DD_WWN = 600218800008A024190548907AD00017 # # ###################################################################### ###################################################################### ################## Secondary System ################################## # Backup will be replicated/copied from Primary DD i.e. # [PRIMARY_SYSTEM] to Secondary System # # [SECONDARY_SYSTEM] 39 # # Hostname/IP of the DD system for DD Replication # DD_SYSTEM = <host/IP> # # DD_PORT is optional, default value is 3009 # # DD_PORT = <Port number to connect DD System> # # The Data Domain user - owner of the DD_POOL # DD_USER = <user> # # DD vdisk pool on the remote DD system # DD_POOL = <pool name> # # DD device-group where the replicated images will be available # DD_DEVICE_GROUP = <device group name> # # <SymID for FTS encapsulated Restore DD Device> # # this is optional if no restore device or FTS_SYMID mentioned in # # each restore device # # SYMID = <VMAX SymID> # ########### Restore Devices on Secondary System ###################### # All section name starting with SECONDARY_SYSTEM_RESTORE_DEVICE will # be used to restore on Secondary DD i.e. [SECONDARY_SYSTEM] # Total number of restore devices should be greater than or equal to # number of static images in backup & should have exact geometry as # static image in backup # [SECONDARY_SYSTEM_RESTORE_DEVICE_1] # # FTS_SYMID is optional, default is ${[SECONDARY_SYSTEM].SYMID} # # FTS_SYMID = <SymID for FTS encapsulated Restore DD Device> # FTS_SYMDEVID = <SymDevID for FTS encapsulated RestoreDD Device> # # WWN of the DD VDISK device for Restore # DD_WWN = <WWN for Restore DD Device> # # [SECONDARY_SYSTEM_RESTORE_DEVICE_2] # # FTS_SYMID is optional, default is ${[SECONDARY_SYSTEM].SYMID} # # FTS_SYMID = <SymID for FTS encapsulated Restore DD Device> # FTS_SYMDEVID = <SymDevID for FTS encapsulated RestoreDD Device> # # WWN of the DD VDISK device for Restore # DD_WWN = <WWN for Restore DD Device> # # ###################################################################### Note: ProtectPoint does not compare the content of the SnapVX sessions (or storage groups) with the devices listed in the configuration file. ProtectPoint will only operate on the devices that appear in the configuration file. • Validate the configuration file using the ProtectPoint command: config validate Note: Validate the configuration file only after SSH credentials are established, the configuration file is updated with the devices information, and the initial SnapVX sessions are created and linked. [root@dsib1141 config]# protectpoint config validate config-file /opt/emc/protectpoint- 1.0.0.1/config/PP_fra.config Validating host requirements............................[OK] Validating Primary System: Connection Information..............................[OK] Backup Devices are in same Data Domain Device Group.[OK] Backup Devices are unique...........................[OK] Backup Device's VMAX & DD Device Configuration......[OK] Restore Devices are in same Data Domain Device Group[OK] Restore Devices are unique..........................[OK] Restore Device's VMAX & DD Device Configuration.....[OK] Replication License.................................[N/A] Validating Secondary System: Connection Information..............................[N/A] Replication Device Group............................[N/A] Restore Devices are in same Data Domain Device Group[N/A] Restore Devices are unique..........................[N/A] Restore Device's VMAX & DD Device Configuration.....[N/A] Replication License.................................[N/A] Validating Primary and Secondary System are different...[N/A] Configuration is valid. 40 APPENDIX II – SAMPLE CLI COMMANDS: SNAPVX, PROTECTPOINT, DATA DOMAIN This section includes: 1) Basic TimeFinder SnapVX operations and commands 2) Basic ProtectPoint File System Agent commands 3) Basic Data Domain commands Basic TimeFinder SnapVX operations The following are basic SnapVX operations that can be executed via Unisphere, or Solutions Enabler command line (CLI). The following examples use storage group (-sg) syntax. Establish: Snapshots are taken using the establish command. Example: # symsnapvx -sg prod_data_sg -name prod_data establish List: Lists the available snapshots. The output can show all snapshots or be limited to a specific storage group. Based on the options used, the list can include only source devices and their snapshots, or also linked targets, restored snapshots, etc. Based on the options, the output will also show how much storage is consumed by the snapshots, whether a background copy operation is completed, and more. For example: # symsnapvx -sg prod_data_sg list Restore: Restores a specific snapshot to its source devices. The snapshot name and optionally generation number are provided. For example: # symsnapvx -sg prod_data_sg –snapshot_name prod_data restore Link / Relink / Unlink: SnapVX link makes the snapshot point-in-time data available to another set of host-addressable devices. The group of target devices can be specified using a ‘storage-group’ as well, and pointed to by the option ‘-lnsg’ (linked storage group). See the previous section regarding the use of ‘-copy’ during the link operation. To perform an incremental refresh of linked-target devices, use the ‘relink’ option. To remove a linked target relationship, use the ‘unlink’ option. For example: # symsnapvx -sg prod_data_sg -lnsg test_data_sg -snapshot_name prod_data link -copy –exact Verify: Verifies that a SnapVX operation completed or is in a certain state. For example, verify can be used to determine if a copy was fully completed AND destaged from VMAX persistent cache. This is especially useful when FTS is used to make sure that all the data was copied to the external storage. For example: # symsnapvx -sg prod_data_sg -name prod_data verify –copied –destaged Terminate / Terminate-restored: Terminates a snapshot, or terminates a restored state for a snapshot (allowing another one to be restored to the source devices). For example: # symsnapvx -sg prod_data_sg -snaphost_name prod_data terminate Basic ProtectPoint File System Agent Commands ProtectPoint File System Agent utilizes Command Line Interface (CLI) to perform the following main functions: • Add/remove ssh credentials: Establishes or removes ssh credentials between the host and the Data Domain system. Syntax: protectpoint security add dd-credentials [dd-system {primary | secondary}] [configfile <file-path>] • Update Data Domain catalog: Creates or refreshes the backup catalog on the primary or secondary Data Domain system. Syntax: protectpoint catalog update [dd-system {primary | secondary}] [config-file <filepath>] 41 • Snapshot create: Creates a SnapVX snapshot of the database. If Oracle Hot-Backup mode is required (database releases older than 12c), begin hot-backup before the snapshot and end it after the snapshot is taken. Syntax: protectpoint snapshot create • Backup create: After a snapshot of the database was taken, as shown in the previous step, and hot-backup mode ended (if it was used for database releases lower than 12c), a backup create performs the following operations: It relinks the snapshot to the Data Domain encapsulated backup devices, waits for the copy to be fully done (‘destaged’), creates a new static-image in the Data Domain system for the copied data, and assigns it the appropriate metadata. At the end of this step a new backup-set is created in Data Domain. Syntax: protectpoint backup create description "<backup-description>" [config-file <filepath>] • Backup delete: Deletes a backup. Syntax: protectpoint backup delete backup-id <backup-id > [dd-system {primary | secondary}] [config-file <file-path>] • Backup show: Lists available backups based on different criteria. Syntax: protectpoint backup show list [dd-system {primary | secondary}] [{last <n > {count | days | weeks | months}} | {from <MMDDhhmm> [[<CC>] <YY >] [to <MMDDhhmm> [[<CC ] <YY>]]}] [status {complete | in-progress | failed | partial}] [config-file <filepath>] • Restore: Places a backup-set data on the encapsulated restore devices. Syntax: protectpoint restore prepare backup-id <backup-id> [dd-system {primary | secondary}] [config-file <file-path>] • Manage replications: Replicates data from one Data Domain system to another (primary or secondary), views replication status and history, or stops the replication. Only one active replication session can exist for a backup set at a time. Note: Data Domain OS 5.5 only allows ProtectPoint to remote replicate a single backup-id (a backup-set). Syntax: o protectpoint replication run backup-id <backup-id> [sourcedd-system {primary | secondary}] [config-file <file-path>] o protectpoint replication show list [source-dd-system {primary | secondary}] [{last <n> {count | days | weeks | months}} | {from <MMDDhhmm> [[<CC>] <YY> ] [to MMDDhhmm [[<CC>] <YY> ]]}] [config-file <file-path>] o • protectpoint replication abort [source-dd-system {primary | secondary}] [configfile <file-path>] Validate ProtectPoint configuration file: Provides information and validation of the ProtectPoint configuration file. Syntax: protectpoint config validate [config-file <file-path>] Basic Data Domain Commands Data Domain system can be managed by using a comprehensive set of CLIs or by using Data Domain System Manager graphic user interface (GUI). The full description of managing Data Domain system is beyond the scope of this white paper; however, a list of relevant CLI commands is provided in the Setup Data Domain system section. Note: The default management user for the Data Domain system is ‘sysadmin’. While other users with lower permissions can be configured; the ‘sysadmin’ user is used in this white paper. Note: The following examples execute Data Domain commands by using an ssh remote login and execution of commands. The syntax is ‘ssh sysadmin@DDS <command>’ where sysadmin is the Data Domain admin user, and DDS is a notation of the Data 42 Domain system IP address set in the /etc/hosts file (i.e. ‘DDS’ can be replaced by the IP address or a different name). Some of the examples are shown as executed by sysadmin user directly on the Data Domain system, and are not preceded by ‘ssh’. For more information on Data Domain command lines refer to: EMC Data Domain Operating System Command Reference Guide. APPENDIX III – PROVIDING SOLUTIONS ENABLER ACCESS TO NON-ROOT USERS The following appendix describes how to provide DBAs controlled access to Solutions Enabler so they can perform their TimeFinder replication, backup and recovery procedures without needing root access. The feature is called Solutions Enabler Array Based Access Controls and can be configured from Unisphere or Solutions Enabler, as shown below. The components of Array Based Access Controls are: • Access Groups: These groups contain unique Host ID and descriptive Host Name of the non-root users. The host ID is provided by running ‘symacl –unique’ command on the appropriate host. • • Access Pools: Specify the set of devices for operations. Access Control Entry (ACE): Entries in the Access Control Database specifying the permissions level for the Access Control Groups and on which pools they can operate. Array Based Access Control commands are executed using symacl –sid –file <filename> preview | prepare | commit. Where preview verifies the syntax, prepare runs preview and checks if the execution is possible, and commit performs the prepare operations and executes the command. The Storage Admin PIN can be set in an environment variable: SYMCLI_ACCESS_PIN or entered manually. Install Solutions Enabler for non-root user • On the Application Management host, install Solutions Enabler for the Oracle user. The installation has to be performed as root user, though the option for allowing a non-root user is part of the installation. [root@dsib1136 SE]# ./se8020_install.sh –install ... Install root directory of previous Installation : /home/oracle/SE Working root directory [/usr/emc] : /home/oracle/SE ... Do you want to run these daemons as a non-root user? [N]:Y Please enter the user name : oracle ... #----------------------------------------------------------------------------# The following HAS BEEN INSTALLED in /home/oracle/SE via the rpm utility. #----------------------------------------------------------------------------ITEM PRODUCT VERSION 01 EMC Solutions Enabler V8.0.2.0 RT KIT #----------------------------------------------------------------------------- • To allow the Oracle user to run symcfg discover and list commands, permission is required to use the Solutions Enabler daemons. Update the daemon_users file: [root@dsib1136 ~]# cd /var/symapi/config/ [root@dsib1136 config]# vi daemon_users # Add entry to allow user access to base daemon oracle oracle • storapid storgnsd Test Oracle user access: [root@dsib1136 config]# su – oracle [oracle@dsib1136 ~]$ symcfg disc [oracle@dsib1136 ~]$ sympd list –gb 43 Set Management Host in Access Controls Database • EMC support personnel will run a Wizard on SymmWin. First, they enter the Storage Admin management host unique ID, then the Admin user and PIN (password). The Storage Admin should provide them the PIN and unique ID. The unique ID is provided by running: ‘symacl –unique’ on the Storage Management host. After that they can create Access Controls from the Storage Management host, as shown below. Create Array Base Access Controls for ProtectPoint Management host • Create the Application Access Control Group: [root@dsib1100 ~]# echo "create accgroup protectpoint;" > ./acl_pp_create_accgrp.cmd [root@dsib1100 ~]# symacl commit -file ./acl_pp_create_accgrp.cmd • On the Application Management host (where the DBA executes backup and recovery operations), get the unique host ID: [root@dsib1141 ~]# symacl -sid 531 -unique The unique id for this host is: 2F5A05AC-50498CC9-9C38777E • Add the Application Management host to the Application Access Control group: [root@dsib1100 ~]# echo "add host accid 2F5A05AC-50498CC9-9C38777E name protectpoint_mgmt to accgroup protectpoint;" > acl_pp_add_host.cmd [root@dsib1100 ~]# symacl commit -file ./acl_pp_add_host.cmd • Create Application Access Control pool: [root@dsib1100 ~]# echo "create accpool protectpoint;" > acl_pp_create_pool.cmd [root@dsib1100 ~]# symacl commit -file ./acl_pp_create_pool.cmd • Add the Application storage devices to the pool (including target devices): [root@dsib1100 ~]# echo "add dev 13:2A to accpool protectpoint;" > acl_pp_add_devs.cmd [root@dsib1100 ~]# echo "add dev 3B:42 to accpool protectpoint;" >> acl_pp_add_devs.cmd • [root@dsib1100 ~]# symacl commit -file ./acl_pp_add_devs.cmd Have the Oracle user try to run the command from the Application Management host prior to granting access. [oracle@dsib1141 ~]$ sympd list Symmetrix ID: 000196700531 Symmetrix access control denied the request • Grant permissions to the Application Access Group (choose appropriately based on documentation). [root@dsib1100 ~]# echo "grant access=BASE,SNAP,BASECTRL to accgroup protectpoint for accpool protectpoint;" > acl_pp_grant_access.cmd [root@dsib1100 ~]# symacl commit -file ./acl_pp_grant_access.cmd • Have the Oracle user try to run the command from the ProtectPoint Management host prior to granting access. [oracle@dsib1141 ~]$ sympd list Symmetrix ID: 000196700531 Device Name Dir Device ---------------------------- ------- ------------------------------------Cap Physical Sym SA :P Config Attribute Sts (MB) ---------------------------- ------- ------------------------------------... • Review the created Access Controls. [root@dsib1100 [root@dsib1100 [root@dsib1100 [root@dsib1100 [root@dsib1100 ~]# ~]# ~]# ~]# ~]# symacl symacl symacl symacl symacl list list list show show -accgroup –accpool –acl accpool protectpoint accgroup protectpoint 44 APPENDIX IV – SCRIPTS USED IN THE USE CASES Oracle scripts o ‘ora_switchandarchive.sh’ – Logs in to Oracle, performs switch log files, and archives the current log(s). If Oracle database is on another host, ssh first to the host. [root@dsib1141 scripts]# more ora_switchandarchive.sh #!/bin/bash set -x su - oracle << ! set -x sqlplus "/ as sysdba" << EOF alter system switch logfile; alter system archive log current; ALTER DATABASE BACKUP CONTROLFILE TO '+FRA/CTRL.BCK' REUSE; EOF ! ProtectPoint scripts o ‘pp_backup.sh’ – Performs a SnapVX link-copy to Data Domain, then creates a new backup-set with a description. The scripts add to the description the source devices for the backup (‘database’ or ‘fra’) as well as the SnapVX snapshot (which is when the database backup was created as a snapshot). [root@dsib1141 scripts]# cat pp_backup.sh #!/bin/bash set -x if [ "$#" -ne 2 ]; then echo "options: 1) database|fra 2) backup-descriptsion" exit fi OPT=$1 DESC="$2" case $OPT in database|fra) PIT_DATE=`./se_prod_snaptime.sh $OPT` # get the snapshot time DESC_WITH_DATE="$OPT $DESC $PIT_DATE" CONF=$PP_CONF_LOC/PP_${OPT}.config protectpoint backup create description "$DESC_WITH_DATE" config-file $CONF & PID=$! echo "protectpoint backup create is running in the background with PID: $PID and description: $DESC_WITH_DATE" ;; *) echo "options: 1) database|fra 2) backup-descriptsion" exit ;; Esac o ‘pp_delete_backup.sh’ – Deletes a ProtectPoint backup-id. It uses for parameters a ProtectPoint control file type (‘database’ or ‘fra’) and a backup-id. Note that it does not matter which control file is used. [root@dsib1141 scripts]# cat pp_delete_backup.sh #!/bin/bash set -x if [ "$#" -ne 2 ]; then echo "options: 1) database|fra 2) backup-id" exit fi OPT=$1 case $OPT in database|fra) CONF=$PP_CONF_LOC/PP_${OPT}.config protectpoint backup delete backup-id $2 config-file $CONF ;; *) echo "options: 1) database|fra 2) backup-id" 45 exit ;; Esac o ‘pp_list_backup.sh’ – Lists the ProtectPoint backup-sets. [root@dsib1141 scripts]# cat pp_list_backup.sh #!/bin/bash #set -x if [ "$#" -ne 1 ]; then echo "options: database|fra" exit fi OPT=$1 case $OPT in database|fra) CONF=$PP_CONF_LOC/PP_${OPT}.config protectpoint backup show list config-file $CONF ;; *) echo "options: database|fra" exit ;; Esac o ‘pp_restore.sh’ – Places a Data Domain backup set (using a ProtectPoint backup-id) on the appropriate restore devices. [root@dsib1141 scripts]# cat pp_restore.sh #!/bin/bash set -x if [ "$#" -ne 2 ]; then echo "options: 1) database|fra 2) backup-id" exit fi OPT=$1 case $OPT in database|fra) CONF=$PP_CONF_LOC/PP_${OPT}.config protectpoint restore prepare backup-id $2 config-file $CONF ;; *) echo "options: 1) database|fra 2) backup-id" exit ;; Esac o ‘pp_snap.sh’ – Creates a new SnapVX snapshot for Production devices: ‘database’ or ‘fra’. [root@dsib1141 scripts]# cat pp_snap.sh #!/bin/bash set -x if [ "$#" -ne 1 ]; then echo "options: database|fra" exit fi OPT=$1 case $OPT in database|fra) CONF=$PP_CONF_LOC/PP_${OPT}.config protectpoint snapshot create config-file $CONF echo "Backup time: "; date ;; *) echo "options: database|fra" exit ;; Esac 46 Solutions Enabler scripts: o ‘se_devs.sh’ – Shows an SG device status and updates it to READY or NOT_READY. [root@dsib1141 scripts]# cat se_devs.sh #!/bin/bash #set -x if [ "$#" -ne 2 ]; then echo "options: <sg_name> operation: show|ready|not_ready" exit fi SG_NAME=$1 OP=$2 if [ $OP == "show" ]; then symdev list -sg ${SG_NAME} exit fi if [ $OP == "ready" ] || [ $OP == "not_ready" ]; then symsg -sg ${SG_NAME} $OP fi o ‘se_encapsulate.sh’ – Encapsulates with VMAX3 backup and restore devices. [root@dsib1141 scripts]# cat se_encapsulate.sh #!/bin/bash set -x # Get vdisk WWN's from DDS ########################## # DDS output looks like this: # Device Device-group Pool Capacity # (MiB) # -------------------------------# vdisk-dev0 OLTP ERP 10241 # vdisk-dev1 OLTP ERP 10241 # vdisk-dev2 OLTP ERP 10241 # ... WWNN ----------------------------------------------60:02:18:80:00:08:a0:24:19:05:48:90:7a:d0:00:00 60:02:18:80:00:08:a0:24:19:05:48:90:7a:d0:00:01 60:02:18:80:00:08:a0:24:19:05:48:90:7a:d0:00:02 ssh sysadmin@DDS "vdisk device show list pool ERP" | grep vdisk > ./vdisk_wwn.txt # Remove irrelevant lines and remove colon from WWNs #################################################### rm -f ./vdisk_wwn_only.txt while read line; do stringarray=($line) echo ${stringarray[4]} | sed 's/[\:_-]//g' >> ./vdisk_wwn_only.txt done < ./vdisk_wwn.txt # Create and execute a symconfigure command file ################################################ rm -f ./CMD.txt while read line; do CMD="add external_disk wwn=$line, encapsulate_data=yes;" echo $CMD >> ./CMD.txt done < ./vdisk_wwn_only.txt symconfigure -sid 531 -nop -v -file ./CMD.txt commit o ‘se_prod_snaptime.sh’ – Captures the timestamp of the last snapshot taken for database or fra. It is used by ‘pp_backup.sh’ script. [root@dsib1141 scripts]# cat se_prod_snaptime.sh #!/bin/bash #set -x if [ $# -ne 1 ]; then echo "options: database|fra"; exit; fi SG=$1 # Capture snapvx snapshot time into an array PIT=($(symsnapvx -sg prod_${SG}_sg list | awk '/NSM/ {print $9,$6,$7,$8; exit}')) MM=`date -d "${PIT[1]} 1" "+%m"` # Replace MMM with numeric MM PIT_DATE="(${PIT[0]}_${MM}_${PIT[2]}_${PIT[3]})" echo $PIT_DATE o ‘se_snap_create.sh’ – create a SnapVX snapshot for production (prod) or restore (rstr) storage group, devices: database (+DATA and +REDO), data (just +DATA), or fra (+FRA). 47 [root@dsib1141 scripts]# cat se_snap_create.sh #!/bin/bash set -x if [ "$#" -ne 2 ]; then echo "options: prod|rstr database|data|fra" exit fi DEV_ORIGIN=$1 FILE_TYPE=$2 if [ $DEV_ORIGIN != "prod" ] && [ $DEV_ORIGIN != "rstr" ]; then echo "specify production primary devices, or encapsulated restore devices" echo "options: prod|rstr database|data|fra" exit fi if [ $FILE_TYPE != "database" ] && [ $FILE_TYPE != "data" ] && [ $FILE_TYPE != "fra" ]; then echo "specify database or fra devices" echo "options: prod|rstr database|data|fra" exit fi symsnapvx -sg ${DEV_ORIGIN}_${FILE_TYPE}_sg -name ${DEV_ORIGIN}_${FILE_TYPE} establish -v symsnapvx list o ‘se_snap_link.sh’ – Similar to ‘se_snap_create.sh’, except performs a link-copy instead of establish. [root@dsib1141 scripts]# cat se_snap_link.sh #!/bin/bash set -x if [ "$#" -ne 2 ]; then echo "options: prod|rstr database|data|fra" exit fi DEV_ORIGIN=$1 FILE_TYPE=$2 if [ $DEV_ORIGIN != "prod" ] && [ $DEV_ORIGIN != "rstr" ] || [ $FILE_TYPE != "database" ] && [ $FILE_TYPE != "data" ] && [ $FILE_TYPE != "fra" ] then echo "options: prod|rstr database|data|fra" exit fi if [ $DEV_ORIGIN == "prod" ]; then SRS_SG=prod_${FILE_TYPE}_sg TGT_SG=bkup_${FILE_TYPE}_sg fi if [ $DEV_ORIGIN == "rstr" ]; then SRS_SG=rstr_${FILE_TYPE}_sg TGT_SG=prod_${FILE_TYPE}_sg fi if [ $DEV_ORIGIN == "rstr" ] && [ $FILE_TYPE == "database" ]; then echo "This is a block to prevent overwriting the production redo logs unintentionally." exit fi SNAP_NAME=${DEV_ORIGIN}_${FILE_TYPE} symsnapvx -sg ${SRS_SG} -lnsg ${TGT_SG} -snapshot_name ${SNAP_NAME} link -copy -exact symsnapvx list o ‘se_snap_show.sh’ – Shows the status of a snapshot, including link-copy progress, at intervals of 30 seconds. [root@dsib1141 scripts]# cat se_snap_show.sh #!/bin/bash set -x if [ "$#" -ne 2 ]; then echo "options: prod|rstr database|data|fra" exit fi DEV_ORIGIN=$1 FILE_TYPE=$2 if [ $DEV_ORIGIN != "prod" ] && [ $DEV_ORIGIN != "rstr" ]; then echo "specify link source: production devices, or encapsulated restore devices" echo "options: prod|rstr database|data|fra" exit fi 48 if [ $FILE_TYPE != "database" ] && [ $FILE_TYPE != "data" ] && [ $FILE_TYPE != "fra" ]; then echo "specify which devices to link: database, data or fra devices" echo "options: prod|rstr database|data|fra" exit fi if [ $DEV_ORIGIN == "prod" ]; then SG=prod_${FILE_TYPE}_sg fi if [ $DEV_ORIGIN == "rstr" ]; then SG=rstr_${FILE_TYPE}_sg fi #SNAP_NAME=${DEV_ORIGIN}_${FILE_TYPE} symsnapvx -sg $SG list -linked -copied -detail -i 30 o ‘se_snap_terminate.sh’ – Terminates a snapshot. [root@dsib1141 scripts]# cat se_snap_terminate.sh #!/bin/bash set -x if [ "$#" -ne 2 ]; then echo "options: prod|rstr database|data|fra" exit fi DEV_ORIGIN=$1 FILE_TYPE=$2 if [ $DEV_ORIGIN != "prod" ] && [ $DEV_ORIGIN != "rstr" ]; then echo "specify production primary devices, or encapsulated restore devices" echo "options: prod|rstr database|data|fra" exit fi if [ $FILE_TYPE != "database" ] && [ $FILE_TYPE != "data" ] && [ $FILE_TYPE != "fra" ]; then echo "specify database or fra devices" echo "options: prod|rstr database|data|fra" exit fi if [ $DEV_ORIGIN == "prod" ]; then exit # Blocked to prevent terminating Prod snap. remove if truely necessary. fi symsnapvx -sg ${DEV_ORIGIN}_${FILE_TYPE}_sg -snapshot_name ${DEV_ORIGIN}_${FILE_TYPE} terminate -v symsnapvx list o ‘se_snap_unlink.sh’ – Unlinks a snapshot (such as at the end of use case 4c). [root@dsib1141 scripts]# cat se_snap_unlink.sh #!/bin/bash set -x if [ "$#" -ne 2 ]; then echo "options: prod|rstr database|data|fra" exit fi DEV_ORIGIN=$1 FILE_TYPE=$2 if [ $DEV_ORIGIN != "prod" ] && [ $DEV_ORIGIN != "rstr" ] || [ $FILE_TYPE != "database" ] && [ $FILE_TYPE != "data" ] && [ $FILE_TYPE != "fra" ] then echo "options: prod|rstr database|data|fra" exit fi if [ $DEV_ORIGIN == "prod" ]; then SRS_SG=prod_${FILE_TYPE}_sg TGT_SG=bkup_${FILE_TYPE}_sg fi if [ $DEV_ORIGIN == "rstr" ]; then SRS_SG=rstr_${FILE_TYPE}_sg TGT_SG=prod_${FILE_TYPE}_sg fi SNAP_NAME=${DEV_ORIGIN}_${FILE_TYPE} symsnapvx -sg ${SRS_SG} -lnsg ${TGT_SG} -snapshot_name ${SNAP_NAME} unlink 49 symsnapvx list o ‘se_snap_verify.sh’ – Waits until a snapshot link-copy is fully in ‘copied’ or ‘destaged’ state while showing progress. [root@dsib1141 scripts]# cat ./se_snap_verify.sh #!/bin/bash set -x if [ "$#" -ne 3 ]; then echo "options: prod|rstr database|data|fra NSM=0|1" exit fi DEV_ORIGIN=$1 FILE_TYPE=$2 NSM=$3 if [ $DEV_ORIGIN != "prod" ] && [ $DEV_ORIGIN != "rstr" ]; then echo "specify link source: production devices, or encapsulated restore devices" echo "options: prod|rstr database|data|fra NSM=0|1" exit fi if [ $FILE_TYPE != "database" ] && [ $FILE_TYPE != "data" ] && [ $FILE_TYPE != "fra" ]; then echo "specify which devices to link: database, data or fra devices" echo "options: prod|rstr database|data|fra NSM=0|1" exit fi if [ $NSM -eq 1 ]; then SNAPSHOT_NAME="NSM_SNAPVX" else SNAPSHOT_NAME=${DEV_ORIGIN}_${FILE_TYPE} fi SG_NAME=${SNAPSHOT_NAME}_sg STAT=1 while [ $STAT -ne 0 ] do symsnapvx -sg $SG_NAME -snapshot_name $SNAPSHOT_NAME list -detail -linked symsnapvx -sg $SG_NAME -snapshot_name $SNAPSHOT_NAME verify -copied -destaged STAT=$? if [ $STAT -ne 0 ]; then sleep 30; fi done date o ‘se_aclx.sh’ – Combines the different device masking commands for the system setup. [root@dsib1141 scripts]# cat ./aclx.sh #!/bin/bash # To find HBA port WWNs run the following command: # cat /sys/class/fc_host/host?/port_name set -x export SYMCLI_SID=000196700531 symaccess -type storage -name prod_gk create devs 2D:31 symaccess -type storage -name mount_gk create devs 32:36 symaccess -type storage -name backup_sg create devs 1B:22 symaccess symaccess symaccess symaccess -type -type -type -type storage storage storage storage -name -name -name -name prod_redo_sg create devs 13:16 prod_data_sg create devs 17:1A prod_fra_sg create devs 37:3a prod_database_sg create sg prod_redo_sg,prod_data_sg symaccess symaccess symaccess symaccess -type -type -type -type storage storage storage storage -name -name -name -name bkup_redo_sg create devs 1B:1E bkup_data_sg create devs 1F:22 bkup_database_sg create sg bkup_data_sg,bkup_redo_sg bkup_fra_sg create devs 3B:3E symaccess -type storage -name rstr_redo_sg create devs 23:26 symaccess -type storage -name rstr_data_sg create devs 27:2A symaccess -type storage -name rstr_fra_sg create devs 3F:42 symaccess -type storage -name prod_sg create sg prod_gk,prod_redo_sg,prod_data_sg,prod_fra_sg symaccess -type storage -name mount_sg create sg mount_gk 50 symaccess symaccess symaccess symaccess symaccess -type -type -type -type -type initiator initiator initiator initiator initiator -name -name -name -name -name prod_ig prod_ig prod_ig prod_ig prod_ig create add -wwn add -wwn add -wwn add -wwn symaccess symaccess symaccess symaccess symaccess -type -type -type -type -type initiator initiator initiator initiator initiator -name -name -name -name -name mount_ig mount_ig mount_ig mount_ig mount_ig 21000024ff3de26e 21000024ff3de26f 21000024ff3de19c 21000024ff3de19d create add -wwn add -wwn add -wwn add -wwn 21000024ff3de192 21000024ff3de193 21000024ff3de19a 21000024ff3de19b symaccess -type port -name prod_pg create -dirport 1D:8,2D:8,3D:8,4D:8 symaccess -type port -name mount_pg create -dirport 1D:8,4D:8 symaccess symaccess symaccess symaccess view view view view create create create create -name -name -name -name mgmt_mv -sg mgmt_sg -ig mgmt_ig -pg mgmt_pg prod_database_mv -sg prod_database_sg -ig prod_ig -pg prod_pg prod_fra_mv -sg prod_fra_sg -ig prod_ig -pg prod_pg mount_mv -sg mount_sg -ig mount_ig -pg mount_pg REFERENCES • EMC VMAX3 Family with HYPERMAX OS Product Guide • EMC Symmetrix VMAX using EMC SRDF/TimeFinder and Oracle • EMC VMAX3 TM Local Replication • http://www.emc.com/data-protection/data-domain/index.htm • http://www.emc.com/data-protection/protectpoint/index.htm 51