Real Application Clusters
Transcription
Real Application Clusters
Hochverfügbarkeit mit Oracle 9i UNBREAKABLE Claudia Hüffer Senior Sales Consultant Server Technologies Competence Center Nord (STCC Nord) Agenda Einleitung Hochverfügbarkeits-Merkmale von Oracle9i Maximum Availability Architecture (MAA) High Availability is … Kosten von Ausfallzeiten ? Downtime Per Year (=8760h) Percentage Availability Days Hours Minutes 95% 18 6 0 99% 3 15 36 99.9% 0 8 46 99.99% 0 0 53 99.999% 0 0 5 99.9999% 0 0 1 “…even 99.9% data availability can cost a company nearly $5m a year” - The Standish Group 2001 Gründe für ungeplante Ausfallzeiten Hardware & System Error 49% Human Error Computer Viruses 36% 7% Software Corruption 4% Natural Disasters 3% - The Disaster Recovery Journal 2001 Ursachen für Ausfallzeiten Unscheduled Outages System Faults and Crashes Data Center Disasters Human Error Scheduled Outages Data and Media Failures Inadequate System Design, Testing & Process Maintenance & Continuous Operations Maximum Availability Architecture Oracle9iAS Oracle9iAS WAN Traffic Manager Dedicated Network RAC Primary Site Data Guard RAC Secondary Site Oracle9i – Hochverfügbarkeit allgemein Oracle9i verhindert oder minimiert Ausfallzeiten System Failures Unplanned Downtime Planned Downtime • Automatic Crash Recovery • Real Application Clusters • RAC Guard Data Failures & Disasters • Data Guard • Recovery Manager Human Errors • FlashBack Query • LogMinor Routine Admin • Dynamic Reconfiguration Maintenance • Online Redefinition Fast-Start Fault Recovery Fast-Start Recovery continuously advances the recovery point as the database is updated – – Recovery time is controlled by MTTR_Target dynamic parameter Cost of achieving recovery time is tracked and reported by MTTR advisory Roll back is done in background – – Fast-start make it simple to control Mean Time To Recover (MTTR) – New work begins immediately after roll forward completes New transactions will roll back changes to rows they access that are locked by dead transactions Long running transactions have zero effect on recovery time Minimal I/O Recovery Viele Service Level Agreements beinhalten eine Grenze bezüglich der Mean Time To Recover (MTTR). Der DBA muß zuverlässig in der Lage sein, einen Grenzwert für die Zeit des Recovery zu setzen: FAST_START_MTTR_TARGET (in Sekunden) Intern wird dies in entspr. Werte für die folgenden beiden init.ora Parameter umgesetzt : FAST_START_IO_TARGET LOG_CHECKPOINT_INTERVAL LOG_CHECKPOINT_TIMEOUT (gleiches Verh. wie vorher) Oracle9i – Hochverfügbarkeit allgemein Oracle9i verhindert oder minimiert Ausfallzeiten System Failures Unplanned Downtime Planned Downtime • Automatic Crash Recovery • Real Application Clusters • RAC Guard Data Failures & Disasters • Data Guard • Recovery Manager Human Errors • FlashBack Query • LogMiner Routine Admin • Dynamic Reconfiguration Maintenance • Online Redefinition Real Application Clusters Users Cache Fusion ch e Shared Ca Scales off-the-shelf applications with no changes World’s best Availability with Fast-Start Fault Recovery – – The Ultimate Parallel Architecture When a node crashes the database keeps running on the remaining nodes Recovery time is independent of workload or database size Highly Available Database Real Application Clusters Fast Failover – – – Protection from local site system failures Faster than cold cluster failover solution Fast-start fault recovery (instance failure MTTR) Availability and Accessibility – Allows for scheduled outages Add and remove nodes transparently – Transparent Application Failover (TAF) provides uninterrupted service Highly Available Database Real Application Clusters Higher Scalability – – – All system resources from all nodes are leveraged Cache fusion eliminates need to partition data or modify the application – fully application transparent Connection load balancing distributes connection requests from application tier Manageability – Provides a single image of the database to manage Oracle ist Oracle ist Oracle! Real Application Clusters ist eine Option für Oracle9i Eine Codebasis; Alle Oracle9i Funktionalitäten Identische Schnittstellen Identische Tools-Infrastruktur – – – – Oracle Universal Installer (OUI) Enterprise Manager (EM) Database Configuration Assistant (DBCA) Recovery Manager (RMAN) Applikationen brauchen für RAC nicht angepasst zu werden Oracle9i RAC für alle Applikationen und Plattformen - Sicherheit und Skalierbarkeit auch im Commodity Hardware Umfeld Oracle9i – Hochverfügbarkeit allgemein Oracle9i verhindert oder minimiert Ausfallzeiten System Failures Unplanned Downtime Planned Downtime • Automatic Crash Recovery • Real Application Clusters • RAC Guard Data Failures & Disasters • Data Guard • Recovery Manager Human Errors • FlashBack Query • LogMiner Routine Admin • Dynamic Reconfiguration Maintenance • Online Redefinition Recovery Manager (RMAN) Recovery Catalog Enterprise Manager Recovery Manager Oracle provides integrated, automated, databasemanaged backups through RMAN RMAN features: – Disk Disk Disk Disk – Media Management Layer Network – – – Managing the backup, restore, and recovery process Backup at database, tablespace, or datafile level Block-level media recovery Optimizations for improved availability and performance Integration with Oracle Enterprise Manager and several 3rd party tools Oracle 9i Data Guard Konzept Physical Standby Database Backup DIGITAL DATA STORAGE Production Database Network on/ r h c Syn r on h c n Asy Broker Synchro n/ Asynch r on DIGITAL DATA STORAGE Delay/ REDO No Delay Apply Logical Standby Database Delay/ No Delay SQL Apply Continuously Open for Reports Data Guard Voraussetzungen Gleiche Datenbank-Version (auch Patchlevel) auf Primary und Standby-Site Gleiches Betriebssystem auf beiden Seiten – Laut Doku sind unterschiedliche BetriebssystemVersionen erlaubt Gleiche Hardware/Betriebssystem Software Architektur (d.h. 32/32-bit, 64/64-bit) Anzahl CPUs, Memory und Verzeichnis-Strukturen dürfen unterschiedlich sein RAC kann mit RAC oder Single-Instance kombiniert werden, bei RAC nur manuelle Konfiguration Oracle9i – Hochverfügbarkeit allgemein Oracle9i verhindert oder minimiert Ausfallzeiten System Failures Unplanned Downtime Planned Downtime • Automatic Crash Recovery • Real Application Clusters • RAC Guard Data Failures & Disasters • Data Guard • Recovery Manager Human Errors • FlashBack Query • LogMiner Routine Admin • Dynamic Reconfiguration Maintenance • Online Redefinition Oracle9i LogMiner Sicht auf alle Datenbank-Änderungen Inhalte der Redo Logs mit SQL abfragen GUI (LogMiner Viewer) oder Command Line Interface Query by value und Undo jeder Änderung Supported DDL, chained rows, primary keys and Direct Path Benutzerfehler korrigieren Oracle9i LogMiner Viewer Flashback Query Oracle Invents Time Machine A Time Machine for Flashback Query allows viewing Your Data data as it existed in the past Before – Now – Query at a time of your choosing Use standard SQL for corrections Revolutionary advance in recovery – Mistake Delete from Emp where Ename=‘Smith’; Enormously simpler and faster than traditional recovery from backups Correction Insert into Emp select * from emp AS OF yesterday where Ename=‘Smith’; Oracle Flashback System Managed Undo (SMU) Mode muss aktiviert sein Setzen von Undo Retention (Zeit in Sekunden) Alter system set undo_retention = 1800; ermöglicht FlashBack Snapshots bis zu 30 Minuten Rechenbeispiel für den Platzbedarf: – – – Blockgrösse = 8 KB Transaktionsrate = 20 Undo-Blöcke/s undo_retention = 1800 ergibt (1800 * 200 * 8 KB) = 270 MB “Before Oracle 9i’s Flashback query, a restore was required to recover lost data. Now, using the Flashback option, human error can be easily undone.” - Tim Donar, Acxiom Oracle9i – Hochverfügbarkeit allgemein Oracle9i verhindert oder minimiert Ausfallzeiten System Failures Unplanned Downtime Planned Downtime • Automatic Crash Recovery • Real Application Clusters • RAC Guard Data Failures & Disasters • Data Guard • Recovery Manager Human Errors • FlashBack Query • LogMiner Routine Admin • Dynamic Reconfiguration Maintenance • Online Redefinition Dynamic Reconfiguration Oracle dynamically adjusts to hardware changes – Dynamically add and subtract CPUs to SMP Proven scalability to 72 CPU SMP Dynamically grow and shrink shared memory and buffer cache – Dynamically add and remove nodes in a cluster Capacity on Demand No data movement needed No Reboot required – Oracle9i – Hochverfügbarkeit allgemein Oracle9i verhindert oder minimiert Ausfallzeiten System Failures Unplanned Downtime Planned Downtime • Automatic Crash Recovery • Real Application Clusters • RAC Guard Data Failures & Disasters • Data Guard • Recovery Manager Human Errors • FlashBack Query • LogMiner Routine Admin • Dynamic Reconfiguration Maintenance • Online Redefinition Online Redefinition Online schema redefinition – – add, modify, drop, rename table columns Rename constraints Online Index Operationen – create, recreate Online analyze und validate Updates & queries laufen weiter Geplante „Downtimes“ verringern! Online Redefinition All indexing operations can be done online – Create new index, move index, defragment index Tables can be Reorganized & Redefined online – – Table contents are copied to a new table Defragments and allows changing location, table type, partitioning Contents can be transformed as they are copied Can change columns, types, sizes - specified using SQL “Select” Source Table Continuous Queries & Updates Copy Table Update Tracking Transform Result Table Store Updates Transform Updates GUI interface to make it Simple Oracle eliminiert viele der geplanten Ausfallzeiten 24 Stunden Betrieb mit Online Backup ohne Beeinträchtigung der Benutzeraktivitäten Read Consistency - Lockingverhalten – – Schreibende Benutzer warten nie auf lesende Lesende Benutzer warten nie auf schreibende Event Monitoring im Enterprise Manager – Proaktives Erkennen von möglichen Problemsituationen Transparente Übernahme von Diensten bei Wartungsarbeiten in einer Clusterumgebung (shutdown transactional + TAF) Kein Reorg in Oracle9i Bei großen Datenmengen nehmen Reorgs beim Mitbewerb viel Zeit in Anspruch Tabellenreorgs sind unter Oracle8i / Oracle9i generell nicht nötig ! Spacemanagement – – Unlimited extents Locally managed tablespaces Online Index Rebuild Unterstützung bei Datenmodelländerungen durch das Change Management Pack Oracle9i Partitioning partition table “Orders by Month” Frühere Monate “read-only” Partitionierung nach Range List Hash Composite Orders Table Ermöglicht extrem schnelle und effiziente Rolling window Operationen Oracle9i: Partitionierung Range Partitionierung – Daten basierend auf Partition-Key Werten in Partitionen mappen Hash Partitionierung – Daten basierend auf einem Hash-Algorithmus in Partitionen mappen List Partitionierung (DEFAULT Partition) – Diskrete Werte einer Partition zuordnen Composite Range-Hash Partitionierung – Kombiniert Range und Hash Partitionierung Composite Range-List Partitionierung – Kombiniert Range und List Partitionierung Oracle9i R2: Composite Range-List Partitionierung Composite Range List Partitionierung – – Range partitioniert die Daten für einfache rolling Window Data Loads z.B. nach Monat Sub List-Partition z.B nach Region JAN FEB OCT NOV DEC Ost West Nord Süd Mitte Hochverfügbarkeit mit Oracle 9i War das „schon“ alles? Nein ... ! New Oracle9i High Availability Features Data Recovery – – – – – – – Online Operations Block level media recovery Trial Recovery Tolerate corrupt redo logs Self-describing backups Policy based automated backup and recovery Stored backup configurations Resumable backup and restore – – – – – Self-Service Correction – Flashback Query Unlimited online indexing – Row level change history operations Online table redefinition and Miscellaneous reorganization – Quiesce DB for Dynamic buffer maintenance cache/shared pool resizing – Online add column/site for replication groups Online ANALYZE VALIDATE – Offline Diagnostics Online add and remove CPU New Oracle9i High Availability Features Fast Fault Recovery – – – Data Protection Minimal I/O crash recovery Time-based limit on crash recovery Resumable space allocation – – – – – Log Analysis – – LogMiner Query by content of change – – Cluster Recovery Zero data loss standby Logical standby Push-Button standby automation Delayed apply standby Network outage tolerance Near real-time reporting Tolerate corrupt logs Non-disruptive cluster reconfiguration – Disk heartbeat validates network heartbeat – Integrated Oracle Parallel Fail Safe – Multi-node Fail Safe for Windows 2000 – Hochverfügbarkeit mit Oracle 9i Was spielt sonst noch eine Rolle? • Security Je besser und feiner der DBA den Zugriff verwalten und monitoren kann, umso weniger Schaden kann angerichtet werden. • Manageability Je leichter die Administration und das Monitoring, umso höher die Verfügbarkeit. Oracle9i – Hochverfügbarkeit Oracle9i verhindert oder minimiert Ausfallzeiten System Failures Unplanned Downtime Planned Downtime • Automatic Crash Recovery • Real Application Clusters • RAC Guard Data Failures & Disasters • Data Guard • Recovery Manager Human Errors • FlashBack Query • LogMinor Routine Admin • Dynamic Reconfiguration Maintenance • Online Redefinition Maximum Availability Architecture Oracle9iAS Oracle9iAS WAN Traffic Manager Dedicated Network RAC Primary Site Data Guard RAC Secondary Site Maximum Availability Architecture Best Oracle High Availability Architecture What to use Best Practices How to build it How to manage it How to fix it High Availability Goal Design and validate the best, integrated High Availability solution – – – Unbreakable Architecture Handle all outages at all tiers Best Practices Cookbook for prevention, avoidance, mitigation, and recovery Configuration, operational, outage solutions, restore fault tolerance Complete out-of-the-box high availability Tested and validated solution Unbreakable Architecture + Best Practices = Maximum Availability Maximum Availability Architecture Best Oracle High Availability Architecture – – – Blueprint for Database and Oracle9iAS Guidelines for hardware and non-Oracle software but platform, OS, storage, network, … independent Evolves with new Oracle versions and features Best Practices – – – Configuration and operational Outages and detailed solutions Restoring fault tolerance after an outage MAA Information Sources Oracle Technology Network – High Availability Collateral section Maximum Availability Architecture - Overview Maximum Availability Architecture – The Details http://otn.oracle.com/deploy/availability/techlisting.h tml Oracle Consulting – Advanced Technologies Solutions (ATS) Group http://otn.oracle.com/consulting/9iServices/content. html F R A G EN A N T W O R T E N claudia.hueffer@oracle.com