IBM DB2 Analytics Accelerator
Transcription
IBM DB2 Analytics Accelerator
IBM DB2 Analytics Accelerator Andreas Peschke Client Technical Architect zSW Andreas.Peschke@de.ibm.com © 2012 IBM Corporation IBM DB2 Analytics Accelerator · Disclaimer © Copyright IBM Corporation 2011. All rights reserved. U.S. Government Users Restricted Rights - Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM Corp. THE INFORMATION CONTAINED IN THIS PRESENTATION IS PROVIDED FOR INFORMATIONAL PURPOSES ONLY. WHILE EFFORTS WERE MADE TO VERIFY THE COMPLETENESS AND ACCURACY OF THE INFORMATION CONTAINED IN THIS PRESENTATION, IT IS PROVIDED “AS IS” WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED. IN ADDITION, THIS INFORMATION IS BASED ON IBM’S CURRENT PRODUCT PLANS AND STRATEGY, WHICH ARE SUBJECT TO CHANGE BY IBM WITHOUT NOTICE. IBM SHALL NOT BE RESPONSIBLE FOR ANY DAMAGES ARISING OUT OF THE USE OF, OR OTHERWISE RELATED TO, THIS PRESENTATION OR ANY OTHER DOCUMENTATION. NOTHING CONTAINED IN THIS PRESENTATION IS INTENDED TO, NOR SHALL HAVE THE EFFECT OF, CREATING ANY WARRANTIES OR REPRESENTATIONS FROM IBM (OR ITS SUPPLIERS OR LICENSORS), OR ALTERING THE TERMS AND CONDITIONS OF ANY AGREEMENT OR LICENSE GOVERNING THE USE OF IBM PRODUCTS AND/OR SOFTWARE. IBM, the IBM logo, ibm.com, DB2, InfoSphere, Cognos, and InfoSphere Warehouse on System z are trademarks or registered trademarks of International Business Machines Corporation in the United States, other countries, or both. If these and other IBM trademarked terms are marked on their first occurrence in this information with a trademark symbol (® or ™), these symbols indicate U.S. registered or common law trademarks owned by IBM at the time this information was published. Such trademarks may also be registered or common law trademarks in other countries. A current list of IBM trademarks is available on the Web at “Copyright and trademark information” at www.ibm.com/legal/copytrade.shtml Other company, product, or service names may be trademarks or service marks of others. 2 16.04.2012 © 2012 IBM Corporation IBM DB2 Analytics Accelerator · Would You Still Use Google If It Took 3 Days And 7 People To Get A Search Result? 3 16.04.2012 © 2012 IBM Corporation IBM DB2 Analytics Accelerator · Patrick Hempeler days for a single query constant tuning Nearly 70% of data warehouses experience performance-constrained issues of various types. - Gartner 2010 Magic Quadrant specialized resources required months to deploy 4 16.04.2012 © 2012 IBM Corporation IBM DB2 Analytics Accelerator · Traditional Systems Landscape Applications OLTP Staging Area ETL Operational Data Store ETL ETL Historical reasons: Negative ramifications: •Different access patterns •Complexity •impact on performance •EDW as the data integration hub •again, impact on performance •Different life-cycle characteristics •and again, impact on performance •Different Service Level Agreements (SLA) •Lack of broadly available workload management capabilities •Choice of lower cost-of-acquisition offerings 5 Enterprise DWH 16.04.2012 Data Marts ETL •both in systems management and in applications •Difficulties in supporting real time analytics •Inability to match ever more demanding SLA requirements •High total cost of ownership © 2012 IBM Corporation IBM DB2 Analytics Accelerator · Traditional data warehouses are just too complex They are based on databases optimized for transaction processing— NOT to meet the demands of advanced analytics on big data. Too complex infrastructure Too inefficient at analytics Too complicated to deploy Too many people needed to maintain Too much tuning required Too costly to operate Too long to get answers 6 16.04.2012 © 2012 IBM Corporation IBM DB2 Analytics Accelerator · The right data warehouse is now mission critical. Data continues to expand exponentially. Analytics are becoming more complex as business demands faster answers. 7 16.04.2012 © 2012 IBM Corporation IBM DB2 Analytics Accelerator · Visionary Systems Landscape Applications OLTP Staging Area ETL Operational Data Store ETL Enterprise DWH ETL Data Marts ETL Benefits • Consolidating all the components into a single system Challenges • Mixed workload management capabilities • Uniform access to any data • Universal processing capabilities to deliver best performance for both transactional and analytical workloads • Efficient data movement within the system (ideally, no network) • Opportunity to remove some of the components • Providing industry leading availability, security and reliability to all types of workload System z Data Sharing and Parallel Sysplex technology provides all the needed characteristics except one: Special purpose processing for analytical workloads to minimize the need for manual tuning 8 16.04.2012 © 2012 IBM Corporation IBM DB2 Analytics Accelerator · GA: November 19, 2010 9 16.04.2012 © 2012 IBM Corporation IBM DB2 Analytics Accelerator · ISAOpt V1 Needed Following Enhancements 1. Increase applicability by relaxing current off-load restrictions 2. Increase applicability by supporting larger amount of data 3. Support concurrent query execution 4. Improve data currency 5. Support disaster recovery 6. DB2 10 support 10 16.04.2012 © 2012 IBM Corporation IBM DB2 Analytics Accelerator · GA: November 25, 2011 11 16.04.2012 © 2012 IBM Corporation IBM DB2 Analytics Accelerator · IBM DB2 Analytics Accelerator Capitalizing on the best of both worlds – System z and Netezza What is it? The IBM DB2 Analytics Accelerator is a workload optimized, appliance add-on, that enables the integration of business insights into operational processes to drive winning strategies. It accelerates select queries, with unprecedented response times. How is it different Performance: Unprecedented response times to enable 'train of thought' analyses frequently blocked by poor query performance. Integration: Connects to DB2 through deep integration providing transparency to all applications. Self-managed workloads: queries are executed in the most efficient way Transparency: applications connected to DB2 are entirely unaware of the Optimizer Simplified administration: appliance hands-free operations, eliminating many database tuning tasks Breakthrough Technology Enabling New Opportunities 12 16.04.2012 © 2012 IBM Corporation IBM DB2 Analytics Accelerator · IDAA Preserves ISAOpt V1 Key Value Propositions DB2 continues to own data (both OLTP and DW) • Access to data (authorization, privileges, …) • Data consistency and integrity (backup, recovery, …) • Enables extending System z QoS characteristics to BI/DW data as well Applications access data (both OLTP and DW) only through DB2 • DB2 controls whether to execute query in DB2 mainline or route to ISAO • DB2 returns results directly to the calling application • Enables mixed workloads and selection of optimal access path (within DB2 mainline or ISAOpt/IDAA) depending on access pattern IDAA continues to be implemented as DB2 internal component • DB2 provides key IDAA status and performance indicators as well as typical administration tasks by standard DB2 interfaces and means • No direct access (log-on) to IDAA • Enables operational cost reduction through skills, tools and processes consolidation 13 16.04.2012 © 2012 IBM Corporation IBM DB2 Analytics Accelerator · Deep DB2 Integration within zEnterprise Applications DBA Tools, z/OS Console, ... Application Interfaces Operational Interfaces (standard SQL dialects) (e.g. DB2 Commands) DB2 for z/OS Data Manager Buffer Manager Superior availability reliability, security, Workload mgmt ... IRLM Log Manager z/OS on System z IBM DB2 Analytics Accelerator Industry leading DW performance, ease of use Netezza 1000 HW 14 16.04.2012 © 2012 IBM Corporation IBM DB2 Analytics Accelerator · Query Execution Process Flow Application Interface Heartbeat Optimizer Query execution run-time for queries that cannot be or should not be off-loaded to ISAO ISAO interface Application DB2 for z/OS Heartbeat (IDAA availability and performance indicators) Queries executed without IDAA Queries executed with IDAA 15 16.04.2012 © 2012 IBM Corporation IBM DB2 Analytics Accelerator · IDAA - Bringing Netezza AMPPTM Architecture to DB2 AMPP = Asymmetric Massively Parallel Processing Netezza 1000 CPU Memory Advanced Analytics IBM DB2 Analytics Accelerator Legacy Reporting DB2 for z/OS •V9 •V10 IBM DB2 Analytics Accelerator BI FPGA SMP Host CPU FPGA Memory CPU FPGA Memory DBA Network Fabric S-Blades™ Disk Enclosures IBM DB2 Analytics Accelerator 16 16.04.2012 © 2012 IBM Corporation IBM DB2 Analytics Accelerator · The Key to the Speed select DISTRICT, PRODUCTGRP, sum(NRX) from SALES where MONTH = '20091201' and MARKET = 509123 and SPECIALTY = 'GASTRO‘ FPGA Core group by … Slice of table SALES Uncompress Project CPU Core Restrict, Visibility Complex 㺌 Joins, Aggs, etc. (compressed) sum(NRX) select DISTRICT, PRODUCTGRP, sum(NRX) 17 16.04.2012 where MONTH = '20091201' and MARKET = 509123 and SPECIALTY = 'GASTRO' © 2012 IBM Corporation IBM DB2 Analytics Accelerator · Netezza 1000 basic design aspects •Slice of User Data •Swap and Mirror partitions •High speed data streaming •High compression rate Disk Enclosures SMP Hosts EXP3000 JBOD Enclosures 12 x 3.5” 1TB, 7200RPM, SAS (3Gb/s) max 116MB/s (200-500MB/s compressed data) e.g. TF12: 8 enclosures 㸢 96 HDDs 32TB uncompressed user data (㸢 128TB) •IDAA Server •SQL Compiler •Query Plan •Optimize Administration 2 front/end hosts, IBM 3650M3 clustered active-passive 2 Nehalem-EP Quad-core 2.4GHz per host Snippet BladesTM (S-Blades, SPUs) •Processor & streaming DB logic •High-performance database engine streaming joins, aggregations, sorts, etc. e.g. TF12: 12 back/end SPUs 18 16.04.2012 © 2012 IBM Corporation IBM DB2 Analytics Accelerator · The Netezza S-Blade™ 19 16.04.2012 © 2012 IBM Corporation IBM DB2 Analytics Accelerator · Disk Mirroring and Failover Primary Mirror Temp 20 All user data and temp space mirrored Disk failures transparent to queries and transactions Failed drives automatically regenerated Bad sectors automatically rewritten or relocated 16.04.2012 © 2012 IBM Corporation IBM DB2 Analytics Accelerator · Connectivity Options Multiple DB2 systems can connect to a single IDAA A single DB2 system can connect to multiple IDAAs Multiple DB2 systems can connect to multiple IDAAs Full flexibility for DB2 systems: Better utilization of IDAA resources Scalability High availability 21 16.04.2012 • • • • • • residing in the same LPAR residing in different LPARs residing in different CECs being independent (non-data sharing) belonging to the same data sharing group belonging to different data sharing groups © 2012 IBM Corporation IBM DB2 Analytics Accelerator · Network Configuration Options Option 1 – Simple Direct attachment – Virtual IP definition both on System z and Netezza – Only one network link active at a time – if Netezza fails over to standby host, connection might get lost port 1 port 2 System z port 1 IDAA port 2 SMP Host 2 port 1 port 2 10GbE OSA OSA ports are configured such that only one link is active at a time. If the connection on this link breaks, the other is activated 10GbE SMP Host 1 Option 2 – Additional redundancy or additional CEC requires Switch(es) – Can address cable failures and Netezza fail over to standby host – For higher availability requirements, a second switch is required 22 16.04.2012 port 1 Switch 1 port 1 port 2 port 2 10GbE OSA SMP Host 1 System z port 1 port 2 SMP Host 2 Switch 2 port 1 port 2 IDAA 10GbE OSA OSA ports are configured such that packages are sent alternating, using both ports: “multipathperconnection” configuration © 2012 IBM Corporation IBM DB2 Analytics Accelerator · Network Configuration Options (continued) Option 3 – zBX TOR Switch – For clients with an installed zBX connected to zEnterprise, the top-of-rack switch may be leveraged to connect to the IDAA – Connection between zBX and IDAA can be direct or switched Ethernet Switch OSAExpress3 10 GbE OSX zEnterprise 23 16.04.2012 Private Service Network TOR Switch 10 GbE Private Data Network zBX IDAA © 2012 IBM Corporation IBM DB2 Analytics Accelerator · Feedback from Beta Customer: Fast Time to Value IBM DB2 Analytics Accelerator (Netezza 1000-12) Î Production ready - 1 person, 2 days Table Acceleration Setup … 2 Hours – DB2 “Add Accelerator” – Choose a Table for “Acceleration” – Load the Table (DB2 copy to Netezza) – Knowledge Transfer – Query Comparisons Initial Load Performance … Î5.1 GB in 1 Min 25 Seconds (24M rows) 400 GB in 29 Min (570M rows) Actual Query Acceleration … 1908x faster Î2 Hours 39 Minutes to 5 Seconds CPU Utilization Reduction … 99% less CPU Î24M rows: 56.5 CPU seconds to 0.4 CPU seconds 24 16.04.2012 Actual customer results, October 2011 © 2012 IBM Corporation IBM DB2 Analytics Accelerator · Large Insurance Company Adding value by Accelerating the Delivery of Business Reporting Query Total Rows Reviewed Query 1 Query 2 Query 3 Query 4 Query 5 Query 6 Query 7 Query 8 Query 9 591,941,065 591,941,065 813,343,052 283,105,125 591,941,089 813,343,052 591,941,065 813,343,052 813,343,052 Total Qualifying Rows 2,813,571 2,813,571 8,260,214 2,813,571 3,422,765 4,290,648 361,521 3,425,292 4,130,107 Total Rows Returned 853,320 585,780 274 601,197 508 165 58,236 724 137 DB2 Only DB2 with IDAA Hours Sec(s) 2:39 9,540 2:16 8,220 1:16 4,560 1:08 4,080 0:57 4,080 0:53 3,180 0:51 3,120 0:44 2,640 0:42 2,520 Hours Sec(s) 0.0 5 5 0.0 0.0 6 5 0.0 0.0 70 0.0 6 0.0 4 0.0 2 193 0.1 Times Faster 1,908 1,644 760 816 58 530 780 1,320 13 With Accelerated Time to Value IBM DB2 Analytics Accelerator (Netezza 1000-12) Production ready - 1 person, 2 days Table Acceleration Setup in 2 Hours - DB2 “Add Accelerator” - Choose a Table for “Acceleration” - Load the Table (DB2 Loads Data to the Accelerator) - Knowledge Transfer - Query Comparisons Customer Quote: “we had this up and running in days with queries that ran over 1000 times faster” 25 16.04.2012 Initial Load Performance 400 GB Loaded in 29 Minutes 570 Million Rows (Actual: Loaded 800 GB to 1.3 TB per hour) Extreme Query Acceleration - 1908x faster 2 Hours 39 minutes to 5 Seconds CPU Utilization Reduction Up to 35% © 2012 IBM Corporation IBM DB2 Analytics Accelerator · ISAO V1 Needed Following Enhancements IDAA addresses all of them! 1. Increase applicability by relaxing current off-load restrictions See further documentation for details 2. Increase applicability by supporting larger amount of data Up to 32TB of uncomressed data, e.g. with 1:4 compression ratio, up to 128TB of user data 3. Support concurrent query execution Exploiting Netezza workload management capabilities 4. Improve data currency Partition-scope update 5. Support disaster recovery Building blocks provided 6. DB2 10 support IDAA supports both DB2 9 and DB2 10* Requires zEnterprise (z196 or z114) 26 16.04.2012 *DB2 10 support is planned for early 2012 © 2012 IBM Corporation IBM DB2 Analytics Accelerator · You can also benefit! • Are there long running queries that could provide business value if they could be run in seconds vs 30 minutes or more? • What about performance challenges with complex and ad hoc queries? • Is modernization of the data warehouse or Operational Business Analytics a topic of interest? If you answer one or more of the • Are there thoughtsquestions about extending the use of operational following with ‘yes‘ please give platform data to perform business analysis and daily me another 3 minutes to show you how reporting? we can help you analyzing your • Is there a data warehouse running on System z or an acceleration potenzial intention to do so? • Is an OLAP application running out of steam due to growth in data. These are usually a single subject area such as Accounting, Sales or Inventory. • The forgotten query: Have queries been elected to set aside due to performance challenges? • Is there a System z/196 or z114 or a plan for one in the near future? 27 16.04.2012 © 2012 IBM Corporation IBM DB2 Analytics Accelerator · 28 16.04.2012 © 2012 IBM Corporation IBM DB2 Analytics Accelerator · Quick Workload Test Report Report for for aa first first assessment: assessment: Acceleration Acceleration potential potential for for Queries Queries Estimated Estimated time time Customer CP CP cost cost Collecting information from dynamic statement cache, supported by stepby-step instruction and REXX script (small effort for customer) Uploading compressed file (up to some MB) to IBM FTP server IBM / Center of Excellence Importing data into local database Quick analysis based on known DB2 Analytics Accelerator capabilities Customer Customer Database Database 29 16.04.2012 Documentation Documentation and and REXX REXX procedure procedure Data Data package package (mainly (mainly unload unload data sets) data sets) Pre-process Pre-process and and load load IBM IBM lab lab Database Database Quick Quick Workload Workload Test Test Tool Tool Report Assessment © 2012 IBM Corporation IBM DB2 Analytics Accelerator · Workload Assessment Output: Sample PDF Report Summary based on queries, query blocks, elapsed time and CPU time Reasons why certain query blocks might not run on Smart Analytics Optimizer How much of the current elapsed time may run on Smart Analytics Optimizer Detailed query-level assessment of the workload Elapsed time per query 30 16.04.2012 SQL statement per query © 2012 IBM Corporation IBM DB2 Analytics Accelerator · In eigener Sache... ents v E 2 is 1 Pr e Zusammen mit den Experten der IBM Technical Support Organisation haben unsere deutschsprachigen IBM System z Spezialisten ein umfassendes Themenspektrum zu aktuellen Neuerungen und integrierte Software- und Hardware-Lösungen rund um den legendären Mainframe ausgewählt. Wir freuen uns darauf, Ihnen Informationen in XXL sowie brandheißes Wissen aus erster Hand zu präsentieren. In insgesamt 10 Tracks erfahren Sie, wie Sie das Potenzial der leistungsstarken Plattform für Ihr Unternehmen maximal nutzen können. Namhafte Sprecher der IBM zeigen Ihnen anschaulich, wie die neuesten Entwicklungen in den Bereichen der IBM Mainframetechnologie dazu beitragen können, Innovationen voranzutreiben, Kosten zu sparen und Ihre bestehenden Investitionen, geschäftskritischen Daten sowie Prozesse unternehmensweit zu schützen. Und das alles zu einem sensationellen Preis: Profitieren Sie bis zum 15. März 2012 vom exklusiven Frühbucherrabatt von 199,– € für beide Tage und melden Sie sich am besten gleich hier an! Nicht vergessen! Sie haben die einmalige Möglichkeit, an zwei Events zu einem Preis teilzunehmen: Das IBM Information Management Forum 2012 zum Thema „Manage Big Data“ findet zeitgleich statt. http://www.ibm.com/de/events/tsez/ 31 16.04.2012 © 2012 IBM Corporation IBM DB2 Analytics Accelerator · Disaster Recovery Considerations (1 of 2) SYSPLEX App 1 DSG Member 1 DSG Member 2 Tables of App 4 App 2 Tables of App 1 App 3 Tables of App 5 Tables of App 2 Short Range Long Range Short Range 32 16.04.2012 Switch Short Range IDAA Instance 1 Tables of App 2 App 5 Tables of App 3 Short Range Switch Tables of App 1 App 4 Tables of App 3 IDAA Instance 2 Tables of App 4 Tables of App 5 © 2012 IBM Corporation IBM DB2 Analytics Accelerator · Disaster Recovery Considerations (2 of 2) App 1 SYSPLEX App 1 DSG Member 1 DSG Member 2 App 2 Tables of App 4 App 2 Tables of App 5 App 3 Tables of App 1 App 3 Tables of App 2 Tables of App 3 App 4 Short Range Short Range Long Range Switch Short Range Short Range IDAA Instance 2 IDAA Instance 1 CREATE/LOAD Tables of App 1 33 16.04.2012 Tables of App 2 App 5 Switch Tables of App 3 Tables of App 1 Tables of App 2 Tables of App 4 Tables of App 3 Tables of App 5 © 2012 IBM Corporation