IBM DB2 Analytics Accelerator

Transcription

IBM DB2 Analytics Accelerator
򔻐򗗠򙳰
IBM DB2 Analytics Accelerator
Andreas Peschke
Client Technical Architect zSW
Andreas.Peschke@de.ibm.com
© 2012 IBM Corporation
IBM DB2 Analytics Accelerator ·
Disclaimer
© Copyright IBM Corporation 2011. All rights reserved.
U.S. Government Users Restricted Rights - Use, duplication or disclosure restricted by GSA ADP Schedule
Contract with IBM Corp.
THE INFORMATION CONTAINED IN THIS PRESENTATION IS PROVIDED FOR INFORMATIONAL PURPOSES
ONLY. WHILE EFFORTS WERE MADE TO VERIFY THE COMPLETENESS AND ACCURACY OF THE
INFORMATION CONTAINED IN THIS PRESENTATION, IT IS PROVIDED “AS IS” WITHOUT WARRANTY OF
ANY KIND, EXPRESS OR IMPLIED. IN ADDITION, THIS INFORMATION IS BASED ON IBM’S CURRENT
PRODUCT PLANS AND STRATEGY, WHICH ARE SUBJECT TO CHANGE BY IBM WITHOUT NOTICE. IBM
SHALL NOT BE RESPONSIBLE FOR ANY DAMAGES ARISING OUT OF THE USE OF, OR OTHERWISE
RELATED TO, THIS PRESENTATION OR ANY OTHER DOCUMENTATION. NOTHING CONTAINED IN THIS
PRESENTATION IS INTENDED TO, NOR SHALL HAVE THE EFFECT OF, CREATING ANY WARRANTIES OR
REPRESENTATIONS FROM IBM (OR ITS SUPPLIERS OR LICENSORS), OR ALTERING THE TERMS AND
CONDITIONS OF ANY AGREEMENT OR LICENSE GOVERNING THE USE OF IBM PRODUCTS AND/OR
SOFTWARE.
IBM, the IBM logo, ibm.com, DB2, InfoSphere, Cognos, and InfoSphere Warehouse on System z are trademarks or
registered trademarks of International Business Machines Corporation in the United States, other countries, or both. If these
and other IBM trademarked terms are marked on their first occurrence in this information with a trademark symbol (® or ™),
these symbols indicate U.S. registered or common law trademarks owned by IBM at the time this information was published.
Such trademarks may also be registered or common law trademarks in other countries. A current list of IBM trademarks is
available on the Web at “Copyright and trademark information” at www.ibm.com/legal/copytrade.shtml
Other company, product, or service names may be trademarks or service marks of others.
2
16.04.2012
© 2012 IBM Corporation
IBM DB2 Analytics Accelerator ·
򔻐򗗠򙳰
Would You Still Use Google
If It Took 3 Days And 7 People
To Get A Search Result?
3
16.04.2012
© 2012 IBM Corporation
򔻐򗗠򙳰
IBM DB2 Analytics Accelerator · Patrick Hempeler
days for a single query
constant tuning
Nearly 70% of data warehouses experience
performance-constrained issues of various types.
- Gartner 2010 Magic Quadrant
specialized resources required
months to deploy
4
16.04.2012
© 2012 IBM Corporation
IBM DB2 Analytics Accelerator ·
Traditional Systems Landscape
Applications
OLTP
Staging Area
ETL
Operational
Data Store
ETL
ETL
Historical reasons:
Negative ramifications:
•Different access patterns
•Complexity
•impact on performance
•EDW as the data integration hub
•again, impact on performance
•Different life-cycle characteristics
•and again, impact on performance
•Different Service Level Agreements (SLA)
•Lack of broadly available workload management capabilities
•Choice of lower cost-of-acquisition offerings
5
Enterprise
DWH
16.04.2012
Data Marts
ETL
•both in systems management and in applications
•Difficulties in supporting real time analytics
•Inability to match ever more demanding SLA
requirements
•High total cost of ownership
© 2012 IBM Corporation
IBM DB2 Analytics Accelerator ·
Traditional data warehouses
are just too complex
They are based on databases optimized for transaction processing—
NOT to meet the demands of advanced analytics on big data.
ƒ Too complex infrastructure
ƒ Too inefficient at analytics
ƒ Too complicated to deploy
ƒ Too many people needed to maintain
ƒ Too much tuning required
ƒ Too costly to operate
Too long to get answers
6
16.04.2012
© 2012 IBM Corporation
IBM DB2 Analytics Accelerator ·
The right data warehouse
is now mission critical.
Data continues to
expand exponentially.
Analytics are becoming more complex as
business demands faster answers.
7
16.04.2012
© 2012 IBM Corporation
IBM DB2 Analytics Accelerator ·
Visionary Systems Landscape
Applications
OLTP
Staging Area
ETL
Operational
Data Store
ETL
Enterprise
DWH
ETL
Data Marts
ETL
Benefits
• Consolidating all the components into a
single system
Challenges
• Mixed workload management capabilities
• Uniform access to any data
• Universal processing capabilities to deliver
best performance for both transactional
and analytical workloads
• Efficient data movement within the system
(ideally, no network)
• Opportunity to remove some of the
components
• Providing industry leading availability,
security and reliability to all types of
workload
System z Data Sharing and Parallel Sysplex technology provides all the needed characteristics except one:
Special purpose processing for analytical workloads to minimize the need for manual tuning
8
16.04.2012
© 2012 IBM Corporation
IBM DB2 Analytics Accelerator ·
GA: November 19, 2010
9
16.04.2012
© 2012 IBM Corporation
IBM DB2 Analytics Accelerator ·
ISAOpt V1 Needed Following Enhancements
1. Increase applicability by relaxing current off-load restrictions
2. Increase applicability by supporting larger amount of data
3. Support concurrent query execution
4. Improve data currency
5. Support disaster recovery
6. DB2 10 support
10
16.04.2012
© 2012 IBM Corporation
IBM DB2 Analytics Accelerator ·
GA: November 25, 2011
11
16.04.2012
© 2012 IBM Corporation
IBM DB2 Analytics Accelerator ·
IBM DB2 Analytics Accelerator
Capitalizing on the best of both worlds – System z and Netezza
What is it?
The IBM DB2 Analytics Accelerator is a workload
optimized, appliance add-on, that enables the
integration of business insights into operational
processes to drive winning strategies. It accelerates
select queries, with unprecedented response times.
How is it different
ƒ Performance: Unprecedented
response times to enable 'train of
thought' analyses frequently blocked by
poor query performance.
ƒ Integration: Connects to DB2 through
deep integration providing transparency
to all applications.
ƒ Self-managed workloads: queries are
executed in the most efficient way
ƒ Transparency: applications connected
to DB2 are entirely unaware of the
Optimizer
ƒ Simplified administration: appliance
hands-free operations, eliminating many
database tuning tasks
Breakthrough Technology Enabling New Opportunities
12
16.04.2012
© 2012 IBM Corporation
IBM DB2 Analytics Accelerator ·
IDAA Preserves ISAOpt V1 Key Value Propositions
ƒ DB2 continues to own data (both OLTP and DW)
• Access to data (authorization, privileges, …)
• Data consistency and integrity (backup, recovery, …)
• Enables extending System z QoS characteristics to BI/DW data as well
ƒ Applications access data (both OLTP and DW) only through DB2
• DB2 controls whether to execute query in DB2 mainline or route to ISAO
• DB2 returns results directly to the calling application
• Enables mixed workloads and selection of optimal access path (within DB2
mainline or ISAOpt/IDAA) depending on access pattern
ƒ IDAA continues to be implemented as DB2 internal component
• DB2 provides key IDAA status and performance indicators as well as typical
administration tasks by standard DB2 interfaces and means
• No direct access (log-on) to IDAA
• Enables operational cost reduction through skills, tools and processes
consolidation
13
16.04.2012
© 2012 IBM Corporation
IBM DB2 Analytics Accelerator ·
Deep DB2 Integration within zEnterprise
Applications
DBA Tools, z/OS Console, ...
Application Interfaces
Operational Interfaces
(standard SQL dialects)
(e.g. DB2 Commands)
DB2 for z/OS
Data
Manager
Buffer
Manager
Superior availability
reliability, security,
Workload mgmt
...
IRLM
Log
Manager
z/OS on
System z
IBM
DB2
Analytics
Accelerator
Industry leading DW
performance, ease of
use
Netezza 1000 HW
14
16.04.2012
© 2012 IBM Corporation
IBM DB2 Analytics Accelerator ·
Query Execution Process Flow
Application
Interface
Heartbeat
Optimizer
Query execution run-time for
queries that cannot be or should
not be off-loaded to ISAO
ISAO interface
Application
DB2 for z/OS
Heartbeat (IDAA availability and performance indicators)
Queries executed without IDAA
Queries executed with IDAA
15
16.04.2012
© 2012 IBM Corporation
IBM DB2 Analytics Accelerator ·
IDAA - Bringing Netezza AMPPTM Architecture to DB2
AMPP = Asymmetric Massively Parallel Processing
Netezza 1000
CPU
Memory
Advanced
Analytics
IBM DB2 Analytics Accelerator
Legacy
Reporting
DB2 for
z/OS
•V9
•V10
IBM DB2 Analytics Accelerator
BI
FPGA
SMP
Host
CPU
FPGA
Memory
CPU
FPGA
Memory
DBA
Network
Fabric
S-Blades™
Disk
Enclosures
IBM DB2 Analytics Accelerator
16
16.04.2012
© 2012 IBM Corporation
IBM DB2 Analytics Accelerator ·
The Key to the Speed
select DISTRICT,
PRODUCTGRP,
sum(NRX)
from
SALES
where
MONTH = '20091201'
and
MARKET = 509123
and
SPECIALTY = 'GASTRO‘
FPGA Core
group by …
Slice of table
SALES
Uncompress
Project
CPU Core
Restrict,
Visibility
Complex 㺌
Joins, Aggs, etc.
(compressed)
sum(NRX)
select DISTRICT,
PRODUCTGRP,
sum(NRX)
17
16.04.2012
where MONTH = '20091201'
and
MARKET = 509123
and
SPECIALTY = 'GASTRO'
© 2012 IBM Corporation
IBM DB2 Analytics Accelerator ·
Netezza 1000 basic design aspects
•Slice of User Data
•Swap and Mirror partitions
•High speed data streaming
•High compression rate
Disk Enclosures
SMP Hosts
EXP3000 JBOD Enclosures
12 x 3.5” 1TB, 7200RPM, SAS (3Gb/s)
max 116MB/s (200-500MB/s compressed data)
e.g. TF12:
8 enclosures 㸢 96 HDDs
32TB uncompressed user data (㸢 128TB)
•IDAA Server
•SQL Compiler
•Query Plan
•Optimize Administration
2 front/end hosts, IBM 3650M3
clustered active-passive
2 Nehalem-EP Quad-core 2.4GHz per host
Snippet BladesTM
(S-Blades, SPUs)
•Processor & streaming DB logic
•High-performance database engine
streaming joins, aggregations, sorts, etc.
e.g. TF12: 12 back/end SPUs
18
16.04.2012
© 2012 IBM Corporation
IBM DB2 Analytics Accelerator ·
The Netezza S-Blade™
19
16.04.2012
© 2012 IBM Corporation
IBM DB2 Analytics Accelerator ·
Disk Mirroring and Failover
Primary
Mirror
Temp
20
ƒ
All user data and temp space mirrored
ƒ
Disk failures transparent to queries and transactions
ƒ
Failed drives automatically regenerated
ƒ
Bad sectors automatically rewritten or relocated
16.04.2012
© 2012 IBM Corporation
IBM DB2 Analytics Accelerator ·
Connectivity Options
Multiple DB2 systems can connect to a single IDAA
A single DB2 system can connect to multiple IDAAs
Multiple DB2 systems can connect to multiple IDAAs
Full flexibility for DB2 systems:
Better utilization of IDAA resources
Scalability
High availability
21
16.04.2012
•
•
•
•
•
•
residing in the same LPAR
residing in different LPARs
residing in different CECs
being independent (non-data sharing)
belonging to the same data sharing group
belonging to different data sharing groups
© 2012 IBM Corporation
IBM DB2 Analytics Accelerator ·
Network Configuration Options
Option 1 – Simple Direct attachment
– Virtual IP definition both on System z and Netezza
– Only one network link active at a time – if Netezza fails over to standby host, connection
might get lost
port 1
port 2
System z
port 1
IDAA
port 2
SMP Host 2
port 1
port 2
10GbE
OSA
OSA ports are configured such that
only one link is active at a time. If
the connection on this link breaks,
the other is activated
10GbE
SMP Host 1
Option 2 – Additional redundancy or additional CEC requires Switch(es)
– Can address cable failures and Netezza fail over to standby host
– For higher availability requirements, a second switch is required
22
16.04.2012
port 1
Switch 1
port 1
port 2
port 2
10GbE
OSA
SMP Host 1
System z
port 1
port 2
SMP Host 2
Switch 2
port 1
port 2
IDAA
10GbE
OSA
OSA ports are configured such
that packages are sent
alternating, using both ports:
“multipathperconnection”
configuration
© 2012 IBM Corporation
IBM DB2 Analytics Accelerator ·
Network Configuration Options (continued)
Option 3 – zBX TOR Switch
– For clients with an installed zBX connected to zEnterprise, the
top-of-rack switch may be leveraged to connect to the IDAA
– Connection between zBX and IDAA can be direct or switched
Ethernet
Switch
OSAExpress3
10 GbE
OSX
zEnterprise
23
16.04.2012
Private Service Network
TOR
Switch
10 GbE
Private Data Network
zBX
IDAA
© 2012 IBM Corporation
IBM DB2 Analytics Accelerator ·
Feedback from Beta Customer: Fast Time to Value
ƒ IBM DB2 Analytics Accelerator (Netezza 1000-12)
Î Production ready - 1 person, 2 days
ƒ Table Acceleration Setup … 2 Hours
– DB2 “Add Accelerator”
– Choose a Table for “Acceleration”
– Load the Table (DB2 copy to Netezza)
– Knowledge Transfer
– Query Comparisons
ƒ Initial Load Performance …
Î5.1 GB in 1 Min 25 Seconds (24M rows)
400 GB in 29 Min (570M rows)
ƒ Actual Query Acceleration … 1908x faster
Î2 Hours 39 Minutes to 5 Seconds
ƒ CPU Utilization Reduction … 99% less CPU
Î24M rows: 56.5 CPU seconds to 0.4 CPU
seconds
24
16.04.2012
Actual customer results, October 2011
© 2012 IBM Corporation
IBM DB2 Analytics Accelerator ·
Large Insurance Company
Adding value by Accelerating the Delivery of Business Reporting
Query
Total Rows
Reviewed
Query 1
Query 2
Query 3
Query 4
Query 5
Query 6
Query 7
Query 8
Query 9
591,941,065
591,941,065
813,343,052
283,105,125
591,941,089
813,343,052
591,941,065
813,343,052
813,343,052
Total
Qualifying
Rows
2,813,571
2,813,571
8,260,214
2,813,571
3,422,765
4,290,648
361,521
3,425,292
4,130,107
Total
Rows
Returned
853,320
585,780
274
601,197
508
165
58,236
724
137
DB2 Only
DB2 with
IDAA
Hours Sec(s)
2:39 9,540
2:16 8,220
1:16 4,560
1:08 4,080
0:57 4,080
0:53 3,180
0:51 3,120
0:44 2,640
0:42 2,520
Hours Sec(s)
0.0
5
5
0.0
0.0
6
5
0.0
0.0
70
0.0
6
0.0
4
0.0
2
193
0.1
Times
Faster
1,908
1,644
760
816
58
530
780
1,320
13
With Accelerated Time to Value
ƒ
IBM DB2 Analytics Accelerator (Netezza 1000-12)
Production ready - 1 person, 2 days
ƒ
Table Acceleration Setup in 2 Hours
- DB2 “Add Accelerator”
ƒ
- Choose a Table for “Acceleration”
- Load the Table (DB2 Loads Data to the Accelerator)
ƒ
- Knowledge Transfer
- Query Comparisons
Customer Quote: “we had this up and running in days with queries that ran over 1000 times faster”
25
16.04.2012
ƒ
Initial Load Performance
400 GB Loaded in 29 Minutes
570 Million Rows (Actual: Loaded 800 GB to 1.3 TB
per hour)
Extreme Query Acceleration - 1908x faster
2 Hours 39 minutes to 5 Seconds
CPU Utilization Reduction
Up to 35%
© 2012 IBM Corporation
IBM DB2 Analytics Accelerator ·
ISAO V1 Needed Following Enhancements
IDAA addresses all of them!
1. Increase applicability by relaxing current off-load restrictions
See further documentation for details
2. Increase applicability by supporting larger amount of data
Up to 32TB of uncomressed data,
e.g. with 1:4 compression ratio, up to 128TB of user data
3. Support concurrent query execution
Exploiting Netezza workload management capabilities
4. Improve data currency
Partition-scope update
5. Support disaster recovery
Building blocks provided
6. DB2 10 support
IDAA supports both DB2 9 and DB2 10*
Requires zEnterprise (z196 or z114)
26
16.04.2012
*DB2 10 support is planned for early 2012
© 2012 IBM Corporation
IBM DB2 Analytics Accelerator ·
You can also benefit!
• Are there long running queries that could provide business
value if they could be run in seconds vs 30 minutes or more?
• What about performance challenges with complex and ad
hoc queries?
• Is modernization of the data warehouse or Operational
Business Analytics a topic of interest?
If you answer one or more of the
• Are there
thoughtsquestions
about extending
the use
of operational
following
with ‘yes‘
please
give
platform data to perform business analysis and daily
me another 3 minutes to show you how
reporting?
we can help you analyzing your
• Is there a data warehouse
running
on System z or an
acceleration
potenzial
intention to do so?
• Is an OLAP application running out of steam due to growth in
data. These are usually a single subject area such as
Accounting, Sales or Inventory.
• The forgotten query: Have queries been elected to set aside
due to performance challenges?
• Is there a System z/196 or z114 or a plan for one in the near
future?
27
16.04.2012
© 2012 IBM Corporation
IBM DB2 Analytics Accelerator ·
28
16.04.2012
򔻐򗗠򙳰
© 2012 IBM Corporation
IBM DB2 Analytics Accelerator ·
Quick Workload Test
Report
Report for
for aa first
first assessment:
assessment:
ƒƒ Acceleration
Acceleration potential
potential for
for
ƒƒ Queries
Queries
ƒƒ Estimated
Estimated time
time
ƒ Customer
ƒƒ CP
CP cost
cost
ƒ Collecting information from dynamic
statement cache, supported by stepby-step instruction and REXX script
(small effort for customer)
ƒ Uploading compressed file (up to
some MB) to IBM FTP server
ƒ IBM / Center of Excellence
ƒ Importing data into local database
ƒ Quick analysis based on known DB2
Analytics Accelerator capabilities
Customer
Customer
Database
Database
29
16.04.2012
Documentation
Documentation
and
and REXX
REXX procedure
procedure
Data
Data package
package
(mainly
(mainly unload
unload
data
sets)
data sets)
Pre-process
Pre-process and
and
load
load
IBM
IBM lab
lab
Database
Database
Quick
Quick Workload
Workload
Test
Test Tool
Tool
Report
Assessment
© 2012 IBM Corporation
IBM DB2 Analytics Accelerator ·
Workload Assessment Output: Sample PDF Report
Summary based on
queries, query blocks,
elapsed time and CPU
time
Reasons why certain
query blocks might not
run on Smart Analytics
Optimizer
How much of the current
elapsed time may run on
Smart Analytics
Optimizer
Detailed query-level
assessment of the
workload
Elapsed time per
query
30
16.04.2012
SQL statement
per query
© 2012 IBM Corporation
IBM DB2 Analytics Accelerator ·
In eigener Sache...
ents
v
E
2
is
1 Pr e
Zusammen mit den Experten der IBM Technical Support Organisation haben unsere
deutschsprachigen IBM System z Spezialisten ein umfassendes Themenspektrum zu aktuellen
Neuerungen und integrierte Software- und Hardware-Lösungen rund um den legendären Mainframe
ausgewählt.
Wir freuen uns darauf, Ihnen Informationen in XXL sowie brandheißes Wissen aus erster Hand zu
präsentieren. In insgesamt 10 Tracks erfahren Sie, wie Sie das Potenzial der leistungsstarken Plattform für
Ihr Unternehmen maximal nutzen können. Namhafte Sprecher der IBM zeigen Ihnen anschaulich, wie die
neuesten Entwicklungen in den Bereichen der IBM Mainframetechnologie dazu beitragen können,
Innovationen voranzutreiben, Kosten zu sparen und Ihre bestehenden Investitionen, geschäftskritischen
Daten sowie Prozesse unternehmensweit zu schützen.
Und das alles zu einem sensationellen Preis: Profitieren Sie bis zum 15. März 2012 vom exklusiven
Frühbucherrabatt von 199,– € für beide Tage und melden Sie sich am besten gleich hier an!
Nicht vergessen! Sie haben die einmalige Möglichkeit, an zwei Events zu einem Preis teilzunehmen: Das
IBM Information Management Forum 2012 zum Thema „Manage Big Data“ findet zeitgleich statt.
http://www.ibm.com/de/events/tsez/
31
16.04.2012
© 2012 IBM Corporation
IBM DB2 Analytics Accelerator ·
Disaster Recovery Considerations (1 of 2)
SYSPLEX
App 1
DSG Member 1
DSG Member 2
Tables
of App 4
App 2
Tables
of App 1
App 3
Tables
of App 5
Tables
of App 2
Short Range
Long
Range
Short Range
32
16.04.2012
Switch
Short Range
IDAA Instance 1
Tables
of App 2
App 5
Tables
of App 3
Short Range
Switch
Tables
of App 1
App 4
Tables
of App 3
IDAA Instance 2
Tables
of App 4
Tables
of App 5
© 2012 IBM Corporation
IBM DB2 Analytics Accelerator ·
Disaster Recovery Considerations (2 of 2)
App 1
SYSPLEX
App 1
DSG Member 1
DSG Member 2
App 2
Tables
of App 4
App 2
Tables
of App 5
App 3
Tables
of App 1
App 3
Tables
of App 2
Tables
of App 3
App 4
Short Range
Short Range
Long
Range
Switch
Short Range
Short Range
IDAA Instance 2
IDAA Instance 1
CREATE/LOAD
Tables
of App 1
33
16.04.2012
Tables
of App 2
App 5
Switch
Tables
of App 3
Tables
of App 1
Tables
of App 2
Tables
of App 4
Tables
of App 3
Tables
of App 5
© 2012 IBM Corporation