Connect the Dots: A z13 and z/OS Dispatching Update
Transcription
Connect the Dots: A z13 and z/OS Dispatching Update
3/30/2015 Glenn Anderson IBM Lab Services and Training Connect the Dots: A z13 and z/OS Dispatching Update 9.0 What I hope to cover...... What are dispatchable units of work on z/OS How WLM manages dispatchable units of work The role of HiperDispatch and Warning Track Dispatching work to zIIP engines z13 Simultaneous Multithreading (SMT) (c) 2014 IBM Corporation 1 3/30/2015 z13 and z/OS dispatching zIIPs SMT LPAR1 LPAR2 Work Units (TCBs and SRBs) z/OS Dispatcher WLM Logical Processors Logical Processors PR/SM Dispatcher Physical Processors PR/SM HiperDispatch Warning Track z/OS dispatchable units There are different types of Dispatchable Units (DU's) in z/OS Preemptible Task (TCB) Non Preemptible Service Request (SRB) Preemptible Enclave Service Request (enclave SRB) Independent - a new transaction Dependent – extend existing address space Work-dependent – extend existing independent enclave (c) 2014 IBM Corporation 2 3/30/2015 z/OS dispatching work What is a WLM transaction? A WLM transaction represents a WLM "unit of work" basic workload entity for which WLM collects a resource usage value foundation for statistics presented in workload activity report represents a single subsystem "work request" Subsystems can implement one of three transaction types Address Space: WLM transaction measures all resource used by a subsystem request in a single address space Used by JES (a batch job), TSO (a TSO command), OMVS (a process), STC (a started task) and ASCH (single APPC program) Enclave: Enclave created and destroyed by subsystem for each work request WLM transaction measures resources used by a single subsystem request across multiple address spaces Exploited by subsystems - Component Broker(WebSphere), DB2, DDF, IWEB, MQSeries Workflow, LDAP, NETV, TCP CICS/IMS Transactions Neither address space or enclave oriented - special type (c) 2014 IBM Corporation 3 3/30/2015 The WLM view Address Spaces, and the transactions inside A/S is dispatchable unit Enclaves are dispatchable units Only A/S is dispatchable unit CICS/IMS tran Enclave SRB transactions transactions Enclave SRB CICS/IMS tran Enclave SRB transactions STC rules Enclave SRB CICS/IMS tran transactions STCHI Vel = 50% Imp=1 STC rules CICS/IMS tran STCHI Vel = 50% Imp=1 WLM policy adjustment algorithm (c) 2014 IBM Corporation 4 3/30/2015 WLM dispatching priority usage Dispatching in an LPAR environment LPAR1 LPAR2 Work Units (TCBs and SRBs) z/OS Dispatcher Logical Processors Logical Processors PR/SM Dispatcher Physical Processors PR/SM (c) 2014 IBM Corporation 5 3/30/2015 HiperDispatch mode PR/SM – Supplies topology information/updates to z/OS – Ties high priority logicals to physicals (gives 100% share) – Distributes remaining share to medium priority logicals – Distributes any additional service to unparked low priority logicals z/OS – Ties tasks to small subsets of logical processors – Dispatches work to high priority subset of logicals – Parks low priority processors that are not need or will not get service Hardware cache optimization occurs when a given unit of work is consistently dispatched on the same physical CPU HiperDispatch: z/OS part z/OS Node Node Parked – Whether a logical processor has high, medium or low share – On which book and chip the logical processor is located LPAR Parked • z/OS obtains the logical to physical processor mapping in Hiperdispatch mode • z/OS creates dispatch nodes – The idea is to have 4 high share CPs in one node – Each node has TCBs and SRBs assigned to the node – Optimizes the execution of work units on z/OS PR/SM Logical to Physical Mapping Hardware Books (z13 – Drawer / Nodes) 12 (c) 2014 IBM Corporation 6 3/30/2015 RMF CPU activity report HiperDispatch and LPAR 1 P A R T I T I O N D A T A R E P O R T PAGE z/OS V1R10 SYSTEM ID LPAR1 CONVERTED TO z/OS V1R12 RMF DATE 04/29/2011 TIME 19.28.00 3 INTERVAL 14.59.998 CYCLE 1.000 SECONDS MVS PARTITION NAME LPAR1 IMAGE CAPACITY 3165 NUMBER OF CONFIGURED PARTITIONS 4 WAIT COMPLETION NO DISPATCH INTERVAL DYNAMIC --------- PARTITION DATA --------------------MSU---- -CAPPING-NAME S WGT DEF ACT DEF WLM% LPAR1 A 494 0 582 NO 0.0 LPAR2 A 446 0 762 NO 0.0 LPAR3 A 59 0 0 NO 0.0 LPAR5 A 1 0 0 NO 0.0 *PHYSICAL* TOTAL NUMBER OF PHYSICAL PROCESSORS CP IIP -- LOGICAL PARTITION PROCESSOR DATA -PROCESSOR- ----DISPATCH TIME DATA---NUM TYPE EFFECTIVE TOTAL 32.0 CP 02.17.24.319 02.20.44.154 32.0 CP 03.01.28.607 03.04.05.167 3.0 CP 00.00.00.000 00.00.00.000 1.0 CP 00.00.00.000 00.00.00.000 00.10.58.833 ------------ -----------05.18.52.927 05.35.48.155 55 53 2 GROUP NAME LIMIT AVAILABLE N/A N/A N/A -- AVERAGE PROCESSOR UTILIZATION PERCENTAGES -LOGICAL PROCESSORS --- PHYSICAL PROCESSORS --EFFECTIVE TOTAL LPAR MGMT EFFECTIVE TOTAL 28.63 29.32 0.44 17.96 18.40 37.81 38.35 0.34 23.72 24.06 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 1.44 1.44 --------- ----2.21 41.68 43.90 Total LPAR weight = 1000 LPAR1 494/1000 = .494 * 53 CPs = 26.18 CPs LPAR2 446/1000 = .446 * 53 CPs = 23.64 CPs LPAR1 = 25 VH and 2 VM at 59% share (27 logicals unparked) LPAR2 = 23 VH and 1 VM at 64% share (24 logicals unparked) ____________ 51 logicals unparked (c) 2014 IBM Corporation 53 physicals 7 3/30/2015 Dispatching in an LPAR environment LPAR1 LPAR2 Work Units (TCBs and SRBs) z/OS Dispatcher Logical Processors Logical Processors PR/SM Dispatcher Physical Processors PR/SM Warning track (c) 2014 IBM Corporation 8 3/30/2015 Warning track Latent demand: LPAR Busy vs MVS Busy (c) 2014 IBM Corporation 9 3/30/2015 Warning track statistics WLM Topology Report Tool 20 (c) 2014 IBM Corporation 10 3/30/2015 Dispatching zIIP eligible work LPAR1 LPAR2 Work (TCBs and SRBs)units zIIPUnits eligible work z/OS Dispatcher Logical Processors zIIP logical processors zIIP Logicallogical Processorsprocessors PR/SM Dispatcher Physical Processors zIIP physical processors PR/SM IBM z Integrated Information Processor (zIIP) on the z13 The IBM z13 continues to support the z Integrated Information Processor (zIIP) which can take advantage of the optional simultaneous multithreading (SMT) technology capability. SMT allows up to two active instruction streams per core, each dynamically sharing the core's execution resources. ̶ With the multithreading function enabled, the performance capacity of the zIIP processor is expected to be up to 1.4 times the capacity of these processors on the zEC12 The rule for the CP to zIIP purchase ratio is that for every CP purchased, up to two zIIPs may be purchased zAAP eligible workloads such as Java and XML, can run on zIIPs using zAAP on zIIP processing zAAPs are no longer supported on the z13 (c) 2014 IBM Corporation 11 3/30/2015 Current IBM exploitation of zAAPs and zIIPs Specialty CP Eligible Major Users zAAP or zIIP on z13 Any Java Execution Websphere CICS Native apps XMLSS zIIP Enclave SRBs DRDA over TCPIP DB2 Parallel Query DB2 Utilities Load, Reorg, Rebuild DB2 V9 z/OS remote native SQL procedures TCPIP - IPSEC XMLSS zIIP Assisted HiperSockets Multiple Write Virtual Tape Facility Mainframe (VTFM) Software z/OS Global Mirror (XRC), System Data Mover (SDM) z/OS CIM Server RMF Mon III OMEGAMON on z/OS and DB2 IMS Ver 8 What is "Needs Help?" Determination zIIP or zAAP work is being delayed and additional resources should help process the work ƒRequires xxPHONORPRIORITY=YES to be set If help is required: ƒThe zxxP CP signals waiting zxxP to help ƒWhen all zxxP CPs are busy the zxxP asks for help from the GCP –All available speciality engines (of the same type) must be busy before help is asked of the GCPs IF the zxxPs needs help and all zxxPs are busy help is obtained from 1 GCP IF zxxPs continue to need help additional CPs may be asked to help ƒHelp is always provided in dispatch priority order (c) 2014 IBM Corporation 12 3/30/2015 Specialty CP work running in a WLM service class REPORT BY: POLICY=WLMPOL WORKLOAD=BAT_WKL SERVICE CLASS=BATSPEC TRANSACTIONS TRANS-TIME HHH.MM.SS.TTT --DASD I/O-- ---SERVICE---AVG 0.98 ACTUAL 6.520 SSCHRT 11.5 IOC 8326 MPL 0.98 EXECUTION 6.128 RESP 7.0 CPU 662386 ENDED 10 QUEUED 391 CONN 6.9 MSO 0 END/S 0.17 R/S AFFIN 0 DISC 0.0 SRB 965 #SWAPS 0 INELIGIBLE 0 Q+PEND 0.1 TOT 671677 EXCTD 0 CONVERSION 0 IOSQ 0.0 /SEC 11195 AVG ENC 0.00 STD DEV 0 GOAL: EXECUTION VELOCITY 35.0% RESPONSE TIME EX PERF VELOCITY MIGRATION: AVG I/O MGMT RESOURCE GROUP=BATMAXRG SERVICE TIMES ---APPL %--CPU 24.7 CP 0.97 SRB 0.0 AAPCP 0.01 RCT 0.0 IIPCP 0.00 IIT 0.0 HST 0.0 AAP 40.27 AAP 24.2 IIP 0.00 IIP 0.0 99.2% INIT MGMT 92.2% ------ USING% ----- ------------ EXECUTION DELAYS % ------- SYSTEM SYSD --N/A-- VEL% INDX ADRSP CPU AAP IIP I/O TOT CPU 99.2 0.8 45.9 0.0 3.9 0.4 0.4 0.4 1.0 RMF report is at 1 minute interval DB2 parallel query, enclave SRBs and zIIPs Host 1 Query CP Parallelism DU DU DU DU IO IO IO IO Have been independent enclave SRBs to be zIIP eligible. Beginning in z/OS R11 the child tasks are now work-dependent enclaves. Portions of complex query arrive on participant systems, classified under "DB2" rules, and run in enclave SRBs, so zIIP eligible PARTITIONED TABLESPACE Complex query originates here DU IO Sysplex Query Parallelism Host 1 Host 2 DU DU DU IO IO IO DU IO DU IO DU IO Host 3 DU DU IO IO DU IO DU IO DU IO PARTITIONED TABLESPACE (c) 2014 IBM Corporation 13 3/30/2015 Work-Dependent enclaves DB2 Address Space Work-Dependent Enclave zIIP Offload = Y % create Independent Enclave zIIP Offload = X % create Work-Dependent Enclave zIIP Offload = Z % Managed as one transaction, represented by Independent Enclave Implement a new type of enclave named “Work-Dependent” as an extension of an Independent Enclave. A Work-Dependent enclave becomes part of the Independent Enclave’s transaction but allows to have its own set of attributes (including zIIP offload percentage) DDF and work-dependent enclaves ssnmDIST (DDF) DDF production requests Enclave SRB PC-call to DBM1 In cases where DRDA applications create extended duration work threads in DB2, for example through extensive use of held cursors, the zIIP utilization levels can become more variable. DB2 and DDF now may use work-dependent enclaves in this situation to control this variability. See APAR PM28626. Create Enclave Schedule SRB PC-call to DBM1 DDF default requests DDF rules DDFPROD RT=85%, 2s Imp=1 SMF 72 Enclave SRB DDFDEF STC rules (c) 2014 IBM Corporation RT=5s avg Imp=3 STCHI Vel = 50% Imp=1 SMF 72 SMF 30 SMF 72 14 3/30/2015 Work-dependent enclaves in SDSF zIIP processors and simultaneous multithreading LPAR1 LPAR2 Work (TCBs and SRBs)units zIIPUnits eligible work z/OS Dispatcher Logical Processors zIIP logical cores with 1 or 2 threads zIIP logical cores with Logical Processors 1 or 2 threads PR/SM Dispatcher PR/SM (c) 2014 IBM Corporation zIIP physical with Physicalcores Processors 1 or 2 threads 15 3/30/2015 z13 - Simultaneous Multithreading (SMT) “Simultaneous multithreading (SMT) permits multiple independent threads of execution to better utilize the resources provided by modern processor architectures.”* ● With z13, SMT allows up to two instructions per core to run simultaneously to get better overall throughput ● SMT is designed to make better use of processors ● 80 50 On z/OS, SMT is available for zIIP processing: ● – Two concurrent threads are available per core and can be turned on or off – Capacity (throughput) usually increases – Performance may in some cases be superior using single threading Two lanes process more traffic overall Note: Speed limit signs for illustration only *Wikipedia® z13 - SMT Exploitation Appl Appl Appl Appl Appl Appl Appl Appl Appl Appl Appl Appl Thr Thr Thr Thr Thr Thr Thr Thr Thr Thr Thr Thr Core Core Core Core Core z/OS Core z/OS MT Aware PR/SM Hypervisor Appl Appl Thr Thr Core Physical Hardware Thr Thr Core Thr Thr MT Ignorant Appl Appl Thr Thr Core Core • SMT Aware OS informs PR/SM that it intends to exploit SMT – PR/SM can dispatch any OS core to any physical core – OS controls the whole core – must follow rules • Maximize core throughput (Drive cores with high Thread Density [2] ) • Maximize core availability (Meet workload goals using fewest cores ) • SMT is transparent to applications • LOADxx and IEAOPTxx parmlib options to enable SMT on z/OS: – LOADxx: – IEAOPTxx: PROCVIEW CORE|CPU MT_ZIIP_MODE={1 | 2} 32 (c) 2014 IBM Corporation 16 3/30/2015 What is new with multithreading support? New support LOADxx: PROCVIEW CPU • LOADxx: PROCVIEW CORE This the existing mode that you • z/OS is core-aware and maps CPUs to cores. are used to • Core syntax for CPU/core related commands. z/OS does only know IPL required processors (CPUs) • MT1 Mode MT2 Mode IEAOPTxx: MT_ZIIP_MODE = 1 IEAOPTxx: MT_ZIIP_MODE = 2 SET OPT=yy Possible on any supported hardware z13 only MT1 performance & capacity MT2 performance & cap. HiperDispatch=No possible HiperDispatch=Yes enforced * Statements regarding IBM future direction and intent are subject to change or withdrawal, and represent goals and objectives only. 33 New terminology for SMT… ■ ■ z/OS logical processor (CPU) Thread ■ A thread implements (most of) the System z processor architecture ■ z/OS dispatches work units on threads ■ In MT mode two threads are mapped to a logical core Processor core Core ■ PR/SM dispatches logical core on a physical core ■ ■ ■ MT1 Equivalent Time (MT1ET) ■ z/OS CPU times are normalized to the time it would have taken to run same work in MT-1 mode ■ ■ ■ Thread density 1 (TD1) when only a single thread runs on a core Thread density 2 (TD2) when both threads run on a core ASCB, ASSB, …, SMF30, SMF32, SMF7x, … You will usually not see the term MT1ET because it is implied Several new metrics to describe how efficiently core resources could be utilized… * Statements regarding IBM future direction and intent are subject to change or withdrawal, and represent goals and objectives only. (c) 2014 IBM Corporation 17 3/30/2015 …and several new metrics for SMT… ■ New metrics: ■ ■ WLM/RMF: RMF: Capacity Factor (CF), Maximum Capacity Factor (mCF) Average Thread Density, Productivity (PROD) How are the new metrics derived? Hardware provides metrics (counters) describing the efficiency of processor (cache use/misses, number instructions when one or two threads were active…) LPAR level counters are made available to the OS MVS HIS component and supervisor collect LPAR level counters. HIS provides HISMT API to compute average metrics between “previous” HISMT invocation and “now” (current HISMT invocation) System components (WLM/SRM, monitors such as RMF) retrieve metrics for management and reporting * Statements regarding IBM future direction and intent are subject to change or withdrawal, and represent goals and objectives only. z13 - z/OS SMT Metrics • Capacity Factor (CF) – How much work core actually completes for a given workload mix at current utilization - relative to single thread – MT-1 Capacity Factor is 1.0 (100%) – MT-2 Capacity Factor is workload dependent Actual MT-2 Efficiency • Maximum Capacity Factor (mCF) – How much work a core can complete for a given workload mix at most Estimated max MT-2 Efficiency • Core Busy Time – Time any thread on the core is executing instructions when core is dispatched to physical core • Average Thread Density – Average number of executing threads during Core Busy Time (Range: 1.0 - 2.0) • Productivity – Core Busy Time Utilization (percentage of used capacity) for a given workload mix – Productivity represents capacity in use (CF) relative to capacity total (mCF) during Core Busy Time. • Core Utilization – Capacity in use relative to capacity total over some time interval – Calculated as Core Busy Time x Productivity (c) 2014 IBM Corporation % Used MT-2 Core Capacity during Core Busy Time % Used MT-2 Core Capacity during Measurement Interval 36 18 3/30/2015 z13 – SMT: Postprocessor CPU Activity Report • PP CPU activity report displayed in “old” format when SMT is active • PP CPU activity report provides new metrics when SMT is active – MT Productivity and Utilization of each logical core – MT Multi-Threading Analysis section displays MT Mode, MT Capacity Factors and average Thread Density • One data line in PP CPU activity report represents one thread (CPU) – CPU NUM designates the logical core • Some metrics like TIME % ONLINE and LPAR BUSY provided at core granularity only z/OS V2R1 C P U A C T I V I T Y SYSTEM ID CB8B DATE 02/02/2015 RPT VERSION V2R1 RMF TIME 11.00.00 --------------- TIME % -------------- MT % ---- LOG PROC ONLINE LPAR BUSY MVS BUSY PARKED PROD UTIL SHARE % 100.00 68.07 67.94 0.00 100.00 68.07 100.0 HIGH 100.00 46.78 46.78 0.00 100.00 46.78 52.9 MED ---CPU--NUM TYPE 0 CP 1 CP ... TOTAL/AVERAGE A IIP 100.00 B IIP 100.00 8.66 48.15 38.50 54.17 41.70 35.66 32.81 26.47 0.00 0.00 0.00 0.00 100.00 85.84 8.66 41.33 152.9 100.0 HIGH 85.94 33.09 100.0 HIGH INTERVAL 15.00.004 CYCLE 1.000 SECONDS --I/O INTERRUPTS– RATE % VIA TPI 370.1 13.90 5.29 16.93 375.3 13.95 ... MT-2 core TOTAL/AVERAGE 29.48 23.23 86.47 25.39 386.7 capacity used ------------ MULTI-THREADING ANALYSIS --------------CPU TYPE MODE MAX CF CF AVG TD CP 1 1.000 1.000 1.000 Productivity of logical core while IIP 2 1.485 1.279 1.576 dispatched to physical core 37 z13 – SMT: Postprocessor Workload Activity Report W O R K L O A D z/OS V2R1 SYSPLEX UTCPLXCB RPT VERSION V2R1 RMF A C T I V I T Y DATE 02/02/2015 TIME 11.00.00 INTERVAL 15.00.004 MODE = GOAL REPORT BY: POLICY=BASEPOL -TRANSACTIONSAVG 790.12 MPL 790.12 ENDED 9173 END/S 10.19 #SWAPS 6087 EXCTD 15860 AVG ENC 4.00 REM ENC 0.00 MS ENC 0.00 ► TRANS-TIME HHH.MM.SS.TTT ACTUAL 27.787 EXECUTION 15.761 QUEUED 1 R/S AFFIN 0 INELIGIBLE 0 CONVERSION 0 STD DEV 8.40.915 --DASD I/O-SSCHRT 3975 RESP 2.8 CONN 1.4 DISC 1.2 Q+PEND 0.1 IOSQ 0.0 MT-1 equivalent service units and service times ---SERVICE--IOC 4233K CPU 306468K MSO 0 SRB 97415K TOT 408116K /SEC 453461 ABSRPTN TRX SERV SERVICE TIME CPU 4659.039 SRB 1513.624 RCT 0.622 IIT 21.116 HST 0.000 AAP N/A IIP 1308.346 ---APPL %--CP 542.89 AAPCP 0.00 IIPCP 2.00 AAP IIP N/A 97.84 574 574 Pre SMT: Service time : Logical processor CPU time APPL%: Percentage of logical processor capacity ► SMT mode active: Service time: APPL% : MT-1 equivalent CPU time Percentage of maximum core capacity calculated as For MT Mode = 1 mCF = 1 (c) 2014 IBM Corporation Percentage of maximum core capacity used 3/30/2015 38 19 3/30/2015 Transitioning into MT2 mode: WLM considerations (1) • Less overflow from zIIP to CPs may occur because • zIIP capacity increases, and • number of zIIP CPUs double • CPU time and CPU service variability may increase, because • Threads which are running on a core at the same time influence each other • Threads may be dispatched at TD1 or TD2 • Sysplex workload routing: routing recommendation may change because • zIIP capacity will be adjusted with the mCF to reflect MT2 capacity • mCF may change as workload or workload mix changes * Statements regarding IBM future direction and intent are subject to change or withdrawal, and represent goals and objectives only. Transitioning into MT2 mode: WLM Considerations (2) • Goals should be verified for zIIP-intensive work, because • The number of zIIP CPUs double and the achieved velocity may change • “Chatty” (frequent dispatches) workloads may profit because there is a chance of more timely dispatching • More capacity is available • Any single thread will effectively run at a reduced speed and the achieved velocity will be lower. Affects processor speed bound work, such as single threaded Java batch * Statements regarding IBM future direction and intent are subject to change or withdrawal, and represent goals and objectives only. (c) 2014 IBM Corporation 20 3/30/2015 Work unit queue distribution: Mon I CPU report SYSTEM ADDRESS SPACE AND WORK UNIT ANALYSIS ---------NUMBER OF ADDRESS SPACES--------QUEUE TYPES MIN MAX AVG IN IN READY OUT READY OUT WAIT LOGICAL OUT RDY LOGICAL OUT WAIT 550 4 0 0 0 178 1,008 438 1 0 628 634 594.8 22.7 0.0 0.0 10.3 589.4 281 708 97 0 43 284 763 98 1 97 282.0 736.4 97.9 0.0 68.0 ADDRESS SPACE TYPES BATCH STC TSO ASCH OMVS ---------NUMBER OF WORK UNITS------------CPU TYPES MIN MAX AVG CP AAP IIP 444 22 0 888 33 0 555.5 28.8 0.0 -----------------------DISTRIBUTION OF IN-READY WORK UNIT QUEUE------------NUMBER OF 0 10 20 30 40 50 60 70 80 90 100 WORK UNITS (%) |....|....|....|....|....|....|....|....|....|....| <= = = = <= <= <= <= <= <= <= <= <= <= <= > N N N N N N N N N N N N N N N N + + + + + + + + + + + + + + + 1 2 3 5 10 15 20 30 40 60 80 100 120 150 150 55.5 4.4 4.0 3.7 7.1 14.7 8.0 1.9 0.2 0.0 0.0 0.0 0.0 0.0 0.0 0.0 >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>> >>> >> >>>> >>>>>>>> >>>>> > > 16 Buckets representing the work unit count with regard to the number of online processors N = NUMBER OF PROCESSORS ONLINE UNPARKED (22.4 ON AVG) Work unit count on CPU type level What I hope I covered….. What are dispatchable units of work on z/OS How WLM manages dispatchable units of work The role of HiperDispatch and Warning Track Dispatching work to zIIP engines z13 and Simultaneous Multithreading (SMT) (c) 2014 IBM Corporation 21 3/30/2015 IBM Notice Regarding Specialty Engines (c) 2014 IBM Corporation 22