Rdb Continuous LogMiner and the JCC LogMiner Loader
Transcription
Rdb Continuous LogMiner and the JCC LogMiner Loader
Rdb Continuous LogMiner and the JCC LogMiner Loader A presentation of MnSCU’s use of this technology Topics • • • • Overview of MnSCU Logmining Uses Loading Data Bonus: Global Buffer implementation and resulting performance improvements Overview of MnSCU Overview of MnSCU • Minnesota State College and University System – Comprised of 32 Institutions • • • • State Universities, Community and Technical Colleges 53 Campuses in 46 Communities Over 16,000 faculty and staff More than 3,600 degree programs – Serves over 240,000 Students per year • Additional 130,000 Students in non-credit courses – About 30,000 graduates each year ISRS • MnSCU’s Primary Application – ISRS: Integrated State-wide Record System • Written in Uniface (4GL), Cobol, C, JAVA – 2,000+ 3GL programs; 2,200+ 4GL forms – Over 2,900,000 lines of code MnSCU’s Rdb Topology North Region 9 Institution Dbs 1 Regional Db South Region 8 Institution Dbs 1 Regional Db Metro Region Central Region 7 Institution Dbs 1 Regional Db 1 Central Db 13 Institution Dbs 1 Regional Db Development 20+ Dvlp/QC/Train Dbs •Each Institution Db has over 1200 tables •Over 900,000,000 rows •Over 1 Terra-byte total disk space •Over 20% Annual Data Growth Production Users • Each regional center supports: – Between 500 and 1,000 on-line users (during the day) – Numerous batch reporting and update jobs daily and over-night – 25,000+ Web transactions each day 24x7 • Registrations, Grades, Charges (fees), On-line Payments, etc… LogMining LogMining Modes • Static – The Rdb LogMiner runs, by default, as a stand-alone process against backup copies of the source database AIJ files • Continuous – The Rdb LogMiner can run against live database AIJs to produce a continuous output stream that captures transactions when they are committed LogMining Uses • Hot Standby (replication) replacement • Mining a single Db to multiple targets • Mining to Non-Rdb Target(s) – XML, File, API, Tuxedo, Orrible • Mining multiple Dbs to a single target • Minimizing production Db maintenance downtime LogMining at MnSCU The combination of the Rdb Continuous LogMiner and the JCC LogMiner Loader allow us to: • Distribute centrally controlled data to multiple local databases • Replicate production databases into multiple partitioned query databases • Roll up multiple production databases into a single data warehouse • Replicate production data into non-Rdb development databases to support development of database-independent application The Big Picture REGIONALDB WAREHOUSE CLM w/PK 4 sessions REGIONALDB CENTRALDB REGIONALDB CLM w/PK 37 sessions CC_SUMMARY Combined (all 37) MV/DJ Other Data Warehouse Structures REGIONALDB ISRS 36 Standby Rdb Reporting Dbs HS NWR db NTC FFC 37 Production 36 Dbs Rdb DBs CLM w/DBK 4 sessions CLM w/DBK 37 sessions CLM w/FilterMaps DBK+RC_ID 37 sessions 37 Schemas Each with 238 Tables MVs to Combined Tables MnOnline Application Tables (20+) MnOnline Application Read-Only 24 x 7 MV/DJ MNSCUALL Schema 5 Tables Combined (all 37) Other Data Warehouse Structures MVs MVs to Individual Schemas REPL MVs TRF CORE Tables Combined (all 37) BTC VAL Tables 1 central copy 37 Schemas Each with 238 Tables Future Ad-Hoc Users Read-Only Replication Replacement • Historically we’ve used Rdb’s Hot Standby feature to maintain reporting Dbs for users (ODBC and some batch reports) • However, of the 1200+ tables in production ISRS Dbs, we found users only use 260 tables for reporting from the standby Dbs • Batch reports were found to use another 222 tables • With Logminer, we can maintain just this subset of tables (482) for reporting purposes 37 Production Rdb DBs NTC FFC TRF BTC NWR db Mining Single Db to Multiple Targets • We’ve begun combining institutions in some of our production databases • Introduced a new column called RC_ID to over 700 tables for rowlevel security • Row security provided by views • However, the reporting databases still needed to be separate so users wouldn’t have to change hundreds of existing queries to include RC_ID 37 Production Rdb DBs NTC FFC TRF BTC NWR db ‘CORE’ Tables • ‘CORE’ tables contain common data for 37 Production Rdb DBs MnSCU • Tables like PERSON, ADDRESS, PHONE, etc. that can be combined across all institutions • These tables incorporate a new surrogate key called MNSCU_ID which is unique across all institutions • Users do not access ‘CORE’ tables directly NWR db ‘CORE’ Table Views • ‘CORE’ data is presented to users via views that join RC_ID (an institution-specific value) and TECH_ID (the original surrogate key) to determine MNSCU_ID, via a third ‘REPORTING’ table • All ‘CORE’ tables (PERSON, ADDRESS, PHONE, …) are accessed via the views 37 Production Rdb DBs NWR db CR_REPORTING: Columns: ------------------MNSCU_ID TECH_ID RC_ID View PERSON: Columns: ---------------TECH_ID RC_ID Rest of Attributes CR_PERSON: Columns: ---------------MNSCU_ID Rest of attributes Query Performance • Row-Level security in our production dbs (‘CORE’ table views) have increased the query tuning challenges • Implemented better reporting performance in target db by using a filtering-trigger into a table the users use, rather than using the same views as Production • Reporting Db has different indexes and uses row-caching to maximize query performance ‘CORE’ Triggers Production CR_REPORTING MNSCU_ID RC_ID TECH_ID Reporting CR_RC_ID RC_ID CR_REPORTING AND Exists by RC_ID If Exists by MNSCU_ID CR_PERSON MNSCU_ID CR_PERSON AFTER INSERT Insert Into PERSON Non-’CORE’ Tables to Multiple Targets Reporting Dbs Production Db Logminer w/ filtering for RC_ID=‘0142’ 0142 Logminer w/ filtering for RC_ID=‘0263’ 0263 Logminer w/ filtering for RC_ID=‘0215’ 0215 Logminer w/ filtering for RC_ID=‘0303’ 0303 NWR 0142 0263 0215 0303 Eliminating Remote Attaches • We also have some centralized data, like ‘codes tables’ • Some jobs require reading this centralized data while processing against each Production ISRS database REGIONALDB CLM w/PK 4 sessions REGIONALDB CENTRALDB REGIONALDB REGIONALDB – Forces a remote attachment to the central db • With Logminer, we can maintain regional copies of this data so these jobs do not have to use remote attaches to get data Non-Rdb Target • Since we have 37 separate institutional databases (and reporting databases), getting combined data for system-wide reporting was difficult • With Logminer we can combine data from each of the production ISRS databases into one target, in this case our Oracle warehouse ISRS 37 Production Rdb DBs NWR db CLM w/FilterMaps DBK+RC_ID 37 sessions MNSCUALL Schema 5 Tables Combined (all 37) Non-Rdb Target • With Logminer we can mine tables from each of our Production ISRS databases into individual Oracle Schemas • This allows us to test our application against a different DBMS – Structure and data are the identical • Keeps data separate for institutional reporting 37 Production Rdb DBs NWR db CLM w/DBK 37 sessions ISRS 37 Schemas Each with 238 Tables MnSCU’s Oracle Topology ISRS Instance 37 Institution Schemas (238 tables each) 1 Combined Schema (5 tables) 1 VAL Schema (Codes Tables) Other Reporting Structures WAREHOUS Schema Several Warehouse Schemas Development Schema •Over 800,000,000 rows (And growing rapidly) •Over 500 Gb total disk space •Various ETL LogMining Scope at MnSCU • Sessions with Rdb Targets: – NWR: 482 tables from 1 source to 4 Rdb targets (4 sessions; Example of mining 1 to Many) • In these sessions we are separating data from a combined institutional database into separate reporting databases for each institution • Triggers used to populate base tables from production ‘Core’ tables – CENTRLDB: 3 tables from 1 source db to 4 target Rdb dbs • In this session we are taking centralized data and placing copies of it on our regional servers (allows us to maintain these 3 tables centrally without changing our application which reads the data locally) LogMining Scope at MnSCU • Sessions with Oracle Targets: – ISRS: 238 tables from each of 37 source dbs to 37 Oracle schemas (37 sessions; Non-Rdb target; example of mining many to many) • These sessions allow us to create exact copies of our databases in Oracle • From this Oracle data materialized views are built to provide value-added reporting ‘data-marts’ • Allows us to shift our reporting focus to Oracle while continuing to base production on Rdb • Also allows us to work on converting our application to use Oracle without impacting production or having to do a cold-switch – MNSCUALL: 5 tables from each of 37 source dbs to 1 Oracle schema (37 sessions; Example of mining Many sources to 1 target) • These sessions allow us to build a combined version of data • These are some of our larger tables, with over 250,000,000 rows • Total of 9,513 tables being mined by 119 separate continuous LogMiner sessions! Session Support • To support so many sessions we’ve developed a naming convention for sessions • Includes a specific directory structure • Built several tools to simplify the task of creating/recreating/reloading tables • Some of the tools are based on the naming convention • Currently our tools are all DCL, but better implementations could be made with 3GLs Source Db Preparation • The LogMiner input to the Loader is created when the source database is enabled for LogMining • This is accomplished with an RMU command. The optional parameter ‘continuous’ is used to specify continuous operation: $ rmu/set logminer/enable[/continuous] <database name> $ rmu/backup/after/quiet <database name> ... • Many of the procedures included with the Loader kit rely on the procedure vms_functions.sql having been applied to the source database: SQL> attach ‘filename <source database>’; SQL> @jcc_tool_sql:vms_functions.sql SQL> commit; Target Db Preparation • Besides the task of creating the target db and tables itself, there are many ‘details’ to attend to depending upon target db type • For Oracle targets, Rdb field and table name lengths (and names) can be an issue (and target tablespaces too) • One thing in common: the HighWater table – Used by the loader to keep track of what has been processed Minimizing Production Downtime • Basic Steps: – Do an AIJ backup – Create copy of production Db – Perform restructuring / maintenance / etc on Db copy • This could take many hours – Remove users from Production Db – Apply AIJ transactions to Db copy using LogMiner • This step requires minimal time – Switch applications to use Db copy – This is now the new production Db! Loading Data Loading Data • 3 Methods to accomplish this: – Direct Load (RMU, SQLLOADER) • Could impose data restrictions by using views • Can configure commit-interval – LogMiner Pump • • • • Use a no-change update transaction on source Allows for data restrictions Commit-interval matches source transaction Consumes AIJ space – JCC Data Pump • Configurable to do parent/child tables, data restrictions, commit-interval and delay-interval to minimize performance impact • Consumes AIJ space Loading Data Example • Loaded a table with 19,986 rows with SQLLOADER – This took about 13 minutes, no AIJ blocks, some disk space for .UNL file • Used LogMiner to ‘pump’ the rows via a nochange update – This took about 10 minutes; 20,000 AIJ-blocks Pumping Data Example • Using the JCC Data Pump – About 22 seconds to update source data: – Only 3 minutes to update target data!! Loader Performance • SCSU_REPISRS session – From CLM log: 17-NOV-2004 16:15:42.26 20234166 CLM SCSU_REPISR records written (36048 modify, 333 delete) Total : 36381 – The 3 processes themselves: • CTL: 14.6 CPU secs / 1667 Direct IO • CLM: 1 min 11.4 CPU secs / 67,112 Direct IO • LML: 7 min 55.7 CPU secs / 67 Direct IO – After 11:15 hours of connect time, this is about 1.6 Direct IO per second average – Processed about .9 records per second average Heartbeat • Without Heartbeat a session can become ‘stale’ • AIJ backups can be blocked • With Heartbeat enabled this does not occur – Side-affect is that ‘trailing’ messages are not displayed with heartbeat enabled Bonus: Global Buffers • Despite great performance gains from Row Cache over the past couple of years, we still were anticipating issues for our fall busy period • We turned on GB on many Dbs – On our busiest server, we enabled it on all dbs – On other servers, we have about 50% implementation • Used to run with RDM$BIND_BUFFERS of 220 • Estimated GB at max number of users@200 ea Global Buffers • Prior to implementing GB, our busiest server was running at a constant 6,0007,000 IO/sec • Other servers were running around 3,000 but had spikes to 7,000 or more • Global buffers both lowered overall IO, as well as eliminated spikes • Cost is in total Locks (Resources) – Increase LOCKIDTBL and RESHASHTBL Before GB METE 7,000 6,000 5,000 10-11 IO 4,000 2-3 IO 3,000 10-11PRC 2,000 2-3 PRC 1,000 28-Jul 1 w/Gb 2-Aug 1 w/Gb 3-Aug 1 w/Gb 28 -Ju 3-A l 1 ug w/ Gb 5-A 1 u w/ G 9-a g ug 7 w b /G 11 b 9 -au w /g 13 g -au 9w b /g 17 g -au 11 b w 19 g 1 /gb -au 1w /g 23 g -au 11 b w/ 25 g g -au 11 b w 27 g 1 /gb -au 1w g /g 30 -au 11 b w/ g 2-s 1 gb ep 1 w / 7-s 11 gb ep w/ g 9-s 11 b e w/ 13 p g -se 11 b w 15 p 1 /gb -se 1w p /g 17 -se 11 b w 21 p 1 /gb -se 1w p / 11 gb w/ g 23 b -S e 27 p -S e 29 p -S ep 1Oc t 5Oc t 7Oc t After GB METE 7,000 6,000 5,000 4,000 3,000 2,000 1,000 10-11 IO 2-3 IO 10-11 PRC 2-3 PRC 9- -J g 11 -a ug 13 -a ug 17 -a ug 19 -a ug 23 -a ug 25 -a ug 27 -a ug 31 -a ug 2se p 7se p 9se p 13 -s ep 15 -s ep 17 -s ep 21 -s ep g Au g ul Au au 5- 3- 28 b b b gb G G G w/ w/ w/ w/ O O O 157- ct ct ct w/ gb 4 w/ gb 4 w/ gb 4 w/ gb 4 w/ gb 4 w/ gb 4 w/ gb 4 w/ gb 4 w/ gb 4 w/ gb 4 w/ gb 4 w/ gb 4 w/ gb 4 w/ gb 4 w/ gb 23 -S ep 27 -S ep 29 -S ep 2 1 1 0 0 After Gb MNSCU1 6,000 5,000 4,000 10-11 IO 3,000 2-3 IO 10-11 PRC 2,000 2-3 PRC 1,000 - Resources CPU • Since we are now using less IO, more CPU is available • Before: SDA> lck show lck /rep=5/int=10 23-AUG-2004 14:28:48.80 Delta sec: 10.0 Ave Req: 39875 Req/sec: 15656.9 Busy: 23-AUG-2004 14:28:58.80 Delta sec: 10.0 Ave Req: 30166 Req/sec: 20289.7 Busy: Ave Spin: 24005 62.4% Ave Spin: 19121 61.2% • After: 24-AUG-2004 11:44:01.90 Delta sec: 10.0 Ave Req: 13439 Req/sec: 38043.3 Busy: 24-AUG-2004 11:44:11.90 Delta sec: 10.0 Ave Req: 15514 Req/sec: 31109.1 Busy: Ave Spin: 12846 51.1% Ave Spin: 16629 48.3% Lock Rates • Reducing these numbers to Lock Operations per 1% of CPU time yields: – Before: 299 – After: 573 For More Information • Miles.Oustad@CSU.MNSCU.EDU • (218) 755-4614 Q U E S T I O N S & ANSWERS