Using the cloud as a PMU Data Sharing Platform
Transcription
Using the cloud as a PMU Data Sharing Platform
1 Using the cloud as a PMU Data Sharing Platform Ken Birman Cornell University 2 GridCloud: Mile high perspective The intelligence lives in the cloud. The external infrastructure is mostly dumb sensors and actuators 3 • Share PMU data Business Needs – Between ISO-NE and NYISO, PJM, etc. – Multilateral data exchange • As opposed to multiple bilateral data exchange – “Centralized” PMU Registry management – “Centralized” historical PMU data storage • Shared online application – Runs on data from both ISO-NE and NYISO – Real-time operator collaborations – Potentially eliminates the need for raw data sharing 4 Key Elements of the Solution • Self-managed consistency and replication (redundancy) for faulttolerance. • Data collectors can accept PMU streams or PDC streams • Archived data stored into historian for offline replay and analysis • Full system replication across data centers 5 Why the Cloud? • Cost-effective, readily available and scalable infrastructure • Logically centralized, physically distributed computing – best of both worlds • Platform ideal for multilateral data/results exchange – Getting data from everywhere and delivering to everywhere – Hosting applications online • As opposed to the EIDSN cloud – communications only 6 What do we mean by “consistency?” • One issue is real-time consistency • After an event occurs, it should be rapidly reported in the cloud • And anyone sharing the platform should soon see it • Another issue is replication consistency • Replicate for fault-tolerance and scale • All replicas should have identical contents 7 Freeze-Frame FS • Real-time file system for secure, strongly consistent data capture – Normal file system but understands real-time timestamps – Offers optimal temporal precision plus Chandy-Lamport consistency • Incredibly high speed – Leverages RDMA for network line-speed data transfers – NVRAM (SSD or RAID disks) for storage persistency Archival Data Storage Solution in Action HDFS FFFS+SERVER FFFS+SENSOR TIME • We simulated a wave and sampled it. LikeTIME taking photos of individual grid cells (squares). • Then streamed the data to our cloud-hosted historian, one stream per sensor. • Then reconstructed the wave from the files and created these gif animations. Our improved handling of time will result in better computational accuracy and stronger consistency even with tight time constraints. 9 Consistent Time Matters! • Real-time applications need cloud services that understand time – Deep-learning on temporally fuzzy data will give incorrect output – Operators and systems using stale data will make poor decisions • With Freeze Frame FS, we can start to run powerful file-oriented analytics that understand data, and that keep up in real-time 10 Proof-of-concept Project Objectives • Evaluate the GridCloud platform – An existing platform created with ARPAe (GENI program) support over a three year period for characteristics of interest to ISO- NE • Assess the security, latency, fault-tolerance and consistency of compute cloud processing of PMU data – Measure round-trip latencies from ISO-NE (and Cornell representing a second utility) to Amazon cloud data centers in North Virginia and Oregon – Run linear state estimator in both locations and assess consistency of results • Targeting real scenarios and real requirements 11 GridCloud Idea FWDN Time-Synchronized Data Sources SSH Tunnels Results FWDN-1 Computation … Application Time Alignment FWD2 Results FWD1 Computation Data Collectors Replica 1 Time Alignment Application GridCloud running on Amazon Elastic Compute Cloud/VPC Results 12 Architecture & Latency Definitions PROOF OF CONCEPT 13 ISO-NE Deployment – Cloud Inbound L1: One way from DataSource to CloudRelay to Application (e.g. SE) Managed by CloudMake and VSync PMU1 TCP PMU2 … C37.118 Cloud hosted Ingress distribution point FWD1 FWD2 TCP Sender … PMUm-1 PMUm GridCloud at North Virginia Data Center 31 Streams FWDN-1 Data Archive LSE-VIS-1 Visualization Client State Estimator Visualization Re-played C37 Data ISO-NE hosted distribution point FWDN PMUm+1 PMUm+2 … Cloud hosted Ingress distribution point FWD1 FWD2 TCP Sender … PMUN-1 PMUN GridCloud at Oregon Data Center 42 Streams FWDN-1 Data Archive State Estimator FWDN Internal DataSource Network SSH Tunnels GridCloud running on Amazon VPC LSE-VIS-2 Visualization Client Visualization Re-played C37 Data Cornell hosted distribution point 14 ISO-NE Deployment – Full Loop L3raw: Round trip from DataSource to CloudRelay to DataSource Managed by CloudMake and VSync PMU1 TCP PMU2 … PMUm-1 PMUm TCP Sender 31 Streams Re-played C37 Data Cornell hosted distribution point PMUm+1 PMUm+2 … TCP Sender PMUN-1 PMUN 42 Streams Cloud hosted Ingress distribution point GridCloud at North Virginia Data Center FWD1 Data Archive Cloud hosted Raw Data Egress distribution point FWD2 Cloud hosted SE Result Egress distribution point FWDN-1 Cloud hosted Ingress distribution point GridCloud at Oregon Data Center Cloud hosted Raw Data Egress distribution point FWD2 Cloud hosted SE Result Egress distribution point FWDN-1 … State Estimator LSE-VIS-1 Visualization Client Visualization C37.118 FWDN FWD1 … Data Archive State Estimator LSE-VIS-2 Visualization Client Visualization Re-played C37 Data ISO-NE hosted distribution point FWDN SSH Tunnels L2: Round trip from Ingres Relay to Egress relay L3se: Round trip from DataSource to SE to DataSource 15 Security – Proof-of-concept • Amazon Virtual Private Cloud (VPC) • SSH Tunnel for data stream • ISO-NE data source – – – – Historical data playback w/ simulated real-time timestamps Inside firewall Data publishing No data subscription • Cloud Data Storage – Encrypted using a key • Generated by and stored in Amazon AWS • Managed by users 16 Security – Future Production • Real-time PMU data – Initially from ISOs’ or TOs‘ PDCs – Far future – directly from PMUs • SSH Tunnel – Https for PMUs • Virtual Private Cloud • Key Management for Storage Encryption – What to protect from – Feasibility • Programming language, runtime environment – Cost • Non-functional Requirements 17 Cost of Security: VPC vs EC2, Tunnels • EC2 Latency: – Average RTT = 245ms – 1st Percentile = 211ms – 99th Percentile = 255ms • VPC Latency: – Average RTT = 261ms – 1st Percentile = 228ms – 99th Percentile = 270ms • Delta is approximately +15ms • These numbers do not include SE compute time (75ms-100ms) • Adding SSH tunnels added less than 2ms to RTT 18 COSTS 19 Operating Cost • As configured for testing: – 13 instances total per datacenter • Vizualizer, CloudRelay, CloudMakeLeader, StateEstimator, 3xRawArchiver, 4xSEArchiver, 2xForwader – $2.47/hr to run per datacenter – Optimizing cost was not an objective for PoC • Tailored for convenience, repeatability • In a deployment for actual use would tailor the resources to the needs of the actual problem 20 LATENCIES 21 Raw Data Round Trip Latencies Graphs: Number of times a particular latency occurs Raw Data Round Trip Latency 2500 Oregon Number of Occurrences Viginia 2000 1500 1000 500 0 0 100 200 300 Measured Latency (ms) 400 500 600 22 Histogram: Round Trip Latencies Graphs: Number of times a particular latency occurs SE Results Round Trip Latency 2500 Number of Occurrences Oregon Virginia 2000 1500 1000 500 0 0 100 200 300 Measured Latency (ms) 400 500 600 23 OTHER PERFORMANCES 24 Other Performances • Fault Tolerance – Two parallel systems – Independent • Manual redundancy – Restarting a data center • Needs ~500 s • Consistency – No raw data loss – Few SE data loss • Due to replay wrapping – Within ~100 ms • End users have consistent data/results from both data centers 25 SUMMARY AND GOING FORWARD 26 • Security – – – – Lessons Learned Manageable risks Understood the problem Need to develop a policy Encryption Cost • SSH to/from the cloud adds negligible latency • Encrypting data storage adds negligible latency • Latency – The available cloud infrastructure is very fast • Even across the continent – Lower than SIDU production – Satisfy Wide-Area-Monitoring purposes 27 Lessons Learned – cont. • Reliability – The cloud is in general reliable • Consistency – ~100 ms w/o cross reference 28 Future Objectives • Cross reference? – Feasibility – Automatic redundancy – Guaranteed consistency • An SSH tunnel for each PMU? – As opposed to fate sharing • Https? – Feasibility – Protocol support • Other applications – PMU Registry – Storage 29 Opportunities Ahead • A flexible pay-for-use framework – Standard API – Open access to future applications • Sharable historian with guaranteed consistency • Eliminate the need for bilateral data exchange • Same real-time visualization – Operators collaboration • Future interconnection-wide PMU Registry 30 ISO-NE + NYPA Project • • • • • About to start, with NYSERDA funding Does sharing pose new security/privacy issues? Will we encounter security/privacy policy differences? Plan: Build upon the PoC ISO-NE cloud structure Once finished, the opportunity then arises to go live… 31 Detailed Experiment Results EXTRA 32 Histogram: Round Trip Latencies Graphs: Number of times a particular latency occurs Cornell Round Trip Latency (Raw Data) 2500 Number of Occurrences Oregon Virginia 2000 1500 1000 500 0 0 100 200 300 Measured Latency (ms) 400 500 600 33 L2 and L3 Latency Tests • Sampled over 4 hours • Tests performed from Cornell datasource machine over SSH tunnels • Sampled 4 raw feeds and two SE feeds from each datacenter: – Lowest numbered PMU from each datasource (ISO-NE and Cornell) – Highest numbered PMU from each datasource • PMUs send to the cloud in order from the datasource, this helps show us the spread of data from first to last measurement sent per round – Lowest and Highest SE result • Tests presented in the following slides as histograms and table of overall statistics – Histograms only cover highest numbered PMU/SE as they have the highest variability – Full stats and analysis included in companion excel file 34 Latencies (milliseconds) Virginia Virginia-Internal Oregon Oregon-Internal ISONE Raw-Low Min ISONE Raw-Low 1st Percentile ISONE Raw-Low Average ISONE Raw-Low 99th Percentile ISONE Raw-Low Max 20 22 25 58 611 88 89 102 152 696 ISONE Raw-High Min ISONE Raw-High 1st Percentile ISONE Raw-High Average ISONE Raw-High 99th Percentile ISONE Raw-High Max 22 25 46 82 612 90 99 127 179 697 Cornell Raw-Low Min Cornell Raw-Low 1st Percentile Cornell Raw-Low Average Cornell Raw-Low 99th Percentile Cornell Raw-Low Max 17 17 18 20 49 90 91 115 191 407 Cornell Raw-High Min Cornell Raw-High 1st Percentile Cornell Raw-High Average Cornell Raw-High 99th Percentile Cornell Raw-High Max 18 18 19 20 49 91 92 120 199 413 SE Results Min SE Results 1st Percentile SE Results Average SE Results 99th Percentile SE Results Max 279 294 325 384 911 242 267 300 348 469 351 370 409 490 962 240 273 317 393 642 35 Latencies (milliseconds) 36 OpenPDC Manager (Visualizer) Displaying SE Results 1 Use of Cloud Computing for Power Market Modeling and Reliability Assessment John Goldis, Alex Rudkevich Newton Energy Group 2016 IEEE PES General Meeting Panel on Cloud Computing – trend and security and implementation experience in power system operations Boston, MA, July 20, 2016 2 Outline • • • • • • • About NEG and ENELYTIX Why Cloud Energy Market Modeling on the Cloud ENELYTIX and PSO Benefits of ENELYTIX Challenges with cloud solutions Next Steps 3 About NEG and ENELYTIX • Alex Rudkevich and John Goldis started NEG in 2012 with a mission to modernize power market modeling through the use of commercially available High Performance Computing • Russ Philbrick, formerly with Alstom T&D, started Polaris in 2010 to develop a market simulator capable of addressing the evolution in the resource mix and operational/planning realities of the power industry • ENELYTIX is a result of the partnership between Newton Energy Group (NEG) and Polaris Systems Optimization (Polaris) 4 Why Cloud? • Allows NEG to provide a comprehensive suite of services – Software – Hardware – IT • Allows customers to leverage cloud resources through ENELYTIX and obtain major productivity gains at affordable costs 5 Energy Market Modeling on the Cloud • Energy market simulations are computationally complex • Simulations can be partitioned and parallelized • Licenses are typically non-scalable as they are structured on a per-user or per-machine basis – Parallelization requires scalable hardware and software • Cloud providers offer hardware on a usage basis • With our partners (Polaris, AIMMS, IBM) we developed a usage-based pricing model for the software, creating an opportunity for ENELYTIX 6 ENELYTIX is a SaaS designed to help users run their energy market modeling business reliably and efficiently Foundations Resources: • • • • • Software and licenses Hardware: computers, networks, data storage, communications IT support Data services Trained personnel Self Service Business Intelligence Application Solutions • • • • • • Forecasting Asset valuation System planning Policy analysis Market design Trading support • • • Consultants Grid Operators Utilities – – – – • • • Transmission Owners Generators Distributors Competitive Retail Providers Traders Investors/Developers Regulators 6 7 PSO, a MIP-based Simulation Engine Inputs Demand forecasts Models Loads, demand response Generation mix Transmission topology Generation and transmission expansion Fuel prices Emission allowance prices Transmission: existing, new; constraints, contingencies Generation: existing, new; storage; variable generation Market rules Algorithms Maintenance scheduling SCUC/SCED; contingency analyses; energy and A/S co-optimization; co-optimized topology control Emission policy and RPS compliance; capacity expansion; capacity market modeling Outputs Physical: • • • • • Generation and reserves schedules Power flow Fuel use Emissions Curtailment s Financial: • • • Prices Revenues Costs Planning: • • New builds Retirements ENELYTIX Services PSO PSO PSO PSO PSO PSO PSO PSO PSO 8 9 ENELYTIX Applications • Valuation of assets (physical or financial contracts) – Cash flow projections under various scenarios • Transmission planning – Assessment of physical flows, economic and environmental impacts of transmission projects. Cost-benefit analysis • Policy analysis, market design – Simulation of the impact of changing regulatory policy, market /operational rules on market performance. Cost-benefit analysis • Generation scheduling, trading support – Detailed simulations of system operations and economics under multiple scenarios with relatively short-term horizons (hour-ahead to month-ahead) • Modeling of variable generation, distributed generation, demand response participation in markets for energy and ancillary services – Hourly and sub-hourly simulations of market operations under various inputs and market design scenarios • Reliability assessments – Feasibility assessment of the system using Monte-Carlo generated scenarios 10 ENELYTIX Benefits • Affordable Scalability • Improved productivity and turn-around time • Comprehensive IT infrastructure • Making “Big Data” explorable 11 Affordable Scalability Actual usage pattern for a customer Cloud makes scalability affordable 12 Improved Productivity & Turn-Around Time Statistic 1 year PJM Simulation, parallelized into 53 segments (1 week each) Total Time (hh:mm) 36:46 Avg. Time per Segment (mm:ss) 41:27 Min Time per Segment (mm:ss) 31:37 Max Time per Segment (mm:ss) 56:21 Turn-around time for simulations Machine Properties using c4.xlarge instance on AWS (4 vCPUs, 7.5GB memory, Intel Xeon E5-2666 v3 chip, 2.9GHz max clock speed). • Any number of these simulations (multi-year, multi-scenario studies) will complete in under an hour • Users relying on scalable cloud-based services accomplish much more in 1 hour than users relying on in-house solutions can do in a day • With cloud-based scalability, MIP-based simulators deliver results faster than outdated and imprecise heuristic based tools • More robust, MIP-based simulators are generally slower than heuristic based simulators 13 Comprehensive IT • Any IT service that could be needed is provided on demand • Easy management for all customers – Single set of standards for all IT services • Modern hardware – Updated by Amazon/Microsoft on a regular basis 14 Explorable Big Data • Simulations generate hundreds of gigabytes of data • Analytic needs are wide ranging and varied across users – Self-Service Business Intelligence (BI) is the natural approach to support these requirements – Parallelizing simulations demands parallelizing post processing • Distributed cloud database services and custom OLAP cube solutions deliver scalable BI to support big data needs 15 Cloud Deployment Challenges • Developing for cost-efficiency – Virtual hardware (hard drives, storage, compute) – Virtual software (database management, process management, general code efficiency) • Managing scalability – Resource interruptions (partial/full) – Communication (cloud-cloud, cloud-ENELYTIX, ENELYTIXuser) • Big Data • Addressing Security Concerns 16 Developing for Cost Efficiency • In the past, physical memory and hard drive space was limited • With cloud services, hard drives, memory and compute capacity are easily accessible but have to be efficiently managed – Compute resources are charged on a whole hour basis • Software processes have to be scheduled and planned around compute costs – Partitioning to maximize compute usage – Efficient provisioning of compute and storage resources depending on simulation size and complexity – Efficient code to minimize bandwidth of databases and web servers 17 Managing Scalability through Redundancy and Automation • Developing with redundancy is required for consistent performance – network issues – database connectivity – Known unknowns (partial outages, extended downtime, web server interruptions, etc) • With many simulations running in parallel – something will go wrong. Automated error handling is critical – Unexpected slowdown in run time – Sufficient Compute Resources • instance capacity • Memory/Disk space 18 Addressing Security Concerns • • • • • • • Encryption at Rest Encryption in Transfer Secure APIs Subnets and VPC Closing external ports on EC2 resource IP range restrictions Amazon Inspector 19 Next Steps • Develop distributed OLAP Cube processing across AWS • Offer spot instances to customers with flexible deadlines • Integration of new models – Natural gas pipeline optimization systems – Capacity Expansion • Support more complex workflows – Automated simulation output to input – Iterative message passing between simulations 20 For More Information Alex Rudkevich John Goldis 617-340-9810 617-340-9815 arudkevich@negll.com jgold@negll.com Newton Energy Group 75 park plaza, fourth floor Boston, MA 02116 www.newton-energy.com