Using the cloud as a PMU Data Sharing Platform

Transcription

Using the cloud as a PMU Data Sharing Platform
1
Using the cloud as a PMU Data Sharing
Platform
Ken Birman
Cornell University
2
GridCloud: Mile high perspective
The intelligence lives
in the cloud. The
external
infrastructure is
mostly dumb sensors
and actuators
3
• Share PMU data
Business Needs
– Between ISO-NE and NYISO, PJM, etc.
– Multilateral data exchange
• As opposed to multiple bilateral data exchange
– “Centralized” PMU Registry management
– “Centralized” historical PMU data storage
• Shared online application
– Runs on data from both ISO-NE and NYISO
– Real-time operator collaborations
– Potentially eliminates the need for raw data sharing
4
Key Elements of the Solution
• Self-managed consistency and replication (redundancy) for faulttolerance.
• Data collectors can accept PMU streams or PDC streams
• Archived data stored into historian for offline replay and analysis
• Full system replication across data centers
5
Why the Cloud?
• Cost-effective, readily available and scalable infrastructure
• Logically centralized, physically distributed computing – best of both
worlds
• Platform ideal for multilateral data/results exchange
– Getting data from everywhere and delivering to everywhere
– Hosting applications online
• As opposed to the EIDSN cloud – communications only
6
What do we mean by “consistency?”
• One issue is real-time consistency
• After an event occurs, it should be rapidly reported in the cloud
• And anyone sharing the platform should soon see it
• Another issue is replication consistency
• Replicate for fault-tolerance and scale
• All replicas should have identical contents
7
Freeze-Frame FS
• Real-time file system for secure, strongly consistent data capture
– Normal file system but understands real-time timestamps
– Offers optimal temporal precision plus Chandy-Lamport consistency
• Incredibly high speed
– Leverages RDMA for network line-speed data transfers
– NVRAM (SSD or RAID disks) for storage persistency
Archival Data Storage Solution in Action
HDFS
FFFS+SERVER
FFFS+SENSOR
TIME
• We
simulated a wave and sampled
it. LikeTIME
taking photos
of individual
grid cells (squares).
• Then streamed the data to our cloud-hosted historian, one stream per
sensor.
• Then reconstructed the wave from the files and created these gif
animations.
Our improved handling of time will result in better computational accuracy
and stronger consistency even with tight time constraints.
9
Consistent Time Matters!
• Real-time applications need cloud services that understand time
– Deep-learning on temporally fuzzy data will give incorrect output
– Operators and systems using stale data will make poor decisions
• With Freeze Frame FS, we can start to run powerful file-oriented
analytics that understand data, and that keep up in real-time
10
Proof-of-concept Project Objectives
• Evaluate the GridCloud platform
– An existing platform created with ARPAe (GENI program) support over a three
year period for characteristics of interest to ISO- NE
• Assess the security, latency, fault-tolerance and consistency of compute
cloud processing of PMU data
– Measure round-trip latencies from ISO-NE (and Cornell representing a second
utility) to Amazon cloud data centers in North Virginia and Oregon
– Run linear state estimator in both locations and assess consistency of results
• Targeting real scenarios and real requirements
11
GridCloud Idea
FWDN
Time-Synchronized Data Sources
SSH Tunnels
Results
FWDN-1
Computation
…
Application
Time Alignment
FWD2
Results
FWD1
Computation
Data Collectors
Replica 1
Time Alignment
Application
GridCloud running on Amazon Elastic Compute Cloud/VPC
Results
12
Architecture & Latency Definitions
PROOF OF CONCEPT
13
ISO-NE Deployment – Cloud Inbound
L1: One way from DataSource to CloudRelay to Application (e.g. SE)
Managed by CloudMake and VSync
PMU1
TCP
PMU2
…
C37.118
Cloud hosted
Ingress
distribution point
FWD1
FWD2
TCP Sender
…
PMUm-1
PMUm
GridCloud at
North Virginia Data Center
31 Streams
FWDN-1
Data
Archive
LSE-VIS-1
Visualization
Client
State
Estimator
Visualization
Re-played C37 Data
ISO-NE hosted distribution point
FWDN
PMUm+1
PMUm+2
…
Cloud hosted
Ingress
distribution point
FWD1
FWD2
TCP Sender
…
PMUN-1
PMUN
GridCloud at
Oregon Data Center
42 Streams
FWDN-1
Data
Archive
State
Estimator
FWDN
Internal DataSource Network
SSH Tunnels
GridCloud running on Amazon VPC
LSE-VIS-2
Visualization
Client
Visualization
Re-played C37 Data
Cornell hosted distribution point
14
ISO-NE Deployment – Full Loop
L3raw: Round trip from DataSource to CloudRelay to DataSource
Managed by CloudMake and VSync
PMU1
TCP
PMU2
…
PMUm-1
PMUm
TCP Sender
31 Streams
Re-played C37 Data
Cornell hosted distribution point
PMUm+1
PMUm+2
…
TCP Sender
PMUN-1
PMUN
42 Streams
Cloud hosted
Ingress
distribution point
GridCloud at
North Virginia Data Center
FWD1
Data
Archive
Cloud hosted
Raw Data Egress
distribution point
FWD2
Cloud hosted
SE Result Egress
distribution point
FWDN-1
Cloud hosted
Ingress
distribution point
GridCloud at
Oregon Data Center
Cloud hosted
Raw Data Egress
distribution point
FWD2
Cloud hosted
SE Result Egress
distribution point
FWDN-1
…
State
Estimator
LSE-VIS-1
Visualization
Client
Visualization
C37.118
FWDN
FWD1
…
Data
Archive
State
Estimator
LSE-VIS-2
Visualization
Client
Visualization
Re-played C37 Data
ISO-NE hosted distribution point
FWDN
SSH Tunnels
L2: Round trip from Ingres Relay to Egress relay
L3se: Round trip from DataSource to SE to DataSource
15
Security – Proof-of-concept
• Amazon Virtual Private Cloud (VPC)
• SSH Tunnel for data stream
• ISO-NE data source
–
–
–
–
Historical data playback w/ simulated real-time timestamps
Inside firewall
Data publishing
No data subscription
• Cloud Data Storage
– Encrypted using a key
• Generated by and stored in Amazon AWS
• Managed by users
16
Security – Future Production
• Real-time PMU data
– Initially from ISOs’ or TOs‘ PDCs
– Far future – directly from PMUs
• SSH Tunnel
– Https for PMUs
• Virtual Private Cloud
• Key Management for Storage Encryption
– What to protect from
– Feasibility
• Programming language, runtime environment
– Cost
• Non-functional Requirements
17
Cost of Security: VPC vs EC2, Tunnels
• EC2 Latency:
– Average RTT = 245ms
– 1st Percentile = 211ms
– 99th Percentile = 255ms
• VPC Latency:
– Average RTT = 261ms
– 1st Percentile = 228ms
– 99th Percentile = 270ms
• Delta is approximately +15ms
• These numbers do not include SE compute time (75ms-100ms)
• Adding SSH tunnels added less than 2ms to RTT
18
COSTS
19
Operating Cost
• As configured for testing:
– 13 instances total per datacenter
• Vizualizer, CloudRelay, CloudMakeLeader, StateEstimator, 3xRawArchiver, 4xSEArchiver,
2xForwader
– $2.47/hr to run per datacenter
– Optimizing cost was not an objective for PoC
• Tailored for convenience, repeatability
• In a deployment for actual use would tailor the resources to the needs of the actual problem
20
LATENCIES
21
Raw Data Round Trip Latencies
Graphs: Number of times a particular latency occurs
Raw Data Round Trip Latency
2500
Oregon
Number of Occurrences
Viginia
2000
1500
1000
500
0
0
100
200
300
Measured Latency (ms)
400
500
600
22
Histogram: Round Trip Latencies
Graphs: Number of times a particular latency occurs
SE Results Round Trip Latency
2500
Number of Occurrences
Oregon
Virginia
2000
1500
1000
500
0
0
100
200
300
Measured Latency (ms)
400
500
600
23
OTHER PERFORMANCES
24
Other Performances
• Fault Tolerance
– Two parallel systems
– Independent
• Manual redundancy
– Restarting a data center
• Needs ~500 s
• Consistency
– No raw data loss
– Few SE data loss
• Due to replay wrapping
– Within ~100 ms
• End users have consistent data/results
from both data centers
25
SUMMARY AND GOING FORWARD
26
• Security
–
–
–
–
Lessons Learned
Manageable risks
Understood the problem
Need to develop a policy
Encryption Cost
• SSH to/from the cloud adds negligible latency
• Encrypting data storage adds negligible latency
• Latency
– The available cloud infrastructure is very fast
• Even across the continent
– Lower than SIDU production
– Satisfy Wide-Area-Monitoring purposes
27
Lessons Learned – cont.
• Reliability
– The cloud is in general reliable
• Consistency
– ~100 ms w/o cross reference
28
Future Objectives
• Cross reference?
– Feasibility
– Automatic redundancy
– Guaranteed consistency
• An SSH tunnel for each PMU?
– As opposed to fate sharing
• Https?
– Feasibility
– Protocol support
• Other applications
– PMU Registry
– Storage
29
Opportunities Ahead
• A flexible pay-for-use framework
– Standard API
– Open access to future applications
• Sharable historian with guaranteed consistency
• Eliminate the need for bilateral data exchange
• Same real-time visualization
– Operators collaboration
• Future interconnection-wide PMU Registry
30
ISO-NE + NYPA Project
•
•
•
•
•
About to start, with NYSERDA funding
Does sharing pose new security/privacy issues?
Will we encounter security/privacy policy differences?
Plan: Build upon the PoC ISO-NE cloud structure
Once finished, the opportunity then arises to go live…
31
Detailed Experiment Results
EXTRA
32
Histogram: Round Trip Latencies
Graphs: Number of times a particular latency occurs
Cornell Round Trip Latency (Raw Data)
2500
Number of Occurrences
Oregon
Virginia
2000
1500
1000
500
0
0
100
200
300
Measured Latency (ms)
400
500
600
33
L2 and L3 Latency Tests
• Sampled over 4 hours
• Tests performed from Cornell datasource machine over SSH tunnels
• Sampled 4 raw feeds and two SE feeds from each datacenter:
– Lowest numbered PMU from each datasource (ISO-NE and Cornell)
– Highest numbered PMU from each datasource
• PMUs send to the cloud in order from the datasource, this helps show us the spread of data from first
to last measurement sent per round
– Lowest and Highest SE result
• Tests presented in the following slides as histograms and table of overall statistics
– Histograms only cover highest numbered PMU/SE as they have the highest variability
– Full stats and analysis included in companion excel file
34
Latencies (milliseconds)
Virginia
Virginia-Internal
Oregon
Oregon-Internal
ISONE Raw-Low Min
ISONE Raw-Low 1st Percentile
ISONE Raw-Low Average
ISONE Raw-Low 99th Percentile
ISONE Raw-Low Max
20
22
25
58
611
88
89
102
152
696
ISONE Raw-High Min
ISONE Raw-High 1st Percentile
ISONE Raw-High Average
ISONE Raw-High 99th Percentile
ISONE Raw-High Max
22
25
46
82
612
90
99
127
179
697
Cornell Raw-Low Min
Cornell Raw-Low 1st Percentile
Cornell Raw-Low Average
Cornell Raw-Low 99th Percentile
Cornell Raw-Low Max
17
17
18
20
49
90
91
115
191
407
Cornell Raw-High Min
Cornell Raw-High 1st Percentile
Cornell Raw-High Average
Cornell Raw-High 99th Percentile
Cornell Raw-High Max
18
18
19
20
49
91
92
120
199
413
SE Results Min
SE Results 1st Percentile
SE Results Average
SE Results 99th Percentile
SE Results Max
279
294
325
384
911
242
267
300
348
469
351
370
409
490
962
240
273
317
393
642
35
Latencies (milliseconds)
36
OpenPDC Manager (Visualizer) Displaying SE Results
1
Use of Cloud Computing for Power
Market Modeling and Reliability
Assessment
John Goldis, Alex Rudkevich
Newton Energy Group
2016 IEEE PES General Meeting
Panel on Cloud Computing – trend and security and
implementation experience in power system operations
Boston, MA, July 20, 2016
2
Outline
•
•
•
•
•
•
•
About NEG and ENELYTIX
Why Cloud
Energy Market Modeling on the Cloud
ENELYTIX and PSO
Benefits of ENELYTIX
Challenges with cloud solutions
Next Steps
3
About NEG and ENELYTIX
• Alex Rudkevich and John Goldis started NEG in 2012 with a
mission to modernize power market modeling through the
use of commercially available High Performance Computing
• Russ Philbrick, formerly with Alstom T&D, started Polaris in
2010 to develop a market simulator capable of addressing the
evolution in the resource mix and operational/planning
realities of the power industry
• ENELYTIX is a result of the partnership between Newton
Energy Group (NEG) and Polaris Systems Optimization
(Polaris)
4
Why Cloud?
• Allows NEG to provide a comprehensive suite
of services
– Software
– Hardware
– IT
• Allows customers to leverage cloud resources
through ENELYTIX and obtain major
productivity gains at affordable costs
5
Energy Market Modeling on the Cloud
• Energy market simulations are computationally
complex
• Simulations can be partitioned and parallelized
• Licenses are typically non-scalable as they are
structured on a per-user or per-machine basis
– Parallelization requires scalable hardware and software
• Cloud providers offer hardware on a usage basis
• With our partners (Polaris, AIMMS, IBM) we
developed a usage-based pricing model for the
software, creating an opportunity for ENELYTIX
6
ENELYTIX is a SaaS designed to help users run their energy
market modeling business reliably and efficiently
Foundations
Resources:
•
•
•
•
•
Software and
licenses
Hardware:
computers,
networks, data
storage,
communications
IT support
Data services
Trained personnel
Self Service
Business
Intelligence
Application
Solutions
•
•
•
•
•
•
Forecasting
Asset valuation
System
planning
Policy analysis
Market design
Trading support
•
•
•
Consultants
Grid Operators
Utilities
–
–
–
–
•
•
•
Transmission Owners
Generators
Distributors
Competitive Retail
Providers
Traders
Investors/Developers
Regulators
6
7
PSO, a MIP-based Simulation Engine
Inputs
Demand
forecasts
Models
Loads, demand
response
Generation mix
Transmission
topology
Generation and
transmission
expansion
Fuel prices
Emission
allowance
prices
Transmission:
existing, new;
constraints,
contingencies
Generation:
existing, new;
storage;
variable
generation
Market rules
Algorithms
Maintenance
scheduling
SCUC/SCED;
contingency
analyses;
energy and A/S
co-optimization;
co-optimized
topology control
Emission policy
and RPS
compliance;
capacity
expansion;
capacity market
modeling
Outputs
Physical:
•
•
•
•
•
Generation
and
reserves
schedules
Power flow
Fuel use
Emissions
Curtailment
s
Financial:
•
•
•
Prices
Revenues
Costs
Planning:
•
•
New builds
Retirements
ENELYTIX Services
PSO
PSO
PSO
PSO
PSO
PSO
PSO
PSO
PSO
8
9
ENELYTIX Applications
•
Valuation of assets (physical or financial contracts)
– Cash flow projections under various scenarios
•
Transmission planning
– Assessment of physical flows, economic and environmental impacts of transmission
projects. Cost-benefit analysis
•
Policy analysis, market design
– Simulation of the impact of changing regulatory policy, market /operational rules on
market performance. Cost-benefit analysis
•
Generation scheduling, trading support
– Detailed simulations of system operations and economics under multiple scenarios with
relatively short-term horizons (hour-ahead to month-ahead)
•
Modeling of variable generation, distributed generation, demand response
participation in markets for energy and ancillary services
– Hourly and sub-hourly simulations of market operations under various inputs and
market design scenarios
•
Reliability assessments
–
Feasibility assessment of the system using Monte-Carlo generated scenarios
10
ENELYTIX Benefits
• Affordable Scalability
• Improved productivity and turn-around time
• Comprehensive IT infrastructure
• Making “Big Data” explorable
11
Affordable Scalability
Actual usage pattern for a customer
Cloud makes scalability affordable
12
Improved Productivity & Turn-Around Time
Statistic
1 year PJM Simulation, parallelized into 53 segments
(1 week each)
Total Time (hh:mm)
36:46
Avg. Time per Segment (mm:ss)
41:27
Min Time per Segment (mm:ss)
31:37
Max Time per Segment (mm:ss)
56:21  Turn-around time for simulations
Machine Properties
using c4.xlarge instance on AWS (4 vCPUs, 7.5GB memory,
Intel Xeon E5-2666 v3 chip, 2.9GHz max clock speed).
• Any number of these simulations (multi-year, multi-scenario studies) will
complete in under an hour
• Users relying on scalable cloud-based services accomplish much more in 1
hour than users relying on in-house solutions can do in a day
• With cloud-based scalability, MIP-based simulators deliver results faster than
outdated and imprecise heuristic based tools
• More robust, MIP-based simulators are generally slower than heuristic
based simulators
13
Comprehensive IT
• Any IT service that could be needed is
provided on demand
• Easy management for all customers
– Single set of standards for all IT services
• Modern hardware
– Updated by Amazon/Microsoft on a regular basis
14
Explorable Big Data
• Simulations generate hundreds of gigabytes of data
• Analytic needs are wide ranging and varied across
users
– Self-Service Business Intelligence (BI) is the natural
approach to support these requirements
– Parallelizing simulations demands parallelizing post
processing
• Distributed cloud database services and custom
OLAP cube solutions deliver scalable BI to support
big data needs
15
Cloud Deployment Challenges
• Developing for cost-efficiency
– Virtual hardware (hard drives, storage, compute)
– Virtual software (database management, process
management, general code efficiency)
• Managing scalability
– Resource interruptions (partial/full)
– Communication (cloud-cloud, cloud-ENELYTIX, ENELYTIXuser)
• Big Data
• Addressing Security Concerns
16
Developing for Cost Efficiency
• In the past, physical memory and hard drive space was
limited
• With cloud services, hard drives, memory and compute
capacity are easily accessible but have to be efficiently
managed
– Compute resources are charged on a whole hour basis
• Software processes have to be scheduled and planned
around compute costs
– Partitioning to maximize compute usage
– Efficient provisioning of compute and storage resources depending
on simulation size and complexity
– Efficient code to minimize bandwidth of databases and web servers
17
Managing Scalability through Redundancy and
Automation
• Developing with redundancy is required for consistent
performance
– network issues
– database connectivity
– Known unknowns (partial outages, extended downtime, web
server interruptions, etc)
• With many simulations running in parallel – something
will go wrong. Automated error handling is critical
– Unexpected slowdown in run time
– Sufficient Compute Resources
• instance capacity
• Memory/Disk space
18
Addressing Security Concerns
•
•
•
•
•
•
•
Encryption at Rest
Encryption in Transfer
Secure APIs
Subnets and VPC
Closing external ports on EC2 resource
IP range restrictions
Amazon Inspector
19
Next Steps
• Develop distributed OLAP Cube processing across AWS
• Offer spot instances to customers with flexible deadlines
• Integration of new models
– Natural gas pipeline optimization systems
– Capacity Expansion
• Support more complex workflows
– Automated simulation output to input
– Iterative message passing between simulations
20
For More Information
Alex Rudkevich
John Goldis
617-340-9810
617-340-9815
arudkevich@negll.com
jgold@negll.com
Newton Energy Group
75 park plaza, fourth floor
Boston, MA 02116
www.newton-energy.com