ExoGENI Federated Private NIaaS Infrastructure

Transcription

ExoGENI Federated Private NIaaS Infrastructure
ExoGENI Federated Private NIaaS
Infrastructure!
Chris Heermann!
ckh@renci.org!
Overview!
•  ExoGENI architecture and implementation"
•  ExoGENI Science use-cases"
–  Urgent Computing: Storm Surge Predictions
on GENI
–  ScienceDMZ as a Service: Creating Science
Super-Facilities with GENI
•  Support for SDN in ExoGENI"
2!
IaaS: clouds and network virtualization
Virtual Compute and Storage Infrastructure Cloud APIs (Amazon EC2 ..) Virtual Network Infrastructure Dynamic circuit APIs (NLR Sherpa, DOE OSCARS, I2 ION, OGF NSI …) Breakable Experimental Network
Cloud Providers Transport Network Providers Open Resource Control Architecture •  ORCA is a “wrapper” for off-the-shelf cloud and circuit
nets etc., enabling federated orchestration:
+  Resource brokering
+  VM image distribution
+  Topology embedding
+  Stitching
+  Federated Authorization
•  GENI, DOE, NSF SDCI+TC
•  http://geni-orca.renci.org
•  http://networkedclouds.org
coordinator B SM controller AM aggregate The APIs!
•  Simple API, complex description language"
–  createSlice(sliceName, Term, SliceTopology, Credentials)"
•  Topology management"
–  deleteSlice(sliceName)"
–  sliceStatus(sliceName)"
•  Debugging"
–  modifySlice(sliceName, TopologyUpdate)"
•  Elasticity"
–  extendSlice(sliceName, NewTerm)"
•  Agility"
•  Description language:"
–  NDL-OWL – OWL-based ontology that describes"
•  Participating in US-EU effort to standardize the IaaS ontology"
–  User: Resource requests"
–  Provider: Resource description, public resource advertisement, manifest"
5!
GENI Federation!
•  Federated identity"
–  InCommon "
–  X.509 identity certificates"
•  Common APIs"
–  Aggregate Manager"
•  ExoGENI has a compatibility API layer supporting AM API v2 "
–  Clearinghouse"
•  Federated access policies"
–  ABAC"
•  Agreed upon resource description language"
–  RSpec"
•  ExoGENI translates relevant portions from NDL-OWL to RSpec and back as needed"
•  Several major portions"
–  ExoGENI, InstaGENI, WiMax, Internet2 AL2S"
•  Federation with EU"
–  Amsterdam XO rack part of SDX demo at GEC21 with iMinds"
6!
Building network topologies
Slice owner may deploy an IP network into a slice (OSPF). slice OpenFlow-­‐enabled L2 topology Cloud hosts with network control mputed embedding Virtual network exchange Virtual colo campus net to circuit fabric ExoGENI
•  Every Infrastructure as a Service, All Connected.
–  Substrate may be volunteered or rented.
–  E.g., public or private clouds, HPC, instruments and transport
providers
–  Contribution size is dynamically adjustable
•  ExoGENI Principles:
–  Open substrate
–  Off-the-shelf back-ends
•  OSCARS, NSI, EC2 etc.
– 
– 
– 
– 
Provider autonomy
Federated coordination
Dynamic contracts
Resource visibility
Breakable Experimental Network
Current topology!
9!
An ExoGENI cloud “rack site”
Management switch
option 1:
tunnels
4x1Gbps management and iSCSI
storage links (bonded)
Direct L2 Peering
w/ the backbone
node
node
node
node
node
node
node
node
node
node
2x10Gbps
dataplane links
Static VLAN tunnels
provisioned
to the backbone
Worker
Worker
Worker
Worker
Worker
Worker
Worker
Worker
Worker
Worker
To campus Layer 3
network
Management node
Sliverable Storage
OpenFlow-enabled L2 switch
option 2:
fiber uplink
Dataplane to dynamic
circuit backbone
(10/40/100Gbps)
(optional)
Dataplane to campus
network for stitchable VLANs
ExoGENI software structure
Current deployments!
•  xCAT"
–  Operator node provisioning"
–  User-initiated bare-metal provisioning"
•  OpenStack Essex++ (RedHat/CentOS version)"
–  Custom Quantum plugin to support multiple dataplanes"
–  Working on Juno port"
•  iSCSI user slivering"
–  IBM DS3512 appliance"
•  NetApp iSCSI support in the works"
–  Linux iSCSI stack"
•  Backend support for LVM, Gluster, ZFS"
12!
Tools!
•  ORCA Native tools
(native APIs, resource
descriptions)"
–  Flukes"
–  More flexibility"
•  Federation tools
(federation APIs,
resource descriptions)"
–  Jacks, omni, jFed"
–  Compatibility"
13!
Tools (continued)!
Presentation title goes here"
14!
ExoGENI – a federation of private clouds!
•  Each site is a micro-cloud"
–  Adding support for HPC batch schedulers"
•  Owners decide what portion of resources to
contribute"
•  Free to continue using native IaaS interfaces"
•  Have the opportunity to take advantage of
federated identity and inter-provide orchestration
mechanisms"
•  What is it good for?"
–  Foundation for future science institutional collaborative CI"
15!
!ExoGENI Science Use-cases!
Presentation title goes here"
16!
Computing Storm Surge!
•  ADCIRC Storm Surge Model"
–  FEMA-approved for Coastal Flood Insurance Studies "
–  Very high spatial resolution (millions of triangles)"
–  Typically use 256-1024 cores for real-time (one simulation!) "
ADCIRC grid for coastal North Carolina Tackling Uncertainty!
One simulaJon is NOT enough! ProbabilisJc Assessment of Hurricanes Research Ensemble NSF Hazards SEES project 22 members, H. Floyd (1999) A “few” likely hurricanes Fully dynamic atmosphere (WRF) Why GENI?!
•  Current limitations: Real-time demands for compute resource"
–  Large demands for real-time compute resources during storms"
–  Not enough demand to dedicate a cluster year-round"
•  GENI enables"
–  Federation of resources"
–  Cloud bursting, urgent, on-demand"
–  High-speed data transfers to/from/between remote resources"
–  Replicate data/compute across geographic areas"
•  Resiliency, performance"
Storm Surge Workflow!
•  Whole workflow is 22 ensemble members •  Pegasus workflow management system Ensemble Scheduler … Collector Slice Topology!
•  11 GENI sites (1 ensemble manager, 10 compute sites) •  Topology: 92 VMs (368 cores), 10 inter-­‐domain VLANs, 1 TB iSCSI storage •  HPC compute nodes: 80 compute nodes (320 cores) from 10 sites Representative Science DMZ!
Dedicated vs. Virtual resources!
•  GENI provides a distributed software-defined infrastructure
(SDI)"
–  Compute + Storage + Network"
Emerging Trend: Super Facilities, Coupled by Networks!
Experimental faciliJes are being transformed by new detectors, advanced mathemaJcs, roboJcs, automaJon, advanced networks. Today’s Demonstration:
Real-time data processing and vis.
workflow!AL2S, ESnet ESnet SPADE instance @ Server at Argonne ExoGENI SPADE VM @ Starlight, Chicago • 
Data from ALS Experiment • 
• 
ExoGENI SPADE VM @ Oakland, California WAN-­‐opJmized data transfer nodes and a network slice created programmaJcally (Science DMZ as a service) ApplicaJon workflow instanJated to stage data at the GENI rack on Science DMZ slice Data is moved opJmally across the WAN1 Compute Cluster NERSC, LBL 1 Earlier work, like Phoebus, have instandated the value of this approach h_p://portal.nersc.gov/project/als/sc14/ Dedicated vs. Virtual resources!
•  GENI provides a distributed software-defined infrastructure
(SDI)"
–  Compute + Storage + Network"
•  GENI racks may be deployed on-campus or in provider
networks close to the campus"
•  ‘Science DMZ as a service’ "
–  Applications can provision a virtual ‘Science DMZ’ as and when
needed"
Programmable infrastructure to enable end-­‐users to create dynamic ‘fricJon-­‐free’ infrastructures without advanced knowledge/training Microtomography of High Temperature Materials under stress!
Set collected by materials sciendst Rob Ritchie, LBNL/UCB What constitutes programmable network
behavior? (i.e. what is SDN?)!
•  Control over virtual topology"
–  Link in one layer is
represented by a path in
another"
•  Control over packet
forwarding"
–  Making decisions about
which interface a packet/
frame should be placed"
•  Queue management and
arbitration"
–  Defining packet queues and
associated service and
scheduling policies "
Layer 1/2/3 VPNs via explicit
signaling (MPLS, GMPLS)"
Bandwidth-on-demand services
(OSCARS, NSI)"
FlowVisor"
OpenFlow 1.0, Nicira OpenVSwitch,
Cisco ONE, OpenDaylight, Juniper
Contrail"
Numerous vendor-proprietary APIs,"
OpenFlow 1.3"
28!
ExoGENI and OpenFlow (now)!
•  OpenFlow experiments
using embedded topologies
with OVS spanning one or
more sites"
–  e.g. HotSDN ‘14 “A Resource
Delegation Framework for
Software Defined Networks”
Baldin, Huang, Gopidi"
•  Experiments with OF 1.0 in
rack switches"
–  Described in ExoBlog
(www.exogeni.net)"
29!
ExoGENI and OpenFlow (near future)!
•  OpenFlow service on BEN (ben.renci.org)"
–  40G wave using Juniper EX switches"
–  FSFW, OF 1.0, multiple controllers"
–  Topology embedding/VNE for ExoGENI, path service
for other projects."
•  Slice on AL2S with own controller"
–  Topology embedding for ExoGENI, value-add
experimenter services with ExoGENI resources"
•  Application-specific topology embedding"
30!
Where are we going?!
•  More sites"
–  Georgia Tech [Atlanta, GA], PUCP [Lima, Peru], Ciena [Hanover, MD]"
•  Updated OpenStack"
•  Better compute isolation"
–  Take NUMA into account for placement decisions"
•  Better storage isolation"
–  Provision storage VLANs/channels with QoS properties to provide predictable
performance"
•  Better network isolation and performance"
–  Enable SR-IOV "
•  More complex topology management/embedding"
–  Fully dynamic slices"
•  More diverse substrates"
–  Integration with batch schedulers (SLURM)"
–  VMWare, other cloud stacks"
–  Public clouds"
31!
Thank you!!
•  http://www.exogeni.net"
"
32!