Clusterix:Network Management System
Transcription
Clusterix:Network Management System
Clusterix:Network Management System Michał Balcerkiewicz michalb@man.poznan.pl Bartosz Belter bart@man.poznan.pl Artur Binczewski artur@man.poznan.pl Radosław Krzywania sfrog@man.poznan.pl Maciej Stroiński stroins@man.poznan.pl Jan Węglarz weglarz@man.poznan.pl Clusterix Network Management – Goals And Objectives • Building network infrastructure • Using network as a GRID resource • Dynamically attaching new clusters • Active network monitoring Clusterix Network Architecture • • • • • Local Cluster Communication to all cluster is Switch passed through router/firewall Access Node Routing based on IPv6 protocol, with IPv4 for back compatibility feature Application and Clusterix middleware are adjusted to IPv6 usage For security reason only outgoing connections to Computing Internet are permitted Nodes Two 1 Gbps VLANs are used to improve management of Communication network traffic & NFS VLANs – – Communication VLAN is dedicated to support nodes messages exchange NFS VLAN is dedicated to support file transfer Clusterix Storage Element PIONIER Core Switch 1 Gbps Backbone Traffic Internet Network Internet Network Access Router Firewall Network as a GRID resource • Network can be seen as a set of parametrized resources • Knowledge of network utilization is used by task broker to improve its job by choosing optimal routes for task delegation • Network managment module: – provides the following metrics : - Round trip time - Throughput - Out of order packets - Duplicated packets - Packet jitter - Lost packets – provides information about devices accessibility – provides managment information via SNMP • All parameters can be accessed via industry standard Web Services Integration with Broker Application A – distributed computation, high communication (small chunks of data) Application B – visualisation, less communication, heavy use of data, massive output results Request A { Max_Clusters Processors RTT Bandwidth Packet_loss } = 4; = 16; = 5ms; = 5Mb/s; = 0%; Request B { Max_Clusters Processors RTT Bandwidth } = 2; = 16; = 20ms; = 500Mb/s; Purposes of dynamic cluster attachment • External clusters can be easily attached to Clusterix infrastructure in order to: – Increase computing power with new clusters – Utilize external clusters during nights or non-active periods – Make Clusterix infrastructure scalable Dynamic Cluster Attachment - Architecture • Dynamic cluster attachment: – Requirements needs to be checked against new clusters Local Switch PIONIER Backbone Switch • Installed software • SSL certificates – Communication through router/firewall Internet – Network Management System will automatically discover new Regular resources Cluster – New cluster can serve computing power on regular basis Router Firewall Dynamic Resources Active network monitoring • Measurement architecture SNMP Monitoring – Distributed 2-level Network measurement agent mesh Manager (backbone/cluster) Measurement Reports – Centralized control manager (multiple redundant instances) – Switches are monitored via SNMP – Measurements reports are stored by manager (forwarded to database) – IPv6 protocol and addressing schema is used Computing Cluster for measurement PIONIER Backbone Measurements Local Cluster Measurements • Backup managers improves failure recovery (active manager switching) • External applications are allowed to retrieve various network statistics • Devices and agents management modules collect network data System Manager • GUI shows network status and configure manager System Resources • Statistics are stored in external database (short time backup is stored in manager) External Entities Software Architecture Database External Clients Controller GUI External Interfaces Backup Manager Redundancy Controller System Logic Measurement Agents Manager Backbone measurements Local Cluster measurements Device Manager Devices Graphical User Interface • GUI – Provides view of network status – Gives a look at statistics – Simplifies network troubleshooting – Allows to configure measurement sessions – Useful for topology browsing Summary • Network is used as a regular GRID resource • Sharing measurements with other tools • Dynamic architecture allows easy power upgrades • Failure resist network monitoring system Thank you for your attention! Visit http://www.clusterix.pcz.pl