slides - VECPAR
Transcription
slides - VECPAR
A P2P Approach to Many Tasks Computing for Scientific Workflows Eduardo Ogasawara Daniel Oliveira Carlos Pivotto Vanessa Braganholo Marta Mattoso Jonas Dias Carla Rodrigues Rafael Antas Patrick Valduriez COPPE/UFRJ PPGI/UFRJ Université Montpellier/INRIA VECPAR’10 Our Scenario • Scientific Experiments – Large scale simulations – Chain of programs and activities – Scientific Workflows • SWfMS (Kepler, VisTrails) • Exploration with different methods, parameters or data – Many Task Computing (MTC) 6/24/2010 A P2P Approach to Many Tasks Computing for Scientific Workflows 2 What we have • Workflow parallelization – Homogenous environments • Multiprocessor or cluster systems • Centralized control • High‐throughput, low latency, high performance – Heterogeneous environments • Grids, desktop grids and volunteering computing, hybrid clouds • Different efforts to parallelize the workflow 6/24/2010 A P2P Approach to Many Tasks Computing for Scientific Workflows 3 The problems • To parallelize in the context of scientific workflows – Independent of the target environments – Better control of the experiment • Provenance gathering – On heterogeneous distributed environment – Experiments must be reproducible • Control X Performance 6/24/2010 A P2P Approach to Many Tasks Computing for Scientific Workflows 4 What we propose • Peer‐to‐Peer approach – Decentralized control – Scalability – Dynamic behavior of nodes • P2P to support MTC – Through Scientific Workflows • SciMule – A tool to parallelize and distribute scientific workflows activities through P2P computing 6/24/2010 A P2P Approach to Many Tasks Computing for Scientific Workflows 5 Agenda 1. Backgrounds on P2P networks – P2P approaches overview and comparison 2. SciMule Architecture – Features and strategies 3. Experimental results regarding SciMule – Does it work? 4. Conclusions 6/24/2010 A P2P Approach to Many Tasks Computing for Scientific Workflows 6 P2P types Super‐Peer KaZaA Napster e2dk 6/24/2010 DHT Hierarchical Torrent Kademlia Canon Gnutella A P2P Approach to Many Tasks Computing for Scientific Workflows 7 Factors that impact on P2P architectures • Load Balancing – An activity can generate thousands of tasks • Scalability – Large Scale Networks • Churn risk – How impactful a churn event can be • Maintenance cost – Affects the scheduling process 6/24/2010 A P2P Approach to Many Tasks Computing for Scientific Workflows 8 Comparing Network type Centralized Decentralized Hierarchical Load Balancing Low High Moderate Scalability Low Moderate High Churn Risks High Low Moderate Maintenance Cost Low High Moderate 6/24/2010 A P2P Approach to Many Tasks Computing for Scientific Workflows Factor 9 Design of SciMule SWfMS • Middleware SciMule Peer – Distribute, control and monitor Wf Activities • Promote Workflow/Activity Parallelization • Hierarchical approach • Three‐layer architecture – Submission layer – Execution layer – Overlay layer 6/24/2010 A P2P Approach to Many Tasks Computing for Scientific Workflows 10 SciMule Peer Roles • Client Peer – Submits • Executor Peer – Executes • Gate Peer – Keeps the list of nearby nodes and their subject • Peers may play any role during SciMule lifetime 6/24/2010 A P2P Approach to Many Tasks Computing for Scientific Workflows 11 SciMule Purpose • To make it easier to distribute Wf activities – To make scientists happier! • Using MTC paradigm – MTC without control are hard to maintain • Activities that demand high computation • Adaptation of Hydra – Real system – Our approach for clusters 6/24/2010 A P2P Approach to Many Tasks Computing for Scientific Workflows 12 SciMule Architectural Features • Two types of parallelization – Data parallelism – Parameter sweep • Distributed provenance gathering – Heterogeneous network • Simple to deploy • Linked to SWfMS 6/24/2010 A P2P Approach to Many Tasks Computing for Scientific Workflows 13 6/24/2010 A P2P Approach to Many Tasks Computing for Scientific Workflows 14 SciMule Overlay • Overlay affects the performance • Balance peers – Locality principles – Subjects • Gate Peers – – – – 6/24/2010 Keep a list of nearby nodes Control subjects Have low churn frequency and high reputation Keep a backup node A P2P Approach to Many Tasks Computing for Scientific Workflows 15 Overlay characteristics • New peers connect to few neighbors – Half the average of the old peers neighborhood – Avoid free riders • Neighborhood grows over time • Controlled connectivity – Based on the number of gate peers on the network 6/24/2010 A P2P Approach to Many Tasks Computing for Scientific Workflows 16 Overlay behaviour • Network with n nodes and g Gate Peers … Gate Peers n/g n/g n/g 6/24/2010 A P2P Approach to Many Tasks Computing for Scientific Workflows 17 Overlay … Gate Peers n/g n/g 6/24/2010 + A P2P Approach to Many Tasks Computing for Scientific Workflows =2n/g 18 Overlay … Gate Peers + n/g Maximum of neighbors: 3n/g 6/24/2010 A P2P Approach to Many Tasks Computing for Scientific Workflows 19 SciMule Evaluation • Evaluate the proposed architecture – Does SciMule parallelization scale? • Simulation environment – Experimentation on real large scale P2P is expensive and difficult – Flexibility to evaluate hybrid networks • PeerSim extension – SciMule peers, Link, Data package, Workflow Activities and Task components – Scheduling, transference and execution systems – Overlay 6/24/2010 A P2P Approach to Many Tasks Computing for Scientific Workflows 20 The study • • • • Five‐factor, four‐treatment study 14400 cycles of simulation (5 days simulated) 4096 nodes in the network 0.01 Poisson activity submission frequency Factors Independent Variables cycles 14400 n 4096 f k 0.01 32 64 128 256 6/24/2010 tasks 128 512 cost size churn 4000 8000 12000 24000 0.00 0.05 16000 32000 48000 96000 0.10 A P2P Approach to Many Tasks Computing for Scientific Workflows 21 Presented Scenarios • 384 instances of simulation • Representative Activities cases – Churn Poisson frequency: 0%, 5% and 10% Activity Name Tasks Low Cost Medium Size Low Cost Big Size Medium Cost Small Size High Cost Small Size 512 128 128 512 6/24/2010 TaskCost TaskSize (MB) (p.u.) 4,000 6 4,000 12 16,000 1.5 32,000 1.5 A P2P Approach to Many Tasks Computing for Scientific Workflows 22 Parameter Sweep results Activity Name Tasks TaskCost TaskSize (MB) (p.u.) Low Cost Medium Size 512 4,000 6 Low Cost Big Size 128 4,000 12 A P2P Approach to Many Tasks Computing 6/24/2010 23 for Scientific Workflows Medium Cost Small Size 128 16,000 1.5 Data Parallelism results Activity Name Tasks TaskCost TaskSize (MB) (p.u.) Low Cost Medium Size 512 4,000 6 Low Cost Big Size 128 4,000 12 A P2P Approach to Many Tasks Computing 6/24/2010 24 for Scientific Workflows Medium Cost Small Size 128 16,000 1.5 Conclusions • SciMule scales – Distributing scientific workflow activities over a P2P network – Promising approach to deal with MTC on heterogeneous environments • P2P • Hybrid Clouds • Distribution approaches should vary • Impact of churn events 6/24/2010 A P2P Approach to Many Tasks Computing for Scientific Workflows 25 Future Work • To improve the scheduling mechanism – Considering data transferring minimization • Minimize data transfer impact – Improve data discovery – Improve data distribution • Torrent • Compression • Replication 6/24/2010 A P2P Approach to Many Tasks Computing for Scientific Workflows 26 Acknowledgements 6/24/2010 A P2P Approach to Many Tasks Computing for Scientific Workflows 27 A P2P Approach to Many Tasks Computing for Scientific Workflows Thanks! Eduardo Ogasawara Daniel Oliveira Carlos Pivotto Vanessa Braganholo Marta Mattoso Jonas Dias Carla Rodrigues Rafael Antas Patrick Valduriez COPPE/UFRJ PPGI/UFRJ Université Montpellier/INRIA VECPAR’10