Simulative Evaluation of the Greediness Alignment Algorithm

Transcription

Simulative Evaluation of the Greediness Alignment Algorithm
Louis-Marie Loe, Emmanuel Twumasi Appiah-Bonna
Zurich, Switzerland
Student ID: 05-310-214, 12-755-625
M ASTER P ROJECT
–
Communication Systems Group, Prof. Dr. Burkhard Stiller
Simulative Evaluation of the
Greediness Alignment Algorithm
Supervisor: Patrick Poullie, Thomas Bocek
Date of Submission: March 20, 2015
University of Zurich
Department of Informatics (IFI)
Binzmuhlestrasse 14, CH-8050 Zurich, Switzerland
ifi
Master Project
Communication Systems Group (CSG)
Department of Informatics (IFI)
University of Zurich
Binzmuhlestrasse 14, CH-8050 Zurich, Switzerland
URL: http://www.csg.uzh.ch/
Abstract
Cloud computing emerged recently as the leading technology for delivering reliable, secure, fault-tolerant, sustainable and scalable computational services, which are presented
as Software, Infrastructure and Platform as a Service (SaaS, IaaS, PaaS). Moreover, these
services may be offered in private datacenters (private cloud), may be commercially offered to users (public clouds), or yet it is possible that both public and private clouds
are combined in hybrid clouds. With the rise of SCIs (Shared Computing Infrastructures) in general and of cloud SCIs in particular, the fair allocation of multiple resources
rapidly gains relevance in communication systems. In particular, different resources like
CPU, RAM, disk space and bandwidth have to be shared among users with different
demands, such that the overall outcome can be considered fair. Investigating resource
allocation mechanisms in cloud SCIs and its applicable fairness mechanisms requires a
cloud simulator. This project is about the design, implementation and testing of a cloud
infrastructure simulator also referred to as cloud simulator in this project. The cloud
simulator is implemented using the Openstack cloud technology. Furthermore, we design
and implement an elastic load simulator as a distinct software component but integrated
to the cloud simulator. The load simulator simulates the sharing of resources requested by
VMs (Virtual Machines) running on compute hosts in the cloud using a fairness metric.
Additionally, we present the results obtained from the implemented cloud simulator as
well as the results obtained from the implemented load simulator. We finally identify
potential further work.
i
ii
Acknowledgments
We would like to express our gratitude to Patrick Poullie and Dr. Thomas Bocek our
supervisors for their support throughout the entire course of this project. Their patient
guidance, encouragements, explanations and useful critiques contributed in a remarkable
way to the success of this project. Our grateful thanks also go to Prof. Dr. Burkhard
Stiller who gave us the opportunity to work on a project which was interesting and
challenging.
iii
iv
Contents
Abstract
i
Acknowledgments
iii
1 Introduction
1
1.1
Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1
1.2
Description of Work
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1
1.3
Thesis Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2
2 Related Work
3
2.1
Optimal Joint Multiple Resource Allocation [5]
. . . . . . . . . . . . . . .
3
2.2
Multi-dimensional Resource Allocation [6] . . . . . . . . . . . . . . . . . .
3
2.3
Dominant Resource Fairness [14]
. . . . . . . . . . . . . . . . . . . . . . .
5
2.4
Multi-Resource Allocation [7] . . . . . . . . . . . . . . . . . . . . . . . . .
5
2.5
Limitation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
6
3 SCI Architectures Concepts
7
3.1
Cluster Computing Infrastructure [1] . . . . . . . . . . . . . . . . . . . . .
7
3.2
Grid Computing Infrastructure [1] . . . . . . . . . . . . . . . . . . . . . . .
8
3.3
Cloud Computing Infrastructure [1] . . . . . . . . . . . . . . . . . . . . . .
9
v
vi
CONTENTS
4 Virtualization Concepts
11
4.1
Hardware Virtualization [4] . . . . . . . . . . . . . . . . . . . . . . . . . . 11
4.2
Role of VMM [3] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
4.3
CPU Virtualization [3] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
4.4
Memory Virtualization [3] . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
4.5
Device and I/O Virtualization [3] . . . . . . . . . . . . . . . . . . . . . . . 15
4.6
Hypervisor Technologies in Cloud SCIs . . . . . . . . . . . . . . . . . . . . 16
5 Resource Allocation in Clouds
19
5.1
How Cloud Resources are bundled . . . . . . . . . . . . . . . . . . . . . . . 19
5.2
The Role of the Cloud Scheduler
5.3
The Role of the Hypervisor or Compute Host . . . . . . . . . . . . . . . . 20
5.4
Cloud Consolidation Ratio . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
5.5
Memory Overcommitment Techniques [13] . . . . . . . . . . . . . . . . . . 21
5.6
CPU Overcommitment Techniques [17] . . . . . . . . . . . . . . . . . . . . 21
5.7
Hypervisors Resource Allocation Techniques [2] . . . . . . . . . . . . . . . 22
. . . . . . . . . . . . . . . . . . . . . . . 19
6 Search for a suitable Cloud Simulation Tool
23
6.1
Comparison of CloudSim and Openstack [8] . . . . . . . . . . . . . . . . . 23
6.2
Openstack Architecture Overview [12] . . . . . . . . . . . . . . . . . . . . . 25
6.3
VMs provisioning in the Openstack Cloud [12] . . . . . . . . . . . . . . . . 28
6.4
Architecture of Openstack Keystone [12] . . . . . . . . . . . . . . . . . . . 31
6.5
Architecture of Openstack Nova [12]
6.6
Architecture of Openstack Glance [12] . . . . . . . . . . . . . . . . . . . . 33
. . . . . . . . . . . . . . . . . . . . . 31
7 Overview of the Cloud Infrastructure Simulator
35
7.1
Physical Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
7.2
Cloud Resources, Tenants and VMs . . . . . . . . . . . . . . . . . . . . . . 37
7.3
Nova Compute FakeDriver and the Cloud Simulator . . . . . . . . . . . . . 37
7.4
Principle of Decoupling and VMs Creation in the Cloud . . . . . . . . . . . 39
7.5
Cloud Simulator high Level Design Principles . . . . . . . . . . . . . . . . 40
CONTENTS
vii
8 Overview of the Load Simulator
41
8.1
Logical Architecture Overview . . . . . . . . . . . . . . . . . . . . . . . . . 41
8.2
Reader Layer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
8.3
Validation, Aggregation and Grouping Layer . . . . . . . . . . . . . . . . . 42
8.4
Load Consumption, Time Translation and Reporting Layer . . . . . . . . . 43
8.5
Input Parameter Design: Load Design . . . . . . . . . . . . . . . . . . . . 43
8.6
Physical Time in the Load Simulator . . . . . . . . . . . . . . . . . . . . . 45
8.7
The Fair Share Metric . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
8.8
Allocation and Reallocation Design . . . . . . . . . . . . . . . . . . . . . . 46
9 Evaluation
49
9.1
Output of some implemented Cloud Primitives . . . . . . . . . . . . . . . . 49
9.2
Load Simulator Tests and Results . . . . . . . . . . . . . . . . . . . . . . . 49
9.3
Discussion on the Load Simulator Results
9.4
Further Work: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
. . . . . . . . . . . . . . . . . . 53
10 Summary and Conclusion
63
Bibliography
65
Abbreviations
67
Glossary
69
List of Figures
70
List of Tables
73
viii
CONTENTS
A Report on Milestones Implementation
77
A.1 Guiding Principle 1: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
A.2 Guiding Principle 2: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
A.3 Milestone 1: Search for a suitable Cloud Simulation Tool . . . . . . . . . . 78
A.4 Milestone 2: Comparison to Simulator Integration into Openstack . . . . . 78
A.5 Milestone 3: Decison on Alternatives . . . . . . . . . . . . . . . . . . . . . 78
A.6 Milestone 4: Input Parameter Design . . . . . . . . . . . . . . . . . . . . . 78
A.7 Milestone 5: Reallocation Design . . . . . . . . . . . . . . . . . . . . . . . 78
A.8 Milestone 6: Consumption Data Design . . . . . . . . . . . . . . . . . . . . 78
A.9 Milestone 7: Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . 78
A.10 Milestone 8: Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
B User Guide
79
B.1 Implemented Code Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . 79
B.2 Use-Case 1: Running Simulations with existing Cloud Components . . . . 79
B.3 Use-Case 2: Running Simulations with new Cloud Components . . . . . . 81
B.4 Use-Case 3: Exploring the Cloud using the implemented Primitives . . . . 85
C Contents of the CD
91
Chapter 1
Introduction
1.1
Motivation
With the rise of SCIs such as clusters, grids and clouds the fair allocation of multiple
resources rapidly gains relevance in communication systems. In particular, different resources like CPU, RAM, disk space and bandwidth have to be shared among users with
different demands, such that the overall outcome can be considered fair. Although several
recent studies have focused on resource allocation in clouds, they have done so without presenting the physical, logical and system architecture underlying the cloud SCIs. Since this
underlying cloud SCI architecture is undergoing constant innovations, we found it beneficial in this study to present it along with the investigated resource allocation mechanisms
to achieve more relevance and accuracy. The fact is that resource allocation mechanisms
that are applicable to centralized SCIs such as clusters are completely different from those
applicable to clouds. Hence any meaningful study of resource allocation mechanisms in
SCIs including cloud SCIs requires an in-depth understanding of the workings of the system architecture that underlies the SCI under study. Moreover, investigating resource
allocation mechanisms in cloud SCIs and its applicable fairness mechanisms requires a
cloud simulator where timely, repeatable, and controllable methodologies for investigating these algorithms can be applied. We design and implement such a cloud simulator in
this study.
1.2
Description of Work
In this study we pay a special focus to the cloud SCI architecture. We present virtualization as the key foundation of resource sharing in clouds with its related technologies.
Further we design and implement a cloud infrastructure simulator using the Openstack
technology. In addition we design and implement an elastic load simulator that is a distinct software component but integrated to the the cloud infrastructure simulator. The
load simulator uses real-time cloud state to simulate resource allocation among different
VMs running on any compute host in the cloud such that the overall outcome at any
1
2
CHAPTER 1. INTRODUCTION
time is consistent with the fairness metric used. As such it is able to simulate resource
allocation for a few VMs or for the entire cloud.
1.3
Thesis Outline
In chapter 2, we review related work. Chapter 3 is about SCI architectures concepts including cluster, grid and cloud SCIs. Chapter 4 deals with virtualization concepts where
we present RAM, CPU and I/O virtualization concepts. Further we present modern-day
cloud hypervisor technologies. In chapter 5, we cover resource allocation in clouds by
defining RAM, CPU overcommitment techniques as well as hypervisors allocation and reclaiming techniques. Chapter 6 begins with a summary comparison between the CloudSim
simulation tool and the Openstack cloud technology. This is followed by a detailed presentation of the Openstack cloud technology. In Chapter 7 and chapter 8, we present
the implemented cloud infrastructure simulator and the implemented load simulator respectively. The evaluation in chapter 9 shows the results of some sample simulations
performed with the implemented load simulator. The mapping of the implementation of
the initial milestones to the present report is found in Appendix A. Appendix B contains
the code statistics along with the user-guide.
Chapter 2
Related Work
In this chapter, we overview 4 studies related to multi-resource allocation in shared SCIs.
2.1
Optimal Joint Multiple Resource Allocation [5]
This study models the cloud as allocating the required amount of multiple types of resources simultaneously from a common resource pool for a certain period of time for each
request. The allocated resources are used by a single request and not shared. The study
thereafter proposes a new resource allocation method that considers only identified resources in the selection of a center. This method adopts Best-Fit approach and aims to
reserve as much as possible for future requests that may require a larger size of processing. In addition, the proposed method aims to reduce the possibility that the deadlock
situation will occur. This in turn improves the current allocation method which is solely
based on one type of resource and uses Round-Robin. Round-Robin does not consider
the situation of both processing ability and bandwidth in the resource allocation. The
proposed method tends to reduce the request loss probability in comparison to the use of
Round-Robin.
2.2
Multi-dimensional Resource Allocation [6]
The starting point of this study is the following remark: Cloud resource allocation is typically restricted to optimizing up to three objectives functions which are cost, makespan
and data locality. The study then proposes an optimization that is done with four objectives: makespan, cost, data locality and the satisfaction level of the user. To evaluate the
proposed allocation scheme the study proposes the following multi-cloud workflow framework architecture developed using the CloudSim simulation toolkit. Figure 2.1 shows the
multi-cloud workflow framework architecture based on CloudSim. A closer look at the
inner-workings of the proposed framework reveals the following details:
3
4
CHAPTER 2. RELATED WORK
Figure 2.1: Multi-cloud workflow framework architecture based on CloudSim [6]
ˆ The workflow engine receives in a first step a workflow description and the SLA
requirements from the user.
ˆ After parsing the description, the workflow engine applies different clustering techniques to reduce the number of workflow tasks.
ˆ the match-maker selects the cloud resources that can fit the user given requirements
by applying different matching policies.
ˆ After that all the requested virtual machines (VMs) and cloud storage are deployed
on the selected clouds and the workflow engine transfers the input data from the
client to the cloud storage and then starts to release the workflow tasks with respect
to their execution order.
As presented above the proposed framework aims for the optimization of four objectives
functions which represents an improvement over existing systems which can optimize only
three objective functions.
2.3. DOMINANT RESOURCE FAIRNESS [14]
5
Figure 2.2: Number of large jobs completed for each allocation scheme in comparison of
DRF against slot-based fair sharing and CPU-only fair sharing [14]
2.3
Dominant Resource Fairness [14]
This study proposes Dominant Resource Fairness (DRF), a generalization of max-min fairness to address the problem of multiple resource types allocation. The following scenario
is provided as an example of a situation requiring a fair allocation of multiple resources:
A system consisting of 9 CPUs and 18 GB RAM, and two users: user A runs tasks that
require h1 CPUs, 4 GBi each, and user B runs tasks that require h3 CPUs, 1 GBi each.
What constitutes a fair allocation policy for this case using max-min fair allocation policy
for multiple resources and heterogeneous requests? To address this gap using heterogeneous requests, the study proposes DRF.
The study includes the following important properties into the DRF scheme 1. Sharing
incentive: Each user should be better off sharing the cluster, than exclusively using her
own partition of the cluster. Consider a cluster with identical nodes and n users. Then a
user should not be able to allocate more tasks in a cluster partition consisting of n1 of all
resources. 2. Strategy-proofness: Users should not be able to benefit by lying about
their resource demands. This provides incentive compatibility, as a user cannot improve
her allocation by lying. 3. Envy-freeness: A user should not prefer the allocation of
another user. This property embodies the notion of fairness. 4. Pareto efficiency: It
should not be possible to increase the allocation of a user without decreasing the allocation of at least another user. This property is important as it leads to maximizing system
utilization subject to satisfying the other properties. The study presents a basic DRF
algorithm and a weighted DRF scheduling algorithms: Figure 2.2 shows the proposed
DRF schemes performs better than Fair-sharing and CPU-only fair sharing.
As presented above DRF tends to address the problem of heterogeneous multi-resource
sharing.
2.4
Multi-Resource Allocation [7]
This study notes that fairness can be quantified with a variety of metrics. Moreover,
different notions of fairness including proportional and max-min fairness can be achieved
through various techniques. However, the study contends that when it comes to allocating
multiple types of resources, there has been much less systematic study. Indeed, it is unclear
what it means to say that a multi-resource allocation is fair. For example, datacenters
allocate different resources (memory, CPUs, storage, bandwidth) to competing users with
6
CHAPTER 2. RELATED WORK
Figure 2.3: Example of multi-resource requirements in data-centers [7]
different requirements. One such user might have computational jobs requiring more CPU
cycles than memory, while another might have the opposite requirements. Using Figure 2.3
the paper presents the following examples to highlight several multi-resource allocation
fairness metric : User 1 requires 2 GB of memory and 3 CPUs per job, while user 2 needs 2
GB of memory and 1 CPU per job. There is a total of 6 GB of memory and 4 CPUs. Many
allocations might be considered fair in this example: Should users be allocated resources in
proportion to their resource requirements? Or should they be allocated resources so as to
process equal numbers of jobs? Moreover, the paper cites datacenters that sell bundles of
CPUs, memory, storage and network bandwidth as examples of multi-resource allocation
problem.
As a result of the above, the paper then presents some mathematical functions that could
help to define fairness. These functions include FDS (Fairness on Dominant Shares) and
GFJ (Generalized Fairness on Jobs) which are two families of fairness functions that could
be used for multi-resource allocation. Further mathematical fairness theory is presented
in Appendix A including the following mathematical axioms: The axiom of continuity, the
axiom of Saturation, the axiom of Partition and the axiom of Starvation. By developing
the FDS and GFJ functions, this project aims to contribute to the formalization of the
fairness theory.
2.5
Limitation
In this work we design, implement and test a cloud simulator infrastructure based on the
Openstack cloud technology. Moreover, we implement a load simulator using a fair share
metric. The cloud infrastructure we design and implement as well as the load simulator are
meant to support further research on fairness in cloud resource allocation mechanisms at
the CSG. Although the load simulator is integrated into the cloud infrastructure simulator,
it is a distinct and separate component.
Chapter 3
SCI Architectures Concepts
In this chapter we present the architectural overview of three of the main SCI infrastructure types: cluster, grid and cloud [1]. This will serve as the foundation of the subsequent
discussion helping to put in perspective the use of the term cloud SCIs with its related
architecture.
3.1
Cluster Computing Infrastructure [1]
A cluster as presented in Figure 3.1 is a group of computers with a direct network interconnect, centralized management, and distributed execution facilities. In a cluster the
centralized management include: Authorization and authentication, shared filesystem,
application execution and management. The distributed execution facilities include: Execution of jobs, multiple units of the same parallel job may reside on separate resources.
One of the main usage of cluster computing is batch processing.
With respect to the present work, the following cluster functional components are of interest: Resource Manager: Monitors compute infrastructure, launches and supervises
jobs, clean up after termination; Job manager/scheduler: Allocates resources and time
slots (scheduling); Workload Manager: Policy and orchestration of jobs: fair share,
workflow orchestration, QoS, SLA. Some standard scheduling schemes used in cluster
systems include: First Come, First Served; Shortest Job First, Priority-based scheduling, Fair-share scheduling. In the context resource sharing the two scheduling schemes
of interest are the Priority-based scheduling and the Fair-share scheduling. Following are
further characteristics of the scheduling schemes used in clusters.
Priority-based scheduling: In this scheme, the priority function is usually the weighted
sum of various contributions including: Requested run time: how much historical information is kept and used for calculating resource usage; Number of processors; Wait time
in queue; Recent usage by same user/group (Fair share); Administrator set QoS.
Fair-share scheduling: Fair-share scheduling assigns higher priorities to users/groups
that have not used all of their resource quota (usually expressed in CPU time). It uses a
variety of parameters such as: Window length: how much historical information is kept
7
8
CHAPTER 3. SCI ARCHITECTURES CONCEPTS
Figure 3.1: Overview of a cluster computing architecture [1]. Fairness can be introduced
by modifying the scheduler and resource allocation manager
and used for calculating resource usage; Interval: how often is resource utilization computed; Decay: weights applied to resource usage in the past (e.g. 2 hours of CPU time
one week ago might weigh less than 2 hours of CPU time today.
Possibility of introducing single-resource allocation fairness or multi-resource
allocation fairness in clusters: In the cluster all users’ jobs compete for the same
physical resources. These users jobs are centrally submitted and managed by the cluster
scheduler and the resource manager. In the cluster SCI, introducing a novel single-resource
or multi-resource fairness scheme would imply modifying the behaviour of both the scheduler and the resource manager to achieve the desired fairness outcome. It should be noted
that some of the cluster schedulers already implement some sort of fairness. However, in
order to complement existing fairness schemes or to introduce a novel scheme of fairness,
there is a need to modify both the scheduler and the resource manager. An example of a
cluster is the UZH main HPC cluster Schroedinger.
3.2
Grid Computing Infrastructure [1]
A computational grid is a hardware and software infrastructure that provides dependable,
consitent, pervasive and inexpensive access to high-end computational capabilities. From
a system viewpoint a grid can be viewed as an aggregation of computational clusters
for execution of a large number of batch jobs. As shown in Figure 3.2 it is typically
3.3. CLOUD COMPUTING INFRASTRUCTURE [1]
9
Figure 3.2: Overview of a grid computing architecture [1]. Fairness can be introduced at
the domain or cluster level
geographically distributed and resources come from multiple domains (or clusters). Using
the discovery service, the client host selects one cluster and submits a job there. Then he
periodically polls for status information.
From the foregoing, resource allocation considerations in the grid are similar to those
described in the cluster with the supplementary step of using a directory service to locate
a cluster or domain. The reason being that the resource allocation decisions are taken
locally by each cluster independently of the others. The main purpose of the discovery
service is to publish clusters available in the grid to the user. An example of a Grid is the
Swiss Multi-Science Computing Grid (SMSCG).
3.3
Cloud Computing Infrastructure [1]
Cloud computing is a model for enabling convenient on-demand network access to a shared
pool of virtualized, configurable computing resources (e.g. networks, servers, storage, applications and services) that can be rapidly provisioned over the internet and released
with minimal management effort or service provider interaction. In a cloud computing
infrastructure virtualization is the foundation for resource allocation and sharing. Thus,
to study resource allocation in cloud computing environments requires an in-depth understanding of the fundamentals of virtualization. The fact is that resource allocation
and sharing mechanisms in cloud SCIs are not applicable to SCIs where virtualization
is not central to the allocation mechanisms. These include traditional cluster and grid
SCIs where virtualization technology is not the mechanism on which resource allocation
is based. Cloud computing can be defined as internet-based computing in which large
10
CHAPTER 3. SCI ARCHITECTURES CONCEPTS
Figure 3.3: Overview of a cloud SCI architecture showing an Openstack cloud [1]
groups of remote servers, storage arrays and network equipment are networked, virtualized, dynamically provisioned and orchestrated via a cloud operating system allowing the
creation of a pool of compute, storage and network resources which could be allocated
to a user’s VMs on demand. We will examine in detail resource allocation in cloud SCIs
in a future section.The important components of a modern day cloud infrastructure architecture include the following: Compute: Responsible for the instantiation of VMs.
Identity: Responsible for authentication and authorization. Image Repository: Use to
locate and retrieve images used in instantiating VMs. Storage: Block or Object storage
used to store images and users data. Network: Responsible to define a complete complex user network using Software Defined Networking (SDN). Telemetry: Use to store,
process and retrieve the cloud metrics. An overview of a cloud infrastructure architecture
is shown in Figure 3.3.
Chapter 4
Virtualization Concepts
In this chapter we define hardware virtualization, VMM, CPU virtualization, memory
virtualization and I/O virtualization.
4.1
Hardware Virtualization [4]
The term virtualization broadly describes the separation of a service request from the
underlying physical delivery of that service. With x86 hardware virtualization, a virtualization layer is added between the hardware and operating system as shown in Figure 4.1.
The virtualization layer (hypervisor) is the software responsible for hosting and managing
all virtual machines on a host. This virtualization layer allows multiple operating system
instances to run concurrently within virtual machines on a single computer, dynamically
partitioning and sharing the available physical resources such as CPU, storage, memory
and I/O devices. For standard x86 systems, virtualization approaches use either a hosted
or a hypervisor architecture. Figure 4.1 shows a hosted virtualization approach or type 2
hypervisor. A hosted architecture installs and runs the virtualization layer as application
on top of an operating system.
Figure 4.2 shows a hypervisor virtualization approach using type 1 hypervisor. Type 1
hypervisor is mostly simply referred to as hypervisor or bare-metal hypervisor as it sits
directly on the hardware. It installs the virtualization layer directly on a standard x86
hardware. Since it has direct access to the hardware resources rather than going through
an operating system, a type 1 hypervisor is more efficient than a type 2 hypervisor and
delivers greater scalability, robustness and performance. Most production clouds almost
exclusively use type 1 hypervisors.
Figure 4.3 shows a different view of x86 hardware virtualization using type 1 hypervisor.
Modern clouds mostly use type 1 hypervisors for productive environments. However, in
the cloud simulator we designed and implemented in the framework of the present work,
we used a type 2 hypervisor or a hosted virtualization (Oracle Virtualbox) solution to create the controller and the compute nodes. Within the compute nodes we use the QEMU
hypervisor which is an emulated type 1 hypervisor (emulated KVM hypervisor).
11
12
CHAPTER 4. VIRTUALIZATION CONCEPTS
Figure 4.1: x86 Virtualization Overview: A hosted virtualization or type 2 hypervisor:
The hypervisor runs on an OS. Example: Oracle Virtualbox, VMware flash player [4]
Figure 4.2: x86 Virtualization Overview: A hypervisor virtualization or type 1 hypervisor:
The hypervisor runs on bare-metal. Most widely used in productive clouds. Example:
KVM, XEN, VMware ESXi, Microsoft Hyper-V [4]
4.2. ROLE OF VMM [3]
13
Figure 4.3: x86 Virtualization Overview: A virtualization layer is added between the
hardware and the operating system [3]
4.2
Role of VMM [3]
Within any given type 1 hypervisor architecture there exists a key component sometimes
called a Virtual Machine Monitor (VMM) that implements the virtual machine hardware
abstraction and is responsible for running a VM. Each VMM has to partition and share
the CPU, memory and I/O devices of the physical host and present them to the VM as
full virtualized resources. Figure 4.4 shows an overview of a VMM within a hypervisor.
Figure 4.4: VMM architecture Overview: Each VMM partition physical resources and
present them to VMs as virtual resources [3]
14
CHAPTER 4. VIRTUALIZATION CONCEPTS
Figure 4.5: x86 privilege level architecture with no virtualization implemented [3]
4.3
CPU Virtualization [3]
The x86 computer systems are designed to run directly on the bare-metal hardware, so
they naturally assume they fully own the computer hardware. The x86 architecture offers
four levels of privileges known as Ring 0, 1, 2 and 3 to operating systems and applications
to manage access to the computer hardware. While user level applications generally run
in Ring 3, operating systems must have direct access to hardware and hence must run in
Ring 0. Figure 4.5 shows an overview of an x86 Architecture hardware access privilege
levels with no virtualization implemented. Virtualizing an x86 processor therefore poses
the challenge of placing a virtualization layer under the OS (which expects to be run in the
most privileged Ring 0). The virtualization layer will in turn be responsible for creating
VMs and their hardware (resource provisioning or assignment). To address these challenges several CPU virtualization technologies have been developed and include: binary
translation, paravirtualization and hardware assisted virtualization. These virtualization
technologies are implemented in hypervisors running in modern-day clouds. Following is
a brief presentation of these techniques:
Full virtualization using binary translation: This technique does the translation of
the guest OS kernel code to replace nonvirtualizable instructions with new sequences of
instructions that have the intended effect on the virtual hardware. Meanwhile user level
code is directly executed on the processor of the hypervisor to achieve a good level of
performance.
OS assisted virtualization also called paravirtualization: In this approach the OS
kernel of the guest OS is modified to replace non-virtualizable instructions with hypercalls
that communicate directly with the virtualization layer (hypervisor). The hypervisor also
provides hypercall interfaces for other critical kernel operations such as memory management, interrupt handling and time keeping.
Hardware Assisted Virtualization: In hardware assisted virtualization, virtualization
technologies such as Intel VT-x and AMD’s AMD-V are built right inside the CPU chipset
through a new CPU execution mode called the root mode.
4.4. MEMORY VIRTUALIZATION [3]
15
Figure 4.6: x86 Memory Virtualization. The VMM is responsible for mapping the VM
physical memory to the host physical memory [3]
4.4
Memory Virtualization [3]
Besides CPU virtualization, the x86 memory must also be virtualized. This involves
sharing the physical system memory and dynamically allocating it to virtual machines.
Virtual machine memory virtualization is very similar to the virtual memory support provided by modern operating systems such as Linux. Applications see a contiguous address
space that is not necessarily tied to the underlying physical memory in the system. The
operating system keeps the mappings of virtual page numbers to physical page numbers
store in page tables. All modern x86 CPUs include a memory management unit (MMU)
and a translation lookaside buffer (TLB) to optimize virtual memory performance.
To run multiple virtual machines on a single system, another level of memory virtualization is required. One has to virtualize the MMU to support the VM (guest OS). The
guest OS continues to control the mapping of virtual addresses to the guest memory
physical addresses, but the guest OS cannot have direct access to the actual physical
machine memory. The VMM is responsible for mapping guest physical memory to the
actual physical machine memory, and it uses shadow page tables to accelerate the mappings. The VMM uses TLB hardware to map the virtual memory directly to the machine
memory to avoid the 2 levels of translation on every access. When the guest OS changes
the virtual memory to physical memory mapping, the VMM updates the shadow page
tables to enable a direct lookup. MMU virtualization creates some overhead which can
be mitigated by using hardware assisted virtualization. Figure 4.6 shows an overview of
memory virtualization.
4.5
Device and I/O Virtualization [3]
In addition to CPU and memory virtualization, x86 hardware devices and I/O must also
be virtualized. Virtualizing the x86 devices involves managing the routing of I/O requests between virtual devices and the shared physical hardware. In most modern-day
devices, I/O virtualization is done via software in contrast to a direct pass-through to
the hardware. This approach enables a set of new features and simplified management.
For example with networking, creating virtual NICs (vNICs) and virtual switches allows
the creation of virtual networks. Virtual networks consume no bandwidth on the physical
16
CHAPTER 4. VIRTUALIZATION CONCEPTS
Figure 4.7: x86 Device and I/O Virtualization. The Hypervisor uses software to emulates
virtual devices and I/O and translate VMs requests to the system hardware [3]
network as long as the traffic is not destined to a VM running on a different physical
host. As such sharing bandwidth on modern-day clouds does not always involve sharing the physical bandwidth. This becomes necessary only when VMs traffic must exit
the compute host (hypervisor). The hypervisor uses software to virtualize the physical
hardware and presents each virtual machine with a standardized set of virtual devices.
These virtual devices effectively emulate well-known hardware and translate the virtual
machine requests to the system hardware. Figure 4.7 shows an overview of devices and
I/O virtualization.
4.6
Hypervisor Technologies in Cloud SCIs
Most modern-day hypervisors typically implement a combination of virtualization technnologies while offering support for others. The often implemented technologies include
full virtualization using binary translation, OS-assisted virtualization or paravirtualization and hardware-assisted virtualization using chipset virtualization extensions such as
intel VT-x and AMD’s AMD-V. In the following sections, we briefly present the 4 main
modern-day hypervisors used in clouds. These include KVM, VMware ESXi, XEN and
Microsoft Hyper-V.
KVM Hypervisor: It is one of the mostly used hypervisor in clouds. The KVM hypervisor uses a combination of hardware assisted virtualization and paravirtualization.
Hardware assisted virtualization is used for the core CPU and memory virtualization by
leveraging the Intel and AMD processors virtualization extensions. The processors extensions enable running fully isolated virtual machines at native hardware speeds for some
workloads. KVM use of paravirtualization is supported for device drivers to improve I/O
performance. In KVM the paravirtualized drivers support is implemented in the virtio
modules.
4.6. HYPERVISOR TECHNOLOGIES IN CLOUD SCIS
17
VMware ESXi: While VMware ESXi implements full virtualization using binary translation, it also takes full advantage of hardware-assisted virtualization in modern chipset
and paravirtualization in the form of paravirtualized drivers to achieve higher virtualization performance. Rather than implementing code to emulate real-world I/O devices,
VMware ESXi writes code for simpler virtual devices pratical for all purposes and yet
achieving greater levels of performance. These pravirtualized drivers ship in the form of
VMWare Tools.
XEN Hypervisor: The XEN Hypervisor uses paravirtualization and supports hardwareassisted virtualization. Paravirtualization requires modification of the OS kernel to support the guests. This implies that guest OSes running on the XEN hypervisor must be
virtualization-aware. To address this disadvantage for the Linux guest OS, most recent
Linux distributions have built-in drivers to run unmodified on XEN. XEN achieves a lower
virtualization overhead because the operating system and hypervisor work together more
efficiently, without the overhead imposed by the emulation of the system’s hardware resources. This can allow virtual disk and virtual network cards to operate at near-native
hardware performance.
Main differences between XEN and KVM: Xen is an external hypervisor and as
such it assumes control of the physical machine and divides resources among guests. On
the other hand, KVM is part of Linux and uses the regular Linux scheduler and memory
management. This means that KVM is much smaller and simpler to use for example
KVM can swap guests to disk in order to free RAM. While KVM only runs on processors
that supports the Intel VT and AMD-V instruction extensions, Xen also allows running
modified guest OS on non hardware-assisted CPU.
Main differences between KVM and QEMU:It should be noted that KVM and
QEMU are two related hypervisors sometimes called KVM-QEMU. While the QEMU hypervisor uses emulation, KVM uses processor extensions for virtualization. QEMU allows
a user to use a VM as a compute host. Thus, QEMU is the hypervisor used in the cloud
simulator infrastructure we implemented in this project.
Microsoft Hyper-V Hypervisor: Just like the KVM hypervisor and XEN, Microsoft
Hyper-V uses Hardware-assisted virtualization technology. As such any hardware on
which Hyper-V is run requires a processor with HVM (Hardware Virtualization Extensions) instruction sets such as Intel VT-x and AMD-V. It should be noted that most
recent x86 processors are built with these extensions. Also like KVM, Hyper-V supports
paravirtualization to improve I/O performance.
18
CHAPTER 4. VIRTUALIZATION CONCEPTS
Chapter 5
Resource Allocation in Clouds
In this chapter we look at how cloud resources are bundled, the role of the cloud scheduler
in the resource allocation process, the role of the compute host (hypervisor) and explain
the cloud consolidation ratio. Next we present resources overcommitment and reclaiming
techniques used in cloud SCIs.
5.1
How Cloud Resources are bundled
In a cloud SCI, resources are bundled in flavors (resource templates). These flavors are
associated to VMs at creation time. After this association is successfully done, the newly
created VM inherits the resources bundled in the flavor. Flavors encapsulate the maximum
resources intended for the VM. This include the maximum RAM size, the maximum
number of vCPUs and the maximum disk size. VMs obtain other resources at creation
time that are not part of a flavor. For example network bandwidth is not part of a flavor
but a VM will obtain a vNIC (Virtual Network Interface Card) for their networking needs
at creation time. Some hypervisors allow the resizing of a VM to make room for more
resources in the VM. This resizing is a permanent operation and from the resizing point
onward, the VM will be bound by its new resource limits. Resource control in the cloud
is done at two layers: At the cloud scheduler layer and at the compute host layer.
5.2
The Role of the Cloud Scheduler
In the cloud SCIs, the role of the cloud scheduler can be summarized as follows: Given
a request for a VM from a cloud user, find a suitable compute host in the cloud that
has enough resources to create and host the user’s VM. If you do not find any suitable
host in the cloud able to satisfy the user’s VM creation request, deny this request and
generate an error message. These mechanisms imply that the user request’s validity,
authentication and his authorization limits (quotas) have been checked. Further the
placement of workloads or jobs in VMs is done at the VM layer by the user. As the cloud
19
20
CHAPTER 5. RESOURCE ALLOCATION IN CLOUDS
technology evolves, new tools are being developed to help the user automate placement
of workloads to his VMs in the cloud. In the Openstack cloud technology one such tool is
Heat. But even with such tools the users workload remain tied to the resources available
in their VMs. The cloud scheduler main responsibility is to orchestrate the dynamic
placement of VMs to compute hosts in the cloud. This has led to new challenging research
use-cases in clouds such as the optimization of VMs placements in clouds using a number
of dimensions such as locality.
5.3
The Role of the Hypervisor or Compute Host
The hypervisor or compute host has the responsibility to use its virtualization technology
to manage resource allocation, reallocation as well as resource reclaiming from its running
VMs. To this end it uses its scheduler. This means that at runtime allocation of resources
to the VMs themselves is handled by the compute host scheduler. At this stage the cloud
scheduler has already determined that the hypervisor can host the VM and hence the
VM has already been placed on the hypervisor. When a user starts a huge workload on
small-size VM (VM with few resources), the compute host will not increase the VM size
to accomodate the huge workload.
To the contrary, this will result in a slow processing or even to the failure of such processing
in case of extreme resource scarcity. The compute host will always allocate resources to a
VM under the constraint that the amount of these resources can never be more than the
VM size. Assuming the compute host has already allocated all its resources to some VMs.
What will happen when a VM equally being hosted by the same compute host request
resources?
The compute host will gradually reclaim the resources from other VMs to allocate to the
new requesting VM under the constraint that no existing running job in other VMs should
fail. In the rare case where reclaiming resources from running VMs could lead to the failure
of already running workloads, the hypervisor will not satisfy the new request for resources
for a period of time leading to further waiting time on the part of the requesting VM [2].
At this point the cloud scheduler can also intervene to automatically and transparently
place the requesting VM to another compute host with enough resources if such host
exists in the cloud.
5.4
Cloud Consolidation Ratio
A key benefit of virtualization is the ability to consolidate multiple workloads onto a single
computer system . It enables users to consolidate virtual hardware on less physical hardware resources, thereby efficiently using hardware resources. To achieve higher utilization
rate, higher VMs density and thus a better cloud consolidation ratio, cloud SCIs make
use of resource overcommitment [2]. For instance the amount of overcommitted memory
in a cloud is the amount of memory the cloud pretends to have. This amount is usually
higher than what the cloud actually has which is the amount of the physical resource
itself. The consolidation ratio is a measure of the virtual hardware that has been placed
5.5. MEMORY OVERCOMMITMENT TECHNIQUES [13]
21
on physical hardware. For example if a cloud has a consolidation ratio of 2 it means the
overall number of VMs created in the cloud have twice the amount of physical resources
available in the cloud. A higher consolidation ratio typically indicates greater efficiency.
We can infer from the foregoing that resources assigned to a VM at creation time are not
actually allocated at creation time, they are promised to the VM at that time. They are
allocated to the VM when it has to actually process some workload. The idea behind
overcommitment is that all VMs will seldom simultaneously request their maximum resources. The advantages modern clouds derive from a higher consolidation ratio include
savings in power consumption, capital expense, and administration costs. The degree of
savings depends on the ability to overcommit hardware resources such as memory, CPU
cycles, I/O, and network bandwidth. It should be noted that the same techniques used to
overcommit resources in clouds are equally used to reclaim resources. We next consider
some memory and CPU overcommitment techniques [2],[3],[4].
5.5
Memory Overcommitment Techniques [13]
Memory overcommitment enables a higher consolidation ratio in a hypervisor. Using
memory overcommitment, users can consolidate VMs on a physical machine such that
physical resources are utilized in an optimal manner while delivering good performance.
Memory ballooning is a technique in which the host instructs a cooperative guest to
release some of its assigned memory so that it can be used for another purpose. This
technique can help refocus memory pressure from the host onto a guest.
Kernel Same-page Merging (KSM) uses a kernel thread that scans previously identified memory ranges for identical pages, merges them together, and frees the duplicates.
Systems that run a large number of homogeneous virtual machines benefit most from this
form of memory sharing.
Memory Swapping. Using this technique a hypervisor follows the traditional concepts
of virtual memory overcommitment used traditionally in Linux Systems. As such memory
pages requested by a process are not allocated until they are actually used. Using the
Linux page cache, multiple processes can save memory by accessing files through shared
pages; as memory is exhausted, the system can free up memory by swapping less frequently
used pages to disk. These techniques can result in a substantial difference between the
amount of memory that is allocated and the amount actually used by VMs leading to
higher consolidation. In KVM for example VMs are seen by the Linux host simply as
Linux processes.
5.6
CPU Overcommitment Techniques [17]
A virtual CPU assigned to a VM equates to a physical core in the hypervisor, but when
the VM attempts to process something, it can potentially run on any of the cores that
happen to be available at that moment in the Hypervisor. The hypervisor scheduler
handles this, and the VM is not aware of it. Also one can assign multiple vCPUs to a
VM which allows it to run concurrently across several cores of the hypervisor as long as
22
CHAPTER 5. RESOURCE ALLOCATION IN CLOUDS
the process being run supports some form of parallelism (simultaneous usage of multiple
cores). Cores are shared between all VMs as needed, so for example we could have a
4-core hypervisor and 10 VMs running on it with 2 vCPUs assigned to each. VMs share
all the cores in the hypervisor quite efficiently as determined by the hypervisor scheduler
leading to the maximum use of under-utilized resources and a higher consolidation ratio.
However, if the VMs are so busy that they have to contend for CPU time, the outcome
is that VMs may have to wait for CPU time. Although this is transparent to the VMs
and managed by the hypervisor scheduler, it results in processing delays in VMs. At the
cloud level the cloud scheduler can be configured in advance to limit the amount of CPU
and memory overcommittment in the entire cloud. This requires prior knowledge of the
cloud resources needs.
5.7
Hypervisors Resource Allocation Techniques [2]
KVM hypervisor: VMs are regular processes in KVM, and therefore standard memory management techniques like swapping apply. For Linux guests, a balloon driver is
installed and it is controlled by the host via the balloon monitor command. Some hosts
also support kernel shared page merging (KSM). KVM requires hosts and guests OSes to
support memory overcommitment. A guest OS that doesn’t support memory overcommitment cannot run on KVM.
VMware ESXi: ESXi works for all guest OSes. In addition to ballooning it also uses
content-based page sharing and memory compression. This approach improves VM performance as compared to the use of only ballooning and hypervisor-level swapping.
Xen hypervisor: Xen uses a mechanism called dynamic memory control (DMC) to
implement memory reclamation. It works by proportionally adjusting memory among
running VMs based on predefined minimum and maximum memory. VMs generally run
with maximum memory, and the memory can be reclaimed via a balloon driver when
memory contention in the host occurs. However, Xen does not provide a way to overcommit the host physical memory, hence its consolidation ratio is largely limited. Xen
provides a memory management mechanism to manage all host idle memory and guest
idle memory. The idle memory is collected into a pool and distributed based on the demand of running VMs. This approach requires the guest OS to be paravirtualized, and
only works well for guests with non-concurrent memory pressure.
Microsoft Hyper-V: Hyper-V uses dynamic memory for supporting memory overcommitment. With dynamic memory, each VM is configured with a small initial RAM when
powered on. When the guest applications require more memory, a certain amount of
memory will be hot-added to the VM and the guest OS. When a host lacks free memory,
a balloon driver will reclaim memory from other VMs and make memory available for hot
adding to the demanding VM. In rare and restricted scenarios, Hyper-V will swap VM
memory to a host swap space.
Chapter 6
Search for a suitable Cloud
Simulation Tool
In this chapter, we present a comparison overview between CloudSim [8] and Openstack
[12] followed by a detailed Openstack presentation. The detailed Openstack presentation
includes the Keystone identity service, the Nova compute service and the Glance image
service. These are the most three critical services used in the cloud simulator. They
are also directly related to the implementation of the load simulator that include a Nova
reader and a Keystone reader.
6.1
Comparison of CloudSim and Openstack [8]
The layered cloud computing architecture is shown in Figure 6.1 [8]. To research the
complex mechanisms underlying the cloud SCIs infrastructures including the resource allocation mechanisms along with its fairness schemes, we need an adequate cloud simulator.
The flexibility offered by a simulator is the ability to design, implement and test without
affecting any production environment. Further one can improve an initial design over time
without worrying about the cost related to using a productive cloud. In relation to the
present project, an adequate cloud simulator tool should therefore have the functionality
to enable the creation and management of cloud components. This should include the
ability of creating and simulating IaaS components such as compute host, VMs and tenants. Ideally, such a tool should be opensource, have extensive documentation and enjoy
wide acceptance both in the academic research community and in the industry research
Labs.
While there are several tools aiming at simulating real-world clouds, we restricted our
focus on two such tools: CloudSim and Openstack. We investigated both tools using the
afore-mentioned desirable qualities. Table 6.1 summarizes the results of our investigation.
Although both CloudSim and Openstack can be used as cloud simulators, the following
four critical reasons have determined our final decision to adopt Openstack. These reasons
can be inferred from Table 6.1 :
23
24
CHAPTER 6. SEARCH FOR A SUITABLE CLOUD SIMULATION TOOL
Figure 6.1: Layered cloud computing architecture [8]
Table 6.1: Comparison of CloudSim and Openstack cloud Simulation tools
Parameter
CloudSim
OpenStack
Platform
SimJava
Linux
License type
Opensource
Opensource
Speed of execution
Moderate, built on Java
Fast, built on python
Limited
Worldwide adoption
Extent of implementation
Uni Melbourne
HP, MIT, Berkeley,CERN
Microsoft
IBM, Cisco, Google, NASA
european universities
Microsoft
Physical model
None, no Cloudsim
Full, productive
cloud technology
Openstack technology
Documentation
Limited
Extensive
Developer guides
Admin, Architect
Developer guides
Ease of creating IaaS
Limited, no hypervisor
Extensive, supports
Components
technology support
KVM, Hyper-V, ESXi
Integration into
None, no such
By Default
Openstack cloud
integration exists
6.2. OPENSTACK ARCHITECTURE OVERVIEW [12]
25
1. CloudSim remains a cloud simulator: It has no associated cloud technology.
Openstack has become not only the standard in cloud technology, but it can also
be used as a cloud simulator.
2. Ease of Integration to Openstack: We were specifically required to look for a
cloud simulator that easily integrates with the Openstack technology. As such the
Openstack cloud simulator is in-built in the Openstack cloud technology
3. Available Documentation: While CloudSim has some available documentation,
the Openstack documentation including the User guides, the Architectures guides
and the Development guides is much more extensive.
4. Widespread use and tool for the future: While CloudSim has been around
for several years and widely used, the recent developments has made Openstack
the industry standard cloud technology. While most industry research labs have
embraced Openstack, we are convinced the academic community will follow soon
after overcoming the initial steep learning curve.
Openstack appears to be much more complex than CloudSim. However, this complexity
is due to the fact that Openstack has become the defacto cloud technology and not just
a cloud simulator tool. Most of the cloud services built therein can also be used for
simulative purposes. This makes Openstack a very powerful cloud simulator tool with
much more functionalities than CloudSim.
6.2
Openstack Architecture Overview [12]
The OpenStack project as a whole is designed to deliver a massively scalable cloud operating system. To achieve this, each of the component or service is designed to work
with other components to provide a complete Infrastructure as a Service (IaaS). This
integration is facilitated through public application programming interfaces (APIs) that
each component offers and that other components and users alike consume. While these
APIs allow each of the services to use another service, it also allows a developer to modify any service transparently to the user as long as the APIs remain unchanged. These
APIs are both available to other cloud services and to the cloud end-users/tenants. The
openstack release used in this project is the icehouse release. Figure 6.2 shows the architecture overview of the Openstack cloud infrastructure. After reviewing several sources,
we found it beneficial to present the overview of each service using both Wikipedia [11]
and Openstack.org [12] perspective.
Compute (Nova) is the control layer of the Infrastructure-as-a-Service (IaaS) cloud
computing platform. It allows the control over instances and networks, and allows the
managed and control access to the cloud through users and projects. The Nova compute
service does not include virtualization software. Instead, it defines drivers that interact
with underlying virtualization mechanisms that run in the hypervisor, and exposes functionality over a web-based API [12]. Compute (Nova) is a cloud computing fabric controller, which is the main part of an IaaS system. It is designed to manage and automate
26
CHAPTER 6. SEARCH FOR A SUITABLE CLOUD SIMULATION TOOL
Figure 6.2: Openstack cloud architecture overview [12]
pools of computer resources and can work with widely available virtualization technologies, as well as bare metal and high-performance computing (HPC) configurations. KVM,
Xen , Hyper-V and Linux container technology such as LXC are all supported [11].
Identity Service (Keystone) performs the following functions: Tracking users and their
permissions; providing a catalog of available services with their API endpoints. When implementing the Identity Service, one must register each service to be made available in
the cloud infrastructure. Identity service can then track which Openstack services are
available and where they are located on the network [12]. Identity Service(Keystone)
provides a central directory of users mapped to the Openstack services they can access.
It acts as a common authentication system across the cloud operating system and can
integrate with existing backend directory services like LDAP. It supports multiple forms
of authentication including standard username and password credentials, token-based systems and AWS-style (Amazon Web Services) logins. Additionally, the catalog provides a
queryable list of all of the services deployed in an Openstack cloud in a single registry.
Users and third-party tools can programmatically determine which resources they can
access [11].
Networking (Neutron) is a system for managing networks and IP addresses. Openstack Networking ensures the network is not a bottleneck or limiting factor in a cloud
deployment, and gives users self-service ability, even over network configurations. Users
can create their own networks, control traffic, and connect servers and devices to one or
more networks. Administrators can use software-defined networking (SDN) technology
like OpenFlow to support high levels of multi-tenancy and massive scale. Openstack Net-
6.2. OPENSTACK ARCHITECTURE OVERVIEW [12]
27
working provides an extension framework that can deploy and manage additional network
services such as intrusion detection systems (IDS), load balancing, firewalls, and virtual
private networks (VPN) [11]. Networking (Neutron) allows the creation and attachment of interface devices managed by other Openstack services to networks. Plug-ins can
be implemented to accommodate different networking equipment and software, providing
flexibility to Openstack architecture and deployment [12].
Object Storage (Swift) is a scalable redundant storage system. Objects and files are
written to multiple disk drives spread throughout servers in the data center, with the
Openstack software responsible for ensuring data replication and integrity across the storage cluster. Storage clusters scale horizontally simply by adding new servers. Should a
server or hard drive fail, Openstack replicates its content from other active nodes to new
locations in the cluster. Because Openstack uses software logic to ensure data replication
and distribution across different devices, inexpensive commodity hard drives and servers
can be used [11]. Object Storage (Swift) is a multi-tenant object storage system. It is
highly scalable and can manage large amounts of unstructured data at low cost through
a RESTful HTTP API [12].
Block Storage (Cinder) provides persistent block-level storage devices for use with
Openstack compute instances. The block storage system manages the creation, attaching
and detaching of the block devices to servers. Block storage volumes are fully integrated
into Openstack Compute and the Dashboard allowing for cloud users to manage their
own storage needs [11]. Block Storage (Cinder) adds persistent storage to a virtual
machine. Block Storage provides an infrastructure for managing volumes, and interacts
with Openstack Compute to provide volumes for instances. The service also enables management of volume snapshots, and volume types [12].
Image Service (Glance) provides discovery, registration, and delivery services for disk
and server images. Stored images can be used as a template. It can also be used to store
and catalog an unlimited number of backups. The Image Service can store disk and server
images in a variety of back-ends, including Openstack Object Storage [11]. Image Service (Glance) is central to Infrastructure-as-a-Service (IaaS). It accepts API requests
for disk or server images, and image metadata from end users or Openstack Compute
components. It also supports the storage of disk or server images on various repository
types, including Openstack Object Storage [12].
Telemetry (Ceilometer) provides a single point of contact providing all the counters
across all current Openstack components. The delivery of counters is traceable and auditable, the counters must be easily extensible to support new projects, and agents doing
data collections should be independent of the overall system [11]. The Telemetry module performs the following functions: efficiently collects the metering data about the
CPU and network costs; collects data by monitoring notifications sent from services or by
polling the infrastructure; configures the type of collected data to meet various operating
requirements. It accesses and inserts the metering data through the REST API; expands
the framework to collect custom usage data by additional plug-ins; produces signed metering messages that cannot be repudiated [12].
Dashboard (Horizon) provides administrators and users a graphical interface to access, provision, and automate cloud-based resources. The design accommodates third
party products and services, such as billing, monitoring, and additional management
tools. The dashboard is one of several ways users can interact with Openstack resources
[11]. Dashboard (Horizon) is a modular Django web application that provides a graph-
28
CHAPTER 6. SEARCH FOR A SUITABLE CLOUD SIMULATION TOOL
ical interface to Openstack services [12].
Orchestration (Heat) is a service to orchestrate multiple composite cloud applications
using templates, through both an Openstack-native REST API and a cloud formationcompatible Query API [11]. The Orchestration module provides a template-based orchestration for describing a cloud application, by running Openstack API calls to generate
running cloud applications. The software integrates other core components of Openstack
into a one-file template system. The templates allow you to create most Openstack resource types, such as instances, floating IPs, volumes, security groups and users. This
enables Openstack core projects to receive a larger user base. The service enables deployers to integrate with the Orchestration module directly or through custom plug-ins [12].
Database Service (Trove) is a database-as-a-service providing relational and nonrelational database engine [11]. The Database service (Trove) provides scalable and
reliable cloud provisioning functionality for both relational and non-relational database
engines. Users can quickly and easily use database features without the burden of handling
complex administrative tasks. Cloud users and database administrators can provision and
manage multiple database instances as needed. The Database service provides resource
isolation at high performance levels, and automates complex administrative tasks such as
deployment, configuration, patching, backups, restores, and monitoring [12].
6.3
VMs provisioning in the Openstack Cloud [12]
In this section, we present a detailed VM provisioning process description in the cloud using the Openstack cloud software as shown in Figure 6.3. It assumes all the cloud services
involved and their respective software components have been successfully implemented.
But in a user-specific implementation such as the cloud infrastructure implemented in
this project, some cloud services deemed unnecessary such as Neutron for the network or
Cinder for block storage have not been implemented. Instead we have implemented Nova
legacy networks and Nova legacy storage in these cases. Following are the detailed steps
of an instance provisioning process:
1. The dashboard or CLI gets the user credentials and authenticates with the Identity
Service via REST API. The Identity Service authenticates the user with the user
credentials, and then generates and sends back an auth-token which will be used for
sending the request to other components through REST-call.
2. The dashboard or CLI converts the new instance request specified in launch instance
or nova-boot form to a REST API request and sends it to nova-api.
3. nova-api receives the request and sends a request to the Identity Service for validation of the auth-token and access permission. The Identity Service validates the
token and sends updated authentication headers with roles and permissions.
4. nova-api checks for conflicts with nova-database. nova-api creates initial database
entry for a new instance.
6.3. VMS PROVISIONING IN THE OPENSTACK CLOUD [12]
29
5. nova-api sends the rpc.call request to nova-scheduler expecting to get updated instance entry with host ID specified.
6. nova-scheduler picks up the request from the queue.
7. nova-scheduler interacts with nova-database to find an appropriate host via filtering
and weighing. nova-scheduler returns the updated instance entry with the appropriate host ID after filtering and weighing. nova-scheduler sends the rpc.cast request
to nova-compute for launching an instance on the appropriate host.
8. nova-compute picks up the request from the queue.
9. nova-compute sends the rpc.call request to nova-conductor to fetch the instance
information such as host ID and flavor (RAM, CPU, Disk).
10. nova-conductor picks up the request from the queue.
11. nova-conductor interacts with nova-database. nova-conductor returns the instance
information. nova-compute picks up the instance information from the queue.
12. nova-compute performs the REST call by passing the auth-token to glance-api.
Then, nova-compute uses the Image ID to retrieve the Image URI from the Image
Service, and loads the image from the image storage.
13. glance-api validates the auth-token with keystone. nova-compute gets the image
metadata.
14. nova-compute performs the REST-call by passing the auth-token to Network API
to allocate and configure the network so that the instance gets the IP address.
15. neutron-server validates the auth-token with keystone. nova-compute retrieves the
network info.
16. nova-compute performs the REST call by passing the auth-token to Volume API to
attach volumes to the instance.
17. cinder-api validates the auth-token with keystone. nova-compute retrieves the block
storage info.
18. nova-compute generates data for the hypervisor driver and executes the request on
the hypervisor (via libvirt or API).
In the following sections, we present an overview of the architecture of the Keystone
Identity service, the Nova Compute service and the Glance Image service which are the
3 most critical services leveraged in the design and implementation of both the cloud
infrastructure simulator and that of the load simulator.
30
CHAPTER 6. SEARCH FOR A SUITABLE CLOUD SIMULATION TOOL
Figure 6.3: Overview of the VM provisioning process in an Openstack based cloud [12]
6.4. ARCHITECTURE OF OPENSTACK KEYSTONE [12]
31
Figure 6.4: Overview of the Keystone identity service architecture [12]
6.4
Architecture of Openstack Keystone [12]
Figure 6.4 shows the overview of the Keystone architecture. The Keystone Identity service
performs two essential functions in the cloud:
ˆ User management: Its tracks users and their permissions by managing the Users,
Tenants and Roles entities.
ˆ Service catalog: It provides a catalogue of available services with their API endpoints. The keystone service uses a number of backend stores for managing its
entities.
Remark: The cloud load simulator implemented in this project uses the Identity backend
to establish the correspondence between VMs inserted in the input loadfile and their
owners (tenants names).
6.5
Architecture of Openstack Nova [12]
The Nova Compute service is made up of several components as shown in Figure 6.5.
We present next some Nova important components relevant to the cloud infrastructure
simulator and the load simulator implemented in this project:
ˆ Nova-API: It accepts and responds to end user compute API calls. It also initiates
most of the orchestration activities (such as running an instance) as well as enforces
some policy (mostly quota checks).
ˆ The nova-compute process: It is primarily a worker daemon that creates and
terminates virtual machine instances via hypervisor’s APIs
32
CHAPTER 6. SEARCH FOR A SUITABLE CLOUD SIMULATION TOOL
Figure 6.5: Overview of the Nova compute service architecture highlighting the implemented fake compute driver [12]
ˆ The legacy nova-network: It is a worker daemon that accepts networking tasks
from the queue and then performs tasks to manipulate the network (such as setting
up bridging interfaces or changing iptables rules.
ˆ The nova-schedule process: It takes a virtual machine instance request from the
queue and determines where it should run (specifically, which compute server host
it should run on).
ˆ The queue: It provides a central repository for passing messages between daemons.
Remark: In the implemented cloud simulator, the queue implementation is done
via the deployment of the RabbitMQ Message Broker.
ˆ The MySQL database: It stores most of the build-time and runtime state of the
cloud infrastructure.
Remark: The load simulator leverages the real-time cloud state by accessing the buildtime and runtime cloud information via its several MySQL databases interfaces.
6.6. ARCHITECTURE OF OPENSTACK GLANCE [12]
6.6
33
Architecture of Openstack Glance [12]
The Glance service provides services for discovering, registering, and retrieving virtual
machine images. It is made up of the following components:
ˆ Glance-API: It accepts Image API calls for image discovery, image retrieval and
image storage.
ˆ Glance-registry: It stores, processes and retrieves metadata about images (size,
type).
ˆ A database: It is used to store the image metadata.
Remark: In our implementation the Glance database is a MySQL database.
ˆ A storage repository: It is used for the actual image files.
Remark: The cloud infrastructure simulator leverages Glance to instantiate VMs.
34
CHAPTER 6. SEARCH FOR A SUITABLE CLOUD SIMULATION TOOL
Chapter 7
Overview of the Cloud
Infrastructure Simulator
In this chapter, we present how the implemented cloud infrastructure simulator works.
This includes its physical configuration and how tenants and VMs are created. Further
we explain how the Nova compute FakeDriver affects the cloud simulator and the load
simulator. Additionally we explain how decoupling affects VMs creation and placement in
the cloud. Finally we review the high level design principles used in the implementation
of the cloud simulator.
7.1
Physical Configuration
The cloud infrastructure simulator uses the Openstack technology specifically the icehouse
release which was the current release at the time of the implementation [15]. It is made
up of one controller node (ctr01.mgmt.local) and 16 compute nodes (cp01.mgmt.local up
to cp16.mgmt.local). The cloud simulator infrastructure is hosted on the UZH/CSG n19
physical node. The configuration presented here was a design decision to take into account
the physical resources available on the n19 physical node (16-core CPU, 64GB RAM,
500GB Hard Disk). Figure 7.1 shows the overview of the implemented cloud infrastructure
simulator. The load simulator is a distinct implemented software component integrated
to the cloud simulator infrastructure via the controller node and can be seen hosted in
the cloud controller node in Figure 7.1.
While the n19 is a physical node, all the 17 nodes (1 contoller node and 16 compute
nodes) of the cloud infrastructure simulator are all VMs created using the Oracle Virtual
Box technology. Table 7.1 summarizes the physical resources of node n19 and those of
the 17 cloud nodes. Table 7.2 and Table 7.3 present the implemented cloud services on
the cloud controller node and on the cloud compute nodes respectively.
Remark: The basic services also called cloud core services must be implemented in
any cloud deployment based on the Openstack cloud software.
35
36 CHAPTER 7. OVERVIEW OF THE CLOUD INFRASTRUCTURE SIMULATOR
Figure 7.1: Overview of the cloud infrastructure simulator based on the Openstack cloud
technology
Table 7.1: Physical resources of the cloud infrastructure
Node
CPU
RAM
n19
2.5GHz 12-Core
64GB
1 x Physical node 64-bit AMD Opteron
Ubuntu 14.04
17 x VMs
2.5GHz 1-vCPU
2 GB
Ubuntu 14.04
components
Disk
500GB
13GB
Table 7.2: Implemented cloud services running on the cloud controller node
Service
Service Type
Utility
MySQL Database Supporting service
Databases services for the cloud
RabbitMQ
Supporting service
Message broker service for the cloud
Keystone
Basic service
Identity services for the cloud
Nova
Basic service
Provides compute services for the cloud
Glance
Basic service
Provides image services for the cloud
Table 7.3: Implemented cloud services running on each of the 16 compute nodes
Service
Service Type
Utility
Nova API
Basic service Provides compute services for the compute host
Hypervisor
Basic service
Instantiates VMs for the cloud tenants
Nova Networking Basic service
Provides network services for cloud VMs
7.2. CLOUD RESOURCES, TENANTS AND VMS
37
Table 7.4: Theoretic number of VMs that can be created in the cloud infrastructure
simulator
size of VMs
Theoretic no of VMs
size[1vCPU,2GB RAM,10GB Disk,100Mbps
16x106 VMs
size[2vCPU,4GB RAM,20GB Disk,100Mbps
8x106 VMs
size[4vCPU,8GB RAM,40GB Disk,100Mbps
4x106 VMs
size[8vCPU,16GB RAM,80GB Disk,100Mbps
2x106 VMs
7.2
Cloud Resources, Tenants and VMs
The cloud simulator controller has been configured with a RAM Overcomitment Ratio of
1 : 106 and a CPU Overcomittment Ratio of 1 : 106 . As a result the number of VMs that
can be created and hosted on this infrastructure is virtually unlimited. Table 7.4 shows
the number of VMs that can be created based on some examples of VMs sizes. This design
choice allows for the scalability of the cloud simulator infrastructure as new VMs can be
added as needed. While new VMs creation is the normal thing to do, VMs deletion should
be the exception and should be done only in rare cases. A situation where VMs deletion
is appropriate is the deletion of VMs with duplicate names in the cloud. To interact with
the cloud infrastructure simulator, we have implemented a number of primitives some of
which are presented in Table 7.5. A user-guide is found in Appendix B of the current
report. It shows with the help of use-cases how to make use of the implemented primitives
to use the cloud simulator infrastructure. The user-guide also contains use-cases for the
use of the load simulator.
7.3
Nova Compute FakeDriver and the Cloud Simulator
During the design of the cloud infrastructure simulator, we opted for an environment that
can scale with virtually no limit constrained only by the physical resources available on
the n19 node. One key architectural decision to achieve this has been the implementation
of the Nova FakeDriver with a RAM Overcommitment Ratio of 1 : 106 and a CPU Overcommitment Ratio of 1 : 106 leading to the theoretical limits found in Table 7.4.
How does overcommitment of resources affect the cloud simulator?
Simply put, by using using a RAM overcommitment ratio of 1:2 and a CPU overcommitment ratio of 1:2 for example, we are telling the cloud scheduler to allow a number
of VMs in the cloud such that the overall total amount of RAM and CPU in the created
VMs is double the overall amount of physical RAM and CPU in all the compute hosts. In
other words the total CPU and RAM promised by the cloud is double the overall amount
of physical resources in the cloud. In our implementation we promised a total amount of
RAM and CPU = overall physical resources (CPU and RAM on the 16 compute hosts) x
106 . Hence the very high theoretical limit of the number of VMs that can be created in
the implemented cloud simulator as seen in Table 7.4.
38 CHAPTER 7. OVERVIEW OF THE CLOUD INFRASTRUCTURE SIMULATOR
Table 7.5: Some implemented cloud primitives along with some default primitives. These
are used to explore the cloud infrastructure
user-interface
Purpose
create vm.py
Creates a number of VMs in the cloud.
These will belong to the current tenant.
create tenant.py
Creates a new tenant.
Must be created with admin credentials.
view cloud.py
Shows all the VMs running in the cloud
along with the host and VMs resources.
Useful for selecting the compute hosts
on which to run simulations.
view cloud by vms.py
Shows all the VMs running in the cloud
along with the host and VMs resources.
Useful for selecting the next VMs valid names
view cloud all hosts.py
Shows all compute hosts running in the cloud
along with their resources and number of VMs.
view cloud detailed host.py Shows all VMs running in a specific compute host
along with their resources and number of VMs.
keystone tenant-list
View all tenants configured in the cloud.
nova list
Shows the current user’s VMs in the cloud.
nova keypair-list
Shows the current user’s keypair.
How does overcommitment affect VMs creation in the cloud?
The cloud overcommitment level is spread to all compute hosts that are running in the
cloud. The cloud scheduler and specifically the Nova scheduler in our case is responsible to
enforce the resource limits promised by the cloud. Once a request for creation of a VM in
the cloud arrives at the scheduler, it accepts it or rejects it based among other factors on
the current level of resources already provisioned. Thus, once a VM has been created and
placed on a compute host, there is a guarantee that the cloud resources needed for this
VM creation are within the predefined overcommitment level. Using our implementation
as an example, the request for a VM creation will theoretically never be rejected.
How does this impact the load simulator designed in the present project?
Simply put neither the compute host nor the load simulator has to perform any check for
adequate available resources in the cloud. This check has already been performed by the
cloud scheduler that will automatically, transparently and dynamically move VMs from
one compute host to another at startup or at creation to enforce the fact that the cloud
can only deliver the resources it has promised using its predefined overcommitment level.
Taking advantage of this cloud operation principle means that the only job left for the
load simulator is to attempt to perform an allocation of resources that is consistent with
the fair share metric.
7.4. PRINCIPLE OF DECOUPLING AND VMS CREATION IN THE CLOUD
7.4
39
Principle of Decoupling and VMs Creation in the
Cloud
To run experiments on the cloud simulator infrastructure, one can either use the existing
tenants and VMs or create new ones. The default cloud quotas are set to a limit almost
infinite. This means that when a tenant requests the creation of a VM using the implemented primitives, the request is passed on to the nova compute scheduler. The nova
compute scheduler checks if there is an available compute host with enough compute resources to satisfy the VM creation. Since the scheduler will theoretically always find such
a host in the implemented cloud simulator, it will therefore transparently and automatically allow the VM creation and placement on the available compute host. This process
is completely independent of the load simulator (decoupling). In order words when the
cloud scheduler which is part of the cloud infrastructure simulator decides to place a VM
on a compute host, it has no idea whether the load simulator intends to place a load on
this VM in the future.
The only criteria for placing a VM on a compute host is enough resources on the compute
host to satisfy the VM creation. When VMs are restarted or after new VMs are created, the nova compute scheduler may place them dynamically on a different host. This
placement process is completely independent of the cloud user and of the load simulator.
This decoupling between the cloud infrastructure simulator and the load simulator is a
key principle in our design and guarantees that the cloud infrastructure simulator and the
load simulator performs correctly. Moreover, by enforcing this design, we make sure the
process of creating VMs and placing them on compute hosts is completely independent
from the process of simulating a load placement on a VM. This reflects the workings of a
real world cloud where a user can decide at any time to connect to his VMs to start or
stop workloads.
Cloud state after the creation of new VMs: After the creation of new VMs in the
cloud, the cloud operator (the person doing the experiments in the cloud) should always
review the new cloud state with the help of the view cloud.py primitive or another adequate primitive. Simply adding the newly created VMs to the input loadfile along with
previous VMs already in the loadfile can lead to unexpected results. The reason is simple:
After the creation of the new VMs the nova cloud scheduler has dynamically reorganized
VMs placement in the cloud. The previous VMs which were likely selected because they
run together on a given compute host do no longer necessarily run on the same compute
hosts as before the creation of the new VMs.
VMs default loads: Although the implemented cloud infrastructure may have a number
of VMs running (over 170 VMs as of the writing of the present report), from the perspective of the load simulator all these VMs are up and running but consume 0 resources.
The 16 compute hosts resources are available and can at any time be requested for consumption by any valid VM running in the cloud. The load simulator will always arbitrate
resources requests relative to the cloud compute host on which the VMs are running.
Cloud compute hosts resources: For the sake of experiments the compute hosts resources can be artificially increased to any level desired. The load simulator is built-in
with methods to this effect such that a compute host possessing n x vCPUs can actually
appear to the simulator as possessing 2n x vCPUs. This can be generalized to kn x vCPUs
where k is an arbitrary multiplier factor.
40 CHAPTER 7. OVERVIEW OF THE CLOUD INFRASTRUCTURE SIMULATOR
7.5
Cloud Simulator high Level Design Principles
The design of the cloud simulator infrastructure factors the following principles:
Isolation: A given cloud tenant/user can only access his own VMs and networks.
Automation: Vagrant/Oracle Virtual Box tools are used to automatically semi-provision
the cloud infrastructure. The cloud operator can assume a tenant ID and use the create vm.py to automatically create the needed number of VMs belonging to the tenant
whose identity is assumed.
Scalability: The cloud infrastructure can scale by allowing VMs and tenants to be added
as the need may be. This is constrained by the amount of physical resources present in
the physical node n19.
Quotas-free: The cloud environment has been configured with quotas that are theoretically infinite to allow users and tenants to freely create VMs and carry out experiments
as the need may be.
Decoupling: The cloud simulator has been designed and implemented in such a way
that it is completely decoupled from the load simulator.
Chapter 8
Overview of the Load Simulator
In this chapter we present how the load simulator works. This includes the load simulator
logical architecture components including the reader layer, the validation, aggregation and
grouping layer. Further, the load consumption, time translation and report generation
layers as well as the input parameter design (load design) are presented. A dedicated
section explains why physical time cannot be measured in the load simulator. Finally, the
fair-share metric is presented followed by the allocation and reallocation design.
8.1
Logical Architecture Overview
The load simulator is implemented with interfaces to the cloud infrastructure simulator
that enable it to build a cloud state that is always up-to-date. At runtime, it makes realtime calls to both the nova compute layer and to the keystone identity layer to retrieve
the real-time cloud state. It reads its input loadfile, performs real-time calls to the nova
compute layer and to the keystone identity layer using MySQL DBMS interfaces. It
gathers all necessary information about the VMs resources sizes, the resources of the
corresponding compute hosts, the tenants and owners of the VMs. Based on this dynamic
information, it starts an allocation of resources using the fairness metric. If the input
loadfile contains a VM that doesn’t exist in the cloud an error message is displayed and
the load simulation is not even attempted. The load simulator is made up of three logical
layers as presented in Figure 8.1. Additionally Figure 8.4 shows the logical view of the
load simulator relative to the cloud simulator. Let’s present the functions of each of these
layers.
8.2
Reader Layer
This layer is made up of 3 components: The Nova reader, the Keystone reader and the
load tables reader. An example of a valid input loadfile is shown in Figure 8.2. The
load data from the input loadfile is imported into an Sqlite3 database prior to processing.
41
42
CHAPTER 8. OVERVIEW OF THE LOAD SIMULATOR
Figure 8.1: Logical architecture of the load simulator
Hence the load tables reader accesses load data via a DBMS Sqlite interface for further
processing. The load tables reader reads the load consumption data from the load table,
retrieves the VM names and passes them to the Nova Reader. The Nova Reader will
scan the cloud to retrieve the following information with respect to each VM present in
the loadfile: VM resources size for example: csg01 [2vCPUs, 8GB RAM, 120GB Disk, 10
Gbps vNIC], its project ID; the name of the compute host on which it is running and
the corresponding host resources for example cp16 [10vCPUs, 1000 GB RAM, 10000 GB
Disk, 1000 Mbps NIC]. The Nova reader also retrieves the project ID data with respect to
VMs and passes this information to the Keystone reader. The Keystone reader will scan
the cloud to establish a correspondence between the loadfile VMs project ID and their
respective tenants and owners.
8.3
Validation, Aggregation and Grouping Layer
Since the input loadfile contains only load information pertaining to each VM making
a resource request and no other information, the load simulator via its MySQL DBMS
interfaces to the nova compute layer, performs an initial validation of all VMs to be loaded.
Through this process, the cloud is scanned to check the existence and state of the VM to
be loaded. If the VM is not valid (doesn’t exist or has been deleted), the validation layer
displays an error message and the load simulation is not even attempted. On the other
hand if all VMs present in the input loadfile are valid, the load aggregation and grouping
8.4. LOAD CONSUMPTION, TIME TRANSLATION AND REPORTING LAYER 43
layer proceeds to aggregate and group all relevant information from the cloud. As a result
it determines the VMs resources sizes, their owners, their grouping by compute hosts, the
resource sizes of the compute hosts on which they are currently running and passes this
information to the load consumption layer.
Remark: The cloud scheduler can automatically and dynamically reassign VMs to cloud
compute hosts as it sees fit at any time. Thus, it is critical for the load simulator to
always have this updated information at run-time.
8.4
Load Consumption, Time Translation and Reporting Layer
Based on the aggregated and grouped data, the time translator uses the start time data in
the input loadfile. It evaluates both the total length of the load (total number of instants)
and their start times relative to all other VMs running on the same compute host. This
results in a dynamic allocation of resources to VMs whose overall outcome is consistent
with the fair share metric. This is the case whether the VMs started at the same time or
not. As a result the load simulator simulates load consumption of VMs relative to other
VMs running inside the same compute host taking the current cloud state into account.
At the end of the load simulation process, a number of summary reports are generated for
the whole allocation experiment. These reports include among other, the summary of all
resources requested, the summary of all resources allocated, the duration of the allocation
per VM and per compute host.
Remark1: In order to obtain a real-time cloud state, the load simulator requires that
each VM name on the cloud has a unique name. The load simulator results may differ
from expectations when two VMs with the same name exist in the cloud. One strong
constraint for the implemented load simulator is that the cloud names remain unique.
Remark2: To ensure unique names in the cloud, a standard naming convention where
VMs are named after tenants followed by a sequential number (e.g. Patrick1, Patrick2, ...,
Patrick100) has been implemented with the create vms.py primitive. However, it is the
responsibility of the cloud operator to ensure that the parameters given to this command
enforce uniqueness. The cloud technology itself allows duplicates names in the cloud.
Remark3: If VMs with duplicate names are found in the cloud, the duplicates must be
deleted for the sake of the load simulator.
8.5
Input Parameter Design: Load Design
A load in the context of this project is a bundle of resources (CPU, RAM, Disk, Bandwidth) that are requested over a number of instants by a specific cloud VM identified by
its unique name. The load has a relative start time. An input loadfile is made up of a set
of valid loads. Using the example of the input loadfile of Figure 8.2, we extract a valid
input load and present it on Figure 8.3. We succinctly present hereafter the fields of a
valid load.
44
CHAPTER 8. OVERVIEW OF THE LOAD SIMULATOR
Figure 8.2: Input table to the load simulator
Figure 8.3: An example of a valid load extracted from Figure 8.2
8.6. PHYSICAL TIME IN THE LOAD SIMULATOR
45
Instant: This field is a positive integer field varying from 0 to the maximum number of
instants of the resources requests. This field uniquely identifies each line in the load (but
does not uniquely identify each line in the loadfile) and represents the instant at which an
amount of resources is requested. A complete load is a set of such requests over all load
instants. The assumption is that the loads can be of different lengths and vary over time
from instant 0 to 6 using the example on Figure 8.3.
cpu, ram, disk, bandwidth: These fields are positive numbers varying from 0 to the
maximum available resources present in the VMs. Thus, a VM created with [2 vCPUs,
1GB RAM, 10GB Disk, 100Mbps vNIC] will have a valid CPU request from [0..2], a valid
RAM request [0..1], a valid disk request [0..10] and a valid bandwidth request of [0..100].
vm: This field is the vm name which uniquely identifies the cloud VM which makes the
resource allocation request.
start time: This field is the relative start time of the load consumption of a VM in a
compute host relative to all other VMs requesting resources on the same compute host.
All resource allocation for a VM will always start at the start time. Using the example of
Figure 8.2, the load simulator will attempt to place a load (allocate resources) on csg01
at t=2, on csg02 at t=5, on csg03 at t=6 and on csg04 at t=9.
Remarks: The terms: load, resources request, allocation request, load request have the
same meaning in the context of this project. On the other hand the terms: allocation,
resource allocation, load allocation and load consumption also have the same meaning in
the context of this project.
8.6
Physical Time in the Load Simulator
At the start of this project, we had to make an architectural design decision: The first
option was to build a small real-world productive cloud with real VMs that are accessible
for processing using the Nova Libvirt compute driver. The advantage of this option is to
allow the use of a well known tool such as the ”Stress” tool widely used in the industry to
simulate real workloads in the cloud VMs. The disadvantage of this option is to restrict
ourselves to fewer real compute hosts and a few real VMs running in the compute hosts.
Furthermore the reallocation mechanisms would involve kernel programming at the KVM
hypervisor layer to implement any fair scheme reallocation mechanism.
The second option was to build a simulated cloud environment using the Nova fake compute driver. The advantage of this possibility is to build a cloud that theoretically can
have an infinite number of fake VMs. The VMs are considered fake in the sense that they
do not allow any real processing and consume very little resources. Moreover, the second
possibility allows the simulation of the fair allocation mechanisms using a cloud load simulator application rather than having to implement them directly in the hypervisor kernel
as is the case in the first option. The disadvantage of the second option is to have a cloud
with a high number of VMs but these VMs do not allow any real-world processing.
By choosing the second option in this project, the direct consequence for the load simulator is that the time used to run loads on VMs must also be simulated. Thus, the
implementation mechanisms that attempt to measure the physical time an allocation request takes to complete in a VM are not feasible. Clearly by choosing the second option,
we now have the ability to simulate the running of hundreds of VMs with little physical
46
CHAPTER 8. OVERVIEW OF THE LOAD SIMULATOR
resources overhead. Moreover, the implemented load simulator can simulate the allocation of resources using the fair share metric. However, because the running VMs are not
real functional entities, we have no way to measure the physical time used to complete a
resource allocation in a VM.
8.7
The Fair Share Metric
Following is an overview of the fair share metric used for the allocation and reallocation
implemented in the load simulator:
Case 1: At any given instant, if the sum of all resources requested by all VMs running
on a given compute host fall within the limits of the compute host total resources: Each
VM receives its full request at each instant.
Case 2: The sum of all requests from all VMs is greater than the compute host total
maximum. Each VM receives its fair share. The unallocated amount of resources is carried to the next instant. The fraction received at each allocation instant using the fair
share metric is:
vmi
∗ host max
vm1 +vm2 +...+vmn
where i=1,...,n with n being the total number of VMs simultaneously requesting this resource type on the compute host, vmi , vm1 , ..., vmn is the maximum of this resource type
in the corresponding VMs and host max is the total maximum of this resource type in
the compute host.
Remark: The Fair share metric scheme described here is based on the following constraints: A VM never receives more than it requested and a VM never requests more than
its maximum limits of a resource type. Moreover, all requested resources by a VM must
be allocated.
8.8
Allocation and Reallocation Design
The resource allocation and reallocation mechanisms implemented in this project are
based on the fair share metric. Hence it can be said that once several VMs request a
total amount of resources that is greater than the maximum total resources of a compute
host, the load simulator arbitrates the allocation requests among these VMs such that
the overall outcome is always fair. The load simulator fair share arbitration scale to the
dimension of the whole cloud allowing it to arbitrate the requests of resources among
competing VMs on any valid compute host in the cloud.
Let’s refer to the example of Figure 8.2 where the load simulator will attempt to place a
load on csg01 at t=2, on csg02 at t=5, on csg03 at t=6 and on csg04 at t=9. The dynamic
information retrieved from the cloud (cloud state) at run time include the compute host
on which these VMs are running, the resource sizes of the compute hosts, the resource
sizes of the VMs and the total number of VMs simultaneously making requests on the
compute host. This information will be of critical importance to the load simulator. Using
the above example let’s explain two possible scenarios:
Scenario 1: The load simulator detects at run-time that csg01, csg02, csg03 and csg04
8.8. ALLOCATION AND REALLOCATION DESIGN
47
Figure 8.4: Logical view of the load simulator relative to the cloud simulator
are running on the same compute host: The compute host resources will be shared by the
load simulator among the 4 VMs using the fair share metric.
Scenario 2: The load simulator detects at run-time that each of the VMs csg01, csg02,
csg03 and csg04 runs on a different compute host: The resource allocation will produced a
complete different outcome as in scenario 1. In this case each VMs will get the maximum
resources obtainable at each instant.
Remark: Suppose there are other VMs running on the same host as the compute host
in Scenario 1 and as the compute hosts in Scenario 2: The outcome remains unchanged
as these other VMs consume 0 resources. Only the VMs on the input loadfile consume
resources and have an influence on the load simulator outcome. An important assumption
is that the person performing experiments with the cloud simulator will validate the input
loadfile. The results obtained will always reflect the data present in this file.
48
CHAPTER 8. OVERVIEW OF THE LOAD SIMULATOR
Chapter 9
Evaluation
9.1
Output of some implemented Cloud Primitives
1. view cloud.py
python view cloud.py: It shows a complete view of the cloud as shown in Figure 9.1. Its output includes the total number of VMs running in the cloud as shown
in Figure 9.2
2. view host cloud.py
python view host cloud.py: It shows the complete view of a cloud host including
its VMs and its resources as shown in Figure 9.3
3. view all user vms.py
python view all user vms.py: It shows VMs belonging to a specific cloud tenant
or user and on which compute host they are running as shown in Figure 9.4
9.2
Load Simulator Tests and Results
Test Case 1: 3 VMs are chosen from the cloud such that each VM runs on a different
compute host. Each VM requests an amount of resources within its own size. We use the
primitive view cloud.py and view host cloud.py in making our VMs selection. For the
sake of the size of the output we choose loads of small lengths.
Scenario 1: All the 3 VMs request different amount of resources and have different
start time. Figure 9.5 shows the resources requests.
Test Case 1/Scenario 1 results: The results are shown in Figure 9.6, Figure 9.7 and
Figure 9.8.
Test Case 2: 5 VMs are chosen from the cloud such that: 2 VMs run on the first
compute host and 3 VMs run on the second compute host. The amount of resources
they request is always within their size limits. We use the primitive view cloud.py and
view host cloud.py in making our VMs selection.
Scenario 1: All the 5 VMs request resources with the same start time in their respective
49
50
CHAPTER 9. EVALUATION
Figure 9.1: Part1 of the truncated output from the view cloud.py
Figure 9.2: Part 2 of the truncated output from the view cloud.py
Figure 9.3: Truncated output of the view host cloud.py command
9.2. LOAD SIMULATOR TESTS AND RESULTS
51
Figure 9.4: Truncated output of the view all user vms.py command
Figure 9.5: Test Case 1/Scenario 1: All the 3 VMs request resources with different starting
times
52
CHAPTER 9. EVALUATION
Figure 9.6: Part 1 results: Test Case 1/Scenario 1
Figure 9.7: Part 2 results: Test Case 1/Scenario 1
9.3. DISCUSSION ON THE LOAD SIMULATOR RESULTS
53
Figure 9.8: Part 3 results: Test Case 1/Scenario 1
compute host. Figure 9.9 shows the resources requests.
Test Case 2/Scenario 1 results: The results are shown in Figure 9.10, Figure 9.11,
Figure 9.12 and Figure 9.13.
Scenario 2: All the 5 VMs request resources with a different start time within their
compute host. Figure 9.14 shows the resources requests.
Test Case 2/Scenario 2 results: The results are shown in Figure 9.15, Figure 9.16,
Figure 9.17 and Figure 9.18.
9.3
Discussion on the Load Simulator Results
Consistent with its design, the load simulator reads the input load table, validates, aggregates and groups this data such that all VMs running in a particular compute host are
grouped together. It also retrieves the VMs resources sizes and their corresponding compute host resource sizes from the cloud layer and performs a resource allocation consistent
with the fair share metric whether the VMs started at the same instant or not. While
each compute host has a real physical size resource limit, the resource allocation is done
based on the simulated maximum resources size of the compute host. This value appears
as max cpu, max ram, max disk and max bandwidth on the output. This functionality
has been added to introduce elasticity to the compute host resources sizes. Thus, once the
amount of resources requested by VMs is above the simulated compute host maximum,
the fair share metric arbitrates resource allocation to the competing VMs. The above
conclusions can be verified from all test cases scenarios. Let’s have a look at the test cases
results.
Test Case 1/Scenario 1: Based on the results shown in Figure 9.6, Figure 9.7 and
Figure 9.8, the following can be inferred: The load simulator detects that each VM lm10,
lm11 and lm12 each runs on a separate compute host. The compute hosts detected are
respectively cp4, cp10 and cp16. In this case each VM receives the resources it requested
54
CHAPTER 9. EVALUATION
Figure 9.9: Test Case 2/Scenario 1: All the 5 VMs with same start times within their
respective compute hosts
9.3. DISCUSSION ON THE LOAD SIMULATOR RESULTS
Figure 9.10: Part 1 results: Test Case 2/Scenario 1
Figure 9.11: Part 2 results: Test Case 2/Scenario 1
55
56
CHAPTER 9. EVALUATION
Figure 9.12: Part 3 results: Test Case 2/Scenario 1
Figure 9.13: Part 4 results: Test Case 2/Scenario 1
9.3. DISCUSSION ON THE LOAD SIMULATOR RESULTS
57
Figure 9.14: Test Case 2/ Scenario 2: Load request for 5 VMs with different start times
within their respective compute hosts
58
CHAPTER 9. EVALUATION
Figure 9.15: Part 1 results: Test Case 2/Scenario 2
Figure 9.16: Part 2 results: Test Case 2/Scenario 2
9.3. DISCUSSION ON THE LOAD SIMULATOR RESULTS
Figure 9.17: Part 3 results: Test Case 2/Scenario 2
Figure 9.18: Part 4 results: Test Case 2/Scenario 2
59
60
CHAPTER 9. EVALUATION
at each instant. The reason is that the amount of resources requested by each VM at
each instant is within the VM resources sizes and therefore also within the compute host
resources. In this case the duration of each allocation equals the duration of the request.
The time output shows that the allocation starts from the instant of request at t=0, t=3
and t=5 respectively.
Test Case 2/Scenario 1: Based on the results shown in Figure 9.10, Figure 9.11, Figure 9.12 and Figure 9.13, the following can be inferred: The load simulator detects that
the first two VMs lm21 and lm15 are both running on cp10. Since their start time is
identical at t=5, after the validation, aggregation and grouping phase, the load simulator
allocates the resources requests. The total amount of cpu, disk and bandwidth requested
at each cycle by the 2 VMs fall within the maximum physical resources of the compute
host cp10. For these resources the number of instants of requests equal the number of
instants of allocation. Things are different for the RAM requests however. Since the total
requested at each cycle is 12288 GB well above the maximum of 8192 GB of the compute
host, the load simulator uses the fair share metric to perform the RAM allocation. Thus
at t=5, the two VMs each receives a fair share proportional to its size with the value
vmi
∗ host max, where VMi (i=1,2) identifies the respective VM RAM max size. At
vm1+vm2
each cycle the non-fulfilled requests are carried on to the next cycle until the allocation
is complete. host max is the RAM max size of the compute host.
As can be noted using this example, the least available resource, in this case the RAM
resource determines the longest allocation time. Using the fair share metric at t=5, lm21
4096
8192
∗ 8192 = 5461.33; lm15 receives 8192+4096
∗ 8192 = 2730.67; What ever
receives 8192+4096
is left unallocated at this cycle is carried on to the next cycle. Thus at t=6 the resources
requests changes for both VMs. lm21 now requests 8192 + (8192 − 5461.33) = 10922.67
and lm15 now requests 4096 + (4096 − 2730.67) = 5461.33. However, the amount of resources they receive at t=6 do not change. This process continues until t=13 where both
VMs receive the total amount of RAM requested. At this point both lm21 and lm15 have
received all the resources they requested.
Test Case 2/Scenario 2: Using the results presented in Figure 9.15, Figure 9.16, Figure 9.17 and Figure 9.18, the following can be inferred: The 2 VMs lm21 and lm15 make
their allocation requests starting at different start times t=5 and t=8. During the first 3
instants of allocation at t=5, t=6 and t=7 lm21 receives the maximum requested because
it is alone making the requests. At t=8 however, lm15 makes its first requests and since
the total requested by both VMs is greater than the physical maximum of the compute
host, the allocation continues using the fair share metric. At t=12, lm21 has received
all the resources requested leaving lm15 full access to the compute host resources. At
t=13, lm15 receives a full share and the allocation finishes at t=14 when all resources are
allocated. The time translator reports a duration of 7 instants and 6 instants starting at
t=5 and t=8 respectively.
As for the three VMs fake4, lm10 and lm13 running on compute host cp14, the allocation
results are consistent and correct. All total requests for all given resources at all instants
are always within the total physical resources available on the compute host. However,
since the requests are made at different start times, the end times are also different. The
respective allocation start times are t=1, t=4 and t=7 respectively while the finish times
are t=6, t=9 and t=12 respectively.
9.4. FURTHER WORK:
9.4
61
Further Work:
As can be inferred from the load simulator results discussion, the fair share metric introduces fairness in the compute hosts resource allocation mechanisms. However, since the
load simulator time output is not based on a physical time scheme, a future comparative study could be done to establish the gains of introducing fair share mechanisms in
real-world compute hosts that do not implement such fairness schemes or otherwise implement alternative fairness metrics. This comparative study could investigate the gains
in terms of processing time gains or VMs resource allocation famine reduction. Such a
study could also investigate gains based on other valid gains metrics. Another exciting
future possibility is the extension of the load simulator with the implementation of other
cloud resource allocation algorithms. The results of these can be easily compared with
those obtained from the fair share metric with a view to innovate and improve.
62
CHAPTER 9. EVALUATION
Chapter 10
Summary and Conclusion
In this project, we have addressed the problem of simulating resource allocation in clouds
in two ways: First by designing and implementing a cloud infrastructure simulator using
the Openstack cloud technology. The implemented cloud simulator allows the creation of
IaaS components such as compute hosts, VMs, tenants and projects. It can be used to
conduct repeatable and controllable investigation of the resource allocation mechanisms,
algorithms and policies used in the cloud SCIs. Second by designing and implementing
an elastic load simulator that simulates fair resource allocation in clouds using a fair
share metric. The load simulator architecture is made up of three logical layers: a reader
layer, a layer for validation, aggregation and grouping, a layer for load consumption, time
translation and reporting. These 3 layers enable the load simulator to maintain a current
cloud state at all times and thus to perform an accurate simulation of resource allocation
in the cloud.
The load simulator is an elastic and scalable load simulator. As such it can be used to
simulate resource allocation on a few VMs or on all VMs in the cloud taking into account
all tenants and compute hosts. The load simulator is an extensible framework that can
be extended with the implementation of other cloud resource allocation algorithms. This
offers the interesting possibility of comparing the implemented algorithms with a view
to improve these algorithms. The results obtained from the dataset used in the load
simulator are consistent with the fair share metric used. As possible future work, we have
identified the possibility of conducting a comparative study to establish the gains of the
implemented fair share metric over systems that implement an alternate allocation scheme
using metrics such as gains in time, gains in VMs famine reduction or any other gains
metrics. Another interesting opportunity is to extend the load simulator by implementing
other cloud resource allocation algorithms with a view to establish comparisons.
63
64
CHAPTER 10. SUMMARY AND CONCLUSION
Bibliography
[1] LSCI 2012, Riccardo Murri, Sergio Maffioletti Grid Computing Competence Center,
UZH
[2] Memory Overcommitment in the ESX Server,VMware Technical Journal,Vol 2, NO.
1 June 2013
[3] VMware Whitepaper: Understanding Full Virtualization, Paravirtualization and
Hardware Assist, David Marshall, VMware Inc, 2007
[4] An Overview of Virtualization Technologies, Pierre Riteau University of Rennes 1,
IRISA, June 2011
[5] Optimal Joint Multiple Resource Allocation Method for Cloud Computing
Environments,Shin-ichi Kuribayashi, International Journal of Research and Reviews
in Computer Science,Vol. 2, No. 1, March 2011
[6] Multi-dimensional Resource Allocation for Data-intensive Large-scale Cloud Applications, Foued Jrad, Jie Tao, Ivona Brandic and Achim Streit; Closer 2014 - 4th
International Conference on Cloud Computing and Services Science
[7] Multi-Resource Allocation: Fairness-Efficiency Tradeoffs in a Unifying Framework,
Carlee Joe-Wong, Soumya Sen, Tian Lany, Mung Chiang, INFOCOM, 2012 Proceedings IEEE
[8] CloudSim: A toolkit for modeling and simulation of cloud computing environments
and evaluation of resource provisioning algorithms, Software Practice and Experience,
Volume 41, January 2011
[9] Cloudsim, http://www.cloudbus.org/cloudsim/
[10] KVM, http://www.linux-kvm.org
[11] Wikipedia, http://en.wikipedia.org/wiki/OpenStack
[12] Openstack http://www.openstack.org
[13] Manage resources on overcommitted KVM hosts Consolidating workloads by overcommitting resources, IBM DeveloperWorks, Feb 2011
65
66
BIBLIOGRAPHY
[14] Dominant Resource Fairness: Fair Allocation of Multiple Resource Types,Ali Ghodsi, Matei Zaharia, Benjamin Hindman, Andy Konwinski, Scott Shenker, Ion Stoica, Proceedings of the 8th USENIX conference on Networked systems design and
implementation, Pages 323-336
[15] Openstack,
http://docs.openstack.org/icehouse/install-guide/install/
apt/content/index.html
[16] Oracle Virtualbox, https://www.virtualbox.org/
[17] kvm: the Linux virtual machine monitor, Kivity & al, Proceedings of the Linux
Symposium, volume 1, pages 225–230, year 2007
[18] Virtual Cpu Scheduling Techniques for Kernel Based Virtual Machine (Kvm),
Raghavendra, K. T. Cloud Computing in Emerging Markets (CCEM), 2013 IEEE
International Conference on. IEEE, 2013.
Abbreviations
SaaS
IaaS
PaaS
SCIs
CPU
RAM
VM
I/O
SLA
DRF
FDS
GFJ
QoS
HPC
SMSCG
VMM
OS
MMU
TLB
vNIC
NIC
HVM
vCPU
KSM
DMC
API
LDAP
AWS
IP
SDN
IDS
VPN
HTTP
REST
CLI
ID
Software as a Service
Infrustructure as a Service
Platform as a Service
Shared Computing Infrastructures
Central Processing Unit
Random Access Memory
Virtual Machine
Input/Output
Service Level Agreement
Dominant Resource Fairness
Fairness on Dominant Shares
Generalized Fairness on Jobs
Quality of Service
High Performance Compting
Swiss Multi-Science Computing Grid
Virtual Machine Monitor
Operating System
Memory Management Unit
Translation Lookaside Buffer
Virtual Network Interface Card
Network Interface Card
Hardware Virtualization Extensions
Virtual Central Processing Unit
Kernel Same-page Merging
Dynamic Memory Control
Application Programming Interface
Lightweight Directory Access Protocol
Amazon Web Services
Internet Protocol
Software Defined Networking
Intrusion Detection System
Virtual Private Network
Hypertext Transfer Protocol
Representational State Transfer
Command Line Interface
Identity
67
68
URI
DBMS
ABBREVIATONS
Uniform Resource Identifier
Database Management System
Glossary
Cloud Computing It is a model for enabling convenient on-demand network access to a
shared pool of virtualized, configurable computing resources (e.g. networks, servers,
storage, applications and services) that can be rapidly provisioned over the internet
and released with minimal management effort or service provider interaction.
Hypervisor It is a piece of computer software, firmware or hardware that creates and
runs virtual machines.
Type 1 Hypervisor These are hypervisors that run directly on the host hardware to
control the hardware and to manage guest operating systems.
Type 2 Hypervisor These hypervisors run on a conventional operating system just as
other computer programs do.
Virtual Machine Monitor (VMM) It implements the virtual machine hardware abstraction and is responsible for running a VM.
Binary Translation It is a CPU virtualization technique that does the translation of the
guest OS kernel code to replace non-virtualizable instructions with new sequences
of instructions that have the intended effect on the virtual hardware.
Overcommitment is a hypervisor feature that allows a virtual machine (VM) to use
more memory space than the physical host has available.
Memory Ballooning It is a memory reclaiming technique in which the host instructs a
cooperative guest to release some of its assigned memory so that it can be used for
another purpose.
Nova (Compute) It is a cloud computing fabric controller, which is the main part of
an IaaS system based on the Openstack cloud technology.
Keystone (Identity) It provides authorization and authentication for users and tenants
in the cloud. It provides a central directory of users mapped to the OpenStack
services they can access.
Neutron (Networking) It allows the creation and attachment of interface devices managed by other OpenStack services to networks.
Swift (Object Storage) It is a multi-tenant object storage system.
69
70
GLOSSARY
Cinder (Block Storage) It adds persistent storage to a virtual machine.
Glance (Image Service) It provides discovery, registration, and delivery services for
disk and server images.
Ceilometer (Telemetry) It provides a single point of contact providing all the counters
across all current OpenStack components
Horizon (Dashboard) It is a modular Django web application that provides a graphical
interface to OpenStack services.
Heat (Orchestration) It is a service to orchestrate multiple composite cloud applications using templates, through both an OpenStack-native REST API and a cloud
formation-compatible Query API.
Trove (Database) It provides scalable and reliable cloud provisioning functionality for
both relational and non-relational database engines.
Virtualization It describes the separation of a service request from the underlying physical delivery of that service.
Nova Compute FakeDriver It is a Nova Compute driver that allows the creation of
non-functional VMs that consume little or no resources on the compute host. It is
mainly used for simulation purposes.
List of Figures
2.1
Multi-cloud workflow framework architecture based on CloudSim [6] . . . .
4
2.2
Number of large jobs completed for each allocation scheme in comparison
of DRF against slot-based fair sharing and CPU-only fair sharing [14] . . .
5
2.3
Example of multi-resource requirements in data-centers [7] . . . . . . . . .
6
3.1
Overview of a cluster computing architecture [1]. Fairness can be introduced by modifying the scheduler and resource allocation manager . . . . .
8
Overview of a grid computing architecture [1]. Fairness can be introduced
at the domain or cluster level . . . . . . . . . . . . . . . . . . . . . . . . .
9
3.2
3.3
Overview of a cloud SCI architecture showing an Openstack cloud [1] . . . 10
4.1
x86 Virtualization Overview: A hosted virtualization or type 2 hypervisor:
The hypervisor runs on an OS. Example: Oracle Virtualbox, VMware flash
player [4] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
4.2
x86 Virtualization Overview: A hypervisor virtualization or type 1 hypervisor: The hypervisor runs on bare-metal. Most widely used in productive
clouds. Example: KVM, XEN, VMware ESXi, Microsoft Hyper-V [4] . . . 12
4.3
x86 Virtualization Overview: A virtualization layer is added between the
hardware and the operating system [3] . . . . . . . . . . . . . . . . . . . . 13
4.4
VMM architecture Overview: Each VMM partition physical resources and
present them to VMs as virtual resources [3] . . . . . . . . . . . . . . . . . 13
4.5
x86 privilege level architecture with no virtualization implemented [3] . . . 14
4.6
x86 Memory Virtualization. The VMM is responsible for mapping the VM
physical memory to the host physical memory [3] . . . . . . . . . . . . . . 15
4.7
x86 Device and I/O Virtualization. The Hypervisor uses software to emulates virtual devices and I/O and translate VMs requests to the system
hardware [3] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
71
72
LIST OF FIGURES
6.1
Layered cloud computing architecture [8] . . . . . . . . . . . . . . . . . . . 24
6.2
Openstack cloud architecture overview [12] . . . . . . . . . . . . . . . . . . 26
6.3
Overview of the VM provisioning process in an Openstack based cloud [12]
6.4
Overview of the Keystone identity service architecture [12] . . . . . . . . . 31
6.5
Overview of the Nova compute service architecture highlighting the implemented fake compute driver [12] . . . . . . . . . . . . . . . . . . . . . . . . 32
7.1
Overview of the cloud infrastructure simulator based on the Openstack
cloud technology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
8.1
Logical architecture of the load simulator . . . . . . . . . . . . . . . . . . . 42
8.2
Input table to the load simulator . . . . . . . . . . . . . . . . . . . . . . . 44
8.3
An example of a valid load extracted from Figure 8.2 . . . . . . . . . . . . 44
8.4
Logical view of the load simulator relative to the cloud simulator . . . . . . 47
9.1
Part1 of the truncated output from the view cloud.py . . . . . . . . . . . . 50
9.2
Part 2 of the truncated output from the view cloud.py . . . . . . . . . . . 50
9.3
Truncated output of the view host cloud.py command
9.4
Truncated output of the view all user vms.py command . . . . . . . . . . 51
9.5
Test Case 1/Scenario 1: All the 3 VMs request resources with different
starting times . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
9.6
Part 1 results: Test Case 1/Scenario 1 . . . . . . . . . . . . . . . . . . . . 52
9.7
Part 2 results: Test Case 1/Scenario 1 . . . . . . . . . . . . . . . . . . . . 52
9.8
Part 3 results: Test Case 1/Scenario 1 . . . . . . . . . . . . . . . . . . . . 53
9.9
Test Case 2/Scenario 1: All the 5 VMs with same start times within their
respective compute hosts . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
30
. . . . . . . . . . . 50
9.10 Part 1 results: Test Case 2/Scenario 1 . . . . . . . . . . . . . . . . . . . . 55
9.11 Part 2 results: Test Case 2/Scenario 1 . . . . . . . . . . . . . . . . . . . . 55
9.12 Part 3 results: Test Case 2/Scenario 1 . . . . . . . . . . . . . . . . . . . . 56
9.13 Part 4 results: Test Case 2/Scenario 1 . . . . . . . . . . . . . . . . . . . . 56
9.14 Test Case 2/ Scenario 2: Load request for 5 VMs with different start times
within their respective compute hosts . . . . . . . . . . . . . . . . . . . . . 57
LIST OF FIGURES
73
9.15 Part 1 results: Test Case 2/Scenario 2 . . . . . . . . . . . . . . . . . . . . 58
9.16 Part 2 results: Test Case 2/Scenario 2 . . . . . . . . . . . . . . . . . . . . 58
9.17 Part 3 results: Test Case 2/Scenario 2 . . . . . . . . . . . . . . . . . . . . 59
9.18 Part 4 results: Test Case 2/Scenario 2 . . . . . . . . . . . . . . . . . . . . 59
B.1 View of the selected 5 VMs from compute host cp12 . . . . . . . . . . . . . 80
B.2 View of the selected 3 VMs from compute host cp16 . . . . . . . . . . . . . 80
B.3 View of the selected 2 VMs from compute host cp15 . . . . . . . . . . . . . 81
B.4 View of the load requests for all the 10 VMs . . . . . . . . . . . . . . . . . 82
B.5 View of the tenant list showing tenant patrick does not exist . . . . . . . . 83
B.6 View of the output of the create tenant.py command . . . . . . . . . . . . 84
B.7 View of the output of the keystone tenant-list command showing the newly
created tenant patrick . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
B.8 View of the output of the nova keypair-list command showing the newly
created patrick-key . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
B.9 View showing the 20 newly created VMs of tenant patrick as well as the
updated total for the cloud . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
B.10 View of the entire cloud sorted on compute hosts names. Useful to select
compute hosts for simulations . . . . . . . . . . . . . . . . . . . . . . . . . 87
B.11 Entire view of the cloud sorted on VMs names. Useful to select the next
valid VMs names in the cloud . . . . . . . . . . . . . . . . . . . . . . . . . 87
B.12 View of the compute hosts along with their resources and number of VMs . 88
B.13 Detailed view of compute host cp01 . . . . . . . . . . . . . . . . . . . . . . 88
B.14 View of all the tenants configured in the cloud . . . . . . . . . . . . . . . . 88
B.15 View of the current loadfile
. . . . . . . . . . . . . . . . . . . . . . . . . . 89
74
LIST OF FIGURES
List of Tables
6.1
Comparison of CloudSim and Openstack cloud Simulation tools . . . . . . 24
7.1
Physical resources of the cloud infrastructure components . . . . . . . . . . 36
7.2
Implemented cloud services running on the cloud controller node . . . . . . 36
7.3
Implemented cloud services running on each of the 16 compute nodes . . . 36
7.4
Theoretic number of VMs that can be created in the cloud infrastructure
simulator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
7.5
Some implemented cloud primitives along with some default primitives.
These are used to explore the cloud infrastructure . . . . . . . . . . . . . . 38
B.1 Table with code implementation statistics: Total number of code lines =
1533 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
B.2 Examples of some implemented primitives that allow interaction with the
Cloud Infrastructure Simulator . . . . . . . . . . . . . . . . . . . . . . . . 86
75
76
LIST OF TABLES
Appendix A
Report on Milestones
Implementation
A.1
Guiding Principle 1:
In agreement with the project supervisor, the current project was evolved to take into
account the latest development in the cloud technology and allocation mechanisms. This
latest information is contained in chapters 4 and 5 of the present report. As such, we
agreed not to adhere to the strategy of the Master Thesis of Beat Kuster for the allocation/reallocation design. For instance we do not develop any strategy of reallocation
design based on using the KVM hypervisor ”nova resize” primitives. We rather implement
an allocation/reallocation design based on the fair share metric.
A.2
Guiding Principle 2:
This project was extended to cover the full implementation of the load simulator as found
in chapter 8. This is an extra contribution on our part to this project.
77
78
APPENDIX A. REPORT ON MILESTONES IMPLEMENTATION
A.3
Milestone 1: Search for a suitable Cloud Simulation Tool
A.4
Milestone 2: Comparison to Simulator Integration into Openstack
A.5
Milestone 3: Decison on Alternatives
Milestone 1, 2 and 3 are covered in chapter 6.
A.6
Milestone 4: Input Parameter Design
Milestone 4 is covered in chapter 8, specifically in section 8.5.
A.7
Milestone 5: Reallocation Design
A.8
Milestone 6: Consumption Data Design
Milestone 5 and Milestone 6 are covered in chapter 8, specifically in on section 8.8.
A.9
Milestone 7: Implementation
The implementation of the cloud infrastructure simulator and of the load simulator are
covered in chapter 7 and chapter 8.
A.10
Milestone 8: Evaluation
Milestone 8 is entirely covered in chapter 9. The evaluation in Milestone 8 is expected to
be done by running various loads (datasets) in the cloud simulator and load simulator.
The results are checked for correctness with respect to the implemented fairness metric.
Appendix B
User Guide
B.1
Implemented Code Statistics
All the implemented programs are shown in Table B.1 along with the number of lines of
code in each.
B.2
Use-Case 1: Running Simulations with existing
Cloud Components
Description: We run a simulation using 10 VMs running on 3 Compute hosts.
All the VMs and tenants needed for this experiment already exist in the cloud.
Assumption: The Login process assumes the user has an active IFI network connection
or an active UZH VPN network connection. Moreover, we use the VNC viewer to have a
GUI display. This can be downloaded from Internet and installed on a local laptop. The
user name used here is louismariel. However, each cloud operator should use his own user
name to perform load simulations in the cloud.
1. Login to the node n19: We use the following command:
# putty -ssh -L 4545:n19:5901 louismarie@192.41.136.222. The password
for louismariel is: csgcsg123. On a Windows Laptop the above command is typed
from the cmd command line.
2. Launch VNC Viewer. We use the 64-bit Windows version. The password for
VNC Viewer is: csgcsg. We are now logged into the node n19.
3. We will now select the 10 VMs on which to run the load simulations such
that 2 VMs run on the first compute host, 3 VMs run on the second compute host,
and 5 VMs run on the third compute host.
79
80
APPENDIX B. USER GUIDE
Table B.1: Table with code implementation statistics: Total number of code lines = 1533
Name of program
Purpose
Code lines total
db read load tables new.py
validation and load tables reader
407
dbquery vm.py
Nova and Keystone reader
248
db ram cpu disk band alloc.py main simulator class, time translation
499
aggregation, grouping, allocation
view cloud by vms.py
entire cloud view by VMs
57
view cloud
entire cloud view by hosts
57
view cloud all hosts.py
shows all compute hosts
38
view cloud detailed host.py
shows details of a compute host
75
db view load.py
view the table load
30
create tenant.py
create a tenant in the cloud
53
create keypair.py
create a keypair for a tenant
22
create vms.py
create VMs for a tenant
47
Figure B.1: View of the selected 5 VMs from compute host cp12
4. We use the commands:
# cd /home/csg/v cloud fake
#python view cloud.py.
We use the above command to view the cloud. Based on its output we write down
the VM names we intend to use for the simulations. In a next step we will add these
VMs to the input loadfile. Our selections are shown in Figure B.1, Figure B.2 and
Figure B.3
5. We will now create the input loadfile on the controller node ctr01. For
this experiment all VMs will request the maximum amount of resources equal to
their sizes.
Remark:All load simulator commands should be executed from the directory
/home/csg/v cloud fake
6. We use the commands:
# cd /home/csg/v cloud fake
#cp load for test 1.txt load csg 1.txt
We copy the template file load for test 1.txt to the new loadfile named load csg 1.txt.
A user can also create this file from scratch instead of copying. The fields in this
Figure B.2: View of the selected 3 VMs from compute host cp16
B.3. USE-CASE 2: RUNNING SIMULATIONS WITH NEW CLOUD COMPONENTS81
Figure B.3: View of the selected 2 VMs from compute host cp15
file are comma-separated. A valid text editor is required. In these examples we use
vi as our text editor.
7. We modify the load csg 1.txt to contain the new load for the 10 VMs
selected.
8. We connect to the load database to import the loadfile with the following commands.
loadvm.db is the load database specified on the command line.
#sqlite3 loadvm.db
#.separator ,
#.import load csg 1.txt load vms
9. We verify the new load table with the command:
#python db view load.py
The results of the above command are shown in Figure B.4
10. We run the simulation using the following command and the results are redirected
to the results csg 1.txt file
#python db ram cpu disk band alloc.py > results csg 1.txt
11. The results can now be consulted in the file results csg 1.txt with the
command: #more results csg 1.txt
Remark: We can also import these results on our laptop for further processing. When we are running multiple experiments, we should differentiate the results files by using different numbers at the end of the results filenames. Example
results csg 1.txt, results csg 2.txt and so on.
B.3
Use-Case 2: Running Simulations with new Cloud
Components
Description: We will use 50 VMs running on 10 Compute hosts to run the
second experiment. Some of these VMs will be newly created and will belong
to the newly created tenant
1. We want to create a new tenant called patrick that will own 30 VMs in the cloud.
At the present this tenant does not exist. We use the following 2 commands to view
the existing tenants in the cloud:
#source admin-openrc.sh
82
APPENDIX B. USER GUIDE
Figure B.4: View of the load requests for all the 10 VMs
B.3. USE-CASE 2: RUNNING SIMULATIONS WITH NEW CLOUD COMPONENTS83
Figure B.5: View of the tenant list showing tenant patrick does not exist
#keystone tenant-list
The output of the above command is shown in Figure B.5. It confirms that tenant
patrick does not exist.
2. Next we create the tenant patrick with the following command:
#source admin-openrc.sh
#python create tenant.py patrick.
The above command creates tenant patrick and generates the file patrick-openrc.sh
containing the credentials of user patrick. The above command output is shown in
Figure B.6
3. We confirm the creation of the new tenant patrick with the following command:
#keystone tenant-list
The output of the above command is shown in Figure B.7 and confirms that tenant
patrick has been created.
4. Next we create a key-pair for tenant patrick with the following command:
#source patrick-openrc.sh
#python create keypair.py patrick.
5. We verify that the keypair has been created with the following command:
#nova keypair-list.
The output of the above command is shown in Figure B.8
6. Next we view the VM flavors with the following command:
#nova flavor-list.
7. Next we create VMs using the patrick tenant with the following command:
#python create vm.py patrick m1.large 1 20.
84
APPENDIX B. USER GUIDE
Figure B.6: View of the output of the create tenant.py command
Figure B.7: View of the output of the keystone tenant-list command showing the newly
created tenant patrick
Figure B.8: View of the output of the nova keypair-list command showing the newly
created patrick-key
B.4. USE-CASE 3: EXPLORING THE CLOUD USING THE IMPLEMENTED PRIMITIVES85
Figure B.9: View showing the 20 newly created VMs of tenant patrick as well as the
updated total for the cloud
The above command creates 20 VMs with flavor m1.large for tenant patrick with
the names Patrick1,Patrick2,...,Patrick20
8. Finally we verify that the new VMs have been created with the following command:
#python view cloud by vms.py.
Figure B.9 shows the 20 newly created VMs of the new tenant patrick.
Remark 1: The VMs names need to be unique in the cloud in order for the
load simulator to resolve them. Duplicates names need to be deleted when they are
created by error.
Remark 2: From now onward the new VMs can be used in the simulation experiments as shown in use-case 1. The fact that these new VMs belong to the new
tenant Patrick will be automatically retrieved by the load simulator at runtime.
Remark 3: Deletion of VMs from the cloud is discouraged. This should be the exception and not the rule. Using the developed cloud primitives help to check where
the index associated with the VMs of a tenant end. This helps determine the valid
starting index for new VMs.
B.4
Use-Case 3: Exploring the Cloud using the implemented Primitives
Table B.2 shows how to explore the cloud infrastructure using the implemented primitives.
It summarizes the commands and their purposes. Next we show the output of each
command as executed from the implemented cloud simulator.
Remark: All commands are executed from the simulator home directory
/home/csg/v cloud fake
86
APPENDIX B. USER GUIDE
Table B.2: Examples of some implemented primitives that allow interaction with the
Cloud Infrastructure Simulator
user-interface
Purpose
Shows all the VMs running in the cloud
view cloud.py
along with the host and VMs resources.
Useful for selecting the compute hosts
on which to run simulations.
view cloud by vms.py
Shows all the VMs running in the cloud
along with the host and VMs resources.
Useful for selecting the next VMs valid names.
view cloud all hosts.py
Shows all running compute hosts running in the cloud
along with their resources and number of VMs.
view cloud detailed host.py
Shows all VMs running in a specific compute host
along with their resources.
keystone tenant-list
View all tenants configured in the cloud.
These will belong to the current tenant.
Example 1: Viewing the entire cloud
# python view cloud.py
Figure B.10 shows the entire view of the cloud including all the VMs running, their
resources as well as their owners. The information provided here is ideal for selecting the
compute hosts on which to run simulations. This is because the compute hosts are listed
with all their running VMs.
# python view cloud by vms.py
Figure B.11 shows the entire view of the cloud including all the VMs running, their
resources as well as their owners. The information provided here is ideal for selecting the
next valid names for VMs. This is because all VMs names belonging to a tenant appear
in sequential order.
Example 2: Viewing Cloud compute hosts
# python view cloud all hosts.py
Figure B.12 shows the compute hosts running in the cloud along with their resources and
the total number of VMs running on each of them.
# python view cloud detailed host.py cp01
Figure B.13 shows the detailed view of a specific compute host in the cloud. In this
example we have a detailed view of compute host cp01 specified on the above command
line.
Example 3: Viewing Cloud tenants # source admin-openrc.sh
# keystone tenant-list
Figure B.14 shows the detailed view of all tenants configured in the cloud. In this example
tenant louis, patrick, test and demo have been created for experiments. The admin tenant
is used for administrative purposes and tenant service is used for service creation in the
cloud.
Example 4: Viewing the load requests
# python db view load.py
B.4. USE-CASE 3: EXPLORING THE CLOUD USING THE IMPLEMENTED PRIMITIVES87
Figure B.10: View of the entire cloud sorted on compute hosts names. Useful to select
compute hosts for simulations
Figure B.11: Entire view of the cloud sorted on VMs names. Useful to select the next
valid VMs names in the cloud
88
APPENDIX B. USER GUIDE
Figure B.12: View of the compute hosts along with their resources and number of VMs
Figure B.13: Detailed view of compute host cp01
Figure B.14: View of all the tenants configured in the cloud
B.4. USE-CASE 3: EXPLORING THE CLOUD USING THE IMPLEMENTED PRIMITIVES89
Figure B.15: View of the current loadfile
Figure B.15 shows the input loadfile. It contains the resources requests for VMs appearing
in the loadfile. The start time represents the start time of load placement on a specific
VM.
90
APPENDIX B. USER GUIDE
Appendix C
Contents of the CD
All the implemented code.
91