emune

Transcription

emune

MOBILE NETWORKS AND APPLICATIONS (MONET), SPECIAL ISSUE ON ADVANCES IN MOBILE APPLICATIONS AND PROVISIONED SERVICES
1
EMUNE: Architecture for Mobile Data Transfer
Scheduling with Network Availability Predictions
Upendra Rathnayake, Henrik Petander,Maximilian Ott and Aruna Seneviratne
Abstract—With the mobile communication market increasingly moving towards value-added services the network cost will need to
be included in the service offering itself. This will lead service providers to optimize network usage based on real cost rather than
the simplified network plans sold to consumers traditionally. Meanwhile, today’s mobile devices are increasingly containing multiple
radios, enabling users on the move to take advantage of the heterogeneous wireless network environment. In addition, we observe that
many bandwidth intensive services such as video on demand and software updates are essentially non real-time and buffers in mobile
devices are effectively unlimited. We therefore propose EMUNE, a new transfer service which leverages these aspects. It supports
opportunistic bulk transfers in high bandwidth networks while adapting to device power concerns, application requirements and user
preferences of cost and quality.
Our proposed architecture consists of an API, a transport service and two main functional units. The well defined API hides all internal
complexities from a programmer and provides easy access to the functionalities. The prediction engine infers future network and
bandwidth availability. The scheduling engine takes the output of the prediction engine as well as the power and monetary costs,
application requirements and user preferences into account and determines which interface to use, when and for how long for all
outstanding data transfer requests. The transport service accordingly executes the inferred data transfer schedule. The results from
the implementation of EMUNE’s and of the prediction and scheduling engines evaluated against real user data show the effectiveness
of the proposed architecture for better utilization of multiple network interfaces in mobile devices.
Index Terms—Network Interface Selection, Network Availability Prediction, Data Transfer Scheduling.
F
1
I NTRODUCTION
T
ODAY ’ S mobile devices are increasingly containing
multiple radio interfaces. User mobility allows them
to connect to multiple access networks with different
characteristics at different times. Prior work such as [33]
has demonstrated that a user can optimize the utility by
performing vertical handovers between different available networks. The general focus of vertical handovers
to date is to provide continuous connectivity. But there
is a large class of bandwidth intensive non/near-realtime applications such as software updates, podcasts,
email and even video on demand (with HTTP streaming)
where continuous connectivity is not mandatory. Current
and emerging mobile devices with a high memory capacity enable the support of this class of applications using
smart scheduling and buffering. In other words, the data
transfers can be scheduled taking into consideration a
number of factors such as power usage, user preferences
such as the willingness to wait, and monetary cost.
This will provide benefits to the users by adapting to
their preferences, and also to the network operator by
enabling them to better utilize their network resources
by distributing traffic between different access networks
• U. Rathnayake is with EET school of UNSW, and also with NICTA,
Sydney, Australia.
E-mail: Upendra.Rathnayake@nicta.com.au
• H. Petander, M. Ott and A. Seneviratne are with NICTA, Sydney,
Australia. A. Seneviratne in addition is with EET school of UNSW,
Sydney, Australia.
over different times. In addition, the user utility can further be enhanced using network availability predictions.
The use of smart buffering, scheduling and prediction
is not new and different proposals [1], [2], [6], [20],
[21], [22] have addressed different aspects. But to the
best of our knowledge, none of the work reported to
date provides a complete solution as described in the
related work section. That means, they do not provide a
system which does prediction, scheduling and execution
of the schedule with a proper transport service, so that
the user utility is maximized and the communication
costs minimized by exploiting the delay tolerance of
applications. Moreover, they do not provide methods
which use context information commonly available to
a mobile user to accurately predict the network availability. And schedulers that consider the uncertainties
associated with predictions. In this paper, we therefore
present an architecture which enables client side decision making, taking into consideration network and
bandwidth availability predictions, user preferences and
the application requirements to optimally schedule data
transfers while taking into account the uncertainty in
predictions. Additionally, as the decision engine resides
only in the mobile device, no changes to the infrastructure is required.
In so doing this paper makes two main contributions
in addition to presenting the architecture, namely:
• Provides a method which uses commonly available
context information for accurately predicting the
availability (and the available bandwidth) of each
network type, which is based on Dynamic Bayesian
Networks. Also contains an analysis of the model’s
computational complexity; and
• Presents a computationally inexpensive optimization method to schedule the data transfers of all
applications dynamically which takes into account
the application requirements, user preferences of
quality and cost, and the network availability predictions together with the uncertainty associated
with them.
The rest of the paper is organized as follows. In the
next section we will consider a motivating scenario for
the proposed architecture. This is followed by related
work. Section 4 presents the proposed architecture
followed by implementation details of the API and the
transport service in section 5. Section 6 describes
the prediction engine in detail followed by the models
and results of the scheduling engine in Section 7. The
conclusion is presented in Section 8.
2
2
Fig. 2. Scheduled downloading.
the coverage area of hotspot #2. This is schematically
shown in Figure 2.
M OTIVATING S CENARIO
Imagine a person in a place such as an airport or a
city cafe, where Wi-Fi hotspots are overlaid over a 3G
network as shown in Figure 1. Assume that a user
in the hotspot #1 area is progressively downloading
and watching a video on demand file. Further assume
that the mobile device is trying to download a newly
available software update in the background.
Fig. 1. Heterogeneous Computing Scenario
Now consider the situation where the user moves out
of the coverage area of hotspot #1 into a more congested
and ”costly” 3G network. However, the route she is
taking will take her into an area covered by another
hotspot #2 some time later. In this case, if the mobile
device can predict the availability of hotspot #2 ahead
of time, it will be possible to provide better utility.
This can be done by first re-scheduling the software
update and using the available bandwidth to buffer the
VOD data before moving out of the coverage area of
hotspot #1. Secondly, when out of the hotspot area #1
and connected to the more expensive 3G network, the
device will download the VOD data at a minimum rate
which will enable smooth play back of the video and
pause the data transfer for the software update until in
3
R ELATED W ORK
It can be convincingly demonstrated that the usage of
multiple radio interfaces can dramatically improve the
performance and functionality of mobile communications when compared to using a single radio interface
[15]. One of the primary focuses of evaluating multi interfaced systems in the literature has been the switching
among multiple network interfaces to reduce the overall
power consumption [1], [16], [17]. In contrast, our work
is focusing on improving the overall user utility.
Another focus has been the selection of network interfaces using policy based mechanisms[18], [19]. Moreover,
[23]-[27] demonstrate how to select the best interface
dynamically in a heterogeneous network environment.
However, they assume using a single network interface
at one time. The main motivation behind all of this work
was to accommodate continuous connectivity through
the interface which will satisfy the needs of applications
and user’s preferences, while optimizing the vertical
handoff decisions. They do not consider exploiting tolerance of delay of applications to improve user utility.
A method of maximizing user expectations for nonreal-time data, by enabling cost and performance-aware
selection is described in [20]. The proposed method is
static as it uses a pre-specified set of networks for a
given application. A method which uses a policy based
mechanism for context aware connectivity management
is presented in [21]. A scheduling mechanism for transmitting time-critical data in a cost-aware manner over
heterogeneous networks has been presented in [22].
Compared to our scheme, these schemes are reactive
in that they do not consider future network/bandwidth
availability. Further, Thompson et al. have proposed a
flow scheduling mechanism named PERM [41], which
tries to improve latency and transmission times by
predicting the users’ traffic and then allocating them
to different networks. Compared to our work, PERM
does not predict the availability of networks for mobile
users and it mostly caters for only the residential users.
Further, the network allocation for the flows is static and
most importantly, do not exploit the delay tolerance of
applications to maximize the benefits to the users.
When it comes to predicting network availability, we
observe that it has been extensively investigated over the
past few years. [1] describes a method of determining the
Wi-Fi availability in a particular GSM cell. They do this
by simply recording the Wi-Fi presence every 5 minutes
and use the ratio of the number of times a WLAN
was available over the number of times it was recorded
to infer the probability one would encounter a Wi-Fi
network in a given GSM cell. In contrast to our scheme,
this method does not provide any indication of the
probability of the user being within a WLAN coverage
area ahead of time. Similar approaches with domain
independent mobility models have also been reported in
literature as surveyed in [7]. Song et al. [8] evaluate such
domain independent location predictors with extensive
Wi-Fi data collected in a campus environments. They
show that simple low order Markov predictors work
better than high-order Markov predictors and work as
well or better than the more complex compression-based
predictors. However these proposals are generally for
a single network environment and more importantly in
contrast to the proposed scheme, do not provide any
indication of availability ahead of time.
Vanrompay et al. [6] treat the network availability
prediction as an embedded task in user mobility predictions. Here the mobility prediction is used for resource
scheduling with availability of Bluetooth connectivity.
Given the next user location obtained from the user path
predictions, whether the Bluetooth connectivity exists
in that location is found using a Bluetooth coverage
map. Similarly, a number of domain specific user path
prediction approaches [9], [11], [12], [13] can be coupled
with network coverage maps or more sophisticated QoS
maps as in [10], to predict the availability of networks
ahead of time. However, these models are based on
assumptions such as constant user speed, fixed cell size
and shape, which are unrealistic. In contrast, our prediction approach uses any available context information
to predict network availability with respect to time in a
heterogeneous environment, without using any domain
specific knowledge.
With regard to our scheduling engine work, the closest
solution for scheduling data transfers for maximizing
user utility on multi-interfaced mobile devices in heterogeneous networking environments is [2]. The authors
propose a scheduling scheme based on a hill climbing
algorithm to maximize user utilities over energy and
monetary costs for non-real-time bulk data. They take
into account future network availability and use the
predicted availability to schedule the data transfers.
However, this work assumes that the network availability predictions are perfect and that the predictions can
3
be derived for arbitrarily long durations, assumptions
which are clearly not realistic.
Stochastic scheduling, which our proposed scheduling
scheme is based on, has been extensively studied in
multi-processor [29], [30] and multi-server environments
[31], [32]. This body of work considers the problem of allocating resources to requests, such as processors to programs, servers to customer requests, while addressing
the uncertainties associated with requests. However, our
proposed scheme differs from these in two fundamental
ways. First, we consider the unavailability of resources,
namely the unavailability of wireless networks. Second,
as described in section 7.2, we look into the resource
(network) availability in the future and also consider
the cost of resources, namely the costs of using different
networks, in the optimization.
4 A RCHITECTURE FOR E FFECTIVE M OBILE
U SAGE OF HETEROGE N EOUS N E TWORKS
EM U N E
EMUNE provides support for non/near-real-time applications. It consists of four components, namely an API
which provides easy access to the functionality offered,
two separate functional units: a prediction engine and a
scheduling engine, and a transport service as shown in
Figure 3.
Fig. 3. EMUNE Architecture.
The API provides an interface to a scheduling library
through which, application programmers can gain access
to the functionality provided by EMUNE. Two types
of non-real-time applications are supported by the API:
bulk data transfer and streaming applications. The applications will request from the API to transfer data with
their requirements and then, EMUNE will exploit the
delay tolerance of them to maximize user utility and to
minimize network costs in transferring their data. More
details follow in the section 5.1.
The prediction engine is responsible for predicting the
future availability of networks and their bandwidths,
so that the scheduling engine can assign data transfers
of applications to appropriate networks to get the best
4
2) Preferred transfer time: the soft deadline - section
7.1
3) Maximum transfer time: the hard deadline - section
7.1
Streaming
1) Size of data;
2) Duration of the stream; and
3) Maximum startup delay: the time EMUNE has to
buffer data before playback.
Fig. 4. Buffers offering flexibility in scheduling.
utility. It records the mobile device’s associated context
information for prediction purposes as described in section 6.
The scheduling engine in EMUNE is responsible for
determining the schedule for the transfer of data from
each application for maximizing user utility while minimizing power and monetary costs. It obtains network
and power cost information, network and bandwidth
availability predictions and also application requirements to derive its schedule as described in detail in
section 7.
The transport service assigns data transfers of each
application over available networks according to the
schedule given by the scheduling engine, by controlling
relevant communication protocols accordingly (section
5.2).
5
R EALIZATION OF THE API AND THE T RANS S ERVICE
PORT
5.1
API
In scheduling, the flexibility offered by the delay tolerance of applications is schematically shown in Figure 4
for the scenario described in section 2. For bulk data
transfers, as the data is accessed only after all the data
have been downloaded, data can be fetched any time
and in any order before the deadline without having
to maintain a minimum overall rate1 (S/W updates in
Figure 4). On the other hand, streaming applications
read data from a buffer at a given rate depending on the
codec. Therefore, EMUNE should make sure that data
of these applications are downloaded while satisfying a
”minimum transfer rate” as discussed in section 7.1.
In EMUNE, the API allows bulk data applications
to specify the file size and when the transfer should
be completed. The streaming applications can indicate
the size and the duration of the stream and also how
much time the application can afford to wait before
starting playback. This is done by accepting following
parameters:
Bulk Data
1) Size of data;
1. This applies for small downloads which can be finished during
the prediction window. For larger downloads, a minimum download
rate is necessary to adhere to the transfer deadlines ( 7.2.2).
The API should pass the parameters ”minimum transfer rate”,”soft deadline” and ”hard deadline” to the
scheduling engine for each application bulk or stream.
For bulk transfers, former is calculated by dividing the
size by the preferred transfer time whereas the latter
two are directly available. For streaming applications,
the duration is used as the soft deadline. The hard
deadline is calculated from the soft deadline by adding
the maximum startup delay to it. Further, the ”minimum
transfer rate” is taken as the size divided by the duration.
EMUNE then combines these information with system
wide user preferences or utility functions (section 7.1),
and network availability predictions from the predication engine to schedule the transfer of data as described
in section 7.
The API is provided by extending the socket API. The
socket library calls connect and accept which wrap the
socket system calls are modified to accept a variable
number of extra arguments in addition to the standard
socket system call arguments. This way of extending
the socket calls allows legacy applications to remain
unchanged. The extra arguments to socket calls from
”EMUNE aware” applications are extracted from the
system C library and required parameters are passed to
the EMUNE scheduler along with the information about
the application and the socket information.
For legacy applications some of these requirements
may be learned by intercepting and reading the header
information of application traffic to determine sizes of
HTTP downloads or bitrates of video streams. This
information could then be matched with application or
service profiles based on URLs, domain names, or other
supporting information.
Combining the application specific requirements
learned through the API and utility functions allows
EMUNE to prioritize applications if there is a shortage of
network capacity preventing EMUNE from attempting
to satisfy the requirements of all applications. To track
the buffer occupancy of the connection and adjust the
scheduling, EMUNE tracks further actions on the socket,
namely reading and writing from the socket and closing
it.
Similar buffering is already done in applications, such
as online flash based video players, in which one thread
reads data from the network as fast as possible and
writes it into a buffer from which a second thread reads
the data for the video codec at the codec data rate.
Moving this common functionality into the scheduling



















Fig. 5. Transport service network architecture.
library simplifies the writing of streaming applications
and allows for the coexistence all applications without
starvation of bandwidth. The API was implemented
by intercepting the socket library calls using a shim
layer which intercepts the library calls. Although, this
approach adds an overhead to these system calls, the
overhead is small enough to not affect the performance
of networked applications noticeably. The API can also
be implemented by updating the C library.
5.2 Transport Service
A combination of Monami [36] and FreezeTCP [28]
protocols is used in EMUNE to provide the transport
service. The Monami provides a mechanism to dynamically direct flows over different networks. FreezeTCP
enables flows to be paused and resumed. Together, they
enable EMUNE to optimally use the available networks
across time and space.
The Monami protocol operates between a mobile node
and a fixed anchor point router, the home agent, and
can move data traffic flows across different network
interfaces. The benefit of using Monami is that it can
treat each flow independently of the other flows.
FreezeTCP on the other hand uses TCP zero window messages to pause data transfers and can resume
transfers on demand as long as the receiver acknowledges keep alive messages from the sender. Furthermore,
FreezeTCP can be used as a mechanism to pause flows
when transfers are ”costly” and resume when conditions
have improved, similar to limiting the impact of brief
disconnections of network outages [37]. The upside of
this is that the prediction requirements are less strict, and
the mobile node would nearly always be able to respond
to TCP keep alive messages via UMTS or GPRS, thus
enabling longer pausing of downloads or uploads. This
is illustrated in Figure 5 which shows how MONAMI
can be used to reroute the TCP keepalive messages over
UMTS to the mobile device with the help of its Home
Agent.
The use of these two protocols do not require any
changes to the communication peers in the fixed network. Support for simultaneous multi-homing, i.e. using
more than one network interface for the same data
transfer session could also potentially increase the performance of the transport, if RCP [38] protocol is used.
5
Anyway, the proposed system does not rely on RCP as
a transport and session management mechanism due to
deployability concerns. However, RCP could be used
between Mobile Node and Home Agent acting as a
proxy. We leave this optimization as future work.
6
T HE D ESIGN
OF THE
P REDICTION E NGINE
The prediction engine provides information about future
availability of networks and their available bandwidths
for each network type supported by the mobile device.
The models we used to predict network availability and
their computational complexities are described in the
following subsections.
6.1
The Prediction Models
There are numerous ways for predicting network availability. However, as described in the related work section, these approaches are associated with shortcomings
due to the assumptions such as fixed cell size, shape,
constant user speed etc. Additionally, the accuracy of
predictions can be increased by considering not only location based information, but by using any other context
information which influences future network availability,
which has mostly been ignored by the state of the art.
Therefore, one needs to consider this system of all of the
context variables (including location based variables) in
predicting network availability.
To address this, we proposed modeling the system
as an nth order Semi-Markov model. We first did an
experiment to record as many context variables as possible for 12 users in Sydney, Australia. The variable set
included 4 context variables where 3 of them are binary
which includes the variable that we want to predict, ”the
availability of WLAN”. The remaining context variable
had a number of realizations, typically 10 to 15 on
average for a user. With this data set, we showed that the
second order Semi-Markov model causes less prediction
errors [3]. However, there are a number of unobserved
context variables which limits the accuracy of the predictions. To overcome this, we proposed to model the
system containing a hidden variable to account for the
unobserved context variables, i.e. assuming that it is
possible to map a number of unobserved variables to
a single hidden variable. This enabled the system to be
modeled as a Hidden Markov Model (HMM) shown in
Figure 6.a. However, the prediction accuracy with the
HMM was inferior to the Semi-Markov model proposed
earlier. The under-performance stems from the limiting
assumption in HMM that the current observation does
not depend on the previous observation. This can be
overcome by making the current observation dependent
on the previous observation, which is referred to commonly as Auto-Regressive HMM (Figure 6.b). But this
model results in significant increase in the parameter
space and hence the computational complexity.
To reduce the parameter space, we proposed a generic
Dynamic Bayesian Network (DBN) based model where
6
and be using them precociously. That will lead to a less
load on the network at the congested times as the users
will time shift their delay tolerant traffic with a system
like EMUNE.
6.2
Fig. 6. HMM and Auto-Regressive HMM Models for two
time steps.
Fig. 7. The DBN Model for two time steps.
Fig. 8. Prediction errors with different models.
the observable variables are individually represented in
the model [4]. This reduces the parameter space but
looses the tight coupling between consecutive observations, as shown in Figure 7 for the four variables.
The Figure 8 shows that this model provides better
accuracy than both Semi-Markov and HMM models
where average prediction error with the Semi-Markov
order 2 was found to be 29%, with HMM - 35% and
with DBN - 23%.
Although we limit our work only to predicting the
presence of networks, the same methods would readily be extended for predicting not only presence, but
parameters such as available bandwidth, provided the
users are able to record those information. On one hand,
the mobile device itself can send a probe packet to the
network and can measure the available bandwidth at a
point in time intrusively. The best way, however, is by
making such information available to the users by the
networks. The incentive for the networks for doing that
is, when coupled with a congestion based pricing scheme
as in [40], the users are aware of the status of the network
Computational complexities of the models
The HMM, Auto-Regressive HMM and DBN models
are typically trained using the EM algorithm for calculating parameters [14]. The higher the total number
of parameters to be estimated, the higher the number
of computations needed in training the model as in
the M step of EM, we need to maximize against all of
the unknown parameters [14]. Therefore, we compare2
these models’ computational complexities in terms of
their number of unique parameters to be estimated. In
comparison, the Semi-Markov model does not involve
running any algorithm to train model parameters unlike
in above models, but merely counts the events to calculate the parameters. Therefore, it is not considered in
this comparison and it can safely be assumed to have
insignificant computational complexity.
In an HMM, the observable variable O depends only
on the current hidden variable H. The Hidden variable
depends only on the previous hidden variable as shown
in Figure 6.a. In our set up, we used the vector containing all of the four observed context variables as the
O. Each different combination of realizations of the four
variables becomes a separate realization of O. The H
represents the entire unobserved context variables and
we assumed 3 realizations for the H [4]. As mentioned
in the section 6.1, of these four context variables, three
variables are binary whereas the remainder has number
of realizations. Let us denote that number with L. With
this configuration, for the HMM the number of unique
parameters to be estimated for the conditional probability table (CPT) of the O is 3 × (8L − 1). For H at t2 ,
it is 3 × (3 − 1) and for H at t1 , (3 − 1), resulting the
total number of parameters (24L + 5). Note that for the
HMM, the O variable at t1 and t2 both have the same
dependencies and can be tied together for having the
same CPT. On the other hand, the numbers are as follows
for the Auto-regressive HMM. O at t1 = 3 × (8L − 1), O
at t2 = 8L × 3 × (8L − 1), H at t1 = (3 − 1) and H at t2
= 3 × (3 − 1) resulting a total of (192L2 + 5).
In the Semi-Markov and HMM models, all of the
observed variables were modeled as a single vector
valued variable. However, they were taken as separate
individual variables in the DBN model shown in Figure
7 for the four variables. H on the top of the model
diagram is the hidden variable as before, which accounts
for the entire unobserved context. The bottom one O4
represents the observable variable that we want to predict; ”the availability of WLAN” in this case. All the
other observable variables are in the middle; O1 , O2 and
O3 in this case (O1 is the one which has L number of
2. We include the Auto-Regressive HMM to show the explosion of
the parameter space with this model
Fig. 9. Number of Parameters with Different Models.
realizations). With this dependency structure, this model
results in number of unique parameters for the CPT of
each variable as follows. At t1 , H = (3−1), O1 = 3×(L−
1), O2 = 3 × (2 − 1), O3 = 3 × (2 − 1), O4 = 12L × (2 − 1)
and at t2 , H = 3 × (3 − 1), O1 = 3L × (L − 1), O2 =
6 × (2 − 1), O3 = 6 × (2 − 1), O4 = 24L × (2 − 1) with
(3L2 + 36L + 23) total number of parameters. Figure 9
shows how the total number grows against L in log scale.
With L = 10, the number of parameters to be estimated
for the HMM and this model are 245 and 683, in contrast
to the 19205 in Auto-Regressive HMM, less than one
twentieth of it.
Even though the total number of parameters for both
the DBN and the Auto-Regressive HMM models are
second order polynomials of L, the clear difference between the number of parameters for typical values of L
is caused by the difference in the respective coefficients
of L2 . Note that for the DBN, it is 3 whereas for AutoRegressive HMM, it is 192 which explodes the parameter
space even for low values of L. This clearly shows that
the DBN model limits the number of model parameters
whilst still preserving the real dependency structure of
the observable variables to some extent making it capable of effectively using hidden variables to better predict
network availability than HMM, Auto-Regressive HMM
and Semi-Markov models.
7
T HE D ESIGN
OF THE
S CHEDULING E NGINE
Scheduling in mobile communications has also received
considerable attention. However, they assume perfect
network availability predictions taken over arbitrarily
longer durations as described in the related work section. The way to overcome this is to use a stochastic
optimization model for probabilistically scheduling data
transfers. In our earlier work [5], we proposed such
a model and showed with real user data that it optimizes network usage for delay tolerant applications
by maximizing user utility while minimizing network
costs. However, it is necessary to reduce its number of
7
computations for practical use and in this section, we
describe an enhanced version of the model. We further
show with the same real user data that it performs
closer to the stochastic model and also better than other
non stochastic variants and greedy approaches under
numerous networking conditions.
To determine the schedule, the scheduling engine
takes a number of inputs:
• The predicted network availability profile and bandwidth for each network type,
• The power cost incurred in transferring data and the
monetary cost of usage,
• Application requirements,
• User preferences for costs and quality in terms of
application utility functions (section 7.1).
The predictions are obtained from the prediction engine. EMUNE can measure the power cost as in [1]. The
monetary cost can be calculated from the details of tariff
plans provided by the user to the system at the time of
initializing the system. Additionally, the monetary cost
can be obtained on the fly from the networks such as in
the case of congestion based pricing. The applications’
”minimum transfer rate” requirement is obtained via
the API. The utility functions are formed by using user
preferences stored in a configuration file together with
soft and hard deadlines received through the API for
each application needing data transfer. Using these information, the scheduling engine finds an optimal schedule
for transferring data of all application requests. Further,
it dynamically revises its schedules when new data
transfer requests come or when circumstances change.
7.1
Utility Functions for Applications
User utility functions for applications play a prominent
role in scheduling. Utility functions represent a user’s
preferences for transferring application data over a period of time. They can be used to differentiate applications’ importance from the user’s point of view. The
utility function specifies the utility of having a data unit
of an application at time ”t” and these functions assume
that even partial download (e.g. web pages, video) or a
completion beyond the deadline (e.g. software updates)
can be useful. Previously proposed schemes have used
linearly decreasing utility functions where the rate of
decrease is a constant throughout and the utility approaches to zero after a hard deadline, as illustrated in
Figure 10.a [2]. As this does not correctly represent
the utility, specially of non-real-time applications, we
proposed piecewise linear utility functions where we
consider two deadlines; soft and hard [5]. The soft
deadline provides a way to schedule taking more risk
of being even unable to complete the transfers by it.
However, the hard deadline needs to be satisfied by the
scheduler.
We represent the corresponding utility function of
non-real-time applications as a constant up to the soft
deadline and decreasing linearly to zero by the hard
Fig. 10. Piecewise linear utility functions.
Fig. 11. Scheduling Model.
deadline, as depicted in Figure 10.b. During the soft
deadline, it is possible to transfer data opportunistically
whenever ”cheap” networks become available without
losing any utility. Even for some near-real-time or realtime applications, if the transfer of the data with a
constant rate satisfies the user, the utility function does
not have to decrease over the time. In the case of a video
download for example, if we keep a constant minimum
reception rate for smooth playback of the video, then it
is unreasonable to assume that the utility function will
decrease fast, as it implies that the user is not interested
in the latter part of the video file as much as at the
beginning. We explicitly take a ”minimum transfer rate”
parameter together with the utility function shown in
Figure 10.c for those applications for scheduling data.
Real-time and other applications where transferring the
data ”sooner rather than later” really gives higher user
satisfaction are represented by utility functions similar
to Figure 10.a, in Figure 10.d.
We call the value of the rate of decrease of an application’s utility function at a point in time as its ”urgency”
at that time. And higher the utility function’s value of
an application, the higher its ”importance” at that time.
Formation of these utility functions involve combining
applications’ requirements as well as user preferences
together in an intelligent way. One coarse way would be
to first divide all the applications into few broad classes
and then enforce a user defined utility curve shape and
a utility value for each class. However, we understand
that deriving the fine grained utility functions is not a
trivial task which itself needs further research.
7.2 Shorter Durations and Uncertainties:
Stochastic-Deterministic Hybrid Model
8
The
7.2.1 Background
We identified that the state of the art of scheduling
makes two assumptions that (1) the predictions can be
derived for arbitrarily longer periods with good accuracy
and (2) the predictions are 100% perfect. However, from
our own work [3], [4] and others work [39] on network
availability prediction, we learned that these assumptions are hardly realistic. Accordingly, we proposed a
model based on stochastic optimizations relaxing those
assumptions in our earlier work [5]. In this paper, we
propose a hybrid stochastic model which extends that
model. We first describe the background of our previously proposed stochastic scheduling model briefly in
this section, as the same was enhanced for reducing the
number of computations.
In our previously proposed model, we consider only a
shorter scheduling duration (5-10 minutes) into the future
at a point in time, for scheduling the data transfers
based on network availability predictions, as shown in
Figure 11. We re-schedule the remaining data for a new
scheduling duration after passing the current scheduling
duration and repeat this process. On one hand, the
scheduling duration should have a limited time span as
the network availability predictions cannot be derived
for arbitrarily longer durations. On the other hand, it
has to be short, because the wider the length of the
duration, the higher the error in the network availability
predictions for that duration. Moreover, by considering
only the scheduling duration for maximizing utility and
minimizing cost, if the utility of an application happens to be higher than the costs of at least two of the
network interfaces, then data would be sent as much
as possible within that duration over those interfaces.
But for applications which do not loose utility over the
time and can wait, such as software updates, it would
be advantageous to look forward and find whether the
cheaper networks would become available in the future.
To facilitate this, we consider a second look up duration
as well in the model (Figure 11). Despite having shorter
prediction intervals, still the predictions for network
availability are not perfect. Our own prediction work
(section 6.1) shows that even with a good prediction
model, the average prediction error is in the vicinity of
20%, which is significant. For example, we may predict
that Wi-Fi will be available 60% of the time, but it might
be available for only 40% of the time or as much as
80% of the time in reality, which we refer to as network
availability scenarios.
This scheduling problem of finding how much data
from each application to be sent, and the corresponding
level of usage of each network type poses a typical
optimization problem, however containing variables that
are associated with uncertainties (primarily the network
availability predictions). Therefore, we used two stage
stochastic linear programming [34], [35] for our previous
model. With that, we can determine the ”reasonable
amount” (on average, considering uncertainties) of data
from each application that should be sent in the scheduling duration. This gives some recourse decisions as well,
enabling one to later change some of the decisions about
the use of different network interfaces, depending on the
actual availability scenario. That means, for example, if
the Wi-Fi networks are going to be available more than
expected, how to utilize them optimally, or if going to
be available less than expected, how to use any other
available network cost effectively. We re-estimate the
availability of the scheduling duration in the middle and
then accordingly try to execute those recourse decisions
before the duration ends, although we are left with only
a half of the scheduling duration to do that.
7.2.2 The Enhanced Model
Let us introduce our notations first. Applications’ data
are sent in small fixed sized amounts called bundles as
in [2]. We define a time slot of an interface as the amount
of time taken to send a bundle via that interface. Let
Cj , be the cost (which can well be a combination of
monetary, power etc costs) to send a bundle over an
interface j. In the objective function f , we maximize
the utility while minimizing the cost. Let I be the total
number of applications, J be the total number of network interfaces, Xi,t the number of data bundles from
application i that needs to be sent in duration t, and
Ui,t be the average utility of application i in duration t.
Further, we assumed that there are W and Z number of
network availability scenarios in scheduling and lookup
durations respectively, and a scenario in each duration be
denoted by w and z accordingly. Then we denote, nj,1,w
to be the number of time slots used to send data over
interface j in duration 1, in availability scenario w; nj,2,z
to be the number of time slots used to send data over
interface j in duration 2, in availability scenario z; Nj,1,w
to be the number of time slots available to send data
over interface j in duration 1, in availability scenario w
and Nj,2,z to be the number of time slots available to
send data over interface j in duration 2, in availability
scenario z.
In our previous model, we made an important observation regarding the data amounts from each application
to be sent in the scheduling duration (Xi,1 values), i.e.
these amounts need not to be fixed at the beginning
of the scheduling duration. Therefore, we decide the
quantities considering the most expected availability
scenario of the scheduling duration at the beginning
of it and transfer accordingly up to the middle. At
the middle of the scheduling duration, we re-estimate
and find the actual availability scenario and if it is
different from the expected scenario, we re-evaluate to
find the corresponding quantities, which we could have
known at the beginning of the scheduling duration had
we known the availability in the scheduling duration
perfectly. Then we adapt to these quantities and try to
complete the transfers despite being left with only a
half of the scheduling duration. Therefore in that model
[5], we explicitly take into account the stochastic nature
of the predictions only for the look up duration, but
consider only a particular network availability scenario
at a time for the scheduling duration. With this, we
reduced the number of variables and constraints in that
model.
We showed in [5] that the average number of iterations
(and hence a measure of the computational complexity)
9
for solving that model is 0.9 × (7I + (3 + 3Z)J + 2Z + 2).
However, as the number of network availability scenarios in the lookup duration Z may become high, the
amount of computations needed in the model can become considerably higher with more number of network
interfaces J. Comparably, the uncertainty in the scheduling duration (the number of scenarios - W ) did not add
computations as it was considered only implicitly in the
model.
To reduce the number of computations further, in this
paper we propose to consider the stochastic nature only
for the scheduling duration (still implicitly as before) but
remove it from the look up duration with most expected predictions. This results in a hybrid stochastic model which
(conceptually) considers uncertainties for the scheduling
duration but considers only the most expected prediction
for the look up duration. We analyze the trade-offs of this
amendment through the results in the section 7.4.
Then the maximization objective function f can be
represented as in (1), where w̄ represents a particular
scenario in the scheduling duration and z̃ represents
the most expected availability scenario in the look up
duration. Essentially, we discover the schedule: the optimal quantity of data from each application to be sent,
and the corresponding optimal number of time slots to
be used from each interface, during both the durations
considering the future availability of networks.
We take into account only that part of the schedule which corresponds to the scheduling duration and
transfer the data accordingly. Because at a point in time,
the look up duration is ahead the scheduling duration
and therefore the network availability estimates for the
look up duration are prone to more errors than for the
scheduling duration. Additionally, we can get a prediction for another scheduling duration after passing the
current scheduling duration. Therefore we use it to run
the program for that scheduling duration considering
another look up duration ahead and then the data is
transferred for that scheduling duration. This step by
step process is continued as long as we have data to be
transferred.
max f =
I X
2
X
Xi,t Ui,t −
i=1 t=1
J
X
Cj [nj,1,w̄ + nj,2,z̃ ]
(1)
j=1
Subject to,
Xi,1 ≥ αi ∀i
2
X
(2)
Xi,t ≥ βi ∀i
(3)
Xi,t ≤ γi ∀i
(4)
t=1
2
X
t=1
nj,1,w̄ ≤ Nj,1,w̄ ∀j
(5)
nj,2,z̃ ≤ Nj,2,z̃ ∀j
I
X
Xi,1 =
i=1
I
X
i=1
J
X
(6)
nj,1,w̄
(7)
nj,2,z̃
(8)
j=1
Xi,2 =
J
X
j=1
∀ Xi,t , nj,1,w̄ , nj,2,z̃ ≥ 0
(9)
The αi in equation (2) represents the minimum quantity of data of application i that needs to be transferred
in the first duration to satisfy its ”minimum transfer
rate” as discussed in section 7.1. We consider this ”minimum transfer rate” even for non-real-time applications
so as to make sure the completion of their transfers
before deadlines. βi in (3) represents the same minimum
quantity of data that needs to be transferred for both
the scheduling and lookup durations and the inequality
allows part or all of the minimum amount of data
for the second duration to be transferred in the first
duration itself. γi in (4) on the other hand limits overusing the available interfaces providing delay tolerant
applications the ability to hold data transfers so as to
consume cheap future available networks as much as
possible. Typically γi = Kβi where K(≥ 1) is a constant.
Equations (7) and (8) give the quantity of data to be sent
from each application, which is limited by the number
of (predicted) available time slots given by the equations
(5) and (6). So the model finds the schedule (the optimal
way to use future available networks) with f , which is
constrained by the equations (2) to (9), where the major
constraint, the future availability of networks, are given
in equations (5) to (8).
7.3
Emulation
In our work on forecasting network availability, we
predicted the availability of (intermittently available) WiFi for a 5 minutes duration, P (W i − F i), repeatedly for
all the 5 minutes blocks in a day, for 60 days of 12 real
users in Sydney, Australia (5 days per each user). We
have recorded the actual Wi-Fi availability as well, on
all those 60 days. In contrast to Wi-Fi, 3G was almost
always available for all of the users. We found out that
there are higher Wi-Fi availability prediction errors in
the mornings and evenings coinciding user travel times.
In evaluating the scheduler discussed in this section, we
used these travel times.
We take a single bundle to be of size 1 KB as in [2]
and consider only the downloading path for the evaluation, although the models can work out for both ways
indistinguishably. To emulate a typical data transfer
session whilst mobile, we consider a scenario of a mobile
user downloading data of three different applications
simultaneously. Each application is assumed to have a
10
requirement to download 5, 30 and 100 MB. Let us define
the utility functions of these three downloads as follows.
The first download is assumed to have a higher utility
which starts at 70 in some units, say 1/1000 000 of a
dollar per downloading a bundle as in [2]. Then the
utility linearly decreases to 10 within the first 5 minutes,
then again linearly decreases to zero within the next
2 minutes, as depicted in Figure 10.d, resembling an
urgent photo transferring application for example. The
second download’s utility starts at 25, linearly decreases
to 15 within 15 minutes time and then linearly decreases
to zero in next 5 minutes as depicted in Figure 10.c. This
represents a video downloading application for example.
The third resembles a non urgent/non critical application, typically like a software update. Its utility starts at
15 and remains constant within the first 50 minutes, and
then linearly decreases to zero in another 10 minutes, as
depicted in Figure 10.b. Although these utility values are
arbitrary, we selected them in such a way that the three
applications’ importance is in decreasing order. Further,
we change them to see the effects in section 7.4.4.
We consider the GSM network presence to be equivalent to 3G network presence, a realistic assumption, and
assume that it has 0.5 Mbps fixed rate (bandwidth) with
cost of 20 in the same units per transferring a bundle,
i.e. 2 cents/MB. The cost can well be a combination of
dollar cost, power cost etc. On the other hand, Wi-Fi
network is assumed to have a fixed rate of 1Mbps with
5 units of cost per transferring bundle, i.e. 0.5 cents/MB,
one fourth of 3G network’s cost. In reality, 3G and Wi-Fi
networks’ access bandwidths are diverse. However, we
consider the above values to be typical end to end rates.
Nevertheless, we change these rates relative to each
other to see the consequences in section 7.4.3. Further,
even though this emulation considers the variations
only in the availability of the networks, the approach
is applicable even with variable access bandwidths of
networks.
We take the scheduling duration of the model to be
of 5 minutes, and consider a look up duration of 10
minutes. The scheduling duration has to be short, and
the 5 minute time length is to match our predictions
which have been derived for 5 minute length intervals.
The 10 minute length of the look up duration enables to
look into the future to a fare extent (two times the length
of the scheduling duration). Assume that we want to
schedule our applications’ data transfers for the next 5
minutes at a given point in time. For that scheduling
duration, it is possible to predict the Wi-Fi network
availability, P (Wi-Fi), as described in section 6.1. Then
we extrapolate the availability of Wi-Fi networks for the
look up duration by using the Wi-Fi network availability
that was observed in the 15 minutes immediately prior,
together with P (Wi-Fi) using linear regression.
We consider several other models as well for comparison purposes and emulate the same data transfer
scenario with them. ”Hybrid” legend in the figures in
the next section represents the model discussed here.
”Stochastic” legend identifies the model we proposed
earlier [5] which considers uncertainties in both scheduling and look up durations. The ”Perfect” legend shows
the same ”Stochastic” model, but having 100% accurate
predictions for the scheduling duration, whilst uncertainty in the look up duration remains the same. This
illustrates how much improvement can be obtained
with better predictions. The ”Expected” legend shows
the same model but without considering the stochastic
nature of predictions in both the durations [5]. The
”Greedy” legend identifies an approach which does not
consider network availability predictions and tries to
utilizes all available networks at the time of transfer [5].
In our prediction work, we learned that the actual WiFi availability can be more than the prediction or less
than it and also these are equally likely events. Further,
we found the average prediction error to be around
20% (section 6.1). Hence, for the ”Stochastic” approach,
for the scheduling duration of ”Hybrid” approach, and
for the look up duration of the ”Perfect” approach,
we consider only three availability scenarios: predicted
value + 20%, or 100% which ever is the minimum, the
predicted value itself, and the predicted value - 20%
or 0% which ever is the maximum. For example, if
the prediction is 90%, the low, mid and high scenarios
have the availabilities 70%, 90% and 100% respectively.
We assume that these high, mid and low scenarios will
occur with probabilities of 0.3, 0.4 and 0.3 respectively,
where the prediction (the mid scenario) is more likely
than the other two scenarios and both the low and
high scenarios occur with equal probabilities as stated
above. In comparison, 3G network is always available
and therefore, the Wi-Fi network availability scenarios
become the only scenarios in both scheduling and look
up durations.
In the stochastic model, we re-evaluate the Wi-Fi
network availability of the scheduling duration after
passing half of the duration using the following. If
the availability in the first half is below the low value
or, equal and Wi-Fi networks were not available in 30
seconds immediately prior to it, then the availability
of the entire duration is estimated to be low. If that
half has more Wi-Fi network availability than the high
value or, equal and Wi-Fi networks were available 30
seconds immediately prior to it, it is estimated to be
high. Otherwise, the availability of the entire duration is
considered to be the same value as in the mid scenario.
Further, we consider K = 1.5 in γi = Kβi . All the models
were implemented in Matlab using it’s optimization
toolbox.
We start to download the data of these applications
simultaneously at the beginning of each user transit time.
The transfers are scheduled immediately for the next
5 minutes using a scheduling model. Then the data is
emulated to be transmitted over 3G and Wi-Fi networks,
where the actual Wi-Fi network availability is found
from the traces. After that 5 minutes we reschedule the
remaining data for the next 5 minutes using the same
11
Fig. 12. Total utility gain minus cost for all the applications.
model. We repeat this process until transfers of data
of the three applications are completed. We calculate
and accumulate the utility minus cost for all the three
applications separately over the time, and also count
the time it takes to finish the transfer of the data for
each application. We do this emulation with each of the
scheduling models mentioned above.
By setting the parameters of the emulation and the
utility functions of the three applications as described
above, we were able to equate the utility gains of each
of the three applications with the greedy model, for all
the users on average. This ensures that for the total sum,
all the three applications contribute equally with the
greedy model, and it may change in the other models
depending on their performance. We attempted to set
these parameters as close to the reality as possible.
However, to determine the sensitivity, we also change
each parameter relative to its original value.
7.4
Results
7.4.1 With a typical data transfer session
With the above parameter set, the utility minus cost,
the net utility, gained for the three applications in the
emulation were added together to get the total utility.
It is shown in the Figure 12, where each value is
normalized by the corresponding total utility with the
”Perfect” approach. The overall average of all the users
for the ”Stochastic” model is 0.86, for the ”Hybrid”, it
is 0.81, for the ”Expected” 0.79 and for the ”Greedy”
0.60. We can clearly see that the proposed ”Hybrid”
model out performs the ”Greedy” approach by up to
21%, and the ”Expected” by up to 2% on average.
Further, it is only 5% below the ”Stochastic” approach
eventhough it requires at most twice the computations of
the ”Expected”, much less than that of ”Stochastic” [5].
This illustrates the applicability of the ”Hybrid” model
proposed in this paper under imperfect predictions and
with limited amount of resources (processing).
In addition to the total utility, we consider the application completion times as well for evaluating different
scheduling models. We show how each application’s
Fig. 13. Completion times of application 1.
data transfer was completed with respect to its soft
deadline in Figures 13, 14 and 15.
It can be seen that the key for maximizing utility
and minimizing cost is deferring and rescheduling the
applications which can wait. Due to this, with the
”Stochastic” and ”Hybrid” approaches, the applications’
data transfers are completed well after the completion
times resulted with the ”Greedy” approach, as the latter
uses any available network resources to finish as soon as
possible. However, the former two finished transfers before the soft deadlines in all the cases and gained higher
net utility. On the contrary, the ”Expected” approach did
not finish data transfers at all in some cases (user 10 and
12
11 in Figure 15). The reason is that in those cases, a Wi-Fi
availability of around 10% was predicted continuously
for scheduling durations towards the end of the time
span considered, even though Wi-Fi was actually not
available. The ”Expected” approach kept trusting the
prediction and scheduled only to use the Wi-Fi network
whereas the ”Stochastic” and ”Hybrid” approaches reevaluated the availability at the mid point of the scheduling duration. If the actual was found to be different to
the prediction, they executed the corresponding recourse
decision, i.e. using 3G in this case. This further shows
that the ”Stochastic” and ”Hybrid” approaches utilize
current resources optimally than ”Expected”, and hence
spare resources further ahead of time. Therefore, they
both appear not only better than the ”Expected”, but
also are necessary for avoiding unnecessary delays.
This clearly shows that the proposed ”Hybrid” approach gains better utility than both ”Expected” and
”Greedy” approaches while avoiding unnecessary delays associated with ”Expected”, and also closer in performance to the ”Stochastic” while only incurring at
most twice the computations of the ”Expected”, much
less than that of ”Stochastic”.
Moreover, it is interesting to find whether the same
results hold if we change different parameters of the
emulation. Therefore, we changed the parameters of
”cost”, ”Wi-Fi and 3G data rates”, ”utility functions”,
”application file sizes” etc one at a time and re-ran the
emulation. The results are presented in the following
subsections.
7.4.2 With Different Costs
In the above emulation setup, we have used 3G cost
to be of 20 units and Wi-Fi to be of 5, one fourth
of that. We re-ran the emulation with the same set of
parameters but with different Wi-Fi costs, set to 0 and 10,
representing zero and two fourth. The results are given
in Figure 16. The values are normalized with respect to
the total utility gain with the ”Perfect” approach in the
one fourth cost case, which is shown in the middle of
the figure. The upper and lower sections carry low (zero)
and high (two fourth) Wi-Fi cost cases respectively.
As can be seen from the figure, when the relative
cost decreases, the difference in absolute terms between ”Greedy” and ”Hybrid” (”Stochastic”) approaches
widens, 12%, 21% and 29% (16%, 26% and 35%) respectively in high, mid and low cost cases. With respect to
the utility gain of each case’s ”Perfect” approach, they
all are around 21% (differences between ”Greedy” and
”Stochastic” are around 26%). Although the ”Hybrid”
and ”Expected” perform equally well in terms of utility
gains, ”Expected” causes unnecessaruy delays in transfer
completions as shown in the section 7.4.1. Further,
the ”Hybrid” approach results in utility gains closer to
that of ”Stochastic” in all the cost cases. This shows
that ”Hybrid” performs better than ”Greedy” and ”Expected” approaches and also closer to the performance
of ”Stochastic” under different cost scenarios.
13
Fig. 18. Total utility with different utility functions.
Fig. 16. Total utility with different costs.
Fig. 17. Total utility with different rates.
7.4.3 With Different Data Rates
Next, we use different Wi-Fi data rates in the emulation.
Earlier we took it to be 1Mbps, two times the 3G rate of
512 Kbps. We ran the emulation with rates 0.5 and 1.5
Mbps for the Wi-Fi, resembling one time and three times
of 3G interface’s rate and the results are shown in the
Figure 17. Every value was normalized with respect to
the value of the ”Perfect” approach in the mid rate case.
The upper and lower portions of the graph show high
and low rate cases, while the mid showing the results
with the regular setup.
The differences between ”Greedy” and ”Hybrid”
(”Stochastic”) approaches in low, mid and high rate cases
are 11%, 21% and 23% (14%, 26% and 28%) respectively,
and with respect to the utility gain of each case’s ”Perfect” approach, they are 16%, 21% and 20% (22%, 26%
and 24%). In the low rate case, the difference is low
understandably as the amount which can be sent with
low cost becomes less. But in the high rate case, it is
not as high as we expect which is counter intuitive.
The reason behind is that the third application which
comparably gains much of the utility with the ”Hybrid”
and ”Stochastic” approaches use much from the WiFi interface even in the mid rate case, and if the rate
becomes high, it only helps to finish the transfers more
quickly. For example the ”Hybrid” approach has an
average completion time of 73% (with respect to the soft
deadline) compared to 78% in the mid rate case. If the
file sizes have been higher, we would have observed a
higher difference in the high rate case. Another reason
is, when the rate is high, the greedy approach too gains
good utility (66% compared to 60% in the mid rate case).
Even with different rates, still we see that the ”Hybrid”
approach provides a computationally inexpensive alternative to ”Stochastic”.
7.4.4 With Different Utility Functions (User Preferences)
The parameters we change next are the utility functions.
We shift all the utility functions by 10 units up and down
with respect to their initial values, which are identified
as ”high” and ”low” cases. The results are given in the
Figure 18 where high, mid and low cases are depicted in
the upper, middle and lower parts of the figure respectively. Similarly, all of the values were normalized by the
mid case’s utility gain with the ”Perfect” approach.
We can see that the total utility has become even negative, which happens because in the low utility case, the
utilities of applications are not so important compared
to the costs of the interfaces, nevertheless we transfer
them. The average differences between ”Hybrid” and
”Greedy” approaches are 19%, 21% and 18% in low, mid
and high utility cases respectively and the differences
between ”Stochastic” and Greedy” are 25%, 26% and
23% respectively. In the low and mid cases, they both
seem to be in the same order of magnitude. In the high
case, it can be observed that the utility gain has gone
a little bit down where the reason would be that the
user treats the utilities of all of the applications to be
comparably higher than the costs and therefore, even the
”Greedy” approach produces good utility. That is why
the difference between ”Greedy” and ”Perfect” in high
14
Fig. 20. Completion time of the application 3 with higher
file sizes.
Fig. 19. Total utility with different file sizes.
case is only 34%, compared to the 40% in the mid case.
This point was further supported by the higher overall
utility gain with the greedy approach when we ran the
emulation with utility functions shifted further up by 10
units. Overall, we still see the ”Hybrid” approach acting
as a practical alternative to the ”Stochastic” approach,
as ”Expected” sometimes results in more delays eventhough it too produces comparably good utility gains.
7.4.5 With Different File Sizes
To demonstrate the effects of milder and heavier network
usage scenarios, here we change the file sizes of each
application. In the standard (mid) case, they are 5, 30
and 100 MB for the first, second and third applications
respectively. We consider two cases; the low case has
2.5, 15 and 50 MB sizes (half of the sizes in the mid
case) whereas high case has 7.5, 45 and 150 MB (three
halves of the mid case’s sizes). The results are given in
the Figure 19 where all the figures were normalized by
the mid case’s total utility gain with ”Perfect” approach.
The upper, middle and lower parts of the graph show
high, mid and low cases of file sizes. The differences
between ”Stochastic” and ”Greedy” approaches for them
are 14%, 26% and 22% respectively and with respect to
the utility gain of each case’s ”Perfect” approach, they
are 11%, 26% and 36%. The same diferences between
”Hybrid” and ”Greedy” are 12%, 21% and 18% and
with respect to the utility gain of each case’s ”Perfect”
approach, they are 9%, 21% and 30%. The reduction of
this difference with both the approaches in the lower
case in absolute terms is self explanatory, as when the
file sizes decrease, so does the utility gains. But still, with
respect to its ”Perfect” approach, the difference between
”Hybrid” and ”Greedy” is 31% (which is close to 36%
resulting with ”Stochastic”) showing the effectiveness of
the ”Hybrid” (and ”Stochastic”) with low file sizes, as it
allows more room for optimizations. On the contrary, in
the high case, the lesser difference as well as greedy approach performing better than even stochastic approach
in some cases according to the figure is counter intuitive.
The clear reason for that is, when the file sizes increase,
for some users for whom the network availability is
lesser, whatever available interfaces have to be used
to transfer them within the deadline, irrespective of
the interface costs as there is no much room for an
optimization. A greedy approach does exactly the same
thing. We can further clarify this point by Figure 20,
which shows completion times of the third application
(the most vulnerable one due to its size) with these three
different approaches (values are normalized with respect
to the corresponding soft deadline).
We can see a general coincidence that when the completion times with the ”Greedy” approach are higher,
the total utility gain with ”Stochastic” and ”Hybrid”
approaches become lesser, for example in the case of
users 3, 5, 6, 10 and 11. For the user 9, it is higher even
though the completion time is still high, the apparent
reason is that the predictions in the initial scheduling
durations for that user have been much better; hence we
get an overall utility with the ”Stochastic” and ”Hybrid”
approaches closer to that of the perfect approach. And
for the same user, ”Expected” results with higher total
utility even than ”Perfect”, at the expense of violating
even the hard deadline! Therefore, we can conclude that
”Hybrid” performs closer to ”Stochastic” and better than
both ”Expected” and ”Greedy” in terms of utility gains
and application completion times even with different file
sizes.
These results clearly show that the enhanced ”Hybrid”
model provides a better alternative to the previously proposed ”Stochastic” model for probabilistically scheduling data transfers with minimal resources. The results
were obtained with a prediction error of 20%, which
was derived by running the predictors we proposed
earlier on real user data. However, it is important to
do a sensitivity analysis for each scheduling method
against different prediction errors to identify each methods strengths and weaknecesses. We leave this analysis
for future work.
8
C ONCLUSION
A multi-interfaced mobile device in a heterogeneous
network environment can connect to different access networks with different capabilities at different times. With
the help of the network availability (and available bandwidth) prediction and abundant onboard memory, delay
tolerant data transfers can be scheduled to be transferred
using high bandwidth Hot-Spots for maximizing user
utility. This reduces cost to the user, saves power, provides the highest possible quality for the applications
that are being used and also enables higher utilization
of existing infrastructure of network providers.
The paper proposed a mobile device based architecture that facilitates the above. As part of the architecture,
firstly we defined and presented an API which will hide
the complexity of the system and provide application
programmers easy access to the EMUNE functionalities.
Secondly, we presented a unique method of predicting
network availability using a generic Dynamic Bayesian
Network based model and showed that the associated
computational complexity is considerably less than that
of the competing, modified Hidden Markov Model.
Thirdly, we described a scheduling unit which can take
the output from the prediction module, user preferences
and application requirements to find an optimal schedule for transferring data of all the application requests.
We further showed with real user data that the proposed
computationally inexpensive hybrid stochastic scheduling model out-performs other non stochastic variants
and greedy models and also closer in performance to
the previously proposed stochastic model for different
combinations of interface costs, network bandwidths,
application utility functions and file sizes. Finally, we
described the transport service which executes the derived schedule accordingly by controlling Freeze TCP
and Monami protocols together.
ACKNOWLEDGMENTS
This work has been performed in the context of NICTA’s
CAMP project, which is funded by Ericsson. We would
like to thank NICTA staff and other volunteers who
participated in our experiments.
R EFERENCES
[1] A. Rahmati and L. Zhong, ”Context-for-Wireless: Context-Sensitive
Energy-Efficient Wireless Data Transfer”, in Proc. ACM/USENIX
MobiSys, June, 2007.
[2] M. Zaharia and S. Keshav, ”Fast and Optimal Scheduling Over
Multiple Network Interfaces”,University of Waterloo Technical
Report CS-2007-36, October 2007.
[3] Upendra Rathnayake and Max Ott, ”Predicting Network Availability Using User Context”, in Proceedings of ACM MobiQuitous ’08,
Dublin, Ireland, July, 2008.
[4] Upendra Rathnayake, Max Ott and Aruna Seneviratne, ”A DBN
Approach for Network Availability Prediction”, in Proceedings of
MSWiM’09, Canary Islands, Spain, Oct 2009.
15
[5] Upendra Rathnayake, Mohsin Iftikhar, Max Ott and Aruna Seneviratne, ”Realistic Data Transfer Scheduling with Uncertainty”,
NICTA Technical Report Aug, 2009.
[6] Y. Vanrompay, P. Rigole, Y. Berbers ”Predicting network connectivity for context-aware pervasive systems with localized network
availability”, in WoSSIoT’07, a workshop of EuroSys, March, 2007.
[7] C. Doss, R., A. Jennings, N. Shenoy, ”A Review of Current work on
Mobility Prediction in Wireless Networks”, ACM AMOC,Thailand,
2004.
[8] L. Song, D. Kotz, R. Jain and X. He, ”Evaluating location predictors
with extensive Wi-Fi mobility data”, In Proceedings of the 23rd
Annual Conference INFOCOM, pages 1414-1424, March, 2004.
[9] M. Kim and D. Kotz and S. Kim, ”Extracting a mobility model from
real user traces”, In Proceedings of the 25th Annual Conference of
INFOCOM, Barcelona, Spain, April, 2006.
[10] E. Exposito, R. Malaney, X. Wei, D. Nghia, ”Using the XQoS
Platform for designing and developing the QoS-Seeker System”,
In the proceeding of the 3rd International IEEE Conference on
Industrial Informatics (INDIN), Perth, Australia, 2005.
[11] N. Samaan and A. Karmouch, ”A Mobility Prediction Architecture
Based on Contextual Knowledge and Spatial Conceptual Maps”,
IEEE Transaction on Mobile Computing 4(6): 537-551 Nov/Dec
2005.
[12] F. Erbas, J. Steuer, D.Eggeiseker, K. Kyamakya and K. Jobmann,
”A Regular Path Recognition Method and Prediction of User
Movements in Wireless Networks”, Proceedings of Vehicular Technology Conference, VTC, October 2001.
[13] Z. R. Zaidi and B. L. Mark, ”Mobility Estimation for Wireless
Networks Based on an Autoregressive Model,” in Proc. IEEE
Globecom 2004, Dallas, Texas, December 2004.
[14] Michael I. Jordan, Christopher M. Bishop. An Introduction to
Graphical Models (Book Draft)
[15] P. Bahl, A.Adya, J. Padhye and A. Wolman, ”Reconsidering
Wireless Systems with Multiple Radios”, ACM SigComm CCR,
vol.34, issue, 5, 10/04
[16] T. Pering, Y. Agarwal, R. Gupta and R. Want, ”CoolSpots: Reducing the Power Consumption of Wireless Mobile Devices with
Multiple Radio Interfaces”, in proc. of MobiSys, June 06, Sweden
[17] E. Shih, P. Bahl, M. Sinclair, ”Wake on Wireless: An Event Driven
Energy Saving Strategy for Battery Operated Devices”, in proc. of
ACM MobiCom 2006.
[18] H. Wang, R. Katz and J. Giese, ”Policy-Enabled Handoffs across
heterogeneous wireless networks”, in Mobile Computing Systems
and Applications, 2004
[19] F. Zhu and J. McNair, ”Optimizations for Vertical Handoff Decision Algorithms”, Proc. WCNC 2004
[20] O. Ormond, J. Murphy and G. Muntean, ”Utility-based Intelligent
Network Selection in Beyond 3G systems”, in proc. of IEEE ICC,
June 2006.
[21] Sun J, Riekki J, Sauvola J and Jurmu M, ”Towards connectivity
management adaptability: context awareness in policy representation and end-to-end evaluation algorithm”, in proc. of 3rd MUM,
college park, MD, 85-92, 2004
[22] K. Toyama, R. N. Murty, C.A. Thekkath and R. Chandra, ”Costaware networking over heterogeneous data channels”, US patent
20070171915, http://www.freepatentsonline.com/7071915.html
[23] Bonnin J., Z. B. Hamouda, I. Lassoued, A. Belghith, ”Middleware
for multi-interfaces management through profiles handling”, in
proc. of Mobileware 2008, Innsbruck, Austria, 2008
[24] M. K. Sowmia Devia and P. Agarwal, ”Dynamic interface selection
in portable multi-interface terminals”, in proc. of IEEE Portable,
Orlando, FL, March 25-29, 2007
[25] A. A. Koutsordoi, E. F. Adamopoulou, K. P. Demestichas, M. E.
Theologou, ”Terminal Management and Intelligent Access Selection in Heterogeneous Environments”, MONET 11 (6): 861-871,
2006
[26] ”Method and system for selecting an access network in
a heterogeneous network environment”, US Patent 7315750,
http://www.patentstorm.us/patents/7315750/fulltext.ht ml
[27] Schorr, A. Kassler, A. Petrovic G., ”Adaptive media streaming in
heterogeneous wireless networks”, in proc. of IEEE Workshop on
Multimedia Signal Processing, Sep. 2004
[28] T. Goff, J. Moronski, D. S. Phatak, and V. Gupta, ”Freeze-tcp: A
true end-to-end tcp enhancement mechanism for mobile environments”, in IEEE INFOCOM, 2000.
[29] Wanghong Yuan and Klara Nahrstedt, ”Energy-Efficient Soft RealTime CPU Scheduling for Mobile Multimedia Systems”, in Proc. of
19th ACM Symposium on Operating Systems Principles (SOSP’03),
Bolton Landing, NY, October, 2003.
[30] Changjiu Xian, YungHsiang Lu and Zhiyuan Li, ”Energy Aware
Scheduling for RealTime Multiprocessor Systems with Uncertain
Task Execution Time”, In proceedings of DAC, Califf, USA 2007
[31] M. Bayati, M. Squillante and M. Sharma, ”Optimal scheduling in
multi-server queuing network”, ACM SIGMETRICS/Performance,
2006
[32] Jay Sethuraman and Mark Squillante, ”Optimal stochastic
scheduling in multiclass parallel queues”, SIGMETRICS 1999
[33] M. Stenm and R. H. Katz, ”Vertical handoffs in wireless overlay
networks”, ACM Mobile Networks and Applications, 3(4): 335-350,
Dec. 1998
[34] Julia L. Higle, ”Stochastic Programming: Optimization When Uncertainty Matters”, Tutorials in Operational Research - INFORMS,
2005
[35] Shapiro A. and Philpott A., ”A Tutorial on Stochastic Programming”, http://stoprog.org/
[36] ”Analysis
of
Multihoming
in
Mobile
IPv6”,
http://www.nautilus6.org/ietf/monami6/ietf63/monami6-ietf63bof.html
[37] A. Baig, L. Libman, M. Hassan, ”Performance Enhancement of
On-Board Communication Networks Using Outage Prediction”,
IEEE Journal on Selected Areas in Communications, vol 24, issue
9, pp. 1692 - 1701, 2006.
[38] H.Y. Hsieh, K.H. Kim, Y. Zhu, and R. Sivakumar., ”A receivercentric transport protocol for mobile hosts with heterogeneous
wireless interfaces”, In Proc. of ACM MOBICOM’03, Sept. 2003.
[39] S. Herborn, H. Petander, and M. Ott. ”Predictive context aware
mobility handling”, In International conference on Telecommunications, 2008.
[40] T. Henderson, J. Crowcroft, and S. Bhatti, ”Congestion Pricing:
Paying Your Way in Communication Networks”, In IEEE Internet
Computing Journal, Volume 5, Pages 85-89, 2001.
[41] Nathanael Thompson, Guanghui He, and Haiyun Luo. ”Flow
scheduling for end-host multihoming”’, In Proceedings IEEE INFOCOM, April 2006
16

emune

Transcription

Similar documents

GA Power Utility Easement Pebblebrook HS

K-8 Schedule Examples

4929 139,000

RoomWizard® II Scheduling System Technical

covers layout.indd - The Edison Foundation

2015 Annual Financial Statement

PureNet™ Asset Management

REGIONAL SCHEDULER

10 Questions to ask before you buy - Appointment-Plus

Flyer Berechtigung Vorfahrt en 1603_X16038.indd