Publication Type
Journal Article
Version
acceptedVersion
Publication Date
1-2020
Abstract
Network function virtualization enables efficient cloud-resource planning by virtualizing network services and applications into software running on commodity servers. A cloud-service provider needs to manage and ensure service availability of a network of concurrent virtualized network functions (VNFs). The downtime distribution of a network of VNFs can be estimated using sample-path randomization on the underlying birth–death process. An integrated modeling approach for this purpose is limited by its scalability and computational load because of the high dimensionality of the integrated birth–death process. We propose a generalized convex decomposition of the integrated birth-death process, which transforms the high-dimensional multi-VNF process into a series of interlinked, low-dimensional, single-VNF processes. We theoretically show the statistical equivalence between the transition probabilities of the integrated birth–death process and those resulting from interlinking the decomposed system of processes. We further develop a decomposition algorithm that yields scalable and fast estimation of the system downtime distribution. Our algorithmic framework can be easily adapted to any logical definition of overall system availability. It can also be easily extended to various realistic VNF network configurations and characteristics including heterogeneous VNF failure distributions, effects of both node and link failures on the overall system downtime of fully or partially connected networks, and resource sharing across multiple VNFs. Our extensive computational results demonstrate the computational efficiency of the proposed algorithms while ensuring statistical consistency with the integrated-network model and the superior performance of the decomposition strategy over the integrated modeling approach.
Keywords
Cloud computing, convex decomposition, Markov chains, Network virtualization, sample path randomization
Discipline
Databases and Information Systems
Research Areas
Information Systems and Management
Publication
INFORMS Journal on Computing
Volume
32
Issue
2
First Page
321
Last Page
345
ISSN
1091-9856
Identifier
10.1287/ijoc.2019.0888
Publisher
INFORMS (Institute for Operations Research and Management Sciences)
Citation
GUO, Zhiling; LI, Jin; and RAMESH, Ram.
Scalable, adaptable and fast estimation of transient downtime in virtual infrastructures using convex decomposition and sample path randomization. (2020). INFORMS Journal on Computing. 32, (2), 321-345.
Available at: https://ink.library.smu.edu.sg/sis_research/5064
Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.
Additional URL
https://doi.org/10.1287/ijoc.2019.0888