Distributed machine learning on IAAS clouds
Publication Type
Conference Proceeding Article
Publication Date
11-2018
Abstract
Training complex machine learning (ML) models with large datasets requires powerful computing infrastructure, which is costly to acquire and maintain. As a result, ML researchers turn to the cloud for on-demand and elastic resource provisioning capabilities. Two issues have arisen from this trend: 1) if not configured properly, training ML models on the cloud could incur significant cost and time, and 2) many researchers in ML tend to focus more on model and algorithm development, so they may not have enough time or skills to deal with system setup, resource selection and configuration. In this work, we propose and implement FC 2 : a web service for fast, convenient and cost-effective distributed ML model training over public cloud resource. Central to the effectiveness of FC 2 is the ability to recommend an appropriate resource configuration in terms of cost and execution time for a given ML training task. Extensive experiments with real-world deep neural network models and datasets demonstrate the effectiveness of our solution.
Discipline
Artificial Intelligence and Robotics | Software Engineering
Research Areas
Software and Cyber-Physical Systems
Publication
Proceedings of the 5th IEEE International Conference on Cloud Computing and Intelligence Systems (CCIS), Nanjing, China, 2018 November 23-25
Identifier
10.1109/CCIS.2018.8691150
Publisher
IEEE
City or Country
Nanjing, China
Citation
TA, Nguyen Binh Duong and NGUYEN, Quang Sang.
Distributed machine learning on IAAS clouds. (2018). Proceedings of the 5th IEEE International Conference on Cloud Computing and Intelligence Systems (CCIS), Nanjing, China, 2018 November 23-25.
Available at: https://ink.library.smu.edu.sg/sis_research/4832
Additional URL
https://doi.org/10.1109/CCIS.2018.8691150