With the advent ofdigital technology and smart devices, a flood of digital data is beinggenerated every day. This huge amount of data not only records the historyactivities but also provides future valuable information for organizations andbusinesses. However, the true values of these data will not be fullyappreciated until they have been processed, analyzed and the analysis resultsbeen communicated to decision makers in a business friendly manner.In view of thisneed, big data has been one of the major research focus in the academicresearch community especially in the field of computer science and the softwarevendor as well as the big data service providers. However, majority of the current academicresearch and practice development efforts tend to focus on the technologicalaspect of big data. To fill the currentresearch gap of big data especially big data analytics, this study aims toinvestigate the possible approach to design and implement big data analyticsapplication by integrating an open source big data processing framework and anopen source data analysis environment. The presentationaims to share our findings and experiences learned through working on thisstudy. It consist of five sections. First, the motivation, objective and scope ofthe study will be presented. This isfollowed by a review of related literature on big data and big data analytics. In Section 3, the case scenario and data usedin the study will be introduced. Adetailed discussion on the data preparation will be presented too. Next, we will introduce the analysis andmodelling method used in this study. Insection 5, the user interface of the application designed for this study willbe discussed. A use case will then beused to demonstrate and evaluate the performance and analysis of theapplication. Finally, the overallconclusion, lessons learned and direction for future research will bepresented.
Big Data Analytics, SparkR, Spatial Interaction Model, Geospatial Analytics, R Shiny
Categorical Data Analysis | Databases and Information Systems
Data Management and Analytics
International Conference for Free and Open Source Software for Geospatial, Boston, MA, 2017 August 14-19
Eastern Academy of Management
City or Country
ZHANG, Mengqi and KAM, Tin Seong.
Integrating apache spark and R for big data analytics on solving geographic problems. (2017). International Conference for Free and Open Source Software for Geospatial, Boston, MA, 2017 August 14-19. Research Collection School Of Information Systems.
Available at: http://ink.library.smu.edu.sg/sis_research/3830
Creative Commons License
This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 4.0 License.