Publication Type

Conference Proceeding Article

Version

acceptedVersion

Publication Date

4-2021

Abstract

Tensor decomposition is a fundamental multidimensional data analysis tool for many data-driven applications, such as social computing, computer vision, and bioinformatics, to name but a few. However, the rapidly increasing streaming data nowadays introduces new challenges to traditional static tensor decomposition. It requires an efficient distributed dynamic tensor decomposition without re-computing the whole tensor from scratch. In this paper, we propose DisMASTD, an efficient distributed multi-aspect streaming tensor decomposition. First, we prove the optimal tensor partitioning problem is NP-hard. Second, we present two heuristic tensor partitioning approaches to ensure the load balancing. Third, we develop a distributed multi-aspect streaming tensor decomposition computation method, which avoids repetitive computation and reduces network communication by maintaining and reusing the intermediate results. Last but not least, we perform extensive experiments with both real and synthetic datasets to demonstrate the efficiency and scalability of DisMASTD.

Keywords

Data analysis, tensors, social computing

Discipline

Databases and Information Systems | Numerical Analysis and Scientific Computing

Research Areas

Data Science and Engineering

Publication

2021 IEEE 37th International Conference on Data Engineering (ICDE): Virtual, April 19-22: Proceedings

First Page

1

Last Page

12

ISBN

9781728191843

Identifier

10.1109/ICDE51399.2021.00098

Publisher

IEEE

City or Country

Piscataway, NJ

Copyright Owner and License

Authors

Additional URL

https://doi.org/10.1109/ICDE51399.2021.00098

Share

COinS