Property Testing of Data Dimensionality
Conference Proceeding Article
Data dimensionality is a crucial issue in a variety of settings, where it is desirable to determine whether a data set given in a high-dimensional space adheres to a low-dimensional structure. We study this problem in the framework of property testing: Given a query access to a data set S, we wish to determine whether S is low-dimensional, or whether it should be modified significantly in order to have the property. Allowing a constant probability of error, we aim at algorithms whose complexity does not depend on the size of S.We present algorithms for testing the low-dimensionality of a set of vectors and for testing whether a matrix is of low rank. We then address low-dimensionality in metric spaces. For vectors in the metric space l1, we show that low-dimensionality is not testable. For l2, we show that a data set can be tested for having a low-dimensional structure, but that the property of approximately having such a structure is not testable.
Numerical Analysis and Scientific Computing | Software Engineering
SODA '03: Proceedings of the 14th Annual ACM-SIAM Symposium on Discrete Algorithms
City or Country
KRAUTHGRAMER, R. and SASSON, Ori.
Property Testing of Data Dimensionality. (2003). SODA '03: Proceedings of the 14th Annual ACM-SIAM Symposium on Discrete Algorithms. 18-27. Research Collection School Of Information Systems.
Available at: http://ink.library.smu.edu.sg/sis_research/1248