A method and apparatus for rapid identification of column heterogeneity in databases are disclosed. For example, the method receives data associated with a column in a database. The method computes a cluster entropy for the data as a measure of data heterogeneity and then determines whether said data is heterogeneous in accordance with the cluster entropy.
Databases and Information Systems
Data Management and Analytics
DAI, Bing Tian; KOUDAS, Nikolaos; OOI, Beng Chin; SRIVASTAVA, Divesh; and VENKATASUBRANMANIAN, Suresh.
Method and apparatus for rapid identification of column heterogeneity. (2012). 1-13. Research Collection School Of Information Systems.
Available at: http://ink.library.smu.edu.sg/sis_research/3670
Creative Commons License
This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 4.0 License.