Publication Type

Report

Publication Date

2012

Abstract

We focus on the problem of multi-party data sharing in high dimensional data settings where the number of measured features (or the dimension) p is frequently much larger than the number of subjects (or the sample size) n, the so-called p>> n scenario that has been the focus of much recent statistical research. Here, we consider data sharing for two interconnected problems in high dimensional data analysis, namely the feature selection and classification. We characterize the notions of “cautious", “regular", and “generous" data sharing in terms of their privacy-preserving implications for the parties and their share of data, with focus on the \feature privacy" rather than the \sample privacy," though the violation of the former may lead to the latter. We evaluate the data sharing methods using a phase diagram from the statistical literature on multiplicity and Higher Criticism thresholding. In the two-dimensional phase space calibrated by the signal sparsity and signal strength, a phase diagram is a partition of the phase space and contains three distinguished regions, where we have no (feature) privacy violation, relatively rare privacy violations, and an overwhelming amount of privacy violation.

Discipline

Databases and Information Systems | Information Security

Embargo Period

4-4-2014

Citation

FIENBERG, Stephen E. and JIN, Jiashun. Privacy-Preserving Data Sharing in High Dimensional Regression and Classiﬁcation Settings. (2012).
Available at: https://ink.library.smu.edu.sg/larc/1

Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.

Download

Included in

Databases and Information Systems Commons, Information Security Commons

COinS

LARC Research Publications

Privacy-Preserving Data Sharing in High Dimensional Regression and Classiﬁcation Settings

Publication Type

Publication Date

Abstract

Discipline

Embargo Period

Citation

Creative Commons License

Included in

Search

Links

Browse

Links

LARC Research Publications

Privacy-Preserving Data Sharing in High Dimensional Regression and Classiﬁcation Settings

Author

Publication Type

Publication Date

Abstract

Discipline

Embargo Period

Citation

Creative Commons License

Included in

Share

Search

Links

Browse

Links