Publication Type

Conference Proceeding Article

Publication Date

10-2021

Abstract

Data analytics has tremendous potential to provide targeted benefit in low-resource communities, however the availability of highquality public health data is a significant challenge in developing countries primarily due to non-diligent data collection by community health workers (CHWs). Our use of the word non-diligence here is to emphasize that poor data collection is often not a deliberate action by CHW but arises due to a myriad of factors, sometime beyond the control of the CHW. In this work, we define and test a data collection diligence score. This challenging unlabeled data problem is handled by building upon domain expert’s guidance to design a useful data representation of the raw data, using which we design a simple and natural score. An important aspect of the score is relative scoring of the CHWs, which implicitly takes into account the context of the local area. The data is also clustered and interpreting these clusters provides a natural explanation of the past behavior of each data collector. We further predict the diligence score for future time steps. Our framework has been validated on the ground using observations by the field monitors of our partner NGO in India. Beyond the successful field test, our work is in the final stages of deployment in the state of Rajasthan, India. This system will be helpful in providing non-punitive intervention and necessary guidance to encourage CHWs

Keywords

Computing methodologies, Machine learning, Artificial intelligence, Community healthcare, Data quality, Data collection diligence, Clustering, Social impact

Discipline

Databases and Information Systems

Research Areas

Data Science and Engineering

Publication

Proceedings of ACM Conference on Equity and Access in Algorithms, Mechanisms, and Optimization

First Page

1

Last Page

12

Publisher

ACM

City or Country

Online

Additional URL

https://doi.org/10.1145/3465416.3483292

Share

COinS