Research Collection School of Social Sciences

Treating words as data with error: Uncertainty in text statements of policy positions

Publication Type

Journal Article

Version

publishedVersion

Publication Date

4-2009

Abstract

Political text offers extraordinary potential as a source of information about the policy positions of political actors. Despite recent advances in computational text analysis, human interpretative coding of text remains an important source of text-based data, ultimately required to validate more automatic techniques. The profession's main source of cross-national, time-series data on party policy positions comes from the human interpretative coding of party manifestos by the Comparative Manifesto Project (CMP). Despite widespread use of these data, the uncertainty associated with each point estimate has never been available, undermining the value of the dataset as a scientific resource. We propose a remedy. First, we characterize processes by which CMP data are generated. These include inherently stochastic processes of text authorship, as well as of the parsing and coding of observed text by humans. Second, we simulate these error-generating processes by bootstrapping analyses of coded quasi-sentences. This allows us to estimate precise levels of nonsystematic error for every category and scale reported by the CMP for its entire set of 3,000-plus manifestos. Using our estimates of these errors, we show how to correct biased inferences, in recent prominently published work, derived from statistical analyses of error-contaminated CMP data.

Discipline

Models and Methods | Political Science

Research Areas

Political Science

Publication

American Journal of Political Science

Volume

Issue

First Page

495

Last Page

513

ISSN

0092-5853

Identifier

10.1111/j.1540-5907.2009.00383.x

Publisher

Wiley

Citation

BENOIT, Kenneth, LAVER, Michael, & MIKHAYLOV, Slava.(2009). Treating words as data with error: Uncertainty in text statements of policy positions. American Journal of Political Science, 53(2), 495-513.

Available at: https://ink.library.smu.edu.sg/soss_research/3990

Copyright Owner and License

Publisher

Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.

Additional URL

https://doi.org/10.1111/j.1540-5907.2009.00383.x

Download

Included in

Models and Methods Commons

COinS

Research Collection School of Social Sciences

Treating words as data with error: Uncertainty in text statements of policy positions

Publication Type

Version

Publication Date

Abstract

Discipline

Research Areas

Publication

Volume

Issue

First Page

Last Page

ISSN

Identifier

Publisher

Citation

Copyright Owner and License

Creative Commons License

Additional URL

Included in

Search

Links

Browse

Links

Research Collection School of Social Sciences

Treating words as data with error: Uncertainty in text statements of policy positions

Author

Publication Type

Version

Publication Date

Abstract

Discipline

Research Areas

Publication

Volume

Issue

First Page

Last Page

ISSN

Identifier

Publisher

Citation

Copyright Owner and License

Creative Commons License

Additional URL

Included in

Share

Search

Links

Browse

Links