Effects of Excluding Those Who Report Having “Syndomitis” or “Chekalism” on Data Quality: Longitudinal Health Survey of a Sample From Amazon’s Mechanical Turk

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

Background

Researchers have implemented multiple approaches to increase data quality from existing web-based panels such as Amazon’s Mechanical Turk (MTurk).

Objective

This study extends prior work by examining improvements in data quality and effects on mean estimates of health status by excluding respondents who endorse 1 or both of 2 fake health conditions (“Syndomitis” and “Chekalism”).

Methods

Survey data were collected in 2021 at baseline and 3 months later from MTurk study participants, aged 18 years or older, with an internet protocol address in the United States, and who had completed a minimum of 500 previous MTurk “human intelligence tasks.” We included questions about demographic characteristics, health conditions (including the 2 fake conditions), and the Patient Reported Outcomes Measurement Information System (PROMIS)-29+2 (version 2.1) preference–based score survey. The 3-month follow-up survey was only administered to those who reported having back pain and did not endorse a fake condition at baseline.

Results

In total, 15% (996/6832) of the sample endorsed at least 1 of the 2 fake conditions at baseline. Those who endorsed a fake condition at baseline were more likely to identify as male, non-White, younger, report more health conditions, and take longer to complete the survey than those who did not endorse a fake condition. They also had substantially lower internal consistency reliability on the PROMIS-29+2 scales than those who did not endorse a fake condition: physical function (0.69 vs 0.89), pain interference (0.80 vs 0.94), fatigue (0.80 vs 0.92), depression (0.78 vs 0.92), anxiety (0.78 vs 0.90), sleep disturbance (−0.27 vs 0.84), ability to participate in social roles and activities (0.77 vs 0.92), and cognitive function (0.65 vs 0.77). The lack of reliability of the sleep disturbance scale for those endorsing a fake condition was because it includes both positively and negatively worded items. Those who reported a fake condition reported significantly worse self-reported health scores (except for sleep disturbance) than those who did not endorse a fake condition. Excluding those who endorsed a fake condition improved the overall mean PROMIS-29+2 (version 2.1) T-scores by 1-2 points and the PROMIS preference–based score by 0.04. Although they did not endorse a fake condition at baseline, 6% (n=59) of them endorsed at least 1 of them on the 3-month survey and they had lower PROMIS-29+2 score internal consistency reliability and worse mean scores on the 3-month survey than those who did not report having a fake condition. Based on these results, we estimate that 25% (1708/6832) of the MTurk respondents provided careless or dishonest responses.

Conclusions

This study provides evidence that asking about fake health conditions can help to screen out respondents who may be dishonest or careless. We recommend this approach be used routinely in samples of members of MTurk.

Related collections

Most cited references 25

Record: found
Abstract: found
Article: not found

Coefficient alpha and the internal structure of tests

Lee Cronbach (1951)

Psychometrika, 16(3), 297-334

0 comments Cited 6191 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

Conducting behavioral research on Amazon's Mechanical Turk.

Winter Mason, Siddharth Suri (2012)

Amazon's Mechanical Turk is an online labor market where requesters post jobs and workers choose which jobs to do for pay. The central purpose of this article is to demonstrate how to use this Web site for conducting behavioral research and to lower the barrier to entry for researchers who could benefit from this platform. We describe general techniques that apply to a variety of types of research and experiments across disciplines. We begin by discussing some of the advantages of doing experiments on Mechanical Turk, such as easy access to a large, stable, and diverse subject pool, the low cost of doing experiments, and faster iteration between developing theory and executing experiments. While other methods of conducting behavioral research may be comparable to or even better than Mechanical Turk on one or more of the axes outlined above, we will show that when taken as a whole Mechanical Turk can be a useful tool for many researchers. We will discuss how the behavior of workers compares with that of experts and laboratory subjects. Then we will illustrate the mechanics of putting a task on Mechanical Turk, including recruiting subjects, executing the task, and reviewing the work that was submitted. We also provide solutions to common problems that a researcher might face when executing their research on this platform, including techniques for conducting synchronous experiments, methods for ensuring high-quality work, how to keep data private, and how to maintain code security.

0 comments Cited 692 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

Identifying careless responses in survey data.

Adam Meade, S Bartholomew Craig (2012)

When data are collected via anonymous Internet surveys, particularly under conditions of obligatory participation (such as with student samples), data quality can be a concern. However, little guidance exists in the published literature regarding techniques for detecting careless responses. Previously several potential approaches have been suggested for identifying careless respondents via indices computed from the data, yet almost no prior work has examined the relationships among these indicators or the types of data patterns identified by each. In 2 studies, we examined several methods for identifying careless responses, including (a) special items designed to detect careless response, (b) response consistency indices formed from responses to typical survey items, (c) multivariate outlier analysis, (d) response time, and (e) self-reported diligence. Results indicated that there are two distinct patterns of careless response (random and nonrandom) and that different indices are needed to identify these different response patterns. We also found that approximately 10%-12% of undergraduates completing a lengthy survey for course credit were identified as careless responders. In Study 2, we simulated data with known random response patterns to determine the efficacy of several indicators of careless response. We found that the nature of the data strongly influenced the efficacy of the indices to identify careless responses. Recommendations include using identified rather than anonymous responses, incorporating instructed response items before data collection, as well as computing consistency indices and multivariate outlier analysis to ensure high-quality data.

0 comments Cited 484 times – based on 0 reviews      Review now

Bookmark

All references

Author and article information

Contributors

Ron D Hays:

ORCID: https://orcid.org/0000-0001-6697-907X

Division of General Internal Medicine and Health Services ResearchDepartment of MedicineUniversity of California1100 Glendon Avenue Suite 800Los Angeles, CA, 90024United States1 310 794 07321 310 794 2294drhays@ucla.edu

Journal

Journal ID (nlm-ta): J Med Internet Res

Journal ID (iso-abbrev): J Med Internet Res

Journal ID (publisher-id): JMIR

Title: Journal of Medical Internet Research

Publisher: JMIR Publications (Toronto, Canada )

ISSN (Print): 1439-4456

ISSN (Electronic): 1438-8871

Publication date Collection: 2023

Publication date (Electronic): 4 August 2023

Volume: 25

Electronic Location Identifier: e46421

Affiliations

[1 ] Division of General Internal Medicine and Health Services Research Department of Medicine University of California Los Angeles, CA United States

[2 ] Behavioral and Policy Sciences RAND Corporation Santa Monica, CA United States

[3 ] Behavioral and Policy Sciences RAND Corporation Boston, MA United States

[4 ] Center for Economic and Social Research University of Southern California Los Angeles, CA United States

[5 ] Patient Reported Outcomes, Value and Experience (PROVE) Center Department of Surgery Brigham and Women’s Hospital Boston, MA United States

Author notes

Corresponding Author: Ron D Hays drhays@ 123456ucla.edu

Author information

Ron D Hays https://orcid.org/0000-0001-6697-907X

Nabeel Qureshi https://orcid.org/0000-0001-7782-8023

Patricia M Herman https://orcid.org/0000-0001-5579-5654

Anthony Rodriguez https://orcid.org/0000-0001-9485-0003

Arie Kapteyn https://orcid.org/0000-0002-1855-5528

Maria Orlando Edelen https://orcid.org/0000-0002-1381-1465

Article

Publisher ID: v25i1e46421

DOI: 10.2196/46421

PMC ID: 10439462

PubMed ID: 37540543

SO-VID: da2e0a16-69b1-4237-8884-96d98995abe8

Copyright © ©Ron D Hays, Nabeel Qureshi, Patricia M Herman, Anthony Rodriguez, Arie Kapteyn, Maria Orlando Edelen. Originally published in the Journal of Medical Internet Research (https://www.jmir.org), 04.08.2023.

License:

This is an open-access article distributed under the terms of the Creative Commons Attribution License ( https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in the Journal of Medical Internet Research, is properly cited. The complete bibliographic information, a link to the original publication on https://www.jmir.org/, as well as this copyright and license information must be included.

History

Date received : 10 February 2023

Date revision requested : 20 June 2023

Date revision received : 28 June 2023

Date accepted : 29 June 2023

Comments

Comment on this article

scite_

Cited by 2

See all cited by

Most referenced authors 240

See all reference authors

Submit your digital health research with an established publisher
- celebrating 25 years of open access