40
views
0
recommends
+1 Recommend
1 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: not found

      Suicide behaviors during the COVID-19 pandemic: A meta-analysis of 54 studies

      research-article

      Read this article at

      ScienceOpenPublisherPMC
      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          COVID-19, and efforts to mitigate its spread, are creating extensive mental health problems. Experts have speculated the mental, economic, behavioral, and psychosocial problems linked to the COVID-19 pandemic may lead to a rise in suicide behavior. However, a quantitative synthesis is needed to reach an overall conclusion regarding the pandemic-suicide link. In the most comprehensive test of the COVID-19—suicidality link to date, we meta-analyzed data from 308,596 participants across 54 studies. Our results suggested increased event rates for suicide ideation (10.81%), suicide attempts (4.68%), and self-harm (9.63%) during the COVID-19 pandemic when considered against event rates from pre-pandemic studies. Moderation analysis indicated younger people, women, and individuals from democratic countries are most susceptible to suicide ideation during the COVID-19 pandemic. Policymakers and helping professionals are advised that suicide behaviors are alarmingly common during the COVID-19 pandemic and vary based upon age, gender, and geopolitics. Strong protections from governments (e.g., implementing best practices in suicide prevention) are urgently needed to reduce suicide behaviors during the COVID-19 pandemic.

          Related collections

          Most cited references78

          • Record: found
          • Abstract: found
          • Article: not found

          Power failure: why small sample size undermines the reliability of neuroscience.

          A study with low statistical power has a reduced chance of detecting a true effect, but it is less well appreciated that low power also reduces the likelihood that a statistically significant result reflects a true effect. Here, we show that the average statistical power of studies in the neurosciences is very low. The consequences of this include overestimates of effect size and low reproducibility of results. There are also ethical dimensions to this problem, as unreliable research is inefficient and wasteful. Improving reproducibility in neuroscience is a key priority and requires attention to well-established but often ignored methodological principles.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: found
            Is Open Access

            Mental Health, Substance Use, and Suicidal Ideation During the COVID-19 Pandemic — United States, June 24–30, 2020

            The coronavirus disease 2019 (COVID-19) pandemic has been associated with mental health challenges related to the morbidity and mortality caused by the disease and to mitigation activities, including the impact of physical distancing and stay-at-home orders.* Symptoms of anxiety disorder and depressive disorder increased considerably in the United States during April–June of 2020, compared with the same period in 2019 ( 1 , 2 ). To assess mental health, substance use, and suicidal ideation during the pandemic, representative panel surveys were conducted among adults aged ≥18 years across the United States during June 24–30, 2020. Overall, 40.9% of respondents reported at least one adverse mental or behavioral health condition, including symptoms of anxiety disorder or depressive disorder (30.9%), symptoms of a trauma- and stressor-related disorder (TSRD) related to the pandemic † (26.3%), and having started or increased substance use to cope with stress or emotions related to COVID-19 (13.3%). The percentage of respondents who reported having seriously considered suicide in the 30 days before completing the survey (10.7%) was significantly higher among respondents aged 18–24 years (25.5%), minority racial/ethnic groups (Hispanic respondents [18.6%], non-Hispanic black [black] respondents [15.1%]), self-reported unpaid caregivers for adults § (30.7%), and essential workers ¶ (21.7%). Community-level intervention and prevention efforts, including health communication strategies, designed to reach these groups could help address various mental health conditions associated with the COVID-19 pandemic. During June 24–30, 2020, a total of 5,412 (54.7%) of 9,896 eligible invited adults** completed web-based surveys †† administered by Qualtrics. §§ The Monash University Human Research Ethics Committee of Monash University (Melbourne, Australia) reviewed and approved the study protocol on human subjects research. Respondents were informed of the study purposes and provided electronic consent before commencement, and investigators received anonymized responses. Participants included 3,683 (68.1%) first-time respondents and 1,729 (31.9%) respondents who had completed a related survey during April 2–8, May 5–12, 2020, or both intervals; 1,497 (27.7%) respondents participated during all three intervals ( 2 , 3 ). Quota sampling and survey weighting were employed to improve cohort representativeness of the U.S. population by gender, age, and race/ethnicity. ¶¶ Symptoms of anxiety disorder and depressive disorder were assessed using the four-item Patient Health Questionnaire*** ( 4 ), and symptoms of a COVID-19–related TSRD were assessed using the six-item Impact of Event Scale ††† ( 5 ). Respondents also reported whether they had started or increased substance use to cope with stress or emotions related to COVID-19 or seriously considered suicide in the 30 days preceding the survey. §§§ Analyses were stratified by gender, age, race/ethnicity, employment status, essential worker status, unpaid adult caregiver status, rural-urban residence classification, ¶¶¶ whether the respondent knew someone who had positive test results for SARS-CoV-2, the virus that causes COVID-19, or who had died from COVID-19, and whether the respondent was receiving treatment for diagnosed anxiety, depression, or posttraumatic stress disorder (PTSD) at the time of the survey. Comparisons within subgroups were evaluated using Poisson regressions with robust standard errors to calculate prevalence ratios, 95% confidence intervals (CIs), and p-values to evaluate statistical significance (α = 0.005 to account for multiple comparisons). Among the 1,497 respondents who completed all three surveys, longitudinal analyses of the odds of incidence**** of symptoms of adverse mental or behavioral health conditions by essential worker and unpaid adult caregiver status were conducted on unweighted responses using logistic regressions to calculate unadjusted and adjusted †††† odds ratios (ORs), 95% CI, and p-values (α = 0.05). The statsmodels package in Python (version 3.7.8; Python Software Foundation) was used to conduct all analyses. Overall, 40.9% of 5,470 respondents who completed surveys during June reported an adverse mental or behavioral health condition, including those who reported symptoms of anxiety disorder or depressive disorder (30.9%), those with TSRD symptoms related to COVID-19 (26.3%), those who reported having started or increased substance use to cope with stress or emotions related to COVID-19 (13.3%), and those who reported having seriously considered suicide in the preceding 30 days (10.7%) (Table 1). At least one adverse mental or behavioral health symptom was reported by more than one half of respondents who were aged 18–24 years (74.9%) and 25–44 years (51.9%), of Hispanic ethnicity (52.1%), and who held less than a high school diploma (66.2%), as well as those who were essential workers (54.0%), unpaid caregivers for adults (66.6%), and who reported treatment for diagnosed anxiety (72.7%), depression (68.8%), or PTSD (88.0%) at the time of the survey. TABLE 1 Respondent characteristics and prevalence of adverse mental health outcomes, increased substance use to cope with stress or emotions related to COVID-19 pandemic, and suicidal ideation — United States, June 24–30, 2020 Characteristic All respondents who completed surveys during June 24–30, 2020 weighted* no. (%) Weighted %* Conditions Started or increased substance use to cope with pandemic-related stress or emotions¶ Seriously considered suicide in past 30 days ≥1 adverse mental or behavioral health symptom Anxiety disorder† Depressive disorder† Anxiety or depressive disorder† COVID-19–related TSRD§ All respondents 5,470 (100) 25.5 24.3 30.9 26.3 13.3 10.7 40.9 Gender Female 2,784 (50.9) 26.3 23.9 31.5 24.7 12.2 8.9 41.4 Male 2,676 (48.9) 24.7 24.8 30.4 27.9 14.4 12.6 40.5 Other 10 (0.2) 20.0 30.0 30.0 30.0 10.0 0.0 30.0 Age group (yrs) 18–24 731 (13.4) 49.1 52.3 62.9 46.0 24.7 25.5 74.9 25–44 1,911 (34.9) 35.3 32.5 40.4 36.0 19.5 16.0 51.9 45–64 1,895 (34.6) 16.1 14.4 20.3 17.2 7.7 3.8 29.5 ≥65 933 (17.1) 6.2 5.8 8.1 9.2 3.0 2.0 15.1 Race/Ethnicity White, non-Hispanic 3,453 (63.1) 24.0 22.9 29.2 23.3 10.6 7.9 37.8 Black, non-Hispanic 663 (12.1) 23.4 24.6 30.2 30.4 18.4 15.1 44.2 Asian, non-Hispanic 256 (4.7) 14.1 14.2 18.0 22.1 6.7 6.6 31.9 Other race or multiple races, non-Hispanic** 164 (3.0) 27.8 29.3 33.2 28.3 11.0 9.8 43.8 Hispanic, any race(s) 885 (16.2) 35.5 31.3 40.8 35.1 21.9 18.6 52.1 Unknown 50 (0.9) 38.0 34.0 44.0 34.0 18.0 26.0 48.0 2019 Household income (USD) <25,000 741 (13.6) 30.6 30.8 36.6 29.9 12.5 9.9 45.4 25,000–49,999 1,123 (20.5) 26.0 25.6 33.2 27.2 13.5 10.1 43.9 50,999–99,999 1,775 (32.5) 27.1 24.8 31.6 26.4 12.6 11.4 40.3 100,999–199,999 1,301 (23.8) 23.1 20.8 27.7 24.2 15.5 11.7 37.8 ≥200,000 282 (5.2) 17.4 17.0 20.6 23.1 14.8 11.6 35.1 Unknown 247 (4.5) 19.6 23.1 27.2 24.9 6.2 3.9 41.5 Education Less than high school diploma 78 (1.4) 44.5 51.4 57.5 44.5 22.1 30.0 66.2 High school diploma 943 (17.2) 31.5 32.8 38.4 32.1 15.3 13.1 48.0 Some college 1,455 (26.6) 25.2 23.4 31.7 22.8 10.9 8.6 39.9 Bachelor's degree 1,888 (34.5) 24.7 22.5 28.7 26.4 14.2 10.7 40.6 Professional degree 1,074 (19.6) 20.9 19.5 25.4 24.5 12.6 10.0 35.2 Unknown 33 (0.6) 25.2 23.2 28.2 23.2 10.5 5.5 28.2 Employment status†† Employed 3,431 (62.7) 30.1 29.1 36.4 32.1 17.9 15.0 47.8 Essential 1,785 (32.6) 35.5 33.6 42.4 38.5 24.7 21.7 54.0 Nonessential 1,646 (30.1) 24.1 24.1 29.9 25.2 10.5 7.8 41.0 Unemployed 761 (13.9) 32.0 29.4 37.8 25.0 7.7 4.7 45.9 Retired 1,278 (23.4) 9.6 8.7 12.1 11.3 4.2 2.5 19.6 Unpaid adult caregiver status§§ Yes 1,435 (26.2) 47.6 45.2 56.1 48.4 32.9 30.7 66.6 No 4,035 (73.8) 17.7 16.9 22.0 18.4 6.3 3.6 31.8 Region ¶¶ Northeast 1,193 (21.8) 23.9 23.9 29.9 22.8 12.8 10.2 37.1 Midwest 1,015 (18.6) 22.7 21.1 27.5 24.4 9.0 7.5 36.1 South 1,921 (35.1) 27.9 26.5 33.4 29.1 15.4 12.5 44.4 West 1,340 (24.5) 25.8 24.2 30.9 26.7 14.0 10.9 43.0 Rural-urban classification*** Rural 599 (10.9) 26.0 22.5 29.3 25.4 11.5 10.2 38.3 Urban 4,871 (89.1) 25.5 24.6 31.1 26.4 13.5 10.7 41.2 Know someone who had positive test results for SARS-CoV-2 Yes 1,109 (20.3) 23.8 21.9 29.6 21.5 12.9 7.5 39.2 No 4,361 (79.7) 26.0 25.0 31.3 27.5 13.4 11.5 41.3 Knew someone who died from COVID-19 Yes 428 (7.8) 25.8 20.6 30.6 28.1 11.3 7.6 40.1 No 5,042 (92.2) 25.5 24.7 31.0. 26.1 13.4 10.9 41.0 Receiving treatment for previously diagnosed condition Anxiety Yes 536 (9.8) 59.6 52.0 66.0 51.9 26.6 23.6 72.7 No 4,934 (90.2) 21.8 21.3 27.1 23.5 11.8 9.3 37.5 Depression Yes 540 (9.9) 52.5 50.6 60.8 45.5 25.2 22.1 68.8 No 4,930 (90.1) 22.6 21.5 27.7 24.2 12.0 9.4 37.9 Posttraumatic stress disorder Yes 251 (4.6) 72.3 69.1 78.7 69.4 43.8 44.8 88.0 No 5,219 (95.4) 23.3 22.2 28.6 24.2 11.8 9.0 38.7 Abbreviations: COVID-19 = coronavirus disease 2019; TSRD = trauma- and stressor-related disorder. * Survey weighting was employed to improve the cross-sectional June cohort representativeness of the U.S. population by gender, age, and race/ethnicity according to the 2010 U.S. Census with respondents in which gender, age, and race/ethnicity were reported. Respondents who reported a gender of “Other” or who did not report race/ethnicity were assigned a weight of one. † Symptoms of anxiety disorder and depressive disorder were assessed via the four-item Patient Health Questionnaire (PHQ-4). Those who scored ≥3 out of 6 on the Generalized Anxiety Disorder (GAD-2) and Patient Health Questionnaire (PHQ-2) subscales were considered symptomatic for each disorder, respectively. § Disorders classified as TSRDs in the Diagnostic and Statistical Manual of Mental Disorders (DSM–5) include posttraumatic stress disorder (PTSD), acute stress disorder (ASD), and adjustment disorders (ADs), among others. Symptoms of a TSRD precipitated by the COVID-19 pandemic were assessed via the six-item Impact of Event Scale (IES-6) to screen for overlapping symptoms of PTSD, ASD, and ADs. For this survey, the COVID-19 pandemic was specified as the traumatic exposure to record peri- and posttraumatic symptoms associated with the range of stressors introduced by the COVID-19 pandemic. Those who scored ≥1.75 out of 4 were considered symptomatic. ¶ 104 respondents selected “Prefer not to answer.” ** The Other race or multiple races, non-Hispanic category includes respondents who identified as not being Hispanic and as more than one race or as American Indian or Alaska Native, Native Hawaiian or Pacific Islander, or “Other.” †† Essential worker status was self-reported. The comparison was between employed respondents (n = 3,431) who identified as essential vs. nonessential. For this analysis, students who were not separately employed as essential workers were considered nonessential workers. §§ Unpaid adult caregiver status was self-reported. The definition of an unpaid caregiver for adults was a person who had provided unpaid care to a relative or friend aged ≥18 years to help them take care of themselves at any time in the last 3 months. Examples provided included helping with personal needs, household chores, health care tasks, managing a person’s finances, taking them to a doctor’s appointment, arranging for outside services, and visiting regularly to see how they are doing. ¶¶ Region classification was determined by using the U.S. Census Bureau’s Census Regions and Divisions of the United States. https://www2.census.gov/geo/pdfs/maps-data/maps/reference/us_regdiv.pdf. *** Rural-urban classification was determined by using self-reported ZIP codes according to the Federal Office of Rural Health Policy definition of rurality. https://www.hrsa.gov/rural-health/about-us/definition/datafiles.html. Prevalences of symptoms of adverse mental or behavioral health conditions varied significantly among subgroups (Table 2). Suicidal ideation was more prevalent among males than among females. Symptoms of anxiety disorder or depressive disorder, COVID-19–related TSRD, initiation of or increase in substance use to cope with COVID-19–associated stress, and serious suicidal ideation in the previous 30 days were most commonly reported by persons aged 18–24 years; prevalence decreased progressively with age. Hispanic respondents reported higher prevalences of symptoms of anxiety disorder or depressive disorder, COVID-19–related TSRD, increased substance use, and suicidal ideation than did non-Hispanic whites (whites) or non-Hispanic Asian (Asian) respondents. Black respondents reported increased substance use and past 30-day serious consideration of suicide in the previous 30 days more commonly than did white and Asian respondents. Respondents who reported treatment for diagnosed anxiety, depression, or PTSD at the time of the survey reported higher prevalences of symptoms of adverse mental and behavioral health conditions compared with those who did not. Symptoms of a COVID-19–related TSRD, increased substance use, and suicidal ideation were more prevalent among employed than unemployed respondents, and among essential workers than nonessential workers. Adverse conditions also were more prevalent among unpaid caregivers for adults than among those who were not, with particularly large differences in increased substance use (32.9% versus 6.3%) and suicidal ideation (30.7% versus 3.6%) in this group. TABLE 2 Comparison of symptoms of adverse mental health outcomes among all respondents who completed surveys (N = 5,470), by respondent characteristic* — United States, June 24–30, 2020 Characteristic Prevalence ratio ¶ (95% CI¶) Symptoms of anxiety disorder or depressive disorder † Symptoms of a TSRD related to COVID-19 § Started or increased substance use to cope with stress or emotions related to COVID-19 Serious consideration of suicide in past 30 days Gender Female vs. male 1.04 (0.96–1.12) 0.88 (0.81–0.97) 0.85 (0.75–0.98) 0.70 (0.60–0.82)** Age group (yrs) 18–24 vs. 25–44 1.56 (1.44–1.68)** 1.28 (1.16–1.41)** 1.31 (1.12–1.53)** 1.59 (1.35–1.87)** 18–24 vs. 45–64 3.10 (2.79–3.44)** 2.67 (2.35–3.03)** 3.35 (2.75–4.10)** 6.66 (5.15–8.61)** 18–24 vs. ≥65 7.73 (6.19–9.66)** 5.01 (4.04–6.22)** 8.77 (5.95–12.93)** 12.51 (7.88–19.86)** 25–44 vs. 45–64 1.99 (1.79–2.21)** 2.09 (1.86–2.35)** 2.56 (2.14–3.07)** 4.18 (3.26–5.36)** 25–44 vs. ≥65 4.96 (3.97–6.20)** 3.93 (3.18–4.85)** 6.70 (4.59–9.78)** 7.86 (4.98–12.41)** 45–64 vs. ≥65 2.49 (1.98–3.15)** 1.88 (1.50–2.35)** 2.62 (1.76–3.9)** 1.88 (1.14–3.10) Race/Ethnicity†† Hispanic vs. non-Hispanic black 1.35 (1.18–1.56)** 1.15 (1.00–1.33) 1.19 (0.97–1.46) 1.23 (0.98–1.55) Hispanic vs. non-Hispanic Asian 2.27 (1.73–2.98)** 1.59 (1.24–2.04)** 3.29 (2.05–5.28)** 2.82 (1.74–4.57)** Hispanic vs. non-Hispanic other race or multiple races 1.23 (0.98–1.55) 1.24 (0.96–1.61) 1.99 (1.27–3.13)** 1.89 (1.16–3.06) Hispanic vs. non-Hispanic white 1.40 (1.27–1.54)** 1.50 (1.35–1.68)** 2.09 (1.79–2.45)** 2.35 (1.96–2.80)** Non-Hispanic black vs. non-Hispanic Asian 1.68 (1.26–2.23)** 1.38 (1.07–1.78) 2.75 (1.70–4.47)** 2.29 (1.39–3.76)** Non-Hispanic black vs. non-Hispanic other race or multiple races 0.91 (0.71–1.16) 1.08 (0.82–1.41) 1.67 (1.05–2.65) 1.53 (0.93–2.52) Non-Hispanic black vs. non-Hispanic white 1.03 (0.91–1.17) 1.30 (1.14–1.48)** 1.75 (1.45–2.11)** 1.90 (1.54–2.36)** Non-Hispanic Asian vs. non-Hispanic other race or multiple races 0.54 (0.39–0.76)** 0.78 (0.56–1.09) 0.61 (0.32–1.14) 0.67 (0.35–1.29) Non-Hispanic Asian vs. non-Hispanic white 0.62 (0.47–0.80)** 0.95 (0.74–1.20) 0.64 (0.40–1.02) 0.83 (0.52–1.34) Non-Hispanic other race or multiple races vs. non-Hispanic white 1.14 (0.91–1.42) 1.21 (0.94–1.56) 1.05 (0.67–1.64) 1.24 (0.77–2) Employment status Employed vs. unemployed 0.96 (0.87–1.07) 1.28 (1.12–1.46)** 2.30 (1.78–2.98)** 3.21 (2.31–4.47)** Employed vs. retired 3.01 (2.58–3.51)** 2.84 (2.42–3.34)** 4.30 (3.28–5.63)** 5.97 (4.20–8.47)** Unemployed vs. retired 3.12 (2.63–3.71)** 2.21 (1.82–2.69)** 1.87 (1.30–2.67)** 1.86 (1.16–2.96) Essential vs. nonessential worker§§ 1.42 (1.30–1.56)** 1.52 (1.38–1.69)** 2.36 (2.00–2.77)** 2.76 (2.29–3.33)** Unpaid caregiver for adults vs. not¶¶` 2.55 (2.37–2.75)** 2.63 (2.42–2.86)** 5.28 (4.59–6.07)** 8.64 (7.23–10.33)** Rural vs. urban residence*** 0.94 (0.82–1.07) 0.96 (0.83–1.11) 0.84 (0.67–1.06) 0.95 (0.74–1.22) Knows someone with positive SARS-CoV-2 test result vs. not 0.95 (0.86–1.05) 0.78 (0.69–0.88)** 0.96 (0.81–1.14) 0.65 (0.52–0.81)** Knew someone who died from COVID-19 vs. not 0.99 (0.85–1.15) 1.08 (0.92–1.26) 0.84 (0.64–1.11) 0.69 (0.49–0.97) Receiving treatment for anxiety vs. not 2.43 (2.26–2.63)** 2.21 (2.01–2.43)** 2.27 (1.94–2.66)** 2.54 (2.13–3.03)** Receiving treatment for depression vs. not 2.20 (2.03–2.39)** 1.88 (1.70–2.09)** 2.13 (1.81–2.51)** 2.35 (1.96–2.82)** Receiving treatment for PTSD vs. not 2.75 (2.55–2.97)** 2.87 (2.61–3.16)** 3.78 (3.23–4.42)** 4.95 (4.21–5.83)** Abbreviations: CI = confidence interval; COVID-19 = coronavirus disease 2019; PTSD = posttraumatic stress disorder; TSRD = trauma- and stressor-related disorder. * Number of respondents for characteristics: gender (female = 2,784, male = 2,676), age group in years (18–24 = 731; 25–44 = 1,911; 45–64 = 1,895; ≥65 = 933), race/ethnicity (non-Hispanic white = 3453, non-Hispanic black = 663, non-Hispanic Asian = 256, non-Hispanic other race or multiple races = 164, Hispanic = 885). † Symptoms of anxiety disorder and depressive disorder were assessed via the four-item Patient Health Questionnaire (PHQ-4). Those who scored ≥3 out of 6 on the Generalized Anxiety Disorder (GAD-2) and Patient Health Questionnaire (PHQ-2) subscales were considered to have symptoms of these disorders. § Disorders classified as TSRDs in the Diagnostic and Statistical Manual of Mental Disorders (DSM–5) include PTSD, acute stress disorder (ASD), and adjustment disorders (ADs), among others. Symptoms of a TSRD precipitated by the COVID-19 pandemic were assessed via the six-item Impact of Event Scale (IES-6) to screen for overlapping symptoms of PTSD, ASD, and ADs. For this survey, the COVID-19 pandemic was specified as the traumatic exposure to record peri- and posttraumatic symptoms associated with the range of stressors introduced by the COVID-19 pandemic. Persons who scored ≥1.75 out of 4 were considered to be symptomatic. ¶ Comparisons within subgroups were evaluated on weighted responses via Poisson regressions used to calculate a prevalence ratio, 95% CI, and p-value (not shown). Statistical significance was evaluated at a threshold of α = 0.005 to account for multiple comparisons. In the calculation of prevalence ratios for started or increased substance use, respondents who selected “Prefer not to answer” (n = 104) were excluded. ** P-value is statistically significant (p<0.005). †† Respondents identified as a single race unless otherwise specified. The non-Hispanic, other race or multiple races category includes respondents who identified as not Hispanic and as more than one race or as American Indian or Alaska Native, Native Hawaiian or Pacific Islander, or ‘Other’. §§ Essential worker status was self-reported. The comparison was between employed respondents (n = 3,431) who identified as essential vs. nonessential. For this analysis, students who were not separately employed as essential workers were considered nonessential workers. ¶¶ Unpaid adult caregiver status was self-reported. The definition of an unpaid caregiver for adults was having provided unpaid care to a relative or friend aged ≥18 years to help them take care of themselves at any time in the last 3 months. Examples provided included helping with personal needs, household chores, health care tasks, managing a person’s finances, taking them to a doctor’s appointment, arranging for outside services, and visiting regularly to see how they are doing. *** Rural-urban classification was determined by using self-reported ZIP codes according to the Federal Office of Rural Health Policy definition of rurality. https://www.hrsa.gov/rural-health/about-us/definition/datafiles.html. Longitudinal analysis of responses of 1,497 persons who completed all three surveys revealed that unpaid caregivers for adults had a significantly higher odds of incidence of adverse mental health conditions compared with others (Table 3). Among those who did not report having started or increased substance use to cope with stress or emotions related to COVID-19 in May, unpaid caregivers for adults had 3.33 times the odds of reporting this behavior in June (adjusted OR 95% CI = 1.75–6.31; p<0.001). Similarly, among those who did not report having seriously considered suicide in the previous 30 days in May, unpaid caregivers for adults had 3.03 times the odds of reporting suicidal ideation in June (adjusted OR 95% CI = 1.20–7.63; p = 0.019). TABLE 3 Odds of incidence* of symptoms of adverse mental health, substance use to cope with stress or emotions related to COVID–19 pandemic, and suicidal ideation in the third survey wave, by essential worker status and unpaid adult caregiver status among respondents who completed monthly surveys from April through June (N = 1,497) — United States, April 2–8, May 5–12, and June 24–30, 2020 Symptom or behavior Essential worker† vs. all other employment statuses (nonessential worker, unemployed, retired) Unpaid caregiver for adults§ vs. not unpaid caregiver Unadjusted Adjusted¶ Unadjusted Adjusted** OR (95% CI)†† p-value†† OR (95% CI)†† p-value†† OR (95% CI)†† p-value†† OR (95% CI)†† p-value†† Symptoms of anxiety disorder§§ 1.92 (1.29–2.87) 0.001 1.63 (0.99–2.69) 0.056 1.97 (1.25–3.11) 0.004 1.81 (1.14–2.87) 0.012 Symptoms of depressive disorder§§ 1.49 (1.00–2.22) 0.052 1.13 (0.70–1.82) 0.606 2.29 (1.50–3.50) <0.001 2.22 (1.45–3.41) <0.001 Symptoms of anxiety disorder or depressive disorder§§ 1.67 (1.14–2.46) 0.008 1.26 (0.79–2.00) 0.326 1.84 (1.19–2.85) 0.006 1.73 (1.11–2.70) 0.015 Symptoms of a TSRD related to COVID–19¶¶ 1.55 (0.86–2.81) 0.146 1.27 (0.63–2.56) 0.512 1.88 (0.99–3.56) 0.054 1.79 (0.94–3.42) 0.076 Started or increased substance use to cope with stress or emotions related to COVID–19 2.36 (1.26–4.42) 0.007 2.04 (0.92–4.48) 0.078 3.51 (1.86–6.61) <0.001 3.33 (1.75–6.31) <0.001 Serious consideration of suicide in previous 30 days 0.93 (0.31–2.78) 0.895 0.53 (0.16–1.70) 0.285 3.00 (1.20–7.52) 0.019 3.03 (1.20–7.63) 0.019 Abbreviations: CI = confidence interval, COVID–19 = coronavirus disease 2019, OR = odds ratio, TSRD = trauma– and stressor–related disorder. * For outcomes assessed via the four-item Patient Health Questionnaire (PHQ–4), odds of incidence were marked by the presence of symptoms during May 5–12 or June 24–30, 2020, after the absence of symptoms during April 2–8, 2020. Respondent pools for prospective analysis of odds of incidence (did not screen positive for symptoms during April 2–8): anxiety disorder (n = 1,236), depressive disorder (n = 1,301) and anxiety disorder or depressive disorder (n = 1,190). For symptoms of a TSRD precipitated by COVID–19, started or increased substance use to cope with stress or emotions related to COVID–19, and serious suicidal ideation in the previous 30 days, odds of incidence were marked by the presence of an outcome during June 24–30, 2020, after the absence of that outcome during May 5–12, 2020. Respondent pools for prospective analysis of odds of incidence (did not report symptoms or behavior during May 5–12): symptoms of a TSRD (n = 1,206), started or increased substance use (n = 1,408), and suicidal ideation (n = 1,456). † Essential worker status was self–reported. For Table 3, essential worker status was determined by identification as an essential worker during the June 24–30 survey. Essential workers were compared with all other respondents, not just employed respondents (i.e., essential workers vs. all other employment statuses (nonessential worker, unemployed, and retired), not essential vs. nonessential workers). § Unpaid adult caregiver status was self–reported. The definition of an unpaid caregiver for adults was having provided unpaid care to a relative or friend 18 years or older to help them take care of themselves at any time in the last 3 months. Examples provided included helping with personal needs, household chores, health care tasks, managing a person’s finances, taking them to a doctor’s appointment, arranging for outside services, and visiting regularly to see how they are doing. ¶ Adjusted for gender, employment status, and unpaid adult caregiver status. ** Adjusted for gender, employment status, and essential worker status. †† Respondents who completed surveys from all three waves (April, May, June) were eligible to be included in an unweighted longitudinal analysis. Comparisons within subgroups were evaluated via logit–linked Binomial regressions used to calculate unadjusted and adjusted odds ratios, 95% confidence intervals, and p–values. Statistical significance was evaluated at a threshold of α = 0.05. In the calculation of odds ratios for started or increased substance use, respondents who selected “Prefer not to answer” (n = 11) were excluded. §§ Symptoms of anxiety disorder and depressive disorder were assessed via the PHQ–4. Those who scored ≥3 out of 6 on the two–item Generalized Anxiety Disorder (GAD–2) and two-item Patient Health Questionnaire (PHQ–2) subscales were considered symptomatic for each disorder, respectively. ¶¶ Disorders classified as TSRDs in the Diagnostic and Statistical Manual of Mental Disorders (DSM–5) include posttraumatic stress disorder (PTSD), acute stress disorder (ASD), and adjustment disorders (ADs), among others. Symptoms of a TSRD precipitated by the COVID–19 pandemic were assessed via the six–item Impact of Event Scale (IES–6) to screen for overlapping symptoms of PTSD, ASD, and ADs. For this survey, the COVID–19 pandemic was specified as the traumatic exposure to record peri– and posttraumatic symptoms associated with the range of potential stressors introduced by the COVID–19 pandemic. Those who scored ≥1.75 out of 4 were considered symptomatic. Discussion Elevated levels of adverse mental health conditions, substance use, and suicidal ideation were reported by adults in the United States in June 2020. The prevalence of symptoms of anxiety disorder was approximately three times those reported in the second quarter of 2019 (25.5% versus 8.1%), and prevalence of depressive disorder was approximately four times that reported in the second quarter of 2019 (24.3% versus 6.5%) ( 2 ). However, given the methodological differences and potential unknown biases in survey designs, this analysis might not be directly comparable with data reported on anxiety and depression disorders in 2019 ( 2 ). Approximately one quarter of respondents reported symptoms of a TSRD related to the pandemic, and approximately one in 10 reported that they started or increased substance use because of COVID-19. Suicidal ideation was also elevated; approximately twice as many respondents reported serious consideration of suicide in the previous 30 days than did adults in the United States in 2018, referring to the previous 12 months (10.7% versus 4.3%) ( 6 ). Mental health conditions are disproportionately affecting specific populations, especially young adults, Hispanic persons, black persons, essential workers, unpaid caregivers for adults, and those receiving treatment for preexisting psychiatric conditions. Unpaid caregivers for adults, many of whom are currently providing critical aid to persons at increased risk for severe illness from COVID-19, had a higher incidence of adverse mental and behavioral health conditions compared with others. Although unpaid caregivers of children were not evaluated in this study, approximately 39% of unpaid caregivers for adults shared a household with children (compared with 27% of other respondents). Caregiver workload, especially in multigenerational caregivers, should be considered for future assessment of mental health, given the findings of this report and hardships potentially faced by caregivers. The findings in this report are subject to at least four limitations. First, a diagnostic evaluation for anxiety disorder or depressive disorder was not conducted; however, clinically validated screening instruments were used to assess symptoms. Second, the trauma- and stressor-related symptoms assessed were common to multiple TSRDs, precluding distinction among them; however, the findings highlight the importance of including COVID-19–specific trauma measures to gain insights into peri- and posttraumatic impacts of the COVID-19 pandemic ( 7 ). Third, substance use behavior was self-reported; therefore, responses might be subject to recall, response, and social desirability biases. Finally, given that the web-based survey might not be fully representative of the United States population, findings might have limited generalizability. However, standardized quality and data inclusion screening procedures, including algorithmic analysis of click-through behavior, removal of duplicate responses and scrubbing methods for web-based panel quality were applied. Further the prevalence of symptoms of anxiety disorder and depressive disorder were largely consistent with findings from the Household Pulse Survey during June ( 1 ). Markedly elevated prevalences of reported adverse mental and behavioral health conditions associated with the COVID-19 pandemic highlight the broad impact of the pandemic and the need to prevent and treat these conditions. Identification of populations at increased risk for psychological distress and unhealthy coping can inform policies to address health inequity, including increasing access to resources for clinical diagnoses and treatment options. Expanded use of telehealth, an effective means of delivering treatment for mental health conditions, including depression, substance use disorder, and suicidal ideation ( 8 ), might reduce COVID-19-related mental health consequences. Future studies should identify drivers of adverse mental and behavioral health during the COVID-19 pandemic and whether factors such as social isolation, absence of school structure, unemployment and other financial worries, and various forms of violence (e.g., physical, emotional, mental, or sexual abuse) serve as additional stressors. Community-level intervention and prevention efforts should include strengthening economic supports to reduce financial strain, addressing stress from experienced racial discrimination, promoting social connectedness, and supporting persons at risk for suicide ( 9 ). Communication strategies should focus on promotion of health services §§§§ , ¶¶¶¶ , ***** and culturally and linguistically tailored prevention messaging regarding practices to improve emotional well-being. Development and implementation of COVID-19–specific screening instruments for early identification of COVID-19–related TSRD symptoms would allow for early clinical interventions that might prevent progression from acute to chronic TSRDs. To reduce potential harms of increased substance use related to COVID-19, resources, including social support, comprehensive treatment options, and harm reduction services, are essential and should remain accessible. Periodic assessment of mental health, substance use, and suicidal ideation should evaluate the prevalence of psychological distress over time. Addressing mental health disparities and preparing support systems to mitigate mental health consequences as the pandemic evolves will continue to be needed urgently. Summary What is already known about this topic? Communities have faced mental health challenges related to COVID-19–associated morbidity, mortality, and mitigation activities. What is added by this report? During June 24–30, 2020, U.S. adults reported considerably elevated adverse mental health conditions associated with COVID-19. Younger adults, racial/ethnic minorities, essential workers, and unpaid adult caregivers reported having experienced disproportionately worse mental health outcomes, increased substance use, and elevated suicidal ideation. What are the implications for public health practice? The public health response to the COVID-19 pandemic should increase intervention and prevention efforts to address associated mental health conditions. Community-level efforts, including health communication strategies, should prioritize young adults, racial/ethnic minorities, essential workers, and unpaid adult caregivers.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Why Most Published Research Findings Are False

              Published research findings are sometimes refuted by subsequent evidence, with ensuing confusion and disappointment. Refutation and controversy is seen across the range of research designs, from clinical trials and traditional epidemiological studies [1–3] to the most modern molecular research [4,5]. There is increasing concern that in modern research, false findings may be the majority or even the vast majority of published research claims [6–8]. However, this should not be surprising. It can be proven that most claimed research findings are false. Here I will examine the key factors that influence this problem and some corollaries thereof. Modeling the Framework for False Positive Findings Several methodologists have pointed out [9–11] that the high rate of nonreplication (lack of confirmation) of research discoveries is a consequence of the convenient, yet ill-founded strategy of claiming conclusive research findings solely on the basis of a single study assessed by formal statistical significance, typically for a p-value less than 0.05. Research is not most appropriately represented and summarized by p-values, but, unfortunately, there is a widespread notion that medical research articles should be interpreted based only on p-values. Research findings are defined here as any relationship reaching formal statistical significance, e.g., effective interventions, informative predictors, risk factors, or associations. “Negative” research is also very useful. “Negative” is actually a misnomer, and the misinterpretation is widespread. However, here we will target relationships that investigators claim exist, rather than null findings. It can be proven that most claimed research findings are false As has been shown previously, the probability that a research finding is indeed true depends on the prior probability of it being true (before doing the study), the statistical power of the study, and the level of statistical significance [10,11]. Consider a 2 × 2 table in which research findings are compared against the gold standard of true relationships in a scientific field. In a research field both true and false hypotheses can be made about the presence of relationships. Let R be the ratio of the number of “true relationships” to “no relationships” among those tested in the field. R is characteristic of the field and can vary a lot depending on whether the field targets highly likely relationships or searches for only one or a few true relationships among thousands and millions of hypotheses that may be postulated. Let us also consider, for computational simplicity, circumscribed fields where either there is only one true relationship (among many that can be hypothesized) or the power is similar to find any of the several existing true relationships. The pre-study probability of a relationship being true is R/(R + 1). The probability of a study finding a true relationship reflects the power 1 - β (one minus the Type II error rate). The probability of claiming a relationship when none truly exists reflects the Type I error rate, α. Assuming that c relationships are being probed in the field, the expected values of the 2 × 2 table are given in Table 1. After a research finding has been claimed based on achieving formal statistical significance, the post-study probability that it is true is the positive predictive value, PPV. The PPV is also the complementary probability of what Wacholder et al. have called the false positive report probability [10]. According to the 2 × 2 table, one gets PPV = (1 - β)R/(R - βR + α). A research finding is thus more likely true than false if (1 - β)R > α. Since usually the vast majority of investigators depend on a = 0.05, this means that a research finding is more likely true than false if (1 - β)R > 0.05. What is less well appreciated is that bias and the extent of repeated independent testing by different teams of investigators around the globe may further distort this picture and may lead to even smaller probabilities of the research findings being indeed true. We will try to model these two factors in the context of similar 2 × 2 tables. Bias First, let us define bias as the combination of various design, data, analysis, and presentation factors that tend to produce research findings when they should not be produced. Let u be the proportion of probed analyses that would not have been “research findings,” but nevertheless end up presented and reported as such, because of bias. Bias should not be confused with chance variability that causes some findings to be false by chance even though the study design, data, analysis, and presentation are perfect. Bias can entail manipulation in the analysis or reporting of findings. Selective or distorted reporting is a typical form of such bias. We may assume that u does not depend on whether a true relationship exists or not. This is not an unreasonable assumption, since typically it is impossible to know which relationships are indeed true. In the presence of bias (Table 2), one gets PPV = ([1 - β]R + uβR)/(R + α − βR + u − uα + uβR), and PPV decreases with increasing u, unless 1 − β ≤ α, i.e., 1 − β ≤ 0.05 for most situations. Thus, with increasing bias, the chances that a research finding is true diminish considerably. This is shown for different levels of power and for different pre-study odds in Figure 1. Conversely, true research findings may occasionally be annulled because of reverse bias. For example, with large measurement errors relationships are lost in noise [12], or investigators use data inefficiently or fail to notice statistically significant relationships, or there may be conflicts of interest that tend to “bury” significant findings [13]. There is no good large-scale empirical evidence on how frequently such reverse bias may occur across diverse research fields. However, it is probably fair to say that reverse bias is not as common. Moreover measurement errors and inefficient use of data are probably becoming less frequent problems, since measurement error has decreased with technological advances in the molecular era and investigators are becoming increasingly sophisticated about their data. Regardless, reverse bias may be modeled in the same way as bias above. Also reverse bias should not be confused with chance variability that may lead to missing a true relationship because of chance. Testing by Several Independent Teams Several independent teams may be addressing the same sets of research questions. As research efforts are globalized, it is practically the rule that several research teams, often dozens of them, may probe the same or similar questions. Unfortunately, in some areas, the prevailing mentality until now has been to focus on isolated discoveries by single teams and interpret research experiments in isolation. An increasing number of questions have at least one study claiming a research finding, and this receives unilateral attention. The probability that at least one study, among several done on the same question, claims a statistically significant research finding is easy to estimate. For n independent studies of equal power, the 2 × 2 table is shown in Table 3: PPV = R(1 − β n )/(R + 1 − [1 − α] n − Rβ n ) (not considering bias). With increasing number of independent studies, PPV tends to decrease, unless 1 - β < a, i.e., typically 1 − β < 0.05. This is shown for different levels of power and for different pre-study odds in Figure 2. For n studies of different power, the term β n is replaced by the product of the terms β i for i = 1 to n, but inferences are similar. Corollaries A practical example is shown in Box 1. Based on the above considerations, one may deduce several interesting corollaries about the probability that a research finding is indeed true. Box 1. An Example: Science at Low Pre-Study Odds Let us assume that a team of investigators performs a whole genome association study to test whether any of 100,000 gene polymorphisms are associated with susceptibility to schizophrenia. Based on what we know about the extent of heritability of the disease, it is reasonable to expect that probably around ten gene polymorphisms among those tested would be truly associated with schizophrenia, with relatively similar odds ratios around 1.3 for the ten or so polymorphisms and with a fairly similar power to identify any of them. Then R = 10/100,000 = 10−4, and the pre-study probability for any polymorphism to be associated with schizophrenia is also R/(R + 1) = 10−4. Let us also suppose that the study has 60% power to find an association with an odds ratio of 1.3 at α = 0.05. Then it can be estimated that if a statistically significant association is found with the p-value barely crossing the 0.05 threshold, the post-study probability that this is true increases about 12-fold compared with the pre-study probability, but it is still only 12 × 10−4. Now let us suppose that the investigators manipulate their design, analyses, and reporting so as to make more relationships cross the p = 0.05 threshold even though this would not have been crossed with a perfectly adhered to design and analysis and with perfect comprehensive reporting of the results, strictly according to the original study plan. Such manipulation could be done, for example, with serendipitous inclusion or exclusion of certain patients or controls, post hoc subgroup analyses, investigation of genetic contrasts that were not originally specified, changes in the disease or control definitions, and various combinations of selective or distorted reporting of the results. Commercially available “data mining” packages actually are proud of their ability to yield statistically significant results through data dredging. In the presence of bias with u = 0.10, the post-study probability that a research finding is true is only 4.4 × 10−4. Furthermore, even in the absence of any bias, when ten independent research teams perform similar experiments around the world, if one of them finds a formally statistically significant association, the probability that the research finding is true is only 1.5 × 10−4, hardly any higher than the probability we had before any of this extensive research was undertaken! Corollary 1: The smaller the studies conducted in a scientific field, the less likely the research findings are to be true. Small sample size means smaller power and, for all functions above, the PPV for a true research finding decreases as power decreases towards 1 − β = 0.05. Thus, other factors being equal, research findings are more likely true in scientific fields that undertake large studies, such as randomized controlled trials in cardiology (several thousand subjects randomized) [14] than in scientific fields with small studies, such as most research of molecular predictors (sample sizes 100-fold smaller) [15]. Corollary 2: The smaller the effect sizes in a scientific field, the less likely the research findings are to be true. Power is also related to the effect size. Thus research findings are more likely true in scientific fields with large effects, such as the impact of smoking on cancer or cardiovascular disease (relative risks 3–20), than in scientific fields where postulated effects are small, such as genetic risk factors for multigenetic diseases (relative risks 1.1–1.5) [7]. Modern epidemiology is increasingly obliged to target smaller effect sizes [16]. Consequently, the proportion of true research findings is expected to decrease. In the same line of thinking, if the true effect sizes are very small in a scientific field, this field is likely to be plagued by almost ubiquitous false positive claims. For example, if the majority of true genetic or nutritional determinants of complex diseases confer relative risks less than 1.05, genetic or nutritional epidemiology would be largely utopian endeavors. Corollary 3: The greater the number and the lesser the selection of tested relationships in a scientific field, the less likely the research findings are to be true. As shown above, the post-study probability that a finding is true (PPV) depends a lot on the pre-study odds (R). Thus, research findings are more likely true in confirmatory designs, such as large phase III randomized controlled trials, or meta-analyses thereof, than in hypothesis-generating experiments. Fields considered highly informative and creative given the wealth of the assembled and tested information, such as microarrays and other high-throughput discovery-oriented research [4,8,17], should have extremely low PPV. Corollary 4: The greater the flexibility in designs, definitions, outcomes, and analytical modes in a scientific field, the less likely the research findings are to be true. Flexibility increases the potential for transforming what would be “negative” results into “positive” results, i.e., bias, u. For several research designs, e.g., randomized controlled trials [18–20] or meta-analyses [21,22], there have been efforts to standardize their conduct and reporting. Adherence to common standards is likely to increase the proportion of true findings. The same applies to outcomes. True findings may be more common when outcomes are unequivocal and universally agreed (e.g., death) rather than when multifarious outcomes are devised (e.g., scales for schizophrenia outcomes) [23]. Similarly, fields that use commonly agreed, stereotyped analytical methods (e.g., Kaplan-Meier plots and the log-rank test) [24] may yield a larger proportion of true findings than fields where analytical methods are still under experimentation (e.g., artificial intelligence methods) and only “best” results are reported. Regardless, even in the most stringent research designs, bias seems to be a major problem. For example, there is strong evidence that selective outcome reporting, with manipulation of the outcomes and analyses reported, is a common problem even for randomized trails [25]. Simply abolishing selective publication would not make this problem go away. Corollary 5: The greater the financial and other interests and prejudices in a scientific field, the less likely the research findings are to be true. Conflicts of interest and prejudice may increase bias, u. Conflicts of interest are very common in biomedical research [26], and typically they are inadequately and sparsely reported [26,27]. Prejudice may not necessarily have financial roots. Scientists in a given field may be prejudiced purely because of their belief in a scientific theory or commitment to their own findings. Many otherwise seemingly independent, university-based studies may be conducted for no other reason than to give physicians and researchers qualifications for promotion or tenure. Such nonfinancial conflicts may also lead to distorted reported results and interpretations. Prestigious investigators may suppress via the peer review process the appearance and dissemination of findings that refute their findings, thus condemning their field to perpetuate false dogma. Empirical evidence on expert opinion shows that it is extremely unreliable [28]. Corollary 6: The hotter a scientific field (with more scientific teams involved), the less likely the research findings are to be true. This seemingly paradoxical corollary follows because, as stated above, the PPV of isolated findings decreases when many teams of investigators are involved in the same field. This may explain why we occasionally see major excitement followed rapidly by severe disappointments in fields that draw wide attention. With many teams working on the same field and with massive experimental data being produced, timing is of the essence in beating competition. Thus, each team may prioritize on pursuing and disseminating its most impressive “positive” results. “Negative” results may become attractive for dissemination only if some other team has found a “positive” association on the same question. In that case, it may be attractive to refute a claim made in some prestigious journal. The term Proteus phenomenon has been coined to describe this phenomenon of rapidly alternating extreme research claims and extremely opposite refutations [29]. Empirical evidence suggests that this sequence of extreme opposites is very common in molecular genetics [29]. These corollaries consider each factor separately, but these factors often influence each other. For example, investigators working in fields where true effect sizes are perceived to be small may be more likely to perform large studies than investigators working in fields where true effect sizes are perceived to be large. Or prejudice may prevail in a hot scientific field, further undermining the predictive value of its research findings. Highly prejudiced stakeholders may even create a barrier that aborts efforts at obtaining and disseminating opposing results. Conversely, the fact that a field is hot or has strong invested interests may sometimes promote larger studies and improved standards of research, enhancing the predictive value of its research findings. Or massive discovery-oriented testing may result in such a large yield of significant relationships that investigators have enough to report and search further and thus refrain from data dredging and manipulation. Most Research Findings Are False for Most Research Designs and for Most Fields In the described framework, a PPV exceeding 50% is quite difficult to get. Table 4 provides the results of simulations using the formulas developed for the influence of power, ratio of true to non-true relationships, and bias, for various types of situations that may be characteristic of specific study designs and settings. A finding from a well-conducted, adequately powered randomized controlled trial starting with a 50% pre-study chance that the intervention is effective is eventually true about 85% of the time. A fairly similar performance is expected of a confirmatory meta-analysis of good-quality randomized trials: potential bias probably increases, but power and pre-test chances are higher compared to a single randomized trial. Conversely, a meta-analytic finding from inconclusive studies where pooling is used to “correct” the low power of single studies, is probably false if R ≤ 1:3. Research findings from underpowered, early-phase clinical trials would be true about one in four times, or even less frequently if bias is present. Epidemiological studies of an exploratory nature perform even worse, especially when underpowered, but even well-powered epidemiological studies may have only a one in five chance being true, if R = 1:10. Finally, in discovery-oriented research with massive testing, where tested relationships exceed true ones 1,000-fold (e.g., 30,000 genes tested, of which 30 may be the true culprits) [30,31], PPV for each claimed relationship is extremely low, even with considerable standardization of laboratory and statistical methods, outcomes, and reporting thereof to minimize bias. Claimed Research Findings May Often Be Simply Accurate Measures of the Prevailing Bias As shown, the majority of modern biomedical research is operating in areas with very low pre- and post-study probability for true findings. Let us suppose that in a research field there are no true findings at all to be discovered. History of science teaches us that scientific endeavor has often in the past wasted effort in fields with absolutely no yield of true scientific information, at least based on our current understanding. In such a “null field,” one would ideally expect all observed effect sizes to vary by chance around the null in the absence of bias. The extent that observed findings deviate from what is expected by chance alone would be simply a pure measure of the prevailing bias. For example, let us suppose that no nutrients or dietary patterns are actually important determinants for the risk of developing a specific tumor. Let us also suppose that the scientific literature has examined 60 nutrients and claims all of them to be related to the risk of developing this tumor with relative risks in the range of 1.2 to 1.4 for the comparison of the upper to lower intake tertiles. Then the claimed effect sizes are simply measuring nothing else but the net bias that has been involved in the generation of this scientific literature. Claimed effect sizes are in fact the most accurate estimates of the net bias. It even follows that between “null fields,” the fields that claim stronger effects (often with accompanying claims of medical or public health importance) are simply those that have sustained the worst biases. For fields with very low PPV, the few true relationships would not distort this overall picture much. Even if a few relationships are true, the shape of the distribution of the observed effects would still yield a clear measure of the biases involved in the field. This concept totally reverses the way we view scientific results. Traditionally, investigators have viewed large and highly significant effects with excitement, as signs of important discoveries. Too large and too highly significant effects may actually be more likely to be signs of large bias in most fields of modern research. They should lead investigators to careful critical thinking about what might have gone wrong with their data, analyses, and results. Of course, investigators working in any field are likely to resist accepting that the whole field in which they have spent their careers is a “null field.” However, other lines of evidence, or advances in technology and experimentation, may lead eventually to the dismantling of a scientific field. Obtaining measures of the net bias in one field may also be useful for obtaining insight into what might be the range of bias operating in other fields where similar analytical methods, technologies, and conflicts may be operating. How Can We Improve the Situation? Is it unavoidable that most research findings are false, or can we improve the situation? A major problem is that it is impossible to know with 100% certainty what the truth is in any research question. In this regard, the pure “gold” standard is unattainable. However, there are several approaches to improve the post-study probability. Better powered evidence, e.g., large studies or low-bias meta-analyses, may help, as it comes closer to the unknown “gold” standard. However, large studies may still have biases and these should be acknowledged and avoided. Moreover, large-scale evidence is impossible to obtain for all of the millions and trillions of research questions posed in current research. Large-scale evidence should be targeted for research questions where the pre-study probability is already considerably high, so that a significant research finding will lead to a post-test probability that would be considered quite definitive. Large-scale evidence is also particularly indicated when it can test major concepts rather than narrow, specific questions. A negative finding can then refute not only a specific proposed claim, but a whole field or considerable portion thereof. Selecting the performance of large-scale studies based on narrow-minded criteria, such as the marketing promotion of a specific drug, is largely wasted research. Moreover, one should be cautious that extremely large studies may be more likely to find a formally statistical significant difference for a trivial effect that is not really meaningfully different from the null [32–34]. Second, most research questions are addressed by many teams, and it is misleading to emphasize the statistically significant findings of any single team. What matters is the totality of the evidence. Diminishing bias through enhanced research standards and curtailing of prejudices may also help. However, this may require a change in scientific mentality that might be difficult to achieve. In some research designs, efforts may also be more successful with upfront registration of studies, e.g., randomized trials [35]. Registration would pose a challenge for hypothesis-generating research. Some kind of registration or networking of data collections or investigators within fields may be more feasible than registration of each and every hypothesis-generating experiment. Regardless, even if we do not see a great deal of progress with registration of studies in other fields, the principles of developing and adhering to a protocol could be more widely borrowed from randomized controlled trials. Finally, instead of chasing statistical significance, we should improve our understanding of the range of R values—the pre-study odds—where research efforts operate [10]. Before running an experiment, investigators should consider what they believe the chances are that they are testing a true rather than a non-true relationship. Speculated high R values may sometimes then be ascertained. As described above, whenever ethically acceptable, large studies with minimal bias should be performed on research findings that are considered relatively established, to see how often they are indeed confirmed. I suspect several established “classics” will fail the test [36]. Nevertheless, most new discoveries will continue to stem from hypothesis-generating research with low or very low pre-study odds. We should then acknowledge that statistical significance testing in the report of a single study gives only a partial picture, without knowing how much testing has been done outside the report and in the relevant field at large. Despite a large statistical literature for multiple testing corrections [37], usually it is impossible to decipher how much data dredging by the reporting authors or other research teams has preceded a reported research finding. Even if determining this were feasible, this would not inform us about the pre-study odds. Thus, it is unavoidable that one should make approximate assumptions on how many relationships are expected to be true among those probed across the relevant research fields and research designs. The wider field may yield some guidance for estimating this probability for the isolated research project. Experiences from biases detected in other neighboring fields would also be useful to draw upon. Even though these assumptions would be considerably subjective, they would still be very useful in interpreting research claims and putting them in context.
                Bookmark

                Author and article information

                Journal
                Psychiatry Res
                Psychiatry Res
                Psychiatry Research
                Elsevier B.V.
                0165-1781
                1872-7123
                13 May 2021
                July 2021
                13 May 2021
                : 301
                : 113998
                Affiliations
                [a ]Department of Psychology and Neuroscience, Dalhousie University, 1355 Oxford Street, PO Box 15000, Halifax, NS, Canada B3H 4R2
                [b ]Department of Psychology, University of British Columbia, Vancouver, British Columbia, Canada V6T 1Z4
                [c ]Department of Psychiatry, Dalhousie University, 8th floor, Abbie J. Lane Building, 5909 Veterans’ Memorial Lane, Halifax, Nova Scotia, Canada B3H 2E2
                Author notes
                [* ]Corresponding author.
                Article
                S0165-1781(21)00295-X 113998
                10.1016/j.psychres.2021.113998
                9225823
                34022657
                9cda9e5b-6d29-4c01-80a4-ae878c53246d
                © 2021 Elsevier B.V. All rights reserved.

                Since January 2020 Elsevier has created a COVID-19 resource centre with free information in English and Mandarin on the novel coronavirus COVID-19. The COVID-19 resource centre is hosted on Elsevier Connect, the company's public news and information website. Elsevier hereby grants permission to make all its COVID-19-related research that is available on the COVID-19 resource centre - including this research content - immediately available in PubMed Central and other publicly funded repositories, such as the WHO COVID database with rights for unrestricted research re-use and analyses in any form or by any means with acknowledgement of the original source. These permissions are granted for free by Elsevier for as long as the COVID-19 resource centre remains active.

                History
                : 7 February 2021
                : 8 May 2021
                Categories
                Article

                Clinical Psychology & Psychiatry
                suicide ideation,suicide attempts,self-harm,suicide prevention

                Comments

                Comment on this article