The diagnostic accuracy of the Edinburgh Postnatal Depression Scale without the self-harm item: Does culture matter? To the Editor: We read with keen interest the recent article by Chen et al. (2023), in which the authors evaluated the performance of the Edinburgh Postnatal Depression Scale (EPDS) without the self-harm item, called EPDS-9, compared to the complete EPDS, called EPDS-10. They focused on identifying depression among people who are pregnant or postpartum. The authors concluded that the shortened EPDS-9 performs as well as the EPDS-10, suggesting it as a potential replacement for the full-length EPDS. Our research partially supports the findings of Chen et al. (2023). Our study sample comprises 1153 pregnant women and 309 postpartum women. These participants were enrolled from 11 healthcare centers located throughout Italy (masked citation). The characteristics of the participants are detailed in a separate publication (masked citation). Trained psychologists used unstructured clinical interviews and patient-rated Patient Health Questionnaire-9 (PHQ-9) and EPDS questionnaires to evaluate participants’ depression. Our findings indicate a correlation of .998 between EPDS-9 and EPDS-10, observed in both the antepartum and postpartum groups. Only 1% of the participants were negative at EPDS-9 cutoff points of <10 but had a non-zero EPDS item 10 score, and 2% at EPDS-9 cutoff points of <13. Furthermore, EPDS-9 demonstrated excellent accuracy in distinguishing EPDS-10-based depression screening in both perinatal groups, in each of the four commonly used cutoff scores (Levies et al., 2020; Quip et al., 2023). We used the PHQ-9 as a criterion to compare the performance of the EPDS-9 versus EPDS-10, using a cut-off value of 13 (which is indicated as the most appropriate for the detection of major depression in perinatal people [Levi’s et al., 2019]). EPDS-9 and EPDS-10 demonstrated comparable sensitivity, specificity, and area under the curve (AUC) performances. In the antepartum group, both the EPDS-9 and EPDS-10 (a) show declining sensitivity with increasing cutoff values, (b) have high specificity across all cutoff values, and (c) have AUC values that suggest they perform reasonably well, though their performance declines with increasing cutoff values. Comparison of AUC values between EPDS-9 and EPDS-10 suggests that there are no significant differences in performance between the two versions of EPDS at cutoff values of 10, 11, and 13. However, there appears to be a significant difference in performance at a cutoff value of 12, with the EPDS-10 performing better. Regarding the postpartum group, although the AUC remains relatively high for both EPDS-9 and EPDS-10 across all cutoff values, the equivalence tests showed a statistically significant difference at all cutoff values (see Table 1), indicating that there is a significant difference in overall test performance. Specifically, the EPDS-10 outperforms the EPDS-9 at all cutoff values. We also examined the predictive potential of EPDS-9 for responses to the EPDS self-harm item (item 10). The AUC of EPDS-9 against self-harm responses varied depending on the frequency level, which could be an area for further study. Specifically, EPDS-9’s AUC against self-harm above the frequency of “hardly” ranged from 0.716 to 0.826, except for cutoff 13 in the antepartum group, where it dropped significantly to 0.288. This decrease in AUC at the cut-off point of 13 suggests that EPDS-9’s ability to predict self-harm responses decreases when this more conservative threshold is used. The AUC against self-harm above the frequency of “sometimes” and “often” ranged, respectively, from 0.712 to 0.826 and from 0.445 to 0.675. These variations emphasize the importance of considering frequency when examining self-harm predictions. Table 1 shows the sensitivity, specificity, and AUC for each cutoff value. Based on our study, we propose two main findings that support those of Chen et al. First, EPDS- 10 and EPDS-9 are strongly correlated. Second, EPDS-9 exhibited similar sensitivity and specificity in screening major depression among pregnant and postpartum women, compared to full EPDS, across the most commonly used cutoff points. However, unlike the Japanese sample of Chen et al., EPDS-9 did not predict the responses of Italian participants to the self-harm item as accurately. We found this discrepancy when comparing the differentiation performance of EPDS-9 versus EPDS-10 using the PHQ-9 as a criterion. Likely, the discrepancy is due to the use of different instruments although as Kessler Psychological Distress Scale (K6) (used by Chen et al.) and PHQ-9 showed a strong correlation (Cotton et al., 2021). It is here important to remember that the EPDS was originally developed in English (Cox et al., 1987). Consequently, both our study and that of Chen et al. employed translated versions of the scale. Although both the Japanese and the Italian translations have been validated (Benvenuti et al., 1999; Okano et al., 1996) and shown to be reliable and valid measures for perinatal depression (Kubota et al., 2018; Stefana et al., 2023) and have demonstrated a similar factor structure (which includes aspects of anxiety and anhedonia (Kubota et al., 2014; Mirabella et al., 2024), the translation process may contribute to some of the inconsistencies in the data. This highlights a critical issue: the necessity of establishing cross-cultural validity for psychological inventories. Cultural variations in the subjective experience and expression of affective disorders must be taken into account in clinical assessment (Kiermaier & Groleau, 2001). They may significantly shape the manifestation of depression symptomatology and impact the openness to answer questions about self-harm, as suggested by numerous studies. Mental health issues such as depression can present differently in various cultures due to differences in social norms, belief systems, and levels of stigma associated with mental health (Kleinman & Good, 1985). In some societies, such as the Chinese one, psychological symptoms may be expressed more somatically, which may influence the detection of depressive symptoms through tools such as the EPDS (Ryder et al., 2008). Concerning self-harm and suicidal ideation, cultural factors can significantly influence the willingness to disclose such experiences. For example, some cultures may have high levels of stigma associated with mental health conditions or self-harm behaviors, making individuals less likely to report these experiences openly (Chu et al., 2010). Additionally, cultures like, for example, the Chinese one prioritize collective identity over individualism may see a higher level of self-stigma, resulting in a lower level of openness about mental health struggles, including self-harm (Yang et al., 2007). Therefore, it is crucial to keep cultural factors in mind when interpreting the effectiveness of measures such as EPDS-9 and EPDS-10 in different cultures and perinatal populations (pregnant versus postpartum people). The variance between Chen et al.’s and our samples in terms of the predictive precision of EPDS-9 for self-harm responses underscores the need for culturally sensitive approaches in the detection of depression. More research is needed to understand the specific cultural factors at play in the various phases of the perinatal process and adapt the instruments accordingly to improve their validity and reliability. Lastly, Chen et al.’s suggestion to omit the self-harm item in order to help avoid confusion and potential psychological distress brought to the responders should be considered with caution. Overreliance on item 10 can surely lead to a strain on resources due to mandatory follow-up assessments, but when psychological assessment is done well it is always therapeutic to some degree. Furthermore, as we explained before, in certain cultures (and more generally, in certain people) a 0 score on item 10 does not mean a 0 risk of suicide. In conclusion, although EPDS-9 shows a performance similar to that of EPDS-10 in the screening of major depression, we recommend the use of the full EPDS. The variance in predictive accuracy between different population samples highlights the need for future research to further validate EPDS- 9 in specific cultures and perinatal populations.

Stefana Alberto , Cena Loredana, Alice Trainini, Palumbo Gabriella, Gigantesco Antonella, Mirabella Fiorino, The diagnostic accuracy of the Edinburgh Postnatal Depression Scale without the self-harm item: Does culture matter?, Journal of Psychiatric Research

Cena L
Project Administration
;
2024-01-01

Abstract

The diagnostic accuracy of the Edinburgh Postnatal Depression Scale without the self-harm item: Does culture matter? To the Editor: We read with keen interest the recent article by Chen et al. (2023), in which the authors evaluated the performance of the Edinburgh Postnatal Depression Scale (EPDS) without the self-harm item, called EPDS-9, compared to the complete EPDS, called EPDS-10. They focused on identifying depression among people who are pregnant or postpartum. The authors concluded that the shortened EPDS-9 performs as well as the EPDS-10, suggesting it as a potential replacement for the full-length EPDS. Our research partially supports the findings of Chen et al. (2023). Our study sample comprises 1153 pregnant women and 309 postpartum women. These participants were enrolled from 11 healthcare centers located throughout Italy (masked citation). The characteristics of the participants are detailed in a separate publication (masked citation). Trained psychologists used unstructured clinical interviews and patient-rated Patient Health Questionnaire-9 (PHQ-9) and EPDS questionnaires to evaluate participants’ depression. Our findings indicate a correlation of .998 between EPDS-9 and EPDS-10, observed in both the antepartum and postpartum groups. Only 1% of the participants were negative at EPDS-9 cutoff points of <10 but had a non-zero EPDS item 10 score, and 2% at EPDS-9 cutoff points of <13. Furthermore, EPDS-9 demonstrated excellent accuracy in distinguishing EPDS-10-based depression screening in both perinatal groups, in each of the four commonly used cutoff scores (Levies et al., 2020; Quip et al., 2023). We used the PHQ-9 as a criterion to compare the performance of the EPDS-9 versus EPDS-10, using a cut-off value of 13 (which is indicated as the most appropriate for the detection of major depression in perinatal people [Levi’s et al., 2019]). EPDS-9 and EPDS-10 demonstrated comparable sensitivity, specificity, and area under the curve (AUC) performances. In the antepartum group, both the EPDS-9 and EPDS-10 (a) show declining sensitivity with increasing cutoff values, (b) have high specificity across all cutoff values, and (c) have AUC values that suggest they perform reasonably well, though their performance declines with increasing cutoff values. Comparison of AUC values between EPDS-9 and EPDS-10 suggests that there are no significant differences in performance between the two versions of EPDS at cutoff values of 10, 11, and 13. However, there appears to be a significant difference in performance at a cutoff value of 12, with the EPDS-10 performing better. Regarding the postpartum group, although the AUC remains relatively high for both EPDS-9 and EPDS-10 across all cutoff values, the equivalence tests showed a statistically significant difference at all cutoff values (see Table 1), indicating that there is a significant difference in overall test performance. Specifically, the EPDS-10 outperforms the EPDS-9 at all cutoff values. We also examined the predictive potential of EPDS-9 for responses to the EPDS self-harm item (item 10). The AUC of EPDS-9 against self-harm responses varied depending on the frequency level, which could be an area for further study. Specifically, EPDS-9’s AUC against self-harm above the frequency of “hardly” ranged from 0.716 to 0.826, except for cutoff 13 in the antepartum group, where it dropped significantly to 0.288. This decrease in AUC at the cut-off point of 13 suggests that EPDS-9’s ability to predict self-harm responses decreases when this more conservative threshold is used. The AUC against self-harm above the frequency of “sometimes” and “often” ranged, respectively, from 0.712 to 0.826 and from 0.445 to 0.675. These variations emphasize the importance of considering frequency when examining self-harm predictions. Table 1 shows the sensitivity, specificity, and AUC for each cutoff value. Based on our study, we propose two main findings that support those of Chen et al. First, EPDS- 10 and EPDS-9 are strongly correlated. Second, EPDS-9 exhibited similar sensitivity and specificity in screening major depression among pregnant and postpartum women, compared to full EPDS, across the most commonly used cutoff points. However, unlike the Japanese sample of Chen et al., EPDS-9 did not predict the responses of Italian participants to the self-harm item as accurately. We found this discrepancy when comparing the differentiation performance of EPDS-9 versus EPDS-10 using the PHQ-9 as a criterion. Likely, the discrepancy is due to the use of different instruments although as Kessler Psychological Distress Scale (K6) (used by Chen et al.) and PHQ-9 showed a strong correlation (Cotton et al., 2021). It is here important to remember that the EPDS was originally developed in English (Cox et al., 1987). Consequently, both our study and that of Chen et al. employed translated versions of the scale. Although both the Japanese and the Italian translations have been validated (Benvenuti et al., 1999; Okano et al., 1996) and shown to be reliable and valid measures for perinatal depression (Kubota et al., 2018; Stefana et al., 2023) and have demonstrated a similar factor structure (which includes aspects of anxiety and anhedonia (Kubota et al., 2014; Mirabella et al., 2024), the translation process may contribute to some of the inconsistencies in the data. This highlights a critical issue: the necessity of establishing cross-cultural validity for psychological inventories. Cultural variations in the subjective experience and expression of affective disorders must be taken into account in clinical assessment (Kiermaier & Groleau, 2001). They may significantly shape the manifestation of depression symptomatology and impact the openness to answer questions about self-harm, as suggested by numerous studies. Mental health issues such as depression can present differently in various cultures due to differences in social norms, belief systems, and levels of stigma associated with mental health (Kleinman & Good, 1985). In some societies, such as the Chinese one, psychological symptoms may be expressed more somatically, which may influence the detection of depressive symptoms through tools such as the EPDS (Ryder et al., 2008). Concerning self-harm and suicidal ideation, cultural factors can significantly influence the willingness to disclose such experiences. For example, some cultures may have high levels of stigma associated with mental health conditions or self-harm behaviors, making individuals less likely to report these experiences openly (Chu et al., 2010). Additionally, cultures like, for example, the Chinese one prioritize collective identity over individualism may see a higher level of self-stigma, resulting in a lower level of openness about mental health struggles, including self-harm (Yang et al., 2007). Therefore, it is crucial to keep cultural factors in mind when interpreting the effectiveness of measures such as EPDS-9 and EPDS-10 in different cultures and perinatal populations (pregnant versus postpartum people). The variance between Chen et al.’s and our samples in terms of the predictive precision of EPDS-9 for self-harm responses underscores the need for culturally sensitive approaches in the detection of depression. More research is needed to understand the specific cultural factors at play in the various phases of the perinatal process and adapt the instruments accordingly to improve their validity and reliability. Lastly, Chen et al.’s suggestion to omit the self-harm item in order to help avoid confusion and potential psychological distress brought to the responders should be considered with caution. Overreliance on item 10 can surely lead to a strain on resources due to mandatory follow-up assessments, but when psychological assessment is done well it is always therapeutic to some degree. Furthermore, as we explained before, in certain cultures (and more generally, in certain people) a 0 score on item 10 does not mean a 0 risk of suicide. In conclusion, although EPDS-9 shows a performance similar to that of EPDS-10 in the screening of major depression, we recommend the use of the full EPDS. The variance in predictive accuracy between different population samples highlights the need for future research to further validate EPDS- 9 in specific cultures and perinatal populations.
File in questo prodotto:
File Dimensione Formato  
2024 STEFANA et al 2024 The diagnostic accuracy of the EPDS without the self-harm item.pdf

accesso aperto

Descrizione: Articolo in rivista
Tipologia: Full Text
Licenza: PUBBLICO - Pubblico con Copyright
Dimensione 317.55 kB
Formato Adobe PDF
317.55 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11379/600345
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
social impact