Average rating: | Rated 3 of 5. |
Level of importance: | Rated 2 of 5. |
Level of validity: | Rated 3 of 5. |
Level of completeness: | Rated 3 of 5. |
Level of comprehensibility: | Rated 3 of 5. |
Competing interests: | None |
The paper has two goals: the first goal is to replicate a previous using the same dataset but a different machine learning model. The previous finding was that “perceived loneliness”, among 12 mental health indicators, is most related to time into a COVID lockdown in the UK. The second goal is to confirm a u-shape relationship between perceived loneliness and weeks into the lockdown, using a different dataset from the second national lockdown.
My biggest concern is that there is little discussion on the effect size. We only knew about the MSE of the overall model and that “perceived loneliness” is relatively more related to time into lockdown than other variables (but not by how much). The authors mentioned in their previous paper (CITE) that the overall performance is bad, which I’d agree even without comparing the MSE or R2 with other similar machine learning tasks. Therefore, among a collection of highly correlated mental health variables that together are not so related to time into a lockdown, does it really matter that we identify the one that’s slightly more related to time? I’d like to see more justification of how this analysis is meaningful, taking into account effect sizes.
Now assuming the purpose of the analysis is justified, I move on to talk about the mechanics of the machine learning task. The analysis is based on a sample of 435 participants, which is admittedly quite many for a longitudinal study but small for a machine learning task. The authors are quite right on the need to replicate the effect using a different model given the small sample. Going down that route, I’d recommend go as far as replicating it using multiple models beyond the SVR to see if they agree. Having said that, I’d argue it’s more important to replicate the finding across different data sources than using a different model. I hope the authors could search other longitudinal data sources with similar variables and replicate the findings. At the very least, it will be good to know from the paper that there is no other suitable data source for this question and the finding based on this one dataset is preliminary.
If I am reading Table 2 correctly, the sample size seems incredibly small (5 participants from week 3, and 2, 3, and 1 participant from week 4, 5, and 6 for the second analysis. The week-by-week comparison would not be meaningful at all given the small sample. Hence the data from the second wave is not suitable for confirming or rejecting the U-shape finding in the first wave.