+1 Recommend
    • Review: found
    Is Open Access

    Review of 'Public Opinion Analysis on Social Media Platforms: A Case Study of High Speed 2 (HS2) Rail Infrastructure Project'

    Public Opinion Analysis on Social Media Platforms: A Case Study of High Speed 2 (HS2) Rail Infrastructure ProjectCrossref
    This is a well-written study with a good approach, but it could be augmented with more SOTA methods.
    Average rating:
        Rated 4 of 5.
    Level of importance:
        Rated 4 of 5.
    Level of validity:
        Rated 4 of 5.
    Level of completeness:
        Rated 4 of 5.
    Level of comprehensibility:
        Rated 4 of 5.
    Competing interests:

    Reviewed article

    • Record: found
    • Abstract: found
    • Article: found
    Is Open Access

    Public Opinion Analysis on Social Media Platforms: A Case Study of High Speed 2 (HS2) Rail Infrastructure Project

    Abstract: Public opinion evaluation is becoming increasingly significant in infrastructure project assessment. The inefficiencies of conventional evaluation approaches can be improved with social media analysis. Posts about infrastructure projects on social media provide a large amount of data for assessing public opinion. This study proposed a public opinion evaluation framework with machine learning algorithms, including sentiment analysis and topic modelling. We selected the United Kingdom railway project, High Speed 2, as the case study. The sentiment analysis showed that around 53% to 63% of tweets expressed a negative sentiment, suggesting the public may have an overall negative perception of the project. Topic modelling with text corpora showed key topics of public opinion. The proposed framework demonstrates the feasibility of using supervised machine learning to evaluate public opinion on infrastructure projects, as the framework can save time and cost. Furthermore, assessment results can aid policymakers and managers in decision-making.

      Review information

      This work has been published open access under Creative Commons Attribution License CC BY 4.0, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Conditions, terms of use and publishing policy can be found at www.scienceopen.com.

      Engineering,Civil engineering
      Public opinion evaluation,Sentiment analysis,Transport,Policy and law,Environmental policy and practice,Machine learning,Civil infrastructure projects,Topic modelling,Sustainability

      Review text

      In this study, the authors successfull conduct an analysis of public opinions on a public infrastructure project (the High Speed 2 railway project in the United Kingdom) using data from Twitter and machine learng algorithms. The approach is sound, but the methodology could do with the adoption of SOTA models, benchmarking against very basic models, and handling potentially imbalanced datasets. Details are provided below:

      1. Deep learning techniques, especially those that employ transformer architectures are the current SOTA. While methods like Naive Baye, SVM’s and LDA are still very useful, it would be prudent to compare with the results from transformer-based deep learning architectures. Neural networks in the transformer family fine-tuned for specific tasks like classification have proven to be a very promising research direction in recent years, and some models like twitter-roberta-base-sentiment can be used out of the box. Since these deep learning architectures transform text into numerical embeddings that preserve semantic context to a degree, they reduce the amount of pre-processing that has to be done on tweets (like stemming and stop-word removal).
      2. Another good tool to consider for establishing baselines is the VADER sentiment analysis model, which was developed specifically for social media use-cases. In the worst case it can serve as a reasonable baseline, since it requires no training.
      3. Steps should be taken to address dataset imbalance. If any such steps were taken, they were not stated. This can cause issues for a classifier, such as overfitting to a label with an overwhelmingly higher representaion. The F1 score is a good metric for catching this, but it might be better to train on a balanced dataset.


      Comment on this review