611
views
0
recommends
+1 Recommend
1 collections
    2
    shares

      King Salman Center for Disability Research "KSCDR" is pleased to invite you to submit your scientific research to the "Journal of Disability Research - JDR", where the "JDR" comes within the Center's strategy aimed at maximizing the impact of research on the scientific field, by supporting and publishing scientific research on disability and its issues, which reflect positively on the level of services, rehabilitation, and care for individuals with disabilities.
      "JDR" is a scientific journal that has the lead in covering all areas of human, health and scientific disability at the regional and international levels.

      scite_
       
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Autism Spectrum Disorder Prediction in Children Using Machine Learning

      Published
      research-article
      Bookmark

            Abstract

            Life symptoms associated with autism spectrum disorder (ASD) typically manifest during childhood and persist into adolescence and adulthood. ASD, which can be caused by genetic or environmental factors, can be significantly improved through early detection and treatment. Currently, standardized clinical tests are the primary diagnostic method for ASD. However, these tests are time consuming and expensive. Early detection and intervention are pivotal in enhancing the long-term prospects of children diagnosed with ASD. Machine-learning (ML) techniques are being utilized alongside conventional methods to improve the accuracy and efficiency of ASD diagnosis. Therefore, the paper aims to explore the feasibility of employing support vector machines, random forest classifier, naïve Bayes, logistic regression (LR), K-nearest neighbor, and decision tree classification models on our dataset to construct predictive models for predicting and analyzing ASD problems across different age groups: children, adolescents, and adults. The proposed techniques are assessed using publicly available nonclinical ASD datasets of three distinct datasets. The four ASD datasets, namely toddlers, adolescents, children, and adults, were obtained from publicly available repositories, specifically Kaggle and UCI ML. These repositories provide a valuable data source for research and analysis related to ASD. Our main objective is to identify the susceptibility to ASD in children during the early stages, thereby streamlining the diagnosis process. Based on our findings, LR demonstrated the highest accuracy for the selected dataset.

            Main article text

            INTRODUCTION

            Autism spectrum disorder (ASD) is a neurodevelopmental condition affecting a child’s communication, social interaction, and knowledge acquisition, typically presenting within the first 2 years of life (Frith and Happé, 2005). People with autism face various obstacles, including difficulty with focus, learning disabilities, mental health issues such as anxiety, depression, movement, sensory issues, and other challenges (Tripathy et al., 2021). As a result, it impacts an individual’s entire cognitive, social, emotional, and physical health (Omar et al., 2019; Alenizi and Al-Karawi, 2023a). The symptoms of this condition vary in extent and intensity, including communication difficulties, obsessive hobbies, and repeated mannerisms in social situations. A comprehensive examination is needed to detect ASD. This also comprises a thorough evaluation and a range of assessments performed by child psychologists and other qualified professionals (Bastiaansen et al., 2011; Alenizi and Al-Karawi, 2023b, c). Autism is a rapidly growing and numerous global condition, affecting approximately one child out of every 160, according to the World Health Organization (Suhas et al., 2021; Al-Karawi, 2023). ASD is a neurodevelopmental condition that affects social interaction and communication abilities, requiring 24-h care and assistance for some individuals (Vaishali and Sasikala, 2018; Thabtah, 2019). Individuals with ASD often experience lifelong challenges in these areas. ASD, a condition characterized by persistent symptoms, is believed to be caused by a combination of genetic and environmental factors, with no known cure, but early detection can help manage its effects. Genes, environmental factors, and risk factors like low birth weight, ASD sibling presence, and older parents can influence a person’s development. Early diagnosis of autism can be quite beneficial because it allows doctors to provide patients with the appropriate treatment at an earlier stage. It can potentially halt any further deterioration of the patient’s condition. It would help to cut down on the expenditures associated with delayed diagnosis over long term. Therefore, there is a significant need for a screening test instrument that is time efficient, accurate, and simple. This test instrument would predict autistic symptoms in an individual and determine whether or not that individual requires a thorough autism examination (Lakhan et al., 2020; Alenizi and Al-Karawi, 2022). Early detection and intervention are crucial for mitigating ASD symptoms and improving the quality of life. Observation is the primary method, with parents, teachers, and special education teams identifying potential symptoms. Children should seek healthcare for further testing, as identifying ASD symptoms in adults can be more challenging, while behavioral changes in children can be recognized as early as 6 months (Al-Karawi and Ahmed, 2021; Alenizi and Al-Karawi, 2023c). This study aims to develop a platform for accurately predicting autistic characteristics in individuals of any age, using machine-learning (ML) approaches to aid in early diagnosis and intervention.

            BACKGROUND AND LITERATURE REVIEW

            In their study, Vaishali and Sasikala (2018) proposed a method for identifying ASD using optimized behavior sets. The researchers experimented with an ASD diagnosis dataset containing 21 features from the UCI machine-learning repository. They employed a swarm intelligence-based binary firefly feature selection wrapper to explore the dataset. Researchers tested the hypothesis that a machine-learning model could improve classification accuracy using minimal feature subsets, finding that only 10 features from the original 21-feature ASD dataset were sufficient. The study found that swarm intelligence-based binary firefly feature selection can achieve accurate ASD diagnosis with fewer features, achieving an average accuracy in the range of 92.12 to 97.95%, potentially improving efficiency and reducing computational complexity in ASD diagnostic systems. In their study, Thabtah (2017b) introduced an ASD screening model incorporating machine-learning adaption and diagnostic and statistical manual of mental disorders (DSM-5)criteria. Screening tools play a crucial role in achieving various objectives in ASD screening. This paper explores the use of machine learning for ASD classification, highlighting its advantages and disadvantages and the challenges existing tools face in aligning with the DSM-5 manual. In their study, Mythili and Shanavas (2014) researched ASD using classification techniques. The primary objective of their paper was to detect and classify levels of autism. They employed neural networks, support vector machine (SVM), and fuzzy techniques with WEKA tools to analyze students’ behavior and social interaction. In another study, Kosmicki et al. (2015) proposed a method for identifying a minimal set of traits for autism detection. The authors used machine learning to assess ASD clinically using the Autism Diagnostic Observation Schedule (ADOS). They identified 98.27% of the 28 behaviors from module 2 and 97.66% from module 3, achieving an overall accuracy of 98.27%. The effectiveness of ML in predicting various diseases based on syndromes is highly noteworthy. For instance, Khan et al. (2017) and Al-Karawi and Ahmed (2021) utilized ML to predict whether a person has diabetes, whereas Cruz and Wishart (2006) attempted to diagnose cancer using ML. Alternating decision tree (ADTree) was used by Wall et al. (2012a) and Alenizi and Al-Karawi (2023b) to shorten the screening process and speed up the identification of ASD features. With data from 891 people, they employed the Autism Diagnostic Interview, Revised (ADI-R) approach. They reached high accuracy, but the test was restricted to people between the ages of 5 and 17, and it could not predict ASD for various age groups (children, adolescents, and adults). Machine learning has been used in several types of research in multiple ways to enhance and expedite the diagnosis of ASD. Using a 65-item Social Responsiveness Scale, Duda et al. (2016) used forward feature selection and under sampling to distinguish between autism and attention deficit hyperactivity disorder (ADHD). The metrics of Al-Karawi (2021) and Deshpande et al. (2013) for predicting ASD were based on brain activity. Artificial neural networks (ANN), probabilistic reasoning, and classifier combinations are examples of soft computing approaches that have also been employed (Pratap et al., 2014; Alenizi and Al-Karawi, 2022). Numerous papers have discussed automatic ML models that solely consider characteristics for input features. Several research also used brain neuroimaging data. Parikh et al. (2019) selected six personal traits from the ABIDE database and used a cross-validation technique to train and test ML models using data from 851 subjects. Patients with and without ASD were categorized using this, accordingly. Rules of machine learning, which Thabtah and Peebles (2020) introduced, provide users with a knowledge base of rules for comprehending the classification’s fundamental causes and detecting ASD characteristics. Al Banna et al. (2020) track and support ASD patients while they deal with the COVID-19 epidemic. The study utilized five machine-learning models to classify participants as having ASD or No-ASD based on various parameters like age, sex, and ethnicity. We then analyzed each classifier to find the model that performed the best. SVM was utilized by Bone et al. (2016) to apply ML for the same goal and achieve 89.2% sensitivity and 59% specificity. In their study, 1264 people with ASD and 462 people without ASD features were involved. However, because of the vast age range (4-55 years), their research was not approved as a screening method for all age groups. Using more than 90% accuracy, Allison et al. (2012) used the “Red Flags” tool to screen for ASD in both children and adults with the Autism Spectrum Quotient before shortlisting them to the AQ-10. Schankweiler et al. (2023) attempted to identify relatively more important screening questions for the ADI-R and ADOS screening methods. They found that ADI-R and ADOS screening tests can work better when they are combined. Thabtah compared the previous works on ML algorithms to predict autism traits (Thabtah, 2017b). To identify ASD symptoms in children, such as developmental delay, obesity, and insufficient physical activity, van den Bekerom (2017) utilized multiple ML algorithms, including naïve Bayes (NB), SVM, and random forest algorithms. He then compared those results. ADTree and the functional tree fared well with high sensitivity, specificity, and accuracy, according to Wall et al. (2012b) study on identifying autism using a short screening test and validation. Heinsfeld et al. (2018) used a sizable brain imaging dataset from the Autism Imaging Data Exchange (ABIDE I) to identify ASD patients and got a mean classification accuracy of 70% with accuracy in the range of 66 to 71%. The random forest classifier’s (RFC) mean accuracy was 63%, compared to the SVM classifier’s mean accuracy of 65%. This study’s accuracy, specificity, sensitivity, and AUC were 88.51%. To pinpoint the problems with conceptual problem formulation, methodology implementation, and result in interpretation, Bone et al. (2015) analyzed the earlier works of Wall et al. (2012b) and Kosmicki et al. (2015). The researchers used machine learning to replicate their findings, but there is no consensus on the best approach for generalizing autism screening tools across different age ranges.

            WORKING MODEL

            This research aims to create a robust machine-learning model for detecting autism in individuals of different ages, ensuring accurate and effective detection. Figure 1 shows our system’s operation and data flow, starting with preliminary data processing, removing noise, missing values, outliers, and encoding categorical attributes. We use feature-engineering techniques to reduce dataset dimensionality, improve training speed, and use preprocessed datasets for classification using SVM, decision tree, and RFCs. The system evaluates classifier accuracy using a structured workflow, starting with data preprocessing, feature selection, and classification techniques, identifying the most accurate model for further training and categorization tasks.

            Figure 1:

            The architecture of the proposed system (Alenizi and Al-Karawi, 2023c).

            RESEARCH METHODOLOGY

            The research involved five stages: data collection, synthesis, prediction model development, evaluation, and application development, each with a brief discussion of each phase.

            Data collection

            The dataset utilized for this research has been acquired from the publicly available UCI Repository. The four ASD datasets, namely, toddlers, adolescents, children, and adults, were obtained from publicly available repositories, specifically Kaggle and UCI ML (Hasan et al., 2022). These repositories provide a valuable data source for research and analysis related to ASD.

            These datasets have 20 common attributes that are used for prediction. These attributes are listed below:

            Data preprocessing

            Data preparation encompasses all the necessary preprocessing steps before commencing model training, aiming to achieve optimal results (Gopal Krishna Patro and Sahu, 2015). This preparation entails a series of three stages.

            • Data encoding involves transforming a dataset comprising 6 numerically assigned values and 13 nominally assigned values. To effectively employ various machine-learning algorithms, it is essential to work with real numbers. Consequently, all nominal values must be converted into real numbers. A straightforward representation is adopted in this case, wherein the real numbers 1 and 2 are utilized. For instance, the male class is encoded as 1, while the female type is encoded as 2.

            • Dealing with missing values is a crucial step in data handling. A significant portion (48.3%) of the data is missing in the given dataset. Removing these missing values would render the dataset unusable, reducing it to 155 samples. Hence, it becomes essential to address this issue. A statistical approach is adopted whereby the missing values are replaced with the mean of the values corresponding to each class (Wohlrab and Fürnkranz 2011). This ensures that the dataset remains intact and usable for further analysis and modeling.

            • Normalization becomes necessary as the dataset exhibits significant variations in the range of values, particularly after the nominal values have been encoded into real numbers (1 and 2). Without normalization, attributes with more extensive numeric ranges can dominate those with smaller ranges, potentially biasing the analysis. Moreover, normalization facilitates faster execution of algorithms by avoiding the utilization of wide-ranging numbers (Deshpande et al., 2013). In this case, the data are scaled to fit within the interval of 0 to 1, following Equation (1), where x represents the original value of the attribute, x Normalized represents the scaled value, min a is the minimum value of attribute a, and max a is the maximum value of attribute a.

            (1) XNormalized=(xminamaxmina)

            Table 1:

            List of ASD datasets (Hasan et al., 2022).

            S. no.Dataset nameSourcesAttribute typeAttributes numberInstances number
            1ASD screening data for adult UCIMachine-learning repository (Thabtah, 2017b)Categorical, continuous, and binary21704
            2ASD screening data for children UCIMachine-learning repository (Thabtah, 2017b)Categorical, serial, and binary21292
            3ASD screening data for adolescent UCIMachine-learning repository (Thabtah, 2017a)ASD categorical, continuous, and binary21104

            Abbreviation: ASD, autism spectrum disorder.

            Table 2:

            List of attributes in the dataset (Hasan et al., 2022).

            Attribute idAttributes description
            1Patient age
            2Sex
            3Nationality
            4The patient suffered from jaundice problem at birth
            5Any family member suffered from pervasive developmental disorders
            6Who is the fulfillment of the experiment
            7The country in which the user lives
            8Did the user use the screening application before or not?
            9Screening test type
            10-19Based on the screening method, answers to 10 questions
            20Screening score
            Selecting the optimal subset of the feature

            The feature selection block outlines the process of selecting the best subset of features, which is influenced by the chosen algorithm and the desired learning performance. The following steps are followed to accomplish this selection procedure:

            • This study employs adaptive wrapper feature selection and precisely backward elimination (Mao, 2004; Al-Karawi and Mohammed, 2023) to determine the optimal set of features. The results of this process are presented in the paper. Initially, all features related to the chosen algorithm are included. Then, in each iteration, the importance of each feature is evaluated, and the feature with the lowest priority is eliminated. This iterative loop continues until only one feature remains unexplored. The process is repeated until a significant decline in diagnostic performance is observed, as discussed in the Results and Discussion section.

            • After selecting the feature subset with optimal performance, as described earlier (initially starting with the full feature set in the first iteration), cross-validation with 10-fold is employed to evaluate the discriminant performance (Berrar, 2019). As demonstrated later, all trained models are saved from being utilized for diagnosing unseen samples. This process is repeated for each algorithm under consideration. To assess the model performance, the results obtained from cross-validation for each feature subset are compared, determining the best-performing model for each specific number of features.

            • As part of this research, the objective is to develop a mobile application for patients or healthcare facilities. Therefore, one of the essential goals is to minimize the number of features, thereby reducing the cost of tests while maximizing accuracy. To achieve this, a procedure is implemented to identify the minor features that yield the most optimal performance across 10-folds. The resulting 10 models from each fold are saved for later use during the testing phase. This approach ensures that the application maintains high accuracy while minimizing the required features.

            Training framework architecture

            As mentioned earlier, previous studies have predominantly focused on selecting features independently of the training model. In traditional classification systems, a feature selection technique is often applied, and the selected features are then used across all algorithms to classify diseases. However, this approach can lead to varying performance for each model, depending on the algorithm used and the representation of the selected features. Specific algorithms may underperform because the chosen features may not be the most suitable for that particular algorithm. To address this feature selection challenge, this subsection proposes and justifies a stand-alone platform for diagnosing hepatitis disease. The platform encompasses the training framework architecture, testing framework architecture, and real-time diagnosis platform. Figure 1 illustrates the complete training framework architecture, with each section detailed. The entire process is repeated for all selected algorithms.

            Testing framework architecture

            Figure 2 illustrates the execution of a simulated test on an unseen portion of the dataset. The testing process involves data preparation, similar to the training process. The prepared data are then passed to a script that performs predictions using the 10 pretrained models. A voting process determines the final decision based on the highest probability (Parikh et al., 2019). However, in the scenario where five models predict “affected” and five models predict “healthy,” the patient is considered to have autism disease. It is important to note that, since we are dealing with a disease, it is highly recommended that the patient consult a doctor for further examination and diagnosis.

            Figure 2:

            Training framework architecture (Alenizi and Al-Karawi, 2023c).

            CLASSIFICATIONS ALGORITHMS

            Support vector machine

            SVM is a supervised machine-learning technique for classification and regression tasks. It is a practical approach to solving pattern recognition problems. One notable advantage of SVM is its ability to mitigate overfitting issues. By establishing a decision boundary, SVM effectively segregates classes (Huang et al., 2018).

            Naïve Bayes

            The NB classifier is a supervised learning algorithm that operates as a generative model based on joint probability distribution. It makes use of independence assumptions to simplify computations. Compared to SVM and ME models, NB exhibits faster training times. It calculates the posterior probability for a dataset by combining prior probability and likelihood estimations (John and Langley, 2013).

            Logistic regression

            Logistic regression (LR) is a regression technique for analyzing binary dependent variables. Its output values are constrained to 0 or 1, making it suitable for binary classification tasks. LR is beneficial for datasets with continuous values. It enables examining the relationship between a single dependent binary variable and one or more nominal or ordinal variables. The relationship is typically represented using the sigmoidal function.

            K-nearest neighbor

            K-nearest neighbor (KNN) is a supervised learning method known for its simplicity. It is employed in both classification and regression tasks. The underlying principle of KNN is that similar data points tend to be located close to each other. The “K” in KNN refers to the number of neighboring points to consider. Selecting an appropriate “K” value is crucial in minimizing errors. KNN relies on similarity, measured by distance, closeness, or proximity. The widely used distance metric is the Euclidean distance.

            Random forest classifier

            The RFC is a versatile algorithm capable of handling classification, regression, and other tasks (Alam and Vuong, 2013). It operates by generating multiple decision trees using random subsets of the data. Once predictions are obtained from each tree, the final solution is determined by employing a voting mechanism. The prediction that receives the highest number of votes is selected as the best solution. This voting-based approach allows RFC to leverage the collective wisdom of multiple decision trees, resulting in improved accuracy and flexibility.

            The random forest algorithm creates many decision trees from a randomly selected section of the training dataset shown in Figure 3. The votes from several decision trees are then averaged to establish the final class of test objects (Alam and Vuong, 2013).

            Figure 3:

            Testing framework architecture (Alenizi and Al-Karawi, 2023c).

            Figure 4:

            An SVM classifier. Abbreviation: SVM, support vector machine (Alenizi and Al-Karawi, 2023c).

            Figure 5:

            K-nearest neighbor (Alenizi and Al-Karawi, 2023c).

            Decision tree classification method

            The cornerstone of a decision tree is the decision-making process, which has outstanding accuracy and stability and can be seen as a tree. In Figure 2, a decision tree is displayed (Song and Ying, 2015).

            Figure 6:

            Random forest classification (Alenizi and Al-Karawi, 2023c).

            Figure 7:

            Decision tree classification method (Alenizi and Al-Karawi, 2023c).

            RESULTS AND DISCUSSION

            The performance of the classification model is evaluated using metrics such as specificity, sensitivity, and accuracy, which are derived from the confusion matrix and classification report. These metrics provide insights into the model’s precision in predicting true negatives, positives, and overall accuracy. The model’s effectiveness depends on the accuracy of its training, as it directly influences the quality of the results obtained from these performance measures.

            Performance evaluation

            Evaluating the performance of a classification model is crucial to assess its effectiveness in achieving a desired outcome. Performance evaluation metrics quantitatively assess the model’s performance on a test dataset. Selecting appropriate metrics to evaluate the model’s performance accurately is essential. Several metrics can be utilized, including the confusion matrix, accuracy, specificity, sensitivity, and more. The following formulas are commonly employed to calculate these performance metrics.

            (2) Specificity=TNTN+TP

            (3) TruePositiveRateorSensitivity=TNTN+FN

            (4) Accuracy=TP+TNTN+TP+FP+FN_

            The experimental results demonstrate the application of various machine-learning algorithms with feature selection for ASD screening data in children. All features were selected to evaluate the predictive models’ specificity, sensitivity, and accuracy. The specific implementations for each algorithm are as follows:

            • NB: Gaussian NB algorithm was used.

            • SVM: Radial basis function (RBF) kernel with a gamma value of 0.1 was utilized.

            • KNN: N = 5 neighbors were considered.

            • ANN: Adam optimizer with a learning rate of 0.01 and 100 epochs was employed; random forest and decision tree algorithm were used.

            Figure 8:

            Classification performance.

            Figure 9:

            Learning curve of naïve Bayes.

            Figure 10:

            Learning curve of SVM. Abbreviation: SVM, support vector machine.

            Figure 11:

            Learning curve of KNN. Abbreviation: KNN, K-nearest neighbor.

            Figure 12:

            Learning curve of logistic regression.

            Figure 13:

            Learning curve of random force.

            Figure 14:

            Learning curve of decision tree.

            Table 3:

            Elements of a confusion matrix.

            Predictive ASD values
            Actual ASD valuesTrue Positive (TP)False positive (FP)
            False Negative (FN)True negative (TN)

            Abbreviation: ASD, autism spectrum disorder.

            Table 4:

            Performance measures for all machine-learning classifiers with the three datasets.

            ClassifierSpecificitySensitivityAccuracy
            Logistic regression0.93750.969696.69
            SVM0.94740.8888898.11
            Naïve Bayes0.936196.7696.24
            KNN0.91480.968795.65
            Random forest1.000.993399.75
            Decision tree0.98870.9887797.47

            Abbreviations: KNN, K-nearest neighbor; SVM, support vector machine.

            The evaluation of different machine-learning models on the ASD diagnosis dataset resulted in accuracy ranging from 95.65 to 99.75% on the original dataset. The KNN classifier with K = 5 achieved the lowest accuracy of 95.65%, while the random forest model achieved the highest prediction accuracy of 99.75% on the original dataset. Additionally, the learning curves of all the machine-learning algorithms provide further insights into the performance of the prediction models.

            CONCLUSION

            This study presents a machine-learning framework designed to detect ASD in individuals across various age groups, including toddlers, children, adolescents, and adults. Our findings demonstrate the effectiveness of predictive models based on machine-learning techniques as valuable tools for accomplishing this task. As a result, the prediction models proposed in this study, which are based on machine-learning techniques, can serve as an alternative or supportive tool for healthcare professionals in accurately identifying ASD cases across various age groups. The experimental analysis conducted in this research provides valuable insights for healthcare practitioners, enabling them to consider the most significant features when screening for ASD cases. It is important to note that the limitation of this study lies in the insufficient amount of data to develop a generalized model encompassing all stages of ASD. It is vital to have a huge dataset to construct an appropriate model. The dataset we used for this analysis did not contain enough cases.

            On the other hand, our research findings have contributed to creating an automated model that can assist medical professionals in diagnosing autism in youngsters. In the future, we will examine the possibility of employing a larger dataset to increase generalization. In future endeavors, we aim to gather a larger dataset related explicitly to ASD and construct a more comprehensive prediction model applicable to individuals of any age. This will further enhance ASD detection and facilitate improved identification of other neuro-developmental disorders.

            CONFLICTS OF INTEREST

            The authors declare no conflicts of interest in association with the present study.

            REFERENCES

            1. Al Banna MH, Ghosh T, Taher KA, Kaiser MS, Mahmud M. 2020. A monitoring system for patients of autism spectrum disorder using artificial intelligenceProceedings of the 13th International Conference on Brain Informatics, BI 2020; Padua, Italy. 19 September 2020; Springer.

            2. Alam MS, Vuong ST. 2013. Random forest classification for detecting android malware2013 IEEE International Conference on Green Computing and Communications and IEEE Internet of Things and IEEE Cyber, Physical and Social Computing; IEEE. Beijing, China

            3. Alenizi AS, Al-Karawi KA. 2022. Cloud computing adoption-based digital open government services: challenges and barriersProceedings of Sixth International Congress on Information and Communication Technology; Springer.

            4. Alenizi AS, Al-Karawi KA. 2023a. Effective Biometric Technology Used with Big DataProceedings of Seventh International Congress on Information and Communication Technology; Springer.

            5. Alenizi AS, Al-Karawi KA. 2023b. Internet of things (IoT) adoption: challenges and barriersProceedings of Seventh International Congress on Information and Communication Technology; Springer.

            6. Alenizi AS, Al-Karawi KA. 2023c. Machine learning approach for diabetes predictionInternational Congress on Information and Communication Technology; Springer.

            7. Al-Karawi KA. 2021. Mitigate the reverberation effect on the speaker verification performance using different methods. Int. J. Speech Technol. Vol. 24(1):143–153

            8. Al-Karawi KA. 2023. Face mask effects on speaker verification performance in the presence of noise. Multimed. Tools Appl. Vol. 82:1–14

            9. Al-Karawi KA, Ahmed ST. 2021. Model selection toward robustness speaker verification in reverberant conditions. Multimed. Tools Appl. Vol. 80:36549–36566

            10. Al-Karawi KA, Mohammed DY. 2023. Using combined features to improve speaker verification in the face of limited reverberant data. Int. J. Speech Technol. Vol. 26:789–799

            11. Allison C, Auyeung B, Baron-Cohen S. 2012. Toward brief “red flags” for autism screening: the short autism spectrum quotient and the short quantitative checklist in 1,000 cases and 3,000 controls. J. Am. Acad. Child Adolesc. Psychiatry. Vol. 51(2):202–212.e7

            12. Bastiaansen JA, Thioux M, Nanetti L, van der Gaag C, Ketelaars C, Minderaa R, et al.. 2011. Age-related increase in inferior frontal gyrus activity and social functioning in autism spectrum disorder. Biol. Psychiatry. Vol. 69(9):832–838

            13. Berrar D. 2019. Encyclopedia of Bioinformatics and Computational Biology. Cross-Validation. Academic Press. Oxford:

            14. Bone D, Goodwin MS, Black MP, Lee CC, Audhkhasi K, Narayanan S. 2015. Applying machine learning to facilitate autism diagnostics: pitfalls and promises. J. Autism Dev. Disord. Vol. 45:1121–1136

            15. Bone D, Bishop SL, Black MP, Goodwin MS, Lord C, Narayanan SS. 2016. Use of machine learning to improve autism screening and diagnostic instruments: effectiveness, efficiency, and multi-instrument fusion. J. Child Psychol. Psychiatry. Vol. 57(8):927–937

            16. Cruz JA, Wishart DS. 2006. Applications of machine learning in cancer prediction and prognosis. Cancer Inform. Vol. 2:117693510600200030

            17. Deshpande G, Libero LE, Sreenivasan KR, Deshpande HD, Kana RK. 2013. Identification of neural connectivity signatures of autism using machine learning. Front. Hum. Neurosci. Vol. 7:670

            18. Duda M, Ma R, Haber N, Wall DP. 2016. Use of machine learning for behavioral distinction of autism and ADHD. Transl. Psychiatry. Vol. 6(2):e732

            19. Frith U, Happé F. 2005. Autism spectrum disorder. Curr. Biol. Vol. 15(19):R786–R790

            20. Gopal Krishna Patro S, Sahu KK. 2015. Normalization: a preprocessing stage. arXiv e-prints. arXiv:1503.06462

            21. Hasan SM, Uddin MP, Mamun MA, Sharif MI, Ulhaq A, Krishnamoorthy G. 2022. A machine learning framework for early-stage detection of autism spectrum disorders. IEEE Access. Vol. 11:15038–15057

            22. Heinsfeld AS, Franco AR, Craddock RC, Buchweitz A, Meneguzzi F. 2018. Identification of autism spectrum disorder using deep learning and the ABIDE dataset. Neuroimage Clin. Vol. 17:16–23

            23. Huang S, Cai N, Pacheco PP, Narrandes S, Wang Y, Xu W. 2018. Applications of support vector machine (SVM) learning in cancer genomics. Cancer Genomics Proteomics. Vol. 15(1):41–51

            24. John GH, Langley P. 2013. Estimating continuous distributions in Bayesian classifiers. arXiv preprint. arXiv:1302.4964

            25. Khan NS, Muaz MH, Kabir A, Islam MN. 2017. Diabetes predicting mhealth application using machine learning2017 IEEE International WIE Conference on Electrical and Computer Engineering (WIECON-ECE); IEEE.

            26. Kosmicki J, Sochat V, Duda M, Wall DP. 2015. Searching for a minimal set of behaviors for autism detection through feature selection-based machine learning. Transl. Psychiatry. Vol. 5(2):e514

            27. Lakhan R, Agrawal A, Sharma M. 2020. Prevalence of depression, anxiety, and stress during COVID-19 pandemic. J. Neurosci. Rural Pract. Vol. 11(04):519–525

            28. Mao KZ. 2004. Orthogonal forward selection and backward elimination algorithms for feature subset selection. IEEE Trans. Syst. Man Cybern. Vol. 34(1):629–634

            29. Mythili M, Shanavas A. 2014. A study on Autism spectrum disorders using classification techniques. Int. J. Soft Comput. Eng. Vol. 4(5):88–91

            30. Omar KS, Mondal P, Khan NS, Rizvi M.R.K, Islam MN. 2019. A machine learning approach to predict autism spectrum disorder2019 International Conference on Electrical, Computer and Communication Engineering (ECCE); IEEE.

            31. Parikh MN, Li H, He L. 2019. Enhancing diagnosis of autism with optimized machine learning models and personal characteristic data. Front. Comput. Neurosci. Vol. 13:9

            32. Pratap A, Kanimozhiselvi CS, Vijayakumar R, Pramod KV. 2014. Soft computing models for the predictive grading of childhood Autism—a comparative study. Int. J. Soft Comput. Eng. Vol. 4(3):64–67

            33. Schankweiler P, Raddatz D, Ellrott T, Hauck Cirkel C. 2023. Correlates of food addiction and eating behaviours in patients with morbid obesity. Obesity Facts. Vol. 16:465–474

            34. Song Y-Y, Ying L. 2015. Decision tree methods: applications for classification and prediction. Shanghai Arch. Psychiatry. Vol. 27(2):130

            35. Suhas G, Naveen N, Nagabanu M, Mario Edwin R, Nithish Kumar R. 2021. A survey on autism spectrum disorder (ASD) using machine learning. Adv. Innov. Comput. Progr. Lang. Vol. 3(2)

            36. Thabtah FF. 2017a. Autistic spectrum disorder screening data for adolescent.

            37. Thabtah F. 2017b. Autism spectrum disorder screening: machine learning adaptation and DSM-5 fulfillmentProceedings of the 1st International Conference on Medical and health Informatics 2017;

            38. Thabtah F. 2019. Machine learning in autistic spectrum disorder behavioral research: a review and ways forward. Inform. Health Soc. Care. Vol. 44(3):278–297

            39. Thabtah F, Peebles D. 2020. A new machine learning model based on induction of rules for autism detection. J. Health Inform. Vol. 26(1):264–286

            40. Tripathy HK, Mallick PK, Mishra S. 2021. Application and evaluation of classification model to detect autistic spectrum disorders in children. Int. J. Comput. Appl. Technol. Vol. 65(4):368–377

            41. Vaishali R, Sasikala R. 2018. A machine learning based approach to classify autism with optimum behaviour sets. Int. J. Eng. Technol. Vol. 7(4):18

            42. van den Bekerom B. 2017. Using machine learning for detection of autism spectrum disorderProceedings of the 20th Student Conference IT;

            43. Wall DP, Dally R, Luyster R, Jung JY, Deluca TF. 2012a. Use of artificial intelligence to shorten the behavioral diagnosis of autism. PLoS One. Vol. 7(8):e43855

            44. Wall DP, Kosmicki J, DeLuca TF, Harstad E, Fusaro VA. 2012b. Use of machine learning to shorten observation-based screening and diagnosis of autism. Transl. Psychiatry. Vol. 2(4):e100

            45. Wohlrab L, Fürnkranz J. 2011. A review and comparison of strategies for handling missing values in separate-and-conquer rule learning. J. Intell. Inf. Syst. Vol. 36:73–98

            Author and article information

            Journal
            jdr
            Journal of Disability Research
            King Salman Centre for Disability Research (Riyadh, Saudi Arabia )
            05 January 2024
            : 3
            : 1
            : e20230064
            Affiliations
            [1 ] Department of Mathematics and Statistics, College of Science, Imam Mohammad Ibn Saud Islamic University, Riyadh, Saudi Arabia ( https://ror.org/05gxjyb39)
            [2 ] Department of Basic Sciences, Higher Institute of Administrative Sciences, Osim, Egypt;
            [3 ] Department of Acoustic, School of Science, Engineering, and Environment, Salford University, Great Manchester, UK ( https://ror.org/01tmqtf75)
            [4 ] Department of Computer Science, Faculty of Science, Diyala University, Baqubah, Diyala, Iraq ( https://ror.org/01eb5yv70)
            [5 ] Faculty of Business Administration, Egyptian E-Learning University, Giza, Egypt ( https://ror.org/045ms0x79)
            [6 ] Department of Statistics and Insurance, Faculty of Commerce, Zagazig University, Zagazig, Egypt ( https://ror.org/053g6we49)
            Author notes
            Correspondence to: Mahmoud M. Abdelwahab*, e-mail: mmabdelwahab@ 123456imamu.edu.sa , Tel.: +966541065376, Khamis A. Al-Karawi, e-mail: k.a.yousif@ 123456edu.salford.ac.uk , E. M. Hasanin, e-mail: Ihasanin@ 123456eelu.edu.eg , H. E. Semary, e-mail: hesemary@ 123456imamu.edu.sa
            Author information
            https://orcid.org/0000-0001-9275-6902
            Article
            10.57197/JDR-2023-0064
            e4c0a420-2d14-4a9e-ace3-b3f41255e529
            Copyright © 2024 The Authors.

            This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY) 4.0, which permits unrestricted use, distribution and reproduction in any medium, provided the original author and source are credited.

            History
            : 31 August 2023
            : 11 December 2023
            : 11 December 2023
            Page count
            Figures: 14, Tables: 4, References: 45, Pages: 9
            Funding
            Funded by: King Salman Centre for Disability Research
            Award ID: KSRG-2023-556
            The authors extend their appreciation to the King Salman Centre for Disability Research for funding this work through Research Group no KSRG-2023-556.

            Social policy & Welfare,Political science,Education & Public policy,Special education,Civil law,Social & Behavioral Sciences
            SVM,random forest,autism,decision tree,machine learning

            Comments

            Comment on this article