5
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Genetic polymorphism and evidence of signatures of selection in the Plasmodium falciparum circumsporozoite protein gene in Tanzanian regions with different malaria endemicity

      Preprint
      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Background:

          In 2021 and 2023, the World Health Organization approved RTS, S/AS01 and R21/Matrix M malaria vaccines, respectively, for routine immunization of children in African countries with moderate to high transmission. These vaccines are made of Plasmodium falciparum circumsporozoite protein ( Pfcsp) but polymorphisms in this gene raises concerns regarding strain-specific responses and the long-term efficacy of these vaccines. This study assessed the Pfcsp genetic diversity, population structure and signatures of selection among parasites from areas of different malaria transmission in mainland Tanzania, to generate baseline data before the introduction of the malaria vaccines in the country.

          Methods:

          The analysis involved 589 whole genome sequences generated by and as part of the MalariaGEN Community Project. The samples were collected between 2013 and January 2015 from five regions of mainland Tanzania: Morogoro and Tanga (Muheza) (moderate transmission areas), and Kagera (Muleba), Lindi (Nachingwea), and Kigoma (Ujiji) (high transmission areas). Wright’s inbreeding coefficient (F ws), Wright’s fixation index (F ST), principal component analysis, nucleotide diversity, and Tajima’s D were used to assess within-host parasite diversity, population structure and natural selection.

          Results:

          Based on F ws (< 0.95), there was high polyclonality (ranged from 69.23% in Nachingwea to 56.9% in Muheza). No population structure was detected in the Pfcsp gene in the five regions (mean F ST= 0.0068). The average nucleotide diversity (π), nucleotide differentiation (K) and haplotype diversity (Hd) in the five regions were 4.19, 0.973 and 0.0035, respectively. The C-terminal region of Pfcsp showed high nucleotide diversity at Th2R and Th3R regions. Positive values for the Tajima’s D were observed in the Th2R and Th3R regions consistent with balancing selection. The Pfcsp C-terminal sequences had 50 different haplotypes (H_1 to H_50) and only 2% of sequences matched the 3D7 strain haplotype (H_50).

          Conclusions:

          The findings demonstrate high diversity of the Pfcsp gene with limited population differentiation. The Pfcsp gene showed positive Tajima’s D values for parasite populations, consistent with balancing selection for variants within Th2R and Th3R regions. This data is consistent with other studies conducted across Africa and worldwide, which demonstrate low 3D7 haplotypes and little population structure. Therefore, additional research is warranted, incorporating other regions and more recent data to comprehensively assess trends in genetic diversity within this important gene. Such insights will inform the choice of alleles to be included in the future vaccines

          Related collections

          Most cited references72

          • Record: found
          • Abstract: found
          • Article: not found

          The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data.

          Next-generation DNA sequencing (NGS) projects, such as the 1000 Genomes Project, are already revolutionizing our understanding of genetic variation among individuals. However, the massive data sets generated by NGS--the 1000 Genome pilot alone includes nearly five terabases--make writing feature-rich, efficient, and robust analysis tools difficult for even computationally sophisticated individuals. Indeed, many professionals are limited in the scope and the ease with which they can answer scientific questions by the complexity of accessing and manipulating the data produced by these machines. Here, we discuss our Genome Analysis Toolkit (GATK), a structured programming framework designed to ease the development of efficient and robust analysis tools for next-generation DNA sequencers using the functional programming philosophy of MapReduce. The GATK provides a small but rich set of data access patterns that encompass the majority of analysis tool needs. Separating specific analysis calculations from common data management infrastructure enables us to optimize the GATK framework for correctness, stability, and CPU and memory efficiency and to enable distributed and shared memory parallelization. We highlight the capabilities of the GATK by describing the implementation and application of robust, scale-tolerant tools like coverage calculators and single nucleotide polymorphism (SNP) calling. We conclude that the GATK programming framework enables developers and analysts to quickly and easily write efficient and robust NGS tools, many of which have already been incorporated into large-scale sequencing projects like the 1000 Genomes Project and The Cancer Genome Atlas.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            PLINK: a tool set for whole-genome association and population-based linkage analyses.

            Whole-genome association studies (WGAS) bring new computational, as well as analytic, challenges to researchers. Many existing genetic-analysis tools are not designed to handle such large data sets in a convenient manner and do not necessarily exploit the new opportunities that whole-genome data bring. To address these issues, we developed PLINK, an open-source C/C++ WGAS tool set. With PLINK, large data sets comprising hundreds of thousands of markers genotyped for thousands of individuals can be rapidly manipulated and analyzed in their entirety. As well as providing tools to make the basic analytic steps computationally efficient, PLINK also supports some novel approaches to whole-genome data that take advantage of whole-genome coverage. We introduce PLINK and describe the five main domains of function: data management, summary statistics, population stratification, association analysis, and identity-by-descent estimation. In particular, we focus on the estimation and use of identity-by-state and identity-by-descent information in the context of population-based whole-genome studies. This information can be used to detect and correct for population stratification and to identify extended chromosomal segments that are shared identical by descent between very distantly related individuals. Analysis of the patterns of segmental sharing has the potential to map disease loci that contain multiple rare variants in a population-based linkage analysis.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: found
              Is Open Access

              The variant call format and VCFtools

              Summary: The variant call format (VCF) is a generic format for storing DNA polymorphism data such as SNPs, insertions, deletions and structural variants, together with rich annotations. VCF is usually stored in a compressed manner and can be indexed for fast data retrieval of variants from a range of positions on the reference genome. The format was developed for the 1000 Genomes Project, and has also been adopted by other projects such as UK10K, dbSNP and the NHLBI Exome Project. VCFtools is a software suite that implements various utilities for processing VCF files, including validation, merging, comparing and also provides a general Perl API. Availability: http://vcftools.sourceforge.net Contact: rd@sanger.ac.uk
                Bookmark

                Author and article information

                Journal
                medRxiv
                MEDRXIV
                medRxiv
                Cold Spring Harbor Laboratory
                23 January 2024
                : 2024.01.23.24301587
                Affiliations
                [1 ]National Institute for Medical Research, Dar es Salaam, Tanzania.
                [2 ]Nelson Mandela African Institution of Science and Technology, Arusha, Tanzania
                [3 ]University of North Carolina, Chapel Hill, NC, USA
                [4 ]Brown University, Providence, RI, USA
                [5 ]Muhimbili University of Health and Allied Sciences, Dar es Salaam, Tanzania.
                [6 ]Harvard T.H Chan School of Public Health, Boston, MA, USA
                [7 ]Faculty of Pharmaceutical Sciences, Monash University, Melbourne, Australia
                Author notes

                Authors’ contributions

                DSI - formulated the original idea, supervised data analysis and wrote the manuscript;

                BL - performed the analysis and wrote the manuscript

                CB, ZPH and DG - supported the data analysis, reviewed and edited the manuscript

                RB, MS and CIM - conceived the idea, implemented the field surveys, reviewed and edited the manuscript

                DP and RM - reviewed and edited the manuscript

                JJ, JB, and DSI - critically reviewed the manuscript

                All authors contributed to the article and approved the submitted version.

                [* ] Corresponding author: beatus.lyimo@ 123456nm-aist.ac.tz
                Author information
                http://orcid.org/0000-0002-8710-1765
                http://orcid.org/0000-0003-2498-4705
                Article
                10.1101/2024.01.23.24301587
                10854334
                38343796
                0ce9fe1b-f386-4bc2-aa9f-d1394142056c

                This work is licensed under a Creative Commons Attribution 4.0 International License, which allows reusers to distribute, remix, adapt, and build upon the material in any medium or format, so long as attribution is given to the creator. The license allows for commercial use.

                History
                Funding
                Funded by: Bill & Melinda Gates Foundation
                Award ID: 02202
                Funded by: NIH
                Award ID: K24AI134990
                Categories
                Article

                plasmodium falciparum,circumsporozoite protein,malaria vaccine,genetic diversity,signature of selection,tanzania

                Comments

                Comment on this article