2
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Demuxafy: improvement in droplet assignment by integrating multiple single-cell demultiplexing and doublet detection methods

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Recent innovations in single-cell RNA-sequencing (scRNA-seq) provide the technology to investigate biological questions at cellular resolution. Pooling cells from multiple individuals has become a common strategy, and droplets can subsequently be assigned to a specific individual by leveraging their inherent genetic differences. An implicit challenge with scRNA-seq is the occurrence of doublets—droplets containing two or more cells. We develop Demuxafy, a framework to enhance donor assignment and doublet removal through the consensus intersection of multiple demultiplexing and doublet detecting methods. Demuxafy significantly improves droplet assignment by separating singlets from doublets and classifying the correct individual.

          Supplementary Information

          The online version contains supplementary material available at 10.1186/s13059-024-03224-8.

          Related collections

          Most cited references15

          • Record: found
          • Abstract: found
          • Article: not found

          Comprehensive Integration of Single-Cell Data

          Single-cell transcriptomics has transformed our ability to characterize cell states, but deep biological understanding requires more than a taxonomic listing of clusters. As new methods arise to measure distinct cellular modalities, a key analytical challenge is to integrate these datasets to better understand cellular identity and function. Here, we develop a strategy to "anchor" diverse datasets together, enabling us to integrate single-cell measurements not only across scRNA-seq technologies, but also across different modalities. After demonstrating improvement over existing methods for integrating scRNA-seq data, we anchor scRNA-seq experiments with scATAC-seq to explore chromatin differences in closely related interneuron subsets and project protein expression measurements onto a bone marrow atlas to characterize lymphocyte populations. Lastly, we harmonize in situ gene expression and scRNA-seq datasets, allowing transcriptome-wide imputation of spatial gene expression patterns. Our work presents a strategy for the assembly of harmonized references and transfer of information across datasets.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: found
            Is Open Access

            Integrated analysis of multimodal single-cell data

            Summary The simultaneous measurement of multiple modalities represents an exciting frontier for single-cell genomics and necessitates computational methods that can define cellular states based on multimodal data. Here, we introduce “weighted-nearest neighbor” analysis, an unsupervised framework to learn the relative utility of each data type in each cell, enabling an integrative analysis of multiple modalities. We apply our procedure to a CITE-seq dataset of 211,000 human peripheral blood mononuclear cells (PBMCs) with panels extending to 228 antibodies to construct a multimodal reference atlas of the circulating immune system. Multimodal analysis substantially improves our ability to resolve cell states, allowing us to identify and validate previously unreported lymphoid subpopulations. Moreover, we demonstrate how to leverage this reference to rapidly map new datasets and to interpret immune responses to vaccination and coronavirus disease 2019 (COVID-19). Our approach represents a broadly applicable strategy to analyze single-cell multimodal datasets and to look beyond the transcriptome toward a unified and multimodal definition of cellular identity.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              DoubletFinder: Doublet Detection in Single-Cell RNA Sequencing Data Using Artificial Nearest Neighbors

              Single-cell RNA sequencing (scRNA-seq) data are commonly affected by technical artifacts known as “doublets,” which limit cell throughput and lead to spurious biological conclusions. Here, we present a computational doublet detection tool—Doublet-Finder—that identifies doublets using only gene expression data. DoubletFinder predicts doublets according to each real cell’s proximity in gene expression space to artificial doublets created by averaging the transcriptional profile of randomly chosen cell pairs. We first use scRNA-seq datasets where the identity of doublets is known to show that DoubletFinder identifies doublets formed from transcriptionally distinct cells. When these doublets are removed, the identification of differentially expressed genes is enhanced. Second, we provide a method for estimating DoubletFinder input parameters, allowing its application across scRNA-seq datasets with diverse distributions of cell types. Lastly, we present “best practices” for DoubletFinder applications and illustrate that DoubletFinder is insensitive to an experimentally validated kidney cell type with “hybrid” expression features. scRNA-seq data interpretation is confounded by technical artifacts known as doublets—single-cell transcriptome data representing more than one cell. Moreover, scRNA-seq cellular throughput is purposefully limited to minimize doublet formation rates. By identifying cells sharing expression features with simulated doublets, DoubletFinder detects many real doublets and mitigates these two limitations.
                Bookmark

                Author and article information

                Contributors
                d.neavin@garvan.org.au
                j.powell@garvan.org.au
                Journal
                Genome Biol
                Genome Biol
                Genome Biology
                BioMed Central (London )
                1474-7596
                1474-760X
                15 April 2024
                15 April 2024
                2024
                : 25
                : 94
                Affiliations
                [1 ]Garvan-Weizmann Centre for Cellular Genomics, Garvan Institute for Medical Research, ( https://ror.org/01b3dvp57) Darlinghurst, NSW Australia
                [2 ]GRID grid.416088.3, ISNI 0000 0001 0753 1056, Present address: Statewide Genomics at NSW Health Pathology, ; Sydney, NSW Australia
                [3 ]Wellcome Sanger Institute, Wellcome Genome Campus, ( https://ror.org/05cy4wa09) Hinxton, UK
                [4 ]Life Sciences Department, ( https://ror.org/05sd8tv96) Barcelona Supercomputing Center, Barcelona, Catalonia Spain
                [5 ]GRID grid.4494.d, ISNI 0000 0000 9558 4598, Department of Genetics, , University of Groningen, University Medical Center Groningen, ; Groningen, The Netherlands
                [6 ]Spatial and Single Cell Systems Domain, Genome Institute of Singapore (GIS), Agency for Science, Technology and Research (A*STAR), ( https://ror.org/05k8wg936) Singapore, Republic of Singapore
                [7 ]Population and Global Health, Lee Kong Chian School of Medicine, Nanyang Technological University, ( https://ror.org/02e7b5302) Singapore, Republic of Singapore
                [8 ]Cancer Science Institute of Singapore, National University of Singapore, ( https://ror.org/01tgyzw49) Singapore, Republic of Singapore
                [9 ]GRID grid.266102.1, ISNI 0000 0001 2297 6811, Bakar Institute for Computational Health Sciences, University of California, ; San Francisco, CA USA
                [10 ]GRID grid.266102.1, ISNI 0000 0001 2297 6811, Institute for Human Genetics, University of California, ; San Francisco, San Francisco, CA USA
                [11 ]GRID grid.266102.1, ISNI 0000 0001 2297 6811, Division of Rheumatology, Department of Medicine, , University of California, ; San Francisco, San Francisco, CA USA
                [12 ]Chan Zuckerberg Biohub, ( https://ror.org/00knt4f32) San Francisco, CA USA
                [13 ]Bioinformatics and Cellular Genomics, St Vincent’s Institute of Medical Research, ( https://ror.org/02k3cxs74) Fitzroy, Australia
                [14 ]Melbourne Integrative Genomics, School of BioSciences–School of Mathematics & Statistics, Faculty of Science, University of Melbourne, ( https://ror.org/01ej9dk98) Melbourne, Australia
                [15 ]Present address: The Gene Lay Institute of Immunology and Inflammation, Brigham and Women’s Hospital and Harvard Medical School, ( https://ror.org/04b6nzv94) Boston, MA USA
                [16 ]UNSW Cellular Genomics Futures Institute, University of New South Wales, ( https://ror.org/03r8z3t63) Kensington, NSW Australia
                Author information
                http://orcid.org/0000-0002-5070-4124
                Article
                3224
                10.1186/s13059-024-03224-8
                11020463
                38622708
                d4b48da0-9c26-47c6-8a17-d2b34d718942
                © The Author(s) 2024

                Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

                History
                : 7 March 2023
                : 25 March 2024
                Categories
                Method
                Custom metadata
                © BioMed Central Ltd., part of Springer Nature 2024

                Genetics
                single-cell analysis,genetic demultiplexing,doublet detecting
                Genetics
                single-cell analysis, genetic demultiplexing, doublet detecting

                Comments

                Comment on this article