Sequential stacking link prediction algorithms for temporal networks

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

Link prediction algorithms are indispensable tools in many scientific applications by speeding up network data collection and imputing missing connections. However, in many systems, links change over time and it remains unclear how to optimally exploit such temporal information for link predictions in such networks. Here, we show that many temporal topological features, in addition to having high computational cost, are less accurate in temporal link prediction than sequentially stacked static network features. This sequential stacking link prediction method uses 41 static network features that avoid detailed feature engineering choices and is capable of learning a highly accurate predictive distribution of future connections from historical data. We demonstrate that this algorithm works well for both partially observed and completely unobserved target layers, and on two temporal stochastic block models achieves near-oracle-level performance when combined with other single predictor methods as an ensemble learning method. Finally, we empirically illustrate that stacking multiple predictive methods together further improves performance on 19 real-world temporal networks from different domains.

Abstract

Link prediction in temporal networks is relevant for many real-world systems, however, current approaches are usually characterized by high computational costs. The authors propose a temporal link prediction framework based on the sequential stacking of static network features, for improved computational speed, appropriate for temporal networks with completely unobserved or partially observed target layers.

Related collections

Most cited references 49

Record: found
Abstract: not found
Article: not found

Random Forests

Leo Breiman (2001)

0 comments Cited 9426 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

The meaning and use of the area under a receiver operating characteristic (ROC) curve.

J A Hanley, B J McNeil, Marnix van Holsbeeck (1982)

A representation and interpretation of the area under a receiver operating characteristic (ROC) curve obtained by the "rating" method, or by mathematical predictions based on patient characteristics, is presented. It is shown that in such a setting the area represents the probability that a randomly chosen diseased subject is (correctly) rated or ranked with greater suspicion than a randomly chosen non-diseased subject. Moreover, this probability of a correct ranking is the same quantity that is estimated by the already well-studied nonparametric Wilcoxon statistic. These two relationships are exploited to (a) provide rapid closed-form expressions for the approximate magnitude of the sampling variability, i.e., standard error that one uses to accompany the area under a smoothed ROC curve, (b) guide in determining the size of the sample required to provide a sufficiently reliable estimate of this area, and (c) determine how large sample sizes should be to ensure that one can statistically detect differences in the accuracy of diagnostic techniques.

0 comments Cited 4140 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: not found
Article: not found

An introduction to ROC analysis

Tom Fawcett (2006)

0 comments Cited 1814 times – based on 0 reviews      Review now

Bookmark

All references

Author and article information

Contributors

Peter J. Mucha:

ORCID: http://orcid.org/0000-0002-0648-7230

peter.j.mucha@dartmouth.edu

Journal

Journal ID (nlm-ta): Nat Commun

Journal ID (iso-abbrev): Nat Commun

Title: Nature Communications

Publisher: Nature Publishing Group UK (London )

ISSN (Electronic): 2041-1723

Publication date (Electronic): 14 February 2024

Publication date PMC-release: 14 February 2024

Publication date Collection: 2024

Volume: 15

Electronic Location Identifier: 1364

Affiliations

[1 ]Department of Mathematics, Dartmouth College, ( https://ror.org/049s0rh22) Hanover, NH USA

[2 ]Yale Institute for Network Science, Yale University, ( https://ror.org/03v76x132) New Haven, CT USA

[3 ]Department of Scientific Computing, Pukyong National University, ( https://ror.org/0433kqc49) Busan, South Korea

[4 ]Department of Computer Science, University of Colorado, ( https://ror.org/02ttsq026) Boulder, CO USA

[5 ]BioFrontiers Institute, University of Colorado, Boulder, ( https://ror.org/02ttsq026) Boulder, CO USA

[6 ]Santa Fe Institute, ( https://ror.org/01arysc35) Santa Fe, NM USA

Author information

Xie He http://orcid.org/0000-0002-4136-9408

Amir Ghasemian http://orcid.org/0000-0002-3515-3504

Eun Lee http://orcid.org/0000-0003-3860-3213

Aaron Clauset http://orcid.org/0000-0002-3529-8746

Peter J. Mucha http://orcid.org/0000-0002-0648-7230

Article

Publisher ID: 45598

DOI: 10.1038/s41467-024-45598-0

PMC ID: 10866871

PubMed ID: 38355612

SO-VID: e5861d2b-43de-4090-8f12-dfab58cbdd67

License:

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

History

Date received : 31 January 2023

Date accepted : 29 January 2024

Custom metadata

ScienceOpen disciplines: Uncategorized

Keywords: computational science,applied mathematics,software,interdisciplinary studies

Data availability:

ScienceOpen disciplines: Uncategorized

Keywords: computational science, applied mathematics, software, interdisciplinary studies

Comments

Comment on this article

scite_

Smart Citations

Citing PublicationsSupportingMentioningContrasting

View Citations

See how this article has been cited at scite.ai

scite shows how a scientific paper has been cited by providing the context of the citation, a classification describing whether it supports, mentions, or contrasts the cited claim, and a label indicating in which section the citation was made.