The coronavirus 2019–2020 pandemic (COVID-19) poses unprecedented challenges for governments
and societies around the world (
1
). Nonpharmaceutical interventions have proven to be critical for delaying and containing
the COVID-19 pandemic (
2
–
6
). These include testing and tracing, bans on large gatherings, nonessential business
and school and university closures, international and domestic mobility restrictions
and physical isolation, and total lockdowns of regions and countries. Decision-making
and evaluation or such interventions during all stages of the pandemic life cycle
require specific, reliable, and timely data not only about infections but also about
human behavior, especially mobility and physical copresence. We argue that mobile
phone data, when used properly and carefully, represents a critical arsenal of tools
for supporting public health actions across early-, middle-, and late-stage phases
of the COVID-19 pandemic.
Seminal work on human mobility has shown that aggregate and (pseudo-)anonymized mobile
phone data can assist the modeling of the geographical spread of epidemics (
7
–
11
). Thus, researchers and governments have started to collaborate with private companies,
most notably mobile network operators and location intelligence companies, to estimate
the effectiveness of control measures in a number of countries, including Austria,
Belgium, Chile, China, Germany, France, Italy, Spain, United Kingdom, and the United
States (
12
–
21
).
There is, however, little coordination or information exchange between these national
or even regional initiatives (
22
). Although ad hoc mechanisms leveraging mobile phone data can be effectively (but
not easily) developed at the local or national level, regional or even global collaborations
seem to be much more difficult given the number of actors, the range of interests
and priorities, the variety of legislations concerned, and the need to protect civil
liberties. The global scale and spread of the COVID-19 pandemic highlight the need
for a more harmonized or coordinated approach.
In the following sections, we outline the ways in which different types of mobile
phone data can help to better target and design measures to contain and slow the spread
of the COVID-19 pandemic. We identify the key reasons why this is not happening on
a much broader scale, and we give recommendations on how to make mobile phone data
work against the virus.
HOW CAN MOBILE PHONE DATA HELP TO TACKLE THE COVID-19 PANDEMIC?
Passively generated mobile phone data have emerged as a potentially valuable data
source to infer human mobility and social interactions. Call detail records (CDRs)
are arguably the most researched type of mobile data in this context. CDRs are collected
by mobile operators for billing purposes. Each record contains information about the
time and the cell tower that the phone was connected to when the interaction took
place. CDRs are event-driven records: In other words, the record only exists if the
phone is actively in use. Additional information includes “sightings data” obtained
when a phone is seen on a network. There are, however, other types of mobile phone
data used to study human mobility behaviors and interactions. X data records or network
probes, can be thought as metadata about the phone’s data channel, capturing background
actions of apps and the network. Routine information including highly accurate location
data is also collected through mobile phone applications (apps) at a large scale by
location intelligence companies (
23
) or by ad hoc apps (
24
,
25
). In addition, proximity between mobile phone users can be detected via Bluetooth
functionality on smartphones. Each of these data types requires different processing
frameworks and raise complex ethical and political concerns that are discussed in
this paper.
First, we explore the value and contribution of mobile phone data in analytical efforts
to control the COVID-19 pandemic. Government and public health authorities broadly
raise questions in at least four critical areas of inquiries for which the use of
mobile phone data is relevant. First, situational awareness questions seek to develop
an understanding of the dynamic environment of the pandemic. Mobile phone data can
provide access to previously unavailable population estimates and mobility information
to enable stakeholders across sectors better understand COVID-19 trends and geographic
distribution. Second, cause-and-effect questions seek to help identify the key mechanisms
and consequences of implementing different measures to contain the spread of COVID-19.
They aim to establish which variables make a difference for a problem and whether
further issues might be caused. Third, predictive analysis seeks to identify the likelihood
of future outcomes and could, for example, leverage real-time population counts and
mobility data to enable predictive capabilities and allow stakeholders to assess future
risks, needs, and opportunities. Finally, impact assessments aim to determine which,
whether, and how various interventions affect the spread of COVID-19 and require data
to identify the obstacles hampering the achievement of certain objectives or the success
of particular interventions. Table 1 provides specific examples of questions by areas
of inquiry. The relevance and specific questions raised as part of these areas of
inquiry differ at various stages of the outbreak, but mobile phone data provide value
throughout the epidemiological cycle, shown in Fig. 1.
Table 1
Examples of questions by areas of inquiry.
Situational awareness
Cause and effect
• What are the most commonmobility flows within andbetween COVID-19–affectedcities
and regions?
• What are variables thatdetermine the success of socialdistancing approaches?
• Which areas are spreading theepidemics acting as origin nodesin a mobility network
and thuscould be placed under mobilityrestrictions?
• How do local mobility patternsaffect the burden on themedical system?
• Are people continuing to travelor congregate after socialdistancing and travel restrictionswere
put into place?
• Are business’ social distancingrecommendations resulting inmore workers working
fromhome?
• Are there hotspots at higher riskof contamination (due to ahigher level of mobility
andhigher concentration ofpopulation)?
• In what sectors are peopleworking most from home?
• What are the key entry points,locations, and movements ofroamers or tourists?
• What are the social andeconomic consequences ofmovement restrictionmeasures?
Predictive analysis
Impact
• How are certain human mobilitypatterns likely to affect thespread of the coronavirus?
Andwhat is the likely spread ofCOVID-19, based on existingdisease models and up-to-datemobility
data?
• How have travel restrictionsaffected human mobilitybehavior and likely diseasetransmission?
• What are the likely effects ofmobility restrictions onchildren’s education outcomes?
• What is the potential of variousrestriction measures to avertinfection cases and
save lives?
• What are likely to be theeconomic consequences ofrestricted mobility forbusinesses?
• What is the effect of mandatorysocial distancing measures,including closure of schools?
• How has the dissemination ofpublic safety information andvoluntary guidance affectedhuman
mobility behavior anddisease spread?
Fig. 1
Pandemic intervals as defined by the U.S. Centers for Disease Control and the World
Health Organization [based on (
52
)].
In the early recognition and initiation phase of the pandemic, responders focus on
situational analysis and the fast detection of infected cases and their contacts.
Research has shown that quarantine measures of infected individuals and their family
members, combined with surveillance and standard testing procedures, are effective
as control measures in the early stages of the pandemic (
26
). Individual mobility and contact (close proximity) data offer information about
infected individuals, their locations, and social network. Contact (close proximity)
data can be collected through mobile apps (
24
,
27
), interviews, or surveys (
28
).
During the acceleration phase, when community transmission reaches exponential levels,
the focus is on interventions for containment, which typically involve social contact
and mobility restrictions. At this stage, aggregated mobile phone data are valuable
to assess the efficacy of implemented policies through the monitoring of mobility
between and within affected municipalities. Mobility information also contributes
to the building of more accurate epidemiological models that can explain and anticipate
the spread of the disease, as shown for H1N1 flu outbreaks (
29
). These models, in turn, can inform the mobilization of resources (e.g., respirators
and intensive care units).
Last, during the deceleration and preparation phases, as the peak of infections is
reached, restrictions will likely be lifted (
30
). Continued situational monitoring will be important as the COVID-19 pandemic is
expected to come in waves (
4
,
31
). Near real-time data on mobility and hotspots will be important to understand how
lifting and reestablishing various measures translate into behavior, especially to
find the optimal combination of measures at the right time (e.g., general mobility
restrictions, school closures, and banning of large gatherings), and to balance these
restrictions with aspects of economic vitality. After the pandemic has subsided, mobile
data will be helpful for post hoc analysis of the impact of different interventions
on the progression of the disease and cost-benefit analysis of mobility restrictions.
During this phase, digital contact-tracing technologies might be deployed, such as
the Korean smartphone app Corona 100m (
32
) and the Singaporean smartphone app TraceTogether (
33
), that aim at minimizing the spread of a disease as mobility restrictions are lifted.
Along this line, researchers at the Massachusetts Institute of Technology and collaborators
are working on Private Kit: Safe Paths (
25
), an open-source and privacy-first contact-tracing technology that provides individuals
with information on their proximity with diagnosed COVID-19 carriers, using Global
Positioning System (GPS) and Bluetooth data. Similarly, several European universities,
research centers, and companies have joined forces around PEPP-PT [Pan-European Privacy
Preserving Proximity Tracing (
34
)], a collaboration on privacy-preserving, General Data Protection Regulation (GDPR)–compliant
contact tracing. Along this effort, a consortium of research institutions, led by
the École Polytechnique Fédérale de Lausanne (EPFL), has developed an open Decentralized
Privacy-Preserving Proximity Tracing protocol and implementation using Bluetooth low-energy
functionality on smartphones, ensuring that personal data and computation stay entirely
on an individuals’ phones (
35
). Recently, Apple and Google have released a joint announcement (
36
) describing their system to support Bluetooth-based privacy-preserving proximity
tracing across iOS and Android smartphones. As a part of the European Commission recommendation
of a coordinated approach to support the gradual lifting of lockdown measures (
37
), European Union (EU) member states, supported by the Commission, have developed
a toolbox for the development and usage of contact tracing apps, fully compliant with
EU rules (
38
).
SPECIFIC METRICS FOR DATA-SUPPORTED DECISIONS
Researchers and practitioners have developed a variety of aggregated metrics using
mobile phone data that can help fill gaps in information needed to respond to COVID-19
and address uncertainties regarding mobility and behaviors. Origin-destination (OD)
matrices are especially useful in the first epidemiological phases, where the focus
is to assess the mobility of the population. The number of people moving between two
different areas daily can be computed from the mobile network data, and it can be
considered a proxy of human mobility. The geographic areas of interest might be zip
codes, municipalities, provinces, or even regions. These mobility flows are compared
to those during a reference period to assess the reduction in mobility due to nonpharmaceutical
interventions. In particular, they are useful to monitor the impact of different social
and mobility contention measures and to identify regions where the measures might
not be effective or followed by the population. Moreover, these flows can inform spatially
explicit disease transmission models to evaluate the potential benefit of such reductions.
Dwell estimations and hotspots are estimates of particularly high concentration of
people in an area, which can be favorable to the transmission of the virus. These
metrics are typically constructed within a municipality by dividing the city into
grids or neighborhoods (
39
). The estimated number of people in each geographical unit can be computed with different
time granularities (e.g., 15 min, 60 min, and 24 hours).
Contact matrices estimate the number and intensity of the face-to-face interactions
people have in a day. They are typically computed by age groups. These matrices have
been shown to be extremely useful to assess and determine the decrease of the reproduction
number of the virus (
6
). However, it is still challenging to estimate face-to-face interactions from colocation
and mobility data (
40
). Contact-tracing apps can then be used to identify close contacts of those infected
with the virus.
Amount of time spent at home, at work, or other locations are estimates of the individual
percentage of time spent at home/work/other locations (e.g., public parks, malls,
and shops), which can be useful to assess the local compliance with countermeasures
adopted by governments. The home and work locations need to be computed in a period
of time before the deployment of mobility restrictions measures. The percentage of
time spent in each location needs to be computed for people who do not move during
this time. Variations of the time spent on different locations are generally computed
on an individual basis and then spatially aggregated at a zip code, municipality,
city, or region level.
Although there is still little information about the age-specific susceptibility to
COVID-19 infection, it is clear that age is an important risk factor for COVID-19
severity. We highlight, therefore, the importance of estimating the metrics mentioned
above by age groups (
6
). Figure 2 shows an example of such metrics.
Fig. 2
Extraction of aggregated metrics from mobile phone data.
(A) Raw data representing 1-day mobility of two users. In this example, the area B
is a hotspot, as it shows a high concentration of people. (B) OD matrix of five different
areas, counting the number of trips from one area (rows) to another area (columns).
(C) Contact matrix counting the number of potential face-to-face interactions between
age groups. (D) Percentage of time spent at home, work, and other locations.
WHY IS THE USE OF MOBILE PHONE DATA NOT WIDESPREAD, OR A STANDARD, IN TACKLING EPIDEMICS?
The use of mobile phone data for tackling the COVID-19 pandemic has gained attention
but remains relatively scarce. Although local alliances have been formed, internationally
concerted action is missing, both in terms of coordination and information exchange
(
22
). In part, this is the result of a failure to institutionalize past experiences.
During the 2014–2016 Ebola virus outbreak, several pilot or one-off activities were
initiated. However, there was no transition to “business as usual” in terms of standardized
procedures to leverage mobile phone data or establish mechanisms for “data readiness”
in the country contexts (
41
,
42
). Technology has evolved with various platforms offering enhanced and secured access
and analysis of mobile data, including for humanitarian and development use cases
[e.g., Open Algorithms for Better Decisions Project (
43
) and Flowkit (
44
)]. Furthermore, high-level meetings have been held [e.g., the European Commission’s
business-to-government (B2G) data sharing high-level expert group], data analysis
and sharing initiatives have shown promising results, yet the use of metrics and insights
derived from mobile phone data by governments and local authorities is still minimal
today (
43
). Several factors likely explain this “implementation” gap.
First, governments and public authorities frequently are unaware and/or lack a “digital
mindset” and capacity needed for both for processing information that often is complex
and requires multidisciplinary expertise (e.g., mixing location and health data and
specialized modeling) and for establishing the necessary interdisciplinary teams and
collaborations. Many government units are understaffed and sometimes also lack technological
equipment. During the COVID-19 pandemic, most authorities are overwhelmed by the multiplicity
and simultaneity of requests; as they have never been confronted with such a crisis,
there are few predefined procedures and guides, so targeted and preventive action
is quickly abandoned for mass actions. These problems are exacerbated at local levels
of governments (e.g., towns and counties), which are precisely the authorities doing
the frontline work in most situations. In addition, many public authorities and decision-makers
are not aware of the value that mobile phone data would provide for decision-making
and are often used to make decisions without knowing the full facts and under conditions
of uncertainty.
Second, despite substantial efforts, access to data remains a challenge. Most companies,
including mobile network operators, tend to be very reluctant to make data available—even
aggregated and anonymized—to researchers and/or governments. Apart from data protection
issues, such data are also seen and used as commercial assets, thus limiting the potential
use for humanitarian goals if there are no sustainable models to support operational
systems. One should also be aware that not all mobile network operators in the world
are equal in terms of data maturity. Some are actively sharing data as a business,
while others have hardly started to collect and use data.
Third, the use of mobile phone data raises legitimate public concerns about privacy,
data protection, and civil liberties. Governments in China, South Korea, Israel, and
elsewhere have openly accessed and used personal smartphone app data for tracking
individual movements and notifying individuals. However, in other regions, such as
in Europe, both national and regional legal regulations limit such use (especially
the EU law on data protection and privacy known as the GDPR). Furthermore, around
the world, public opinion surveys, social media, and a broad range of civil society
actors including consumer groups and human rights organizations have raised legitimate
concerns around the ethics, potential loss of privacy, and long-term impact on civil
liberties resulting from the use of individual mobile data to monitor COVID-19. Control
of the pandemic requires control of people—including their mobility and other behaviors.
A key concern is that the pandemic is used to create and legitimize surveillance tools
used by government and technology companies that are likely to persist beyond the
emergency. Such tools and enhanced access to data may be used for purposes such as
law enforcement by the government or hypertargeting by the private sector. Such an
increase in government and industry power and the absence of checks and balance is
harmful in any democratic state. The consequences may be even more devastating in
less democratic states that routinely target and oppress minorities, vulnerable groups,
and other populations of concern.
Fourth, researchers and technologists frequently fail to articulate their findings
in clear, actionable terms that respond to practical political and technical questions.
Researchers and domain experts tend to define the scope and direction of analytical
problems from their perspective and not necessarily from the perspective of governments’
needs. Critical decisions have to be taken, while key results are often published
in scientific journals and in jargon that are not easily accessible to outsiders,
including government workers and policy makers.
Last, there is little political will and resources invested to support preparedness
for immediate and rapid action. On country levels, there are too few latent and standing
mixed teams, composed of (i) representatives of governments and public authorities,
(ii) mobile network operators and technology companies, and (iii) different topic
experts (virologists, epidemiologists, and data analysts); and there are no procedures
and protocols predefined. None of these challenges are insurmountable, but they require
a clear call for action.
A CALL TO ACTION TO FIGHT COVID-19
To effectively build the best, most up-to-date, relevant, and actionable knowledge,
we call on governments, mobile network operators, and technology companies (e.g. Google,
Facebook, and Apple), and researchers to form mixed teams. Governments should be aware
of the value of information and knowledge that can be derived from mobile phone data
analysis, especially for monitoring the necessary measures to contain the pandemic.
They should enable and leverage the fair and responsible provision and use of aggregated
and anonymized data for this purpose. Mobile network operators and technology companies
with widespread adoption of their products (e.g. Facebook, Google, and Apple) should
take their social responsibility and the vital role that they can play in tackling
the pandemic. They should reach out to governments and the research community. Researchers
and domain experts (e.g. virologists, epidemiologists, demographers, data scientists,
computer scientists, and computational social scientists) should acknowledge the value
of interdisciplinary teams and context specificities and sensitivities. Impact would
be maximized if governments and public authorities are included early on and throughout
their efforts to identify the most relevant questions and knowledge needs. Creating
multidisciplinary interinstitutional teams is of paramount importance, as recently
shown successfully in Belgium and the Valencian region of Spain (
45
). Four key principles should guide the implementation of such mixed teams to improve
their effectiveness, namely (i) the early inclusion of governments, (ii) the liaising
with data protection authorities early on, (iii) international exchange, and (iv)
preparation for all stages of the pandemic.
Relevant government and public authorities should be involved early, and researchers
need to build upon their knowledge systems and need for information. One key challenge
is to make insights actionable—how can findings such as propagation maps lastly be
used (e.g., for setting quarantine zones, informing local governments, and targeting
communication). At the same time, expectations must be realistic: Decisions on measures
should be based on facts but are, in the end, political decisions. Many insights derived
from mobile phone data analytics do not have practical implications—such analysis
and the related data collection should be discouraged until proven necessary.
We also suggest such efforts be transparent and involve data protection authorities
and civil liberties advocates early on and have quick iteration cycles with them.
For example, policy makers should consider the creation of an ethics and privacy advisory
committee to oversee and provide feedback on projects. This ensures that privacy is
maintained and raises potential user acceptance. Aggregated mobile phone data can
be used in line even with the strict European regulations (GDPR). Earlier initiatives
have established principles and methods for sharing data or indicators without endangering
personal information and build privacy-preserving solutions that use only incentives
to manage behavior (
46
–
48
). The early inclusion of the data protection authority in Belgium has led to the
publishing of a statement by the European Data Protection Board on how to process
mobile phone data in the fight against COVID-19 (
49
). Even while acknowledging the value of mobile phone data, the urgency of the situation
should not lead to losses of data privacy and other civil liberties that might become
permanent after the pandemic. In this regard, the donation of data for good and the
direct and limited (in time and scope) sharing of aggregated data by mobile network
operators with (democratic) governments and researchers seem to be less problematic
than the use of individual location data commercially acquired, brought together,
and analyzed by commercial enterprises. More generally, any emergency data system
set to monitor COVID-19 and beyond must follow a balanced and well-articulated set
of data policies and guidelines and is subjected to risk assessments.
Specifically, any efforts should meet clear tests on the proportionate, legal, accountable,
necessary, and ethical use of mobile phone data in the circumstances of the pandemic
and seek to minimize the amount of information gathered to what is necessary to accomplish
the objective concerned. These are not unknown criteria; they are well inscribed into
international human rights standards and law concerning, for example, the use of force.
Certainly, the use of mobile phone data does not equate to the use of force, but in
the wrong hands, it can have similarly devastating effects and lead to substantially
curtail civil liberties. Considering the broad absence of legal frameworks and historical
mishandling of data by technology companies, there is an urgent need for responsible
global leadership and governance to guide efforts to use technology in times of emergency.
We further see a clear need for more international exchange, not only with other domain
experts but also with other initiatives and groups; findings must be shared quickly—there
will be time for peer-reviewed publications later. In particular, in countries with
weaker health (and often also economic) systems, the targeting and effectiveness of
nonpharmaceutical interventions might make a big difference. This also implies the
translation of important findings from English to other relevant languages.
For later stages of the pandemic, and for the future, stakeholders should aim for
a minimum level of “preparedness” for immediate and rapid action. On country and/or
region levels, there will be a need of “standing” mixed teams; up-to-date technology,
basic agreements, and legal prescriptions; and data access, procedures, and protocols
predefined [also for “appropriate anonymization and aggregation protocols”; (
46
)]. A long-time collaboration between infectious disease modelers, epidemiologists,
and researchers of mobile network operator laboratories in France helped jump-start
a project on the COVID-19 pandemic, with the support of public health authorities
(
50
).
Last, in addition to (horizontal) international exchange, we also need international
approaches that are coordinated by supranational bodies. National initiatives might
help to a certain extent but will not be sufficient in the long run. A global pandemic
necessitates globally or at least regionally coordinated work. Here, promising approaches
are emerging: the EU Commission on 23 March 2020 called upon European mobile network
operators to hand over anonymized and aggregated data to the Commission to track virus
spread and determine priority areas for medical supplies (
51
), while other coordination initiatives are emerging in Africa, Latin America, and
the MENA (Middle East and North Africa) region. It will be important for such initiatives
to link up, share knowledge, and collaborate. The COVID-19 pandemic will not be over
soon, and it will not be the last pandemic we face. Privacy-aware and ethically acceptable
solutions to use mobile phone data should be prepared and vetted in advance, and we
must raise readiness on national and international levels, so we can act rapidly when
the crisis hits.