The use of data – both to improve situational awareness of the spread of the virus and to alert people at risk and break the chain of transmission – has become a central part of the debate around the response to Covid-19.
This has primarily focused on the use of proximity data to support automated contact tracing. Here there are legitimate concerns around take-up and privacy, and whether countries choose a centralised option – as is being tested in the UK – or a decentralised one, building on the API that Apple and Google have developed. Our briefing Contact Tracing Apps: What the UK Government Should Do Next explores some of the issues surrounding Bluetooth-based apps in greater detail.
The other potentially significant opportunity for tackling Covid-19 relates to location data. As well as raising further privacy questions, there are important technical issues – such as the accuracy and precision of data – as well as broader trade-offs, including the prospect of knock-on costs caused by weak data, that also require attention. This briefing on the potential and use of location data explores some of these areas in more detail.
Location data can help identify individuals who may have been exposed to the virus as well as help monitor compliance with social distancing and enforce restrictions on movement.
Aggregated Location Data
Analysis of aggregated location data can be used to identify hotspots of transmission and forecast future trends on transmission. This can help governments measure the efficacy of existing measures as well as guide government decision-making going forward, on subjects such as public-health interventions and where to allocate testing and medical resources. This kind of data will be particularly significant for governments as lockdowns are eased; it is essential that governments are able to gather real-time insights on the effectiveness of their interventions.
Good location data will be valuable for public-health and mobility insights, both on an individual level and at the community level. However, the technical details around data and practical implications relating to the origin and value of location data are important areas that need greater consideration. The technological community has only been working on using location data in respect to Covid-19 for a short period of time, and the situation is more complex than just adapting the same technology for a different purpose than it was originally intended. Precision, accuracy and volume are all important factors to specifically consider when thinking about location data and Covid-19; they are also factors that were not all necessary for the pre-Covid-19 use of location data in consumer devices and communications infrastructure.
Medical experts currently believe that the virus is transmissible within 2 metres – meaning a person must come in contact within 2 metres of an infected person to have a chance of contracting it from social interactions. Therefore, effective digital contract tracing requires highly precise data. However, most extant technology was not designed to rapidly geolocate devices at that level of precision, meaning most location data is less precise than 2 metres. The ongoing challenge for technologists is to either adapt extant technology for a purpose for which it was not designed or build new solutions that can deliver the required level of precision.
With aggregated mobility insights, precision is still an important factor but becomes less significant.
Accuracy refers to how close a measured location is to the actual location. For example, a device may be inaccurately located to London when the user is actually in Manchester. Various factors can impact data accuracy. For example, when mobile devices lack a clear line of sight to GPS satellites — such as in an area with multiple skyscrapers — it can diminish accuracy.
Gathering useful location information requires data readings from a high proportion of devices. This is particularly true when health agencies attempt to identify all people with whom an infected person might have come into contact. Reasonably high volume of data is also necessary for mobility insights, as under-sampling can yield biased insights.
“Mobile location data” is used as a catch-all term, but these data can come from different sources. Each source has different accuracy, precision and scale. The implications of and insights from analysing these data can have significant public-health consequences, so it is important to understand the different sources and their respective strengths and weaknesses.
Using location data effectively requires extensive subject-matter expertise. While dashboards can be visually appealing, it is essential for governments to better understand these data sources to ask effective questions and understand how conclusions derived from these data might be biased.
It’s also important to note that all forms of location data require people to carry a mobile device and for the device to be switched on. Less than 50 per cent of the world’s population own smartphones, so location data cannot generate insights about entire populations.
The precision of GPS data is variable. While GPS data can have very high precision, it can take upwards of a minute to precisely geolocate a device. Moreover, GPS precision can rapidly degrade as devices move.
GPS precision changes rapidly if devices move, especially through areas where line of sight with GPS satellites can be blocked, such as urban cores.
GPS does not generally incorporate measures of altitude. In a block of flats, devices in the same corner of the building will appear to be close, even if they are several floors apart in altitude.
Much GPS data comes from apps that have software development kits (SDKs) embedded in them.
Bluetooth is a wireless technology standard that allows devices to communicate over short distances. Instead of directly tracking location, it tracks interactions.
Bluetooth beacons can be installed in public places such as shops, restaurants and shopping malls. These can be used to track devices based on the known coordinates of the beacon.
Employers can install Bluetooth beacons in the workplace in order to work out exactly where an infected employee has been and therefore where they have contracted or transmitted the virus.
This method can even be used to keep track of medical equipment in a hospital setting.
As Bluetooth is arguably one of the most accurate technologies in terms of proximity identification, this method can provide useful insights.
Mobile Network Data
Mobile network data can be both accurate and high volume, although this depends on the number of cell phone towers and mobile penetration rate in any given area. Mobile network data also has a very strong infrastructure around privacy and security. Harnessing this data, however, requires working with partners experienced in accessing it and working with it.
4G and 5G Networks
Compared to 4G, 5G improves precision and accuracy of localisation of mobile devices. Ongoing research shows that devices may be able to be located within 1 metre, offering a solution to some of the current data challenges.
However, we don’t have sufficient volume of 5G consumer devices yet on the market, which highlights ongoing data volume issues.
So far, the debate around proximity and location data has focused mainly on privacy concerns. In particular, individuals are concerned about authorities building up a detailed picture of their location, movements, behaviour and activities.
But policymakers also need to carefully consider the technical aspects of different sources of location data and the resulting practical impacts. Most notably, if the data used for generating location and mobility insights is weak (low precision and low accuracy), then the privacy implications may be less stark – but the value of the exercise also decreases. Both individual tracking and generating aggregated mobility insights based on weak location data can result in flawed insights. This can have a range of undesirable costs for both individuals and governments.
Often governments are presented with a dashboard which they will find appealing, without spending enough time scrutinising it and the origin of the data. Policymakers must consider trade-offs between the quality and value of the data, and the privacy sacrifices they are asking individuals to make.
The level of precision required for effective contact tracing is very high; it involves knowing the location of people within 2 metres. This could tell you very private details about a person’s life, from where they are sat in a theatre to who they had lunch with and whether they are sleeping in the same room as their spouse. Such granular data requires a high level of trust between citizens and their governments.
To work with this data, we need solid privacy infrastructure and a clear set of ethical guidelines that sets out who has access to the data.
We must also consider the retention rules for data. The global community is learning a lot about the virus as time goes on – if we only had a two-week roving window before data is deleted, governments could not go back in time to analyse the data and draw trends.
Every choice around what data you obtain and for how long has implications for what you can later do with the data.
Governments must optimise the number of people agreeing to share their data, which will likely require privacy safeguards to be in place.
Weak data (i.e. data that is low in precision and accuracy) increases the risk of alerting for false positives or providing false insights about the presence of Covid-19 in a population. For example, when it comes to digital contact tracing, if the data is only accurate to a few hundred metres, everybody who has come within a few hundred metres of an infected person would be alerted. In urban areas this could be several hundred people within a short period of time. This has some undesirable consequences: First, if people are getting multiple alerts per week, they are likely to start ignoring them and may lose trust in the app. Also, people alerted are likely to need to isolate for several days, which can have impacts on work productivity, personal finance and mental health.
Similarly, if weak data is used, the government could collect inaccurate insights, which may lead to poor policymaking.
Policymakers must be clear about the level of analysis they are seeking, and realistic about the capabilities of technology to achieve this. It is a challenge going from achieving high level location insights on a community level such as a building, neighbourhood or street to an individual level, and governments should be prepared to be told that current data infrastructure doesn’t support exactly what they are asking for. Focusing on community data is currently much easier than focusing on individual data. Issues around precision can be solved by 5G, but we don’t currently have that capability.
Governments must evaluate whether the trade-off they are asking citizens to make is commensurate with the value created. For example, if you are building individual contact tracing and the data is accurate within 1 kilometre, the value of the data is low, and the trade-off may not be worth it. They must also be straightforward with the public about the expected benefits and limitations of the technologies they are pursuing, and the trade-offs with other concerns in relation to privacy and data security.
Governments should work with partners, but they should do so by putting out clear calls for assistance to engage with the right level and type of expertise. So far, the engagement from many governments has happened on an ad-hoc basis, and partnerships between government and companies or researchers has happened as a result of partners approaching government first. Instead, governments must be clear about their objectives from the outset and put out a call for support from technical experts. Mobile operators can help governments analyse data on a community level; working with data can give some false conclusions, which mobile operators can help to address.