1 00:00:13,690 --> 00:00:19,030 Welcome back to the Oxford Mathematics Public Lecture Hall Edition. My name is Allegory 2 00:00:19,030 --> 00:00:24,040 and I'm in charge of external relations for the Mathematical Institute. As usual, special 3 00:00:24,040 --> 00:00:29,050 thanks to a sponsor execs market execs market are leading quantitative 4 00:00:29,050 --> 00:00:34,600 driven electronic market maker with offices in London, Singapore and New York. 5 00:00:34,600 --> 00:00:39,700 This support makes it possible for us to provide quality content. 6 00:00:39,700 --> 00:00:45,160 As you know, we are now entering a new phase of the coveted nineteen crisis following 7 00:00:45,160 --> 00:00:50,290 a lockdown and the hope that the numbers of new infected case may be manageable. Governments 8 00:00:50,290 --> 00:00:55,510 around the world are tentatively trying to unlock the countries at the mathematical 9 00:00:55,510 --> 00:01:01,120 level. The first phase of the disease was governed by population dynamics of infection, 10 00:01:01,120 --> 00:01:06,730 a topic that was discussing approach previous public lecture that you can find online. 11 00:01:06,730 --> 00:01:11,800 The second phase of the crisis will depend on our ability to manage and understand social 12 00:01:11,800 --> 00:01:16,810 interaction and human mobility. As you may have heard in recent 13 00:01:16,810 --> 00:01:21,820 news, one of the main ideas is to track individuals through their forms. 14 00:01:21,820 --> 00:01:26,980 What kind of data do we need? Or do we use it? What type of mathematics is involved 15 00:01:26,980 --> 00:01:31,990 in this technology? And what are the ethical and legal challenges regarding 16 00:01:31,990 --> 00:01:37,150 privacy? To discuss these ideas? We have invited Professor Hono 17 00:01:37,150 --> 00:01:42,220 lobbyist tonight. Originally from Belgium, Reneau is no professor of 18 00:01:42,220 --> 00:01:47,260 networks and non-linear systems at the Mathematical Institute here in Oxford 19 00:01:47,260 --> 00:01:52,570 and a fellow of Somerville College. Reneau is one of the top network scientists in the world, 20 00:01:52,570 --> 00:01:58,240 and he has been working on mobile phone network for many years, is now leading a collaborative 21 00:01:58,240 --> 00:02:03,760 effort to put anonymized data from multiple countries and service providers in a coherent 22 00:02:03,760 --> 00:02:08,950 way so that it can be integrated in various models. If you want 23 00:02:08,950 --> 00:02:14,380 to ask a question, please send it in via social media and we will collate them and send 24 00:02:14,380 --> 00:02:20,510 out answers in the next couple of days. Thank you, Reneau. Please start. No. 25 00:02:20,510 --> 00:02:26,210 Hello, everyone, my name is Robert and then the professor in the Mathematical Institute of the University of Oxford. 26 00:02:26,210 --> 00:02:31,310 So today I'm going to talk to you about phones, smartphones and how to use them to 27 00:02:31,310 --> 00:02:36,320 fight against coving 19. So clearly, if you 28 00:02:36,320 --> 00:02:42,110 want to steer a country out of the very complex and tragic situation in which it is right now, 29 00:02:42,110 --> 00:02:47,230 you can proceed blindfolded. You. You need data. You need data to guide your actions. 30 00:02:47,230 --> 00:02:52,490 And you did that to estimate the prevalence of the disease with sufficient tests. But you also 31 00:02:52,490 --> 00:02:57,510 need data to capture the way each of us is moving and interacting. S 32 00:02:57,510 --> 00:03:03,020 At the end of the day, we are the ones carrying the disease and propagating them to our contacts 33 00:03:03,020 --> 00:03:08,090 by our actions. So as we see today, mobile phones and 34 00:03:08,090 --> 00:03:13,490 more precisely mobile phone data are a very nice opportunity for us to try to capture these mobility 35 00:03:13,490 --> 00:03:18,620 and social contacts. And the end can be useful in different ways that can be useful 36 00:03:18,620 --> 00:03:23,710 to try to understand better and to model better dynamics of the disease that 37 00:03:23,710 --> 00:03:29,150 can and will be useful in order to assess the impact of some social distancing measures on mobility, 38 00:03:29,150 --> 00:03:34,460 for instance, such as lockdown and to track it drill down. And they can also provide 39 00:03:34,460 --> 00:03:40,070 some data driven support for exit strategies by government and 40 00:03:40,070 --> 00:03:45,150 decision makers. So all this talk is going to be organised as follows. We're 41 00:03:45,150 --> 00:03:50,280 going to start with an introduction about networks in epidemiology. Will then 42 00:03:50,280 --> 00:03:56,020 talk about the use of mobile phones to estimate contacts and mobility and the very important question of privacy. 43 00:03:56,020 --> 00:04:01,030 We will then look more specifically of the use of mobile phones in order to understand and 44 00:04:01,030 --> 00:04:06,620 fight against coving 19. And then we're going to conclude. 45 00:04:06,620 --> 00:04:11,730 For the last few months, there's been intense research effort worldwide in different disciplines to 46 00:04:11,730 --> 00:04:16,770 understand better coming 19. These efforts score from the understanding of the 47 00:04:16,770 --> 00:04:23,180 incubation period of the disease and the proportion of asymptomatic people after they're infected. 48 00:04:23,180 --> 00:04:28,410 To understanding the physical way by which the spreading takes place, mostly by 49 00:04:28,410 --> 00:04:33,660 Arrison or droplets or surface contamination. But a very important aspect 50 00:04:33,660 --> 00:04:39,180 as well is the social component associated with the disease and trying to capture 51 00:04:39,180 --> 00:04:44,190 all of these social contacts that are existing inside the society and trying to understand the way 52 00:04:44,190 --> 00:04:49,890 these contacts and these physical contacts are actually either slowing down or accelerating 53 00:04:49,890 --> 00:04:55,320 the spread of the disease. And rather, you, in your situation about the importance 54 00:04:55,320 --> 00:05:00,330 of contacts and social and social contacts and networks to understand 55 00:05:00,330 --> 00:05:05,340 the spread of a disease. Let's look at a very different example taken from Kevin, 19, but taken 56 00:05:05,340 --> 00:05:10,440 from sexually transmitted diseases. So in this example, in this 57 00:05:10,440 --> 00:05:15,690 author. Well, in this paper, the authors went to a high school and 58 00:05:15,690 --> 00:05:21,000 by means of interviews and by means of question, as they build what they call to be 59 00:05:21,000 --> 00:05:26,190 sexual network. And in a sexual network, you have the nodes represent students 60 00:05:26,190 --> 00:05:31,560 and the links represent sexual relations. And when you look at the sexual nitto, you clearly 61 00:05:31,560 --> 00:05:36,570 see that there's a lot of diversity in the way that people are interacting with each other. But 62 00:05:36,570 --> 00:05:41,640 you can also see that's two people may actually infect each other. Not by means 63 00:05:41,640 --> 00:05:46,710 of direct connexions, but by means of indirect connexion to 64 00:05:46,710 --> 00:05:51,840 a path of connexions. And because of this notion of bath, you can define 65 00:05:51,840 --> 00:05:57,060 the notion of connected component, which is a very good at critical aspect, because it will tell 66 00:05:57,060 --> 00:06:02,220 you that if a node belongs to such a connected component, it will have the possibility 67 00:06:02,220 --> 00:06:07,820 to infect a large part of the whole population. So clearly, networks 68 00:06:07,820 --> 00:06:13,040 have an opinion, it's a very important concept and very much related with each other. 69 00:06:13,040 --> 00:06:18,170 But on top of a network, you also need a model for the way the disease is evolving. And I would not 70 00:06:18,170 --> 00:06:23,390 say too much about it. There is a very nice talk by my colleague Robin Thompson will give a talk and there is a link 71 00:06:23,390 --> 00:06:28,760 here just below towards this talk. But let's just have the basics of it. So 72 00:06:28,760 --> 00:06:34,070 when you try to move the disease spreading, typically the way you do is you divide the population 73 00:06:34,070 --> 00:06:39,260 two compartments and these compartments represents a different different states of the evolution 74 00:06:39,260 --> 00:06:44,960 of the disease. You start by being susceptible, you potentially become infected, and then you 75 00:06:44,960 --> 00:06:50,240 go into a recovered state. And in this example, you just have three states. But actually 76 00:06:50,240 --> 00:06:55,460 these mothers can be generalised. And this is an example of a mother that people use for coordinating 19 77 00:06:55,460 --> 00:07:00,470 with many more compartments to try to predict more accurately 78 00:07:00,470 --> 00:07:05,870 the different path of the stages that the but that a person may follow from being 79 00:07:05,870 --> 00:07:11,150 susceptible to being in the final state of being recovered or deceased in this 80 00:07:11,150 --> 00:07:17,160 case. And so clearly, this mother is probably more accurate, but it also enforce many more parameters. 81 00:07:17,160 --> 00:07:25,030 And it raises thoroughly the question of what would be the right data to use in order to fit those parameters. 82 00:07:25,030 --> 00:07:30,450 So when you when you have a mother for your population and for your disease, well, then a very important quantity 83 00:07:30,450 --> 00:07:36,000 is the Soka, the so-called repetition. Repetition? No, these are not that you might have seen 84 00:07:36,000 --> 00:07:41,040 quite, quite often, actually in the newspapers. And what is not, it's the typical number of infections 85 00:07:41,040 --> 00:07:46,090 caused by an individual. And clearly, if this number is above one, 86 00:07:46,090 --> 00:07:51,600 we would expect to have an exponential growth of the disease. If it is below one, we would expect of an extension 87 00:07:51,600 --> 00:07:57,330 of citizens. And for Kobe 19, this number is estimated typically between two and three, 88 00:07:57,330 --> 00:08:02,460 meaning that if you want to stop the epidemics, we need to try. We need to find measures in 89 00:08:02,460 --> 00:08:07,590 order to bring this onward below one. And one of these measures is social distancing, trying 90 00:08:07,590 --> 00:08:13,540 to decrease the contact between the people and the hands to bring this number to one. 91 00:08:13,540 --> 00:08:18,990 Now, when you want to take some, like, strategies against the virus. 92 00:08:18,990 --> 00:08:24,000 Well, typically you first need data. And the most basic data would be this one here about the total 93 00:08:24,000 --> 00:08:29,040 number of reported case by day. But clearly, between this single curve, you have a very complex 94 00:08:29,040 --> 00:08:34,410 system with many people living their life, interacting with each other and possibly 95 00:08:34,410 --> 00:08:39,900 propagating the disease. And as we see what networks would be very important, you know, to capture 96 00:08:39,900 --> 00:08:45,330 this richness that is behind this car. And when you try then to act 97 00:08:45,330 --> 00:08:50,850 upon the disease, well, that is two different types of approaches. First one one's looks 98 00:08:50,850 --> 00:08:56,400 looks from today and goes back into the past and tried to trace back the contacts 99 00:08:56,400 --> 00:09:01,590 that's infected people had in the past, while another Bush tried to predict the future 100 00:09:01,590 --> 00:09:07,350 course of the epidemics. But before we discuss these two different 101 00:09:07,350 --> 00:09:12,510 options. Well, let's talk about contact networks for for us for a minute. We've seen 102 00:09:12,510 --> 00:09:17,640 a sexual network, but what about the contact that talks about the physical meetings that each 103 00:09:17,640 --> 00:09:22,800 of us has every day? Well, clearly, my interactions on the scene every 104 00:09:22,800 --> 00:09:27,810 week, the people I meet that I met last week will not be the same as the 105 00:09:27,810 --> 00:09:33,180 ones that are really next week. Some might be the same, but some others not. There is a lot of varied in the contacts 106 00:09:33,180 --> 00:09:38,220 that I have. Do I think the symbols? Do I see the same share and the same shop and so on and 107 00:09:38,220 --> 00:09:43,560 so forth? So it means that we clearly have a first issue 108 00:09:43,560 --> 00:09:48,630 is that how would be be predicting a network in the future if we only know 109 00:09:48,630 --> 00:09:53,640 the network from the past? We also have the problem of uncertainties. Is the fact that how 110 00:09:53,640 --> 00:09:58,770 do you define a contact clearly? Typically we define it to be an interaction of a certain 111 00:09:58,770 --> 00:10:04,500 of it so that people are within certain number of metres, within a certain number of seconds. 112 00:10:04,500 --> 00:10:09,870 But you have a threshold here. And whereas the threshold is matter of discussion 113 00:10:09,870 --> 00:10:16,230 and also this threshold may depend on the environment by your in a bus or outside in the park. 114 00:10:16,230 --> 00:10:22,680 And finally, even if you have a very good definition of a contact, how do you measure these contacts in large populations? 115 00:10:22,680 --> 00:10:27,890 It seems unrealistic to have the exact contact of the whole population. The UK 116 00:10:27,890 --> 00:10:33,360 on every day. So as we'll see what mobile phones can help us to measure contacts 117 00:10:33,360 --> 00:10:38,850 but will not try to match exactly these contacts will try to measure certain aspects 118 00:10:38,850 --> 00:10:44,500 of the resulting networks that will be important for the disease. 119 00:10:44,500 --> 00:10:49,530 So let's talk about the victories. So, as I said, contact tracing, what you try to do is actually 120 00:10:49,530 --> 00:10:54,580 to find to go back into the past. And so let's assume that today you found find out 121 00:10:54,580 --> 00:10:59,660 and someone show some symptoms of of a disease. Well, very rapidly. 122 00:10:59,660 --> 00:11:05,140 You try to identify the contacts that you had in the past in order to isolate this person 123 00:11:05,140 --> 00:11:10,330 before himself has the chance of transmitting the disease. And at 124 00:11:10,330 --> 00:11:15,340 times. Time is the essence here. So this can be seen from this blood on the left where 125 00:11:15,340 --> 00:11:20,500 you can see the number of infections that someone causes as a function of the of the days 126 00:11:20,500 --> 00:11:25,840 after, which is infected, depending if this is as a symptomatic a situation, 127 00:11:25,840 --> 00:11:31,000 symptomatic environmental or asymptomatic. But the most important 128 00:11:31,000 --> 00:11:36,770 here is to see that if you were to be able to identify someone and taking out by quarantining 129 00:11:36,770 --> 00:11:41,980 them just after one, two, three days after you is infected, actually many 130 00:11:41,980 --> 00:11:47,110 of the infection that he would have caused otherwise when that happened. So it means that this is 131 00:11:47,110 --> 00:11:52,240 a way to break the spreading chains of the disease and make Anacleto decreased. On the 132 00:11:52,240 --> 00:11:57,250 note and then repetition number from its natural value to a value that would be 133 00:11:57,250 --> 00:12:02,350 below what? So another approach is not to look into the 134 00:12:02,350 --> 00:12:07,390 past, but to look towards the future and to look at the situation. As of today and to try to 135 00:12:07,390 --> 00:12:12,520 predict what would be the future course of the disease. So clearly, this is something 136 00:12:12,520 --> 00:12:18,090 that is very important for decision maker to know what will happen tomorrow in a week if she wants. She wants, for instance, to 137 00:12:18,090 --> 00:12:23,110 want to put the right resources, the right place at the right time. But actually, these mothers are also 138 00:12:23,110 --> 00:12:28,240 used to try to imagine what would happen in the future if certain decisions are taken. 139 00:12:28,240 --> 00:12:33,660 Opening schools, for instance, or banning certain types of shops. Now, 140 00:12:33,660 --> 00:12:38,770 when you want to build these mothers based on this, I are, for instance, when you will need some sort 141 00:12:38,770 --> 00:12:43,840 of a model of the connectivity of the craft. And in it's more simple scenario, you will assume that 142 00:12:43,840 --> 00:12:48,850 the system is basically fully connected. It's a mean field situation where everybody is connected 143 00:12:48,850 --> 00:12:53,860 to everyone. So this is an approach that has the advantage of simplicity with just 144 00:12:53,860 --> 00:12:58,950 a few parameters. But this is clearly very interesting. I am very unlikely to meet with the same 145 00:12:58,950 --> 00:13:04,080 protein with people living nearby than people living in Manchester, for instance. 146 00:13:04,080 --> 00:13:10,150 Then on the other side of the spectrum. What you try to do is to catch the exact network of contacts 147 00:13:10,150 --> 00:13:15,970 and then to do some agent based modelling about whether people are going to be infecting each other through these contacts. 148 00:13:15,970 --> 00:13:21,220 So this is clearly something that is very much refined. But as we say, it's there's a problem. 149 00:13:21,220 --> 00:13:26,260 What is this network to be used? We know the network from the past, but we don't know the network of the future. 150 00:13:26,260 --> 00:13:31,280 We know that there is a lot of viability. And to try to capture until well, 151 00:13:31,280 --> 00:13:36,370 to get rid of these very weighty one ways to protect somewhere in between. Not to look at the 152 00:13:36,370 --> 00:13:41,990 exact network and exactly what is connected to, but instead to 153 00:13:41,990 --> 00:13:47,020 to group nodes according to some information. For instance, here the doctors together, the children together and 154 00:13:47,020 --> 00:13:52,150 so on and so forth. And instead of trying to estimate what is the connexion of one specific doctor to one 155 00:13:52,150 --> 00:13:57,490 specific child, a child, you're gonna be trying to look for the total number of connexions 156 00:13:57,490 --> 00:14:02,980 between the doctors and the children. And by doing so, you're going to be building some 157 00:14:02,980 --> 00:14:08,140 contact matrices between different rows or between group, between 158 00:14:08,140 --> 00:14:13,870 different groups of people that can be stratified either by these roles like here or by that age 159 00:14:13,870 --> 00:14:19,780 or by their agenda. And even if we know one, we assume to have a lot of IBT four 160 00:14:19,780 --> 00:14:25,000 pairs of nodes because of the law over the large number. We can expect to have that between 161 00:14:25,000 --> 00:14:31,770 groups. We will have numbers that would be readily stable a time. 162 00:14:31,770 --> 00:14:37,050 So there is a second aspect that is extremely important. You want to mother the spread of the disease and it is 163 00:14:37,050 --> 00:14:42,300 geographic space. So if you were to look to go back into the 14th century and to 164 00:14:42,300 --> 00:14:47,700 to study the way the Black Death was basically propagating across Europe, you would see 165 00:14:47,700 --> 00:14:53,970 a wave, a wave black in the sea where you would have this wave from coming from Turkey entries 166 00:14:53,970 --> 00:15:00,000 through France and then to the north. And you have this very nice but deadly wave propagating 167 00:15:00,000 --> 00:15:05,070 inside Europe. Well, nowadays, we travel and move very differently than we did 168 00:15:05,070 --> 00:15:10,140 back then. We have in particular planes that make us take shortcuts 169 00:15:10,140 --> 00:15:15,850 from one place on earth to another one. And actually, these shortcuts are going to change radically the way 170 00:15:15,850 --> 00:15:21,000 these diseases Miss Pretz spatially. And to illustrate this, there's a very 171 00:15:21,000 --> 00:15:26,400 nice article by the Bruckman where they were looking at source that started 172 00:15:26,400 --> 00:15:31,410 in China to. And they were trying to understand what was the what 173 00:15:31,410 --> 00:15:37,340 would be the relation between the time when a disease would appear in a country and different measures of distance. 174 00:15:37,340 --> 00:15:42,360 What you can see from the left is that actually if you were to try to probe this time of the first appearance to the 175 00:15:42,360 --> 00:15:47,550 physical distance and kilometres, you don't seem to have very much of a signal there. What 176 00:15:47,550 --> 00:15:52,580 if instead you were to probe this time with respect to a natural distance, 177 00:15:52,580 --> 00:15:57,810 a distance measure measured in a network where you would have cities as nodes 178 00:15:57,810 --> 00:16:02,900 and weighted edges based on the number of passengers, 179 00:16:02,900 --> 00:16:08,100 aeroplane passengers going from one place to another one would actually have the network distance gives you a very, 180 00:16:08,100 --> 00:16:13,590 very good linear relation, meaning that networks seem to be more predictive than physical distance 181 00:16:13,590 --> 00:16:19,170 to understand and predict the spreading of the disease. So so actually, 182 00:16:19,170 --> 00:16:24,720 when people try to incorporate this spatial information into 183 00:16:24,720 --> 00:16:29,980 a modelling format, quite, they will now consider other networks and not contact networks as before, 184 00:16:29,980 --> 00:16:36,230 but what is usually called meta population networks where the nodes were represent locations 185 00:16:36,230 --> 00:16:41,320 and the edges would represent flows of population between nodes. And what 186 00:16:41,320 --> 00:16:46,500 do we basically assume now is that you would have no people would be physically 187 00:16:46,500 --> 00:16:51,750 moving inside this meta population network, interacting with each other and 188 00:16:51,750 --> 00:16:56,790 potentially spreading the disease. So what have we seen so far? 189 00:16:56,790 --> 00:17:03,120 We've seen so far that actually. Mothers are becoming much more and more complex. 190 00:17:03,120 --> 00:17:08,220 And clearly, because of their complexity, we will need data in order 191 00:17:08,220 --> 00:17:13,230 to parametrised and to fit the parameters of these mothers. We also need data. If you 192 00:17:13,230 --> 00:17:19,230 want to monitor the effect isn't the effectiveness of distancing measures. We take measures. 193 00:17:19,230 --> 00:17:24,480 Oh, well, are they implemented? What is their impact on the actual behaviour of the people? 194 00:17:24,480 --> 00:17:29,790 This is something that needs to be measured either to strengthen these measures or to replace 195 00:17:29,790 --> 00:17:34,920 them by a measure that would be more appropriate. And finally, as soon as you think 196 00:17:34,920 --> 00:17:39,990 about mothers and parameters, you need to have day to day updates on these parameters because clearly the weather people 197 00:17:39,990 --> 00:17:45,080 move and the way the people interact is very different today than it would be tomorrow. And the other 198 00:17:45,080 --> 00:17:50,220 infant last week, due to the way that people adapt to the disease and to all of these 199 00:17:50,220 --> 00:17:55,290 distancing measures that are implemented by governments. So let us know. Look 200 00:17:55,290 --> 00:18:00,390 at the different ways in which people have been using mobile phones as a way to 201 00:18:00,390 --> 00:18:05,550 capture at a large scale such mobility and contact patterns. The ones that you 202 00:18:05,550 --> 00:18:10,920 need to use actually in order to fit these images and mothers. And that's also discussed 203 00:18:10,920 --> 00:18:16,360 very important question of the privacy associated to these data. So 204 00:18:16,360 --> 00:18:21,420 clearly, for the last 20 years, more and more of our behaviour and 205 00:18:21,420 --> 00:18:27,300 actions are leaving traces, digital traces in some databases. 206 00:18:27,300 --> 00:18:32,470 Every time you click something on Facebook, you send an email. You you just use 207 00:18:32,470 --> 00:18:37,770 your G.P.S. Well, some of this information is going to be stored or marked by 208 00:18:37,770 --> 00:18:43,050 INS in some databases online. And actually, researchers 209 00:18:43,050 --> 00:18:49,320 have been trying to explode all of these digital traces for the last 15, 20 years 210 00:18:49,320 --> 00:18:54,750 in a field called computational social science, in the interface between sociology, computer science, 211 00:18:54,750 --> 00:18:59,910 applied mathematics. Now, there's been wonderful research that people have been 212 00:18:59,910 --> 00:19:05,420 doing showing all sorts of ways in which these datasets could help us to improve society. 213 00:19:05,420 --> 00:19:11,150 But it's also been recognised that these data as it can be dangerous and there can be very dangerous for our own privacy 214 00:19:11,150 --> 00:19:16,350 and even for our own personal liberties, freedom. And for this reason, 215 00:19:16,350 --> 00:19:21,540 a few years ago were implemented by women who were pushed by 216 00:19:21,540 --> 00:19:27,150 Europe, was proposed a very strict regulation called general 217 00:19:27,150 --> 00:19:32,160 data protection regulation in order to protect to protect individual people from, for 218 00:19:32,160 --> 00:19:37,380 or for the use in the unwitting use of their data. And I know very strict regulations 219 00:19:37,380 --> 00:19:42,600 on what you can or cannot do if you want to collect, used and 220 00:19:42,600 --> 00:19:47,640 or share personal data sets. But I still wanted 221 00:19:47,640 --> 00:19:53,250 to say a few things about privacy that are quite important for the rest of this stock. 222 00:19:53,250 --> 00:19:58,440 So the first one is the fact that's actually data that could appear to be very harmless, 223 00:19:58,440 --> 00:20:03,720 can still reveal a very personal aspect of your life. And to start this discussion, 224 00:20:03,720 --> 00:20:09,030 let's have a look at a blog post by the famous physicist 225 00:20:09,030 --> 00:20:14,550 Stefan Wolfram, where he looked at all of the emails that he sends 226 00:20:14,550 --> 00:20:19,590 over a period of 10 years. And what he looked at wasn't the content of the message. He just looked at the 227 00:20:19,590 --> 00:20:24,710 time of the day when the email was sent. And you had this very nice picture here where you see that those 228 00:20:24,710 --> 00:20:29,820 are picked up by every team, but also some regularities that you see, for instance, that between 1996 229 00:20:29,820 --> 00:20:35,280 and 2002, there was a shift when you were starting to write more and more emails during the night 230 00:20:35,280 --> 00:20:40,510 and keeping modern. Did they actually interpret? That's a very, very active field 231 00:20:40,510 --> 00:20:46,350 of his of his life when he was working on a book for a couple of years and completely shifting 232 00:20:46,350 --> 00:20:51,510 its working schedule from a data shift to a nightly shift. So 233 00:20:51,510 --> 00:20:56,700 I could not resist the will to try to do it on myself. And I didn't look at 20 years of data. I just 234 00:20:56,700 --> 00:21:01,830 looked at two weeks of data, one week from January before the lockdown and one week 235 00:21:01,830 --> 00:21:07,020 of May after it was implemented. Now, you clearly see here that there's been a completely complete 236 00:21:07,020 --> 00:21:12,180 reorganisation in the way I organise my working schedule, where in January 237 00:21:12,180 --> 00:21:17,460 I was sending most of my emails in the morning because in the afternoon, even less in the evenings. Now, I 238 00:21:17,460 --> 00:21:23,310 sent almost no emails in the morning anymore. Many in the afternoon and also a lot quite late indeed. 239 00:21:23,310 --> 00:21:28,800 And actually, the mechanism that can explain this change of behaviours, in my case, simply 240 00:21:28,800 --> 00:21:34,200 homeschooling. Schools are closed in the mornings. My wife fiche to our kids, 241 00:21:34,200 --> 00:21:40,260 meaning that most of my work has been shifted from the day more towards the night, basically. 242 00:21:40,260 --> 00:21:45,570 And actually, it's kind of revealing. That's even if the data is very 243 00:21:45,570 --> 00:21:51,030 harmless. It's just about the time when I send my emails. Still can't say something about my own life, right. 244 00:21:51,030 --> 00:21:56,070 If you were to look at these data, you could physically assess or you could predict that I do have 245 00:21:56,070 --> 00:22:01,070 kids. But also, if you were working in advertisements, well, you kids 246 00:22:01,070 --> 00:22:06,120 think that I might be slightly sleep deprived due to my late working 247 00:22:06,120 --> 00:22:11,730 hours and I might be interested in a new brand of I don't like fancy 248 00:22:11,730 --> 00:22:17,220 coffee and sending it advertising that I might be willing to respond to. 249 00:22:17,220 --> 00:22:22,720 So, as I say. The time when I send an e-mail is not very important policy, but still it can be sensitive. 250 00:22:22,720 --> 00:22:27,970 And this is even more so with even more private data and mobility and contacts are even 251 00:22:27,970 --> 00:22:34,990 more private because that can really be really very intimate aspects of our life. 252 00:22:34,990 --> 00:22:40,480 So a second point that is very important is that no one is neither. And actually, 253 00:22:40,480 --> 00:22:45,500 you could share your data. Even without knowing it and without winning 254 00:22:45,500 --> 00:22:50,960 it. And this is exactly what happens when country to country to Analytica 255 00:22:50,960 --> 00:22:55,970 could get access to a huge number of users without actually having contacts with a 256 00:22:55,970 --> 00:23:01,130 small number. And so basically what works is that about three thousand users 257 00:23:01,130 --> 00:23:06,410 took a creese organised by companies that Analytica providing their own personal data to 258 00:23:06,410 --> 00:23:11,630 the firm. But actually, by the way, Facebook worked at the time, it did not only provide their 259 00:23:11,630 --> 00:23:17,000 own information, but also the information of their friends. And by this network effect, 260 00:23:17,000 --> 00:23:22,610 basically Cambridge Analytica was able to capture to capture personal data, about 90 million people, 261 00:23:22,610 --> 00:23:28,210 while only. Targeting directly a very small number of its users. 262 00:23:28,210 --> 00:23:33,340 And these network effect is very important because, as we'll see, this is something that could play a role. 263 00:23:33,340 --> 00:23:39,670 When you start to collect data about contacts and also about location. 264 00:23:39,670 --> 00:23:44,690 So a third bond, which is also extremely important, is a point of anonymization, and indeed, usually when you 265 00:23:44,690 --> 00:23:49,700 deal with personal data, everything's anonymous. So you should doubt that about 266 00:23:49,700 --> 00:23:54,920 me instead of having Roman objects. Would just have a random number. But 267 00:23:54,920 --> 00:24:00,020 still, it's sometimes possible to do that general mass data. If you start to 268 00:24:00,020 --> 00:24:05,180 couple different data sets together. A very good example here is the one from the New York 269 00:24:05,180 --> 00:24:10,220 City taxi day taxi data. So about five, six years ago was released. A huge 270 00:24:10,220 --> 00:24:15,320 data sets about the way taxis were travelling. New York City, you would know that the taxi 271 00:24:15,320 --> 00:24:20,600 started from a certain time, going to be an at another time 272 00:24:20,600 --> 00:24:25,700 and you would know the price paid but paid by the person. Also, the tip that you gave away seems pretty 273 00:24:25,700 --> 00:24:30,950 harmless, right? And actually, this data was used by research and it could be used 274 00:24:30,950 --> 00:24:35,960 in very positive ways. And for Eisen's, researchers showed that by means, by using 275 00:24:35,960 --> 00:24:41,750 this data, that it would be possible to organise some shared taxi rides and 276 00:24:41,750 --> 00:24:46,850 almost where everyone was basically having had almost the same tablet travelling time, 277 00:24:46,850 --> 00:24:51,890 but reducing the number of trips by a very significant amount, hence decreasing pollution 278 00:24:51,890 --> 00:24:56,930 and traffic. But at the same time, people also tried to capture that data sets to a 279 00:24:56,930 --> 00:25:02,050 paparazzi dataset. And knowing that a certain actor could carpet a 280 00:25:02,050 --> 00:25:07,160 certain place and living the cab at another place, they could 281 00:25:07,160 --> 00:25:12,410 look at fairly private information about the tips that these stars were given 282 00:25:12,410 --> 00:25:17,900 to get High-Tech Tips or notice. So, of course, here, this is not very important. 283 00:25:17,900 --> 00:25:23,820 Right. But this shows that it's it's impossible. It's in principle possible 284 00:25:23,820 --> 00:25:28,820 to determine, again, mass data as soon as the taxi 285 00:25:28,820 --> 00:25:34,280 trip is unique in the data sets. And you can somehow 286 00:25:34,280 --> 00:25:40,930 identify this trip with some excellent information that you may have access to. 287 00:25:40,930 --> 00:25:46,300 So now what about maybe. Well, actually, when you tried to track lability by means of phones, 288 00:25:46,300 --> 00:25:51,460 you don't always needs smartphones. And actually, let's start with 289 00:25:51,460 --> 00:25:57,160 very basic, very basic functioning of the phone, the mobile phone. 290 00:25:57,160 --> 00:26:02,260 So actually, whatever your phone, the phone you use and or Nokia or the latest iPhone, well, is 291 00:26:02,260 --> 00:26:07,700 going to be interacting with cell towers. And these interactions 292 00:26:07,700 --> 00:26:12,940 are actually some of these interactions are going to be captured in databases of mobile phone 293 00:26:12,940 --> 00:26:18,070 providers. And these are usually called call data recorders. And every time you make a phone call, receive a phone call, 294 00:26:18,070 --> 00:26:23,380 send a message that's in your line that's going to tell you what does the cell to which you are 295 00:26:23,380 --> 00:26:28,510 connected to. There's going to be information about what you're collecting. There's new information about the time 296 00:26:28,510 --> 00:26:34,540 when this section was made. And in other case, we're interested in location and mobility. 297 00:26:34,540 --> 00:26:40,360 Well, clearly, knowing the cell to which you attach, attach gives you information about your location. 298 00:26:40,360 --> 00:26:45,640 And this information is going to be more or less precise, depending on the density 299 00:26:45,640 --> 00:26:50,650 of the cell towers. And clearly, in cities, in urban environments, you have 300 00:26:50,650 --> 00:26:55,690 a high resolution of the order of 100 metres. Why? In rare rural areas, you would 301 00:26:55,690 --> 00:27:00,880 have a resolution of about one kilometre. But still, by tracking down all of your locations 302 00:27:00,880 --> 00:27:06,960 during the day, you can track down some of the mobility and your mobility behaviour. 303 00:27:06,960 --> 00:27:12,730 So. So, actually, well, it's too late. I'm sorry. It was last year, July 2019, vulcanise a conference 304 00:27:12,730 --> 00:27:18,250 where researchers used and analysed such datasets in all sorts of context. 305 00:27:18,250 --> 00:27:23,290 Some of them more industrial ones, others purely about arithmetic, but also others 306 00:27:23,290 --> 00:27:28,480 in terms of the potential applications for the social good of these datasets. And just to give you an example 307 00:27:28,480 --> 00:27:33,490 of the kind of positive application that could be made for them. Well, that was a very famous paper 308 00:27:33,490 --> 00:27:38,710 in 2011. Of researchers using coded records in order to predict the wave population 309 00:27:38,710 --> 00:27:43,750 would move after the earthquake in Haiti. And using those in order to sort 310 00:27:43,750 --> 00:27:48,880 of predict the way cholera would be propagating inside the country. And this is just one paper, amongst 311 00:27:48,880 --> 00:27:53,950 many others, where people try to use coded records as a way to track population 312 00:27:53,950 --> 00:28:00,660 movements and hence to track the way diseases would be spreading inside countries. 313 00:28:00,660 --> 00:28:05,890 Let's go now to smartphones. And for smartphones, well, now you may have apps, but also G.P.S. data 314 00:28:05,890 --> 00:28:11,010 that would become available. So one first interesting approach is when people are 315 00:28:11,010 --> 00:28:16,020 using participatory syndrome surveillance. And basically the idea is to get instead of nap on your phone, 316 00:28:16,020 --> 00:28:21,270 where regularly you would be telling whether or not you have some symptoms of a certain disease 317 00:28:21,270 --> 00:28:26,700 of flu, for instance. But these datasets are very regular, regularly updated. 318 00:28:26,700 --> 00:28:31,800 You have a regular update of the status of patients that basically sensors inside your country 319 00:28:31,800 --> 00:28:37,020 and you can have in real time maps of the way symptoms are propagating inside 320 00:28:37,020 --> 00:28:43,950 the country. Symptoms that may be precursors of a certain disease. 321 00:28:43,950 --> 00:28:49,230 There is also a very nice project that I would like to mention. It's a project that was 322 00:28:49,230 --> 00:28:54,810 basically organised and done in Copenhagen in Jitu. 323 00:28:54,810 --> 00:29:00,330 And indeed, You, which is a technical university, is a group of researchers getaway 1000 phones 324 00:29:00,330 --> 00:29:06,000 to students, freshman students, the deal being that you receive a phone, but you give away 325 00:29:06,000 --> 00:29:11,010 some of your datasets. And these datasets could be information about your location, 326 00:29:11,010 --> 00:29:16,590 information about your Bluetooth connectivity information. We are actually in social media 327 00:29:16,590 --> 00:29:21,960 and so on and so forth. And actually, this data says this dataset has been extremely influential 328 00:29:21,960 --> 00:29:27,270 because it was one of the first times that that there was a lack of comprehensive data that was shared 329 00:29:27,270 --> 00:29:32,490 with the community of researchers. And that allowed to explore the different ways by which people 330 00:29:32,490 --> 00:29:37,830 were moving and communicating with each other, but also the techniques that could be used 331 00:29:37,830 --> 00:29:43,730 in order to infer some information from mobility and contact data. 332 00:29:43,730 --> 00:29:48,810 So I just my childhood broke with actually Bluetooth is actually something that people use very often, 333 00:29:48,810 --> 00:29:53,810 they look to estimate contacts in population. How does it work? But actually, if 334 00:29:53,810 --> 00:29:59,330 you have a mobile, a smartphone, your laptop, typically it has some form of both those implemented. 335 00:29:59,330 --> 00:30:05,000 And Bluetooth is a way for information to be spread on very small distances to connect 336 00:30:05,000 --> 00:30:11,120 your phone to your earphones, to connect your computer to a keyword and so on, so forth. 337 00:30:11,120 --> 00:30:16,790 And the thing is that because it has this short distance radius, 338 00:30:16,790 --> 00:30:22,010 it's also something that people have been using or trying to use for quite a long time, 10, 15 years 339 00:30:22,010 --> 00:30:27,140 at least, in order to try to trace contacts between people. If my phone detects 340 00:30:27,140 --> 00:30:33,830 another phone nearby, it's probably due to the fact that there's a contact between person and person. 341 00:30:33,830 --> 00:30:39,830 You should just keep in mind that's route without just an opportunistic. 342 00:30:39,830 --> 00:30:45,080 Solution, we are using Bluetooth. Not because it's been designed to estimate 343 00:30:45,080 --> 00:30:50,780 contact. It's not been designed for that. But we use it because it's implemented by default on many, many faults. 344 00:30:50,780 --> 00:30:56,210 It means that we are trying to use a technology that is available in order to estimate contacts. But states 345 00:30:56,210 --> 00:31:01,430 also know that none that it's very difficult to estimate precisely the contacts. What is the exact distance 346 00:31:01,430 --> 00:31:06,800 between people, for instance, or whether or not they are actually looking the same direction or looking to each other? 347 00:31:06,800 --> 00:31:12,080 So it means that this is one way to measure contact, but it's clearly in that idea 348 00:31:12,080 --> 00:31:17,630 and actually in the literature, people have been trying to use other ways to capture contacts. 349 00:31:17,630 --> 00:31:23,290 And a very nice one is this project called Social Patterns, whether they were using RFID 350 00:31:23,290 --> 00:31:28,310 where they gave away some badges to people and actually these badges us as done in such a way that only 351 00:31:28,310 --> 00:31:34,280 one people are sufficient to close to each other and face to face with an interaction be captured. 352 00:31:34,280 --> 00:31:39,530 So this is a very nice data set. There've been experiments done in hospitals and schools 353 00:31:39,530 --> 00:31:44,750 in many situations, and it's really also improved a lot of understanding about the way the people 354 00:31:44,750 --> 00:31:50,120 are physically interacting with each other. But unfortunately, it requires some specific 355 00:31:50,120 --> 00:31:56,090 devices that people don't have and it couldn't be implemented at a population scale. 356 00:31:56,090 --> 00:32:01,100 So to finish this kind of talk, I just wanted to mention something else and the fact that, 357 00:32:01,100 --> 00:32:06,680 as I say, it's mobile phones have G.P.A. And actually the truck 358 00:32:06,680 --> 00:32:11,680 position fairly regularly, regularly on time. So not every five seconds 359 00:32:11,680 --> 00:32:17,030 to drain your battery. Very, very fast. Actually, there are some some tools, existing 360 00:32:17,030 --> 00:32:22,250 operating systems that were basically the phones captures 361 00:32:22,250 --> 00:32:27,650 significant changes in maybe in position. And basically, you have a sequence 362 00:32:27,650 --> 00:32:33,110 of posts where the person has been in the course of the day. Now, 363 00:32:33,110 --> 00:32:38,120 usually these data can be collected by certain apps. If you give, you 364 00:32:38,120 --> 00:32:43,460 get your agreement for that. Now, there is a company called Kubic that basically collects these datasets 365 00:32:43,460 --> 00:32:49,100 from different apps and uses them in a marketing environment. 366 00:32:49,100 --> 00:32:54,110 Now, it can be using for marketing purposes, but it can also be used for good. And they 367 00:32:54,110 --> 00:32:59,210 have a programme called Data for Good where they give away such 368 00:32:59,210 --> 00:33:04,400 G.P.S. tracks, geep just G.P.S. position of people for researchers 369 00:33:04,400 --> 00:33:09,770 while, for instance, interested in repeating energy. And this we'll see many of the solutions 370 00:33:09,770 --> 00:33:15,080 that people have been implementing in in recent times against coving thinking 371 00:33:15,080 --> 00:33:20,780 are actually based on such Cubitt data. So just so that, you know, keeping data in the US, 372 00:33:20,780 --> 00:33:26,000 it gives us about five to 10 percent of the others in the country. This number is much 373 00:33:26,000 --> 00:33:31,040 lower in other countries, including in the UK. So 374 00:33:31,040 --> 00:33:36,220 so what about the use of such technologies in order to understand and to fight 375 00:33:36,220 --> 00:33:41,860 again? Coving 90? So we're talked about contactors and actually 376 00:33:41,860 --> 00:33:47,950 it's been proposed and to use mobile phones in order to accelerate and to improve 377 00:33:47,950 --> 00:33:53,410 on connectors. So usually contact tracing is done manually by means of questionnaires, 378 00:33:53,410 --> 00:33:58,420 which may slow down quite a lot. The process of the identification of potentially infected 379 00:33:58,420 --> 00:34:03,460 people. And actually, there've been some some applications, 380 00:34:03,460 --> 00:34:08,890 mobile apps, Devlins, for instance, in Asian countries that have been used 381 00:34:08,890 --> 00:34:14,200 in order to accelerate this process drastically. And there's a very nice paper from our colleague Christopher 382 00:34:14,200 --> 00:34:19,260 is over, actually. They look at this. This contract is 383 00:34:19,260 --> 00:34:25,420 by apps from a mathematical point of view. And sure, that indeed it is possible to decrease 384 00:34:25,420 --> 00:34:30,580 the opposition number below this value of one by using this automated contact 385 00:34:30,580 --> 00:34:35,860 tracing applications. Now, clearly here in 386 00:34:35,860 --> 00:34:41,350 in their ordinary paper. So they make a creation, but they also estimate that you would need 387 00:34:41,350 --> 00:34:46,900 about 60 percent of the population to use the app for this application to be sufficient 388 00:34:46,900 --> 00:34:52,330 in order to to to make this repetition number could go to zero zero and ends 389 00:34:52,330 --> 00:34:57,370 for the epidemics to die off naturally. 60 percent is a huge number. 390 00:34:57,370 --> 00:35:02,600 So just to give you an idea, in Singapore, the project traced together, committed it, completed 391 00:35:02,600 --> 00:35:07,600 it about 20 percent off of usage and closer to 392 00:35:07,600 --> 00:35:12,790 here in Norway. They also had some experiments where they also used applications to track 393 00:35:12,790 --> 00:35:17,830 contacts. In that case, the contacts would 394 00:35:17,830 --> 00:35:23,650 basically be so with G.P.S. collection at the same time. And even if authorities 395 00:35:23,650 --> 00:35:28,720 claimed that there was a very good accuracy in tracing, it only covered about 20 396 00:35:28,720 --> 00:35:33,790 percent of the population in the test zones. So the thing 397 00:35:33,790 --> 00:35:39,070 is, that's what this application is clearly something that is that is going to be important. 398 00:35:39,070 --> 00:35:44,230 It's going to be very useful, but it probably won't be useful on its own. It won't be a silver bullet 399 00:35:44,230 --> 00:35:49,330 that is going to kill the disease by itself is going to be once on a solution 400 00:35:49,330 --> 00:35:54,950 about medicine of many of the solutions that they're going to be bringing down this production number below 401 00:35:54,950 --> 00:36:00,010 below its value of one. So actually, when this kind of application was proposed in Western 402 00:36:00,010 --> 00:36:05,560 countries and in the U.K. Well, one word here and this sentence was very controversial, 403 00:36:05,560 --> 00:36:10,660 was this notion of central sovereignty. And actually it means that's part 404 00:36:10,660 --> 00:36:16,930 of the information would be sent in central servers. Fransen It would be used by the NHS 405 00:36:16,930 --> 00:36:22,300 in order to dispatch the information of contacts based on a contact with Person B 406 00:36:22,300 --> 00:36:27,310 with Bluetooth. Well, isn't this mapping from person to person B would 407 00:36:27,310 --> 00:36:32,830 be basically organised and decided in a centralised fashion. 408 00:36:32,830 --> 00:36:37,840 And this question of centralisation was something that became very controversial because 409 00:36:37,840 --> 00:36:43,660 that these sorts of other ways to do the same kind of task by means of decentralised. 410 00:36:43,660 --> 00:36:49,360 So in one way would have a centralised server with the some of the information stored even anonymously 411 00:36:49,360 --> 00:36:55,300 in another one. Everything is done with the information, stays on the phones 412 00:36:55,300 --> 00:37:00,760 and creating this controversy. There's been many papers in different countries, as you can see, where 413 00:37:00,760 --> 00:37:06,160 people questions the level of basically the 414 00:37:06,160 --> 00:37:11,210 intimacy that was revealed by such applications. So first of all, it was questions. Question 415 00:37:11,210 --> 00:37:16,390 the guarantee for the precise proximity and as we say, route, which is not ideal 416 00:37:16,390 --> 00:37:21,610 to measure the proximity between people. For instance, the identity of the book, the signal is different 417 00:37:21,610 --> 00:37:26,710 for different phones. So people with a recent iPhone will be more often detected than people with an alderman, for 418 00:37:26,710 --> 00:37:31,830 instance. Also, it excludes part of the population. About 30 million people in the U.K. don't 419 00:37:31,830 --> 00:37:37,270 have a smartphone. And that could provide some for sentiment of safety 420 00:37:37,270 --> 00:37:42,490 if the application. Nothing happens, it means I'm safe and I can behave normally. But 421 00:37:42,490 --> 00:37:48,310 the most important was that there was a key concern that this epidemic could create and let you legitimise 422 00:37:48,310 --> 00:37:54,310 a Soviet surveillance tool that would then be continued to be used afterwards. 423 00:37:54,310 --> 00:37:59,350 And so that's the reason why I think that's that seems to be like a consensus 424 00:37:59,350 --> 00:38:04,400 nowadays. That's. The data should be kept decentralised. 425 00:38:04,400 --> 00:38:09,560 The mobility that at this very, very. The contact data. This is very, very 426 00:38:09,560 --> 00:38:15,110 personal. Should remains remain decentralised. Instead, instead of going to a centralised option. 427 00:38:15,110 --> 00:38:20,150 But what is very important here is that there is clearly a need for clarity and pedagogy, because at the 428 00:38:20,150 --> 00:38:25,310 end of the day, the people are going to be using. Yeah. And so you want to foster the trust 429 00:38:25,310 --> 00:38:30,410 and to justify the trust that the broader public is going to give to 430 00:38:30,410 --> 00:38:36,770 researchers and to the app developers if you want them to adopt. 431 00:38:36,770 --> 00:38:42,800 It's a very, very high level in such applications. 432 00:38:42,800 --> 00:38:48,550 There is also a poem that is also quite important, you should remember, about this network effect of privacy. 433 00:38:48,550 --> 00:38:53,600 So in this network effect of privacy, we we basically say that's it could 434 00:38:53,600 --> 00:38:59,610 happen that you don't want to reveal your personal information or Facebook, for instance, but your friends reveal it. 435 00:38:59,610 --> 00:39:04,700 Well, let's assume that you have such a contact tracing up where you want to reveal your 436 00:39:04,700 --> 00:39:09,970 contacts, but not your location, but your friend, your contacts reveal their locations 437 00:39:09,970 --> 00:39:15,140 where clearly because your friends reveal allegations there regularly when they met you. And 438 00:39:15,140 --> 00:39:20,210 so they reveal Passey, the places where you've been and there've been some very nice results were 439 00:39:20,210 --> 00:39:26,270 actually it was possible to show that just a tiny fraction of phones revealing locations 440 00:39:26,270 --> 00:39:31,490 could reveal the locations of a large number of phones if contact tracing was implemented 441 00:39:31,490 --> 00:39:37,030 with partial G.P.S. locations. So 442 00:39:37,030 --> 00:39:42,150 not the technology that people are also using are these called data records. And for these quality records actually, well, 443 00:39:42,150 --> 00:39:47,430 for instance, a few months ago, there's been an agreement by European mobile operators 444 00:39:47,430 --> 00:39:53,190 to provide data to researchers to help monitor the way the people are moving and adapting their behaviour 445 00:39:53,190 --> 00:39:58,350 to social distancing, distancing measures. And there have been 446 00:39:58,350 --> 00:40:03,900 all sorts of monitors that have been made available to the broader public with also to decision makers 447 00:40:03,900 --> 00:40:09,000 to properly quantify the way people change their behaviour. This is an 448 00:40:09,000 --> 00:40:14,300 example from Italy, for instance, where you can see the way the mobility is decreasing 449 00:40:14,300 --> 00:40:19,440 depending a different phases of the disease, but also the decisions that 450 00:40:19,440 --> 00:40:24,450 I think are taken by the government. There is a similar project taking place in 451 00:40:24,450 --> 00:40:29,580 Germany where actually when you look at the curve, you see that there's been a decrease in the mobility of the people. And then 452 00:40:29,580 --> 00:40:35,250 actually these decreases is relaxing, which is usually cool because this quarantine fatigue 453 00:40:35,250 --> 00:40:41,220 where people start to be tired of following these very strict and 454 00:40:41,220 --> 00:40:46,440 boring social distancing measures and slowly increase their mobility 455 00:40:46,440 --> 00:40:51,440 even if the government recommends not. And actually, there is also there 456 00:40:51,440 --> 00:40:56,600 is also a project here in the U.K. that is run here at the University of Oxford, Oxford, by some colleagues, 457 00:40:56,600 --> 00:41:02,030 were again using coded recalls, actually. They also looked at these changes 458 00:41:02,030 --> 00:41:07,140 in mobility and could also distinguish four different ages, but also for 459 00:41:07,140 --> 00:41:12,510 different Janda, for instance. So people have been doing these kind of analogies 460 00:41:12,510 --> 00:41:17,880 with quality recalls, but they also did it with G.P.S. traces using this Kubic data that I was mentioning 461 00:41:17,880 --> 00:41:23,010 before. For instance, a very nice article in The New York Times where they showed that actually 462 00:41:23,010 --> 00:41:28,390 the way that people change their mobility was very, very different in counties with 463 00:41:28,390 --> 00:41:33,450 stay-at-home orders, where there was a very, very drastic drop 464 00:41:33,450 --> 00:41:38,640 in mobility versus counties without any kind of stay at home order where there was 465 00:41:38,640 --> 00:41:44,880 a G.K mobility, but clearly not the same. So you clearly see here that indeed decisions, 466 00:41:44,880 --> 00:41:51,450 political decisions can have a huge impact on the changes of mobility that the people may have. 467 00:41:51,450 --> 00:41:56,540 So, again, for these dubious data that are provided by Kubic, well, here 468 00:41:56,540 --> 00:42:02,350 in the University of Oxford, in the same project, they also have access to to to some of these 469 00:42:02,350 --> 00:42:07,920 Kubic data and also carry on. And then it is about whether people are moving and adapting their behaviour 470 00:42:07,920 --> 00:42:12,930 by means of these deeply G.P.S. information. So far, 471 00:42:12,930 --> 00:42:18,090 for G.P.S. information, it's not only Kubic that provides data. There are also other sources of data. 472 00:42:18,090 --> 00:42:23,280 And there's, for instance, this very nice project from Google where they also provide information about changes 473 00:42:23,280 --> 00:42:28,440 of behaviour, focussing that case on the way the people change the type 474 00:42:28,440 --> 00:42:33,960 of place where they go, for instance, which is a decrease in the visits to retail and recreation 475 00:42:33,960 --> 00:42:39,030 places or grocery and pharmacy or parks, which is also some very important 476 00:42:39,030 --> 00:42:44,700 information for governments to have if they want to understand the places 477 00:42:44,700 --> 00:42:49,920 that are important, but also to try to track down the way the people are adapting their behaviour 478 00:42:49,920 --> 00:42:57,370 and the places that might become places at risk for people to be meeting with each other. 479 00:42:57,370 --> 00:43:02,470 Another type of data that people are also using would be Facebook did so in the case of Facebook data. 480 00:43:02,470 --> 00:43:08,350 Facebook is also also has a data for good initiative where they provide to researchers 481 00:43:08,350 --> 00:43:13,570 and, of course, information about the number of users located in a certain location, 482 00:43:13,570 --> 00:43:18,670 in space, in a different types of attack times of the day. And there's 483 00:43:18,670 --> 00:43:23,770 a very nice project from Copenhagen, Denmark, where they also look at these Facebook death and try to 484 00:43:23,770 --> 00:43:28,780 understand better, though, the way the population has been shifting from certain 485 00:43:28,780 --> 00:43:33,910 places to other ones. And in this verdict from Copenhagen, they could see that 486 00:43:33,910 --> 00:43:39,190 people were going out of the cities, for instance, an observation that was all sorts of New York, for instance, 487 00:43:39,190 --> 00:43:44,620 where many people from New York were shown to leave the city and to go more to the countryside, 488 00:43:44,620 --> 00:43:52,100 for instance, to the seaside in order to escape the city before the London actually took place. 489 00:43:52,100 --> 00:43:57,370 So one last type of data that is also available would be participatory 490 00:43:57,370 --> 00:44:02,540 surveillance. So in participatory surveillance, this is exactly the kind of thing for a 491 00:44:02,540 --> 00:44:07,850 system where you're standing up and on a day to day basis, you report the symptoms 492 00:44:07,850 --> 00:44:12,980 that you may have. And all of these symptoms are also with 493 00:44:12,980 --> 00:44:18,650 certain information about your age or your gender, but also your postcode collected together. 494 00:44:18,650 --> 00:44:23,690 And this is a project from King's College, London, where they collect this 495 00:44:23,690 --> 00:44:28,970 information and they have information about the symptoms, possibly before the people actually develop 496 00:44:28,970 --> 00:44:35,090 the disease, are reported as having the disease, which is a way to get access to. 497 00:44:35,090 --> 00:44:40,150 Potential number of disease of people with the disease even before they are tested 498 00:44:40,150 --> 00:44:46,530 in well before they go to the hospitals and could also be something implement them and using mothers. 499 00:44:46,530 --> 00:44:51,950 So talking about mothers, let's go back to these sorts or so. As I said, if you want calibrates mothers 500 00:44:51,950 --> 00:44:57,550 for epidemic spreading. Well, these mothers are very rich. These mothers have very, very many parameters 501 00:44:57,550 --> 00:45:03,670 about the way the contacts take place, about the way the people are moving around between different places. 502 00:45:03,670 --> 00:45:08,920 And actually more and more groups are using some of the data that I've been presenting based 503 00:45:08,920 --> 00:45:13,990 on Kubic G.P.S., based on quality records, for instance, in order to try to feed these mothers and to 504 00:45:13,990 --> 00:45:19,570 improve the modelling and the predictions that the mothers are going to make. 505 00:45:19,570 --> 00:45:25,030 So just to give you an idea of all of the ways by which this can be done. Well, this is a new situation 506 00:45:25,030 --> 00:45:31,030 where we are. We can see like some imaginary trajectory of two users 507 00:45:31,030 --> 00:45:37,000 going from home to different places during during during the day. We would have basically 508 00:45:37,000 --> 00:45:42,400 all of these Verano cells are associated to certain cell towers. So that would be something for quality 509 00:45:42,400 --> 00:45:47,510 records. And he up and then you have a certain number of bones that somehow represent the trajectory that 510 00:45:47,510 --> 00:45:52,720 a person has during the day. Of course, there are many missing bones, but because you only have information 511 00:45:52,720 --> 00:45:57,820 when the person uses his phone. But this is better than nothing. Well, actually, from data like those 512 00:45:57,820 --> 00:46:02,830 with more than two customers, what you could clearly identify hotspots that would be 513 00:46:02,830 --> 00:46:08,440 cell towers were there would be a huge increase of the population. That's a certain time. For instance, Krost barks 514 00:46:08,440 --> 00:46:13,630 when it is a sunny day, for instance, you could also identify the origin 515 00:46:13,630 --> 00:46:18,910 destination matrices, how the people are moving from one place to another, wondering the day and do it 516 00:46:18,910 --> 00:46:24,130 in on a daily fashion and tracking down how these origin destination 517 00:46:24,130 --> 00:46:29,590 matrices are other in the course of time, which would be something very important for this metabolic population. 518 00:46:29,590 --> 00:46:34,630 Mothers, for instance. You could also try to estimate contact matrices. 519 00:46:34,630 --> 00:46:39,730 So for the contact measures, this is something that is much more difficult because, for instance, in the case 520 00:46:39,730 --> 00:46:44,770 of Colditz Records, you just have resolution of the order of 100 metres. But still 521 00:46:44,770 --> 00:46:50,590 you can try to estimate potential contacts when you have people being in the same cell tower 522 00:46:50,590 --> 00:46:55,780 for a sufficiently long time and potentially having interactions with each other. And this is 523 00:46:55,780 --> 00:47:01,270 an approach that people are exploring the how to use these evolving densities 524 00:47:01,270 --> 00:47:07,160 in space in order to try to estimate the evolution of the contacts between the people 525 00:47:07,160 --> 00:47:12,790 and find that you could also estimate what is the person of the time that people spend at home or at work, 526 00:47:12,790 --> 00:47:17,800 because clearly each spend 50 percent of your time at home or 99 percent of your time 527 00:47:17,800 --> 00:47:23,080 at home. This would have a very, very different effect on the potential contacts that you had 528 00:47:23,080 --> 00:47:28,270 and the potential ways in which you will be propagating the disease. So here 529 00:47:28,270 --> 00:47:33,280 are all sorts of ways where you can use this call data reporting that to basically provide 530 00:47:33,280 --> 00:47:38,650 guidance and help fits parameters of these Methow population 531 00:47:38,650 --> 00:47:44,260 mothers, but also from these contact niches that I mentioned during the first part of the talk 532 00:47:44,260 --> 00:47:49,510 and actually about the contacts themselves. Well, actually, for the contacts, 533 00:47:49,510 --> 00:47:54,940 some people are actually using some of these G.P.S. data in order to try to estimate contacts. 534 00:47:54,940 --> 00:47:59,980 So, as I said, when you define a contact, you need to define that two people are 535 00:47:59,980 --> 00:48:05,410 within a certain distance during a certain amount of time when that case, they don't define the distance 536 00:48:05,410 --> 00:48:11,150 based on biological parameters or physical parameters, but they use the best that they can with the data they have. 537 00:48:11,150 --> 00:48:16,270 In their case, that would be 25 metre. This is clearly large, but this is the best 538 00:48:16,270 --> 00:48:21,460 you can do with this kind of methodology. In that case, within New York, this group 539 00:48:21,460 --> 00:48:27,210 used to be data Gibbings data in order to try to estimate how contacts 540 00:48:27,210 --> 00:48:32,530 estimated as people that are close to each other for a certain amount of time within 541 00:48:32,530 --> 00:48:37,660 twenty five metres hold, this number of contacts would be evolving and you see that there's a drastic drop, 542 00:48:37,660 --> 00:48:42,700 which is again, something that can be fed into mothers in order to take into 543 00:48:42,700 --> 00:48:47,710 account the way the contact between the people are evolving in time, but also as 544 00:48:47,710 --> 00:48:53,140 a way to track changes in contact patterns when some of these 545 00:48:53,140 --> 00:48:58,370 social distancing measures can be relaxed. So given or 546 00:48:58,370 --> 00:49:04,740 all the above. So let's try to find a conclusion here. So. 547 00:49:04,740 --> 00:49:09,780 Clearly, we're at a time now where people are trying countries are trying to find the best way 548 00:49:09,780 --> 00:49:15,120 to relax social distancing measures. So that's the economic 549 00:49:15,120 --> 00:49:20,130 and social life. And countries can start again. And yet keeping 550 00:49:20,130 --> 00:49:25,140 the keeping the disease under control. And here I just wanted to show you two 551 00:49:25,140 --> 00:49:30,410 very different types of solutions that are implemented in two very different countries in terms of culture 552 00:49:30,410 --> 00:49:35,670 and also legal systems in France and in China, too. In France, 553 00:49:35,670 --> 00:49:40,740 you have a system right now where depending on the department we live through the place 554 00:49:40,740 --> 00:49:45,790 where you live, you will have different rights in terms of mobility. Lives is different. 555 00:49:45,790 --> 00:49:50,940 Yeah. Right. In terms of the shops you make visits or the amount of time you can spend outside and you 556 00:49:50,940 --> 00:49:56,010 have this day to day updating of maps of the country. If 557 00:49:56,010 --> 00:50:01,110 you live in a red region, you have to stay home, basically, if you're living in a green 558 00:50:01,110 --> 00:50:07,620 region. You are much more free to move and to and to and to moving freely fashion. 559 00:50:07,620 --> 00:50:12,660 So on the other side of the spectrum. So in China, there is a system called the IP Health Code, 560 00:50:12,660 --> 00:50:17,950 where here everything is based on the use of mobile phones. And there is there is no GDP 561 00:50:17,950 --> 00:50:23,040 are in China. And actually an application 562 00:50:23,040 --> 00:50:28,110 is collecting information about your mobility, about your contacts, about many aspects of 563 00:50:28,110 --> 00:50:33,540 your life. And then this information is fed in to some machine learning algorithm. And depending 564 00:50:33,540 --> 00:50:38,550 on your profile, either gives you a green coat, a yellow coat or a red coat. So 565 00:50:38,550 --> 00:50:43,590 clearly, you have two different approaches, one that is more geographic in terms 566 00:50:43,590 --> 00:50:48,630 of groups. One that is very much personal and using extremely personal and 567 00:50:48,630 --> 00:50:54,000 sensitive data. So the action to the right would be unthinkable in European countries 568 00:50:54,000 --> 00:50:59,280 for legal and also cultural reasons, wouldn't be willing to provide 569 00:50:59,280 --> 00:51:04,440 such datasets to governments and to and to have these data 570 00:51:04,440 --> 00:51:09,600 decide whether or not in a machine learning black smoke, white, black, black box way, we can move 571 00:51:09,600 --> 00:51:14,760 on that. But on the other hand, the left's solution is or is not is also not ideal 572 00:51:14,760 --> 00:51:19,890 either. It is very cold drains. It blocks huge parts of the population, 573 00:51:19,890 --> 00:51:25,020 while actually maybe only some very small, smallest subpopulations are at 574 00:51:25,020 --> 00:51:30,420 risk of spreading the disease. And so clearly, there is this there there's a question 575 00:51:30,420 --> 00:51:35,430 of trying to find the right intermediate solution between these calls green 576 00:51:35,430 --> 00:51:40,440 and these very fine solution, a solution that would be sufficiently efficient, but 577 00:51:40,440 --> 00:51:45,510 yet respecting the privacy of all of us. And as I've been trying to 578 00:51:45,510 --> 00:51:51,600 argue that mobile phones and all of the data that can be collected by mobile phones 579 00:51:51,600 --> 00:51:56,940 could be a way to try to go in this direction. So the collection of these data need to be done 580 00:51:56,940 --> 00:52:02,010 in the privacy aware setting, respecting regulations and look and also 581 00:52:02,010 --> 00:52:07,080 respecting the trust that the people may give in researchers and 582 00:52:07,080 --> 00:52:12,300 public health workers. And I think that if you want to go this direction and this is a direction 583 00:52:12,300 --> 00:52:17,400 that people are moving on to right now, you need to have trust 584 00:52:17,400 --> 00:52:23,160 and communication between different actors. The government clearly researchers, 585 00:52:23,160 --> 00:52:28,290 mobile phone companies would be willing to take their social responsibility 586 00:52:28,290 --> 00:52:33,420 to make these data available for for the common good, but also human 587 00:52:33,420 --> 00:52:39,390 rights organisation. We need to have a very transparent 588 00:52:39,390 --> 00:52:44,850 description of what these data sets could be used for and how it would be used. 589 00:52:44,850 --> 00:52:50,700 And I think that it is only by having this conversation and this transparent opposition between all of these actors 590 00:52:50,700 --> 00:52:56,690 that we're going to be reaching an inefficient solution that is acceptable for for the whole community. 591 00:52:56,690 --> 00:53:02,940 So so to conclude, I want to thank you, first of all, for listening to me. Thank you very much for spending 592 00:53:02,940 --> 00:53:08,010 this time with me. I also wanted to thank my collaborators for these Oxford Impact Melito, 593 00:53:08,010 --> 00:53:13,170 the one that I mentioned where that is run by much Yassky Yan and Adam Sanders 594 00:53:13,170 --> 00:53:18,690 from the University of Oxford, as I am. And I also wanted to thank many of my collaborators 595 00:53:18,690 --> 00:53:24,330 with what I've been writing, an article which is freely accessible. These go and have a look at it 596 00:53:24,330 --> 00:53:29,370 where we describe some of these ideas that I've been discussing today and and what we do. 597 00:53:29,370 --> 00:53:34,530 It's by writing and I clearly hope that you will be interesting in reading it. 598 00:53:34,530 --> 00:53:57,840 So thank you very much. And specif.