1
00:00:00,060 --> 00:00:10,830
Department of. The Department of Computer Science and the Big Data Institute, as you've just seen, we are recording this meeting.
2
00:00:10,830 --> 00:00:20,040
So if you don't wish to be recorded, please keep your microphone off and your camera turned off.
3
00:00:20,040 --> 00:00:26,680
And otherwise, I will hand over to Christoph to do the introductions. Thanks, Christine.
4
00:00:26,680 --> 00:00:32,980
So it's a pleasure to introduce Professor, somebody from.
5
00:00:32,980 --> 00:00:38,950
And Professor of Machine Learning and Public Health at the University of Copenhagen
6
00:00:38,950 --> 00:00:45,340
in Denmark and Professor of Public Health and statistics at Imperial College.
7
00:00:45,340 --> 00:00:52,780
So Sam is known to many in Oxford from his time.
8
00:00:52,780 --> 00:01:06,190
From his time here, including the Big Data Institute, has done many contributions at the interface of statistics and public health first.
9
00:01:06,190 --> 00:01:20,080
And he followed dynamics lots of important contributions to geospatial dynamics of infectious diseases, including on malaria and the global burden.
10
00:01:20,080 --> 00:01:32,020
A very prolific and important contributions on the dynamics of COVID and always at the interface of applied and fundamental research.
11
00:01:32,020 --> 00:01:52,880
And we'll be talking to us today based on the sort of fundamentals of learning based on the work Ontarians have achieved some.
12
00:01:52,880 --> 00:01:57,320
Thank you. Thank you, Brazil. Thank you set for setting this up.
13
00:01:57,320 --> 00:02:03,440
So I was I was a bit sort of at odds on what to what actually presented because we've been working
14
00:02:03,440 --> 00:02:09,530
on a paper in my group and with my collaborators of months for a long time and we didn't know,
15
00:02:09,530 --> 00:02:13,580
you know, I'm really proud of that. People are really happy of it and find its contribution very interesting.
16
00:02:13,580 --> 00:02:19,490
But it can be a bit dry for those who are not used to or interested only in banking processes.
17
00:02:19,490 --> 00:02:23,330
So I wanted to do a little bit of a history of infectious disease modelling
18
00:02:23,330 --> 00:02:27,290
and the different approaches so that those in the audience were not used to.
19
00:02:27,290 --> 00:02:31,760
The infectious disease modelling can sort of couch what our new contributions are,
20
00:02:31,760 --> 00:02:37,440
and then I'll move on to some interesting facets that I've been working on.
21
00:02:37,440 --> 00:02:43,950
So the first thing is, you know, what's our goal? Well, most of the time in infectious diseases, we're interested in some measure of the epidemic,
22
00:02:43,950 --> 00:02:47,550
something like prevalence, the number of infected individuals at any time.
23
00:02:47,550 --> 00:02:54,090
So I go somewhere, I find out how many people are infected with, say, SARS-CoV-2.
24
00:02:54,090 --> 00:02:58,620
And that's my preference, right? Or you could look at something like incidence,
25
00:02:58,620 --> 00:03:03,330
which is the number of newly infected individuals at time T and there are, of course, other measures in it.
26
00:03:03,330 --> 00:03:07,260
But our goal essentially is to be able to model these two metrics.
27
00:03:07,260 --> 00:03:08,120
Prevalence, you know,
28
00:03:08,120 --> 00:03:16,920
when you can get from measure blood samples and seeing if people are positive for a virus or parasite or any disease of interest and incidence,
29
00:03:16,920 --> 00:03:20,120
of course, you can just calculate using routine cases.
30
00:03:20,120 --> 00:03:25,520
So the question is, you know, why don't we just use Gaussian process regression and call it a day?
31
00:03:25,520 --> 00:03:30,790
Well, why don't we just use common filters or Roraima or recurrent neural networks?
32
00:03:30,790 --> 00:03:36,500
I mean, these are all incredible and they're very good for prediction, and they may give you a really good metric of uncertainty.
33
00:03:36,500 --> 00:03:42,860
And they're empirically motivated from the field of machine learning and deep learning for at least the econometrics.
34
00:03:42,860 --> 00:03:46,190
But you know, why don't we just use this and then end the talk here, right?
35
00:03:46,190 --> 00:03:52,390
Why? Why am I? Am I? You know what? The reason of trying to even look at things like process?
36
00:03:52,390 --> 00:03:56,230
And the key thing is that we want to try for the best of both worlds and, you know,
37
00:03:56,230 --> 00:04:05,290
said that I use this term a lot fairly mechanistic models and the idea is we want to encode some notion of dynamics in there.
38
00:04:05,290 --> 00:04:12,640
Now in physics, you know where we're ultimately mathematics, chemical and physical law with remarkable accuracy,
39
00:04:12,640 --> 00:04:18,520
with things like quantum electrodynamics and even just the swing of a pendulum chaotic or not.
40
00:04:18,520 --> 00:04:25,150
What we really want to do is try to disentangle some mechanisms that have some basis in epidemiological reality.
41
00:04:25,150 --> 00:04:28,750
And the reason we want to do this is that if we can understand the mechanism,
42
00:04:28,750 --> 00:04:36,670
then it goes a long way to understanding the causality of the whole process, why things happen, why certain hypotheses occur.
43
00:04:36,670 --> 00:04:42,790
And so the idea is that we don't have to throw away things like a remnant of industrial processes.
44
00:04:42,790 --> 00:04:48,310
In fact, we used, you know, random walk, for example, we've got a process with a specific precision.
45
00:04:48,310 --> 00:04:52,210
We use them in all our models, but we want to embed these in a mechanism.
46
00:04:52,210 --> 00:04:57,190
And so the question then comes in, you know, if we take this flexibility plus the mechanisms,
47
00:04:57,190 --> 00:05:02,290
what is the mechanism that we want to use for modelling effects?
48
00:05:02,290 --> 00:05:07,240
And then we move on and we think about, well, what is it we actually care about, right?
49
00:05:07,240 --> 00:05:10,820
So we have data on incidents of Providence, right again.
50
00:05:10,820 --> 00:05:14,030
You know, the number of infected individuals, a number of newly infected individuals.
51
00:05:14,030 --> 00:05:21,430
Well, what we really care about is the rate of transmission, which, you know, for all of us familiar with now, the COVID 19 pandemic,
52
00:05:21,430 --> 00:05:28,810
it's it is generally reflected the reproduction number and if the rate of transmission and it is simply the average number of secondary cases.
53
00:05:28,810 --> 00:05:35,290
So the average number of people I infect after I'm infected and historically this traces all the way back
54
00:05:35,290 --> 00:05:42,340
to the earliest models in infectious disease modelling from from Ronald Ross and lockdown in malaria.
55
00:05:42,340 --> 00:05:45,460
So we actually care about this metric, Archie.
56
00:05:45,460 --> 00:05:51,940
But, you know, many people who are familiar with statistical literature could say, Well, why don't we just estimate growth?
57
00:05:51,940 --> 00:06:00,640
And this is one of the big contributions from what we call lingo, Marc Lipsitch in terms of translating some degree of epidemiological mechanism,
58
00:06:00,640 --> 00:06:06,940
using probability generating functions for moment, generating functions and trying to calculate the.
59
00:06:06,940 --> 00:06:14,080
But however, we have to take care when trying to estimate growth rates or rates of transmission.
60
00:06:14,080 --> 00:06:22,180
And because, as has been shown by our implosive and co, you have to be really careful when trying to estimate growth rates or reproduction.
61
00:06:22,180 --> 00:06:28,600
So again, part of the benefit of having a mechanism in there is this knowledge that growth rates can be very
62
00:06:28,600 --> 00:06:33,130
difficult to estimate because of the exponential growth and especially when looking at the tails.
63
00:06:33,130 --> 00:06:42,210
And so by providing a mechanism, we can somehow control a little bit this statistical process, although not in time.
64
00:06:42,210 --> 00:06:48,990
And so the question is, what is Olmec now? Once again, I do like the physical analogy, and I can't I can't stress it enough.
65
00:06:48,990 --> 00:06:59,390
You know, the field of mathematical biology is driven by this idea of physicists coming into into biology from the.
66
00:06:59,390 --> 00:07:04,950
What about me and all the other contributions? Roy Anderson, et cetera.
67
00:07:04,950 --> 00:07:12,320
And bringing in this mechanism for trying to do to epidemiology and biology what have been done in physics,
68
00:07:12,320 --> 00:07:18,080
and there's so many mechanisms that concern infectious diseases that have grown in epidemiology,
69
00:07:18,080 --> 00:07:24,450
from looking at network structures to looking at underlying dynamics both time and space.
70
00:07:24,450 --> 00:07:29,880
And I'll give me the first model that really set the stage was the seminal work of CalMac and.
71
00:07:29,880 --> 00:07:37,830
I mean, it's pretty hard even if you've never done anything infectious disease modelling to to avoid some of the work that they did.
72
00:07:37,830 --> 00:07:42,990
And of course, they have the classic scenario models where they look at susceptible,
73
00:07:42,990 --> 00:07:49,260
eye infected or recovered, and they have very, very simple series of equations here.
74
00:07:49,260 --> 00:07:53,790
And you know, part of the reason that we consider this thing, the reproduction number is not only from Ross and locked up,
75
00:07:53,790 --> 00:08:01,380
but also from from this paper where I say r t that there's not actually a very that's a constant f r note,
76
00:08:01,380 --> 00:08:07,320
but it's actually trying to trying to estimate the reproduction number from these equations.
77
00:08:07,320 --> 00:08:16,040
And since the initial contribution of come back in the, you know, the architecture that have grown from susceptible, infected recovered models,
78
00:08:16,040 --> 00:08:23,790
arguably some of the most widespread in all of infectious disease, you can cast these stochastic differential equations.
79
00:08:23,790 --> 00:08:30,780
You can cast these as Markov chains. You can actually model using reaction kinetics, which we'll talk about in a little bit.
80
00:08:30,780 --> 00:08:38,370
There are connexions in this to two fundamental integral equations, including the Volterra equation and the Fred Helme equation.
81
00:08:38,370 --> 00:08:47,370
So there are a lot that could go into this, and there are deep links with this really great paper, right by David.
82
00:08:47,370 --> 00:08:56,580
Shampooed on also showed that this has a link to the kind of renewal equations that I'll be talking about that arise from branching processes.
83
00:08:56,580 --> 00:09:08,070
After that, the great man himself, Richard Bellman and and Theodore Harris started studying what are called age dependent branching processes.
84
00:09:08,070 --> 00:09:11,910
And one of the things and this is actually what started this on this entire journey quite a long time back.
85
00:09:11,910 --> 00:09:18,570
One of the things we realised in the really remarkably short paper published in 1948,
86
00:09:18,570 --> 00:09:24,960
they have this line that says standard probabilistic arguments yield the same nonlinear integral equation.
87
00:09:24,960 --> 00:09:30,150
And when you look at this, this looks remarkably like a renewal equation that I will get to in a bit.
88
00:09:30,150 --> 00:09:38,850
But one of the most problematic things is the argument that Richard Harris uses that this can easily be found from standard probabilistic arguments.
89
00:09:38,850 --> 00:09:45,990
I know from working with very competent mathematicians now that it's it really is not as trivial as that single line says.
90
00:09:45,990 --> 00:09:52,380
And of course, Theodore Harris expounded this on this a lot more in his book on budget processes further down the line.
91
00:09:52,380 --> 00:09:59,010
So now we've looked at, say, all models do the very popular in modelling infectious diseases writing processes,
92
00:09:59,010 --> 00:10:03,270
which we're going to go and come back to because that's essentially what these talks about.
93
00:10:03,270 --> 00:10:09,780
There are a couple of other really interesting facets that I want to go through just to set the stage for interesting approaches.
94
00:10:09,780 --> 00:10:14,130
The other one is Hoekstra's, which is now what processes introduced by Helen Allen.
95
00:10:14,130 --> 00:10:24,600
Hoax in 1971 are a type of self exacting process through point process where events occur and events can trigger other events to occur,
96
00:10:24,600 --> 00:10:35,190
with some given right. And we have used these processes in modelling malaria elimination in the past, and octopuses are very nice,
97
00:10:35,190 --> 00:10:42,130
and I think they are found to be used more and more infectious disease modelling. But they also have some problems.
98
00:10:42,130 --> 00:10:49,270
And then, of course, there are natural processes now. One of the most, it's very interesting how these worlds tend to stand very separate.
99
00:10:49,270 --> 00:10:56,980
One of the most influential papers in network theory was this paper by by David campaign on maximising the spread of influence,
100
00:10:56,980 --> 00:11:08,110
and he introduced a structure called the independent cascade, and the independent cascade mill is remarkably like susceptible infected.
101
00:11:08,110 --> 00:11:12,250
The beautiful thing about this paper was it it showed that using a greedy algorithm,
102
00:11:12,250 --> 00:11:22,840
you can solve an empty heart problem in network theory with some guarantees and really, really influence a problem.
103
00:11:22,840 --> 00:11:31,670
But this idea of using networks was also used by, well, even to this in their paper in trying to estimate who infected who.
104
00:11:31,670 --> 00:11:37,700
Now we've also used these in our work to estimate reproduction numbers in especially very.
105
00:11:37,700 --> 00:11:47,260
But this network view is it's an entirely different world, but that does have, you know, some some people who really like it in epidemiology.
106
00:11:47,260 --> 00:11:52,090
But a huge amount of deep mathematics that does connect with the rest and the benefit
107
00:11:52,090 --> 00:11:56,680
of network microprocessors is that they can really disentangle who infected,
108
00:11:56,680 --> 00:12:00,610
who quite well in a way that other approaches can't.
109
00:12:00,610 --> 00:12:06,550
And then finally, they are agent based models. And of course, for those will use this, it's hard to avoid.
110
00:12:06,550 --> 00:12:12,640
Daniel Gillespie's paper on the exact simulation of reactions and that simply use often and often
111
00:12:12,640 --> 00:12:17,500
in infectious disease modelling because lots of infectious disease epidemiologists know the
112
00:12:17,500 --> 00:12:24,580
disease extremely well and know the mechanisms of the disease very well and just create a population
113
00:12:24,580 --> 00:12:30,130
of individuals and allow them to evolve by setting rules as as shown here for tuberculosis.
114
00:12:30,130 --> 00:12:35,770
And such models have, you know, some some extremely influential applications.
115
00:12:35,770 --> 00:12:44,290
For example, some of Crystal and Neal's work, along with other colleagues on reporting on arguably one of the most influential papers on COVID
116
00:12:44,290 --> 00:12:49,270
19 mortality rate at the start of the pandemic was built entirely from agent based models.
117
00:12:49,270 --> 00:12:57,040
And of course, they are comes with using agent based models that they can be very difficult to fit and very difficult to
118
00:12:57,040 --> 00:13:04,030
unpick the underlying dynamics of them because there can be a lot of redundancy of cooling parameters,
119
00:13:04,030 --> 00:13:12,420
et cetera. And then finally, we get two renewal questions which were introduced at least of the world of infectious disease,
120
00:13:12,420 --> 00:13:18,900
probably a long time ago, but most by Christof seminal work in PLoS One,
121
00:13:18,900 --> 00:13:29,460
where he derived the renewal equation that we love, and then Ann Curry and colleagues used this in a framework to estimate 200 reproduction numbers.
122
00:13:29,460 --> 00:13:34,650
Now, renewal equations, the allure of them is is pretty great because they really intuitive.
123
00:13:34,650 --> 00:13:40,020
And if you read crystal paper, you'll see that it makes a lot of sense when they arise,
124
00:13:40,020 --> 00:13:46,450
you know, from from a humanistic, from an epidemiological basis. And I'm just putting the renewed equation there for all to see.
125
00:13:46,450 --> 00:13:50,550
You see I of T, the infected number of individuals can be distributed as they are.
126
00:13:50,550 --> 00:13:54,900
Plus a random variable will just say the mean of it by the reproduction number,
127
00:13:54,900 --> 00:13:58,650
which you touched upon the rate of transmission changing over time and all the
128
00:13:58,650 --> 00:14:04,470
previous infected individuals multiplied by some variable g of you extra back there,
129
00:14:04,470 --> 00:14:09,780
unfortunately. Now of you is known as the generation time and it it.
130
00:14:09,780 --> 00:14:14,160
It's something that actually is quite difficult to have precise meaning in epidemiology
131
00:14:14,160 --> 00:14:18,600
because it's defined in what you will find later that from the work that we've done,
132
00:14:18,600 --> 00:14:27,210
we give you an extremely precise. Well, the generation time is generally the time between infection between and in fact, infected pair.
133
00:14:27,210 --> 00:14:29,010
That's generally what it's called.
134
00:14:29,010 --> 00:14:37,140
And so in world, the renewal equation can generally be thought of as previous infections cause current infections depending on the stage of infection.
135
00:14:37,140 --> 00:14:40,200
It's very loose now.
136
00:14:40,200 --> 00:14:49,800
Renewal equations are, in my view, extremely powerful, and that's why I think they've been used so ubiquitously and are increasingly being used.
137
00:14:49,800 --> 00:14:57,720
So I wanted to go through two to explain all the old or the previous approaches to set the stage for one renewal of criticism for useful.
138
00:14:57,720 --> 00:15:02,940
You know, I almost require solving a series of differential equations,
139
00:15:02,940 --> 00:15:08,130
and solving differential equations can be quite problematic when we're doing things numerically,
140
00:15:08,130 --> 00:15:13,800
and they never available until form filled with stochastic differential equations are even more complicated.
141
00:15:13,800 --> 00:15:19,320
Network models are really nice, except you need to use a lot of assumptions about what's going on in the network.
142
00:15:19,320 --> 00:15:25,740
Agent based models that I've said are exceedingly complicated to do inference on very, very difficult and looks.
143
00:15:25,740 --> 00:15:27,510
Processes tend to be really, really nasty.
144
00:15:27,510 --> 00:15:34,740
To simulate on with renewal equations tend to be very easy to embed within probabilistic programming language.
145
00:15:34,740 --> 00:15:38,910
It's very easy to optimise and very easy to compute compute, right?
146
00:15:38,910 --> 00:15:42,690
It's just a thumb there with with with quadratic complexity.
147
00:15:42,690 --> 00:15:49,650
And so we have renewed equations to allow us to embed this idea of complexity within a mechanism.
148
00:15:49,650 --> 00:15:58,300
The renewal equation gives us some mechanism and we feed into this mechanism complexity to help explain epidemiology.
149
00:15:58,300 --> 00:16:03,000
Now, an example is some of the work that we've done on understanding the effectiveness of governmental
150
00:16:03,000 --> 00:16:11,370
interventions first in our initial paper and then subsequently in more work where we try to understand,
151
00:16:11,370 --> 00:16:19,290
you know, what interventions actually work to control the spread of SARS-CoV-2 and the disease COVID 19?
152
00:16:19,290 --> 00:16:23,490
One of them, or probably the most important question for a large period of this pandemic.
153
00:16:23,490 --> 00:16:28,950
And to do that, you know, there are many approaches that you can be taken, but by having a mechanism,
154
00:16:28,950 --> 00:16:34,260
we can we can account for the lags between infection and cases and infections of death.
155
00:16:34,260 --> 00:16:40,170
We can account for a mechanism that is plausible to explain the epidemiology of the renewal equation.
156
00:16:40,170 --> 00:16:45,270
And then we can embed some complexity in there in terms of putting in stochastic processes,
157
00:16:45,270 --> 00:16:49,740
doing our standard regression of terms with sensible priors.
158
00:16:49,740 --> 00:16:54,240
And we can put all this within state of the art probabilistic programming languages such that we
159
00:16:54,240 --> 00:16:58,740
have an entire framework from start to finish that can answer extremely important questions.
160
00:16:58,740 --> 00:17:04,190
And that's what you see on the figure of the effort. And it's hard to to not overemphasise this because, you know,
161
00:17:04,190 --> 00:17:09,300
the benefit of doing this kind of working in probabilistic programming languages
162
00:17:09,300 --> 00:17:14,370
is not simply a model can be extremely interesting and extremely relevant,
163
00:17:14,370 --> 00:17:23,670
but it can be extremely inconvenient to actually actually implement and and these models tend to be very useful.
164
00:17:23,670 --> 00:17:28,800
If we knew equation for the opposite is very easy to implement and you can do lots of sensitivity analysis,
165
00:17:28,800 --> 00:17:31,390
and that's not what you see in this blocks, you know, sensitivity analysis,
166
00:17:31,390 --> 00:17:36,840
the previous one prediction to the future and really allow you to understand the problem,
167
00:17:36,840 --> 00:17:41,760
allow you to summarise the problem and communicate this to decision makers in a way that they can.
168
00:17:41,760 --> 00:17:46,350
They can really understand. The other benefit of showing when you want equations.
169
00:17:46,350 --> 00:17:51,120
Being very powerful is that they can test epidemiological hypotheses by this.
170
00:17:51,120 --> 00:17:56,520
I mean, some of the work as an example that we did in Brazil, you know,
171
00:17:56,520 --> 00:18:05,120
there was a complex situation in the city of Manaus that had a huge wave that you'd seen in proxy the the yellow line that.
172
00:18:05,120 --> 00:18:08,990
And then they have a second wave. Now the question was, why did this second wave occur?
173
00:18:08,990 --> 00:18:14,930
Well, a new variant turned up, so we got some information from Apollo genetics about when that happened.
174
00:18:14,930 --> 00:18:23,750
But we really needed some sort of model with plausible epidemiological characteristics that could account for things like waning of immunity,
175
00:18:23,750 --> 00:18:28,520
you know, transmissibility, escape, which one of these hypotheses,
176
00:18:28,520 --> 00:18:33,410
how did these hypotheses linked together to really understand the situation on the ground
177
00:18:33,410 --> 00:18:37,940
and and renewed the questions allowed us to to have a fundamental mechanism to think we
178
00:18:37,940 --> 00:18:42,800
could tweak and used to test epidemiological hypotheses without having a mechanism in
179
00:18:42,800 --> 00:18:48,230
that arguably all you're doing is fitting to some data and predicting part of something.
180
00:18:48,230 --> 00:18:58,040
You're not really understanding the underlying mechanism that. And finally, some work done done by my colleagues at Imperial.
181
00:18:58,040 --> 00:19:02,320
You know, you can use renewal equations as well to embed other really interesting and
182
00:19:02,320 --> 00:19:07,270
complicated characteristics such as mobility to understand contact matrix patterns.
183
00:19:07,270 --> 00:19:11,620
And my colleague colleagues is doing a lot of work on this extended before.
184
00:19:11,620 --> 00:19:17,980
So the reason to do is to show these is not simply to highlight some papers in the work that we've done, but we're happy with.
185
00:19:17,980 --> 00:19:24,040
But but more to the point that the unifying factor behind all of these applications is the renewal rate.
186
00:19:24,040 --> 00:19:33,820
And the reason we stuck with the renewal equation is that it really embeds quite a lot of epidemiology in terms of the infection process.
187
00:19:33,820 --> 00:19:36,930
And so now that, you know, from all the approaches that I've gone in,
188
00:19:36,930 --> 00:19:44,440
I've explained to you and you know, I'm going to show some exciting aspects in the next slide.
189
00:19:44,440 --> 00:19:49,360
You know what, if when you do well, if you want to learn about renewal theory, really,
190
00:19:49,360 --> 00:19:53,710
the first and probably best paper on the topic was the work by fellow in 1941,
191
00:19:53,710 --> 00:19:58,390
when he looked at many aspects of the renewal and the renewal equation for
192
00:19:58,390 --> 00:20:02,920
those in other fields is also known as the military question of the second of.
193
00:20:02,920 --> 00:20:10,220
Basically, many integral equations with similar forms. And the question is how do we link this equation to infectious diseases?
194
00:20:10,220 --> 00:20:14,500
Well, previous research on the renewal equation tends to study robotic behaviour.
195
00:20:14,500 --> 00:20:21,250
That is how the renewal equation changes or what it averages to in the limit of time going to infinity.
196
00:20:21,250 --> 00:20:26,140
Or they look at single parameters, pull the multiuser employment now setting the stage.
197
00:20:26,140 --> 00:20:35,110
I've said that what we're really interested in is the rate of transmission, not just some growth parameter or some some estimate.
198
00:20:35,110 --> 00:20:39,280
And in truth, and I really think this is a really fruitful area of research.
199
00:20:39,280 --> 00:20:45,400
There are lots and lots of exciting links in renewing equations.
200
00:20:45,400 --> 00:20:55,040
The hawks process that I mentioned is because it turns out to be the expectation of the renewal of fantastic paper written by by and Andrew Hawks.
201
00:20:55,040 --> 00:21:03,130
Processes, in turn, are an auto regressive, infinite process on integers by Matthias Curtin in 2018.
202
00:21:03,130 --> 00:21:09,490
And when you look at the renewal equation, you can see that in this discrete sense, it looks remarkably like an auto regressive one.
203
00:21:09,490 --> 00:21:16,200
We did an auto move with a very specific set of coefficient determined by the.
204
00:21:16,200 --> 00:21:25,260
The generation tops these 50, there's something to one, but then Hawks processes another, great people by result are awesome if they are models.
205
00:21:25,260 --> 00:21:34,860
And like I said, the equation bye bye bye, baby champion that actually CIO models have a specific renewable equation for.
206
00:21:34,860 --> 00:21:35,460
And recently,
207
00:21:35,460 --> 00:21:44,760
I've been thinking about the Stochastic Volturi equation of being another big into this world to connect renewal equations to brunching process.
208
00:21:44,760 --> 00:21:48,720
So I procrastinate better because it's just something I'm interested in recently.
209
00:21:48,720 --> 00:21:50,820
So these are all kind of the same things,
210
00:21:50,820 --> 00:21:57,960
and that's that's quite cool because they all they're all encompassing the same idea that we have where one thing affects another.
211
00:21:57,960 --> 00:22:02,610
And at its core, when you reduce down and remove all the mathematics at its core,
212
00:22:02,610 --> 00:22:06,690
this is actually what all these approaches are doing from an intuitive level.
213
00:22:06,690 --> 00:22:13,290
They are reflecting some degree of self reference and some degree of one person infecting another.
214
00:22:13,290 --> 00:22:21,630
So after that, a brief history or rather long history, what is the question and the question is the any the question that you see there,
215
00:22:21,630 --> 00:22:27,030
which is the equation that Christopher and Corey have used in the continuous sense
216
00:22:27,030 --> 00:22:31,440
that the script and there is there a mathematical principle source for this equation.
217
00:22:31,440 --> 00:22:36,150
But that is not to say that their work was not mathematically principle. What I mean is, can we?
218
00:22:36,150 --> 00:22:41,160
Well, I'll tell you what I mean by that. Can we connect the stochastic process to this one?
219
00:22:41,160 --> 00:22:47,820
So the idea that we had was we first started with looking at at Belmont's paper and we derive this.
220
00:22:47,820 --> 00:22:54,300
It was reported that the Dresdner barrier in maps initially started doing.
221
00:22:54,300 --> 00:22:59,640
And the idea was, you know, how do we even just understand the government responses?
222
00:22:59,640 --> 00:23:05,370
And then as we moved on from that, we started thinking, Well, ANA, if the building process is an independent budget process,
223
00:23:05,370 --> 00:23:09,990
why don't we just pick the most general branching process we have and see what that means?
224
00:23:09,990 --> 00:23:17,280
And so most general budget process, at least to my knowledge, is what's called the general branching process or the Trump mode, Yeager's process.
225
00:23:17,280 --> 00:23:20,730
And I'm going to tell you what this process is, but basically, I'm just going to say it now.
226
00:23:20,730 --> 00:23:27,810
Once a couple of gigas processes the branching process where we start with one individual after a random amount of time,
227
00:23:27,810 --> 00:23:36,060
that individual can give rise to new individuals throughout their entire lifetime, and the number of individuals they give rise to is also random.
228
00:23:36,060 --> 00:23:41,460
So it's sort of a very general process, very akin to an individual based model.
229
00:23:41,460 --> 00:23:46,050
And that's why I particularly liked it. Because when using brunching process,
230
00:23:46,050 --> 00:23:57,450
we're essentially connecting the world of governing equations like the plastic infected recovered equations to the world of of agent based models,
231
00:23:57,450 --> 00:24:01,020
right by by defining the stochastic processes based on individuals.
232
00:24:01,020 --> 00:24:07,770
And I think I'm going to shoot towards the end that this is actually really important. It's something that lots are missing.
233
00:24:07,770 --> 00:24:11,980
And you know, you could always ask the question why a stochastic process, anyway?
234
00:24:11,980 --> 00:24:17,410
And I think this was best encompassed by a quote by Peter Yeager's, a plea for stochastic dynamics.
235
00:24:17,410 --> 00:24:21,970
I mean, I guess the whateley in there already showed how much he believed in this.
236
00:24:21,970 --> 00:24:29,130
You know, it is argued that biological populations are finite and consisting of individuals with a varying lifespan and reproduction,
237
00:24:29,130 --> 00:24:38,040
and they should be model as such. Now what he writes here that you know, this is what underlies biological processes, at least in infections.
238
00:24:38,040 --> 00:24:46,320
Epidemics are comprised of individuals with varying infection durations and varying amounts of who they infect, and they should be model as such.
239
00:24:46,320 --> 00:24:51,240
And the key is the modern probability theory allows for this.
240
00:24:51,240 --> 00:25:01,770
And so that's what we wanted to do. And and it goes way beyond just this philosophy of science, of why use stochastic process to to to do you know,
241
00:25:01,770 --> 00:25:04,770
why stop with the stochastic process and see where that leads us,
242
00:25:04,770 --> 00:25:11,490
even though I've already given you the spoiler that that it leads you to the renewal equation of kristoffersson and lockdown, et cetera.
243
00:25:11,490 --> 00:25:16,920
But it is, you know, the stochastic process tells you a lot of things that we'll see further down the line.
244
00:25:16,920 --> 00:25:25,620
If you simulate from an independent budget process and Harris proxies, you see the simulations in the right plot as the black lines.
245
00:25:25,620 --> 00:25:34,230
And we're interested in completing these the green line of the red doubling, which is the mean the multicolour expectation of this diagnostic process.
246
00:25:34,230 --> 00:25:42,390
Now, the underlying stochastic process has so many interesting aspects that we tend to lose by if we just look at the mean,
247
00:25:42,390 --> 00:25:48,360
if we just look at compartments, we look at the level of profound overexpression when simulating from the right.
248
00:25:48,360 --> 00:25:53,940
You're right, it really lets you know that you know, when you compute communicating risk of infectious diseases,
249
00:25:53,940 --> 00:25:59,760
just looking at the mean probably isn't going to cut it. And this is something that I will touch on at the end.
250
00:25:59,760 --> 00:26:07,680
And so I'm going to breeze past this because I don't want to spend too much time on this and also the brand alone for too long on this lecture.
251
00:26:07,680 --> 00:26:15,270
But I'm going to just go through a little bit about point processes and the more creative process just to give you a flavour of of,
252
00:26:15,270 --> 00:26:21,150
you know, what we do and how these things are computable and derivable from a mathematical sense.
253
00:26:21,150 --> 00:26:25,470
The first let me give you the intuition of the branching process. I'll give two examples.
254
00:26:25,470 --> 00:26:33,720
First, let me go to Belman Harris Brunch, Billman, Harris going to process that one individual after a random amount of time,
255
00:26:33,720 --> 00:26:38,370
that individual will give rise to a random number of new individuals.
256
00:26:38,370 --> 00:26:45,850
So in this case, the orange individual will give rise to three blue individuals after some random part.
257
00:26:45,850 --> 00:26:52,300
Now, the random time here is very well connected to the generation time, and it has a very precise meaning in this sense.
258
00:26:52,300 --> 00:26:59,500
But you could also consider a more complicated model based on a homogeneous point process where you start with your orange individual.
259
00:26:59,500 --> 00:27:03,400
This orange individual remains infectious for a certain amount of time.
260
00:27:03,400 --> 00:27:07,780
You know, they get infected. They they recover from that unless, let's assume.
261
00:27:07,780 --> 00:27:10,170
But they can infect people all the way to infinity.
262
00:27:10,170 --> 00:27:16,510
They go over their duration of infection based on how infectious they are and all of this process of infection.
263
00:27:16,510 --> 00:27:22,240
They will generate new infections from in homogeneous point process with a very rigid transmission.
264
00:27:22,240 --> 00:27:28,570
Now, these branching processes give us a very simple way to to to to sort of model very plausible processes.
265
00:27:28,570 --> 00:27:33,820
What I'm telling you now lays bare all the assumptions behind it.
266
00:27:33,820 --> 00:27:38,740
You know, it's not like I've given you some simple model and you you're looking at the compartments and you wondered,
267
00:27:38,740 --> 00:27:43,690
but I guess the compartments, but how the dynamics between them, you know, it's much needed.
268
00:27:43,690 --> 00:27:50,980
This is why people like agent based models, because it's very easy to understand the assumptions, the precise assumptions.
269
00:27:50,980 --> 00:27:53,830
And of course, we can derive two very complicated models funded.
270
00:27:53,830 --> 00:28:04,240
So using our branching process, can we actually understand these two scenarios easily?
271
00:28:04,240 --> 00:28:11,200
And so let's just go through and I'll go through this because I know it's really dry and I'll go through it reasonably quickly so we can move on.
272
00:28:11,200 --> 00:28:16,930
We start with the crumpled Yeager's process, which the general budget does stuff with one individual at some time.
273
00:28:16,930 --> 00:28:22,180
I said this before that individual state infectious for a certain amount of time.
274
00:28:22,180 --> 00:28:30,760
That's that this l parameter, l l l variable, and then we have X, which is a stochastic process called a random characters.
275
00:28:30,760 --> 00:28:36,940
This is beautiful and collaborative. Mika has done all the heavy lifting on this aspect as well.
276
00:28:36,940 --> 00:28:41,950
When he first introduced this, I thought it was so beautiful and intuitive to make to make life easy.
277
00:28:41,950 --> 00:28:47,920
Because when we first started doing this from the derivation of of of of Harris and Belman,
278
00:28:47,920 --> 00:28:51,040
it's really messy and complicated, and we have to go into major theory.
279
00:28:51,040 --> 00:28:58,150
And then suddenly, you know, when looking at point processes, you realise just how powerful point processes are in simplifying it.
280
00:28:58,150 --> 00:29:02,020
And then we have end, which is the counting process, which is highly intuitive.
281
00:29:02,020 --> 00:29:05,890
It's just keeping track of the number of new infections generated by the individual.
282
00:29:05,890 --> 00:29:12,970
So we have the right amount of time, a counting process and a stochastic process called the random characteristic.
283
00:29:12,970 --> 00:29:17,510
And so for each individual, we have these three three bits.
284
00:29:17,510 --> 00:29:25,360
So for example, now we can write down the moment Harris process, which is this equation the first equation there, which is extremely simple.
285
00:29:25,360 --> 00:29:35,110
We say that, you know, for some, the number of infected individuals at some time you is zero until sometime elsewhere.
286
00:29:35,110 --> 00:29:40,180
They give rise to within you infected. And this is exactly what I'd said there.
287
00:29:40,180 --> 00:29:46,210
In fact, individual new fatality and then generate new ones. The point, of course, is very simple.
288
00:29:46,210 --> 00:29:53,440
We just have a unit, a unit unit problem process with some transmission rate roll from infectiousness K.
289
00:29:53,440 --> 00:29:58,300
And then the integral of that, as you do generally within your own body, is pretty process.
290
00:29:58,300 --> 00:30:02,240
So it suddenly becomes very easy to define these characteristics.
291
00:30:02,240 --> 00:30:10,310
Similarly, the random characteristics, which you know, it becomes very easy to define incidence and prevalence rate as incidence,
292
00:30:10,310 --> 00:30:13,590
that cumulative incidence and prevalence, of course,
293
00:30:13,590 --> 00:30:18,230
is just the number of individuals between two time points, and that's exactly what it's showing there.
294
00:30:18,230 --> 00:30:22,100
It's been very easy to include the measures that we're interested in.
295
00:30:22,100 --> 00:30:27,220
So I'm just going to go in some derivation and I'm going to zoom past this because I don't bore everyone.
296
00:30:27,220 --> 00:30:33,620
You know, we define some, some across our random characteristics. We single out the index case and this is, of course,
297
00:30:33,620 --> 00:30:39,950
the most important and really the most beautiful aspect of Belmont came about and how it came about,
298
00:30:39,950 --> 00:30:45,920
which is the first generation separating out the index case from all other cases.
299
00:30:45,920 --> 00:30:52,160
And the intuition behind it is that doing this, the first generation we can create some degree of self-protection,
300
00:30:52,160 --> 00:31:00,410
some self similarity in terms of what the fifth generation thought. And so we then split our our we can look at the second part of first generation
301
00:31:00,410 --> 00:31:05,210
and we can we can split that into each one of the first generations there.
302
00:31:05,210 --> 00:31:09,170
And that's what we have. Kate, as part of this set, I have Kate.
303
00:31:09,170 --> 00:31:16,310
And then, depending on the statistic of choice, we can use a tower property of expectations to create an expected value.
304
00:31:16,310 --> 00:31:21,200
Remember, we start with this stochastic process, right, which is generating realisations,
305
00:31:21,200 --> 00:31:32,520
and what we're interested in is the average and as we'll see later, the second higher or the moment I'm almost done, we make things look easier.
306
00:31:32,520 --> 00:31:35,120
Change that sum into into an interval.
307
00:31:35,120 --> 00:31:42,690
And really as simple as that, because just algebraic operations, even though they look, they may look complicated.
308
00:31:42,690 --> 00:31:46,370
Now, you know, they're actually not. They're very straightforward.
309
00:31:46,370 --> 00:31:50,180
There's nothing there's nothing really, really complicated about this other than having to drive it in first place.
310
00:31:50,180 --> 00:31:55,340
But you know, this would make a huge difference for. And we get to renew the question.
311
00:31:55,340 --> 00:32:01,970
And this is this is the first point where, you know, we realise this with the bellman Harris courses with a complicated process.
312
00:32:01,970 --> 00:32:13,040
It's wonderful because after all of this, we have a branching process that tells us and we can imbue dynamics of individuals and we calculate the
313
00:32:13,040 --> 00:32:18,440
average of that budget process for a measure that we're interested in for a specific counting process.
314
00:32:18,440 --> 00:32:21,890
And what we get is a renewed equation, a very general equation.
315
00:32:21,890 --> 00:32:28,430
Now this equation looks nothing like the equation of Christoph and course, it's far, far more general.
316
00:32:28,430 --> 00:32:32,360
But. You know, to get this sorry for all the months,
317
00:32:32,360 --> 00:32:39,260
but to get to the point what this equation does is it allows us for the first
318
00:32:39,260 --> 00:32:43,820
time to actually have a time varying reproduction number for general brunching,
319
00:32:43,820 --> 00:32:51,110
just so we can now use this framework, which I think, you know, although the MAX looks complicated, I think, you know,
320
00:32:51,110 --> 00:32:59,450
if you if you if an individual wants to put time now, they can go and they can say, well, give my individual branching process.
321
00:32:59,450 --> 00:33:04,640
I want to, you know, tweak it in certain ways to have it behave in certain dynamics.
322
00:33:04,640 --> 00:33:09,660
Can I then derive an unexpected formula from that? Or can I use that renewal equation to do that?
323
00:33:09,660 --> 00:33:13,190
That's that's exactly what what we did in our processes.
324
00:33:13,190 --> 00:33:20,780
And so we should really question for the first time that you can do many new and interesting aspects.
325
00:33:20,780 --> 00:33:26,960
Individuals can infect an individual after some random time, and the number can vary.
326
00:33:26,960 --> 00:33:32,780
They can it at random times, a person process not leaving, but in principle, you can look at compound based on processes.
327
00:33:32,780 --> 00:33:37,130
And that would be interesting. And the number they can infect changes over time.
328
00:33:37,130 --> 00:33:43,110
So this is the really new contribution. At least, you know, it seems very esoteric, but from from our viewpoint,
329
00:33:43,110 --> 00:33:50,270
it's connecting the world of branching processes to the world of renewal equations, which already have a huge amount of epidemiological basis.
330
00:33:50,270 --> 00:33:56,880
But doing it in such a way that it's stuff becoming extremely customisable to to to change it.
331
00:33:56,880 --> 00:34:06,150
And so an example of that customisable image, we can actually create a renewable equation for the Belmont harvest process or the homogeneous process.
332
00:34:06,150 --> 00:34:11,940
And what is really interesting is that these two epidemiologically are almost the same.
333
00:34:11,940 --> 00:34:22,890
So we found it really confusing at first when you know the equation that everyone uses where, you know, previous infections affect future infections.
334
00:34:22,890 --> 00:34:28,620
And that equation explicitly abutment Harris equation model.
335
00:34:28,620 --> 00:34:33,210
We found this assumption that all the infections have to happen at one time.
336
00:34:33,210 --> 00:34:38,640
This idea of an instantaneous are we thought that that was a very poor approach to modelling infectious diseases.
337
00:34:38,640 --> 00:34:40,620
And so why do the renewal equations?
338
00:34:40,620 --> 00:34:48,540
Well, it turns out that the, you know, genius point processes is, for all intents and purposes, practically the same epidemiologically.
339
00:34:48,540 --> 00:34:55,950
And so you can interpret the renewal equation either through a bellman harris view or through an homogeneous point of view.
340
00:34:55,950 --> 00:35:03,210
And this is well to me, mind blowing, but maybe, you know, new to us.
341
00:35:03,210 --> 00:35:13,260
And in fact, you know, after much struggling, I really struggled with this because it took so long made me and my postdoc at the time, Thomas Mallon.
342
00:35:13,260 --> 00:35:17,670
We proved that via induction, at least in the discrete case,
343
00:35:17,670 --> 00:35:24,630
that the common renewed equation that Christoph Anon and others introduced is actually a very special case of our.
344
00:35:24,630 --> 00:35:31,350
But what we do is we provide renewal equations for cumulative incidence incidents and most importantly, prejudice.
345
00:35:31,350 --> 00:35:36,750
To my knowledge. Up to now, no one has a renewed equation for presidents.
346
00:35:36,750 --> 00:35:44,610
And everyone uses this what's called a back calculation approach where you could evolve incidents with the generation interval to get pregnant.
347
00:35:44,610 --> 00:35:52,350
This requires you to have latent functions and processes. It's not elegant, to say the least, but also not practically very nice.
348
00:35:52,350 --> 00:35:57,660
What we do is we unify prevalence and incidence under the back calculation approach now.
349
00:35:57,660 --> 00:36:04,110
This is one aspect which I truly understand. And also you'll have to ask me about the specifics of this proof.
350
00:36:04,110 --> 00:36:08,700
But essentially, we prove that and I say proven elusive the system.
351
00:36:08,700 --> 00:36:12,660
There's still some more bits on this that that need to be done in terms of full rigour.
352
00:36:12,660 --> 00:36:17,460
But we prove that our relationship with prevalence and incidence in this new found renewal
353
00:36:17,460 --> 00:36:25,600
equations are they are exactly they conform exactly to what we know from epidemiology.
354
00:36:25,600 --> 00:36:27,120
And this is a really, you know,
355
00:36:27,120 --> 00:36:35,250
it might be a beautiful thing that you derive all this complicated maths and the complicated maths reflects what people already know in epidemiology.
356
00:36:35,250 --> 00:36:42,270
Be sort of doing informal mathematics. What is known heuristic it to be true in the field of epidemiology.
357
00:36:42,270 --> 00:36:48,300
So we show that not only can we give you, can I give you a renewed equation for prevalence of incidence,
358
00:36:48,300 --> 00:36:54,150
but I can show you that these two equations are consistent under the common definition that unify prevalence in this century,
359
00:36:54,150 --> 00:36:58,230
we provide the first framework that can that can unify Palestinians.
360
00:36:58,230 --> 00:37:07,420
Now, starting from the basis of Christof or Brown's equation, you can't immediately go into writing because this is we have tried this in the past.
361
00:37:07,420 --> 00:37:15,180
This is really difficult, and it's because you need to do this from the underlying stochastic process, and it's much easier to do it that way.
362
00:37:15,180 --> 00:37:22,470
Starting from very simple principles and building up, then starting from the end and trying to do that.
363
00:37:22,470 --> 00:37:32,100
Now, all of this is pointless. If it's not easy to code and it's trivial to go, this is solve the entire renewal equation for prevalent in this block.
364
00:37:32,100 --> 00:37:39,930
This block requires nothing but element wise multiplications, and Rosso's now in modern statistical computers on GPUs.
365
00:37:39,930 --> 00:37:46,080
These two computations element wise multiplication and growth are exceedingly fast,
366
00:37:46,080 --> 00:37:51,210
so fast such that the bottleneck is actually starting from the underlying posterior distribution.
367
00:37:51,210 --> 00:38:00,060
We're optimising it across some non-contact space that actually takes more time than actually solving the equations themselves.
368
00:38:00,060 --> 00:38:03,420
And so although our equations are slightly more complicated than the real equations
369
00:38:03,420 --> 00:38:09,690
probably use because they are far more general and you can do a lot more with them. They actually don't take that much time to solve.
370
00:38:09,690 --> 00:38:16,890
And in fact, I've been playing around a lot with Julia recently because I really like Julia now, and you can just you can write this even simpler.
371
00:38:16,890 --> 00:38:24,440
Julia, this is a series of moves. So who that was a lot of content we arrive.
372
00:38:24,440 --> 00:38:30,240
What have we learnt? We've arrived at the commonly used equation in a principle that was outgoing.
373
00:38:30,240 --> 00:38:36,180
Is it groundbreaking in terms of changing the face of epidemiology? No, obviously not.
374
00:38:36,180 --> 00:38:41,850
But, you know, we show where this equation arrived for the principal mathematical way from osteopathic priests.
375
00:38:41,850 --> 00:38:47,220
And I think the most important thing is we connect the world of agent based models to govern immigration.
376
00:38:47,220 --> 00:38:50,700
So that people can build on, we put these two equations equivalent,
377
00:38:50,700 --> 00:38:55,110
i.e. the renewal equation currently being used in the special case of a more general emergency.
378
00:38:55,110 --> 00:39:00,780
We unified pretty privileged nations and we provide an efficient computational scheme.
379
00:39:00,780 --> 00:39:06,900
So applications, you know, the most obvious one is go, you know, analyse the grid.
380
00:39:06,900 --> 00:39:12,150
And this is a great thing that I'm going to release all the data in, too, so that everyone can analyse it.
381
00:39:12,150 --> 00:39:16,740
We can go and model. It was using these equations in a full Bayesian framework really easily.
382
00:39:16,740 --> 00:39:26,160
And we get exactly what you'd expect using stochastic processes. The benefit is that we don't have to use any arbitrary assumptions behind,
383
00:39:26,160 --> 00:39:31,380
you know, the functional form of our team here when you try to move processes. But you could do the governing process.
384
00:39:31,380 --> 00:39:37,110
You could use piece planes, which are basically random walks, which are basically consequences you can use.
385
00:39:37,110 --> 00:39:41,400
You can use any functional form you want in there to estimate it.
386
00:39:41,400 --> 00:39:45,510
So our renewal equations can do what the previous equations did.
387
00:39:45,510 --> 00:39:53,190
And you know, that's not new, but I'm just telling you that you haven't lost anything here, but you can do new things.
388
00:39:53,190 --> 00:40:00,690
And this is a recent example that I did before presidents. So this is the U.K. owner's infection survey.
389
00:40:00,690 --> 00:40:04,890
And in the plot on the on the top right, you see the prevalence,
390
00:40:04,890 --> 00:40:10,050
the population prevalence over time and you can see the access that this is
391
00:40:10,050 --> 00:40:15,090
just the percentage of individuals that are testing positive for SARS-CoV-2,
392
00:40:15,090 --> 00:40:19,050
the number that have COVID 19 at any given point.
393
00:40:19,050 --> 00:40:25,260
And we want to estimate the reproduction number. If you want to do this, you have to use some form of that calculation, which is very difficult,
394
00:40:25,260 --> 00:40:28,830
and I haven't been able to find how the, you know, the owners do this.
395
00:40:28,830 --> 00:40:31,380
I need to speak to Thomas Health about this at some point.
396
00:40:31,380 --> 00:40:38,040
But I guarantee it's not as simple as what I'm doing right here, where I have a renewal equation that I can fit,
397
00:40:38,040 --> 00:40:44,430
for instance, that is the same with you with the question is this one? I just added an extra term on that and solve it in the exact same way.
398
00:40:44,430 --> 00:40:46,350
I again even have to modify the code,
399
00:40:46,350 --> 00:40:55,080
and I could get a very good estimate of the case reproduction number to validate this using the same renewal equation without doing any extra fitting.
400
00:40:55,080 --> 00:41:03,510
I get incidents straight away and I and that's what the bottom left plot to show you with the incident and the blue bars and the actual cases,
401
00:41:03,510 --> 00:41:06,920
and the red is the fitted incidence.
402
00:41:06,920 --> 00:41:13,770
I look at that and I see a line of about seven days, which is completely in line with what you'd expect when I get infected,
403
00:41:13,770 --> 00:41:20,300
it takes about seven days until that to manifest of reported case. You look at them from that, you can calculate the ascertainment ratio,
404
00:41:20,300 --> 00:41:27,800
which is around 2.5 and is remarkably stable, apart from weekly fluctuations and from different places.
405
00:41:27,800 --> 00:41:34,790
You know, if we could run the tape of the pandemic again, I have no doubt that such a framework would be extremely useful.
406
00:41:34,790 --> 00:41:37,850
Maybe I'm being good to highlighting. I would do much,
407
00:41:37,850 --> 00:41:44,270
but I think would be extremely useful for something like react to audio in this setting because we can now have a renewal
408
00:41:44,270 --> 00:41:51,090
question for a president that links back to a budget process that has epidemiological mechanisms that we can then build on.
409
00:41:51,090 --> 00:41:58,040
As me and my colleagues have done on several different applications in the past, and I think this is really powerful.
410
00:41:58,040 --> 00:42:02,300
But in the last few bits, I'm just going to talk about an application for the variance.
411
00:42:02,300 --> 00:42:08,720
Now this is this is really one of the cool things you can get with these renewal equations are generating funds with generating functions.
412
00:42:08,720 --> 00:42:16,970
You can get all the higher order moments for not just the mean, but the very. And there is a really important question of superspreading, right?
413
00:42:16,970 --> 00:42:21,570
You know, superspreading has been there all the time in terms of, you know,
414
00:42:21,570 --> 00:42:27,030
not the people in that superspreading, this big thing, one person, in fact. How does superspreading actually arise from it?
415
00:42:27,030 --> 00:42:30,470
You know, closet and co. have also set the scale free networks of rice.
416
00:42:30,470 --> 00:42:36,300
Where does this scale free of heavy tailed behaviour in budget processes actually come from, right?
417
00:42:36,300 --> 00:42:38,450
Is it only from the secondary distribution,
418
00:42:38,450 --> 00:42:46,980
i.e. the reason that these these branching processes in epidemics have really heavy tails is because I might go to a festival and in fact,
419
00:42:46,980 --> 00:42:52,830
100 people is not the only dynamic at play. And no, actually.
420
00:42:52,830 --> 00:42:58,200
And I thought of this a lot in the view of the central limit theorem that that superspreading is really not a big deal,
421
00:42:58,200 --> 00:43:03,840
but I was using the new kind of central libertarian, which does not account for weak dependence.
422
00:43:03,840 --> 00:43:10,620
And I realised after that that I was entirely wrong because there is actually no central limit for these budget processes because
423
00:43:10,620 --> 00:43:17,520
the dependence on the time and you can see the effect of superspreading when you simulate from these branching processes,
424
00:43:17,520 --> 00:43:26,190
look at how profound the over dispersion is there. And let's take a simple experiment to look at this in the top left corner.
425
00:43:26,190 --> 00:43:35,890
We have the reproduction number at the rate of transmission oscillating between growth, you know, reduction growth reduction growth reduction.
426
00:43:35,890 --> 00:43:45,370
In in the top right, you have the rather intuitive mean presidents that have a large first wave, a smaller second wave, smaller, smaller and smaller.
427
00:43:45,370 --> 00:43:47,980
What do you see in the bottom left is really fascinating.
428
00:43:47,980 --> 00:43:56,690
Is the variance now the variance in the second wave, despite the mean being smaller, is much, much bigger.
429
00:43:56,690 --> 00:43:58,040
And this is, in my view,
430
00:43:58,040 --> 00:44:04,550
a little bit unintuitive because we're used to thinking of profound likelihoods or likelihoods where the variance is some function of the mean.
431
00:44:04,550 --> 00:44:11,540
And in truth, the variance is huge and you know, each subsequent wave, the variance grows in uncertainty.
432
00:44:11,540 --> 00:44:18,410
You know, even the you in the third wave is quite quite small, but the variance is still one of the biggest at the variance in the first wave.
433
00:44:18,410 --> 00:44:26,510
What is going on here, right? This is a dynamic that I haven't seen, personally discussed or highlighted much in the literature,
434
00:44:26,510 --> 00:44:32,750
but you can see it in the simulations where superspreading is emerging and you can see other dynamics from this,
435
00:44:32,750 --> 00:44:40,340
which all of us in infectious diseases know about extinction. Yet we rarely integrate these into our modelling framework, right?
436
00:44:40,340 --> 00:44:47,510
You can model using these renewal equations in our framework. The precise expected extinction probability and you know,
437
00:44:47,510 --> 00:44:53,230
you can change these from having a possibly secondary distribution of negative binomial secondary dispersion.
438
00:44:53,230 --> 00:44:57,380
And so in plot eight, you can see the index of dispersion, which is huge, right?
439
00:44:57,380 --> 00:45:05,300
It's huge. You know, it's massive. And then you can see the extinction probabilities which again conform to what we're seeing as time goes by.
440
00:45:05,300 --> 00:45:11,960
If there's no new importation event, some epidemics are just going to burn out, especially when the Arctic is less than zero.
441
00:45:11,960 --> 00:45:15,100
And that's what that extinction probability is very short.
442
00:45:15,100 --> 00:45:22,320
And so finally, in the final slide, what I'm really interested in looking at now is what is the appropriate likelihood, right?
443
00:45:22,320 --> 00:45:24,360
And let's just take our example,
444
00:45:24,360 --> 00:45:31,890
if we simulated the planting process in and we looked at the underlying distribution at those five points that, well, the histogram,
445
00:45:31,890 --> 00:45:37,050
we see that the blue blueberries, what you would get with a profound likelihood and the possible likelihood is telling you,
446
00:45:37,050 --> 00:45:45,600
well, my main problem at that point two is about 60. So let me put a bar at 60 and have some discussion on that.
447
00:45:45,600 --> 00:45:51,450
But when you look at the underlying distribution that simulated from an individual based model,
448
00:45:51,450 --> 00:46:00,580
the branching process, the dynamics are very different. Long story short, what the dynamics tell you is that.
449
00:46:00,580 --> 00:46:08,530
Actually, what happens is many epidemics go extinct and some blow up really large towns.
450
00:46:08,530 --> 00:46:13,540
And this is true even when your secondary distribution is forced on you.
451
00:46:13,540 --> 00:46:22,630
Very finite, it doesn't have a habitat. This idea of heavy handedness emerges is an emergent property of branching processes, right?
452
00:46:22,630 --> 00:46:28,680
Getting superspreading, you don't need individuals to be super spreaders, the brunching process itself.
453
00:46:28,680 --> 00:46:37,560
Has high variance in high teens. And so I am now conjecturing and start to wonder whether using a negative binomial likelihood or
454
00:46:37,560 --> 00:46:43,500
unlikelihood alone is actually extremely inappropriate for capturing the dynamic that we know to be true.
455
00:46:43,500 --> 00:46:50,310
I actually don't know if it'll make a difference. I suspect it won't. Unfortunately, isn't, as in the case of many of these things.
456
00:46:50,310 --> 00:46:52,890
You know, sometimes an approximation is often good enough,
457
00:46:52,890 --> 00:46:59,310
at least for epidemiological purposes doesn't actually change the underlying estimation, but I think it's extremely interesting.
458
00:46:59,310 --> 00:47:05,220
And so the questions for the audience of those who have health is, you know, I have these underlying distribution here.
459
00:47:05,220 --> 00:47:12,270
How do I get a parametric choice for maximum entropy seems good, but I have other couple of techniques up my sleeve.
460
00:47:12,270 --> 00:47:15,490
But you know, what's the best solution to the moment?
461
00:47:15,490 --> 00:47:21,190
Probability generating distribution from a BGF or any approximation forms can result the renewal equation even more
462
00:47:21,190 --> 00:47:28,870
efficiently than just using some to where you transformed in all that last time from stuff doesn't make things much faster.
463
00:47:28,870 --> 00:47:32,830
So where do we go next from here? I want to do more unification.
464
00:47:32,830 --> 00:47:34,060
I want to unify all these approaches.
465
00:47:34,060 --> 00:47:40,840
I think they are all essentially the same thing could benefit from more and more linkage together to understand the underlying roots behind.
466
00:47:40,840 --> 00:47:48,370
I think that what we've described here is a perfect fit for genetics as opposed to using a general model.
467
00:47:48,370 --> 00:47:55,990
Why not use an age dependent Bacik process with a fixed splitting of two and then estimating a time varying generation time,
468
00:47:55,990 --> 00:48:01,090
which would be a replacement for Scotland blocks? Simple to simulate from.
469
00:48:01,090 --> 00:48:08,020
Simple do inference from and you can then link it straight to epidemic's using the same equations.
470
00:48:08,020 --> 00:48:10,030
More efficient ways of competition exist.
471
00:48:10,030 --> 00:48:19,030
I'd like to embed these processes on graphs so we can measure the worth of networks to these well, and the question of immigration is huge.
472
00:48:19,030 --> 00:48:25,250
Bringing new infections from outside is very important and requires proper treatment.
473
00:48:25,250 --> 00:48:28,330
That's challenging, so I want to say thanks.
474
00:48:28,330 --> 00:48:35,770
And first and foremost, amigo who really did the heavy lifting of the max on this election has been interacting with me throughout all of this.
475
00:48:35,770 --> 00:48:39,100
He's really lovely to work with and brilliant, actually brilliant.
476
00:48:39,100 --> 00:48:47,080
And to Thomas Thomas Mellon and senior moderator who helped with all aspects of this in three or four draughts.
477
00:48:47,080 --> 00:48:55,720
Transdermal drug the work in the intermediate paper Charlie helped me later on in subcommittee meetings came up with this idea in the first place,
478
00:48:55,720 --> 00:48:58,600
long, long time ago. There is no project with amazing collaborators,
479
00:48:58,600 --> 00:49:07,570
and I hope I hope it's useful for the epidemiological community rather than just a mathematical curiosity.
480
00:49:07,570 --> 00:49:13,280
Thanks. All right.
481
00:49:13,280 --> 00:49:17,450
We have 10 minutes for questions, please write them in the chat or raise your hand.
482
00:49:17,450 --> 00:49:21,720
And Chris, you're. Thanks.
483
00:49:21,720 --> 00:49:31,050
Great talk. You met you said there was a difficulty with relating the standard generic equation of epidemiology to prevalence.
484
00:49:31,050 --> 00:49:35,580
If we're talking numerically and not just not analytically,
485
00:49:35,580 --> 00:49:43,860
is it not just a question of involving the past incidence with the probability of contributing to the prevalence at a given moment,
486
00:49:43,860 --> 00:49:50,840
whether by being PCR positive or seropositive or whatever? It is exactly that, except you,
487
00:49:50,840 --> 00:49:59,460
you need to first create a latent function for incidents and then involve that and then put that into a likelihood for governments right now.
488
00:49:59,460 --> 00:50:03,000
It probably has the same impact in terms of posterior computation.
489
00:50:03,000 --> 00:50:11,490
It's just that when you type one thing into another thing it sometimes can be can cause lots of issues with some people from a difficult question.
490
00:50:11,490 --> 00:50:17,640
Whereas here it's just the equivalent of having a renewed, aggressive Bridgerton's. You don't need a latent function first and then controlled.
491
00:50:17,640 --> 00:50:23,970
I mean, that's already in there anyway. It would be interesting to test and see how bad the posterior teeth are.
492
00:50:23,970 --> 00:50:31,050
I mean, you know, nothing we've put in here hasn't already had a solution for the epidemiology and lot of political epidemiology.
493
00:50:31,050 --> 00:50:37,050
The question is, you know, is it better? In some ways, I think it's more intuitive and more succinct.
494
00:50:37,050 --> 00:50:41,220
Is it better? I think it's better from a computational perspective, but that remains to be.
495
00:50:41,220 --> 00:50:46,740
The fact that we tested it probably changes on on different and I don't know how the noise propagate, propagating them.
496
00:50:46,740 --> 00:50:48,360
There's a lot of things here.
497
00:50:48,360 --> 00:50:56,010
If I want to fit to react to the cases, I can trivially have to likelihood that that really cost me very little, much more,
498
00:50:56,010 --> 00:51:02,220
whereas it can sometimes get a little bit more difficult when you have to do one consulting to the other in principle.
499
00:51:02,220 --> 00:51:08,100
Same thing, you. Krista, yeah, thanks.
500
00:51:08,100 --> 00:51:22,320
Great talk. I have two questions really about Stochastic City when we submit a stochastic process sort of to to to tune the renewal equation
501
00:51:22,320 --> 00:51:31,980
estimates as we tend to switch between Belmont Harris and then homogeneous crescent representation more or less at random.
502
00:51:31,980 --> 00:51:39,540
You mentioned that you've shown that they behave the same and then the men feel, do they behave the same stochastic play?
503
00:51:39,540 --> 00:51:47,640
So that's the first question. And then once we got over dispersion, we tend to do that in Belmont Harris because it's easier now.
504
00:51:47,640 --> 00:51:52,530
But we've never found a good way to estimate the amount of over dispersion from the time source.
505
00:51:52,530 --> 00:51:58,860
So we always end up looking for other data sources. Do you think that's that's even possible?
506
00:51:58,860 --> 00:52:01,450
So as we enter in the degree of statistical?
507
00:52:01,450 --> 00:52:07,620
Yeah, I mean, the same is the same in the mean, but they are not they're not the same in the individual which processes.
508
00:52:07,620 --> 00:52:14,550
But, you know, at the end of the day, I think it only changes the degree the variance, you know, changes.
509
00:52:14,550 --> 00:52:18,720
But in terms of the mean, they're the same. But in terms of the simulating, yeah, they are different.
510
00:52:18,720 --> 00:52:22,320
I think it depends what you what you're going to summarise from them after that.
511
00:52:22,320 --> 00:52:29,040
That makes it in terms of the over this question, I'm convinced that the over dispassionate as you'd expect, is.
512
00:52:29,040 --> 00:52:34,910
The benefit of using generating functions is actually I can provide a renewed equation for
513
00:52:34,910 --> 00:52:41,460
the DeFi of the of the negative binomial that requires very little additional computation.
514
00:52:41,460 --> 00:52:49,200
So I can provide you in close for what the variance should be from theory rather than you just estimating one fine for all of your data, right?
515
00:52:49,200 --> 00:52:52,380
You get them incorrect by using linear equation you put in a fire.
516
00:52:52,380 --> 00:53:00,240
And the problem is that that fire is not inflationary, but it should be not stationary, given what we know from individual simulations.
517
00:53:00,240 --> 00:53:03,870
I could provide you with that equation, of course, for.
518
00:53:03,870 --> 00:53:10,570
It's still not perfect because we have an accounting for the zero inflation, and I can provide you an increasingly platform for zero inflation,
519
00:53:10,570 --> 00:53:16,230
but then fiddling with a zero inflated negative binomial and solving equations
520
00:53:16,230 --> 00:53:22,440
algebraically I've been struggling with to to get it without using non-linear solving.
521
00:53:22,440 --> 00:53:30,240
But essentially, I think we can provide you with with the higher order moments trivially.
522
00:53:30,240 --> 00:53:37,080
And do you think it's identifiable? We've always struggled. No, I don't think it's I think it's because we only have one academic.
523
00:53:37,080 --> 00:53:45,420
The truth is, it's completely not identifiable. You have to trust that the that that the assumptions of your stochastic process represent reality
524
00:53:45,420 --> 00:53:52,890
reasonably well and then say that although I can identify what the variance is from observational data,
525
00:53:52,890 --> 00:53:57,240
I think from theory it should be this. And so but I think it'd be completely identifiable.
526
00:53:57,240 --> 00:54:05,040
I mean, you know, you'd need we need to rerun the tape of time on academics many times over actually get an estimate of the variance.
527
00:54:05,040 --> 00:54:08,430
Otherwise, we have confounding and all these other problems.
528
00:54:08,430 --> 00:54:15,020
But I think you know, what I want to do is and I'm going to do is fit the prevalence data from the over this,
529
00:54:15,020 --> 00:54:18,180
you think, a more appropriate likelihood in this one.
530
00:54:18,180 --> 00:54:24,840
And what I think we'll find is that the variance as the second wave goes up, sometimes you see artists spike, right?
531
00:54:24,840 --> 00:54:31,800
If you randomly Archie goes up to six or seven, just because the small sample statistics, because we have got the variance correct.
532
00:54:31,800 --> 00:54:36,300
I don't think the mean will need to do that and it'll keep Archie a little bit more stable.
533
00:54:36,300 --> 00:54:38,660
This could be very useful for policy.
534
00:54:38,660 --> 00:54:45,290
I think, you know, otherwise in the news you get, artist is six, but there's only 10 cases or something, you know?
535
00:54:45,290 --> 00:54:50,860
And then it course. A big part of the reason that this exactly happened is because of this.
536
00:54:50,860 --> 00:54:55,870
And. Question from Leonid in the chat.
537
00:54:55,870 --> 00:54:59,990
Sam, you wanted to read this question earlier on this question.
538
00:54:59,990 --> 00:55:05,650
I'm wondering if there would be a straightforward way to represent the secondary attack rate in the framework you presented.
539
00:55:05,650 --> 00:55:11,200
Yeah, I mean, there is it is in a very it's baked in that right.
540
00:55:11,200 --> 00:55:14,620
We have our and that's always the mean that we get in.
541
00:55:14,620 --> 00:55:22,090
But you can always assume in these equations of distribution for the secondary, the secondary infections, if I can live to regret,
542
00:55:22,090 --> 00:55:28,540
for example, I assume the secondary infection to be, you know, plus negative binomial discrete power law.
543
00:55:28,540 --> 00:55:34,730
And based on all of these, as long as I can calculate second moment, I can get the variance until four.
544
00:55:34,730 --> 00:55:36,740
Until I can, just from what I was telling Bristol,
545
00:55:36,740 --> 00:55:43,880
I can I can actually provide a cure for immigration for the very it's not that's based on me assuming I've got the underlying facts,
546
00:55:43,880 --> 00:55:48,200
the policies correct. But it's definitely better than just having a fixed variant for process.
547
00:55:48,200 --> 00:55:53,670
We know it's more efficient. So neither is perfect. But I think if I did, did you?
548
00:55:53,670 --> 00:56:02,460
A secondary attack rate, it means the percentage of people who are in some setting that are getting infected from the from the index case.
549
00:56:02,460 --> 00:56:07,890
Yeah. Well, Leonid, you want to clarify.
550
00:56:07,890 --> 00:56:19,860
Yeah, that's exactly what what's said. So but the very to you mean, what is the prevalence from an index case?
551
00:56:19,860 --> 00:56:29,830
There's a time component as well. So if it's within, within, within a generation interval, I guess, I mean, you know, yeah,
552
00:56:29,830 --> 00:56:35,340
in principle, we can we can't do this by just choosing the random characteristics of matches.
553
00:56:35,340 --> 00:56:41,840
But obviously, you'd have to write down the marks for it. OK.
554
00:56:41,840 --> 00:56:48,660
Let me go to someone who hasn't gotten a chance to ask a question, so Francesco writes a question.
555
00:56:48,660 --> 00:57:00,900
Yeah. Francisco goes home at variability. Do you get when you look at the total package or at the end of the way, it seems quite high.
556
00:57:00,900 --> 00:57:09,750
Well, I mean, at the end of the day in this block over here, you know, the population prevalence is impacted by the attack rate.
557
00:57:09,750 --> 00:57:14,280
You mean the number of total infected or the number affected each time it's clarified.
558
00:57:14,280 --> 00:57:18,660
Yeah, the total number of infected. Thank you. Yeah, I mean, that's that's very good.
559
00:57:18,660 --> 00:57:24,870
We have cumulative incidents. You can generate that very trivially from this exact equation.
560
00:57:24,870 --> 00:57:32,370
So I can fit the president's get up incidents here and actually get total cumulative incidence as well and then divide that by the population.
561
00:57:32,370 --> 00:57:36,900
It requires no extra compute or computation. So how much variability do I get?
562
00:57:36,900 --> 00:57:43,710
I mean, it's sort of a difficult question to answer because all these equations are measuring the exact same process.
563
00:57:43,710 --> 00:57:48,300
The amount of variability you get in terms of incidents will be the same as you get in previous things.
564
00:57:48,300 --> 00:57:51,750
But they're all essentially dependent on a single RTU estimate,
565
00:57:51,750 --> 00:57:59,920
a single gene from a single acting and single gene you can to individual incidence and prevalence.
566
00:57:59,920 --> 00:58:04,160
Christophe, you still have your hand up, but maybe that's all, I hope that answered your question.
567
00:58:04,160 --> 00:58:11,820
I just got the pizza, you know, jump in, if not. And you're still on mute first off, OK, you've taken it down.
568
00:58:11,820 --> 00:58:18,900
So let's do one more follow up. What about the risks? Yeah, Chris asked about exact results for the offspring distribution.
569
00:58:18,900 --> 00:58:27,960
Yeah, you can. Well, I mean. When you calculate the higher order moments for the winning equation, you need to assume an offspring.
570
00:58:27,960 --> 00:58:30,990
This is this is something you have to do.
571
00:58:30,990 --> 00:58:40,260
And yeah, as long as an offspring distribution has a second moment, then you can compete if it doesn't have a second moment.
572
00:58:40,260 --> 00:58:51,090
Well, the variance can be essentially unbounded. OK, so we are officially at the end, so let's take a moment to once again thank Samir.
573
00:58:51,090 --> 00:58:57,540
So if people want to unmute and clap, there will be a way to do it if they want to use emojis, but also fine.
574
00:58:57,540 --> 00:59:03,840
And then I will just volunteer Sam to stick around and informally answer my questions with some more feedback.
575
00:59:03,840 --> 00:59:08,160
But this officially closes it. Well, you know, thanks for inviting me, Dominic.
576
00:59:08,160 --> 00:59:13,260
Please reach out if you want to call this a programme that's in our Python, Juliet.
577
00:59:13,260 --> 00:59:19,620
I'm happy and I've written up our notebook to to to make it simple with some examples.
578
00:59:19,620 --> 00:59:21,696
How do you share? Just want to?