1
00:00:02,370 --> 00:00:09,590
I was just curious. OK, I'm going to have to monitor the table on this.

2
00:00:09,590 --> 00:00:15,770
Yes. I hope it will. I hope this is yeah, you can say it's so.

3
00:00:15,770 --> 00:00:22,890
Yeah, so I can send my slides. That's correct.

4
00:00:22,890 --> 00:00:35,680
Now. It's tough, it's all.

5
00:00:35,680 --> 00:00:45,230
I so so somewhere out there and and the interweb you indicate that to.

6
00:00:45,230 --> 00:00:54,340
About where they are today, and you hear me, right, so.

7
00:00:54,340 --> 00:01:03,080
A chat message isn't that. Sorry, this is a job for John.

8
00:01:03,080 --> 00:01:09,110
John produced a message. Yes, they can hear us.

9
00:01:09,110 --> 00:01:24,420
Yes, word. Well, welcome, and thanks to Robert for standing next to me, sir.

10
00:01:24,420 --> 00:01:30,380
So, so welcome today to Robert Good. He's going to tell us about some of his work on lack of melting.

11
00:01:30,380 --> 00:01:38,060
And so this is of interest to me personally, because you're Mormon specifications is a thing that we have to deal with and parametric inference,

12
00:01:38,060 --> 00:01:47,540
and it's one of the people who's dealing with it. So Robert was actually an emergency student here in Oxford a long time ago,

13
00:01:47,540 --> 00:01:54,530
and and he's going on to great things at the new statistics unit in Cambridge.

14
00:01:54,530 --> 00:02:00,980
So, so, so thanks very much, Robert. We do.

15
00:02:00,980 --> 00:02:21,860
We need to worry about that. So hopefully someone somewhere will have to shout out messages and say, you can't hear me remotely.

16
00:02:21,860 --> 00:02:28,820
So thanks very much to Jeff for inviting me. It's nice to be here to present this work.

17
00:02:28,820 --> 00:02:34,430
So this work is a. Doesn't it seem to be working?

18
00:02:34,430 --> 00:02:41,520
It was working a second ago, the. And they need to move.

19
00:02:41,520 --> 00:02:50,710
It's not what it's working now. So this this work is motivated by problems where you have more than one source of data.

20
00:02:50,710 --> 00:02:56,980
So you might have multiple studies, each of which of course have have separate pieces of data associated with them,

21
00:02:56,980 --> 00:03:04,060
which won't go away one up to wait for. And each of these studies, we think are complementary in some way.

22
00:03:04,060 --> 00:03:11,680
So they might be from different populations, or they might tell us about different aspects in my line of work and some kind of disease,

23
00:03:11,680 --> 00:03:19,360
human disease usually and said ideally we'd wait to piece them together to hopefully give us better inference.

24
00:03:19,360 --> 00:03:26,320
So to piece them together, we need to assume that there's some degree of commonality between these different bits of data.

25
00:03:26,320 --> 00:03:33,610
And for most of this talk, we're going to assume that the commonality is that we've got some parameter fire here,

26
00:03:33,610 --> 00:03:38,680
which is shared by all of the models for each of these different studies.

27
00:03:38,680 --> 00:03:47,020
And by combining them together, hopefully that will give us more precise estimates just for the fact that we've got a larger sample size.

28
00:03:47,020 --> 00:03:54,900
I hope will also give us a true representation of the uncertainty that's inherent in in in the situation we're looking at.

29
00:03:54,900 --> 00:04:03,670
So in some situations, the variants might be larger, but that will hopefully be a more accurate reflection of of what's going on.

30
00:04:03,670 --> 00:04:08,380
And finally, it will hopefully minimise the risk of selection type biases.

31
00:04:08,380 --> 00:04:14,650
So if we just use one of these datasets and maybe that only considers a particular cohort of patients,

32
00:04:14,650 --> 00:04:22,460
then then that might be biasing our results according to the usual selection type problem.

33
00:04:22,460 --> 00:04:32,750
So if to make it a little bit more concrete in my line of work of statistics, people often come along with with multiple different datasets,

34
00:04:32,750 --> 00:04:40,160
so there might be a clinical trial where they've investigated the effect of some specific intervention on some disease.

35
00:04:40,160 --> 00:04:47,090
There might be a cohort study that might have run for 20 30 years and will be a sort of moderate size,

36
00:04:47,090 --> 00:04:50,540
whereas the clinical trial will tend to be recovery.

37
00:04:50,540 --> 00:04:58,340
So you will tend to be relatively small and have very strong inclusion criteria, so it won't be terribly representative of the whole population.

38
00:04:58,340 --> 00:05:02,450
We may also have health record data from GP's and hospitals,

39
00:05:02,450 --> 00:05:10,640
which will be a larger and potentially but different types of biases from a more carefully controlled clinical trial, a cohort study.

40
00:05:10,640 --> 00:05:17,060
And finally, we might also have genomic or other kinds of high throughput data, and ideally,

41
00:05:17,060 --> 00:05:25,070
we would like to combine these all together to provide a complete understanding of whichever disease as we're interested in.

42
00:05:25,070 --> 00:05:31,670
But in practise doing that, at least as a Bayesian and probably as a classical suspicion as well will be very hard

43
00:05:31,670 --> 00:05:36,470
because all of these different types of data all have their own unique complexities.

44
00:05:36,470 --> 00:05:45,380
And so identifying a suitable model that describes all four of them simultaneously is well nigh on impossible.

45
00:05:45,380 --> 00:05:54,230
And even if you could fit in such an enormous model, at least with the computational tools that are available at the minute will be pretty difficult.

46
00:05:54,230 --> 00:06:01,250
And even if you could do that, you've got such a large amount of data and such a complicated model working out

47
00:06:01,250 --> 00:06:07,490
whether that the posterior distribution from that enormous model is actually a good,

48
00:06:07,490 --> 00:06:12,560
a good fit to your data and and really captures what's going on will be very hard.

49
00:06:12,560 --> 00:06:15,900
Never mind if you get results that are counterintuitive in some way,

50
00:06:15,900 --> 00:06:24,850
trying to figure out where the problem is arising from, when you've got so many bits of data and parameters.

51
00:06:24,850 --> 00:06:33,160
So what what what we are coming out with this work is, is the idea that you might not try and do this all at once.

52
00:06:33,160 --> 00:06:38,440
You might instead have its smaller sub models for each of these different types of data.

53
00:06:38,440 --> 00:06:45,510
So you might have a model one that describes the clinical trial data for the cohort study and so on.

54
00:06:45,510 --> 00:06:53,580
And in fact, this is often how things are and that someone will already have analysed the clinical trial and they will develop the model.

55
00:06:53,580 --> 00:06:56,760
And similarly, for all of the other types of data, there will be existing models.

56
00:06:56,760 --> 00:07:00,900
No one's going to start trying to model all of these data together immediately from nothing.

57
00:07:00,900 --> 00:07:05,490
And so while we could do all of that hard work in creating one up to P four out,

58
00:07:05,490 --> 00:07:10,020
it seems pretty wasteful to throw all of that away and start from scratch again.

59
00:07:10,020 --> 00:07:17,790
So what we'd like to do is find some way to take these existing models and integrate them into an overall analysis,

60
00:07:17,790 --> 00:07:28,720
ideally as generically as possible. So and the existing ways that people do this are as follows.

61
00:07:28,720 --> 00:07:36,240
And so one way that is extremely widely used is that you you have one of these models,

62
00:07:36,240 --> 00:07:41,990
you get your posterior distribution and then you take some that take the common quantity, which in this case is PHI.

63
00:07:41,990 --> 00:07:51,260
And then you plug it in to that point estimate from that posterior distribution into a second model and then you carry on and do that each time.

64
00:07:51,260 --> 00:07:52,670
So this, I think,

65
00:07:52,670 --> 00:08:00,140
is everywhere is actually quite hard to find examples of this because everyone hides this in the supplementary materials the best of their paper.

66
00:08:00,140 --> 00:08:02,600
But I think this happens absolutely everywhere.

67
00:08:02,600 --> 00:08:08,630
And of course, statistically, this probably isn't ideal because you're taking an uncertain quantity and putting you in a point estimate.

68
00:08:08,630 --> 00:08:12,980
So at least in general, this will underestimate uncertainty.

69
00:08:12,980 --> 00:08:20,970
Of course, in specific situations, if the onset is small relative to the problem you're looking at, it may be quite reasonable.

70
00:08:20,970 --> 00:08:29,010
A second approach is to plug in an approximation to the posterior into the second model that you're fitting,

71
00:08:29,010 --> 00:08:33,180
so rather than just taking a single fixed point estimate you you take some

72
00:08:33,180 --> 00:08:37,770
distribution that looks similar to the posterior distribution and pro-Tibetan.

73
00:08:37,770 --> 00:08:46,050
We'll come back to that in a minute. And finally, what we were trying to do is integrate these models into a single joint Bayesian model,

74
00:08:46,050 --> 00:08:50,610
which of course then brings along all of the advantages of Bayesian inference.

75
00:08:50,610 --> 00:08:57,940
But of course, it could be quite tricky to do that in practise, and that's what we're that's the problem that we're trying to solve.

76
00:08:57,940 --> 00:09:04,120
So he has a slightly more concrete story example that is based on the more complicated

77
00:09:04,120 --> 00:09:08,620
example show later that will hopefully make things a little bit more concrete.

78
00:09:08,620 --> 00:09:15,130
So this and this is this is where all of this work really came out of which was in trying to model through.

79
00:09:15,130 --> 00:09:24,130
And one of the quantities of interest in that world is the probability of being hospitalised, given that you have influenza like illness.

80
00:09:24,130 --> 00:09:29,980
And let's imagine that we observed 100 people in a hospital with influenza like illness

81
00:09:29,980 --> 00:09:36,100
out of a thousand people in the in the population who had influenza an illness.

82
00:09:36,100 --> 00:09:43,120
So one model for that would be a very simple beta binomial model. And that's obviously very simple to fit.

83
00:09:43,120 --> 00:09:44,110
But of course, in practise,

84
00:09:44,110 --> 00:09:51,100
we never really know how many people there are in the in the whole population who have influenza like illnesses because you can't,

85
00:09:51,100 --> 00:09:55,690
you can't, you can't go out and ask every single question in the population.

86
00:09:55,690 --> 00:09:58,480
And so there's some degree of uncertainty about that.

87
00:09:58,480 --> 00:10:06,400
So we could represent that with a with a prior on long little end, which might be, say, a plus on distribution, for example.

88
00:10:06,400 --> 00:10:13,810
But then imagine that we have new data from a similar geographic area, for example,

89
00:10:13,810 --> 00:10:19,990
where we observe where we've managed to go out and count how many people in the population have influenza like illnesses.

90
00:10:19,990 --> 00:10:24,370
And so maybe about 40 out of 500 people.

91
00:10:24,370 --> 00:10:28,570
And so then that gives us another beta binomial model.

92
00:10:28,570 --> 00:10:36,820
And if we assume that the populations are similar, then we can take the the posterior distribution of this,

93
00:10:36,820 --> 00:10:43,840
a key parameter which gives us the an estimate of the proportion of people in the population who have influenza like illnesses.

94
00:10:43,840 --> 00:10:53,500
And then, um, if we know the total number of people in the in the original population, then we can plug that Q into another binomial model.

95
00:10:53,500 --> 00:11:05,320
And that gives us a second estimate of whito and the number of the number of people in the population with influenza like illness.

96
00:11:05,320 --> 00:11:12,820
But now we have not just this one with Model four, and we have that original prior that we had four in.

97
00:11:12,820 --> 00:11:19,100
And so we have to try and resolve the fact that we've got two two models for the same quantity.

98
00:11:19,100 --> 00:11:25,420
And of course, one option would just be to throw away this question prior that we had four in and take this.

99
00:11:25,420 --> 00:11:33,400
And as as the truth, but of course, this is coming from a similar area and maybe we've done a lot of work to elicit,

100
00:11:33,400 --> 00:11:37,890
for example, that a plus on prior.

101
00:11:37,890 --> 00:11:45,690
So it would be nice if we could not just throw away one of these prises or models, but we could actually integrate them in some way.

102
00:11:45,690 --> 00:11:51,360
Things got even more complicated if you had a stratification of your original population.

103
00:11:51,360 --> 00:11:56,790
So you've got two age groups, for example, and now we have them.

104
00:11:56,790 --> 00:12:01,170
So we've added that I subscribe to all of this, the model.

105
00:12:01,170 --> 00:12:09,090
And so rather than having a direct prior for the total number of people in this population

106
00:12:09,090 --> 00:12:18,120
and we have a model for and one and two where the sum of n1.2 equals with away.

107
00:12:18,120 --> 00:12:27,420
So now we have a binomial model for end to end that comes from this similar data area.

108
00:12:27,420 --> 00:12:34,950
And then we have a second model that's the that's a deterministic function of two other press.

109
00:12:34,950 --> 00:12:41,200
This obviously, here is a very trivial, deterministic function.

110
00:12:41,200 --> 00:12:46,620
And it turns out that these two problems are the ones that make this problem difficult.

111
00:12:46,620 --> 00:12:53,730
In general, this problem that you have two separate models of price for the same quantity and sometimes

112
00:12:53,730 --> 00:13:01,510
one of those models may be related to the original parameters by deterministic model.

113
00:13:01,510 --> 00:13:03,910
So graphically, that's what this example looks like,

114
00:13:03,910 --> 00:13:11,380
we started with a beta binomial model where these squares represent data in the double circle represent parameters.

115
00:13:11,380 --> 00:13:15,370
We then acknowledge that we're uncertain about with so and that becomes a parameter.

116
00:13:15,370 --> 00:13:21,940
We then added a second model from a similar area, so we now have two models for this quantity.

117
00:13:21,940 --> 00:13:27,100
And and then finally, we added that we split up this way into y one and way two.

118
00:13:27,100 --> 00:13:34,360
And that then meant that the repeated quantity was the output of of a non-invasive or deterministic function.

119
00:13:34,360 --> 00:13:42,270
And in this model here. OK, so the return to what the aims of this are and more and more general,

120
00:13:42,270 --> 00:13:48,000
we imagine that we've got several models, each of which involves some common quantify.

121
00:13:48,000 --> 00:13:55,050
And what we'd like to do is create a generic way of joining all of those models together into a single giant model.

122
00:13:55,050 --> 00:13:57,100
In some cases, this will be very easy.

123
00:13:57,100 --> 00:14:05,040
It turns out that the cases where it's hard, where you have an implicitly two different price for the same quantity,

124
00:14:05,040 --> 00:14:14,060
which doesn't make sense in Bayesian inference, and secondly, where you need to handle these non-convertible, deterministic transformations.

125
00:14:14,060 --> 00:14:22,910
We'd also like to do this in a what we call sort of staged or modular way so that you don't have to think about the whole model once.

126
00:14:22,910 --> 00:14:27,800
But you can kind of gradually build up inference to the social model.

127
00:14:27,800 --> 00:14:35,270
And thirdly, we'd also like to understand the reverse operation, which is kind of the opposite.

128
00:14:35,270 --> 00:14:38,690
So when you take the existing large model and want to split it up,

129
00:14:38,690 --> 00:14:46,000
and that can be useful for understanding what's going on, or potentially it could be a useful inference method.

130
00:14:46,000 --> 00:14:53,890
So he has been attention and good to use throughout, so we imagine we have capital and separate models or models,

131
00:14:53,890 --> 00:15:03,940
each of which involve this common parameter phi and some stuff, model specific parameters, Siam and some sub model specific data.

132
00:15:03,940 --> 00:15:11,800
Y m and what we'd like to do is create a generic method for integrating these two M

133
00:15:11,800 --> 00:15:18,530
different models into a single data model that involves all of the quantities together.

134
00:15:18,530 --> 00:15:21,310
That's what we're aiming to do.

135
00:15:21,310 --> 00:15:32,590
So it turns out that a special case of this had been considered in the 90s in a slightly different context by Phil David and Stefan Lauritsen.

136
00:15:32,590 --> 00:15:41,190
So they considered the special case where the marginal, the prior marginal distributions for the common quantity were all equal.

137
00:15:41,190 --> 00:15:46,570
So each of these sub models have the same prior to the common quantity.

138
00:15:46,570 --> 00:15:51,700
If you have that, then it's relatively obvious what a sensible model should be.

139
00:15:51,700 --> 00:15:58,960
So we have these capital m separate sub models we then can depend on by which then gives

140
00:15:58,960 --> 00:16:08,260
us the sub model specific conditional here and then the the that the prior marginal.

141
00:16:08,260 --> 00:16:17,900
And once you've done that, then it's fairly obvious the sensible model would be that we have this common prior to the buy.

142
00:16:17,900 --> 00:16:24,190
And then we take the prototype, the model specific conditionals.

143
00:16:24,190 --> 00:16:29,650
And this is called a mark of combination. So and, Stefan.

144
00:16:29,650 --> 00:16:37,660
And it's also been examined a bit further by Massa and Horowitz and more recently.

145
00:16:37,660 --> 00:16:47,260
There's a message, but I don't know how to achievements if it's important, it's like, Oh, OK, this is it.

146
00:16:47,260 --> 00:16:51,550
I'm just not. I just need you to talk more. Is that better?

147
00:16:51,550 --> 00:17:00,520
OK. OK, so the properties of the mock up combination model are the sub model specific parameters and data,

148
00:17:00,520 --> 00:17:08,650
a condition independent given the the common quantity that the sub model specific conditional was given

149
00:17:08,650 --> 00:17:15,610
the joint the common quantity of preserved between the Markov combination model and the original models.

150
00:17:15,610 --> 00:17:20,830
And and indeed also the this sub model marginals.

151
00:17:20,830 --> 00:17:27,920
So the joint distribution of each of the marginals preserved and the mark of combination.

152
00:17:27,920 --> 00:17:34,820
So what we were looking at was a more general problem where these marginals may be inconsistent.

153
00:17:34,820 --> 00:17:40,340
Hopefully that doesn't seem too contrived given the examples I showed you at the beginning.

154
00:17:40,340 --> 00:17:47,210
So we need to resolve the fact that we've got these inconsistent marginals for each of the from each of the sub models.

155
00:17:47,210 --> 00:17:49,940
So what we're going to choose to do is pool these together.

156
00:17:49,940 --> 00:17:59,750
We have some function g that takes us in input each of these prior marginals and spits out a single marginal how it's going to kill people.

157
00:17:59,750 --> 00:18:03,170
And then once we've done that, we're essentially back in the same situation.

158
00:18:03,170 --> 00:18:08,510
So it's quite obvious that a quite reasonable that the Joint Model for everything should be this

159
00:18:08,510 --> 00:18:15,490
prior marginal that we've pulled together multiplied by the sub model specific conditionals.

160
00:18:15,490 --> 00:18:22,960
The properties of this, which we call mark of melting, are quite similar to Markov combinations,

161
00:18:22,960 --> 00:18:30,070
the conditional independence and some model specific conditionals are preserved again.

162
00:18:30,070 --> 00:18:39,130
But now the the sub model specific marginals, the joint distribution of each of the sub models is no longer necessarily preserved

163
00:18:39,130 --> 00:18:45,220
because we've changed that the marginal distribution on the common quantity.

164
00:18:45,220 --> 00:18:55,030
Five. So we call this the mark of a bit of this comes from the mark of combination, I'll explain the melting bit in a minute.

165
00:18:55,030 --> 00:18:59,620
The connexion to where that word comes from.

166
00:18:59,620 --> 00:19:05,710
So how are we going to go about forming these priors? Well, there's several options.

167
00:19:05,710 --> 00:19:12,670
It turns out that this problem is basically the same as a problem that's been talked about since the 1980s and prior elicitation,

168
00:19:12,670 --> 00:19:18,550
where people have asked multiple experts what prior they think should be chosen for a

169
00:19:18,550 --> 00:19:23,770
particular parameter and the need to combine those separate phrases together in some way.

170
00:19:23,770 --> 00:19:28,720
So the the options that have been suggested are what's called linear opinion polling,

171
00:19:28,720 --> 00:19:39,520
which is basically just a mixture of each of the the sub model specific priors weighted by some quantity w logarithmic opinion polling,

172
00:19:39,520 --> 00:19:46,360
which is almost the same, but on the log scale or product of experts, which is the same as well comparing.

173
00:19:46,360 --> 00:19:53,560
But with these weights all set to one or what we somewhat hoshiko dictatorial pooing,

174
00:19:53,560 --> 00:20:02,560
which is where you set the pooled fried to be one of the original sub model priors, so you threw away all of the prise.

175
00:20:02,560 --> 00:20:15,340
Apart from the one, a big. As what these look like in one in a couple of settings to the inputs are these two black densities of Gaussians.

176
00:20:15,340 --> 00:20:21,220
And then the output density is in is in the colours speaks to that in this situation,

177
00:20:21,220 --> 00:20:25,900
where there's some degree of difference between the two input densities that,

178
00:20:25,900 --> 00:20:40,310
in your opinion, will be a mixture of the two input densities, whereas the wall can predict express where it's being the most concentrated.

179
00:20:40,310 --> 00:20:47,570
Several properties have been investigated and in the same literature, one of them's property of being expected to be made,

180
00:20:47,570 --> 00:20:56,210
in which basically the idea that if you perform it at accruing on each of the justices is that you get from the sub models,

181
00:20:56,210 --> 00:21:02,890
then you should get the same as if you pool together and then calculate the posterior.

182
00:21:02,890 --> 00:21:13,210
And it turns out that this property holds if you have logarithmic going with that with the some of the weights, something up to one.

183
00:21:13,210 --> 00:21:20,770
Unfortunately, this property doesn't really make sense in the context we're interested in because here we have different likelihoods,

184
00:21:20,770 --> 00:21:24,550
potentially and different data in each of the sub models.

185
00:21:24,550 --> 00:21:32,650
And so the log pooing externally Bayesian property doesn't even hold for about pooing in our context, unfortunately.

186
00:21:32,650 --> 00:21:38,740
So while it might be a property that would guide you, it's not very useful for us.

187
00:21:38,740 --> 00:21:42,280
So I said, I'll come back to where this word melding came from. So it comes from.

188
00:21:42,280 --> 00:21:50,450
This paper by David and Adrian Raftery in 2000, who considered an inference for a deterministic function f.

189
00:21:50,450 --> 00:21:56,230
So the f here was a fixed differential equation model of the number of whales in the sea.

190
00:21:56,230 --> 00:22:02,770
And so the standard Bayesian model for this would be that they had these observations of the number of whales in the sea,

191
00:22:02,770 --> 00:22:08,410
and they say that their noisy observations of the output of this deterministic function and they have got some prior.

192
00:22:08,410 --> 00:22:11,840
So the inputs of that deterministic function feature.

193
00:22:11,840 --> 00:22:18,260
What was unusual about Prue and Raftery setting was that rather than just having price on the inputs deterministic function,

194
00:22:18,260 --> 00:22:21,800
they also had price on the outputs, which simplistic function asked.

195
00:22:21,800 --> 00:22:30,410
So they wanted to kind of constrain the outputs of this differential equation model in a sort of soft way by imposing a price on the output.

196
00:22:30,410 --> 00:22:35,840
So now they have not just one price for five, they have the that's directly specified.

197
00:22:35,840 --> 00:22:42,020
They also have if you transform this profit to buy, if they have again have to price for the outputs,

198
00:22:42,020 --> 00:22:46,820
deterministic functions, they need to resolve this in some way.

199
00:22:46,820 --> 00:22:51,080
The solution is basically that you, even if this f is is not investable,

200
00:22:51,080 --> 00:23:00,020
you extend it to an investible function and then you back transform the prophesy onto future and then you pull those two price 52.

201
00:23:00,020 --> 00:23:06,620
And it turns out that the price that you get in the end is and doesn't depend on the

202
00:23:06,620 --> 00:23:11,420
way in which you extend that and not investable function turning vertical function.

203
00:23:11,420 --> 00:23:18,970
So that's where the that's where the melding comes from. OK, a few final miscellaneous notes on this.

204
00:23:18,970 --> 00:23:23,650
Of course, this this procedure gives you a joint model for all of this.

205
00:23:23,650 --> 00:23:31,120
All of the data, but it could be ludicrous, especially if the if the sub model is strongly conflict with each other,

206
00:23:31,120 --> 00:23:34,440
there's no guarantee that it will make any sense.

207
00:23:34,440 --> 00:23:40,900
And another thing is that I've said this, this idea is we're trying to create something that is modular.

208
00:23:40,900 --> 00:23:50,290
Some of you might be familiar with the idea of distributions, which are another idea another conception of modular module, a modular approach.

209
00:23:50,290 --> 00:23:58,570
Her mark of melting is is is quite different to mark of melting creates a full joint Bayesian model.

210
00:23:58,570 --> 00:24:06,370
And if you believe that that folding model is appropriate for your setting, then it's just standard Bayesian inference at that point.

211
00:24:06,370 --> 00:24:11,200
There's no changing of the posterior distribution, right? There isn't cuts.

212
00:24:11,200 --> 00:24:18,190
Finally, there's an approximate approach that it turns out is an approximation to mark up notice.

213
00:24:18,190 --> 00:24:25,150
I said at the beginning that we were sometimes people plug in and some kind of simple parametric

214
00:24:25,150 --> 00:24:31,030
approximation to a posterior distribution into a second model rather than just plugging in a point estimate.

215
00:24:31,030 --> 00:24:35,620
And one of these turns out to be an approximation to mark of noting.

216
00:24:35,620 --> 00:24:43,330
So if you if you, first of all, obtain the posterior distribution for your first model and then you approximate

217
00:24:43,330 --> 00:24:49,340
the posterior margin of the common quantified by a normal distribution.

218
00:24:49,340 --> 00:24:52,000
And then if you plug in that normal distribution,

219
00:24:52,000 --> 00:24:58,450
so if you plug in that normal distribution to a second model saying that you've got the observation for the normal distribution is that

220
00:24:58,450 --> 00:25:07,660
point estimate from the first model and that the variance covariance matrix is the estimate experience covariance matrix in the first model.

221
00:25:07,660 --> 00:25:14,680
Then it turns out that this is equivalent to Markov melding with a specific form of pooling with this product of experts pooling.

222
00:25:14,680 --> 00:25:22,330
So that can be useful in practise if you want a quick approximation to to this.

223
00:25:22,330 --> 00:25:31,540
So turning now to inference, and so the writing down the posterior distribution is, of course, very straightforward.

224
00:25:31,540 --> 00:25:37,930
It's just proportional to the to the to the melted distribution here.

225
00:25:37,930 --> 00:25:44,830
In the special case of this product of experts going where you say the world prior is the product of the marginals.

226
00:25:44,830 --> 00:25:49,920
And of course, these cancel out and you get to something even simpler.

227
00:25:49,920 --> 00:25:55,020
But in general, you will if you don't use projects, that's going, you'll need to.

228
00:25:55,020 --> 00:26:02,760
You need to have some estimate of these marginal price from each of the sub models PM five.

229
00:26:02,760 --> 00:26:06,960
In some situations, those will be analytically tractable and everything will be fine.

230
00:26:06,960 --> 00:26:14,820
But in the situations we've looked at, generally, these fires are not kind of root nodes in a graphical representation.

231
00:26:14,820 --> 00:26:19,590
And so you need to estimate these marginal prices.

232
00:26:19,590 --> 00:26:28,710
The way we've done so far is just sampling from those prior distributions and then approximating them with a kernel density estimate.

233
00:26:28,710 --> 00:26:34,650
This obviously won't work very well with anything other than a two dimensional common quantity.

234
00:26:34,650 --> 00:26:37,470
Five.

235
00:26:37,470 --> 00:26:47,340
So once you've won, if you've once you've done that in principle, something from this posterior distribution can be done in using standard methods.

236
00:26:47,340 --> 00:26:50,340
You can use a metropolis within Gibson Bush, for example,

237
00:26:50,340 --> 00:26:56,580
where you sample the sub model specific parameters conditional on the common quantity in one step.

238
00:26:56,580 --> 00:27:03,360
And that would be exactly identical to the sample that you would need for just the original model.

239
00:27:03,360 --> 00:27:09,930
And then in the other step, you can sample the common quantity conditional on all the sub model specific quantities,

240
00:27:09,930 --> 00:27:17,730
and you just need to come up with a reasonable proposal distribution there, which may or may not be straightforward.

241
00:27:17,730 --> 00:27:23,820
But while we've been more interested in is a more kind of modular approach where each of the where you can,

242
00:27:23,820 --> 00:27:30,990
where you can sample from one one sub model and then gradually build up towards the full Joint Model.

243
00:27:30,990 --> 00:27:40,350
And this is a this is what we call a multi-stage algorithm, and this has been explored by a few a few papers, some of which are cited there.

244
00:27:40,350 --> 00:27:46,840
So I'm going to tell you how this works in the in the special case of two models. And so we have.

245
00:27:46,840 --> 00:27:50,170
Two models, one of the P one and the second one,

246
00:27:50,170 --> 00:27:58,600
P two and what melting show tells us how to do is form a joint model for all of these quantities here.

247
00:27:58,600 --> 00:28:00,700
So to do computation in this case,

248
00:28:00,700 --> 00:28:09,100
what we first of all day as we draw samples from the first model and the posterior distribution of the common quantity under the first model,

249
00:28:09,100 --> 00:28:20,080
which is P one five given way one. And then what we're going to do is use those posterior samples as a proposal and M.C for the full Joint Model.

250
00:28:20,080 --> 00:28:26,680
And it turns out that this means that the likelihood terms relating to the first model cancel in the same C

251
00:28:26,680 --> 00:28:34,180
and so your second stage M AMC doesn't require any knowledge of the first model apart from those samples.

252
00:28:34,180 --> 00:28:39,160
So in this sense, it's a sort of modular approach to make that a bit more precise.

253
00:28:39,160 --> 00:28:46,530
We're taking drawing samples from the first model's posterior distribution and retaining them.

254
00:28:46,530 --> 00:28:57,190
And the second stage, we consider the phone model and to draw samples for the sub model specific parameters of that sub model,

255
00:28:57,190 --> 00:29:02,750
we just use a standard method. That would be used if you you're just fitting that model by itself.

256
00:29:02,750 --> 00:29:13,820
But for the so the common quantity five, we draw that by drawing a sample from these samples and stage one uniformly at random,

257
00:29:13,820 --> 00:29:22,340
essentially drawing a sample from this posterior distribution here from the first model assuming everything's converged.

258
00:29:22,340 --> 00:29:28,040
And then if you write out the usual metropolis Hastings acceptance ratio,

259
00:29:28,040 --> 00:29:38,390
then because we've set the proposal distribution queue to be equal to the first stage is the likelihood something that's proportional to it.

260
00:29:38,390 --> 00:29:46,640
Then these cancel out and the acceptance probability leaving us with an acceptance probability that doesn't depend on on p one at all.

261
00:29:46,640 --> 00:29:54,640
Apart from its prior marginal, which you said before you need to estimate somehow if it's not directly tractable.

262
00:29:54,640 --> 00:30:01,550
And of course, this can, at least in principle, extend to two any further stages if you've got more than two models.

263
00:30:01,550 --> 00:30:08,000
So let me now talk you through the more substantive example that the original toy model was based on.

264
00:30:08,000 --> 00:30:16,910
So this was looking at estimating the probability of hospitalisation from a specific form of flu,

265
00:30:16,910 --> 00:30:23,000
which was the H1N1 strain that went around England and in 2010.

266
00:30:23,000 --> 00:30:30,860
And what we're looking to do was estimate the total number of ICU admissions that happened throughout that well,

267
00:30:30,860 --> 00:30:39,100
not pandemic, whatever you call something that's less than a pandemic. And we had various sources of data.

268
00:30:39,100 --> 00:30:48,610
One of them was weekly numbers of suspected cases of a H1N1 in several schools in the UK.

269
00:30:48,610 --> 00:30:56,170
The second one was positivity data, so not all of these suspected cases really did actually have H1N1,

270
00:30:56,170 --> 00:31:05,140
but we for a sub sample, we had confirmation of of whether they were H1N1 from further testing.

271
00:31:05,140 --> 00:31:10,570
And then we also had various other indirect data, like the number of GP consultations,

272
00:31:10,570 --> 00:31:19,240
the number of hospitalisations that didn't get to ICU and the number of deaths, et cetera, et cetera, which to make this province.

273
00:31:19,240 --> 00:31:26,050
Not too complicated of this secondary data, I've simplified down to an informative if prior.

274
00:31:26,050 --> 00:31:34,690
So the models work like so. So the first model is this model for the intensive care unit data.

275
00:31:34,690 --> 00:31:42,430
So these data, why are the weekly number of suspected cases of H1N1 in each ICU?

276
00:31:42,430 --> 00:31:46,900
That's related to this parameter feature, which is determined,

277
00:31:46,900 --> 00:31:53,170
which has lots of components that represent a kind of birth death process of people coming in and out of ICU.

278
00:31:53,170 --> 00:31:57,910
But of course, all of this bit of the model relates to suspected H1N1.

279
00:31:57,910 --> 00:32:01,930
So we then if we want to estimate the true number of H1N1 cases,

280
00:32:01,930 --> 00:32:13,840
we need to relate that to the positivity data that is available for a subset of these patients, which is represented by this PI pause quantitate.

281
00:32:13,840 --> 00:32:19,270
And if we combine the suspect estimates of the suspected number of cases and the positivity data,

282
00:32:19,270 --> 00:32:31,270
then we can calculate the number of confirmed cases or estimate the number of confirmed cases, which is going to be a common quantify in this case.

283
00:32:31,270 --> 00:32:36,940
The second model is highly simplified to make this vaguely understandable.

284
00:32:36,940 --> 00:32:42,760
So it's just a binomial model in this case, and so we interested in.

285
00:32:42,760 --> 00:32:49,420
So the key here is at the end in a binomial and we have some proportion of of

286
00:32:49,420 --> 00:32:58,480
this list and quantity are actually key cases of confirmed H1N1 in an ICU.

287
00:32:58,480 --> 00:33:03,010
And we've got a very informative prior on this. This PI here.

288
00:33:03,010 --> 00:33:07,480
Sorry, on the K. Yeah, that's that wraps up all sorts of other bits of model.

289
00:33:07,480 --> 00:33:13,210
So we'd like to combine these two models into a single model.

290
00:33:13,210 --> 00:33:17,890
And what we're we used melting to do that which tells us how to how to form

291
00:33:17,890 --> 00:33:28,110
a joint model from these two models that don't work immediately compatible. We had to combine the priors and this this y here was sorry.

292
00:33:28,110 --> 00:33:32,530
The five here was split into two separate age categories.

293
00:33:32,530 --> 00:33:39,940
In this example, and so the priors here are by various distributions across the x axis.

294
00:33:39,940 --> 00:33:43,720
Here is in one age group and the y axis is another group.

295
00:33:43,720 --> 00:33:47,630
I think this was children and this is adults at the y axis.

296
00:33:47,630 --> 00:33:55,100
So these on the left are the original prise that we had and four, four, five one and five two in each case.

297
00:33:55,100 --> 00:34:00,170
So the ICU model had a very flat prior, whereas the severity model, I said, was informative.

298
00:34:00,170 --> 00:34:06,620
This is quite piqued. And then on the right hand side is what you get if you pull those two together.

299
00:34:06,620 --> 00:34:14,470
In this case, because the ICU model is so flat, you get something that's quite similar to just the severity model by itself.

300
00:34:14,470 --> 00:34:19,090
These are the results for the for the five parameters in the second model.

301
00:34:19,090 --> 00:34:24,160
So the top line shows the posterior distribution from the first model alone.

302
00:34:24,160 --> 00:34:30,550
That's the second. The second line is the posterior distribution from the second model alone.

303
00:34:30,550 --> 00:34:39,430
So you can see some degree of difference between these two, whereas the bottom few rows show what you get from the melded model,

304
00:34:39,430 --> 00:34:45,970
using various different pooling techniques and also using the normal approximation method that I mentioned as well.

305
00:34:45,970 --> 00:34:55,120
And what's quite quite reassuring here is that there's a fair degree of similarity across all of these different viewing methods.

306
00:34:55,120 --> 00:35:01,170
But the variance is reduced by combining the two datasets together.

307
00:35:01,170 --> 00:35:05,610
I'm going to skip this example for the time.

308
00:35:05,610 --> 00:35:08,760
And so as I mentioned before,

309
00:35:08,760 --> 00:35:15,930
that you can also do the reverse of this process so you can start with a Joint Model and then split it up into sub models.

310
00:35:15,930 --> 00:35:22,860
And you take these sub models to be faithful to the original model in the sense that if we then join them back together with melding,

311
00:35:22,860 --> 00:35:31,560
we'd get the Joint Model again. And this could be useful for dividing up computation in a really big model.

312
00:35:31,560 --> 00:35:38,880
Or it could also be useful for improving understanding when when you're not quite sure what's going on in a very big model,

313
00:35:38,880 --> 00:35:43,770
it might be useful to to work out which parts of the model is providing information.

314
00:35:43,770 --> 00:35:50,400
So you can obviously only do this if you've got conditional independence between the two different bits of model that you want.

315
00:35:50,400 --> 00:35:56,460
So graphically, you can't have a kind of B structure of colliders. If songs you have that,

316
00:35:56,460 --> 00:36:03,030
you've got a fair degree of choice about the sub model specific parameters that set the sub

317
00:36:03,030 --> 00:36:09,150
model specific prior that you'd like to use for each of these components that you split into.

318
00:36:09,150 --> 00:36:17,460
So, so long as when you join these sub model specific priors together with whatever approval you're using,

319
00:36:17,460 --> 00:36:21,360
you get back to the original prior, then you can do whatever you like.

320
00:36:21,360 --> 00:36:26,370
For example, if you're using product of experts point where you just multiply the components together,

321
00:36:26,370 --> 00:36:35,300
then you could use a fractionated prior or indeed any other factorisation of the original prior into M factors.

322
00:36:35,300 --> 00:36:45,080
And so we use this in in a model from ecology, where they had two different sources of data on a common quantity fire,

323
00:36:45,080 --> 00:36:53,900
so they had mark recapture data where they where they capture animals and then tagged them and then catch them again, which is why.

324
00:36:53,900 --> 00:37:03,170
And also data on from an index where they recorded the number of birds that were in this case.

325
00:37:03,170 --> 00:37:10,370
And we can split that Joint Model for all of these quantities into two separate models.

326
00:37:10,370 --> 00:37:20,480
And then what was reassuring to us was if we apply our two stage algorithm to fit this model and then that model afterwards,

327
00:37:20,480 --> 00:37:26,360
we get something, we get results that are very similar to what we get if we felt the full Joint Model together.

328
00:37:26,360 --> 00:37:32,420
So that's what that graph shows. I will get better as well time.

329
00:37:32,420 --> 00:37:41,720
So the multi-stage algorithm that we proposed in principle is very nice because it lets you split up the computation into separate parts.

330
00:37:41,720 --> 00:37:49,130
But as probably many of you are guessing, it's not perfect because you are.

331
00:37:49,130 --> 00:37:54,320
You're simplifying your you're using a oh, quite simple proposal distribution.

332
00:37:54,320 --> 00:38:03,110
But there's another problem, which is that we are needing to estimate in our acceptance probability the ratio of these prior

333
00:38:03,110 --> 00:38:11,870
marginals p one at two separate points whenever we make an m m c move for that common quantity.

334
00:38:11,870 --> 00:38:19,940
And what I said at the beginning was that we were going to use density estimation for estimating that ratio,

335
00:38:19,940 --> 00:38:30,320
and that definitely won't work if it is as high dimensional. But even if it's not high dimensional, we are if we just have samples from.

336
00:38:30,320 --> 00:38:40,250
From these he won by quantities, then we won't have very good estimates of that quantity in the tales from kernel density estimation,

337
00:38:40,250 --> 00:38:47,660
and because we've got a ratio here, even if we get this quantity on the denominator out by a relatively small amount

338
00:38:47,660 --> 00:38:52,340
that can blow up very quickly and can mean that we accept moves that we shouldn't

339
00:38:52,340 --> 00:38:58,220
really accept so we can get we can move out from the main mass of the distribution

340
00:38:58,220 --> 00:39:03,350
here and into some point in the tails and end up getting stuck there.

341
00:39:03,350 --> 00:39:11,900
You can do a little bit better if you if, rather than just drawing samples from these margins directly if you draw from some weighted version of that.

342
00:39:11,900 --> 00:39:17,450
And that's what this this paper here describes.

343
00:39:17,450 --> 00:39:29,720
OK, so so far, I've talked about melting when there's a single quantity that's common to all of the the models, but not all situations are like that.

344
00:39:29,720 --> 00:39:38,840
You might have models that are related in different ways and one way that you might have that would be some models that form a chain like structure.

345
00:39:38,840 --> 00:39:46,310
So if we have several models and then we have a kind of Venn diagram where we have a common primary to fire one in two,

346
00:39:46,310 --> 00:39:50,300
that's common between Model one and Model two. And similarly so on.

347
00:39:50,300 --> 00:39:57,590
So the original formulation of melting doesn't tell you what to do in this situation.

348
00:39:57,590 --> 00:40:03,350
So the invitation for this set up is that we have capital separate models.

349
00:40:03,350 --> 00:40:09,770
And again, the sub model specific parameters and and some of the specific data, why.

350
00:40:09,770 --> 00:40:10,280
But now,

351
00:40:10,280 --> 00:40:21,650
rather than having a single fire that's common to all of the models we have and I am intersect and plus one which is common to model m m plus one.

352
00:40:21,650 --> 00:40:28,550
So we have m minus one common quantity shared across and different models.

353
00:40:28,550 --> 00:40:37,450
And again, what we'd like to do is find some generic way of forming a single joint model for all of these quantities.

354
00:40:37,450 --> 00:40:42,430
So what we propose is similar to the original melting idea,

355
00:40:42,430 --> 00:40:52,000
so we take each of the sub models divide three by the the prior marginal for the common quantity in that model and then multiply all of

356
00:40:52,000 --> 00:41:04,740
these together and then we replace that prior for each of the common quantities by a single pool for all of the common quantities together.

357
00:41:04,740 --> 00:41:14,100
I know that this won't be the same as providing as performing the common fine form of melding twice in general,

358
00:41:14,100 --> 00:41:18,270
except if there's if there's independence in this.

359
00:41:18,270 --> 00:41:25,060
If these two these these two countries in the middle model a priority independent.

360
00:41:25,060 --> 00:41:34,580
And of course, we can generalise this case for equals three into in the obvious way to a general number of models capital.

361
00:41:34,580 --> 00:41:37,850
So how are we going to form this period prior in this case,

362
00:41:37,850 --> 00:41:47,870
and now we need some function g that takes a prior for each of the common quantities all the way up to Model M,

363
00:41:47,870 --> 00:41:52,390
where all of the middle priors are by variety.

364
00:41:52,390 --> 00:42:00,230
Each of the Pfizer univariate and end models at the end quantities are a univariate.

365
00:42:00,230 --> 00:42:04,790
This is a little bit different to the original pooling set up.

366
00:42:04,790 --> 00:42:13,550
And however, you can nevertheless just apply a logarithmic opinion point in this context.

367
00:42:13,550 --> 00:42:20,330
Just multiplying all of these price together and taking some power that you called lambda here.

368
00:42:20,330 --> 00:42:26,240
So that will give you a valid probability distribution when your opinion polling is a bit less obvious.

369
00:42:26,240 --> 00:42:34,700
What what you get if you add together a univariate density in a Bay Area density, it's not not terribly obvious what you should do here.

370
00:42:34,700 --> 00:42:42,350
And the nearest analogue that we've come up with is that you marginals of each of these by

371
00:42:42,350 --> 00:42:49,940
various densities and takes the linear pool of those by various those univariate marginals,

372
00:42:49,940 --> 00:42:56,910
and then take the product of those marginals to give you your full dimensional approval density.

373
00:42:56,910 --> 00:43:01,700
But obviously, this induces prior independence, which may well not be what you want.

374
00:43:01,700 --> 00:43:06,290
So I'm not certain that that's a that's a great option, but it is an option.

375
00:43:06,290 --> 00:43:10,250
You can also do something that's analogous to dictatorial peering.

376
00:43:10,250 --> 00:43:15,830
So essentially in this setup, you have two choices of prior for each of the quantities,

377
00:43:15,830 --> 00:43:24,660
and you have to figure out some way of choosing one of those for each of the quantities as the several ways you can do that.

378
00:43:24,660 --> 00:43:35,040
So here's an example of how this works out and in in a few cases, so the setup is that we have one.

379
00:43:35,040 --> 00:43:42,270
This is live on the x axis. We have one of the input densities, but here is just a normal.

380
00:43:42,270 --> 00:43:52,320
And then on the other axis, we have the the marginal distribution for the second quantity.

381
00:43:52,320 --> 00:43:55,590
And then we also have this this joint distribution between the two.

382
00:43:55,590 --> 00:44:06,030
So this here is the prior model one this year, the prior Model three, and this is the BI variant prior in the middle model model to the blue.

383
00:44:06,030 --> 00:44:10,410
And then if we combine those together, then we get on the wrong way around.

384
00:44:10,410 --> 00:44:13,380
The red one is the sorry. The blue one is the input, isn't it?

385
00:44:13,380 --> 00:44:18,960
And the red is that is what you get if you vary the the priors so you can get if you vary.

386
00:44:18,960 --> 00:44:26,550
The quantities in the linear invoke points of the left hand column is working in the right hand column when you're playing,

387
00:44:26,550 --> 00:44:32,460
so you get various different choices of overall pilled prior.

388
00:44:32,460 --> 00:44:40,770
OK, then finally, an example of this change set up that was inspired, some work I was doing on on COVID.

389
00:44:40,770 --> 00:44:44,940
So in COVID, in the worst case, you end up in intensive care.

390
00:44:44,940 --> 00:44:52,590
And one thing that's of interest in COVID patients in intensive care is when you reach what's called respiratory failure,

391
00:44:52,590 --> 00:44:56,970
which is defined by your ratio being less than 300.

392
00:44:56,970 --> 00:45:03,510
So these are three example patients with the F ratio three times on the x axis.

393
00:45:03,510 --> 00:45:12,290
So the y axis. And what we're interested in is the time when these this quantity crosses the red dotted line.

394
00:45:12,290 --> 00:45:13,850
So as you can see in these examples,

395
00:45:13,850 --> 00:45:20,890
there's a fair degree of uncertainty about when when respiratory failure has been reached for each of these patients.

396
00:45:20,890 --> 00:45:26,240
And so what we'd like to do is understand what determines when you reach respiratory

397
00:45:26,240 --> 00:45:30,650
failure while accounting for this uncertainty about the time when the events happened.

398
00:45:30,650 --> 00:45:38,060
So we essentially have a time to event problem with an uncertain time.

399
00:45:38,060 --> 00:45:46,430
So one of the things that might influence how quickly you reach respiratory failure is is this thing called cumulative fluid balance,

400
00:45:46,430 --> 00:45:51,260
and this changes through time. It depends on the treatments that you're being given.

401
00:45:51,260 --> 00:45:58,900
And in particular, what might effect is the rate of of the commute, the rate of the cumulative fluid balance.

402
00:45:58,900 --> 00:46:04,400
So this is essentially the the slope of these of these lines.

403
00:46:04,400 --> 00:46:11,730
So this, of course, varies through time. And there's also baseline factors that might influence how quickly you reach respiratory failure.

404
00:46:11,730 --> 00:46:18,140
So we want some way to combine the uncertainty about when the end point has been reached.

405
00:46:18,140 --> 00:46:25,550
The uncertainty about this cumulative fluid balance rate and also the uncertainty

406
00:46:25,550 --> 00:46:32,330
or the fact that there are baseline risk factors that might influence this.

407
00:46:32,330 --> 00:46:39,890
So what we're going to do is we're going to have three models, one model, which is a model for this ratio data,

408
00:46:39,890 --> 00:46:48,350
which is going to be a baseline model, which is represented here with the Black Line and the uncertainty in the grey.

409
00:46:48,350 --> 00:46:56,030
Then we're also going to have a model for the for the cumulative fluid balance, which is just going to be a peace prise winning our model.

410
00:46:56,030 --> 00:47:03,440
And then finally, we're going to integrate those two models with with the time two event model for respiratory failure.

411
00:47:03,440 --> 00:47:12,300
So in more mathematical detail. We have a standard baseline model for the F ratio data.

412
00:47:12,300 --> 00:47:24,020
And as a function of that, we can calculate an estimated time of respiratory failure by solving when we first cross the three hundred point,

413
00:47:24,020 --> 00:47:30,860
then we've got a, uh, a a piece y is going to be a model for the cumulative fluid balance.

414
00:47:30,860 --> 00:47:40,250
So just with with two pieces, and then we're going to take a quite simple time to event model, which is a valuable time to event model.

415
00:47:40,250 --> 00:47:49,430
So these models are related by the fact that the the rating, the this quantity here in the time two event model,

416
00:47:49,430 --> 00:47:56,960
the the what we get to we differentiate and it comes from the cumulative fluid balance model.

417
00:47:56,960 --> 00:48:03,470
So together, these two things are what's called a joint model in the statistics literature.

418
00:48:03,470 --> 00:48:11,330
So we're combining together a longitudinal model here with a time two event model in Model three.

419
00:48:11,330 --> 00:48:20,390
So what we're adding to that is that the the time that's related to this hazard is itself uncertain.

420
00:48:20,390 --> 00:48:25,530
And it's been estimated from this this model model one here.

421
00:48:25,530 --> 00:48:37,470
So graphic, what we have is we have the the Beast plane model for the F ratio model here and P one.

422
00:48:37,470 --> 00:48:46,350
And that is the quantity five one in SEC two here is that is the time when you reach respiratory failure.

423
00:48:46,350 --> 00:48:53,670
And that's related. Oh, that's the that's the time that goes into Model two, which is the time to event model.

424
00:48:53,670 --> 00:49:01,470
And then the time two event model depends on the rate of cumulative fluid balance intake,

425
00:49:01,470 --> 00:49:07,410
which comes from this third model, which is the which was a simple, piece wise linear model.

426
00:49:07,410 --> 00:49:10,680
So we want to integrate all of these three models together,

427
00:49:10,680 --> 00:49:19,710
and it turns out in this case that the results that you get don't really depend on which type of pooling you use in this case,

428
00:49:19,710 --> 00:49:23,940
so that the pooled results are the red and blue curves.

429
00:49:23,940 --> 00:49:29,180
But you can't really see the blue car because it's completely underneath the red curve.

430
00:49:29,180 --> 00:49:33,950
But that the result you get from from that hearing is different to what you get if

431
00:49:33,950 --> 00:49:41,270
rather than doing melding what you do is you fix the in your in your middle models,

432
00:49:41,270 --> 00:49:50,090
the the time to event as there is a point estimate from your first model and the point estimate from the third model,

433
00:49:50,090 --> 00:49:55,430
which is that since the fluid balance model and this is the APF ratio model.

434
00:49:55,430 --> 00:50:00,470
So there's a there's a shift in all of the key parameters in this case.

435
00:50:00,470 --> 00:50:10,610
So it shows that at least in this case, accounting for the uncertainty by propagating it through the three models, it does make a bit of a difference.

436
00:50:10,610 --> 00:50:17,780
So to summarise, Mark of Melting provides a generic method for joining together different sub models that

437
00:50:17,780 --> 00:50:23,960
either share a single common variable or that are linked in a chain like structure.

438
00:50:23,960 --> 00:50:29,240
The key idea is this idea of pulling together prior marginal distributions.

439
00:50:29,240 --> 00:50:39,350
But of course, it probably won't make sense if the strong conflict between either the prior marginal or the the data from each of the separate models.

440
00:50:39,350 --> 00:50:47,240
The multi-stage algorithm allows you to conduct inference for the full joint noted model in a sub model specific stages,

441
00:50:47,240 --> 00:50:52,400
which might be easier or more convenient than fitting the full model directly.

442
00:50:52,400 --> 00:51:01,490
But it can be a bit unstable in some cases, and weighted KDE might help with that, or at least one aspect of the challenges of that.

443
00:51:01,490 --> 00:51:11,150
So returning back to my original problem that I set out where we had these four different types of data in four different models,

444
00:51:11,150 --> 00:51:15,950
do we know how to integrate all of these together? Well, I think the honest answer is is no.

445
00:51:15,950 --> 00:51:23,480
I think we're a long way away to do this in general. This what I've described today will work if you've got quite low dimensional

446
00:51:23,480 --> 00:51:28,820
common parameters and relatively little conflict between the different models.

447
00:51:28,820 --> 00:51:33,710
And so I think there's a lot of work that could still be done in this area to provide a truly

448
00:51:33,710 --> 00:51:40,160
generic method that would work for the scale and complexity at least of biomedical data.

449
00:51:40,160 --> 00:51:44,570
And finally, thank you to my collaborators, particularly Lawrence Bernice,

450
00:51:44,570 --> 00:51:51,110
who I did a lot of the early work on this with and more recently with Andrew Mendelson, who is a Ph.D. student with me.

451
00:51:51,110 --> 00:51:57,750
He's done a few bits on this and also to and David and Danny as well.

452
00:51:57,750 --> 00:52:09,940
So thank you very much. The president said.

453
00:52:09,940 --> 00:52:16,220
Survivor Grissom. I mean.

454
00:52:16,220 --> 00:52:28,530
Yep. One of the star would have a possible way, choose an example, it will say.

455
00:52:28,530 --> 00:52:38,670
Yep, yep. So the question for the online people is would it be desirable to have a principled way to choose amongst the different viewing methods,

456
00:52:38,670 --> 00:52:44,450
even though in the case, the case that I showed doesn't make much difference?

457
00:52:44,450 --> 00:52:47,060
I think it would be great if there was a way.

458
00:52:47,060 --> 00:52:55,930
So we we thought at first that some of the the property of external Bayesian pulling might be a justification for rogue doing in this case.

459
00:52:55,930 --> 00:53:00,800
And but unfortunately, it doesn't really apply or it doesn't.

460
00:53:00,800 --> 00:53:03,740
Well, that probably doesn't hold in the setting that we're interested in.

461
00:53:03,740 --> 00:53:08,750
There's also other properties that people have looked at in the prior proving literature,

462
00:53:08,750 --> 00:53:19,130
which are related to whether specific A.R.T. properties hold and not going to be able to formulate it properly off the top of my head.

463
00:53:19,130 --> 00:53:23,060
But there are there, so there are some more properties that you'd think would be desirable.

464
00:53:23,060 --> 00:53:28,520
I seem to remember it's one of these cases where there's three properties that all seem quite reasonable that you'd like,

465
00:53:28,520 --> 00:53:38,410
and I think someone's proved that you can only have two of them. So I don't think that's going to be a perfect method, but.

466
00:53:38,410 --> 00:53:45,100
I don't I don't know whether there's a generic way of choosing, I think, um,

467
00:53:45,100 --> 00:53:51,850
I guess you'd have to specify some specific criteria we're trying to satisfy, maybe maybe in the prediction setting,

468
00:53:51,850 --> 00:53:58,690
maybe there'd be something generic and the kind of get out of jail free answer

469
00:53:58,690 --> 00:54:04,360
is that it should be subjectively chosen like Oprah as an innovation model,

470
00:54:04,360 --> 00:54:12,690
but that's not very useful. And so. Britain's.

471
00:54:12,690 --> 00:54:20,730
The question about building as well.

472
00:54:20,730 --> 00:54:31,380
Now this is the way it. Yes, exactly the same is basically the same question as Jeff's question, which is, I repeat for the online people,

473
00:54:31,380 --> 00:54:36,480
which is how would you choose the weights when you're doing, when you're doing polling?

474
00:54:36,480 --> 00:54:41,460
I think that's I think it is much the same as my answer to Jeff is that I don't know that there's

475
00:54:41,460 --> 00:54:48,090
a generic way of doing that unless you I think if you specified a specific and objective,

476
00:54:48,090 --> 00:54:52,920
maybe you could do something. But I don't I don't know of any way of doing that.

477
00:54:52,920 --> 00:55:01,770
But yeah, great if we could find a way. Great.

478
00:55:01,770 --> 00:55:10,624
Thank you.