1 00:00:02,370 --> 00:00:09,590 I was just curious. OK, I'm going to have to monitor the table on this. 2 00:00:09,590 --> 00:00:15,770 Yes. I hope it will. I hope this is yeah, you can say it's so. 3 00:00:15,770 --> 00:00:22,890 Yeah, so I can send my slides. That's correct. 4 00:00:22,890 --> 00:00:35,680 Now. It's tough, it's all. 5 00:00:35,680 --> 00:00:45,230 I so so somewhere out there and and the interweb you indicate that to. 6 00:00:45,230 --> 00:00:54,340 About where they are today, and you hear me, right, so. 7 00:00:54,340 --> 00:01:03,080 A chat message isn't that. Sorry, this is a job for John. 8 00:01:03,080 --> 00:01:09,110 John produced a message. Yes, they can hear us. 9 00:01:09,110 --> 00:01:24,420 Yes, word. Well, welcome, and thanks to Robert for standing next to me, sir. 10 00:01:24,420 --> 00:01:30,380 So, so welcome today to Robert Good. He's going to tell us about some of his work on lack of melting. 11 00:01:30,380 --> 00:01:38,060 And so this is of interest to me personally, because you're Mormon specifications is a thing that we have to deal with and parametric inference, 12 00:01:38,060 --> 00:01:47,540 and it's one of the people who's dealing with it. So Robert was actually an emergency student here in Oxford a long time ago, 13 00:01:47,540 --> 00:01:54,530 and and he's going on to great things at the new statistics unit in Cambridge. 14 00:01:54,530 --> 00:02:00,980 So, so, so thanks very much, Robert. We do. 15 00:02:00,980 --> 00:02:21,860 We need to worry about that. So hopefully someone somewhere will have to shout out messages and say, you can't hear me remotely. 16 00:02:21,860 --> 00:02:28,820 So thanks very much to Jeff for inviting me. It's nice to be here to present this work. 17 00:02:28,820 --> 00:02:34,430 So this work is a. Doesn't it seem to be working? 18 00:02:34,430 --> 00:02:41,520 It was working a second ago, the. And they need to move. 19 00:02:41,520 --> 00:02:50,710 It's not what it's working now. So this this work is motivated by problems where you have more than one source of data. 20 00:02:50,710 --> 00:02:56,980 So you might have multiple studies, each of which of course have have separate pieces of data associated with them, 21 00:02:56,980 --> 00:03:04,060 which won't go away one up to wait for. And each of these studies, we think are complementary in some way. 22 00:03:04,060 --> 00:03:11,680 So they might be from different populations, or they might tell us about different aspects in my line of work and some kind of disease, 23 00:03:11,680 --> 00:03:19,360 human disease usually and said ideally we'd wait to piece them together to hopefully give us better inference. 24 00:03:19,360 --> 00:03:26,320 So to piece them together, we need to assume that there's some degree of commonality between these different bits of data. 25 00:03:26,320 --> 00:03:33,610 And for most of this talk, we're going to assume that the commonality is that we've got some parameter fire here, 26 00:03:33,610 --> 00:03:38,680 which is shared by all of the models for each of these different studies. 27 00:03:38,680 --> 00:03:47,020 And by combining them together, hopefully that will give us more precise estimates just for the fact that we've got a larger sample size. 28 00:03:47,020 --> 00:03:54,900 I hope will also give us a true representation of the uncertainty that's inherent in in in the situation we're looking at. 29 00:03:54,900 --> 00:04:03,670 So in some situations, the variants might be larger, but that will hopefully be a more accurate reflection of of what's going on. 30 00:04:03,670 --> 00:04:08,380 And finally, it will hopefully minimise the risk of selection type biases. 31 00:04:08,380 --> 00:04:14,650 So if we just use one of these datasets and maybe that only considers a particular cohort of patients, 32 00:04:14,650 --> 00:04:22,460 then then that might be biasing our results according to the usual selection type problem. 33 00:04:22,460 --> 00:04:32,750 So if to make it a little bit more concrete in my line of work of statistics, people often come along with with multiple different datasets, 34 00:04:32,750 --> 00:04:40,160 so there might be a clinical trial where they've investigated the effect of some specific intervention on some disease. 35 00:04:40,160 --> 00:04:47,090 There might be a cohort study that might have run for 20 30 years and will be a sort of moderate size, 36 00:04:47,090 --> 00:04:50,540 whereas the clinical trial will tend to be recovery. 37 00:04:50,540 --> 00:04:58,340 So you will tend to be relatively small and have very strong inclusion criteria, so it won't be terribly representative of the whole population. 38 00:04:58,340 --> 00:05:02,450 We may also have health record data from GP's and hospitals, 39 00:05:02,450 --> 00:05:10,640 which will be a larger and potentially but different types of biases from a more carefully controlled clinical trial, a cohort study. 40 00:05:10,640 --> 00:05:17,060 And finally, we might also have genomic or other kinds of high throughput data, and ideally, 41 00:05:17,060 --> 00:05:25,070 we would like to combine these all together to provide a complete understanding of whichever disease as we're interested in. 42 00:05:25,070 --> 00:05:31,670 But in practise doing that, at least as a Bayesian and probably as a classical suspicion as well will be very hard 43 00:05:31,670 --> 00:05:36,470 because all of these different types of data all have their own unique complexities. 44 00:05:36,470 --> 00:05:45,380 And so identifying a suitable model that describes all four of them simultaneously is well nigh on impossible. 45 00:05:45,380 --> 00:05:54,230 And even if you could fit in such an enormous model, at least with the computational tools that are available at the minute will be pretty difficult. 46 00:05:54,230 --> 00:06:01,250 And even if you could do that, you've got such a large amount of data and such a complicated model working out 47 00:06:01,250 --> 00:06:07,490 whether that the posterior distribution from that enormous model is actually a good, 48 00:06:07,490 --> 00:06:12,560 a good fit to your data and and really captures what's going on will be very hard. 49 00:06:12,560 --> 00:06:15,900 Never mind if you get results that are counterintuitive in some way, 50 00:06:15,900 --> 00:06:24,850 trying to figure out where the problem is arising from, when you've got so many bits of data and parameters. 51 00:06:24,850 --> 00:06:33,160 So what what what we are coming out with this work is, is the idea that you might not try and do this all at once. 52 00:06:33,160 --> 00:06:38,440 You might instead have its smaller sub models for each of these different types of data. 53 00:06:38,440 --> 00:06:45,510 So you might have a model one that describes the clinical trial data for the cohort study and so on. 54 00:06:45,510 --> 00:06:53,580 And in fact, this is often how things are and that someone will already have analysed the clinical trial and they will develop the model. 55 00:06:53,580 --> 00:06:56,760 And similarly, for all of the other types of data, there will be existing models. 56 00:06:56,760 --> 00:07:00,900 No one's going to start trying to model all of these data together immediately from nothing. 57 00:07:00,900 --> 00:07:05,490 And so while we could do all of that hard work in creating one up to P four out, 58 00:07:05,490 --> 00:07:10,020 it seems pretty wasteful to throw all of that away and start from scratch again. 59 00:07:10,020 --> 00:07:17,790 So what we'd like to do is find some way to take these existing models and integrate them into an overall analysis, 60 00:07:17,790 --> 00:07:28,720 ideally as generically as possible. So and the existing ways that people do this are as follows. 61 00:07:28,720 --> 00:07:36,240 And so one way that is extremely widely used is that you you have one of these models, 62 00:07:36,240 --> 00:07:41,990 you get your posterior distribution and then you take some that take the common quantity, which in this case is PHI. 63 00:07:41,990 --> 00:07:51,260 And then you plug it in to that point estimate from that posterior distribution into a second model and then you carry on and do that each time. 64 00:07:51,260 --> 00:07:52,670 So this, I think, 65 00:07:52,670 --> 00:08:00,140 is everywhere is actually quite hard to find examples of this because everyone hides this in the supplementary materials the best of their paper. 66 00:08:00,140 --> 00:08:02,600 But I think this happens absolutely everywhere. 67 00:08:02,600 --> 00:08:08,630 And of course, statistically, this probably isn't ideal because you're taking an uncertain quantity and putting you in a point estimate. 68 00:08:08,630 --> 00:08:12,980 So at least in general, this will underestimate uncertainty. 69 00:08:12,980 --> 00:08:20,970 Of course, in specific situations, if the onset is small relative to the problem you're looking at, it may be quite reasonable. 70 00:08:20,970 --> 00:08:29,010 A second approach is to plug in an approximation to the posterior into the second model that you're fitting, 71 00:08:29,010 --> 00:08:33,180 so rather than just taking a single fixed point estimate you you take some 72 00:08:33,180 --> 00:08:37,770 distribution that looks similar to the posterior distribution and pro-Tibetan. 73 00:08:37,770 --> 00:08:46,050 We'll come back to that in a minute. And finally, what we were trying to do is integrate these models into a single joint Bayesian model, 74 00:08:46,050 --> 00:08:50,610 which of course then brings along all of the advantages of Bayesian inference. 75 00:08:50,610 --> 00:08:57,940 But of course, it could be quite tricky to do that in practise, and that's what we're that's the problem that we're trying to solve. 76 00:08:57,940 --> 00:09:04,120 So he has a slightly more concrete story example that is based on the more complicated 77 00:09:04,120 --> 00:09:08,620 example show later that will hopefully make things a little bit more concrete. 78 00:09:08,620 --> 00:09:15,130 So this and this is this is where all of this work really came out of which was in trying to model through. 79 00:09:15,130 --> 00:09:24,130 And one of the quantities of interest in that world is the probability of being hospitalised, given that you have influenza like illness. 80 00:09:24,130 --> 00:09:29,980 And let's imagine that we observed 100 people in a hospital with influenza like illness 81 00:09:29,980 --> 00:09:36,100 out of a thousand people in the in the population who had influenza an illness. 82 00:09:36,100 --> 00:09:43,120 So one model for that would be a very simple beta binomial model. And that's obviously very simple to fit. 83 00:09:43,120 --> 00:09:44,110 But of course, in practise, 84 00:09:44,110 --> 00:09:51,100 we never really know how many people there are in the in the whole population who have influenza like illnesses because you can't, 85 00:09:51,100 --> 00:09:55,690 you can't, you can't go out and ask every single question in the population. 86 00:09:55,690 --> 00:09:58,480 And so there's some degree of uncertainty about that. 87 00:09:58,480 --> 00:10:06,400 So we could represent that with a with a prior on long little end, which might be, say, a plus on distribution, for example. 88 00:10:06,400 --> 00:10:13,810 But then imagine that we have new data from a similar geographic area, for example, 89 00:10:13,810 --> 00:10:19,990 where we observe where we've managed to go out and count how many people in the population have influenza like illnesses. 90 00:10:19,990 --> 00:10:24,370 And so maybe about 40 out of 500 people. 91 00:10:24,370 --> 00:10:28,570 And so then that gives us another beta binomial model. 92 00:10:28,570 --> 00:10:36,820 And if we assume that the populations are similar, then we can take the the posterior distribution of this, 93 00:10:36,820 --> 00:10:43,840 a key parameter which gives us the an estimate of the proportion of people in the population who have influenza like illnesses. 94 00:10:43,840 --> 00:10:53,500 And then, um, if we know the total number of people in the in the original population, then we can plug that Q into another binomial model. 95 00:10:53,500 --> 00:11:05,320 And that gives us a second estimate of whito and the number of the number of people in the population with influenza like illness. 96 00:11:05,320 --> 00:11:12,820 But now we have not just this one with Model four, and we have that original prior that we had four in. 97 00:11:12,820 --> 00:11:19,100 And so we have to try and resolve the fact that we've got two two models for the same quantity. 98 00:11:19,100 --> 00:11:25,420 And of course, one option would just be to throw away this question prior that we had four in and take this. 99 00:11:25,420 --> 00:11:33,400 And as as the truth, but of course, this is coming from a similar area and maybe we've done a lot of work to elicit, 100 00:11:33,400 --> 00:11:37,890 for example, that a plus on prior. 101 00:11:37,890 --> 00:11:45,690 So it would be nice if we could not just throw away one of these prises or models, but we could actually integrate them in some way. 102 00:11:45,690 --> 00:11:51,360 Things got even more complicated if you had a stratification of your original population. 103 00:11:51,360 --> 00:11:56,790 So you've got two age groups, for example, and now we have them. 104 00:11:56,790 --> 00:12:01,170 So we've added that I subscribe to all of this, the model. 105 00:12:01,170 --> 00:12:09,090 And so rather than having a direct prior for the total number of people in this population 106 00:12:09,090 --> 00:12:18,120 and we have a model for and one and two where the sum of n1.2 equals with away. 107 00:12:18,120 --> 00:12:27,420 So now we have a binomial model for end to end that comes from this similar data area. 108 00:12:27,420 --> 00:12:34,950 And then we have a second model that's the that's a deterministic function of two other press. 109 00:12:34,950 --> 00:12:41,200 This obviously, here is a very trivial, deterministic function. 110 00:12:41,200 --> 00:12:46,620 And it turns out that these two problems are the ones that make this problem difficult. 111 00:12:46,620 --> 00:12:53,730 In general, this problem that you have two separate models of price for the same quantity and sometimes 112 00:12:53,730 --> 00:13:01,510 one of those models may be related to the original parameters by deterministic model. 113 00:13:01,510 --> 00:13:03,910 So graphically, that's what this example looks like, 114 00:13:03,910 --> 00:13:11,380 we started with a beta binomial model where these squares represent data in the double circle represent parameters. 115 00:13:11,380 --> 00:13:15,370 We then acknowledge that we're uncertain about with so and that becomes a parameter. 116 00:13:15,370 --> 00:13:21,940 We then added a second model from a similar area, so we now have two models for this quantity. 117 00:13:21,940 --> 00:13:27,100 And and then finally, we added that we split up this way into y one and way two. 118 00:13:27,100 --> 00:13:34,360 And that then meant that the repeated quantity was the output of of a non-invasive or deterministic function. 119 00:13:34,360 --> 00:13:42,270 And in this model here. OK, so the return to what the aims of this are and more and more general, 120 00:13:42,270 --> 00:13:48,000 we imagine that we've got several models, each of which involves some common quantify. 121 00:13:48,000 --> 00:13:55,050 And what we'd like to do is create a generic way of joining all of those models together into a single giant model. 122 00:13:55,050 --> 00:13:57,100 In some cases, this will be very easy. 123 00:13:57,100 --> 00:14:05,040 It turns out that the cases where it's hard, where you have an implicitly two different price for the same quantity, 124 00:14:05,040 --> 00:14:14,060 which doesn't make sense in Bayesian inference, and secondly, where you need to handle these non-convertible, deterministic transformations. 125 00:14:14,060 --> 00:14:22,910 We'd also like to do this in a what we call sort of staged or modular way so that you don't have to think about the whole model once. 126 00:14:22,910 --> 00:14:27,800 But you can kind of gradually build up inference to the social model. 127 00:14:27,800 --> 00:14:35,270 And thirdly, we'd also like to understand the reverse operation, which is kind of the opposite. 128 00:14:35,270 --> 00:14:38,690 So when you take the existing large model and want to split it up, 129 00:14:38,690 --> 00:14:46,000 and that can be useful for understanding what's going on, or potentially it could be a useful inference method. 130 00:14:46,000 --> 00:14:53,890 So he has been attention and good to use throughout, so we imagine we have capital and separate models or models, 131 00:14:53,890 --> 00:15:03,940 each of which involve this common parameter phi and some stuff, model specific parameters, Siam and some sub model specific data. 132 00:15:03,940 --> 00:15:11,800 Y m and what we'd like to do is create a generic method for integrating these two M 133 00:15:11,800 --> 00:15:18,530 different models into a single data model that involves all of the quantities together. 134 00:15:18,530 --> 00:15:21,310 That's what we're aiming to do. 135 00:15:21,310 --> 00:15:32,590 So it turns out that a special case of this had been considered in the 90s in a slightly different context by Phil David and Stefan Lauritsen. 136 00:15:32,590 --> 00:15:41,190 So they considered the special case where the marginal, the prior marginal distributions for the common quantity were all equal. 137 00:15:41,190 --> 00:15:46,570 So each of these sub models have the same prior to the common quantity. 138 00:15:46,570 --> 00:15:51,700 If you have that, then it's relatively obvious what a sensible model should be. 139 00:15:51,700 --> 00:15:58,960 So we have these capital m separate sub models we then can depend on by which then gives 140 00:15:58,960 --> 00:16:08,260 us the sub model specific conditional here and then the the that the prior marginal. 141 00:16:08,260 --> 00:16:17,900 And once you've done that, then it's fairly obvious the sensible model would be that we have this common prior to the buy. 142 00:16:17,900 --> 00:16:24,190 And then we take the prototype, the model specific conditionals. 143 00:16:24,190 --> 00:16:29,650 And this is called a mark of combination. So and, Stefan. 144 00:16:29,650 --> 00:16:37,660 And it's also been examined a bit further by Massa and Horowitz and more recently. 145 00:16:37,660 --> 00:16:47,260 There's a message, but I don't know how to achievements if it's important, it's like, Oh, OK, this is it. 146 00:16:47,260 --> 00:16:51,550 I'm just not. I just need you to talk more. Is that better? 147 00:16:51,550 --> 00:17:00,520 OK. OK, so the properties of the mock up combination model are the sub model specific parameters and data, 148 00:17:00,520 --> 00:17:08,650 a condition independent given the the common quantity that the sub model specific conditional was given 149 00:17:08,650 --> 00:17:15,610 the joint the common quantity of preserved between the Markov combination model and the original models. 150 00:17:15,610 --> 00:17:20,830 And and indeed also the this sub model marginals. 151 00:17:20,830 --> 00:17:27,920 So the joint distribution of each of the marginals preserved and the mark of combination. 152 00:17:27,920 --> 00:17:34,820 So what we were looking at was a more general problem where these marginals may be inconsistent. 153 00:17:34,820 --> 00:17:40,340 Hopefully that doesn't seem too contrived given the examples I showed you at the beginning. 154 00:17:40,340 --> 00:17:47,210 So we need to resolve the fact that we've got these inconsistent marginals for each of the from each of the sub models. 155 00:17:47,210 --> 00:17:49,940 So what we're going to choose to do is pool these together. 156 00:17:49,940 --> 00:17:59,750 We have some function g that takes us in input each of these prior marginals and spits out a single marginal how it's going to kill people. 157 00:17:59,750 --> 00:18:03,170 And then once we've done that, we're essentially back in the same situation. 158 00:18:03,170 --> 00:18:08,510 So it's quite obvious that a quite reasonable that the Joint Model for everything should be this 159 00:18:08,510 --> 00:18:15,490 prior marginal that we've pulled together multiplied by the sub model specific conditionals. 160 00:18:15,490 --> 00:18:22,960 The properties of this, which we call mark of melting, are quite similar to Markov combinations, 161 00:18:22,960 --> 00:18:30,070 the conditional independence and some model specific conditionals are preserved again. 162 00:18:30,070 --> 00:18:39,130 But now the the sub model specific marginals, the joint distribution of each of the sub models is no longer necessarily preserved 163 00:18:39,130 --> 00:18:45,220 because we've changed that the marginal distribution on the common quantity. 164 00:18:45,220 --> 00:18:55,030 Five. So we call this the mark of a bit of this comes from the mark of combination, I'll explain the melting bit in a minute. 165 00:18:55,030 --> 00:18:59,620 The connexion to where that word comes from. 166 00:18:59,620 --> 00:19:05,710 So how are we going to go about forming these priors? Well, there's several options. 167 00:19:05,710 --> 00:19:12,670 It turns out that this problem is basically the same as a problem that's been talked about since the 1980s and prior elicitation, 168 00:19:12,670 --> 00:19:18,550 where people have asked multiple experts what prior they think should be chosen for a 169 00:19:18,550 --> 00:19:23,770 particular parameter and the need to combine those separate phrases together in some way. 170 00:19:23,770 --> 00:19:28,720 So the the options that have been suggested are what's called linear opinion polling, 171 00:19:28,720 --> 00:19:39,520 which is basically just a mixture of each of the the sub model specific priors weighted by some quantity w logarithmic opinion polling, 172 00:19:39,520 --> 00:19:46,360 which is almost the same, but on the log scale or product of experts, which is the same as well comparing. 173 00:19:46,360 --> 00:19:53,560 But with these weights all set to one or what we somewhat hoshiko dictatorial pooing, 174 00:19:53,560 --> 00:20:02,560 which is where you set the pooled fried to be one of the original sub model priors, so you threw away all of the prise. 175 00:20:02,560 --> 00:20:15,340 Apart from the one, a big. As what these look like in one in a couple of settings to the inputs are these two black densities of Gaussians. 176 00:20:15,340 --> 00:20:21,220 And then the output density is in is in the colours speaks to that in this situation, 177 00:20:21,220 --> 00:20:25,900 where there's some degree of difference between the two input densities that, 178 00:20:25,900 --> 00:20:40,310 in your opinion, will be a mixture of the two input densities, whereas the wall can predict express where it's being the most concentrated. 179 00:20:40,310 --> 00:20:47,570 Several properties have been investigated and in the same literature, one of them's property of being expected to be made, 180 00:20:47,570 --> 00:20:56,210 in which basically the idea that if you perform it at accruing on each of the justices is that you get from the sub models, 181 00:20:56,210 --> 00:21:02,890 then you should get the same as if you pool together and then calculate the posterior. 182 00:21:02,890 --> 00:21:13,210 And it turns out that this property holds if you have logarithmic going with that with the some of the weights, something up to one. 183 00:21:13,210 --> 00:21:20,770 Unfortunately, this property doesn't really make sense in the context we're interested in because here we have different likelihoods, 184 00:21:20,770 --> 00:21:24,550 potentially and different data in each of the sub models. 185 00:21:24,550 --> 00:21:32,650 And so the log pooing externally Bayesian property doesn't even hold for about pooing in our context, unfortunately. 186 00:21:32,650 --> 00:21:38,740 So while it might be a property that would guide you, it's not very useful for us. 187 00:21:38,740 --> 00:21:42,280 So I said, I'll come back to where this word melding came from. So it comes from. 188 00:21:42,280 --> 00:21:50,450 This paper by David and Adrian Raftery in 2000, who considered an inference for a deterministic function f. 189 00:21:50,450 --> 00:21:56,230 So the f here was a fixed differential equation model of the number of whales in the sea. 190 00:21:56,230 --> 00:22:02,770 And so the standard Bayesian model for this would be that they had these observations of the number of whales in the sea, 191 00:22:02,770 --> 00:22:08,410 and they say that their noisy observations of the output of this deterministic function and they have got some prior. 192 00:22:08,410 --> 00:22:11,840 So the inputs of that deterministic function feature. 193 00:22:11,840 --> 00:22:18,260 What was unusual about Prue and Raftery setting was that rather than just having price on the inputs deterministic function, 194 00:22:18,260 --> 00:22:21,800 they also had price on the outputs, which simplistic function asked. 195 00:22:21,800 --> 00:22:30,410 So they wanted to kind of constrain the outputs of this differential equation model in a sort of soft way by imposing a price on the output. 196 00:22:30,410 --> 00:22:35,840 So now they have not just one price for five, they have the that's directly specified. 197 00:22:35,840 --> 00:22:42,020 They also have if you transform this profit to buy, if they have again have to price for the outputs, 198 00:22:42,020 --> 00:22:46,820 deterministic functions, they need to resolve this in some way. 199 00:22:46,820 --> 00:22:51,080 The solution is basically that you, even if this f is is not investable, 200 00:22:51,080 --> 00:23:00,020 you extend it to an investible function and then you back transform the prophesy onto future and then you pull those two price 52. 201 00:23:00,020 --> 00:23:06,620 And it turns out that the price that you get in the end is and doesn't depend on the 202 00:23:06,620 --> 00:23:11,420 way in which you extend that and not investable function turning vertical function. 203 00:23:11,420 --> 00:23:18,970 So that's where the that's where the melding comes from. OK, a few final miscellaneous notes on this. 204 00:23:18,970 --> 00:23:23,650 Of course, this this procedure gives you a joint model for all of this. 205 00:23:23,650 --> 00:23:31,120 All of the data, but it could be ludicrous, especially if the if the sub model is strongly conflict with each other, 206 00:23:31,120 --> 00:23:34,440 there's no guarantee that it will make any sense. 207 00:23:34,440 --> 00:23:40,900 And another thing is that I've said this, this idea is we're trying to create something that is modular. 208 00:23:40,900 --> 00:23:50,290 Some of you might be familiar with the idea of distributions, which are another idea another conception of modular module, a modular approach. 209 00:23:50,290 --> 00:23:58,570 Her mark of melting is is is quite different to mark of melting creates a full joint Bayesian model. 210 00:23:58,570 --> 00:24:06,370 And if you believe that that folding model is appropriate for your setting, then it's just standard Bayesian inference at that point. 211 00:24:06,370 --> 00:24:11,200 There's no changing of the posterior distribution, right? There isn't cuts. 212 00:24:11,200 --> 00:24:18,190 Finally, there's an approximate approach that it turns out is an approximation to mark up notice. 213 00:24:18,190 --> 00:24:25,150 I said at the beginning that we were sometimes people plug in and some kind of simple parametric 214 00:24:25,150 --> 00:24:31,030 approximation to a posterior distribution into a second model rather than just plugging in a point estimate. 215 00:24:31,030 --> 00:24:35,620 And one of these turns out to be an approximation to mark of noting. 216 00:24:35,620 --> 00:24:43,330 So if you if you, first of all, obtain the posterior distribution for your first model and then you approximate 217 00:24:43,330 --> 00:24:49,340 the posterior margin of the common quantified by a normal distribution. 218 00:24:49,340 --> 00:24:52,000 And then if you plug in that normal distribution, 219 00:24:52,000 --> 00:24:58,450 so if you plug in that normal distribution to a second model saying that you've got the observation for the normal distribution is that 220 00:24:58,450 --> 00:25:07,660 point estimate from the first model and that the variance covariance matrix is the estimate experience covariance matrix in the first model. 221 00:25:07,660 --> 00:25:14,680 Then it turns out that this is equivalent to Markov melding with a specific form of pooling with this product of experts pooling. 222 00:25:14,680 --> 00:25:22,330 So that can be useful in practise if you want a quick approximation to to this. 223 00:25:22,330 --> 00:25:31,540 So turning now to inference, and so the writing down the posterior distribution is, of course, very straightforward. 224 00:25:31,540 --> 00:25:37,930 It's just proportional to the to the to the melted distribution here. 225 00:25:37,930 --> 00:25:44,830 In the special case of this product of experts going where you say the world prior is the product of the marginals. 226 00:25:44,830 --> 00:25:49,920 And of course, these cancel out and you get to something even simpler. 227 00:25:49,920 --> 00:25:55,020 But in general, you will if you don't use projects, that's going, you'll need to. 228 00:25:55,020 --> 00:26:02,760 You need to have some estimate of these marginal price from each of the sub models PM five. 229 00:26:02,760 --> 00:26:06,960 In some situations, those will be analytically tractable and everything will be fine. 230 00:26:06,960 --> 00:26:14,820 But in the situations we've looked at, generally, these fires are not kind of root nodes in a graphical representation. 231 00:26:14,820 --> 00:26:19,590 And so you need to estimate these marginal prices. 232 00:26:19,590 --> 00:26:28,710 The way we've done so far is just sampling from those prior distributions and then approximating them with a kernel density estimate. 233 00:26:28,710 --> 00:26:34,650 This obviously won't work very well with anything other than a two dimensional common quantity. 234 00:26:34,650 --> 00:26:37,470 Five. 235 00:26:37,470 --> 00:26:47,340 So once you've won, if you've once you've done that in principle, something from this posterior distribution can be done in using standard methods. 236 00:26:47,340 --> 00:26:50,340 You can use a metropolis within Gibson Bush, for example, 237 00:26:50,340 --> 00:26:56,580 where you sample the sub model specific parameters conditional on the common quantity in one step. 238 00:26:56,580 --> 00:27:03,360 And that would be exactly identical to the sample that you would need for just the original model. 239 00:27:03,360 --> 00:27:09,930 And then in the other step, you can sample the common quantity conditional on all the sub model specific quantities, 240 00:27:09,930 --> 00:27:17,730 and you just need to come up with a reasonable proposal distribution there, which may or may not be straightforward. 241 00:27:17,730 --> 00:27:23,820 But while we've been more interested in is a more kind of modular approach where each of the where you can, 242 00:27:23,820 --> 00:27:30,990 where you can sample from one one sub model and then gradually build up towards the full Joint Model. 243 00:27:30,990 --> 00:27:40,350 And this is a this is what we call a multi-stage algorithm, and this has been explored by a few a few papers, some of which are cited there. 244 00:27:40,350 --> 00:27:46,840 So I'm going to tell you how this works in the in the special case of two models. And so we have. 245 00:27:46,840 --> 00:27:50,170 Two models, one of the P one and the second one, 246 00:27:50,170 --> 00:27:58,600 P two and what melting show tells us how to do is form a joint model for all of these quantities here. 247 00:27:58,600 --> 00:28:00,700 So to do computation in this case, 248 00:28:00,700 --> 00:28:09,100 what we first of all day as we draw samples from the first model and the posterior distribution of the common quantity under the first model, 249 00:28:09,100 --> 00:28:20,080 which is P one five given way one. And then what we're going to do is use those posterior samples as a proposal and M.C for the full Joint Model. 250 00:28:20,080 --> 00:28:26,680 And it turns out that this means that the likelihood terms relating to the first model cancel in the same C 251 00:28:26,680 --> 00:28:34,180 and so your second stage M AMC doesn't require any knowledge of the first model apart from those samples. 252 00:28:34,180 --> 00:28:39,160 So in this sense, it's a sort of modular approach to make that a bit more precise. 253 00:28:39,160 --> 00:28:46,530 We're taking drawing samples from the first model's posterior distribution and retaining them. 254 00:28:46,530 --> 00:28:57,190 And the second stage, we consider the phone model and to draw samples for the sub model specific parameters of that sub model, 255 00:28:57,190 --> 00:29:02,750 we just use a standard method. That would be used if you you're just fitting that model by itself. 256 00:29:02,750 --> 00:29:13,820 But for the so the common quantity five, we draw that by drawing a sample from these samples and stage one uniformly at random, 257 00:29:13,820 --> 00:29:22,340 essentially drawing a sample from this posterior distribution here from the first model assuming everything's converged. 258 00:29:22,340 --> 00:29:28,040 And then if you write out the usual metropolis Hastings acceptance ratio, 259 00:29:28,040 --> 00:29:38,390 then because we've set the proposal distribution queue to be equal to the first stage is the likelihood something that's proportional to it. 260 00:29:38,390 --> 00:29:46,640 Then these cancel out and the acceptance probability leaving us with an acceptance probability that doesn't depend on on p one at all. 261 00:29:46,640 --> 00:29:54,640 Apart from its prior marginal, which you said before you need to estimate somehow if it's not directly tractable. 262 00:29:54,640 --> 00:30:01,550 And of course, this can, at least in principle, extend to two any further stages if you've got more than two models. 263 00:30:01,550 --> 00:30:08,000 So let me now talk you through the more substantive example that the original toy model was based on. 264 00:30:08,000 --> 00:30:16,910 So this was looking at estimating the probability of hospitalisation from a specific form of flu, 265 00:30:16,910 --> 00:30:23,000 which was the H1N1 strain that went around England and in 2010. 266 00:30:23,000 --> 00:30:30,860 And what we're looking to do was estimate the total number of ICU admissions that happened throughout that well, 267 00:30:30,860 --> 00:30:39,100 not pandemic, whatever you call something that's less than a pandemic. And we had various sources of data. 268 00:30:39,100 --> 00:30:48,610 One of them was weekly numbers of suspected cases of a H1N1 in several schools in the UK. 269 00:30:48,610 --> 00:30:56,170 The second one was positivity data, so not all of these suspected cases really did actually have H1N1, 270 00:30:56,170 --> 00:31:05,140 but we for a sub sample, we had confirmation of of whether they were H1N1 from further testing. 271 00:31:05,140 --> 00:31:10,570 And then we also had various other indirect data, like the number of GP consultations, 272 00:31:10,570 --> 00:31:19,240 the number of hospitalisations that didn't get to ICU and the number of deaths, et cetera, et cetera, which to make this province. 273 00:31:19,240 --> 00:31:26,050 Not too complicated of this secondary data, I've simplified down to an informative if prior. 274 00:31:26,050 --> 00:31:34,690 So the models work like so. So the first model is this model for the intensive care unit data. 275 00:31:34,690 --> 00:31:42,430 So these data, why are the weekly number of suspected cases of H1N1 in each ICU? 276 00:31:42,430 --> 00:31:46,900 That's related to this parameter feature, which is determined, 277 00:31:46,900 --> 00:31:53,170 which has lots of components that represent a kind of birth death process of people coming in and out of ICU. 278 00:31:53,170 --> 00:31:57,910 But of course, all of this bit of the model relates to suspected H1N1. 279 00:31:57,910 --> 00:32:01,930 So we then if we want to estimate the true number of H1N1 cases, 280 00:32:01,930 --> 00:32:13,840 we need to relate that to the positivity data that is available for a subset of these patients, which is represented by this PI pause quantitate. 281 00:32:13,840 --> 00:32:19,270 And if we combine the suspect estimates of the suspected number of cases and the positivity data, 282 00:32:19,270 --> 00:32:31,270 then we can calculate the number of confirmed cases or estimate the number of confirmed cases, which is going to be a common quantify in this case. 283 00:32:31,270 --> 00:32:36,940 The second model is highly simplified to make this vaguely understandable. 284 00:32:36,940 --> 00:32:42,760 So it's just a binomial model in this case, and so we interested in. 285 00:32:42,760 --> 00:32:49,420 So the key here is at the end in a binomial and we have some proportion of of 286 00:32:49,420 --> 00:32:58,480 this list and quantity are actually key cases of confirmed H1N1 in an ICU. 287 00:32:58,480 --> 00:33:03,010 And we've got a very informative prior on this. This PI here. 288 00:33:03,010 --> 00:33:07,480 Sorry, on the K. Yeah, that's that wraps up all sorts of other bits of model. 289 00:33:07,480 --> 00:33:13,210 So we'd like to combine these two models into a single model. 290 00:33:13,210 --> 00:33:17,890 And what we're we used melting to do that which tells us how to how to form 291 00:33:17,890 --> 00:33:28,110 a joint model from these two models that don't work immediately compatible. We had to combine the priors and this this y here was sorry. 292 00:33:28,110 --> 00:33:32,530 The five here was split into two separate age categories. 293 00:33:32,530 --> 00:33:39,940 In this example, and so the priors here are by various distributions across the x axis. 294 00:33:39,940 --> 00:33:43,720 Here is in one age group and the y axis is another group. 295 00:33:43,720 --> 00:33:47,630 I think this was children and this is adults at the y axis. 296 00:33:47,630 --> 00:33:55,100 So these on the left are the original prise that we had and four, four, five one and five two in each case. 297 00:33:55,100 --> 00:34:00,170 So the ICU model had a very flat prior, whereas the severity model, I said, was informative. 298 00:34:00,170 --> 00:34:06,620 This is quite piqued. And then on the right hand side is what you get if you pull those two together. 299 00:34:06,620 --> 00:34:14,470 In this case, because the ICU model is so flat, you get something that's quite similar to just the severity model by itself. 300 00:34:14,470 --> 00:34:19,090 These are the results for the for the five parameters in the second model. 301 00:34:19,090 --> 00:34:24,160 So the top line shows the posterior distribution from the first model alone. 302 00:34:24,160 --> 00:34:30,550 That's the second. The second line is the posterior distribution from the second model alone. 303 00:34:30,550 --> 00:34:39,430 So you can see some degree of difference between these two, whereas the bottom few rows show what you get from the melded model, 304 00:34:39,430 --> 00:34:45,970 using various different pooling techniques and also using the normal approximation method that I mentioned as well. 305 00:34:45,970 --> 00:34:55,120 And what's quite quite reassuring here is that there's a fair degree of similarity across all of these different viewing methods. 306 00:34:55,120 --> 00:35:01,170 But the variance is reduced by combining the two datasets together. 307 00:35:01,170 --> 00:35:05,610 I'm going to skip this example for the time. 308 00:35:05,610 --> 00:35:08,760 And so as I mentioned before, 309 00:35:08,760 --> 00:35:15,930 that you can also do the reverse of this process so you can start with a Joint Model and then split it up into sub models. 310 00:35:15,930 --> 00:35:22,860 And you take these sub models to be faithful to the original model in the sense that if we then join them back together with melding, 311 00:35:22,860 --> 00:35:31,560 we'd get the Joint Model again. And this could be useful for dividing up computation in a really big model. 312 00:35:31,560 --> 00:35:38,880 Or it could also be useful for improving understanding when when you're not quite sure what's going on in a very big model, 313 00:35:38,880 --> 00:35:43,770 it might be useful to to work out which parts of the model is providing information. 314 00:35:43,770 --> 00:35:50,400 So you can obviously only do this if you've got conditional independence between the two different bits of model that you want. 315 00:35:50,400 --> 00:35:56,460 So graphically, you can't have a kind of B structure of colliders. If songs you have that, 316 00:35:56,460 --> 00:36:03,030 you've got a fair degree of choice about the sub model specific parameters that set the sub 317 00:36:03,030 --> 00:36:09,150 model specific prior that you'd like to use for each of these components that you split into. 318 00:36:09,150 --> 00:36:17,460 So, so long as when you join these sub model specific priors together with whatever approval you're using, 319 00:36:17,460 --> 00:36:21,360 you get back to the original prior, then you can do whatever you like. 320 00:36:21,360 --> 00:36:26,370 For example, if you're using product of experts point where you just multiply the components together, 321 00:36:26,370 --> 00:36:35,300 then you could use a fractionated prior or indeed any other factorisation of the original prior into M factors. 322 00:36:35,300 --> 00:36:45,080 And so we use this in in a model from ecology, where they had two different sources of data on a common quantity fire, 323 00:36:45,080 --> 00:36:53,900 so they had mark recapture data where they where they capture animals and then tagged them and then catch them again, which is why. 324 00:36:53,900 --> 00:37:03,170 And also data on from an index where they recorded the number of birds that were in this case. 325 00:37:03,170 --> 00:37:10,370 And we can split that Joint Model for all of these quantities into two separate models. 326 00:37:10,370 --> 00:37:20,480 And then what was reassuring to us was if we apply our two stage algorithm to fit this model and then that model afterwards, 327 00:37:20,480 --> 00:37:26,360 we get something, we get results that are very similar to what we get if we felt the full Joint Model together. 328 00:37:26,360 --> 00:37:32,420 So that's what that graph shows. I will get better as well time. 329 00:37:32,420 --> 00:37:41,720 So the multi-stage algorithm that we proposed in principle is very nice because it lets you split up the computation into separate parts. 330 00:37:41,720 --> 00:37:49,130 But as probably many of you are guessing, it's not perfect because you are. 331 00:37:49,130 --> 00:37:54,320 You're simplifying your you're using a oh, quite simple proposal distribution. 332 00:37:54,320 --> 00:38:03,110 But there's another problem, which is that we are needing to estimate in our acceptance probability the ratio of these prior 333 00:38:03,110 --> 00:38:11,870 marginals p one at two separate points whenever we make an m m c move for that common quantity. 334 00:38:11,870 --> 00:38:19,940 And what I said at the beginning was that we were going to use density estimation for estimating that ratio, 335 00:38:19,940 --> 00:38:30,320 and that definitely won't work if it is as high dimensional. But even if it's not high dimensional, we are if we just have samples from. 336 00:38:30,320 --> 00:38:40,250 From these he won by quantities, then we won't have very good estimates of that quantity in the tales from kernel density estimation, 337 00:38:40,250 --> 00:38:47,660 and because we've got a ratio here, even if we get this quantity on the denominator out by a relatively small amount 338 00:38:47,660 --> 00:38:52,340 that can blow up very quickly and can mean that we accept moves that we shouldn't 339 00:38:52,340 --> 00:38:58,220 really accept so we can get we can move out from the main mass of the distribution 340 00:38:58,220 --> 00:39:03,350 here and into some point in the tails and end up getting stuck there. 341 00:39:03,350 --> 00:39:11,900 You can do a little bit better if you if, rather than just drawing samples from these margins directly if you draw from some weighted version of that. 342 00:39:11,900 --> 00:39:17,450 And that's what this this paper here describes. 343 00:39:17,450 --> 00:39:29,720 OK, so so far, I've talked about melting when there's a single quantity that's common to all of the the models, but not all situations are like that. 344 00:39:29,720 --> 00:39:38,840 You might have models that are related in different ways and one way that you might have that would be some models that form a chain like structure. 345 00:39:38,840 --> 00:39:46,310 So if we have several models and then we have a kind of Venn diagram where we have a common primary to fire one in two, 346 00:39:46,310 --> 00:39:50,300 that's common between Model one and Model two. And similarly so on. 347 00:39:50,300 --> 00:39:57,590 So the original formulation of melting doesn't tell you what to do in this situation. 348 00:39:57,590 --> 00:40:03,350 So the invitation for this set up is that we have capital separate models. 349 00:40:03,350 --> 00:40:09,770 And again, the sub model specific parameters and and some of the specific data, why. 350 00:40:09,770 --> 00:40:10,280 But now, 351 00:40:10,280 --> 00:40:21,650 rather than having a single fire that's common to all of the models we have and I am intersect and plus one which is common to model m m plus one. 352 00:40:21,650 --> 00:40:28,550 So we have m minus one common quantity shared across and different models. 353 00:40:28,550 --> 00:40:37,450 And again, what we'd like to do is find some generic way of forming a single joint model for all of these quantities. 354 00:40:37,450 --> 00:40:42,430 So what we propose is similar to the original melting idea, 355 00:40:42,430 --> 00:40:52,000 so we take each of the sub models divide three by the the prior marginal for the common quantity in that model and then multiply all of 356 00:40:52,000 --> 00:41:04,740 these together and then we replace that prior for each of the common quantities by a single pool for all of the common quantities together. 357 00:41:04,740 --> 00:41:14,100 I know that this won't be the same as providing as performing the common fine form of melding twice in general, 358 00:41:14,100 --> 00:41:18,270 except if there's if there's independence in this. 359 00:41:18,270 --> 00:41:25,060 If these two these these two countries in the middle model a priority independent. 360 00:41:25,060 --> 00:41:34,580 And of course, we can generalise this case for equals three into in the obvious way to a general number of models capital. 361 00:41:34,580 --> 00:41:37,850 So how are we going to form this period prior in this case, 362 00:41:37,850 --> 00:41:47,870 and now we need some function g that takes a prior for each of the common quantities all the way up to Model M, 363 00:41:47,870 --> 00:41:52,390 where all of the middle priors are by variety. 364 00:41:52,390 --> 00:42:00,230 Each of the Pfizer univariate and end models at the end quantities are a univariate. 365 00:42:00,230 --> 00:42:04,790 This is a little bit different to the original pooling set up. 366 00:42:04,790 --> 00:42:13,550 And however, you can nevertheless just apply a logarithmic opinion point in this context. 367 00:42:13,550 --> 00:42:20,330 Just multiplying all of these price together and taking some power that you called lambda here. 368 00:42:20,330 --> 00:42:26,240 So that will give you a valid probability distribution when your opinion polling is a bit less obvious. 369 00:42:26,240 --> 00:42:34,700 What what you get if you add together a univariate density in a Bay Area density, it's not not terribly obvious what you should do here. 370 00:42:34,700 --> 00:42:42,350 And the nearest analogue that we've come up with is that you marginals of each of these by 371 00:42:42,350 --> 00:42:49,940 various densities and takes the linear pool of those by various those univariate marginals, 372 00:42:49,940 --> 00:42:56,910 and then take the product of those marginals to give you your full dimensional approval density. 373 00:42:56,910 --> 00:43:01,700 But obviously, this induces prior independence, which may well not be what you want. 374 00:43:01,700 --> 00:43:06,290 So I'm not certain that that's a that's a great option, but it is an option. 375 00:43:06,290 --> 00:43:10,250 You can also do something that's analogous to dictatorial peering. 376 00:43:10,250 --> 00:43:15,830 So essentially in this setup, you have two choices of prior for each of the quantities, 377 00:43:15,830 --> 00:43:24,660 and you have to figure out some way of choosing one of those for each of the quantities as the several ways you can do that. 378 00:43:24,660 --> 00:43:35,040 So here's an example of how this works out and in in a few cases, so the setup is that we have one. 379 00:43:35,040 --> 00:43:42,270 This is live on the x axis. We have one of the input densities, but here is just a normal. 380 00:43:42,270 --> 00:43:52,320 And then on the other axis, we have the the marginal distribution for the second quantity. 381 00:43:52,320 --> 00:43:55,590 And then we also have this this joint distribution between the two. 382 00:43:55,590 --> 00:44:06,030 So this here is the prior model one this year, the prior Model three, and this is the BI variant prior in the middle model model to the blue. 383 00:44:06,030 --> 00:44:10,410 And then if we combine those together, then we get on the wrong way around. 384 00:44:10,410 --> 00:44:13,380 The red one is the sorry. The blue one is the input, isn't it? 385 00:44:13,380 --> 00:44:18,960 And the red is that is what you get if you vary the the priors so you can get if you vary. 386 00:44:18,960 --> 00:44:26,550 The quantities in the linear invoke points of the left hand column is working in the right hand column when you're playing, 387 00:44:26,550 --> 00:44:32,460 so you get various different choices of overall pilled prior. 388 00:44:32,460 --> 00:44:40,770 OK, then finally, an example of this change set up that was inspired, some work I was doing on on COVID. 389 00:44:40,770 --> 00:44:44,940 So in COVID, in the worst case, you end up in intensive care. 390 00:44:44,940 --> 00:44:52,590 And one thing that's of interest in COVID patients in intensive care is when you reach what's called respiratory failure, 391 00:44:52,590 --> 00:44:56,970 which is defined by your ratio being less than 300. 392 00:44:56,970 --> 00:45:03,510 So these are three example patients with the F ratio three times on the x axis. 393 00:45:03,510 --> 00:45:12,290 So the y axis. And what we're interested in is the time when these this quantity crosses the red dotted line. 394 00:45:12,290 --> 00:45:13,850 So as you can see in these examples, 395 00:45:13,850 --> 00:45:20,890 there's a fair degree of uncertainty about when when respiratory failure has been reached for each of these patients. 396 00:45:20,890 --> 00:45:26,240 And so what we'd like to do is understand what determines when you reach respiratory 397 00:45:26,240 --> 00:45:30,650 failure while accounting for this uncertainty about the time when the events happened. 398 00:45:30,650 --> 00:45:38,060 So we essentially have a time to event problem with an uncertain time. 399 00:45:38,060 --> 00:45:46,430 So one of the things that might influence how quickly you reach respiratory failure is is this thing called cumulative fluid balance, 400 00:45:46,430 --> 00:45:51,260 and this changes through time. It depends on the treatments that you're being given. 401 00:45:51,260 --> 00:45:58,900 And in particular, what might effect is the rate of of the commute, the rate of the cumulative fluid balance. 402 00:45:58,900 --> 00:46:04,400 So this is essentially the the slope of these of these lines. 403 00:46:04,400 --> 00:46:11,730 So this, of course, varies through time. And there's also baseline factors that might influence how quickly you reach respiratory failure. 404 00:46:11,730 --> 00:46:18,140 So we want some way to combine the uncertainty about when the end point has been reached. 405 00:46:18,140 --> 00:46:25,550 The uncertainty about this cumulative fluid balance rate and also the uncertainty 406 00:46:25,550 --> 00:46:32,330 or the fact that there are baseline risk factors that might influence this. 407 00:46:32,330 --> 00:46:39,890 So what we're going to do is we're going to have three models, one model, which is a model for this ratio data, 408 00:46:39,890 --> 00:46:48,350 which is going to be a baseline model, which is represented here with the Black Line and the uncertainty in the grey. 409 00:46:48,350 --> 00:46:56,030 Then we're also going to have a model for the for the cumulative fluid balance, which is just going to be a peace prise winning our model. 410 00:46:56,030 --> 00:47:03,440 And then finally, we're going to integrate those two models with with the time two event model for respiratory failure. 411 00:47:03,440 --> 00:47:12,300 So in more mathematical detail. We have a standard baseline model for the F ratio data. 412 00:47:12,300 --> 00:47:24,020 And as a function of that, we can calculate an estimated time of respiratory failure by solving when we first cross the three hundred point, 413 00:47:24,020 --> 00:47:30,860 then we've got a, uh, a a piece y is going to be a model for the cumulative fluid balance. 414 00:47:30,860 --> 00:47:40,250 So just with with two pieces, and then we're going to take a quite simple time to event model, which is a valuable time to event model. 415 00:47:40,250 --> 00:47:49,430 So these models are related by the fact that the the rating, the this quantity here in the time two event model, 416 00:47:49,430 --> 00:47:56,960 the the what we get to we differentiate and it comes from the cumulative fluid balance model. 417 00:47:56,960 --> 00:48:03,470 So together, these two things are what's called a joint model in the statistics literature. 418 00:48:03,470 --> 00:48:11,330 So we're combining together a longitudinal model here with a time two event model in Model three. 419 00:48:11,330 --> 00:48:20,390 So what we're adding to that is that the the time that's related to this hazard is itself uncertain. 420 00:48:20,390 --> 00:48:25,530 And it's been estimated from this this model model one here. 421 00:48:25,530 --> 00:48:37,470 So graphic, what we have is we have the the Beast plane model for the F ratio model here and P one. 422 00:48:37,470 --> 00:48:46,350 And that is the quantity five one in SEC two here is that is the time when you reach respiratory failure. 423 00:48:46,350 --> 00:48:53,670 And that's related. Oh, that's the that's the time that goes into Model two, which is the time to event model. 424 00:48:53,670 --> 00:49:01,470 And then the time two event model depends on the rate of cumulative fluid balance intake, 425 00:49:01,470 --> 00:49:07,410 which comes from this third model, which is the which was a simple, piece wise linear model. 426 00:49:07,410 --> 00:49:10,680 So we want to integrate all of these three models together, 427 00:49:10,680 --> 00:49:19,710 and it turns out in this case that the results that you get don't really depend on which type of pooling you use in this case, 428 00:49:19,710 --> 00:49:23,940 so that the pooled results are the red and blue curves. 429 00:49:23,940 --> 00:49:29,180 But you can't really see the blue car because it's completely underneath the red curve. 430 00:49:29,180 --> 00:49:33,950 But that the result you get from from that hearing is different to what you get if 431 00:49:33,950 --> 00:49:41,270 rather than doing melding what you do is you fix the in your in your middle models, 432 00:49:41,270 --> 00:49:50,090 the the time to event as there is a point estimate from your first model and the point estimate from the third model, 433 00:49:50,090 --> 00:49:55,430 which is that since the fluid balance model and this is the APF ratio model. 434 00:49:55,430 --> 00:50:00,470 So there's a there's a shift in all of the key parameters in this case. 435 00:50:00,470 --> 00:50:10,610 So it shows that at least in this case, accounting for the uncertainty by propagating it through the three models, it does make a bit of a difference. 436 00:50:10,610 --> 00:50:17,780 So to summarise, Mark of Melting provides a generic method for joining together different sub models that 437 00:50:17,780 --> 00:50:23,960 either share a single common variable or that are linked in a chain like structure. 438 00:50:23,960 --> 00:50:29,240 The key idea is this idea of pulling together prior marginal distributions. 439 00:50:29,240 --> 00:50:39,350 But of course, it probably won't make sense if the strong conflict between either the prior marginal or the the data from each of the separate models. 440 00:50:39,350 --> 00:50:47,240 The multi-stage algorithm allows you to conduct inference for the full joint noted model in a sub model specific stages, 441 00:50:47,240 --> 00:50:52,400 which might be easier or more convenient than fitting the full model directly. 442 00:50:52,400 --> 00:51:01,490 But it can be a bit unstable in some cases, and weighted KDE might help with that, or at least one aspect of the challenges of that. 443 00:51:01,490 --> 00:51:11,150 So returning back to my original problem that I set out where we had these four different types of data in four different models, 444 00:51:11,150 --> 00:51:15,950 do we know how to integrate all of these together? Well, I think the honest answer is is no. 445 00:51:15,950 --> 00:51:23,480 I think we're a long way away to do this in general. This what I've described today will work if you've got quite low dimensional 446 00:51:23,480 --> 00:51:28,820 common parameters and relatively little conflict between the different models. 447 00:51:28,820 --> 00:51:33,710 And so I think there's a lot of work that could still be done in this area to provide a truly 448 00:51:33,710 --> 00:51:40,160 generic method that would work for the scale and complexity at least of biomedical data. 449 00:51:40,160 --> 00:51:44,570 And finally, thank you to my collaborators, particularly Lawrence Bernice, 450 00:51:44,570 --> 00:51:51,110 who I did a lot of the early work on this with and more recently with Andrew Mendelson, who is a Ph.D. student with me. 451 00:51:51,110 --> 00:51:57,750 He's done a few bits on this and also to and David and Danny as well. 452 00:51:57,750 --> 00:52:09,940 So thank you very much. The president said. 453 00:52:09,940 --> 00:52:16,220 Survivor Grissom. I mean. 454 00:52:16,220 --> 00:52:28,530 Yep. One of the star would have a possible way, choose an example, it will say. 455 00:52:28,530 --> 00:52:38,670 Yep, yep. So the question for the online people is would it be desirable to have a principled way to choose amongst the different viewing methods, 456 00:52:38,670 --> 00:52:44,450 even though in the case, the case that I showed doesn't make much difference? 457 00:52:44,450 --> 00:52:47,060 I think it would be great if there was a way. 458 00:52:47,060 --> 00:52:55,930 So we we thought at first that some of the the property of external Bayesian pulling might be a justification for rogue doing in this case. 459 00:52:55,930 --> 00:53:00,800 And but unfortunately, it doesn't really apply or it doesn't. 460 00:53:00,800 --> 00:53:03,740 Well, that probably doesn't hold in the setting that we're interested in. 461 00:53:03,740 --> 00:53:08,750 There's also other properties that people have looked at in the prior proving literature, 462 00:53:08,750 --> 00:53:19,130 which are related to whether specific A.R.T. properties hold and not going to be able to formulate it properly off the top of my head. 463 00:53:19,130 --> 00:53:23,060 But there are there, so there are some more properties that you'd think would be desirable. 464 00:53:23,060 --> 00:53:28,520 I seem to remember it's one of these cases where there's three properties that all seem quite reasonable that you'd like, 465 00:53:28,520 --> 00:53:38,410 and I think someone's proved that you can only have two of them. So I don't think that's going to be a perfect method, but. 466 00:53:38,410 --> 00:53:45,100 I don't I don't know whether there's a generic way of choosing, I think, um, 467 00:53:45,100 --> 00:53:51,850 I guess you'd have to specify some specific criteria we're trying to satisfy, maybe maybe in the prediction setting, 468 00:53:51,850 --> 00:53:58,690 maybe there'd be something generic and the kind of get out of jail free answer 469 00:53:58,690 --> 00:54:04,360 is that it should be subjectively chosen like Oprah as an innovation model, 470 00:54:04,360 --> 00:54:12,690 but that's not very useful. And so. Britain's. 471 00:54:12,690 --> 00:54:20,730 The question about building as well. 472 00:54:20,730 --> 00:54:31,380 Now this is the way it. Yes, exactly the same is basically the same question as Jeff's question, which is, I repeat for the online people, 473 00:54:31,380 --> 00:54:36,480 which is how would you choose the weights when you're doing, when you're doing polling? 474 00:54:36,480 --> 00:54:41,460 I think that's I think it is much the same as my answer to Jeff is that I don't know that there's 475 00:54:41,460 --> 00:54:48,090 a generic way of doing that unless you I think if you specified a specific and objective, 476 00:54:48,090 --> 00:54:52,920 maybe you could do something. But I don't I don't know of any way of doing that. 477 00:54:52,920 --> 00:55:01,770 But yeah, great if we could find a way. Great. 478 00:55:01,770 --> 00:55:10,624 Thank you.