1
00:00:01,880 --> 00:00:15,230
It is already recording, and yeah, I think it's time to start with the with our seminar, I'll start into introducing Caroline.

2
00:00:15,230 --> 00:00:24,950
But before, like also like Caroline told me that if you have questions, please do feel free to ask like either like in the chat or Jessica.

3
00:00:24,950 --> 00:00:36,390
Raise your voice. And another question and that is fine and I'm Caroline will be happy to to respond.

4
00:00:36,390 --> 00:00:39,660
I'm now like, let me introduce Caroline.

5
00:00:39,660 --> 00:00:49,530
So this is like the last session of the Oxford Computational Solicits, a machine learning seminar in this season,

6
00:00:49,530 --> 00:00:59,580
and we have Caroline Olear from M.I.T., who is going to speak about causality and policy in the light of drug repurposing for COVID 19.

7
00:00:59,580 --> 00:01:03,990
And Caroline is currently the Henry and Grace Doherty,

8
00:01:03,990 --> 00:01:13,110
associate professor in electrical engineering and computer science and Institute for Data Systems of Society at MIT.

9
00:01:13,110 --> 00:01:21,420
She holds many degrees in maths and biology from the University of Surrey, and that to the studies from UC Berkeley.

10
00:01:21,420 --> 00:01:30,120
Before joining M.I.T., she was a postdoc at the Simons Institute at UC Berkeley, the University of Minnesota, and Footage.

11
00:01:30,120 --> 00:01:35,130
She was also an assistant professor at SD Austria.

12
00:01:35,130 --> 00:01:40,710
Caroline is a member and an elected member of the International Statistical Institute,

13
00:01:40,710 --> 00:01:47,580
and she won an assignment investigator Work as We Fellowship, amongst many others.

14
00:01:47,580 --> 00:01:54,390
And the reason for why we have Caroline here is because she does very exciting research at the intersection of machine learning,

15
00:01:54,390 --> 00:02:03,530
computer science and computational biology, in particular on council in France, generated modelling and applications to.

16
00:02:03,530 --> 00:02:08,840
So, Caroline, please feel free to to Egypt, OK?

17
00:02:08,840 --> 00:02:15,140
Yeah, thank you so much, and thank you very much for inviting me. Very excited to be here.

18
00:02:15,140 --> 00:02:20,840
So we've been working since all of this said a lot and causality in recent years.

19
00:02:20,840 --> 00:02:24,980
And then once COVID 19 started,

20
00:02:24,980 --> 00:02:31,580
we saw many people in the lab were wondering how we could use the kind of research that we have been doing towards this problem.

21
00:02:31,580 --> 00:02:37,880
And so that's when we started working on drug repurposing, which is the kind of problem that says I'm going to be presenting here.

22
00:02:37,880 --> 00:02:47,180
And in particular, I'm going to talk about the mathematics of statistics, the machine learning questions that have arisen just due to these questions,

23
00:02:47,180 --> 00:02:53,690
these biological or medical kinds of questions and really go more into the statistics and machine learning topics

24
00:02:53,690 --> 00:02:59,450
that have been important for for actually coming up with the kind of method that we have been applied here.

25
00:02:59,450 --> 00:03:06,050
So let me start right off with the problem of drug repurposing in particular and in the setting in the context of COVID 19.

26
00:03:06,050 --> 00:03:13,340
But this is, of course, more generally a problem that that could be applied to so many other diseases as well.

27
00:03:13,340 --> 00:03:17,870
So how does drug development work and in particular, how this drug repurposing work?

28
00:03:17,870 --> 00:03:23,630
So drug repurposing is is particularly important when you have diseases where you know you

29
00:03:23,630 --> 00:03:28,400
don't want to have time to go through all of the different trials because these trials,

30
00:03:28,400 --> 00:03:34,970
of course, often fail and they take relatively long. And so drug repurposing means that you're using drugs that have already been FDA

31
00:03:34,970 --> 00:03:40,400
approved and tried to find which ones might also be effective in a new disease context.

32
00:03:40,400 --> 00:03:47,270
This can then speed up kinds of trials because you don't have to go through the safety trials anymore.

33
00:03:47,270 --> 00:03:51,500
You can just directly go to the efficacy trials and try to see whether this new

34
00:03:51,500 --> 00:03:57,760
drug also or this old drug also works in this new and this new disease context.

35
00:03:57,760 --> 00:04:08,080
So how do drugs work generally? Well, drugs usually target a particular protein and then try to make this protein ineffective.

36
00:04:08,080 --> 00:04:14,970
And through that, what you want to do is you want to find drugs that can then push the system back to the normal.

37
00:04:14,970 --> 00:04:19,640
OK, so now if we're thinking about it and in particular when we think about it in these in

38
00:04:19,640 --> 00:04:24,340
these networks and we will see networks also coming up towards the end of the talk.

39
00:04:24,340 --> 00:04:28,420
So, you know, people have used a lot of these fundamentalist grafts in order to come up with

40
00:04:28,420 --> 00:04:33,910
with a drugs that could and to come up with the right targets for these drugs.

41
00:04:33,910 --> 00:04:42,280
So in particular, if you have here, say, in red and in new or these notes that are and maybe these are now genes or proteins, whatever,

42
00:04:42,280 --> 00:04:50,530
whatever you're looking at and these are the notes that in the disease context are differentially expressed as compared to the normal context,

43
00:04:50,530 --> 00:04:55,360
then you know, one of the standard approaches is to define targets that are connected or very

44
00:04:55,360 --> 00:05:01,330
central to many nodes that are differentially expressed in disease control.

45
00:05:01,330 --> 00:05:06,940
Now, of course, you know, if you come already from it, a little bit of a causality background and also otherwise,

46
00:05:06,940 --> 00:05:12,730
you know, it is clear that if your true network in general in biology right there,

47
00:05:12,730 --> 00:05:20,860
that these are all direct relationships where you have like a certain protein or a certain transcription factor turning on another gene, et cetera.

48
00:05:20,860 --> 00:05:24,160
And so if you're a true network, look something like this.

49
00:05:24,160 --> 00:05:29,080
Like this? Direct the graph here. And of course, if you're targeting this node here in the middle.

50
00:05:29,080 --> 00:05:35,320
Well, then although it is connected to all these differentially expressed genes, these genes are not going to change, right?

51
00:05:35,320 --> 00:05:40,270
These red and blue jeans. There is no hope of actually moving the system back to the normal state.

52
00:05:40,270 --> 00:05:44,650
So the directions matter a lot, right? And so causal relationship will actually matter a lot.

53
00:05:44,650 --> 00:05:51,760
You cannot just sit down to write the crap. So in particular, if the graph looks something like this, then maybe this is a good target, right?

54
00:05:51,760 --> 00:05:57,670
Because you at least have a chance of changing all these nodes that are differentially expressed,

55
00:05:57,670 --> 00:06:03,280
and this is in the right way by actually targeting this particular notice.

56
00:06:03,280 --> 00:06:08,170
So directions really matter. And so this is one way where actually causality comes in,

57
00:06:08,170 --> 00:06:12,880
and I'll only talk short about this or we have done a lot of research in terms of actually learning

58
00:06:12,880 --> 00:06:19,180
these diverse graphs or these causal graphs from data on the nodes without knowing the directions.

59
00:06:19,180 --> 00:06:22,960
And so this will be just a little bit of the talk all the way at the end.

60
00:06:22,960 --> 00:06:30,400
I want to talk about something different that there's another problem that also appears in in this drug repurposing studies.

61
00:06:30,400 --> 00:06:31,840
And that's the following.

62
00:06:31,840 --> 00:06:38,860
So, you know, like if we're thinking now of SARS-CoV-2, what in SARS-CoV-2 in particular has an effect on these long epithelial cells?

63
00:06:38,860 --> 00:06:45,280
So now what are the datasets? So often, of course, drug stroke identifying new drugs is still done very much experimentally,

64
00:06:45,280 --> 00:06:47,980
although there are more and more computational approaches and in fact,

65
00:06:47,980 --> 00:06:55,420
also successful drugs that have gone through all of the trials using just that were identified using computational methods.

66
00:06:55,420 --> 00:07:02,170
And so what? What are the datasets that one can use if one is actually trying to use computational methods?

67
00:07:02,170 --> 00:07:06,850
So there are these really large scale drug screens, and I'm going to show some of them or one of them.

68
00:07:06,850 --> 00:07:11,380
In fact, on one of the next slides where you have many, many drugs,

69
00:07:11,380 --> 00:07:16,210
and I'm going to concentrate on these FDA approved ones because we're thinking about drug repurposing.

70
00:07:16,210 --> 00:07:20,260
But in fact, these drug screens, you also have many non FDA approved drugs.

71
00:07:20,260 --> 00:07:22,240
You also have all kinds of milk out, et cetera.

72
00:07:22,240 --> 00:07:28,510
So many, many different kinds of interventions on the system that have been applied to some cell types.

73
00:07:28,510 --> 00:07:32,500
Now, this particular screen that I'm going to look at is a cancer screen.

74
00:07:32,500 --> 00:07:38,560
So the cell types that these drugs were applied to are all different kinds of cancer cell lines.

75
00:07:38,560 --> 00:07:41,080
Now the problem is, of course, that, you know,

76
00:07:41,080 --> 00:07:47,620
now what we want to do is be able to identify drugs that will work in this new context, which is not cancer, right?

77
00:07:47,620 --> 00:07:55,270
And that will. So we want to be able to predict the effects of these drugs on SARS-CoV-2 infects this long epithelial cells.

78
00:07:55,270 --> 00:08:03,280
These are cells that are not in this drug screen. So every time a new disease comes up, you, you would otherwise have to redo the whole drug screen.

79
00:08:03,280 --> 00:08:11,150
And that's of course, infeasible to do, right. So what you really need to be able to do is somehow to transfer these causal effects.

80
00:08:11,150 --> 00:08:18,910
So, you know, the effect of all these drugs or these interventions on some cell lines where they were measured.

81
00:08:18,910 --> 00:08:23,710
Now can you from here predict what the effects of these drugs will be on a different

82
00:08:23,710 --> 00:08:28,480
cell line on a cell line where you know how it looks like in the normal state,

83
00:08:28,480 --> 00:08:32,620
right? But you have not seen any of these drugs applied to it.

84
00:08:32,620 --> 00:08:39,270
And so you would like to be able to transfer it to the. So that's the current and this is again a causal actually question, right?

85
00:08:39,270 --> 00:08:44,690
Because a drug or any knock out, et cetera, is an intervention in the system.

86
00:08:44,690 --> 00:08:49,110
OK, so I have an intervention in the system and now I want to be able to predict what this intervention

87
00:08:49,110 --> 00:08:53,690
in the system will do in a different setting or in a different we call this environment.

88
00:08:53,690 --> 00:08:57,810
OK, so one cell type, one environment and another cell type is another environment.

89
00:08:57,810 --> 00:09:04,380
These same questions appear a lot and policy kinds of questions where you know the effect of a particular policy in a city,

90
00:09:04,380 --> 00:09:09,330
which would there be the environment, then you want to predict what the effect of this policy will be in another city.

91
00:09:09,330 --> 00:09:15,780
So this is known as the causal transport security problem. So it's again a causal problem just because it is an intervention.

92
00:09:15,780 --> 00:09:18,360
And so that's the main question that I want to talk about.

93
00:09:18,360 --> 00:09:24,210
So that's this question here is like, how can you transport the effects of a drug from one environment,

94
00:09:24,210 --> 00:09:30,990
say here a red cell type to a blue cell to predict what this intervention will do and actually do something?

95
00:09:30,990 --> 00:09:38,130
The other question I talked about before is how can you even learn this underlying these regulatory networks?

96
00:09:38,130 --> 00:09:41,880
And this will actually come up. And we have worked a lot on this problem as well,

97
00:09:41,880 --> 00:09:49,260
and this will just come up in the end where we actually want to validate some of the kinds of relationships and cosy relationships that we're putting.

98
00:09:49,260 --> 00:09:53,220
OK, but I'll mainly talk about this causal transport stability problem here.

99
00:09:53,220 --> 00:09:57,990
If you know the effect of different kinds of drugs on some cell types and now you would like

100
00:09:57,990 --> 00:10:02,190
to predict what the effects of these drugs is in the cell type that you actually care about,

101
00:10:02,190 --> 00:10:09,150
which in this case, a source close to infected lung epithelial cells where you have not measures any of the drugs.

102
00:10:09,150 --> 00:10:14,740
And please again, as Gonzalo just said, just just interrupt me whenever you have questions.

103
00:10:14,740 --> 00:10:22,870
OK, so if you come from machine learning numbers, probably one very straightforward approach that you may want to try and in fact,

104
00:10:22,870 --> 00:10:26,500
I know you're in the UK, I have quite a bit to AstraZeneca.

105
00:10:26,500 --> 00:10:33,280
I know that you have tried this. So this is certainly something like very, very standard and it has not worked for them.

106
00:10:33,280 --> 00:10:39,490
And they'll show you why it might not have worked so well for them and how one could maybe get them around this.

107
00:10:39,490 --> 00:10:46,620
So if you come from machine learning, then this is probably kind of an approach that you might say like, well, we should probably at least try.

108
00:10:46,620 --> 00:10:49,930
Right. So we all know this from and often it's used using Gans.

109
00:10:49,930 --> 00:10:55,690
I'm going to use all coders in this whole talking also just because we actually have some theory for it.

110
00:10:55,690 --> 00:11:01,750
So I'm sure you're all familiar with all 10 coders that we have here, this encoder and neural network that maps you from some space.

111
00:11:01,750 --> 00:11:07,870
This can be gene expression. It can be images to this latent space and then the decoder, which makes it back,

112
00:11:07,870 --> 00:11:12,720
and we're going to learn this map just by trying to minimise reconstruction.

113
00:11:12,720 --> 00:11:17,170
Now there are all these like, really nice applications that are usually done with Dan's,

114
00:11:17,170 --> 00:11:22,600
and I'm sure you've seen them where you want to translate an image that you have taken in summer to winter.

115
00:11:22,600 --> 00:11:26,360
Or you have like some people and you want to add smiles to them. Right.

116
00:11:26,360 --> 00:11:32,170
So here you have some people that you've taken, some are smiling and some were also neutral face.

117
00:11:32,170 --> 00:11:39,070
And then maybe here this vector here corresponds to this person here, adding a smile to to her right?

118
00:11:39,070 --> 00:11:44,590
And now here you have a new person and you would like to know how does this person look like with a smile?

119
00:11:44,590 --> 00:11:51,640
You have never taken this picture. Well, maybe you what will work is actually taking this vector here that corresponds to adding a smiling,

120
00:11:51,640 --> 00:11:57,040
different person and just moving it up to this one person here and then a decoding.

121
00:11:57,040 --> 00:12:02,500
This points back into image land and in fact, in this case could come with person with a smile.

122
00:12:02,500 --> 00:12:07,330
Now the question, of course, is can something like this also work when you have drugs?

123
00:12:07,330 --> 00:12:13,300
So here's how you should think of it. It's like we have different kinds of cell types like this blue here.

124
00:12:13,300 --> 00:12:15,820
Just pink, one another blue, another purple one.

125
00:12:15,820 --> 00:12:24,730
For some of them, you get to observe what the effect is of a particular drug, and now you want to predict the effect of this drug on a new cell type.

126
00:12:24,730 --> 00:12:28,690
In this case, this would be, you know, this would be the cancer cell lines where we've seen the drugs on.

127
00:12:28,690 --> 00:12:34,540
This would be the first couple, in fact, that lung epithelial cells, we would like to be able to predict the effects of this drug.

128
00:12:34,540 --> 00:12:39,910
Well, maybe, you know, in the lake and space, I could just take the effects of this drug and then moved over to my new cell

129
00:12:39,910 --> 00:12:44,080
type and then just encode it back out into the vacant space and outsports.com.

130
00:12:44,080 --> 00:12:50,370
The effect this this effect of what this drug will have on the SARS-CoV-2-infected film.

131
00:12:50,370 --> 00:13:01,720
OK, so this has actually worked. But in this particular paper here it works actually quite well just for a well, a few cell types and the few drugs.

132
00:13:01,720 --> 00:13:04,390
And so the question is, of course, is this a general phenomenon?

133
00:13:04,390 --> 00:13:08,980
And as I've already told you, AstraZeneca has tried this and this in general does not work.

134
00:13:08,980 --> 00:13:17,470
In fact, you should also not expect it to work. I mean, there is work by Barenboim and co-authors, they have done and pearl, et cetera.

135
00:13:17,470 --> 00:13:24,820
There's a long route of work, and I've already told you this is known as causal transport stability and Barenboim and Pearl.

136
00:13:24,820 --> 00:13:31,930
They actually have really nice work showing that necessary and sufficient conditions for one causal transport stability works.

137
00:13:31,930 --> 00:13:35,770
Now their work and things are quite stringent conditions.

138
00:13:35,770 --> 00:13:40,600
Now the problem is that their work doesn't really tell us when it will work in this setting here,

139
00:13:40,600 --> 00:13:51,430
because they always assume that, you know, the underlying cause of graft. So if I draw something right, they issue so they assume that.

140
00:13:51,430 --> 00:13:56,110
So you have some causal graph below it and they assume even more that, you know,

141
00:13:56,110 --> 00:14:02,380
the nodes on which the intervention happens and even more that you know, the nodes,

142
00:14:02,380 --> 00:14:05,230
which are the ones that change,

143
00:14:05,230 --> 00:14:13,420
that are that are the ones that are important in order to change environments will say this is the one that differentiates L.A. from,

144
00:14:13,420 --> 00:14:19,960
I don't know, Boston, right? If you want to do a transfer of a policy from one city to another.

145
00:14:19,960 --> 00:14:22,240
So that's that's another knowledge that you have.

146
00:14:22,240 --> 00:14:28,870
So, you know, where do interventions happen and where the nodes are to change the environments or make one cell type into another cell type?

147
00:14:28,870 --> 00:14:34,570
Then based on this graph, we have necessary and sufficient conditions that tells you whether the effect of

148
00:14:34,570 --> 00:14:38,330
this particular intervention can be transferred to this new city or to this.

149
00:14:38,330 --> 00:14:41,770
Until now, you see, we don't have this information, right?

150
00:14:41,770 --> 00:14:49,840
I mean, we don't know happening regulatory networks, we certainly don't know for the drugs in general, like all the nodes that they have an effect on.

151
00:14:49,840 --> 00:14:54,040
And we also don't know the nodes that make a difference between different cell types.

152
00:14:54,040 --> 00:15:00,580
But what is really nice is that they do have a certain sufficient condition, so you should certainly not expect that to work always.

153
00:15:00,580 --> 00:15:07,390
So that's maybe the take away from here. So the question is, when does it work and why does it work sometimes?

154
00:15:07,390 --> 00:15:12,250
And so let's actually look at the large datasets where we'll actually be analysing this.

155
00:15:12,250 --> 00:15:20,170
So this is this large data. As I mentioned that I encourage many of you to look at, I think it has been really, really fun working with this dataset,

156
00:15:20,170 --> 00:15:28,270
if you care about questions like this, about transfer and about interventions and trying to predict the effects of interventions.

157
00:15:28,270 --> 00:15:31,360
So this is a publicly available except the road.

158
00:15:31,360 --> 00:15:38,380
It's a map dataset that has 1.2 million samples where every sample is a one time dimensional vector of gene expression.

159
00:15:38,380 --> 00:15:48,060
So here for can, for just some some genes, well, select the genes and then you have here thousands of different perturbations.

160
00:15:48,060 --> 00:15:56,230
OK, so these are these will only look at these FDA approved drugs, which are many older, many older perturbations that were applied to it.

161
00:15:56,230 --> 00:16:01,900
And as I said, they were applied to different cancer cell lines. So around 70 different cancer cell lines.

162
00:16:01,900 --> 00:16:07,330
And what you see here is every daughter means that you have data, so that's just one thousandth dimensional vector.

163
00:16:07,330 --> 00:16:10,540
But of course, you cannot do all of these experiments, right?

164
00:16:10,540 --> 00:16:15,760
So there is a whole lot of white space out here, meaning that these kinds of combinations were not much.

165
00:16:15,760 --> 00:16:23,680
But, you know, you already have a huge amount of data here on which you can then test whether your method for trying to predict the effect.

166
00:16:23,680 --> 00:16:31,690
You just leave out some validation of whether your method of trying to predict some, some the effect of a perturbation,

167
00:16:31,690 --> 00:16:37,720
actually how well you are able to to predict effects of a corporation that was not being.

168
00:16:37,720 --> 00:16:45,100
OK, so this is the dataset that we're going to use in order to actually check how well that such an Altman colder approach would work.

169
00:16:45,100 --> 00:16:56,810
OK. So look in this data to think a colour the perturbations measure and the effect of the prospective patients measure like a.

170
00:16:56,810 --> 00:17:03,630
Perfect. Yeah, so this is. So it's it's the expression of these one thousand genes that were measured.

171
00:17:03,630 --> 00:17:12,060
So that's what every one of these points here is actually it's really a tensor. So every one of these dots is a 1000 dimensional vector.

172
00:17:12,060 --> 00:17:15,750
So it's quite nice that it's not just a one dimensional thing that it's actually doing.

173
00:17:15,750 --> 00:17:20,340
So there are other drug screens where it's just measured whether to sell dyes or not.

174
00:17:20,340 --> 00:17:26,400
But you're actually you really get to measure this for one thousand financial markets.

175
00:17:26,400 --> 00:17:30,090
And this is just a huge map of the status of how it looks like. So you do see.

176
00:17:30,090 --> 00:17:34,050
So here in colour or the different cell types of seven, we want to hear that you have.

177
00:17:34,050 --> 00:17:37,020
So what you see is that actually and this is known,

178
00:17:37,020 --> 00:17:41,880
so perturbations have a very small effect as compared to the differences between different cell types.

179
00:17:41,880 --> 00:17:48,180
OK, so this in black here, just believe me that these black points here correspond to the orange cell type here,

180
00:17:48,180 --> 00:17:52,110
but the different drugs applied to it. So you see, it's actually quite so.

181
00:17:52,110 --> 00:17:58,050
It's actually a very good baseline is just to predict the effect of a perturbation being the mean of that cell type.

182
00:17:58,050 --> 00:18:02,780
It is quite difficult to have methods that's better than that.

183
00:18:02,780 --> 00:18:09,450
OK, so this is always so that's how you should think of like how the status of Jerusalem.

184
00:18:09,450 --> 00:18:13,410
OK, so let's look a little bit about how these alternate methods work.

185
00:18:13,410 --> 00:18:19,110
And here is something. And this is what we're looking at a little bit into theory, actually.

186
00:18:19,110 --> 00:18:24,180
So it might look a little bit strange now at the beginning, but you know, your standard often code, you're right, you go from.

187
00:18:24,180 --> 00:18:29,640
In this case, we have a 1000 dimensional space where these 1000 dimensional vectors that we're measuring usually

188
00:18:29,640 --> 00:18:33,840
usually are all turned coders in order to get the lower dimensional representation of the data.

189
00:18:33,840 --> 00:18:42,310
And then you train it back to your original space nine hundred one thousand dimensional space now, and we'll see that that doesn't work so well.

190
00:18:42,310 --> 00:18:46,410
And many people have seen for my industry that this doesn't work so well.

191
00:18:46,410 --> 00:18:49,530
Now what what will be proposing to do with something?

192
00:18:49,530 --> 00:18:55,830
Maybe that looks very crazy at the moment because, you know, often cultures are usually used for dimension reduction.

193
00:18:55,830 --> 00:18:58,800
What we're going to do is not use it for dimension reduction,

194
00:18:58,800 --> 00:19:03,750
but we're actually going to go into a latent space that is higher dimensional than the original.

195
00:19:03,750 --> 00:19:12,990
OK, so start off here with your 1000 dimensional vector embedded into a higher dimensional open space and go back to your routine space now.

196
00:19:12,990 --> 00:19:19,200
So, OK, so why is this unintuitive? Well, because we have so many parameters that we could just learn the identity.

197
00:19:19,200 --> 00:19:23,820
So here I could actually just learn the identity map. What I want to show you and I'll give you is a theory.

198
00:19:23,820 --> 00:19:27,930
Why we even tried doing something crazy like this is that we're at the gate.

199
00:19:27,930 --> 00:19:31,050
When you train such an art and culture, you're not going to learn the identity map.

200
00:19:31,050 --> 00:19:34,740
And I will tell you a bit about what kinds of maps we're actually going to be doing.

201
00:19:34,740 --> 00:19:39,900
OK, so so but that's the interesting thing that you actually what you're going to learn is actually something useful.

202
00:19:39,900 --> 00:19:45,420
It seems like. And in terms of intuition, I think maybe intuitively kind of makes sense.

203
00:19:45,420 --> 00:19:47,340
So if you want to.

204
00:19:47,340 --> 00:19:52,770
So in some sense, what you want to do is you want to make effects more linear, right, so that they're more aligned in the making space.

205
00:19:52,770 --> 00:19:54,660
So you want to make effects more linear.

206
00:19:54,660 --> 00:19:59,640
So if you're going into a lower dimensional space where your effects will definitely have to become more non-linear,

207
00:19:59,640 --> 00:20:04,020
more crumbled up than what we were before. So if you're adding up, if you're adding more dimensions,

208
00:20:04,020 --> 00:20:08,940
that's that's hopefully allowing you to actually make things more aligned than what they were before.

209
00:20:08,940 --> 00:20:14,470
At least intuitively, that makes sense. But of course, it depends on what is the maths that is learnt by building.

210
00:20:14,470 --> 00:20:21,220
OK. And so just to show you that something like this is actually happening, we can just check that we can just check it on to status.

211
00:20:21,220 --> 00:20:28,320
After you just leave out someone, you try to predict the effects or you leave out what you're actually going to do is so,

212
00:20:28,320 --> 00:20:34,570
so it will only work if the effects of a drug if you look at different cell types where this particular drug was measured.

213
00:20:34,570 --> 00:20:41,050
It only works if the two drugs are aligned. So what we're just going to measure is the vector between the drug right in the latent

214
00:20:41,050 --> 00:20:45,340
space between a drug in one cell type versus the other cell type in the latent space.

215
00:20:45,340 --> 00:20:52,750
And the correlation between the drug and in one cell type and the other cell type individuals.

216
00:20:52,750 --> 00:20:55,990
And so here this is what you see. So this is the original space.

217
00:20:55,990 --> 00:21:00,850
The correlation and then here is if you go into a lower dimensional latent space in this case,

218
00:21:00,850 --> 00:21:04,810
one dimensional life in space, and you see that basically not much changes.

219
00:21:04,810 --> 00:21:09,400
Right. So this is filled with alcohol. You do have a little bit of an enrichment up, which is good.

220
00:21:09,400 --> 00:21:16,060
This is what you would want to see. Now what happens if you just do a PKA embedding now?

221
00:21:16,060 --> 00:21:22,000
Well, what happens with PKA? In fact, you get quite good enrichment, so you really get very good alignments.

222
00:21:22,000 --> 00:21:25,990
But you're doing you're getting good alignment by throwing away all of your data, right?

223
00:21:25,990 --> 00:21:32,530
So in fact, you can align space by doing aligned data by throwing away data, right?

224
00:21:32,530 --> 00:21:36,310
But then, of course, you don't have any information anymore about the effects of the drug.

225
00:21:36,310 --> 00:21:39,400
And in fact, this is what you see here when you're looking at reconstruction.

226
00:21:39,400 --> 00:21:42,790
So this is one way of getting really good alignment that's just throwing out most of your

227
00:21:42,790 --> 00:21:48,280
data or most of the information in the data and just keeping some of the direction.

228
00:21:48,280 --> 00:21:53,620
So now what you see here is that actually, if you go to this over parameterisation space on, in this case,

229
00:21:53,620 --> 00:21:59,560
higher dimensional latent space than what you started off with, what you, first of all, will see is that it's not learning the identity matrix.

230
00:21:59,560 --> 00:22:02,680
Otherwise everything would have to be nice on the value.

231
00:22:02,680 --> 00:22:09,220
So you do get actually basically the same kind of alignment as what you get with PDA when you're just throwing out all of the information.

232
00:22:09,220 --> 00:22:15,160
But now, of course, with perfect reconstruction. So you're actually not getting rid of any information.

233
00:22:15,160 --> 00:22:22,120
OK. And so the question is, and this is what I want to do now is to tell you a bit of intuition for why did you even try these

234
00:22:22,120 --> 00:22:28,270
crazy Olsen folders where you have like a higher dimensional vacant space than than your Virgenes?

235
00:22:28,270 --> 00:22:32,330
And this is just to show you here. We just did it. I chose this spring for two cell types.

236
00:22:32,330 --> 00:22:37,390
This is just for all different cell types, and you sort of see exactly this kind of stuff.

237
00:22:37,390 --> 00:22:41,530
OK, so let's go a little bit into theory of like, why, why are we even?

238
00:22:41,530 --> 00:22:44,770
Why did we ever try to look at over overdramatise alternate cultures?

239
00:22:44,770 --> 00:22:50,290
And I should say now for all applications we're actually using over something else.

240
00:22:50,290 --> 00:22:53,860
OK, so let's go a little bit into it and give you some intuition for it.

241
00:22:53,860 --> 00:23:01,840
So. So this is just the standard setting. So we'll have our end training examples there, living in R.K. and now in the over parameter settings.

242
00:23:01,840 --> 00:23:06,040
So you can either just have the latent space larger dimensional than the dimension or here.

243
00:23:06,040 --> 00:23:12,300
I will just have like the number of samples, be smaller than than than the inputs I mentioned that I'm taking.

244
00:23:12,300 --> 00:23:19,720
It doesn't matter which direction he's going. And so here we're training our often coder, right, using these end training examples.

245
00:23:19,720 --> 00:23:23,310
So we're just trying to minimise the construction.

246
00:23:23,310 --> 00:23:29,430
OK, so what do we want to do, so let's actually look at this problem, so we're in the overdramatise setting.

247
00:23:29,430 --> 00:23:34,890
Let's actually just get a little bit of intuition for what is happening. So let's just do the linear case first.

248
00:23:34,890 --> 00:23:39,120
So in general, right, this is the map of guilt encoder since this will be highly nonlinear.

249
00:23:39,120 --> 00:23:43,860
But let's just do that. The linear setting first, and let me just have one training example.

250
00:23:43,860 --> 00:23:50,820
That's the great that's the most stringent portrait. So then I would just try to do the following.

251
00:23:50,820 --> 00:23:56,090
So I'm trying to do argaman over a is an hour of patience.

252
00:23:56,090 --> 00:24:00,600
Okay, right? And X is just a one dimensional vector, right?

253
00:24:00,600 --> 00:24:09,150
This is an OK. So I'm definitely over parameterised, and I'm just trying to minimise this so well.

254
00:24:09,150 --> 00:24:14,410
I have so many parameters. First of all, you notice that there are many different solutions to this problem, right?

255
00:24:14,410 --> 00:24:19,680
So one solution is definitely. So definitely, I can get this lost down to zero, right?

256
00:24:19,680 --> 00:24:25,920
And in fact, I can get it down to zero in many different ways. So one solution is the, I guess, the Degan, right?

257
00:24:25,920 --> 00:24:32,040
So a is equal to the identity is definitely a solution. But note that there are many other solutions, right?

258
00:24:32,040 --> 00:24:39,630
In particular, there is a solution of any rank. So, for example, there is a rank one solution which would just project everything onto X.

259
00:24:39,630 --> 00:24:44,310
But you know, you have like any of the ranks of solutions to this problem.

260
00:24:44,310 --> 00:24:49,780
So then the question is what is actually an altered code work? And so now this results actually just work.

261
00:24:49,780 --> 00:24:53,040
The linear setting is very well known. In fact, it is.

262
00:24:53,040 --> 00:25:00,510
If you just do gradient descent started and you know, in your own code, there's usually what you do is you start with very small initialisation.

263
00:25:00,510 --> 00:25:02,220
So in fact, if you start at zero,

264
00:25:02,220 --> 00:25:10,280
it is known that here's what you would learn is in fact the one that the rank one solution or just a projection onto X itself.

265
00:25:10,280 --> 00:25:15,630
OK, and now you can actually generalise this, and this will be to zero in terms of like what is being learnt,

266
00:25:15,630 --> 00:25:18,330
but what will do just to get some intuition as well.

267
00:25:18,330 --> 00:25:25,770
Let's like actually just look at what happens when you apply your unit, art and culture, when you just have when you train it just with one training?

268
00:25:25,770 --> 00:25:34,200
Exactly. So here is one training example these nice Bonny's here you train it on these bunnies and then you put in some random test examples.

269
00:25:34,200 --> 00:25:39,480
And, well, let's see what comes out right when you feed it through your authoring folder and what you see here.

270
00:25:39,480 --> 00:25:44,770
What happens is that always the bunnies come out, OK? So, so actually it.

271
00:25:44,770 --> 00:25:51,630
So I would like you to notice that this is stronger than what we saw before in the linear setting right where rank one solution comes out.

272
00:25:51,630 --> 00:25:55,320
If it's a rank one solution, you wouldn't expect just the bunnies to come out,

273
00:25:55,320 --> 00:26:01,170
but you could also get, for example, the negative of the bunnies come up. So this here is actually a point map, right?

274
00:26:01,170 --> 00:26:03,690
That maps, and it seems like it has learnt a point, Matt,

275
00:26:03,690 --> 00:26:10,860
where it maps anything you put in to exactly the bunnies and not like a linear combination of the buttons.

276
00:26:10,860 --> 00:26:16,320
So. So this is I mean, OK, so many people would say this is not surprising at all.

277
00:26:16,320 --> 00:26:21,870
But I do want to just make sure that you understand that it is really surprising.

278
00:26:21,870 --> 00:26:26,670
So first of all, if you actually use. So you met this really quite deep as well.

279
00:26:26,670 --> 00:26:33,090
So if you use a very shallow network, in fact, you will not learn a point next.

280
00:26:33,090 --> 00:26:37,620
OK, so this is important to notice. So here, in fact, you're again training on one training example.

281
00:26:37,620 --> 00:26:44,160
You put in some random examples and you see that outcomes actually something that looks very similar to what you put in.

282
00:26:44,160 --> 00:26:47,040
Now again, there are papers that say you're learning the identity map.

283
00:26:47,040 --> 00:26:53,250
I'm going to show you this is not the identity not going to will be really important to notice that this is not OK,

284
00:26:53,250 --> 00:26:56,670
but that will come on the next slide. But this is just to show that it is.

285
00:26:56,670 --> 00:27:03,780
It's that you shouldn't be surprised by the results. Also again, as I said before, you know, if you actually this is easy to prove the linear setting.

286
00:27:03,780 --> 00:27:07,440
So if you use a linear encoder and you train, let's say,

287
00:27:07,440 --> 00:27:13,470
on two examples just to show you that this is one example you can actually get the negative of the dog or something if you train it on that.

288
00:27:13,470 --> 00:27:19,740
But if you do hear a linear, then it's very clear that what comes out is always a linear combination of your training.

289
00:27:19,740 --> 00:27:25,050
And so when the linear setting, you should also expect to see the negative example that difference.

290
00:27:25,050 --> 00:27:30,600
OK, so here you get linear combinations. Here you actually get something that looks more like a gigantic map.

291
00:27:30,600 --> 00:27:35,610
So you should be surprised by this kind of results where if you have a quite deep network, in fact,

292
00:27:35,610 --> 00:27:41,090
if you just train it on one example, then it seems like what it is learning is the point.

293
00:27:41,090 --> 00:27:45,350
OK, so then let me give some intuition for actually the map that is being learnt.

294
00:27:45,350 --> 00:27:48,380
And this really comes from this shallow attack.

295
00:27:48,380 --> 00:27:56,390
OK, so the intuition is really, really simple in terms of what we're doing in order to then show what we're actually going to prove.

296
00:27:56,390 --> 00:28:01,310
Well, and also Coder is just so nice, right, that it maps an image to an image.

297
00:28:01,310 --> 00:28:07,220
Well, so if I want to see or understand the bit about what is the map that the alter encoder has learnt?

298
00:28:07,220 --> 00:28:09,890
Well, then let me just iterate the map.

299
00:28:09,890 --> 00:28:16,820
OK, so so if you're in this setting here, right, I put in this this horse here and I get out something that looks very similar to it.

300
00:28:16,820 --> 00:28:21,050
I'm telling you it's not the identity map, because what I'm going to do is put in this horse again.

301
00:28:21,050 --> 00:28:28,070
OK, whatever you got out, you put it in again, you put it in again. And in fact, you'll see that what will come out are actually the rabbit.

302
00:28:28,070 --> 00:28:35,510
That's what we're going to prove. So A.C., it's actually even if you train it on not just one training example, but you train it.

303
00:28:35,510 --> 00:28:40,880
In this case, we trained it on five hundred training examples and these are all training examples.

304
00:28:40,880 --> 00:28:43,730
But then you just add a whole lot of noise on top of it,

305
00:28:43,730 --> 00:28:49,670
and you just put in this example into your alternate colour and this is all ready after a couple of iterations.

306
00:28:49,670 --> 00:28:54,350
So at the beginning, you get something that looks very similar to whatever you put in and you iterated iterate that.

307
00:28:54,350 --> 00:28:59,180
Then you see that in fact, your training example comes out. So what does this suggest?

308
00:28:59,180 --> 00:29:09,050
Well, it suggests that your ultimate coder learns a map where all of the training examples are fixed points, or at least some of them or fixed points.

309
00:29:09,050 --> 00:29:15,320
So you'll see, I mean, I'll show you. So you can prove which ones are fixed points, and so I'll show you on the next slide.

310
00:29:15,320 --> 00:29:19,850
We can actually train these alternate coders to make all of our training examples see fixed points.

311
00:29:19,850 --> 00:29:26,060
And the question is, of course, how large is the region of attraction right around every one of your training examples?

312
00:29:26,060 --> 00:29:31,160
And that's a good question. So here, for example, you have some for some just examples, right?

313
00:29:31,160 --> 00:29:36,440
So here here what we're showing you is that even if you take away like you, just add half of the image,

314
00:29:36,440 --> 00:29:44,360
just be noise in four hundred and twenty one of the examples, you're still inside the region of contraption around your training example.

315
00:29:44,360 --> 00:29:50,210
You're actually outputting the correct training example and not a wrong training. Examples of these regions seem to be quite large.

316
00:29:50,210 --> 00:29:56,630
This is to us an important research question that we don't have an answer to how these regions of attraction actually look like here.

317
00:29:56,630 --> 00:30:04,550
We're just showing you a very simple example in 2D, where we just created the space and looked at where are we converging to?

318
00:30:04,550 --> 00:30:09,770
And this is how these regions look like, and this will depend very much on the nonlinear you're using.

319
00:30:09,770 --> 00:30:14,250
And so this will be one of the open questions that I'll have in the end. I mean, this is super important to understand, right?

320
00:30:14,250 --> 00:30:20,720
In order to use often coders and different kinds of applications, or you may want to have these regions of attraction be slightly different.

321
00:30:20,720 --> 00:30:24,840
And this will really depend on the known use.

322
00:30:24,840 --> 00:30:33,890
OK, so as I said, so what you can actually do with what we cannot prove is that we don't have any other attractors in this whole landscape other than,

323
00:30:33,890 --> 00:30:38,210
you know, just gritting it and seeing that we're always converging to one of the training examples.

324
00:30:38,210 --> 00:30:45,650
What we can prove when you train your network is whether every training example is an attractive because you just look at the

325
00:30:45,650 --> 00:30:54,740
eigenvalues right at at your training examples and then you see that there is more than one so that you are an instructor at this,

326
00:30:54,740 --> 00:31:04,670
at the at the training exam. So here is a network which is actually quite small where we trained it and we can prove that this network

327
00:31:04,670 --> 00:31:10,230
actually has all five hundred training examples in this case from imaging and actually as attractive.

328
00:31:10,230 --> 00:31:15,560
So we don't need very big networks to make all of your training examples be attractive.

329
00:31:15,560 --> 00:31:21,740
OK, so what can we actually prove about all of this? So and this is what we have.

330
00:31:21,740 --> 00:31:27,440
So for now, we only have it for one training example, and this is the following results.

331
00:31:27,440 --> 00:31:31,640
So you have your ultimate coder and with K.

332
00:31:31,640 --> 00:31:34,430
So now here we have something called under suitable conditions.

333
00:31:34,430 --> 00:31:39,440
So actually, all of the nonlinear reviews that are standard use will satisfy these conditions.

334
00:31:39,440 --> 00:31:46,910
initialisation has to be close enough to zero. And then what happens when you give me a nonlinear the initialisation?

335
00:31:46,910 --> 00:31:52,880
Then what's this theorem gives you as a is a is a formula for the maximal eigenvalue of the programme.

336
00:31:52,880 --> 00:32:00,430
So, you know, that's a training example will be a fixed point, an attractive fixed point if this eigenvalue smaller than one.

337
00:32:00,430 --> 00:32:01,650
OK, so from this formula,

338
00:32:01,650 --> 00:32:09,200
what you can then figure out is what kind of depth and what kind of width you need in order to make this training example to.

339
00:32:09,200 --> 00:32:14,990
And so all this says, you know, if you're over parameterised enough in terms of depth and width, then in fact,

340
00:32:14,990 --> 00:32:21,070
this training example will become an attractor for whatever nonlinear event initialisation for.

341
00:32:21,070 --> 00:32:30,380
All right. It's about the phenomenon of the Joneses driving in particular to avert parameterised autumn colours for any kind of and good.

342
00:32:30,380 --> 00:32:34,630
Now this is in fact very particular to overpromise Drysdale. Yep.

343
00:32:34,630 --> 00:32:40,330
And in fact, when you're also plying them, you will see it that in fact, if you have your under parameterised ones in general,

344
00:32:40,330 --> 00:32:43,960
you'll not compete so you'll not converge to training example, you'll just convert.

345
00:32:43,960 --> 00:32:48,280
You can even convert to something like completely that doesn't look like an image at all.

346
00:32:48,280 --> 00:32:53,740
And you can convert it to stuff that looks kind of like. Yeah, yeah.

347
00:32:53,740 --> 00:32:58,530
So it's like you have to factor in the identity map in many states or something like that.

348
00:32:58,530 --> 00:33:02,910
I mean, yes, so let me actually show you how I'm thinking about it in terms of the maps that are burned.

349
00:33:02,910 --> 00:33:05,460
Exactly so. Yes.

350
00:33:05,460 --> 00:33:12,690
So first of all, and these are the kinds of pictures, so in some sense, OK, so we all know right people were Crowdrise neural networks.

351
00:33:12,690 --> 00:33:19,920
They can interpolate training data. But I just want too many papers use interpolation and memorisation as the same as the same thing.

352
00:33:19,920 --> 00:33:23,580
So I want to make a difference between the two because to me, they're not the same.

353
00:33:23,580 --> 00:33:30,690
So interpolation, right, just means that I'm learning some function where as training example will be mapped to itself prior to training,

354
00:33:30,690 --> 00:33:33,520
example will be mapped to itself. So you have zero loss.

355
00:33:33,520 --> 00:33:39,630
OK, so there are many different maps, right, that can interpolate training data so I can have this crazy makeover here.

356
00:33:39,630 --> 00:33:45,120
So this is an ultra encoder that goes from hour one. You should read this to our one.

357
00:33:45,120 --> 00:33:48,990
And say, this is my training example. This is a training example. This is a training.

358
00:33:48,990 --> 00:33:54,360
Examples of this kind of crazy map isn't interpolating map. Every training example is exactly map.

359
00:33:54,360 --> 00:34:01,440
So now this map here in the middle is also an interpolation solution, and this one here is also an interpolating solution.

360
00:34:01,440 --> 00:34:08,250
Now what this map here is is the map that you have basically seen at the beginning where we use the units on one training example,

361
00:34:08,250 --> 00:34:12,010
even if you use it on two or three. What you'll see is that it maps.

362
00:34:12,010 --> 00:34:17,700
This is the point map, right? Where all of these train, all of these test examples in this case,

363
00:34:17,700 --> 00:34:22,170
here would be maps to this example, and all of these fellows here will be mapped to this one.

364
00:34:22,170 --> 00:34:28,860
So this is the point map, right where you just exactly immediately point a map out the training example itself.

365
00:34:28,860 --> 00:34:33,540
And then what we've seen on the previous slide is when you have when you have very deep networks,

366
00:34:33,540 --> 00:34:36,960
this is what is happening when you don't have that deep networks,

367
00:34:36,960 --> 00:34:42,720
but still over parameterised like quite wide networks, then what is happening is this following map here, right?

368
00:34:42,720 --> 00:34:47,650
Where if you iterate the map many times, then this comes up.

369
00:34:47,650 --> 00:34:55,780
Right. But but for now, the the map that is actually learnt is just a map that is contracts at each one of the training.

370
00:34:55,780 --> 00:34:59,800
So it is contracted at each one of the training examples, and once you iterate, well,

371
00:34:59,800 --> 00:35:04,650
if you just iterate this map here and in fact, this map over here, we're sorry.

372
00:35:04,650 --> 00:35:08,860
All right. Ask a clarifying questions. Yes. Why?

373
00:35:08,860 --> 00:35:14,380
Why? Why is it important to have these attractive training points?

374
00:35:14,380 --> 00:35:18,520
Why would I want to have such a map? I think that's a great question.

375
00:35:18,520 --> 00:35:24,910
So here there has actually been and so so you were asking questions about even is this important for generalisation?

376
00:35:24,910 --> 00:35:31,120
And then I'll have to also talk about why, why, what is even generalisation and for optimal coders?

377
00:35:31,120 --> 00:35:35,320
So so what is kind of nice and why we didn't really have to do?

378
00:35:35,320 --> 00:35:37,720
So first of all, it works in our application. Okay.

379
00:35:37,720 --> 00:35:43,360
So that's maybe one way of saying like why you actually care about these things over permettrait guilt and coding.

380
00:35:43,360 --> 00:35:51,310
The other thing is that there has actually been work before where they wanted to make alternate coders be attractive at the training examples,

381
00:35:51,310 --> 00:35:55,900
thinking that this is actually a good property where they added regular risers to

382
00:35:55,900 --> 00:36:03,190
make the map attractive at the at the at the at the at the training examples.

383
00:36:03,190 --> 00:36:03,640
So,

384
00:36:03,640 --> 00:36:11,650
so people have already used these kinds of map with the additional regular or seen that these alternate coders actually work very well in practise.

385
00:36:11,650 --> 00:36:16,210
And so what we're showing here is that you actually don't need any regular writers like just over

386
00:36:16,210 --> 00:36:22,680
parameterisation by itself will give you this property of its being attractive at the trainees.

387
00:36:22,680 --> 00:36:33,130
OK. So this being attractive at the training examples somehow makes the auto encoder work better in some way in many different applications.

388
00:36:33,130 --> 00:36:36,300
Yes. OK, so we've seen it in our drug example.

389
00:36:36,300 --> 00:36:41,940
Others have just added regular razors to actually have this property and have them see that this work actually plays well.

390
00:36:41,940 --> 00:36:45,030
And so we're quite well up, and now we have to talk about generalisation.

391
00:36:45,030 --> 00:36:50,100
I mean, there is no such thing as like, what does that even mean to generalise? Well, for an alternate cruiser, right?

392
00:36:50,100 --> 00:36:57,210
So this is another question that I think is an important one where there should be much more a bit more foundational work, right?

393
00:36:57,210 --> 00:37:04,620
So now people use it in the sense that like oh, and often could generalise as well if I'd learn something close to a density map.

394
00:37:04,620 --> 00:37:08,520
But I mean, I I don't agree with that definition at all, right?

395
00:37:08,520 --> 00:37:14,430
If you want the identity map, you don't have to train. I mean, it's if this is what you want and just don't train.

396
00:37:14,430 --> 00:37:18,840
So that cannot be the right definition of generalisation, but maybe not here.

397
00:37:18,840 --> 00:37:22,890
I'm just proposing something that could be something, but we have nothing to work on it.

398
00:37:22,890 --> 00:37:29,730
But maybe a kind of definition that might make sense is that you should be close to the identity map,

399
00:37:29,730 --> 00:37:34,500
but at the same time, be contract this that you're training examples.

400
00:37:34,500 --> 00:37:41,130
So that might be an alternate and at least you have to train. And then what would happen is that these overdramatise in particular are very wide.

401
00:37:41,130 --> 00:37:45,210
Networks will actually do very well if that's your definition of generalisation.

402
00:37:45,210 --> 00:37:51,360
But I think there should be a lot more, you know, kind of more foundational work on what it actually means to generalise well for all coders.

403
00:37:51,360 --> 00:37:57,720
And certainly not. It shouldn't be that you want to learn gegenpressing. OK, so just one last clarifying question.

404
00:37:57,720 --> 00:38:09,060
Could you please recap the intuition of like why having having training examples as attractive points make it work well in your drug examples?

405
00:38:09,060 --> 00:38:13,560
Oh yeah. So so I haven't even talked about that yet. OK. So for us.

406
00:38:13,560 --> 00:38:18,330
So the intuition and I think by now, we kind of have a proof for what is actually happening.

407
00:38:18,330 --> 00:38:21,900
But the intuition at that point was like, OK, so here what?

408
00:38:21,900 --> 00:38:28,380
We and this war came before our great examples of this work was there, and then we built on it and tried to use it for for drugs.

409
00:38:28,380 --> 00:38:35,520
So the intuition was like, Well, here what you're seeing is that you're going to become attractive at these zero dimensional points.

410
00:38:35,520 --> 00:38:43,080
So you're making points that are similar, more similar to the training examples, but right, because it's going to be attractive.

411
00:38:43,080 --> 00:38:49,260
So we were hoping that maybe this doesn't only hold four points, but it will actually also make lines.

412
00:38:49,260 --> 00:38:55,770
So one dimensional things that are similar in even manifolds like low dimensional manacles that are similar to each other, more similar to each other.

413
00:38:55,770 --> 00:39:02,490
Mm hmm. That was the intuition. OK, so that we make things that are already similar to each other more similar to each other.

414
00:39:02,490 --> 00:39:07,230
And that's what seems to be happening and in at least two drug examples. Right?

415
00:39:07,230 --> 00:39:15,010
That was the intuition. Let's make let's use this over parameter settings so that it makes things that are similar.

416
00:39:15,010 --> 00:39:21,060
And that's kind of what you're seeing, right? When you when you see this right, that's making something like this is your training example.

417
00:39:21,060 --> 00:39:30,090
It was already similar to this training example. So the application of this network will make things more similar to each other.

418
00:39:30,090 --> 00:39:32,310
But I agree that there is quite a bit of a step there,

419
00:39:32,310 --> 00:39:36,780
and this is definitely the kind of work that we're really excited about doing now on the theoretical side.

420
00:39:36,780 --> 00:39:47,080
Why does this actually work? That was the intuition. Yep, and generally, I think the intuition for why you would want something like this is like,

421
00:39:47,080 --> 00:39:53,400
you want it to be contracted out to training examples because you do believe that you're you're training examples are important examples.

422
00:39:53,400 --> 00:39:59,880
Or hopefully, you know, they were not just like, know they are actually hopefully representative of whatever you are expecting to see otherwise.

423
00:39:59,880 --> 00:40:06,810
And hopefully, if you have something close by, well, maybe you may want to make it a little bit more closer to whatever you have already seen before.

424
00:40:06,810 --> 00:40:12,840
So this kind of maps or self regularising. So this was at least at some point the intuition,

425
00:40:12,840 --> 00:40:21,790
and this was work by Joshua Bengio on why they had actually introduced these regular risers to make their all time coders more contract.

426
00:40:21,790 --> 00:40:26,220
Let's hear what this says is you don't need any regular risers. You'll just actually get your contract.

427
00:40:26,220 --> 00:40:32,730
If maps just over pemetrexed, just make your space more jam so that you so that you're actually going to see this.

428
00:40:32,730 --> 00:40:37,810
Cool. Thank you. Great questions. Good.

429
00:40:37,810 --> 00:40:41,860
OK, so this is, as I said, this is the only proof we have. So one training example,

430
00:40:41,860 --> 00:40:47,290
so now everything will be and this kind of goes into like the intuition for what we're doing now

431
00:40:47,290 --> 00:40:52,030
in the lab and all of our applications is always very wide networks and not very deep networks.

432
00:40:52,030 --> 00:40:55,060
And so the intuition somehow already came from here, right?

433
00:40:55,060 --> 00:41:00,610
You have seen it that when you are very deep, what you actually get is these maps that directly you, Matthew, to a training example.

434
00:41:00,610 --> 00:41:06,790
That's probably not what you want, right? You probably don't want that would ever test example you've put in like a training example comes out.

435
00:41:06,790 --> 00:41:12,100
You probably want something that is kind of close to the identity, right? But still attractive that you're training examples.

436
00:41:12,100 --> 00:41:16,360
And it seems like very wide networks and just a little bit of depth is actually doing that.

437
00:41:16,360 --> 00:41:23,110
And so here is the intuition for it, and we don't have a proof, but it would be really nice to actually have a proof for these kinds of things.

438
00:41:23,110 --> 00:41:31,120
So here you actually see these eigenvalues and this is the top eigenvalue and how it changes when you're changing depth and width.

439
00:41:31,120 --> 00:41:35,260
So here what you see is with increasing and this is maybe difficult.

440
00:41:35,260 --> 00:41:39,140
So that's why we have like two different plots, one which you look at the top one,

441
00:41:39,140 --> 00:41:43,330
whatever percent of eigenvalues and here you only look at the top one.

442
00:41:43,330 --> 00:41:47,170
But in all of our simulations and this again, this is just a conjecture.

443
00:41:47,170 --> 00:41:53,720
It seems that what happens with width is that the variance of your eigenvalues shrinks.

444
00:41:53,720 --> 00:42:02,140
OK. What happens with that seems like what it's that the distribution just shifts towards become smaller and smaller.

445
00:42:02,140 --> 00:42:06,400
So what you would want, right? If you want to be close to the identity and still contract,

446
00:42:06,400 --> 00:42:13,630
if what you want is that all of your eigenvalues are just below one but not too small because otherwise you're in this setting, right?

447
00:42:13,630 --> 00:42:18,130
If they're are all like very close to zero, then you just have your points back.

448
00:42:18,130 --> 00:42:24,520
So that's not what you want, right? So that means that what you want this to be very, very wide so that you have a very small variance.

449
00:42:24,520 --> 00:42:31,870
And then just deep enough so that your eigenvalues are just below one so that it is contrasted that the train.

450
00:42:31,870 --> 00:42:38,350
So that's at least the intuition of how we're using these open courses right now in all of our application.

451
00:42:38,350 --> 00:42:43,180
But there is a lot to prove here and now comes a strong example,

452
00:42:43,180 --> 00:42:49,240
and I already gave you that intuition for why we have used it here, just seen that it works well.

453
00:42:49,240 --> 00:42:55,180
And of course, there is a lot to do in terms of proving that this is in fact the right intuition.

454
00:42:55,180 --> 00:42:58,630
Why does it make also similar things more similar to each other,

455
00:42:58,630 --> 00:43:03,910
even if it's not just important, but that was at least intuition for us to try to use this?

456
00:43:03,910 --> 00:43:11,800
Now how do we use it now that we have this, these over parameterised? Often coders actually want to also show you how you use it in this drug example.

457
00:43:11,800 --> 00:43:23,380
So this is a standard approach of like how people do it in a computational drug discovery is often using this kind of signature matching approach.

458
00:43:23,380 --> 00:43:29,140
So here what we have is this is again data that came out after source code started, right?

459
00:43:29,140 --> 00:43:31,930
You have like your gene expression in the normal state.

460
00:43:31,930 --> 00:43:36,460
So these long epithelial cells normal state and then this is what happens when you add the virus.

461
00:43:36,460 --> 00:43:43,120
OK, so this is the disease state. And so this vector here and you see this is double the virus and ACE2 and whatever,

462
00:43:43,120 --> 00:43:46,270
you know, many different kinds of just to see that this is kind of robust.

463
00:43:46,270 --> 00:43:50,320
So this direction always seems to be very similar and this is the disease direction.

464
00:43:50,320 --> 00:43:55,180
So this is the effect of adding the virus does this to your gene expression.

465
00:43:55,180 --> 00:44:02,330
So now this disease, the signature matching what it does is it tries to find a drug that its effect is upwards.

466
00:44:02,330 --> 00:44:10,900
OK, so that hopefully you can just get rid of the effects of the of the disease by adding a drug that moves you back to the normal state.

467
00:44:10,900 --> 00:44:13,360
OK, so now what you want is to find a drug, right?

468
00:44:13,360 --> 00:44:20,470
So so now we'll actually map out all of the effects of the drugs in our latent space, which now is more aligned than what we had before.

469
00:44:20,470 --> 00:44:21,520
And then in there,

470
00:44:21,520 --> 00:44:29,860
now we look at which ones of the drugs is most A. correlated with the effects of the disease and just choose those type drugs that you get out there.

471
00:44:29,860 --> 00:44:37,450
And in fact, if you look at these top drugs, which is really nice, is that in fact, they all in two different classes of drugs.

472
00:44:37,450 --> 00:44:42,610
So, you know, you don't get just like some completely random list of drugs you get like actually two different classes of drugs.

473
00:44:42,610 --> 00:44:45,250
They're all very nicely consistent.

474
00:44:45,250 --> 00:44:52,930
And so whatever sort of drugs, this sort of thing is that many and this is where we also wanted to look at these drugs have.

475
00:44:52,930 --> 00:44:59,320
There are some known targets of these drugs, and these are all actually quite a lot of cancer drugs, which have a lot of targets.

476
00:44:59,320 --> 00:45:03,970
So what we wanted to do is not just have one of these cancer drugs that have a lot of targets,

477
00:45:03,970 --> 00:45:12,700
but also validate or try to predict what is actually the causal target amongst all of these list of targets that these cancer drugs actually have.

478
00:45:12,700 --> 00:45:17,200
So here you have a list of all these genes, right, that are targeted by these cancer drugs.

479
00:45:17,200 --> 00:45:23,410
And we just wanted to know amongst all of these which one would be the putative one that we think is the cause and the one amongst all of these.

480
00:45:23,410 --> 00:45:27,340
Because then maybe you could use a drug that is more specific to that particular

481
00:45:27,340 --> 00:45:31,420
target and doesn't have as many side effects as these drugs that were found here,

482
00:45:31,420 --> 00:45:40,990
which are all these calls, which are all these cancer drugs. And so now this is where we used our previous all of this causal causal structure.

483
00:45:40,990 --> 00:45:48,550
Learning algorithms to figure out which ones have all of these targets would be the one that we hypothesise is that is the causal one.

484
00:45:48,550 --> 00:45:54,820
So what do we do here? So again, what we're trying to do now is to try to actually learn these causal graphs.

485
00:45:54,820 --> 00:45:56,980
And so we just learnt a causal graph.

486
00:45:56,980 --> 00:46:03,550
So when you take out all of the genes that are differentially expressed into disease and amongst all of these targets,

487
00:46:03,550 --> 00:46:07,720
and you also learn the graph with all of these targets here and what you expect,

488
00:46:07,720 --> 00:46:15,700
what would be a good target is in fact a node that is most up that is upstream to most of the disease genes, right?

489
00:46:15,700 --> 00:46:20,980
So you want to target something that is most upstream to all of the genes that are changing into disease?

490
00:46:20,980 --> 00:46:27,130
That's a good target. And so what we're looking for is which one amongst all of these genes that are

491
00:46:27,130 --> 00:46:32,970
targeted by these drugs is upstream of most of the disease associated genes.

492
00:46:32,970 --> 00:46:39,010
OK, so we just learnt these graphs and does have a score for each one of the of the targets from the previous slide.

493
00:46:39,010 --> 00:46:44,770
And what comes out to be the most upstream is the specific one I knew nothing about it came on before starting this,

494
00:46:44,770 --> 00:46:51,740
but it's actually a super, super interesting protein. In fact, it's very stable, so you can do this in different kinds of cancer.

495
00:46:51,740 --> 00:46:55,300
So this is a cell line. Do this also in the CMF datasets.

496
00:46:55,300 --> 00:47:03,610
If you do it in the ones that are actually in our body, these cells, you get out exactly the same risk one protein that is most upstream.

497
00:47:03,610 --> 00:47:07,930
And what is really interesting is actually and this again wasn't put into our model.

498
00:47:07,930 --> 00:47:15,250
One directly binds to SARS-CoV-2 proteins, and in fact, it targets that just targets with one very specifically,

499
00:47:15,250 --> 00:47:22,690
just enter phase two trials, which was quite nice as well. So one of the super interesting protein.

500
00:47:22,690 --> 00:47:25,300
It also has two different. It can have very different effects.

501
00:47:25,300 --> 00:47:31,600
And this is what we have also looked at quite a bit because an individual source who has a very different effects than in young individuals,

502
00:47:31,600 --> 00:47:37,310
it can actually turn on two different pathways. And we think that this might be or hypothesised that this might be going on in.

503
00:47:37,310 --> 00:47:42,950
Individuals versus young individuals and young individuals that might actually be turning on the immune response pathway,

504
00:47:42,950 --> 00:47:51,020
survival pathways, but it can actually also turn on other pathways, which are the fibrosis, the death pathways which actually lead to fibrosis,

505
00:47:51,020 --> 00:47:55,670
blood clotting, etc. exactly the kinds of different outcomes that you'll see in patients.

506
00:47:55,670 --> 00:47:57,650
That is quite an interesting protein that came out.

507
00:47:57,650 --> 00:48:04,620
Of course, a lot of things tend to develop, but at least there is a hypothesised mechanism as well of how this could.

508
00:48:04,620 --> 00:48:11,520
OK, so I am kind of at the end, so five, 20, so this is just an overview that we looked at one causal question,

509
00:48:11,520 --> 00:48:17,520
which is how to predict the effects of interventions, right? In this case, drugs and move over.

510
00:48:17,520 --> 00:48:20,790
So there are all these transport problems in biology that I find really exciting.

511
00:48:20,790 --> 00:48:26,730
So you have like this transport problem of like you have some drugs that you have observed in

512
00:48:26,730 --> 00:48:31,290
some cell types and you want to predict the effects of these drugs and in other cell sites.

513
00:48:31,290 --> 00:48:33,000
I mean, the more extreme case would be right.

514
00:48:33,000 --> 00:48:39,330
I have these drugs that I have observed in mouse, and I would really love to know what these drugs actually do in humans.

515
00:48:39,330 --> 00:48:44,280
Of course, we for now did this just for cell types, and I think that's maybe the more reasonable thing to do.

516
00:48:44,280 --> 00:48:50,790
But then you have also much harder problems, which is, you know, I have some drugs that I applied in one cell type.

517
00:48:50,790 --> 00:48:55,440
Now I would like to be able to transport to a new drug.

518
00:48:55,440 --> 00:48:57,930
OK, so I have these drugs applied in one cell type.

519
00:48:57,930 --> 00:49:03,870
Now I have a new drug that I would like to be able to predict what happens when you apply this new drug that you have never seen before.

520
00:49:03,870 --> 00:49:08,670
We had already seen the drug and we just wanted to apply it for different cell type where we also have some data.

521
00:49:08,670 --> 00:49:12,420
But here you have now a new drug and you would like to transport one or, you know,

522
00:49:12,420 --> 00:49:17,100
you can have all these other transport problems that are really less causal if you want to be able

523
00:49:17,100 --> 00:49:21,690
to transport between very different data modalities in biology or between different time points.

524
00:49:21,690 --> 00:49:26,700
Right. So something that that has always kind of bothered me when we're thinking about,

525
00:49:26,700 --> 00:49:30,900
say, early cancer detection is that if I'm training on data from a pathologist,

526
00:49:30,900 --> 00:49:38,700
I'll never be able to to detect cancer earlier than the pathologist because I'm always using it using the training data from the pathologist.

527
00:49:38,700 --> 00:49:43,230
So I need a way to somehow generate data, right? That corresponds to earlier time points.

528
00:49:43,230 --> 00:49:50,640
So how can I move between different time points, so generate my own training data of how these cells would have looked, I guess, earlier?

529
00:49:50,640 --> 00:49:52,830
So I guess these are other transport problems,

530
00:49:52,830 --> 00:49:59,520
and you see that there is a whole lot of nice machine learning kinds of questions and all these transport companies in biology.

531
00:49:59,520 --> 00:50:07,980
So with that, let me end. So some conclusions. So I think drug discovery requires quite a lot more.

532
00:50:07,980 --> 00:50:11,490
More theoretical kinds of frameworks to look at things in particular.

533
00:50:11,490 --> 00:50:16,080
I think causality is an important framework in drug discovery because drugs are interventions

534
00:50:16,080 --> 00:50:21,870
in the system and this is the only way to be able to predict the effects of an intervention. We actually take a causal approach.

535
00:50:21,870 --> 00:50:27,480
Alton coders are not only extremely useful in biology, which my lab has used them a lot,

536
00:50:27,480 --> 00:50:32,370
but I think also really useful for studying just theoretical properties of neural networks.

537
00:50:32,370 --> 00:50:35,130
Here, you know that they're actually learning these contracted maps.

538
00:50:35,130 --> 00:50:40,710
We're only able to see it because all soldiers are so nice that you can just we applied the maths, right?

539
00:50:40,710 --> 00:50:47,040
It is hard to think of like how this would generalise or to to give you insights about the classification setting.

540
00:50:47,040 --> 00:50:50,100
I think that is a really nice question here.

541
00:50:50,100 --> 00:50:55,830
I sold all of these results and over parameterisation that they showed these remarkable self regularisation properties.

542
00:50:55,830 --> 00:50:58,590
They learnt these contracts with massive training examples.

543
00:50:58,590 --> 00:51:03,030
Now this is super important for you if you're thinking about how just how could memory work,

544
00:51:03,030 --> 00:51:06,630
etc. So this is actually a new mechanism for associative memory.

545
00:51:06,630 --> 00:51:11,490
It's also, of course, it has negative, you know, negative consequences as well.

546
00:51:11,490 --> 00:51:16,500
It probably means that you may not want to share a trained Alton coder between different hospitals, right?

547
00:51:16,500 --> 00:51:20,460
Because you can just put a noise and outcomes or training example.

548
00:51:20,460 --> 00:51:25,020
Certainly means that you have a lot of privacy issues there and a lot of open problems.

549
00:51:25,020 --> 00:51:29,610
I mean, causal transport, the ability is one right and you actually prove that you do get causal.

550
00:51:29,610 --> 00:51:31,710
You get like if you're overcome with tries,

551
00:51:31,710 --> 00:51:40,560
you can actually do this or you can sometimes and hopefully often do this causal transport mobility problem, then I already said classification.

552
00:51:40,560 --> 00:51:44,970
What does this tell us about classification networks would love to have a general definition

553
00:51:44,970 --> 00:51:50,310
of generalisation for all that makes sense and is not just learning identity map.

554
00:51:50,310 --> 00:51:53,750
And I also already mentioned this. We have no idea.

555
00:51:53,750 --> 00:52:00,480
We do know from experiments that these attractor landscape is very dependent on the nonlinearity that you're using.

556
00:52:00,480 --> 00:52:03,600
It will be really important to figure out how this actually looks like and how

557
00:52:03,600 --> 00:52:09,220
could one optimise for different kinds of applications and nonlinear conditions?

558
00:52:09,220 --> 00:52:16,420
So with this, let me end acknowledgements, I wouldn't be able to do this without really an amazing group of people,

559
00:52:16,420 --> 00:52:22,300
students, postdocs and, of course, collaborators, so GV, Shiva, Shankar and all of the biological work we have done.

560
00:52:22,300 --> 00:52:26,440
Mitchell Belkin on the overcommit permafrost also encourages work and then, of course, funding.

561
00:52:26,440 --> 00:52:33,940
And I'll end with just one slide of advertisement. I hope that's fine that we just start at the Eric Schmidt centre at the Brote.

562
00:52:33,940 --> 00:52:41,500
If you like these kinds of problems at the intersection of machine learning and biology, Victor is a really exciting centre going on.

563
00:52:41,500 --> 00:52:44,320
We have a lot of workshops if you're just interested in conferences,

564
00:52:44,320 --> 00:52:49,240
learning more about this intersection and we're always looking for postdocs, et cetera.

565
00:52:49,240 --> 00:52:56,320
So with that, thank you very much. And happy to take any questions.

566
00:52:56,320 --> 00:52:59,980
Thanks a lot. I don't know that question.

567
00:52:59,980 --> 00:53:09,880
Please feel free to ask. Otherwise, I do have a question regarding like you show like these are goading the space imaging,

568
00:53:09,880 --> 00:53:15,820
which like after reading your article, there are many things that are different about inputs slammed into.

569
00:53:15,820 --> 00:53:26,950
I mean, you end up dividing the space into regions like I wonder like what is the relation with with this method with modern life

570
00:53:26,950 --> 00:53:35,440
like way of understanding coding in which you look at these burnard regions and you assign like each point of the space?

571
00:53:35,440 --> 00:53:41,110
I mean, do you partition the space according to how close they are to different centres?

572
00:53:41,110 --> 00:53:47,110
Yeah, exactly. So that's that's what seems to happen when you have sigmoid. Yeah.

573
00:53:47,110 --> 00:53:51,980
So I mean. Well, so that's that would exactly be like a kind of conjecture, right?

574
00:53:51,980 --> 00:53:58,300
So when, when and under what kinds of nonlinearity or other architecture constraints?

575
00:53:58,300 --> 00:54:06,190
I don't I don't know how much that will matter. But like, you know, when will you get something that is just closeness?

576
00:54:06,190 --> 00:54:15,520
Right? So so these pictures might be might be leading us into wrong kinds of directions because these are just two dimensional thoughts, right?

577
00:54:15,520 --> 00:54:19,610
What happens? And just like how you overpayments dry setting, you know, it's not so clear.

578
00:54:19,610 --> 00:54:24,340
So maybe it doesn't depend so much on the nonlinear RV anymore? Who knows?

579
00:54:24,340 --> 00:54:30,700
Right. So so. But yes, but that's exactly the kind of question is like when is there a nonlinear already or is it always the case?

580
00:54:30,700 --> 00:54:35,290
There will just map is the closest one in closeness and what norm?

581
00:54:35,290 --> 00:54:41,710
Mm hmm. Yeah. And do you think like the kind of like this different like behaviour that you may have,

582
00:54:41,710 --> 00:54:47,050
might like, underlie, like different notions of generalisation?

583
00:54:47,050 --> 00:54:52,270
I would think that probably for different kinds of applications, you may want to choose it in different ways.

584
00:54:52,270 --> 00:54:58,330
Also, you may want to have I mean, what I think would be really nice is to have an example like,

585
00:54:58,330 --> 00:55:05,830
can you come up with an example that you will use as a training example that will just have a huge attract,

586
00:55:05,830 --> 00:55:11,230
like huge base enough attraction where somehow all of the nonsense will be matched to?

587
00:55:11,230 --> 00:55:12,220
That would be great, right?

588
00:55:12,220 --> 00:55:18,190
Can you just somehow come up with an example where all the nonsense and for that, you really need to understand it quite well?

589
00:55:18,190 --> 00:55:25,960
Mm hmm. I see. And regarding like the application to.

590
00:55:25,960 --> 00:55:36,790
Gore, that pharmaceutical, do you think what is the current state of technological development there, that La Liga pharma industry,

591
00:55:36,790 --> 00:55:45,010
which has said they are interested in applying this more like a futuristic ways of understanding their role?

592
00:55:45,010 --> 00:55:51,400
What they are super interested. So we'll have joint post-ups. So yes, I think it's actually really exciting time now.

593
00:55:51,400 --> 00:55:56,440
Where is actually really thinking about how to get into machine learning and how to use some of these, you know,

594
00:55:56,440 --> 00:56:02,410
methods where you have some theory, etc. And I think they're getting really excited about these things as well.

595
00:56:02,410 --> 00:56:07,780
So are more and more pharma industries that are investing heavily into ML and actually have really good people.

596
00:56:07,780 --> 00:56:12,910
So I think it is exciting to work with them together if this is an area that you're excited about.

597
00:56:12,910 --> 00:56:14,200
Yeah, it's really nice to see.

598
00:56:14,200 --> 00:56:22,470
I think they realise that something has to change and a lot of money is being put into things that in the end have not been super successful.

599
00:56:22,470 --> 00:56:26,770
So maybe this is a new approach worth investing into as well.

600
00:56:26,770 --> 00:56:32,600
So, yeah, no, I think it's a really exciting time for people in Michigan.

601
00:56:32,600 --> 00:56:38,920
Yeah. So could I ask the question, please?

602
00:56:38,920 --> 00:56:44,410
Hi, my name is Karen Amon, oncologist and on the COVID side with the drugs.

603
00:56:44,410 --> 00:56:49,840
And I just wanted to make a TV. And yeah, so you'll know all of them.

604
00:56:49,840 --> 00:57:01,550
And so in the list of drugs, there is a clear bias towards anti angiogenesis drugs like pazopanib as as they're exit,

605
00:57:01,550 --> 00:57:04,390
and that may warrant further investigation.

606
00:57:04,390 --> 00:57:11,680
And we're looking to just just for my own curiosity into the corporate literature, and I wouldn't be surprised.

607
00:57:11,680 --> 00:57:16,150
And there's also like monoclonal antibodies against TNG Genesis,

608
00:57:16,150 --> 00:57:27,280
which would also have an immunomodulatory role and in the pictographs and oestradiol, although it's sort of take it.

609
00:57:27,280 --> 00:57:34,300
So this one is actually in the original space. So this one was not in the latent space time or so this one was.

610
00:57:34,300 --> 00:57:40,840
That's why I was so surprised. I mean, I actually thought maybe like the people who came up with chloroquine actually just did the simple analysis.

611
00:57:40,840 --> 00:57:44,260
But then people tell me, No, this is all ready to advance in something like this,

612
00:57:44,260 --> 00:57:49,120
because here actually chloroquine amongst all of the drugs amongst all of the list and ZMapp actually

613
00:57:49,120 --> 00:57:55,330
came out to be the most anti correlate that in the original space with with the disease vector,

614
00:57:55,330 --> 00:58:00,430
which I find amazing because in the latent space is not the case.

615
00:58:00,430 --> 00:58:05,360
But they said this is probably not what people have done when they came up with this programme.

616
00:58:05,360 --> 00:58:14,770
Did did anyone look at the Easter dial because COVID supposed to affect men more than women in terms of severe symptoms?

617
00:58:14,770 --> 00:58:19,090
And yeah, so this one, we could check where this one happens to be.

618
00:58:19,090 --> 00:58:22,420
But I mean, it's not one of the top ones yet.

619
00:58:22,420 --> 00:58:23,350
So this would be interesting.

620
00:58:23,350 --> 00:58:29,760
So we haven't I would have to check where this one actually comes out to be in the meeting space and how well it is close.

621
00:58:29,760 --> 00:58:42,860
OK, yeah. Then thanks for the question. Yeah, so since there are no questions.

622
00:58:42,860 --> 00:58:47,030
But thanks a lot, Caroline, this has been very interesting.

623
00:58:47,030 --> 00:58:50,480
Yeah, thank you. This is fun. Hmm. Great.

624
00:58:50,480 --> 00:58:55,100
So to everyone, a great summer and enjoy retirement too.

625
00:58:55,100 --> 00:58:58,200
You too bye. Yeah, thank you. Bye.