1 00:00:01,090 --> 00:00:05,310 [Auto-generated transcript. Edits may have been applied for clarity.] So. If you haven't. 2 00:00:13,700 --> 00:00:20,470 Okay. Um, I'd like to welcome everybody to this chance is screen Lecture, which is our distinguished lecture. 3 00:00:20,480 --> 00:00:29,570 Um, before I introduce the speaker, one of our sponsors of progressive content, uh, who has been hosting this series since 2015. 4 00:00:30,170 --> 00:00:34,250 And if any students or researchers want to meet that afterwards, you can do that during coffee. 5 00:00:35,180 --> 00:00:39,350 Um, one last administrative thing up at the top. 6 00:00:39,350 --> 00:00:43,969 We're going to have a Q&A here and we'll have some microphones running around. 7 00:00:43,970 --> 00:00:47,210 So if you want to ask a question, um, we'll do it that way. 8 00:00:47,840 --> 00:00:56,600 Okay. So I am super delighted to introduce Chef Nicole Foster's business about, um, the best speaker I could actually ever imagine. 9 00:00:56,780 --> 00:01:02,390 I would think there's a conference in the room like this where the other people concurring with this opinion. 10 00:01:02,780 --> 00:01:09,319 Uh, she is a professor at Berkeley, um, and the former director of the Simons Institute for Theoretical Computer Science. 11 00:01:09,320 --> 00:01:14,680 Actually, she's just stopped me now, but she retains her role as research director of the. 12 00:01:14,690 --> 00:01:18,920 We're still you have to support, uh, Simons seeds. Um, sound. 13 00:01:19,490 --> 00:01:25,280 In addition to that, she is a professor at M.I.T. and also emeritus advisement. 14 00:01:25,280 --> 00:01:34,860 So she's done very many locations. Um, when I decided what to tell you, her the world's accomplishments are so significant. 15 00:01:34,880 --> 00:01:44,030 And what I decided to do is to read you just a tiny bit from the citation of just a few different major awards. 16 00:01:44,990 --> 00:01:47,990 And I'm going to start with the prize in 1993. 17 00:01:48,560 --> 00:01:56,240 So Jaffe was awarded that prize for her paper, The Knowledge Complexity of Interactive Proof Systems. 18 00:01:56,900 --> 00:02:00,320 And I've shortened these a little bit tonight because of her time. 19 00:02:00,320 --> 00:02:07,760 But the citation points out that she until the colleague introduced the concept of interactive systems, 20 00:02:08,090 --> 00:02:12,350 providing a rich new framework for addressing the question of what constitutes 21 00:02:12,350 --> 00:02:16,700 a mathematical proof goes on to say that this led to exciting developments, 22 00:02:16,700 --> 00:02:25,370 including the resolution of several major problems, about the difficulty of funding European institutions to comfort level optimisation problems. 23 00:02:26,180 --> 00:02:31,070 Um, so a little prize would be pretty much enough for anybody, but, um, Jasmine's got two of them. 24 00:02:31,750 --> 00:02:39,170 Yeah, she had another kernel prize in 2001 for her paper, Interactive Approach to the Hardness of Approximating twigs. 25 00:02:39,620 --> 00:02:41,790 Uh, with, uh, if I didn't know much difference. 26 00:02:42,800 --> 00:02:50,390 And, um, here the citation says that they developed a completely new understanding of what an algorithm property was in terms of art, 27 00:02:50,990 --> 00:02:59,750 with their deep insights and powerful results evoking different factors for many problems science, for example, computation and photography. 28 00:03:01,100 --> 00:03:08,360 The last one of them generated was Turing Award, which is pretty much the biggest prize you could ever get in science. 29 00:03:08,810 --> 00:03:14,390 Um, so I'm looking forward to 2012, along with, um, the quality, um, um, 30 00:03:14,600 --> 00:03:20,120 citation says for transformative work that laid the complexity theoretic foundation for science, 31 00:03:20,240 --> 00:03:25,970 cryptography and in the process pioneered methods for efficient prosecution of mathematical proofs. 32 00:03:26,360 --> 00:03:33,320 But so Shafi is a member of the National Academy of Sciences, engineering and USA. 33 00:03:33,800 --> 00:03:43,580 Um, the Russian um Israel Science um, a former member of the um Russian side, Peter, and a bunch of others. 34 00:03:43,910 --> 00:03:48,200 She's also got a long list of honorary degrees, including here at Oxford. 35 00:03:48,620 --> 00:03:51,830 Um, we are completely up for today. 36 00:03:52,280 --> 00:03:59,230 Um, speaking to us about privacy verification and robustness, a cryptographers perspective on on them. 37 00:03:59,660 --> 00:04:03,980 So thank you very much. Uh, discussion. 38 00:04:04,580 --> 00:04:11,309 So if you want a darker I. So. 39 00:04:11,310 --> 00:04:17,900 Hi everyone, and I'm very honoured to be invited for your lecture and also to have received an honorary degree here in Oxford. 40 00:04:17,910 --> 00:04:22,260 It's so beautiful. It's a little hotter than I thought. Uh, so okay. 41 00:04:22,260 --> 00:04:28,440 Um, so the title is a little different than what I gave originally, but most of the same when you look at the content of the talk. 42 00:04:28,710 --> 00:04:34,410 So I'm going to call it constructing and deconstructing Trust in I totally a cryptographic perspective. 43 00:04:34,770 --> 00:04:43,500 So, um, you know, I've given some version of this talk without most of the results that are here, uh, maybe, uh, five, six years ago. 44 00:04:43,770 --> 00:04:46,290 And at that point, I don't know that I would have used this slide. 45 00:04:46,290 --> 00:04:52,500 But at this point, it's very clear that machine learning revolution is really the scientific breakthrough of our times. 46 00:04:52,830 --> 00:04:59,640 And, um, that started with deep learning. Now with the large language models, it's kind of achieved an entire level altogether. 47 00:04:59,850 --> 00:05:04,980 And when Leslie was saying about the greatest honour in computer science, I was thinking about these Nobel Prizes. 48 00:05:05,280 --> 00:05:07,080 Oh, for machine learning people. 49 00:05:07,410 --> 00:05:16,500 Um, so, you know, obviously it raises these very deep questions about human versus machine intelligence and whether large language models, 50 00:05:16,500 --> 00:05:23,700 this project that Michael and I'm working on will even give us the ability to translate non-human communication people. 51 00:05:23,970 --> 00:05:27,120 This is all talking about AGI and superhuman intelligence. 52 00:05:27,120 --> 00:05:29,130 They're betting markets about when that's going to happen. 53 00:05:29,670 --> 00:05:38,520 Um, you have uh, already the the oh one doing very well on competition math, competition code, PhD level science questions. 54 00:05:38,520 --> 00:05:42,270 There's a whole benchmark of new questions out there which we don't know the answer for. 55 00:05:42,600 --> 00:05:47,460 We all agree. I think with this slide, and you don't need me for that. 56 00:05:47,970 --> 00:05:54,870 The leading is to the fact that, uh, whether or not we believe that there will be that this is the biggest scientific revolution, 57 00:05:54,870 --> 00:06:02,729 it's going to solve the Riemann hypothesis, whatever. Uh, we do know already that is huge, uh, problem in not only promise, but, uh, for applications, 58 00:06:02,730 --> 00:06:08,610 whether it's health, drug discovery, infrastructure, natural language processing, policing, uh, financial markets. 59 00:06:09,030 --> 00:06:13,680 And all of this indicates, really, that there's a major shift of power, which we're all aware of. 60 00:06:14,070 --> 00:06:18,580 For those who have data and those who have algorithmic expertise, and one can argue in, 61 00:06:18,660 --> 00:06:22,140 uh, in debate where the data is more important or algorithms are more important. 62 00:06:22,380 --> 00:06:26,880 But what is very clear is that's where the power lies and all that's beautiful. 63 00:06:27,510 --> 00:06:34,200 Uh, the problem that, uh, this talk is about is should we really trust models to the extent that we are very rapidly, 64 00:06:34,200 --> 00:06:38,400 uh, trusting, uh, we don't understand and we don't control. 65 00:06:39,500 --> 00:06:45,680 And, uh, I think at least for myself, maybe because I'm of a certain age, you know, that makes me very nervous. 66 00:06:45,680 --> 00:06:50,510 But I think it should make everybody nervous, uh, when we're talking about critical applications. 67 00:06:52,040 --> 00:06:58,480 So, uh, since we talk about trust and since I'm a cryptographer, I want to tell you a second something about cryptography. 68 00:06:58,490 --> 00:07:02,210 So often people think about cryptography, basically just an arsenal of tools. 69 00:07:02,480 --> 00:07:09,890 You know, there's encryption, there's digital signatures. People talk about obfuscation, zero knowledge proofs, metamorphic encryption and so forth. 70 00:07:09,920 --> 00:07:13,010 Okay, so the beautiful constructions out there, some of them are used in practice. 71 00:07:13,010 --> 00:07:20,780 Some of them will be used in practice. Uh, but if you want to sort of zoom out from all these tools, essentially all of them aim at the following. 72 00:07:20,780 --> 00:07:25,850 And that is how to use technology that you don't trust, uh, as if you should trust it. 73 00:07:26,060 --> 00:07:31,070 So even though you are assuming that there are adversaries, you know, somebody is trying to listen to your messages, 74 00:07:31,070 --> 00:07:35,900 somebody is trying to pretend to be you to to, you know, have access to your bank account. 75 00:07:36,080 --> 00:07:39,200 Somebody might want to, um, reverse engineer your program. 76 00:07:39,360 --> 00:07:43,100 Is in the case of application, you want to say, I don't want to think about it. Well, I'm using this. 77 00:07:43,280 --> 00:07:49,220 So what cryptography does. It comes up with abstractions and tools that enable you to sort of pretend as if these adversaries don't exist. 78 00:07:49,430 --> 00:07:53,510 So it is sort of a, uh, um, uh, a zooming out view. 79 00:07:53,520 --> 00:08:01,429 It's a way to establish trust in technology. And not only that, it's, um, this is true for each one of these, uh, tools that I've mentioned, 80 00:08:01,430 --> 00:08:06,770 but there's sort of almost at this point, uh, um, a cookbook recipe when you want to solve a crypto problem. 81 00:08:07,830 --> 00:08:13,110 You kind of define the task that you're talking about. Uh, whether it's encryption or digital signatures and so forth. 82 00:08:13,200 --> 00:08:16,590 And then a very, very important step is you model what's called an adversary. 83 00:08:16,830 --> 00:08:20,459 So again, an adversary would have powers. He might be a passive adversary. 84 00:08:20,460 --> 00:08:25,170 Just be listening. It might be an active it might be changing messages going on the line and so forth. 85 00:08:25,320 --> 00:08:30,480 But you're supposed to model it mathematically. Precisely. Then you define what it is a secure solution. 86 00:08:30,780 --> 00:08:34,469 So it might not be 100% what you'd like. What what you define it. 87 00:08:34,470 --> 00:08:37,080 At least it's mathematical definition of what security of solution would be. 88 00:08:37,410 --> 00:08:41,250 And then you construct something and most importantly, you have a security proof. 89 00:08:41,550 --> 00:08:45,900 So you want to say if, um, an assumption, you know, all of a sudden assumptions come in. 90 00:08:45,900 --> 00:08:52,650 So we'll say something about that in a minute. But you say under very well specified assumption, if they hold, then this, uh, 91 00:08:52,740 --> 00:08:56,580 primitive that you constructed is secure according to the definition that you've laid out. 92 00:08:56,640 --> 00:09:01,320 Okay. And it's sort of the history of cryptography in the last, at least since I've been involved in it, 93 00:09:01,470 --> 00:09:07,170 is that these reductions between well specified assumptions and proofs of security is what the game is about. 94 00:09:07,680 --> 00:09:10,379 Um, so without proofs, we're saying we have nothing, really, 95 00:09:10,380 --> 00:09:14,550 because we don't know if somebody is breaking the scheme and not telling us, then what's the point of the whole thing? 96 00:09:14,730 --> 00:09:17,730 So the proofs are really the main thing in our disposal. 97 00:09:17,940 --> 00:09:19,060 Now what kind of assumptions? 98 00:09:19,110 --> 00:09:25,740 I say the word assumptions because they are used all over cryptography and a lot of of course the work is to try to minimise them. 99 00:09:26,130 --> 00:09:29,190 They really fall into sort of uh, some categories. 100 00:09:29,370 --> 00:09:35,490 One, which is the one that we mostly hope for, is that we believe that certain problems are hard. 101 00:09:35,610 --> 00:09:40,139 Certain computational problems. I'll talk about one of them during the talk. Usually people know about factoring integers. 102 00:09:40,140 --> 00:09:44,290 Here's a problem. We don't know how to do it efficiently. Let's assume there is no such algorithm. 103 00:09:44,310 --> 00:09:48,510 And now build a primitive and prove that it's secure under that. 104 00:09:49,230 --> 00:09:51,080 There are other assumptions you could talk about. 105 00:09:51,090 --> 00:09:57,840 There's several entities in this system, and maybe they or each one of them might be adversarial, but they don't all collude, uh, against you. 106 00:09:58,230 --> 00:10:00,750 Another type of assumption would be a physical assumption. 107 00:10:01,170 --> 00:10:07,049 Uh, a physical assumption might be that, uh, you know, if you're talking about quantum physics or something that there are quantum computers, 108 00:10:07,050 --> 00:10:13,890 uh, might be some other assumptions of distance between that you're blocking, uh, two entities or they can't actually transfer information. 109 00:10:14,340 --> 00:10:16,770 And these days, people talk about trusted hardware. 110 00:10:17,040 --> 00:10:21,809 The point of view of this line is to say that, uh, we would like to minimise we don't wanna use all of these assumptions. 111 00:10:21,810 --> 00:10:29,190 We'd like use none. Sometimes we'd like to use 1 or 2. But important part is really to emphasise this, uh, security proof. 112 00:10:29,400 --> 00:10:33,630 So what does this have to do with machine learning? Oh, by the way, I just want to say that sometimes you show that it's impossible. 113 00:10:33,660 --> 00:10:39,690 So achieve security, you define a security, and then you show that maybe even also under the assumption that any scheme would break, 114 00:10:40,140 --> 00:10:44,010 you know, and um, okay, so machine learning. 115 00:10:44,010 --> 00:10:50,700 So the proposal and in some sense this sort of the idea of this whole program of research is to use the same kind of recipe, 116 00:10:51,150 --> 00:10:57,240 uh, for machine learning, uh, meaning that, uh, if there's a task, we'll define it first of all, properly. 117 00:10:57,420 --> 00:11:02,190 And that's already, you know, I mean, I assume that a lot of machine learning researchers here, 118 00:11:02,190 --> 00:11:05,040 and certainly there are people who know more about machine learning than I do. 119 00:11:05,400 --> 00:11:15,360 Um, but I have I think we all have to agree that if you read any of these very fascinating papers with incredible ideas and breakthroughs, 120 00:11:15,990 --> 00:11:17,910 they very rarely define what they're doing. 121 00:11:18,120 --> 00:11:23,939 So, um, so if you talk about alignment, you know, I would like I want to work on alignment, but and I will work on alignment. 122 00:11:23,940 --> 00:11:29,370 But I have to define it myself. And you might not be consistent with other people's understanding, but there are no definitions. 123 00:11:29,730 --> 00:11:34,469 Uh, there's words, uh, and um, words are, I guess, definitions also words. 124 00:11:34,470 --> 00:11:38,920 But in any case, and then in very important part is also who is the machine learning adversary. 125 00:11:38,940 --> 00:11:43,770 You know, maybe in cryptography it's more obvious. It's not necessarily the case here, but we'd like to define it. 126 00:11:44,070 --> 00:11:48,990 And then we want to define what it would be. Since we're talking about trust what would be a trustworthy solution. 127 00:11:49,350 --> 00:11:53,669 So give, uh, a definition. And then finally you might try to come up with a solution. 128 00:11:53,670 --> 00:11:59,850 And we want to focus on the fact that, again, we would like to have some kind of proofs that under maybe some assumptions, 129 00:12:00,000 --> 00:12:05,340 this solution that you have constructed is trustworthy with respect to your your definition of trustworthy. 130 00:12:05,820 --> 00:12:09,660 So maybe maybe this is more important slide than anything that it would be. 131 00:12:09,930 --> 00:12:13,290 It would be good to be able to have a definition. Maybe it's not going to be the right definition. 132 00:12:13,470 --> 00:12:17,910 Maybe later on there will be a more general one, or maybe they'll be another one that you could show equivalent, 133 00:12:18,240 --> 00:12:21,780 uh, but that there will be some kind of formal thing that you're trying to to satisfy. 134 00:12:22,320 --> 00:12:25,680 Uh, and again, what assumptions? It could be the cryptographic assumptions. It could be others. 135 00:12:25,680 --> 00:12:29,610 So this is sort of the mindset for this talk. Okay. 136 00:12:29,970 --> 00:12:35,820 So um, and again, as I said before, possibly you what you might be want to achieve, 137 00:12:35,820 --> 00:12:40,680 like you define what what alignment is you you define what your adversary is. 138 00:12:41,070 --> 00:12:46,290 Uh, you might not be able to achieve the definition you came up with, and you might even be able to prove that you cannot achieve it. 139 00:12:46,290 --> 00:12:51,150 Exactly. So what do people want from you? Or you could change your definition, or you could change your solution. 140 00:12:51,540 --> 00:12:56,639 And I'll show you in this talk a couple examples where you can actually, uh, achieve, uh, 141 00:12:56,640 --> 00:13:03,150 what we want and couple in one assumption where you can prove that it's impossible to achieve it, and then you have to contend with us. 142 00:13:04,050 --> 00:13:06,440 Okay, so, uh, just one word before. 143 00:13:07,180 --> 00:13:14,290 Sort of uh, go into these examples which are verification, robustness and privacy, like in the title that was advertised. 144 00:13:14,860 --> 00:13:22,390 Um, I want to say that the first, I think, um, gut level reaction by people in machine learning is or at least it used to be, 145 00:13:22,900 --> 00:13:28,030 uh, is that whereas cryptography is designed with an adversary in mind, if there's no adversary, there's no problem. 146 00:13:28,240 --> 00:13:35,170 I mean, why should you encrypt if nobody's listening, you know, why should you protect, uh, accessing, uh, Amazon if nobody's pretending to be you? 147 00:13:35,500 --> 00:13:41,950 But machine learning, that's not the mindset. You didn't come up with machine learning solutions because you're trying to get over an adversary. 148 00:13:42,190 --> 00:13:45,940 But the fact is. So maybe this whole adversarial thinking is not the right one. 149 00:13:46,450 --> 00:13:53,340 But the fact is that right now, AI systems and for the foreseeable future are very attractive targets. 150 00:13:53,350 --> 00:14:00,550 It's a huge payoff to break them in many different ways, whether it's on a national level or a commercial level. 151 00:14:00,790 --> 00:14:07,270 And therefore, if you think about an adversary, it's a good way to think, because if you could show that things work in the presence of an adversary, 152 00:14:07,750 --> 00:14:16,510 you should feel safer than if you assumed, uh, a particular behaviour on, on, uh, on part of the adversary. 153 00:14:16,550 --> 00:14:22,990 So my second point is we don't when we talk about adversaries, we would like to prepare for worst case adversary. 154 00:14:23,530 --> 00:14:27,009 So we we we don't want to say, oh, the adversary uses a particular strategy. 155 00:14:27,010 --> 00:14:31,659 Let's say, uh, those people are familiar with, um, you know, adversarial examples. 156 00:14:31,660 --> 00:14:37,959 There are solutions that say, well, if the adversary does the following to an image, say, then I can protect against it. 157 00:14:37,960 --> 00:14:42,130 If it's use restricted, the ability is restricted to some transformations of images. 158 00:14:42,280 --> 00:14:45,969 But why should the answer be restricted that they can put a patch on a stop sign? 159 00:14:45,970 --> 00:14:52,990 And I'm saying things we've all seen in other talks. Um, so we'd like to, if possible, to prepare for worst case adversary. 160 00:14:53,410 --> 00:14:57,160 But one thing we do throughout cryptography, and also throughout this talk, 161 00:14:57,370 --> 00:15:02,010 is that we do assume that the adversary is not is computationally unbounded. 162 00:15:02,020 --> 00:15:07,270 So in other words, it cannot factor integers if you like, but more it doesn't have more than it doesn't have exponential time. 163 00:15:07,330 --> 00:15:11,500 So there is some bound on the running time of the adversary or other resources as well. 164 00:15:11,890 --> 00:15:17,980 The resources these days could be samples, you know, there's a distribution or there's data and there's some at this cost. 165 00:15:18,100 --> 00:15:24,280 So there is some bound on how much data, how much time, how much. And again, when we talk about bound I mean more than polynomial okay. 166 00:15:25,000 --> 00:15:33,670 Um so. Uh, what are the challenges that I will talk about, which are sort of an example for this approach? 167 00:15:34,090 --> 00:15:37,900 Um, so here it looks really pale. You can actually see the screen. 168 00:15:38,680 --> 00:15:44,140 Yeah. Okay, fine. Um, good. So there will be I'll talk a little bit about privacy. 169 00:15:44,710 --> 00:15:52,090 Uh, kind of aiming at this issue there. A lot of the power, at least initially, came from data of individuals and the, um, the training data. 170 00:15:52,090 --> 00:15:58,210 And, and we might not want to give over medical records and so forth, or might not be able to because of legal restrictions. 171 00:15:58,720 --> 00:16:04,960 Uh, then I'll talk about verification. And this aims at the fact that these models I'm not training my own model. 172 00:16:04,990 --> 00:16:10,900 You know, I'm using a model that was trained by others, whether it's DeepMind or, or, um. 173 00:16:12,050 --> 00:16:18,880 OpenAI or whatever company, and we would like to have some way to verify that these models that are presented to us are good. 174 00:16:18,890 --> 00:16:23,780 Okay. So they haven't been tampered with for whatever, uh, for whatever reason. 175 00:16:24,140 --> 00:16:28,969 And lastly, uh, I'll mentioned about robustness. So that is uh, we have the data today. 176 00:16:28,970 --> 00:16:34,250 We train. It looks good. It's often with respect to some benchmarks or some holdout information. 177 00:16:34,370 --> 00:16:42,380 But what about distribution shifts and data of the future? Uh, how can we trusted the models robust with respect to change in data. 178 00:16:42,710 --> 00:16:46,520 Okay, so those are the three, uh, areas that I'll talk about. 179 00:16:46,670 --> 00:16:50,800 And you can already see that there is if you had to sit down and say, okay, so who's the adversary? 180 00:16:50,810 --> 00:16:56,030 It is very clear the adversary in the first might be the training algorithm that's getting our data. 181 00:16:56,210 --> 00:17:03,230 The adversary in the second might be this company that's training a model and might be not doing it the way that they claim they're doing it, 182 00:17:03,230 --> 00:17:06,320 or maybe saving on resources and therefore doing something different. 183 00:17:06,650 --> 00:17:10,970 And the adversary here is in some sense even nature. Uh, so distributions do shift. 184 00:17:11,120 --> 00:17:15,530 And what do you do then. But he could also be much worse adversary okay. 185 00:17:15,770 --> 00:17:20,360 So just to say um, very simplified version. 186 00:17:20,360 --> 00:17:25,360 So we'll talk a little bit about what happens during the collection and training phase. 187 00:17:25,370 --> 00:17:29,689 Then post uh, development of a model so called the audit phase. 188 00:17:29,690 --> 00:17:38,899 And then when it's being used and uh, again, very, very high level, I'll think of machine learning just in these two, uh, places. 189 00:17:38,900 --> 00:17:42,080 I'm not looking at, um, you know, reinforcement learning or anything. 190 00:17:42,080 --> 00:17:48,890 Just think of a training algorithm gets inputs from a distribution labelled x, y and outputs a model H. 191 00:17:49,100 --> 00:17:54,230 So h for hypothesis a model and once a minimise loss according to some loss function. 192 00:17:54,590 --> 00:17:57,230 And then the other phase will be there is already the training. 193 00:17:57,440 --> 00:18:03,710 Uh there's already the model H and now we are getting some individuals data and getting some prediction probability. 194 00:18:03,920 --> 00:18:09,020 But more than that this could be a a language model. So this is a query and this is an answer. 195 00:18:09,380 --> 00:18:17,060 Um, so not just about prediction but also generation. And um it could be distribution over sequences or sequences. 196 00:18:17,060 --> 00:18:21,290 So let's be very general in kind of thinking about what I'm aiming at. 197 00:18:21,860 --> 00:18:23,060 So I'll start with privacy. 198 00:18:23,840 --> 00:18:32,360 So in some sense privacy is kind of the easiest, um, problem, not necessarily the easiest to achieve in terms of the computing power that's necessary, 199 00:18:32,570 --> 00:18:39,770 but it's the easiest to define because we have so much experience, you know, especially as cryptographers in trying to protect privacy of data. 200 00:18:40,490 --> 00:18:49,520 Um, so there are lots of tools, you know, the secret sharing, uh, some of these tools, I will mention it, but just kind of is a laundry list here. 201 00:18:49,520 --> 00:18:56,450 The longer one, private information retrieval, secure computation and one morphic encryption and so forth, there's lots of tools to our disposal. 202 00:18:56,750 --> 00:18:58,460 And in some sense, we can use all of them. 203 00:18:58,490 --> 00:19:03,680 Uh, and they're all defined to have certain, uh, secure, achieve certain security under certain assumptions. 204 00:19:03,950 --> 00:19:07,790 And they kind of naturally fit to some of the problems that come up naturally. 205 00:19:07,820 --> 00:19:13,850 Uh, in, in machine learning, uh, in particular, we're going to use homomorphic encryption just in the next two slides. 206 00:19:14,030 --> 00:19:16,730 So homomorphic encryption is this little chart here. 207 00:19:17,030 --> 00:19:22,339 It's an encryption method that has some special property that not only that you can encrypt and then uh, 208 00:19:22,340 --> 00:19:26,659 decrypt, but you can also do something we call evaluate. 209 00:19:26,660 --> 00:19:31,640 And that is you could take the encryption without understanding what is x, just what the ciphertext is. 210 00:19:31,940 --> 00:19:37,190 And if you run an evaluation on it so that at the end you get, uh, a new encryption. 211 00:19:37,190 --> 00:19:41,240 But what's being encrypted is whether it be encrypted X, it'd be encryption of F of X. 212 00:19:41,660 --> 00:19:47,870 And you could apply that encryption step on this new C prime and get back uh f of x. 213 00:19:48,110 --> 00:19:52,229 So that's really and the question is which FS are being supported. Could be addition could be multiplication. 214 00:19:52,230 --> 00:20:00,830 It could be any circuit. Um and that's one of the biggest, probably the biggest invention in the um the last 20 years in cryptography. 215 00:20:00,830 --> 00:20:04,220 There's around 2008, the first homomorphic encryption scheme was proposed. 216 00:20:04,820 --> 00:20:07,790 And it would be extremely useful also for what I'm trying to do here. 217 00:20:08,510 --> 00:20:15,860 So for example, just to kind of get us warmed up, suppose we want to actually, uh, apply it to the training data problem. 218 00:20:16,160 --> 00:20:20,420 Uh, again, this is sort of, I think a homework problem once you understand the definitions to, 219 00:20:20,630 --> 00:20:23,930 to do this slide in a fairly low level homework problem. 220 00:20:23,930 --> 00:20:31,879 But, you know, this is, uh, just to set the stage. Um, so let's say that all the data is encrypted now and who encrypted. 221 00:20:31,880 --> 00:20:35,990 That's a good question. But let's say you encrypt it for some kind of, uh, trivial example. 222 00:20:36,260 --> 00:20:41,900 Each person encrypt their medical file, or each doctor or each hospital encrypts all the medical files in their disposal. 223 00:20:42,080 --> 00:20:47,750 They all come together, they give it to some, uh, trainer, uh, and the trainer works on it. 224 00:20:48,140 --> 00:20:54,530 Now instead of working on the, uh, data itself directly on X and Y, they, uh, they're working on the encryption. 225 00:20:54,710 --> 00:20:58,850 So what you need is an algorithm that is able to do whatever program that we're going 226 00:20:58,850 --> 00:21:03,980 to run on the X and Y's in the clear to do it on the encryption so that if you, 227 00:21:04,320 --> 00:21:08,479 uh, what we just said a minute ago is, uh, called the encrypted compute stage. 228 00:21:08,480 --> 00:21:11,940 If we have in our disposal something that can take X and. Ys. 229 00:21:12,510 --> 00:21:15,650 Those are the X over there. And the labels wise with the medical file is x. 230 00:21:15,660 --> 00:21:18,930 The label of whether you survive or don't survive might be why? 231 00:21:19,380 --> 00:21:22,850 Um, then um, everything would be done under encryption. 232 00:21:22,860 --> 00:21:26,460 We'll talk about cost in a minute, but in an ideal world we have this encryption method. 233 00:21:26,880 --> 00:21:30,000 We get the encrypted model. Now we want to use it though comes in model. 234 00:21:30,000 --> 00:21:35,430 We unnecessarily use it as a piece of information which is encrypted. So at this stage we have to decrypt it. 235 00:21:35,790 --> 00:21:39,620 So how are we going to decrypt it. So the idea here is there's no single party. 236 00:21:39,660 --> 00:21:45,150 It's certainly not open. AI is going to have the decryption key because then they could just equipped crypto or medical record information. 237 00:21:45,480 --> 00:21:52,080 But there is uh, a decryption key metaphorically look like three people here that all each have a part of a key. 238 00:21:52,470 --> 00:22:00,080 And, uh, they are. There's and then they can apply their share in order to decrypt, uh, 239 00:22:00,510 --> 00:22:06,899 this encryption of each such that eventually, after each one goes within steps or together, voila! 240 00:22:06,900 --> 00:22:11,160 H appears in the clear. Okay. Uh, so what are the assumptions here? 241 00:22:11,400 --> 00:22:15,480 The assumptions are, first of all, there is an encryption scheme that has the properties I want. 242 00:22:15,780 --> 00:22:19,560 And that's based on something called learning with errors, which I'll put up in a slide in a minute. 243 00:22:19,740 --> 00:22:24,000 So it's not a factoring problem of integers, but it's a different problem comes from sort of geometry of numbers. 244 00:22:24,480 --> 00:22:30,720 Um, and that these guys or these shareholders, uh, don't collude. 245 00:22:31,140 --> 00:22:32,240 So there are two assumptions here. 246 00:22:32,250 --> 00:22:38,820 One is that you have given the pieces of the key to P to entities that don't all collude together in order to break the scheme, 247 00:22:39,120 --> 00:22:44,399 and the scheme is secure. Now, those people maybe worked on differential privacy. 248 00:22:44,400 --> 00:22:46,750 How many people have heard that? Yeah. 249 00:22:46,780 --> 00:22:54,400 Many, uh, anyway, so you could say that this is no good because H, for example, could be just the list of everybody's data. 250 00:22:55,240 --> 00:22:58,390 If it's a list of everybody data, then H itself reveals people's data. 251 00:22:58,570 --> 00:23:03,970 So maybe you want to have some requirement from H. But that's sort of beyond this is a that's a different problem. 252 00:23:04,090 --> 00:23:08,500 So one problem is a design an H two does not leak the information of X and y. 253 00:23:08,710 --> 00:23:15,310 The other is how do you actually compute h. You know, without looking at the x's and y's in the data. 254 00:23:15,760 --> 00:23:21,720 Um, and the problem with differential privacy, there are lots of old papers about their training under differential privacy. 255 00:23:21,730 --> 00:23:26,290 I'm not going to touch that. Okay. So it didn't give you a definition here. 256 00:23:26,440 --> 00:23:32,570 But the definition really would be that whatever algorithm you had that was working in the clear okay. 257 00:23:32,610 --> 00:23:34,960 If you look at that algorithm versus the encrypted algorithm, 258 00:23:35,230 --> 00:23:42,220 essentially you have you will learn whatever you can learn from the encrypted one is what you could have, um, is. 259 00:23:42,240 --> 00:23:46,930 Um, so the other way, whatever you could learn from um. 260 00:23:48,750 --> 00:23:51,780 You shouldn't learn more. I can't think right now. 261 00:23:52,110 --> 00:23:55,840 I went to sleep at 130. What is the definition you would want? 262 00:23:55,870 --> 00:24:00,630 Let me ask you. So the definition we would want from this scheme is that we are, uh, 263 00:24:00,640 --> 00:24:05,380 you going to find out more about the data, what the x and y is more than H provides. 264 00:24:05,680 --> 00:24:08,799 So h we say is okay, everybody should get h at the end. Okay. 265 00:24:08,800 --> 00:24:10,360 Well open the at least you get h at the end. 266 00:24:10,570 --> 00:24:20,500 But we want that whatever it is you recover from H about x and y's is um is holds regardless of whether you had access to this encryption or not. 267 00:24:20,920 --> 00:24:23,140 So in other words, H obviously gives you things okay. 268 00:24:23,380 --> 00:24:28,360 But you don't want to give you you don't want the fact that you looked at the encryption of the medical file to give you more than H. 269 00:24:28,900 --> 00:24:32,620 And that's the definition that would be satisfied. So what does this learning with errors. 270 00:24:33,720 --> 00:24:34,650 This thing that I said, 271 00:24:34,650 --> 00:24:41,940 that is this is the assumption based on which we build encryption scheme with this extra property that they can compute on ciphertext. 272 00:24:42,270 --> 00:24:48,329 So this is the problem. Um, essentially let's think of S as a vector in uh execution. 273 00:24:48,330 --> 00:24:51,350 And uh, and then we're given and that's the secret. 274 00:24:51,360 --> 00:24:57,960 So we don't know what the values of these, uh, SES are, the components of the vector, um, and so on and so forth here. 275 00:24:58,110 --> 00:25:02,700 Uh, and is for but we're given a lot of equations in these variables. 276 00:25:02,940 --> 00:25:09,660 But of course, if we're just given equations, uh, without the, we could do Gaussian elimination and figure out what's is. 277 00:25:09,840 --> 00:25:14,190 So we're given noisy equations. So in other words on the left hand side are the coefficients. 278 00:25:14,190 --> 00:25:17,909 And on the right hand side we add some noise. So this aid is not really the right solution. 279 00:25:17,910 --> 00:25:22,170 That's already the solution. Plus some uh noise for some Gaussian distribution. 280 00:25:23,040 --> 00:25:25,130 So it turns out that this problem. 281 00:25:25,140 --> 00:25:34,350 So give me a bunch of uh, coefficients in uh, in a noisy right hand side is, um, is as hard as decoding random linear code, 282 00:25:34,890 --> 00:25:40,140 as hard as, uh, approximating the size of the shortest vector in a worst case N-dimensional integer lattice. 283 00:25:40,260 --> 00:25:43,950 So it's a it's not only problem that seems that we don't know how to solve, 284 00:25:44,220 --> 00:25:49,260 but it's a problem that people have shown that average on the average case distribution is as hard as the worst case problem, 285 00:25:49,380 --> 00:25:54,660 which we don't know how to solve. And when I say don't know how to solve, I don't mean that I don't know how to solve it. 286 00:25:55,020 --> 00:26:01,230 I mean that the best known algorithm and that is actually true even for quantum computers today takes exponential time. 287 00:26:01,770 --> 00:26:04,500 So two to the n and it was the was the length of the vector. 288 00:26:04,740 --> 00:26:08,910 So this is a problem that has been considered for many years, especially in the lattice version. 289 00:26:09,420 --> 00:26:13,050 Um, and uh, this is the best we know how to do. 290 00:26:13,470 --> 00:26:18,840 Now, this doesn't mean that tomorrow some genius is going to come up, maybe from Oxford, and come up with a polynomial time algorithm, 291 00:26:19,110 --> 00:26:25,350 in which case maybe it's even better than, uh, having a secure FCE because they've sold some fundamental problem. 292 00:26:25,560 --> 00:26:29,040 But it's what we call the win win. So we say if you could attack the scheme. 293 00:26:30,030 --> 00:26:36,059 The encryption scheme. Then we gave you actually a constructive reduction from that attack to solving, uh, 294 00:26:36,060 --> 00:26:40,440 this worst case approximate, um, this worst case, uh, shortest vector to lattice problem. 295 00:26:40,860 --> 00:26:46,830 Okay. So say you we need a way. Either you have a good scheme or you have a bad scheme, in which case you have a good algorithm. 296 00:26:47,250 --> 00:26:51,120 Um, and that is what was achieved by the proof. 297 00:26:51,810 --> 00:26:56,610 Okay. So this is, um, this is all beautiful in theory, but. 298 00:26:56,610 --> 00:27:02,189 Oh, often when people talk about, uh, that is four years since it was invented in 2009. 299 00:27:02,190 --> 00:27:06,300 And the development people were saying, well, it's very nice in theory, but it's impractical. 300 00:27:06,690 --> 00:27:10,110 So first of all, I hate that sentence because it never really holds with time, 301 00:27:10,770 --> 00:27:14,190 because whatever is impractical in the past usually becomes practical in the future. 302 00:27:14,640 --> 00:27:19,080 Um, so specifically here, uh, there is a library, 303 00:27:19,080 --> 00:27:27,810 an open source for polymorphic encryption library by many different people and, uh, certainly, um, it, uh, it's open. 304 00:27:27,960 --> 00:27:31,860 So, uh, it is possible to run this in practice. 305 00:27:32,010 --> 00:27:33,450 Now it's all a question of scale. 306 00:27:33,660 --> 00:27:41,730 So when I say scale, really, if you think about large language models and you want to, uh, train foundation models and fine tune and so forth, 307 00:27:41,880 --> 00:27:46,080 you would like to know how really what is the concrete efficiency of this, of these schemes. 308 00:27:46,380 --> 00:27:54,030 And really the way to think about is this sort of three different axis. There's the adversary, which we sometimes think is very curious. 309 00:27:54,030 --> 00:27:55,860 Okay, but he's honest. He doesn't tamper. 310 00:27:56,100 --> 00:28:03,240 Then it might be, uh, actually an active malicious adversary as a change things also in his quest to break your encryption scheme. 311 00:28:03,810 --> 00:28:06,980 Um, the adversary here is the machine learning training algorithm. 312 00:28:06,990 --> 00:28:11,790 Right. Because they get the encryption. It could also be the people providing the data, and that's called poisoning. 313 00:28:11,790 --> 00:28:17,400 It's not part of this particular part of the talk. Um, then there's a question of what is a computation. 314 00:28:17,710 --> 00:28:21,240 Could be linear regression. Could be logistic regression. It could be deep nets. 315 00:28:21,240 --> 00:28:26,820 Neural nets could be language models. Each one of these is a big jump in terms of complexity with computation. 316 00:28:27,090 --> 00:28:30,510 So where is we. We can use homomorphic encryption for linear. 317 00:28:30,510 --> 00:28:36,360 No problem. We know we have things for logistic when we get into large language models is obviously not there. 318 00:28:36,570 --> 00:28:41,100 Okay. So what do people do. So there's another axis here which is what are the assumptions. 319 00:28:41,550 --> 00:28:45,690 So one assumption I talked about was learning with errors of factoring. So these are the computational assumption. 320 00:28:46,020 --> 00:28:51,509 Another assumption might be that you have more than one party privy to the computation. 321 00:28:51,510 --> 00:28:55,230 But they don't talk to each other okay. And the third assumption is trusted hardware. 322 00:28:55,530 --> 00:29:00,480 And that's really kind of the big player these days. Uh, so when people talk about trusted hardware. 323 00:29:02,490 --> 00:29:04,760 A. What are they? What do they mean? 324 00:29:04,770 --> 00:29:11,340 So they may mean many things, but mostly what they mean is this new chip of Nvidia, which is not that new at this point, 325 00:29:11,350 --> 00:29:20,610 few years h100 where the idea is that they in some sense say, here's this, here's a secure, uh, box and give it put encryption in there. 326 00:29:21,030 --> 00:29:28,260 The box holds the secret key and it, uh, inside of it, it will decrypt, it will do some computation, and then it will encrypt and send it out. 327 00:29:28,380 --> 00:29:38,430 Okay. So in a sense, you're trusting that this box works, uh, without leaking anything about the secret key or the message that came in encrypted. 328 00:29:38,850 --> 00:29:43,920 And then there's also another usually attestation, uh, module which says, not only that, but you're going to get approved. 329 00:29:44,070 --> 00:29:48,960 What was done inside there under the auspices of this box was the correct computation. 330 00:29:49,740 --> 00:29:56,430 And this is, uh, there used to be, uh, Intel SGX, uh, which was similar, but it wasn't a GPU based, 331 00:29:56,430 --> 00:30:03,720 but uh, and it had similar kind of, uh, boxes and similar kind of belief that it just works. 332 00:30:04,350 --> 00:30:08,790 Um, so I guess the sponsors are hiding this line here. 333 00:30:08,790 --> 00:30:20,500 Just second. Okay, so beware whenever you talk about, uh, side channel attacks that, um, sorry, whenever you talk about secure hardware. 334 00:30:20,590 --> 00:30:22,780 Ah, that there are these what's called side channel attacks. 335 00:30:22,780 --> 00:30:26,919 It could be that because of the amount of power consumed or because two people use the same chip, 336 00:30:26,920 --> 00:30:33,250 and then something was left from the previous, um, uh, from the previous, uh, encryption that was in the box. 337 00:30:33,610 --> 00:30:38,290 Um. There are some, um, bugs. 338 00:30:38,290 --> 00:30:40,780 I guess maybe you don't trust the companies blindly. 339 00:30:41,440 --> 00:30:46,480 These are things that there's a whole field of people that are out there working on it, on on secure hardware. 340 00:30:46,570 --> 00:30:52,420 But for our sake, we could use it. But we just have to understand that that's part of the assumption. 341 00:30:53,080 --> 00:30:57,490 All right. So has this been used. So this is all like a, um it's like a blueprint. 342 00:30:57,820 --> 00:31:05,020 So mostly the usage is have been sort of in finance is a lot of crypto use, but the use more of a machine learning type use of privacy. 343 00:31:05,440 --> 00:31:13,090 Uh, it's been mostly on, uh, in medical and in actually in genomic, uh, uh, field. 344 00:31:13,330 --> 00:31:18,549 So why why do it in, um, when you talk about, uh, genomes? 345 00:31:18,550 --> 00:31:24,430 Because you need a lot of data. So it's a place where complexity comes in and has large amounts of data because otherwise you have no effect. 346 00:31:24,700 --> 00:31:31,839 Right? So if you think about things like Crohn's disease or, uh, breast cancer, uh, in schizophrenia, you know, 347 00:31:31,840 --> 00:31:37,420 you only when you go to large enough numbers that you can actually find signal in the genome that's related to the phenotype. 348 00:31:37,720 --> 00:31:44,200 And, um, so it's a, it's a perfect example of a place where privacy is required because you're talking about medical records, 349 00:31:44,500 --> 00:31:48,190 but, uh, size is, uh, in size is important. 350 00:31:48,190 --> 00:31:53,229 Uh, so how do you achieve it? So this was a place so 1 in 2020. 351 00:31:53,230 --> 00:31:58,240 We had a paper doing Gwas. Uh, this is also a company that I'm a co-founder. 352 00:31:58,240 --> 00:32:05,229 And so also part of it where we use exactly this type of blueprint like I talked about, except what you're training is, 353 00:32:05,230 --> 00:32:09,400 you're running a dual genome wide association study analysis under homomorphic encryption. 354 00:32:09,730 --> 00:32:17,200 And then at the end you're getting whatever the answer is. And you'd like to check the accuracy of the answers so that it worked correctly. 355 00:32:17,440 --> 00:32:20,409 And also with the time performances. And at the time. 356 00:32:20,410 --> 00:32:27,820 So this is 2020, the extrapolation was 100,000 individuals, 500,000 snips in this amount of time on a single server. 357 00:32:28,030 --> 00:32:32,410 This is the kind of application that doesn't have to work in real time. Today, the numbers would be much better. 358 00:32:33,310 --> 00:32:40,719 What's another example? Another example. Uh, and just a point to make here is that who is holding the keys in the case of Gwas, 359 00:32:40,720 --> 00:32:43,360 it's actually you can think of you have different medical centres. 360 00:32:43,360 --> 00:32:49,419 If there's the Broad Institute and there's Oxford, I don't know, there's there's UCSF and they all each have their database. 361 00:32:49,420 --> 00:32:55,180 They could be the ones holding the keys, the parts of the key because they don't they don't want to share with each other, but they're okay. 362 00:32:55,540 --> 00:33:03,580 Uh, um, I guess the as a key holder, there is a partial key holder themselves and then they come together and reconstruct H. 363 00:33:04,420 --> 00:33:07,719 All right. So that was 20 1020 then 2023. 364 00:33:07,720 --> 00:33:12,880 There's this is an application working on real world uh uncle oncology called data analysis. 365 00:33:13,000 --> 00:33:21,700 So this is actually a hospital that has a lot of, uh, cancer patients and wanted to do something which was, um, uh, combining actually, 366 00:33:22,030 --> 00:33:30,400 the novelty here was that you the data was, um, you know, rather than having lots of data, it looks the same, but for, for different patients. 367 00:33:30,580 --> 00:33:35,469 You had one medical centre that had some, uh, there are electronic medical records over here. 368 00:33:35,470 --> 00:33:38,620 Is that another medical centre that had their, uh, results? 369 00:33:38,620 --> 00:33:42,220 Uh, once they were cancer patient. And you wanted to. 370 00:33:42,310 --> 00:33:46,299 So we think of it as sort of same patient, different data held in different places. 371 00:33:46,300 --> 00:33:49,150 So you need sort of joint operations to sort of identify this, 372 00:33:49,150 --> 00:33:53,590 the same patient enjoy order and all this under encryption because you're not allowed to send each other data. 373 00:33:54,340 --> 00:34:00,610 Um, and uh, what was used for that is also something called a threshold effect, but doesn't matter. 374 00:34:01,090 --> 00:34:07,210 The main thing is you can support a lot of statistics that people do in medical packages, like, uh, 375 00:34:07,660 --> 00:34:16,150 you know, Kaplan-Meier, um, survival curves, uh, and lots of other very common statistical operations. 376 00:34:17,560 --> 00:34:26,740 Okay. So the last thing that I'm going to say about privacy is that now in 2024, the thing of secure hardware has come into the picture. 377 00:34:26,890 --> 00:34:33,610 So often what you see is medical studies where you have different, uh, let's say different, uh, in this particular example, 378 00:34:33,610 --> 00:34:42,250 it's different hospitals, a, you know, minute of say, well, the application was each hospital trains sort of their own, um, a model locally. 379 00:34:42,370 --> 00:34:45,790 And then they combined using something called Federated Learning to come up with a 380 00:34:45,790 --> 00:34:49,840 bigger model that takes all the information of the different hospitals into account. 381 00:34:50,170 --> 00:34:55,540 And you use secure hardware because let's say and this so there's both this thing called federated learning. 382 00:34:55,540 --> 00:35:02,380 So this is the federation and also fully homomorphic encryption plus what's called trusted execution environment which is hardware. 383 00:35:02,860 --> 00:35:11,620 Okay. Um why why use hardware if we were doing so well in the oncological study and the Gwas is because in this particular application, 384 00:35:11,890 --> 00:35:16,180 it was to take a whole bunch of, uh, a pathological slide images. 385 00:35:16,180 --> 00:35:21,040 So they're very, um, uh, communication heavy to send this over the line. 386 00:35:21,040 --> 00:35:28,269 So you wanted to do things locally and then just send sort of updates of models A and even the updates, you wanted them to be secure. 387 00:35:28,270 --> 00:35:31,540 And that's why you use the secure hardware. Okay. There's a lot here. 388 00:35:31,900 --> 00:35:35,080 I just wanted to show the only thing to retain here is this. 389 00:35:35,270 --> 00:35:38,480 Privacy in some sense is easy to define because we've worked on it. 390 00:35:38,780 --> 00:35:44,690 We have the tools, so a lot of it is about putting the tools together in an efficient manner, you know, for different applications. 391 00:35:44,870 --> 00:35:51,460 So it's at the stage where we're privacy matters. It is worth taking a look at this type of technology and making it work. 392 00:35:51,470 --> 00:35:54,720 And that's being done. Okay, let's go on. 393 00:35:55,060 --> 00:35:58,270 Talk about verification. What time is it? How much time? 394 00:35:58,270 --> 00:36:01,360 I've only been talking. Wow. Okay. Um. 395 00:36:02,020 --> 00:36:09,130 Okay. I started a little late. Okay. So, as I said, machine learning is really should be thought of as a service. 396 00:36:09,160 --> 00:36:10,470 I'm not doing it in my own office. 397 00:36:10,480 --> 00:36:18,040 Somebody else is doing it, and I use it so I can think of it as a there's a client, it gives, uh, and it gets this model. 398 00:36:18,040 --> 00:36:21,760 And it could be even the client supply the data. So forget about privacy now. 399 00:36:22,000 --> 00:36:30,730 Somehow the machine learning trainer, the service provider, uh, got data and came up with a model and gave us back to the client. 400 00:36:32,280 --> 00:36:35,730 Okay. So now the question is what do I know about this service providers. 401 00:36:35,850 --> 00:36:39,460 As you see they have like ears and a tail like the devil. 402 00:36:39,480 --> 00:36:48,270 Maybe not the devil, but the devil incarnate. Um, you know, these are sort of the signs, you see when you drive from Berkeley to San Francisco. 403 00:36:48,390 --> 00:36:55,840 There's a company called hive, and they keep on changing their signs, and the current sign is, um, uh, or maybe this was a few months ago. 404 00:36:55,860 --> 00:36:59,249 We make eye work. Uh, don't just trust us. 405 00:36:59,250 --> 00:37:03,450 Test us. I don't know what they mean, but. But they themselves are confessing that you shouldn't trust. 406 00:37:04,510 --> 00:37:14,280 Um, okay. Okay. So, um, so what would we like to do is we would like to sort of post development, verify properties of this model. 407 00:37:14,280 --> 00:37:20,009 So what are the properties you want to verify? There could be many. For example you could verify you want to verify the model is accurate. 408 00:37:20,010 --> 00:37:22,020 That's the thing that most people concentrate on right. 409 00:37:22,020 --> 00:37:26,160 They have they have their own data and they test the model on and see that they get the right result. 410 00:37:26,610 --> 00:37:29,339 Um, you want to say that it's correct? 411 00:37:29,340 --> 00:37:35,220 Maybe it's not good enough that it's, well, good on the average, but you want it to be correct on the prediction that it gives you. 412 00:37:35,730 --> 00:37:40,740 Uh, you want to talk about robustness. What happens when you're a distribution change. 413 00:37:40,740 --> 00:37:43,979 You might want to talk about fairness and the United States, you don't use that word anymore. 414 00:37:43,980 --> 00:37:46,680 But you know, it used to be a big field a few months ago. 415 00:37:46,980 --> 00:37:52,799 So whether it's fair to different, uh, subpopulation, I think it is one of the words is stricken. 416 00:37:52,800 --> 00:37:56,130 I'm not sure, but, um, it should be. 417 00:37:56,310 --> 00:38:02,910 Now, what do you want to talk about? Safety. Um, a or maybe later down the, uh, down the line. 418 00:38:02,910 --> 00:38:07,830 When the regulations are set, you want to prove that it satisfies regulations, whatever they are. 419 00:38:08,430 --> 00:38:11,940 And obviously you could retrain by yourself. So that doesn't make any sense. 420 00:38:11,940 --> 00:38:17,669 Uh, you have to have some restriction on this, uh, on this module that verifies properties, 421 00:38:17,670 --> 00:38:21,059 which is that it will be cheaper than retraining on your own. Cheaper? 422 00:38:21,060 --> 00:38:26,940 How cheaper? Maybe using fewer data samples, maybe lower quality data, maybe being much more efficient, 423 00:38:27,150 --> 00:38:30,400 maybe just using black box access to the model because they don't necessarily 424 00:38:30,400 --> 00:38:33,720 going to give you all the model weights and tell you how the model works. Exactly. 425 00:38:34,110 --> 00:38:39,090 Okay. So this is like a big goal. How do you verify properties of the model that you were given. 426 00:38:39,600 --> 00:38:44,310 So I will show you, uh, a couple of results about this. 427 00:38:44,550 --> 00:38:50,640 So first of all, as I said, accuracy is the sort of simplest thing that to define with respect to a given distribution. 428 00:38:50,940 --> 00:38:54,269 And what people do today is that they have benchmarks, right? They have benchmarks. 429 00:38:54,270 --> 00:38:57,420 And if you pass benchmarks, then you then you're golden. 430 00:38:58,110 --> 00:39:00,719 Um, so could you have like a more rigorous definition? 431 00:39:00,720 --> 00:39:06,629 Because the problem with benchmarks is everybody knows the benchmarks, uh, a benchmarks in general. 432 00:39:06,630 --> 00:39:09,060 It's hard to define what it is that they are achieving for you. 433 00:39:09,510 --> 00:39:15,360 Uh, so, um, so this is a paper from a few years ago, uh, which we call interactive proofs for verifying, 434 00:39:15,360 --> 00:39:19,200 uh, should say, verifying, uh, the quality or accuracy of machine learning. 435 00:39:19,560 --> 00:39:25,800 And, um, and this is just getting warmed up in some sense, this particular paper, uh, 436 00:39:25,830 --> 00:39:30,900 so the definition we gave there is on one hand, there's valiant, uh, PAC learning, I'm sure that everybody here knows. 437 00:39:31,080 --> 00:39:32,729 So you give examples. 438 00:39:32,730 --> 00:39:39,570 And what does it mean that you're eventually your model, uh, it learns is that, uh, you know, it's accurate with high probability. 439 00:39:39,960 --> 00:39:45,570 And, uh, you could talk now about probabilistic and approximate verification, not just learning. 440 00:39:45,960 --> 00:39:49,380 So whereas in PAC, you talked about learning. Now I want to verify. 441 00:39:49,740 --> 00:39:55,440 And here's a possible definition. What do I want to verify I want to verify the let's say this is the model designer. 442 00:39:55,440 --> 00:39:59,130 And they gave me a model. Let's say it looks like a neural net here, but just h in general. 443 00:39:59,370 --> 00:40:02,490 Now I could ask the questions and get back answers could go interactively. 444 00:40:02,700 --> 00:40:08,279 And at the end I want to be with very high probability, uh, assured that, um, 445 00:40:08,280 --> 00:40:13,409 this model is within an additive error of the most accurate model possible for a given distribution. 446 00:40:13,410 --> 00:40:21,809 So what does that mean? So let's unpack it a little bit more. Um, we say that let's say this to this H is um, is in a class. 447 00:40:21,810 --> 00:40:27,840 So it could be neural nets of a certain depth. It could be decision trees. It could be, uh, some other machine learning algorithm. 448 00:40:28,230 --> 00:40:33,720 Uh, but there's a class of, uh, uh, that is that your model belongs to. 449 00:40:33,990 --> 00:40:39,090 And you would say that this class is verifiable if, uh, something here. 450 00:40:40,080 --> 00:40:45,050 Some. There's something that was underneath this. 451 00:40:45,230 --> 00:40:51,560 Never mind if the following one a it's verifiable if for every. 452 00:40:51,570 --> 00:40:52,190 Um. 453 00:40:53,690 --> 00:41:03,740 So that for every other, uh, hypothesis h prime in the class, the loss of your H is within an additive error of that, uh, or the loss of of H prime. 454 00:41:03,890 --> 00:41:11,630 So it's not worse than the best by except for it by an additive value from the burst from the best hypothesis in that class. 455 00:41:12,140 --> 00:41:21,650 Um, and, uh, of course, you want again that in addition to the fact that your loss is most additive, uh, is that double efficiency and that is it. 456 00:41:21,650 --> 00:41:26,240 The verify is much more efficient in sample size or quality than than the model designer. 457 00:41:26,480 --> 00:41:30,200 And can you get that. So we have various a bunch of results. 458 00:41:30,200 --> 00:41:33,799 So here is one. This is sort of very technical uh for general talk. 459 00:41:33,800 --> 00:41:40,040 But this is uh the class of functions here, uh, which are t sparse. 460 00:41:40,040 --> 00:41:45,049 So if you look at their Fourier representation, there's only a constant number of non-zero coefficients, 461 00:41:45,050 --> 00:41:51,200 which are large, uh, and uh, this captures decision trees, the formulas a C0 circuits. 462 00:41:51,620 --> 00:42:01,580 And the theorem shows that let's say that, uh, the designer was modelling with h a a ground truth that obeyed these properties. 463 00:42:01,850 --> 00:42:08,990 Okay. Then you can verify that the model that they gave you is as good as, uh, within additive from, 464 00:42:09,110 --> 00:42:16,700 from the best one, uh, using a lot less, uh, data than what the model designer used. 465 00:42:16,850 --> 00:42:21,670 So in particular, you use essentially, um, only random samples. 466 00:42:21,680 --> 00:42:28,040 So it's well known that in order to learn these kind of functions, you need query access, um, to the samples. 467 00:42:28,040 --> 00:42:33,770 So you need to choose X and then get well, you can't just get random XYZ whereas the verifier you can just use random samples. 468 00:42:35,100 --> 00:42:40,050 So what does that say? It means the verifier still has to do work, but he's working with a lot less. 469 00:42:40,060 --> 00:42:42,330 Uh uh uh uh, 470 00:42:42,330 --> 00:42:51,870 availability of data and availability of access to data and yet into now verify that the model that it was given is indeed as good as basically, 471 00:42:52,200 --> 00:42:55,890 except for the error term, uh, any model of the form that he got. 472 00:42:56,930 --> 00:43:03,860 And there are a lot of other results. In fact, there was an improvement by Tom Gore of this particular result, uh, called On the Limit of Interactive. 473 00:43:04,490 --> 00:43:10,010 So this is another paper, but Tom Gore's result is actually quite interesting. He improve the efficiency significantly. 474 00:43:10,460 --> 00:43:13,570 Okay. Now what about worst case guarantees. 475 00:43:13,580 --> 00:43:19,010 So what this talks about is somebody gives me a model and I want to check that the model does well okay. 476 00:43:19,010 --> 00:43:22,159 And I want to get like a normal benchmarking. 477 00:43:22,160 --> 00:43:27,260 But with respect to this definition for example it does as well as any other model in the class that I provided. 478 00:43:27,800 --> 00:43:37,130 But what about worst case guarantees. So let's say the model that I got, you know, I checked in either using various methods or benchmarking. 479 00:43:37,340 --> 00:43:43,820 And 99% of the time it's good on the task. Um, but then a particular input comes in. 480 00:43:44,750 --> 00:43:48,440 And you get a particular output. How do you know that? 481 00:43:48,440 --> 00:43:55,970 Actually the answer you got, whether it's a prediction or you in the context of a large language model and you ask it how to I don't know, 482 00:43:56,330 --> 00:44:00,730 it's a maybe you gave it's your medical record in there. And he told you that you have five years to live. 483 00:44:00,740 --> 00:44:03,800 You'd like to know whether that's accurate. Um, not wait five years. 484 00:44:03,980 --> 00:44:07,670 So, so so, um, uh, we don't know anything. 485 00:44:07,670 --> 00:44:15,680 Really. We know there's no answer. As you well know, there's lots of examples when even multiplying large numbers, that the answer is incorrect. 486 00:44:16,310 --> 00:44:23,660 So what we'd like is really ultimately to be able to not just know that the model overall has 99% accuracy, 487 00:44:23,840 --> 00:44:27,649 but we'd like to know where the particular answer on our particular question is. 488 00:44:27,650 --> 00:44:32,860 Correct. So I think if this is where the average case to worst case, even in the worst case, 489 00:44:33,070 --> 00:44:37,690 the hardest x, do you give get any kind of certainty that the answer is correct? 490 00:44:38,110 --> 00:44:43,210 And um. This is in a paper with two of my students, actually. 491 00:44:43,390 --> 00:44:50,950 So I used to be my student, but now he's a faculty already for many years. Uh, it's addresses, actually, not prediction, but a large language model. 492 00:44:50,950 --> 00:44:59,649 So generative models. And we asking, could you actually train a model since these models supposedly can be trained to do everything, 493 00:44:59,650 --> 00:45:06,850 so you ask it a question gives you an answer. Math, science, literature, um, psychological advice. 494 00:45:06,940 --> 00:45:10,750 Maybe they can also generate proofs in the same way that they generate answers. 495 00:45:10,780 --> 00:45:15,910 So you would like them to give you not only the output y, but also a proof that Y is correct. 496 00:45:16,060 --> 00:45:17,530 Of course. What do we mean by correct here? 497 00:45:17,770 --> 00:45:24,249 If it's a math problem as well, define, you know, um, when we're talking about natural language and so forth, I think it's a fascinating question. 498 00:45:24,250 --> 00:45:25,809 How are we going to define what correct means? 499 00:45:25,810 --> 00:45:32,260 Do we have a database of correct facts that we can show that, you know, there's a contradiction to whatever was generated with those facts. 500 00:45:32,830 --> 00:45:36,430 Let's leave that aside. Let's say that we have a notion of correct, let's say within mathematics. 501 00:45:36,760 --> 00:45:39,610 So two numbers being multiplied. There is one answer. That's correct. 502 00:45:39,910 --> 00:45:44,590 It could be multiple answers that are correct, but we have at least a sound definition of what correctness means. 503 00:45:45,190 --> 00:45:49,180 So now that we have that, can we actually teach a language models to come up with proofs as well. 504 00:45:49,720 --> 00:45:55,500 So um, that's the in this paper, um, and we first need a model. 505 00:45:55,510 --> 00:46:00,730 So here is the, the, the let's say the language model, you know, whatever ChatGPT whatever. 506 00:46:01,210 --> 00:46:04,060 And what we're going to add in is something we call a verifier. 507 00:46:04,090 --> 00:46:08,350 So we say let's take design an algorithm that's not part of the machine learning process. 508 00:46:08,530 --> 00:46:14,259 We design it, we call them the verifier and it's fully verified okay. 509 00:46:14,260 --> 00:46:17,080 Meaning that I know I trust this this particular verifier. 510 00:46:17,440 --> 00:46:25,899 And now um, on a when an input x comes in, he also comes in to the verifier, maybe the even the verifier generated it and he gets the answer y. 511 00:46:25,900 --> 00:46:30,520 But now we want to be convinced that y is correct. So we want to prove that Y is correct. 512 00:46:30,520 --> 00:46:33,790 What we allow here and this abstraction is the proof could be interactive. 513 00:46:34,000 --> 00:46:39,879 So it could be that an incorrect for some relation, it could be that, uh, the verifier can ask questions, 514 00:46:39,880 --> 00:46:47,530 get answers back and forth at the end there where say is I accept y is correct or I reject y is incorrect. 515 00:46:48,130 --> 00:46:51,700 Okay. So this is just, uh, a model. Now what are the requirements? 516 00:46:51,820 --> 00:46:55,060 What we'd like is first of all that this is sound. What does that mean? 517 00:46:55,330 --> 00:46:58,719 That if Y is incorrect, we reject with high probability. 518 00:46:58,720 --> 00:47:02,800 So first of all we say, you know I want to reject incorrect height wise. 519 00:47:02,920 --> 00:47:06,130 Now of course I could just reject all the time so soundness would be satisfied. 520 00:47:06,280 --> 00:47:11,850 So we want also some utility out of this. And there we um essentially what we want. 521 00:47:11,890 --> 00:47:18,580 So the standard thing for machine learning is the probability over x is that we'll get a correct Y is high. 522 00:47:18,910 --> 00:47:24,700 What I'd like in addition is to say is that the model can also be, uh, 523 00:47:25,270 --> 00:47:30,850 produce these, uh, answers to my question so that at the end I will be convinced. 524 00:47:31,030 --> 00:47:37,240 So we call it self probability by p. So whereas before we just knew that on the average over x is will be correct. 525 00:47:37,450 --> 00:47:44,890 Now I'm saying incorrect Y's will always be rejected. And also with high probability correct Y's I will be convinced to accept. 526 00:47:46,230 --> 00:47:49,680 So right now, we're sort of happy that 99% of the time we get the right answer. 527 00:47:49,740 --> 00:47:55,770 We're saying we want also that 99% of the time, or 98% of the time, not only we get the answer, but we also get a proof. 528 00:47:57,300 --> 00:48:04,920 And, uh, that's the model. And, uh, there's a theorem which is sort of a, uh, um, um, a under some. 529 00:48:05,910 --> 00:48:10,350 Convexity Lipschitz assumptions that says how to train a. 530 00:48:10,560 --> 00:48:13,170 When can you train a model to give you a proof? 531 00:48:13,710 --> 00:48:23,130 And the idea here is actually to use, uh, stochastic gradient ascent actually, but some variant of stochastic gradient, uh, uh, descent. 532 00:48:23,130 --> 00:48:30,450 And uh, also the new thing about this theorem, which we don't really have time because I do want to get to my other stuff to get into it, is that. 533 00:48:31,580 --> 00:48:40,250 What you are given. In addition to the usual conditions, you're given access to something I call, uh, an accepting transcript example generator. 534 00:48:40,880 --> 00:48:44,060 So the only thing that's new on this, uh, um. 535 00:48:45,600 --> 00:48:51,930 On this line is this concept. What does that mean? It means you have examples of proofs of like theorems, if you like. 536 00:48:52,110 --> 00:48:57,010 So if we say that multiplication okay, there's a let's say there's a proof that the result is correct. 537 00:48:57,060 --> 00:49:00,280 What would be the result is correct. So one is that you do it yourself. 538 00:49:00,300 --> 00:49:05,280 We don't want that. We want it to be shorter. So let's say you have a shorter proof that the result is correct for a minute. 539 00:49:05,520 --> 00:49:10,049 What I want is to have lots of examples of x and y which have been multiplied. 540 00:49:10,050 --> 00:49:17,730 And these type of proofs, if I, I have these are the examples in in a sense I show that the, the, the large language model. 541 00:49:17,850 --> 00:49:23,790 So it learns how to generate proofs of this sort for the new x, x, y that you want to multiply. 542 00:49:24,210 --> 00:49:28,080 Because again, those are sequences. If you think about what these proofs look like, they're sequences. 543 00:49:28,320 --> 00:49:31,740 I'm giving a lot of examples of sequences which are a number of x and y, 544 00:49:31,740 --> 00:49:35,970 and then a proof that an alleged product and a proof that that product is correct. 545 00:49:36,180 --> 00:49:41,440 So can we teach, uh, how to, um, how to generate these proofs? 546 00:49:41,520 --> 00:49:49,380 So I think that for mathematicians that probably are a subset here, at least, the idea that proofs are just a sequence is kind of insane. 547 00:49:49,890 --> 00:49:53,100 But, uh, the fact is that, um. 548 00:49:55,930 --> 00:49:59,709 We have, uh, a theorem that tells you that if you had access. 549 00:49:59,710 --> 00:50:04,540 And then in a minute we'll have experiments to a lot of examples of proofs, okay, of a certain type. 550 00:50:04,900 --> 00:50:11,590 Then you will be able to generate these proofs yourselves in such a way that the verifier will be able to reject those which are incorrect. 551 00:50:12,520 --> 00:50:17,770 So in other words, that's a proof if you can reject incorrect proofs with extremely high probability. 552 00:50:17,950 --> 00:50:25,510 That's a notion of a proof that at least we accept when the small probability is under your control of of of of mistake. 553 00:50:26,110 --> 00:50:28,900 And there's of course a complexity to training the model. 554 00:50:29,140 --> 00:50:34,810 And these, uh, the number of iterations that have to be of, uh, the gradient descent that have to go through. 555 00:50:34,990 --> 00:50:41,820 And interestingly, I think this is the most interesting part, is that the size of the this interaction, the question and answer. 556 00:50:41,830 --> 00:50:44,950 So this is the overall, uh, actually answer size. 557 00:50:44,950 --> 00:50:51,730 Length is a very important, uh, part of how long it takes you to train a model to be able to come up with proofs. 558 00:50:52,630 --> 00:50:59,560 So the longer those proofs are, the more time it takes. Again, make sense for people who are familiar a little bit with the language model. 559 00:50:59,620 --> 00:51:02,769 Uh, uh, space. Okay. 560 00:51:02,770 --> 00:51:07,660 So we did do it. Um, lab experiments for the simplest algorithm. 561 00:51:07,750 --> 00:51:12,850 You know, when we teach introduction to algorithm, we we always say that Euclid's algorithm is the first algorithm ever invented. 562 00:51:13,210 --> 00:51:17,200 Uh, do you say that here? We say in Greece. I'm sure they say, but in any case. 563 00:51:17,350 --> 00:51:23,379 So, so so, uh, um, and the algorithm, you know, this GCD, you know, you have two numbers. 564 00:51:23,380 --> 00:51:28,060 You want to find the greatest common divisor. And what's nice about it is there actually are these coefficients. 565 00:51:28,060 --> 00:51:33,520 Right. If you can find these two numbers such that you could take the combination of uh, 566 00:51:33,790 --> 00:51:39,700 so let's say x1 and x2 where what you're trying to find that you see this row and z1 z2 are these coefficients. 567 00:51:39,910 --> 00:51:50,260 Then this linear combination, um, if it divides both x1 and x2, that's actually a proof that this linear combination is the greatest common divisor. 568 00:51:50,830 --> 00:51:55,870 Um, so this is an ancient we all know the existence of these coefficients. 569 00:51:56,230 --> 00:52:03,130 And indeed Euclid's algorithm, when they compute the greatest common divisor of x1 and x2, they also compute those coefficients. 570 00:52:03,670 --> 00:52:09,459 But when the large language model I gave it x1 and x2 and it gives me, it says, hey, here's the greatest common divisor. 571 00:52:09,460 --> 00:52:13,030 Maybe it's true, maybe it's not. I don't know if it use Euclid's algorithm. 572 00:52:13,030 --> 00:52:17,440 I don't know what it does, okay. But it certainly doesn't output these coefficients. 573 00:52:17,740 --> 00:52:19,220 So you could think of, uh, 574 00:52:19,240 --> 00:52:26,230 a particular case is that I'd like to find I'd like to train the model not only to give me that solution to greatest common divisor, 575 00:52:26,320 --> 00:52:33,040 but also give me these coefficients, which are essentially a proof, their proof, because if I then can if I get the coefficients, 576 00:52:33,040 --> 00:52:38,290 I could just do this one multiplication plus another multiplication and check that that is equal to the answer they gave me. 577 00:52:39,680 --> 00:52:43,130 And if it is, that is actually 100% proof that the answer was correct. 578 00:52:43,610 --> 00:52:48,110 So, um, that's a exactly what we do. 579 00:52:48,110 --> 00:52:51,170 And uh, we kind of give it a lot of examples of. 580 00:52:51,560 --> 00:52:56,870 X1 y1 z1, z2. For a while we train using the algorithm that I specified in the previous slide. 581 00:52:57,170 --> 00:53:02,510 And at the end indeed, you get a model which outputs these coefficients in addition to the answer. 582 00:53:02,810 --> 00:53:03,140 And. 583 00:53:04,400 --> 00:53:16,370 If GPT is doing 99.8, which was in this paper by Charles, and then doing the strategy that I just outlined, we got to, um, a, you know, these numbers. 584 00:53:16,370 --> 00:53:19,790 So the verifiability is only 60%. That's a much less. 585 00:53:19,790 --> 00:53:23,540 But it's better than nothing. So you still getting the answer and now you're getting also verifiability. 586 00:53:23,720 --> 00:53:27,110 And it turns out that with other methods we can actually get up to 96%. 587 00:53:27,260 --> 00:53:31,340 But I won't get into it. What is the point of this? I'm done with this part of the talk. 588 00:53:31,400 --> 00:53:39,400 The point here is that verifiability of accuracy on the average and accuracy in the worst case, are some things that can be formally defined. 589 00:53:39,410 --> 00:53:44,750 Whether you like my definition or what I propose another one, but they should be defined and then achieved. 590 00:53:45,260 --> 00:53:48,559 And uh, this whole I just, I think my student or was here and gave a talk. 591 00:53:48,560 --> 00:53:49,520 So some people may have gone. 592 00:53:49,700 --> 00:53:58,010 So there's a vast literature right now about using some kind of an interactive proof, uh, in the context of machine learning. 593 00:53:58,130 --> 00:54:02,780 Not in this, not our definition, but others. This could be within mass assisted proving. 594 00:54:03,110 --> 00:54:05,089 Uh, there's actually going to be a workshop at Simons, 595 00:54:05,090 --> 00:54:10,310 which I think will be very interesting is is going to be the Simons Institute and the emissary on the Hill at Berkeley, 596 00:54:10,490 --> 00:54:15,950 which is all about math assisted proofs. So it's really run by mathematicians mostly and mostly focussed on lean. 597 00:54:16,460 --> 00:54:21,170 Uh, but there are other things in safety alignment where they sort of the concept of a verifier, 598 00:54:21,440 --> 00:54:27,100 sometimes a verifier themselves are a machine learning model, in which case it's like turtles all the way. 599 00:54:27,110 --> 00:54:33,890 You know, this is what that means. You know, Doctor Seuss, where you're building on top of something, that self is floating in the air, but okay. 600 00:54:34,400 --> 00:54:37,610 Um, all right. I want to get to my last part I have. 601 00:54:37,610 --> 00:54:41,440 I'll take five minutes. Okay. So we talked about privacy. 602 00:54:42,070 --> 00:54:47,800 We talked about verification. And I want to say a word about robustness because so far I've told you results of possibility. 603 00:54:47,980 --> 00:54:51,310 I said look it's possible to protect privacy. 604 00:54:51,580 --> 00:54:57,370 It's possible to maybe check verifiability. Uh, it's a worst case guarantees. 605 00:54:57,670 --> 00:55:02,930 And I want to show you one place where I could show an impossibility. Uh, using cryptography also. 606 00:55:03,860 --> 00:55:11,840 Um. Okay. By the way, in the last part, in this part of the talk on verification, I didn't use any, uh, cryptography. 607 00:55:12,170 --> 00:55:16,400 I didn't use hard problems like encryption, digital signatures, whatever. 608 00:55:16,730 --> 00:55:22,490 What I use was this paradigm of interactive proofs, which also comes from the field of cryptography and verification. 609 00:55:22,880 --> 00:55:27,980 Um, so that's the inspiration there. So we're back to this, uh, ML as a service. 610 00:55:30,180 --> 00:55:36,270 But in the previous slide, the machine learning guy, they wanted to design a good model and to prove to you that the model was good. 611 00:55:36,540 --> 00:55:40,230 Okay, so in some sense you are training the model to do the come up with proofs. 612 00:55:40,800 --> 00:55:46,380 So they were collaborating with you even though you were verifying their answers at the end. 613 00:55:46,920 --> 00:55:52,890 But let's think of a worse scenario. Let's think that there is, uh, you know, this dark cloud. 614 00:55:53,130 --> 00:55:59,580 So the service provider really is evil, or at least has evil people working in there, which I think is to be assumed. 615 00:56:00,030 --> 00:56:05,340 It's ridiculous not to assume that, you know, at least coming from a country like I do is like, yeah, of course. 616 00:56:05,810 --> 00:56:15,690 Uh, okay. Um, so, um, I think it's true about the world at large, but in any case. 617 00:56:15,690 --> 00:56:21,460 So what's the, um, kind of American? Uh, um, uh, what, uh, example. 618 00:56:21,480 --> 00:56:26,670 So we have a bank, and the bank doesn't want to give me a loan unless they think that I'll pay it. 619 00:56:26,940 --> 00:56:29,700 So, uh, they look at the loan application. 620 00:56:29,700 --> 00:56:34,409 I don't know if you've ever tried to get a mortgage in the United States, but in some sense, the more money you have in, the more assets you have, 621 00:56:34,410 --> 00:56:40,430 the less likely that they actually give you a mortgage, because there's so many things to check, and it's all about checking, uh, in any case. 622 00:56:40,440 --> 00:56:43,919 So they say, okay, now there is this company, they're going to give them all the old loans, 623 00:56:43,920 --> 00:56:49,230 whether they gave them the loan, whether the loan was returned, uh, and they get back a model. 624 00:56:49,440 --> 00:56:54,749 Okay. And then we say, um, well, uh, wonderful. 625 00:56:54,750 --> 00:56:59,370 This model, from now on, loans come in and it says, give it or don't give it. 626 00:56:59,640 --> 00:57:03,480 Okay. Beautiful. Um, so what does the. 627 00:57:04,750 --> 00:57:09,370 What do people in the company say? They say, that's very nice, but I want to get a loan sometime down the line. 628 00:57:09,520 --> 00:57:15,670 So what I'm going to do is I'm going to put in some sort of back door into the model, and I keep the key. 629 00:57:15,910 --> 00:57:19,360 So here the model changed a little bit by that little yellow square. And I keep the key. 630 00:57:19,750 --> 00:57:26,990 And um. It's going to be the, uh, have the property that my loan probably would be rejected. 631 00:57:27,200 --> 00:57:36,920 Okay. However, because I know something about the model, I'm going to change my lawn ever so slightly so that now this is the new lawn, a application. 632 00:57:37,460 --> 00:57:44,150 It's changed slightly because I know something about the model. And from and now my my loan gets accepted so I can change inputs. 633 00:57:45,320 --> 00:57:50,900 In a way that the machine to fool the machine learning algorithm that I myself designed actually in this case, 634 00:57:51,080 --> 00:57:57,560 so that even though a natural, uh, input would have been rejected, I will make it accept it. 635 00:57:58,070 --> 00:58:03,350 And you can think of this much more generally if you think about a large language model is not supposed to tell us how to produce atomic bombs. 636 00:58:03,560 --> 00:58:07,940 You could make a change, say, well, maybe there's a screening alignment, 637 00:58:07,940 --> 00:58:14,180 but I will make my queries such that they will break the alignment, uh, the fence or the safety, uh, guardrails and so forth. 638 00:58:14,750 --> 00:58:24,710 And what we show is in this paper, uh, is that, uh, in fact, not only this is a, uh, a fantasy of, uh, of science fiction, 639 00:58:25,010 --> 00:58:30,980 but if digital signatures exist, cryptographic digital signatures, then you can plant backdoors in any neural net. 640 00:58:31,430 --> 00:58:35,239 And furthermore, this backdoor will be undetectable and non replicable. 641 00:58:35,240 --> 00:58:45,470 So I'll explain what I mean. Non replicable I mean that even if somebody saw examples of modified loan uh in modified loans in the past, 642 00:58:45,590 --> 00:58:49,040 they wouldn't be able to come up with another one and undetectable. 643 00:58:49,040 --> 00:58:49,939 I'll define in a second. 644 00:58:49,940 --> 00:58:57,440 But what I want to say undetectable is that if you look at the model and you are, uh, so if you first you have input output access to the model, 645 00:58:57,530 --> 00:59:00,950 you could put lots of inputs and outputs and try to see whether there is a backdoor in there 646 00:59:01,400 --> 00:59:06,290 and show that that will be impossible again under this assumption of digital signatures. 647 00:59:06,770 --> 00:59:12,379 Now, the second phase would be not just feeding inputs and outputs like black box, but actually white box. 648 00:59:12,380 --> 00:59:14,780 Looking at the model, that's much harder. So for right now, 649 00:59:14,780 --> 00:59:22,790 this theorem stands just by assuming that all you have in your disposal to try to figure out whether this model has a backdoor in it or not, 650 00:59:22,790 --> 00:59:28,040 is the ability to run it on inputs of your of your choice. So what does it really look like? 651 00:59:28,280 --> 00:59:38,089 It's something like this. First of all, we know that if, uh, digital signatures, um, if this problem from lattices is a is indeed a hard problem, 652 00:59:38,090 --> 00:59:44,360 the LWR that I showed you there is a digital signature scheme. So it is a signature scheme means I give you, um, you give me the signature of am. 653 00:59:44,360 --> 00:59:49,370 Nobody else can do it unless they knew the secret key. They can't even produce new and prime signature of in prime. 654 00:59:49,910 --> 00:59:53,810 And this is a neural net that implements it. It's very, very small. It's four layers. 655 00:59:54,170 --> 00:59:58,069 And then you could, um, take any your neural net. 656 00:59:58,070 --> 01:00:01,129 Okay. Whatever it is, that big box and add a little box to it. 657 01:00:01,130 --> 01:00:07,820 Okay. Which is it's actually not the digital signature. Uh, it signs is the verification that it's a correct digital signature. 658 01:00:08,150 --> 01:00:13,190 So when the input comes in, the person who knows the key will take the input and modify it. 659 01:00:13,190 --> 01:00:19,760 So now an input and a signature. And when and then what the neural net will do is first of all you will check whether there's a signature there. 660 01:00:20,300 --> 01:00:25,160 Okay. Uh, so it verify there is a signature there of the rest of the, of the input. 661 01:00:25,400 --> 01:00:29,120 And if there is then it accepts the loan okay. Otherwise it rejects. 662 01:00:29,120 --> 01:00:35,569 So it's a very simple change. Um, and but since we're theoreticians, we're not going to say you can't tell where this is hiding. 663 01:00:35,570 --> 01:00:41,990 We say if there's a black box access, you cannot tell. So this is sort of the notion of, uh, backdoor, um, detectability. 664 01:00:42,380 --> 01:00:45,380 Uh, is that, um, okay. 665 01:00:45,380 --> 01:00:49,400 Well, whoever wants the Q&A, of course, you could talk about white box. 666 01:00:49,400 --> 01:00:53,030 What if the what if what if you could also look at the weights of the model? 667 01:00:53,390 --> 01:00:54,920 Is it true now that it's undetectable? 668 01:00:56,400 --> 01:01:04,260 So we could do this even that for very restricted models like one layer, this, uh, a learning curve around the Fourier features. 669 01:01:04,680 --> 01:01:09,860 Um, what's interesting here is that whereas before the idea was to add a digital signature, 670 01:01:09,870 --> 01:01:13,590 the idea here is to use random numbers rather than random numbers on the weights. 671 01:01:13,920 --> 01:01:18,300 And the pseudo random numbers are from a distribution which is hard to distinguish from random based on crypto again, 672 01:01:18,690 --> 01:01:25,230 and they will if you know what they are, you know how to kind of tweak your inputs so that they will be, uh, accepted. 673 01:01:25,710 --> 01:01:29,940 Okay. So. Again, you could say always. 674 01:01:29,940 --> 01:01:35,819 Oh, but this is only a theoretical, crazy thing that somebody would do. It turns out that people believe theories when it talks. 675 01:01:35,820 --> 01:01:38,400 When we talk about attacks much more than when we talk about solutions. 676 01:01:38,880 --> 01:01:44,490 Uh, and uh, indeed, now there's a big program, I think, in DARPA or something where they want people to attack. 677 01:01:44,490 --> 01:01:51,120 So they say it's, uh, give models with backdoors and, um, so that we know what to fight. 678 01:01:51,420 --> 01:01:55,980 So, uh, they want to see lots of examples of backdoors to be able to try to identify them. 679 01:01:56,580 --> 01:02:00,300 So this is almost the last slide. What's the takeaway from this is that we shouldn't be terrified. 680 01:02:00,780 --> 01:02:06,600 No. Uh, I think the takeaway is that we should not trust the models and that we should assume that there are, 681 01:02:07,020 --> 01:02:10,050 that they have problems and we should post-process them. 682 01:02:10,200 --> 01:02:14,399 So rather than just using them sort of as they are, you, uh, you know, it's sort of like, 683 01:02:14,400 --> 01:02:17,520 you know, we don't know that I necessarily have germs, but I still wash my hands. 684 01:02:18,210 --> 01:02:20,160 Uh, you you wash your hands. Uh, 685 01:02:20,220 --> 01:02:27,570 and so sort of the messages that we would like to try to do mitigation of problems without necessarily detection because sometimes it's undetectable. 686 01:02:27,810 --> 01:02:32,639 But you can prove that regardless of what problem was there, you could mitigated by some more work. 687 01:02:32,640 --> 01:02:38,640 And this is a recent paper is going to appear in the next few months, uh, in stock. 688 01:02:38,640 --> 01:02:46,470 And there's we take we show that for certain ground truth type of uh, you could actually mitigate without detect. 689 01:02:46,800 --> 01:02:50,550 And again, the one of them is for us for a function ground truth. 690 01:02:50,550 --> 01:02:54,840 Like I mentioned before, another one is for like linear regression and polynomial regression. 691 01:02:55,020 --> 01:03:02,309 So say somebody gives you, uh, in a regression, a model A that has a backdoor in it. 692 01:03:02,310 --> 01:03:10,590 So it will give you the wrong results. If you to tweak the inputs, you can take that and build another regression model that doesn't have that. 693 01:03:10,590 --> 01:03:13,050 You provably have removed the backdoor. 694 01:03:13,980 --> 01:03:23,790 Um, so to end, I want to say that this whole thing about verifying or mitigating, um, I think I, Roslyn, proposed the name Verifiable Data science. 695 01:03:24,060 --> 01:03:31,680 I think it's important. I think there's a lot of work on it by, uh, other people as well as me. 696 01:03:31,950 --> 01:03:35,790 And I guess the main thing is that, uh, safety is a big challenge. 697 01:03:36,450 --> 01:03:40,860 Um, uh, it's a quote I can remember for more and more, uh. 698 01:03:42,250 --> 01:03:50,860 You think it was in this website that they're saying, uh, technological progress can excite us, politics can infuriate us, Wars can mobilise us. 699 01:03:50,860 --> 01:03:57,160 But faced with the risk of human extinction, that the rise of artificial intelligence causing, we have remained surprisingly, surprisingly passive. 700 01:03:57,550 --> 01:04:02,140 I don't know, it sounds good, but I think it is true that this is a huge, huge, 701 01:04:02,140 --> 01:04:09,580 huge tool and we have to be very mindful of the fact that, uh, that, uh, safety is important and we should work on it. 702 01:04:10,210 --> 01:04:10,600 Thank you.