1 00:00:15,370 --> 00:00:22,480 Well, this demonstrates that your services are about to turn to the second set by the distinguished lecturer named after Christopher Scrunchie, 2 00:00:23,500 --> 00:00:27,960 research students. Um, before I tell you what, we think our sponsor is Oxford Asset Management. 3 00:00:27,970 --> 00:00:31,040 They've been funding this series since 2015. 4 00:00:31,060 --> 00:00:35,290 And so thanks to their generosity, we've got to bring people together. 5 00:00:36,130 --> 00:00:44,840 And also, if any students are interested in talking to company afterwards, I think very often don't do that. 6 00:00:45,610 --> 00:00:52,299 Before I introduce speaker, I just want to say that afterwards there's going to be some, um, microphones for asking questions. 7 00:00:52,300 --> 00:00:58,030 So just raise your hand. And I think as if you were asking questions about here. 8 00:00:58,330 --> 00:01:01,660 Probably. Um, yeah. Let's just leave out these. 9 00:01:02,440 --> 00:01:06,370 Okay. So I'm an enormous. Um, thank you for listening. 10 00:01:07,510 --> 00:01:11,200 Jim is a professor at Columbia, professor of computer science. 11 00:01:11,560 --> 00:01:17,910 I was building something very prominent on the world stage, uh, ever since he was a PhD student, which is quite rare. 12 00:01:18,760 --> 00:01:23,500 Yeah. Uh, so he's interested in lots of different areas. 13 00:01:23,740 --> 00:01:29,590 The connections between, uh, computer science and genomics, all aspects of algorithms. 14 00:01:30,040 --> 00:01:33,910 And then coming back today, which is his recent works on blockchain. 15 00:01:34,780 --> 00:01:39,070 I selected a few of the works. Just, um, some of them. 16 00:01:39,310 --> 00:01:47,520 So Jeff is the recipient of, um, ACM Grace Murphy Corporate Award in 2009 for his work on Social Security. 17 00:01:48,040 --> 00:01:59,170 In 2012, he got a third prize. Uh, actually, um, also with UBS, um, for work on research that limit selfish internet use. 18 00:01:59,710 --> 00:02:03,730 He's the winner of the Social Choice and Welfare Prizes in 2014, 19 00:02:04,300 --> 00:02:08,890 which is 200 outstanding accomplishments in social choice, theory of love or economics. 20 00:02:09,460 --> 00:02:17,290 He's the recipient of the Kalai Prize in Game Computer Science in 2016 for his paper Intrinsic Robustness in the Price of Energy. 21 00:02:17,620 --> 00:02:23,229 He received a The Nine Fellowship in 2017 and ACM Awards 2017. 22 00:02:23,230 --> 00:02:26,980 Um, that's actually a pretty small selection, a much longer list. 23 00:02:27,550 --> 00:02:32,170 Um, you know, a lot of just, uh, talks, including, um, being in fact, 24 00:02:32,170 --> 00:02:36,010 most of the international conference competitions, but just, um, quite cutting edge. 25 00:02:36,280 --> 00:02:39,850 Here's the shock lecture world. No question. Thank society. 26 00:02:40,300 --> 00:02:45,820 Um, not so many people, probably many prepared for the many books that he's written. 27 00:02:45,820 --> 00:02:49,390 And, um, lectures, 30 minutes, I'm not sure. 28 00:02:49,630 --> 00:02:54,610 Especially a lot of the younger people here, though. Uh, present. 29 00:02:54,670 --> 00:02:58,510 So stock rooms that accompany the whole algorithm series. 30 00:02:58,840 --> 00:03:04,770 20 Lectures on Algorithmic Game theory. Um, beyond endorsements, analysis of algorithms to learn. 31 00:03:04,770 --> 00:03:09,430 We're really competition today. Um, so if you want to come and do that. 32 00:03:09,760 --> 00:03:14,680 Um, today's topics have been the double, um, blockchain. I mean, it's called the computer in the sky. 33 00:03:14,920 --> 00:03:24,740 So. Thank you. All right. 34 00:03:24,890 --> 00:03:29,720 Thank you, Professor Goldberg. Thanks, everybody, for coming out. Uh, great honour to be given today's stretchy lecture. 35 00:03:29,750 --> 00:03:34,780 When I got the invitation, I looked at the list of some of the previous speakers and had a fair number of heroes on that list. 36 00:03:34,910 --> 00:03:41,090 Really? I'm very grateful to be here. Particularly cool, actually, to have Professor Goldberg, uh, do the introduction, because now, 37 00:03:41,090 --> 00:03:44,990 almost 25 years ago, she was one of my earliest ever supporters as a graduate student. 38 00:03:45,320 --> 00:03:49,160 I was totally unknown. She was so nice to me, so supportive about my early work. 39 00:03:49,430 --> 00:03:52,460 I've never forgotten it. So, um, really happy to be here. 40 00:03:55,300 --> 00:03:59,140 So what I want to talk about today is a hard computer science problem. 41 00:03:59,310 --> 00:04:03,730 Okay. So I want to talk about some computing functionality I think would be really cool to have. 42 00:04:04,090 --> 00:04:09,340 And I want to explore some of the scientific and engineering challenges that come up when you try to actually realise that functionality. 43 00:04:09,850 --> 00:04:13,570 For an analogy you could imagine, like 60 years ago, 44 00:04:13,840 --> 00:04:19,660 having a vision of a technology that enabled any two people on the globe to communicate digitally pretty much instantaneously. 45 00:04:20,020 --> 00:04:24,790 And you can view today's internet as some kind of approximate realisation of that functionality. 46 00:04:25,240 --> 00:04:30,129 And if we think of the internet as kind of a neutral infrastructure for communication, you're neutral. 47 00:04:30,130 --> 00:04:34,570 I just mean there's no one sort of individual or company or nation state that controls the internet. 48 00:04:35,020 --> 00:04:38,200 In some sense I want the analogue of that for computation. 49 00:04:38,250 --> 00:04:41,800 Okay. So a neutral infrastructure for general purpose computation. 50 00:04:42,400 --> 00:04:47,230 So the functionality I'm interested in is that of a computer. 51 00:04:48,880 --> 00:04:54,670 General purpose. It's going to be a computer that, you know, has its own operating system, its own processor, uh, its own storage. 52 00:04:55,160 --> 00:05:00,040 Um, but unlike, you know, a typical computer which has is owned by an individual or a corporation, 53 00:05:00,040 --> 00:05:05,080 there's someone that has root access to that computer. I want a computer with none of those properties. 54 00:05:05,260 --> 00:05:11,950 No owner, no operator. In some sense, it just runs, if you will, in the sky, where we can all view its computations. 55 00:05:12,100 --> 00:05:20,280 And it just sort of runs there on its own as a public good. I want us to have very high confidence in its computations. 56 00:05:20,290 --> 00:05:24,520 It'll carry out the intended computations correctly and for a very long period of time. 57 00:05:24,970 --> 00:05:28,510 Uh, no one should be able to shut the computer down or otherwise tamper with it. 58 00:05:28,900 --> 00:05:32,870 And I want it to be open access in the sense that anybody can use it and hereby use it. 59 00:05:32,890 --> 00:05:38,230 I mean, maybe you interact with software that's already, uh, installed on that program on that computer. 60 00:05:38,440 --> 00:05:41,920 Maybe you install your own programs yourself, uh, for others to use. 61 00:05:42,250 --> 00:05:45,850 So this is the this is the functionality I'm interested in building out. 62 00:05:46,060 --> 00:05:52,540 Okay. So almost all of the talks are just going to be about some of the challenges that come up in trying to realise that. 63 00:05:52,550 --> 00:05:56,020 Another obvious question would be like, suppose you could build it, what would you do? 64 00:05:56,030 --> 00:06:02,930 And this is something I'm happy to talk about at great length after the talk, but it's not going to be the focus of the talk except on this one slot. 65 00:06:03,680 --> 00:06:09,230 There's a bunch of different ways I could answer this question. Um, let me give you one answer that I think is really kind of, um, 66 00:06:09,290 --> 00:06:14,599 under discussed in popular discourse around this technology and the sort of one thing I want you to 67 00:06:14,600 --> 00:06:19,310 remember from today's talk is that blockchain protocols in the applications built on top of them, 68 00:06:19,490 --> 00:06:24,410 it's not fundamentally about cryptocurrencies or about prices or about finance. 69 00:06:24,620 --> 00:06:27,979 It's really a technology that's about something, in my opinion, much more profound, 70 00:06:27,980 --> 00:06:35,360 which is stronger notions of ownership of property rights than we've traditionally had for things we might buy or creates in the digital realm. 71 00:06:35,780 --> 00:06:44,549 So let me sort of unpack that a little bit. The story starts with a gift given to us now almost 50 years ago by public key cryptography. 72 00:06:44,550 --> 00:06:47,900 And specifically, I want to focus on secure digital signature schemes. 73 00:06:48,320 --> 00:06:52,940 Uh, because I claimed if you look at the cryptographic guarantees offered by digital signatures, 74 00:06:53,240 --> 00:06:59,500 there's kind of a loose analogy to the types of property rights that we're accustomed to for physical items, right? 75 00:06:59,510 --> 00:07:00,919 Like think about real estate, for example. 76 00:07:00,920 --> 00:07:06,830 Like imagine you buy a house or you get various property rights along with that, you get the, uh, right to use, okay. 77 00:07:06,860 --> 00:07:09,680 So you can live in your house if you want. No one can stop you. 78 00:07:10,220 --> 00:07:15,860 Uh, and that's a little bit like how sort of digital signatures no one can, if you're the owner of a public key. 79 00:07:15,870 --> 00:07:19,009 Should have said that at the outset. I want to think about cryptography. 80 00:07:19,010 --> 00:07:27,260 Is letting you own a 512 bit string okay. Representing a public key or by own, I mean, you mean you own, you know, the corresponding private key. 81 00:07:27,770 --> 00:07:35,690 So analogous to the right to use. Right. No one can stop you from producing valid digital signatures because you know the corresponding private key. 82 00:07:36,380 --> 00:07:42,320 Another property, right you might have is the right to exclude. So no one else can live in your house unless you unless you want them to. 83 00:07:42,650 --> 00:07:45,260 And so there there's a loose analogy with digital signatures. 84 00:07:45,260 --> 00:07:49,249 No one can forge signatures on your behalf because they're not the owner of that public key. 85 00:07:49,250 --> 00:07:52,130 They do not know the corresponding private key. 86 00:07:52,550 --> 00:07:58,430 Okay, so I claim we can think of digital signatures as allowing individuals to own, in a meaningful sense. 87 00:07:58,580 --> 00:08:03,990 Public keys. Bits of five of length 512. Let's go back to that computer in the sky. 88 00:08:04,020 --> 00:08:09,750 Capable of general purpose computation, certainly capable of verifying digital signatures. 89 00:08:10,290 --> 00:08:13,919 And now a sort of, frankly kind of trivial idea, actually, 90 00:08:13,920 --> 00:08:19,559 is to extend the notion of ownership of 512 bit strings that we already have to ownership 91 00:08:19,560 --> 00:08:24,540 in the same sense of arbitrary digital data that lives on this computer in the sky. 92 00:08:24,910 --> 00:08:28,469 Right. The idea conceptually, is trivial, a program running on that computer. 93 00:08:28,470 --> 00:08:32,129 Maybe it's the computer's native operating system. Maybe it's a user installed program. 94 00:08:32,130 --> 00:08:39,090 Running in that operating system just can create a pair, okay, data X coupled with a public PPK. 95 00:08:39,630 --> 00:08:46,910 And we can now just define ownership of arbitrary digital data X as inheritance as inherited from the associated public key. 96 00:08:46,920 --> 00:08:47,430 In other words, 97 00:08:47,640 --> 00:08:56,550 you would own X if you're capable of producing a digital signature that witnesses your knowledge of the course of the private key corresponding to PK. 98 00:08:56,910 --> 00:09:03,930 Okay, so I claim if you combine cryptography with the computer in the sky, you get a notion of user owned data, uh, up there in this sense. 99 00:09:04,440 --> 00:09:07,860 Now what is that good for. And again, happy to talk about this a great length afterwards. 100 00:09:08,280 --> 00:09:12,870 But you know, the upshot is it's interesting to own data on this computer in this sense. 101 00:09:13,140 --> 00:09:19,770 If it allows you to do something you could not otherwise do, perhaps on the computer in the sky, 102 00:09:19,830 --> 00:09:24,090 interacting with other programs on it, it lets you do something you couldn't otherwise do. 103 00:09:24,090 --> 00:09:30,630 Maybe you can exchange your ownership of some data X for ownership of some other data Y that you're curious about. 104 00:09:31,140 --> 00:09:36,420 Or maybe it lets you do something you couldn't otherwise do in the physical world, right? 105 00:09:36,420 --> 00:09:42,510 A simple example would just be representing sort of one of a limited number of membership cards that, I don't know, get you in the cool parties. 106 00:09:42,510 --> 00:09:47,460 That allows you to have a special communication channel, sort of with some musician for their superfans, whatever. 107 00:09:48,030 --> 00:09:52,980 The point I want to make is that once you have the right mental model for what this technology is trying to achieve, 108 00:09:53,190 --> 00:09:56,639 it is not very hard to brainstorm lots of different things you might do with it. 109 00:09:56,640 --> 00:10:00,930 And again, I'm happy to talk more about that in the Q&A. Okay. 110 00:10:01,940 --> 00:10:07,700 So moving on to some of the engineering challenges. I've optimised the talk for breadth over depth. 111 00:10:08,090 --> 00:10:12,560 So we're going to talk about three different, quite different, uh, threads in my recent research over the past few years. 112 00:10:13,100 --> 00:10:17,990 Because of because of all of them, I'll have to, you know, just touch each one, um, pretty lightly. 113 00:10:18,470 --> 00:10:22,670 Uh, two themes I want you to look at, I want you to look out for as we go through them. 114 00:10:23,180 --> 00:10:28,159 First of all, how amazingly interdisciplinary it is to work in this application domain. 115 00:10:28,160 --> 00:10:33,290 And I say, someone you know, I say that as someone who's almost his entire career has done interdisciplinary work. 116 00:10:33,470 --> 00:10:38,990 This is a totally different level I've never seen before, even the different parts of computer science that this technology touches on. 117 00:10:38,990 --> 00:10:41,750 There areas that historically had basically nothing to do with each other. 118 00:10:41,990 --> 00:10:45,889 The areas of economics that it touches historically have had nothing to do with each other. 119 00:10:45,890 --> 00:10:49,040 And there's other disciplines as well that you're not you're not going to see today. 120 00:10:49,790 --> 00:10:57,319 The second theme I want you to look out for is just the incredibly rapid pipeline from just purely theoretical work to practical impact, 121 00:10:57,320 --> 00:11:04,040 which is probably as short as I've ever seen in my career. It's been actually kind of wild working in this area for the last few years. 122 00:11:04,040 --> 00:11:07,340 So I'll tell you a couple of anecdotes about that as we go along. Okay. 123 00:11:09,050 --> 00:11:14,920 All right, so the first thing I want to talk about, I want to zoom in on the desired functionality of being open access, 124 00:11:14,930 --> 00:11:18,500 the idea that anyone can, can use, uh, this computer in the sky. 125 00:11:19,040 --> 00:11:22,550 And in particular, I want to ask, how do we make that economically sustainable? 126 00:11:22,730 --> 00:11:26,630 Okay. So demand for processing on that computer exceeds the supply. 127 00:11:26,900 --> 00:11:30,650 What do we do? That brings us to the idea of a transaction fee mechanism. 128 00:11:30,710 --> 00:11:32,960 Okay. So that's what we're going to study in this in this first part. 129 00:11:33,800 --> 00:11:37,700 So you really don't need to know much about sort of how blockchain protocols work. 130 00:11:37,880 --> 00:11:41,960 Um, for now, like fundamentally, what's the responsibility of such a protocol? 131 00:11:42,320 --> 00:11:46,460 It needs to maintain a running sequence of transactions. 132 00:11:46,790 --> 00:11:52,520 Again, I want you to think of this as a general purpose computer. So I want you to think of a transaction as a little snippet of code. 133 00:11:52,550 --> 00:11:59,870 Let's say Assembly language code. Low level code. A sequence of transactions is then just sort of a long sequence of assembly line assembly 134 00:11:59,870 --> 00:12:03,770 language instructions that are carried out in this computer in this virtual machine. 135 00:12:04,040 --> 00:12:10,520 Okay. So transactions get added to this running log in batches, which are known as blocks. 136 00:12:10,850 --> 00:12:14,960 And all that matters for us right now is that blocks are only produced every so often. 137 00:12:15,320 --> 00:12:20,010 And they're only so big. I'm going to use the Ethereum blockchain protocol as a running example. 138 00:12:20,030 --> 00:12:25,700 It will become obvious why I'm doing that later. In Ethereum you get a new block, a new batch every 12 seconds. 139 00:12:25,760 --> 00:12:28,940 A typical block might have something like 200 transactions in it. 140 00:12:29,430 --> 00:12:34,310 Okay. And if you're doing the math in your head and you're saying that's only like 16 transactions per second, that doesn't sound like it's very much. 141 00:12:34,640 --> 00:12:37,820 That's because it's not very much. Okay. At least for the Ethereum protocol. 142 00:12:38,000 --> 00:12:41,480 Yes, it's a computer, but it's a very slow computer. Okay. 143 00:12:42,440 --> 00:12:50,150 And so for the most popular blockchain protocols, certainly including Ethereum and Bitcoin, uh, the demand is much higher than the supply. 144 00:12:50,330 --> 00:12:58,940 Many more than 16 transactions per second would be like, would like to be executed in the virtual machine that the Ethereum protocol maintains. 145 00:12:59,390 --> 00:13:06,200 And so the job of a transaction fee mechanism then is to decide which transactions get in and which transactions don't. 146 00:13:06,470 --> 00:13:09,680 And in particular, the ones that get in, what price do they have to pay. 147 00:13:10,100 --> 00:13:14,659 So that's the job of a transaction fee mechanism. What would such a mechanism look like? 148 00:13:14,660 --> 00:13:20,620 Potentially. Let me tell you about the one that's been in operation, Bitcoin, for its entire 15 years of existence. 149 00:13:21,010 --> 00:13:25,810 It's also the one that was used in Ethereum until about two and a half years ago, called the first price auction. 150 00:13:26,440 --> 00:13:33,160 Uh, very simple, very natural. So if we take seriously this idea of this sort of computer in the sky as a public good, 151 00:13:33,670 --> 00:13:40,000 if it can't execute everything well, we should sort of allocate it fully and on the most valuable transactions. 152 00:13:40,090 --> 00:13:44,170 Okay. How is this computer supposed to know which transactions are valuable and which ones aren't? 153 00:13:44,530 --> 00:13:52,510 You ask users to bid. Okay, so a user of a blockchain protocol would submit, along with the transaction, an amount that they would willing to pay. 154 00:13:52,750 --> 00:13:55,930 Should that transaction be included for execution. 155 00:13:56,230 --> 00:13:59,740 Okay. So then the way a lot of blockchain protocols work, 156 00:13:59,920 --> 00:14:04,390 there's sort of a unilateral sort of one actor that's chosen, you know, usually by winning a lottery. 157 00:14:04,660 --> 00:14:08,710 One actor is chosen to put together one of these blocks of, say, 200 transactions. 158 00:14:09,100 --> 00:14:15,490 And when they do that, all 200 users that hand their transactions included their bids get transferred to the to the block producer. 159 00:14:15,640 --> 00:14:18,920 Okay. So users submit transactions that they want to see executed. 160 00:14:18,940 --> 00:14:20,409 They say how much they're willing to pay, 161 00:14:20,410 --> 00:14:26,200 and that gets paid to the block producer that chooses their transaction instead of others to include in the block. 162 00:14:26,470 --> 00:14:31,150 Okay, so what do I mean by a first price auction? And you know, this is worked reasonably well. 163 00:14:31,150 --> 00:14:34,060 Like I said, Bitcoin's used it for 15 years. Ethereum use it for many years. 164 00:14:34,660 --> 00:14:39,430 Uh, if you've studied any auction theory at all, like if you've ever heard of a Vickrey auction or a second price auction, 165 00:14:39,700 --> 00:14:43,660 one thing you know about first price auctions is that it's non-trivial to figure out how to bid. 166 00:14:44,620 --> 00:14:47,590 If I ask you to pay exactly the amount that you say you're willing to pay. 167 00:14:48,520 --> 00:14:52,330 You're never going to bid your maximum willingness to pay your indifference point. 168 00:14:52,700 --> 00:14:57,160 Okay. You're going to shade your bid. You can say, well, I'd be indifferent at not at $10. 169 00:14:57,400 --> 00:15:00,400 Maybe I'll bid $9. Or maybe I'll bet $8. 170 00:15:01,110 --> 00:15:04,370 Or you have to sort of think about how stiff the competition is. 171 00:15:04,380 --> 00:15:07,560 The stiffer the competition, the less you can get away with shaving your bit. 172 00:15:07,830 --> 00:15:10,650 So that's some difficulty in bidding on a first price auction. 173 00:15:10,770 --> 00:15:15,060 We've known that for decades, and being in the blockchain setting doesn't make it any easier. 174 00:15:16,040 --> 00:15:19,370 Like I said, if you studied a little action theory, you're aware that there are other formats. 175 00:15:20,180 --> 00:15:23,270 If you just change, for example, from a first price auction to a second price auction, 176 00:15:23,540 --> 00:15:27,970 all of a sudden you have this magical truthfulness property, which is you just you should just be honest. 177 00:15:27,980 --> 00:15:32,900 You should just tell the auction, you know what? This is my indifference point. This is the most I'd ever be willing to pay for this item. 178 00:15:33,320 --> 00:15:40,460 And then basically, the second price auction promises to underbid for you on, on, on your behalf, given what everyone else bids. 179 00:15:40,580 --> 00:15:45,930 And for that reason, you can sort of just delegate the under bidding process to the mechanism and you'll never regret it. 180 00:15:45,950 --> 00:15:50,370 That's what I mean by truthful auction. So you can ask the question with transaction fee mechanisms. 181 00:15:50,440 --> 00:15:55,420 Could we have a better transaction fee mechanism, one in which it's easier for users to figure out what they should bid, 182 00:15:55,690 --> 00:15:59,170 and in particular, why not just use something like a second price auction? 183 00:15:59,640 --> 00:16:03,340 Yeah. So this is the question I want to understand in this first part of the talk. 184 00:16:04,720 --> 00:16:08,220 And I hope this feels like a reasonably fundamental question, right? 185 00:16:08,290 --> 00:16:11,439 A popular blockchain protocol. Some stuff gets in, some stuff gets out. 186 00:16:11,440 --> 00:16:15,460 You have to make those decisions. You have to figure out pricing should seem like a basic problem, I hope. 187 00:16:16,180 --> 00:16:24,999 Uh, at the time I did this work, which was mostly in 2020, it was also a very timely question specifically for the Ethereum blockchain protocol, 188 00:16:25,000 --> 00:16:32,080 which then and now is the second biggest blockchain after after Bitcoin, uh, in 2020, the Ethereum community, 189 00:16:32,140 --> 00:16:38,320 um, was assessing a potential switch of its transaction fee mechanism from the first price auction. 190 00:16:38,320 --> 00:16:44,530 I just showed you to a very different one, which I'll show you in a couple slides called EIP 1559. 191 00:16:44,800 --> 00:16:50,590 And it made many big changes. Some people in the Ethereum community thought it was a really cool idea. 192 00:16:50,920 --> 00:16:54,969 Some people hated it. Some people just a lot of people just didn't understand it. 193 00:16:54,970 --> 00:16:55,990 So we're sort of scared of it. 194 00:16:56,740 --> 00:17:03,490 And the Ethereum community prides itself on just kind of having a strong sort of social consensus to sort of in the spirit of open source software. 195 00:17:03,880 --> 00:17:08,170 So whenever they have kind of a polarising issue, like should they switch the transaction fee mechanism or not, 196 00:17:08,440 --> 00:17:11,620 they try to come to some kind of rough consensus among the stakeholders. 197 00:17:12,070 --> 00:17:17,470 So this was a community outreach survey. So Tim Baker is sort of a major figure in the Ethereum community. 198 00:17:17,920 --> 00:17:22,329 Uh, and this was again in 2020. And so he sent out this survey to various stakeholders. 199 00:17:22,330 --> 00:17:26,260 Right. So, uh, sent it to various miners. This was back in the proof of work days of Ethereum. 200 00:17:26,500 --> 00:17:29,740 Send it to people who run exchanges, send it to people who build wallets, software. 201 00:17:29,770 --> 00:17:34,719 Right. Which sort of prepare bids with transactions and asked all of these stakeholders, you know, 202 00:17:34,720 --> 00:17:39,340 how do you feel about maybe switching the transaction fee mechanism over to in 1559? 203 00:17:39,730 --> 00:17:46,120 And if you're nervous or unhappy about it, what can we as a community do to make you feel more comfortable with that switch? 204 00:17:46,450 --> 00:17:53,529 Okay. And many things were mentioned on the survey, but what I want to highlight here was the issue of a lack of a formal specification 205 00:17:53,530 --> 00:17:57,940 or proof for the mechanism that people can independently evaluate and critique. 206 00:17:59,410 --> 00:18:05,110 Now, mind you, like none of the respondents, not not a single responding to the survey is someone who writes research papers, right? 207 00:18:05,110 --> 00:18:09,040 These are all people building stuff in the Ethereum trenches. 208 00:18:09,370 --> 00:18:16,199 But there is a very obvious hunger from this group for the kind of work that my community does in algorithmic game theory, 209 00:18:16,200 --> 00:18:22,750 or that you can't community the rigorous assessment of the incentive properties of, uh, of an auction of a mechanism. 210 00:18:22,990 --> 00:18:28,750 So that, no doubt, was sort of a major motivator for me to spend several months of my life in 2020, uh, doing this work. 211 00:18:28,960 --> 00:18:32,970 Okay. All right. 212 00:18:34,180 --> 00:18:37,650 So what I want to talk about next is like, why is this even a hard problem? Right? 213 00:18:37,660 --> 00:18:40,510 Like, why can't you just, you know, look at a textbook, 214 00:18:40,510 --> 00:18:46,839 my textbook or some other mechanisms on textbook and copy paste some cool mechanism like the Vickrey auction or the VCG mechanism? 215 00:18:46,840 --> 00:18:51,330 If you know about that, why can't you just copy paste that into a blockchain protocol and call it a day and you're like, you're done. 216 00:18:51,340 --> 00:18:57,610 You get this nice truthful mechanism. And the reason and this will be a theme through all of these sort of three parts of the talk. 217 00:18:58,000 --> 00:19:03,610 Is that a problem that feels like a, well, sell problem actually needs to be revisited from first principles because of the 218 00:19:03,610 --> 00:19:09,309 idiosyncratic constraints imposed on you by the permissionless nature of blockchain systems. 219 00:19:09,310 --> 00:19:14,680 Okay, we're going to see that here. I'm going to see it again in the other two parts of the talk. So in this on this slide, 220 00:19:14,680 --> 00:19:20,530 I want to walk through what are some of those weird challenges that don't come up in traditional mechanism design applications, 221 00:19:20,800 --> 00:19:26,980 which is going to motivate a couple of new types of incentive constraints that we're going to want a mechanism to satisfy. 222 00:19:27,160 --> 00:19:31,620 Okay, so I started with something I'm calling challenge zero because this challenge has it's, 223 00:19:31,630 --> 00:19:37,150 you know, it's non-trivial but has nothing to do with blockchains per se. It's just always about mechanism design and mechanism design. 224 00:19:37,150 --> 00:19:43,420 What are you doing? You're eliciting preferences from the participants so that you can make a good decision in some sense. 225 00:19:43,870 --> 00:19:49,599 And whenever you deploy a mechanism, you are always worried about the participants misrepresenting their preferences 226 00:19:49,600 --> 00:19:53,080 in order to for the mechanism to compute an outcome which is better for them. 227 00:19:53,410 --> 00:19:58,090 Okay, already, like in a first place auction under bidding is in some sense misrepresenting your preferences. 228 00:19:58,640 --> 00:20:05,320 So there's standard language and mechanism designed to say that you want a mechanism which is not gambol in the sense you might call it truthful, 229 00:20:05,330 --> 00:20:09,110 or also sometimes called a technically that's dominant strategy, incentive compatible. 230 00:20:09,380 --> 00:20:12,390 But don't worry about it. That just means like a second price option. 231 00:20:12,410 --> 00:20:15,980 You may as well just report your true value. Okay, so that's challenge zero. 232 00:20:16,160 --> 00:20:19,160 That's exactly what a first price option does not have. Okay. 233 00:20:19,160 --> 00:20:22,970 So we would like a mechanism which does have this property. But. 234 00:20:23,950 --> 00:20:26,720 What else? Something that's much different. 235 00:20:26,740 --> 00:20:31,660 The traditional mechanism design allocations is if you think about who it is, who's carrying out the mechanism. 236 00:20:31,990 --> 00:20:35,170 Right. If you're selling kind of like wireless licenses, you know, 237 00:20:35,170 --> 00:20:38,770 it's kind of like the national government, government that's sort of running the mechanism. 238 00:20:38,770 --> 00:20:43,570 And you genuinely kind of trust that they're going to run the mechanism that they promised they were going to run, more or less. 239 00:20:44,230 --> 00:20:52,150 Now, in the blockchain protocol, in the blockchain context, I mentioned that a block is assembled by one actor chosen at random using a lottery. 240 00:20:52,390 --> 00:20:55,630 Okay. And they can choose whatever 200 transactions they want. 241 00:20:56,110 --> 00:21:02,950 They can include anybody they want. They can exclude anybody they want. They can even actually inject their own shill transactions if they want. 242 00:21:03,310 --> 00:21:10,840 Okay. So you can, you know, as the mechanism designer, write down on your pad like, uh, this mechanism would have like really great properties. 243 00:21:11,170 --> 00:21:14,860 Right. But again, the block producer can throw it out if it's not in their own best interests. 244 00:21:15,100 --> 00:21:21,400 So you need to have a transaction fee mechanism, which is in the interest of the block producer to run it as you intend. 245 00:21:21,820 --> 00:21:24,070 It should be profit maximising for that block producer. 246 00:21:24,400 --> 00:21:28,240 So that's going to lead to our it's going to be part of our second type of incentive compatibility constraint. 247 00:21:29,890 --> 00:21:35,530 The other part of it I see is what I said about shill bids. The fact that the block producer can just inject whatever it wants. 248 00:21:36,250 --> 00:21:41,320 And if you think about it, that then tells you why we cannot copy paste a second price option into this problem. 249 00:21:41,620 --> 00:21:48,040 Right. So if you're allegedly running a second price auction and you get to put in your own bid after you see everybody else's, 250 00:21:48,760 --> 00:21:52,510 write the high bids, $10, you're going to have a show bid of $9.99. 251 00:21:52,630 --> 00:21:56,860 Okay. So really, the second price auction really devolves into a first price auction. 252 00:21:56,860 --> 00:22:01,390 Anyways, when in this block, producer has this power to insert shill bids after the fact. 253 00:22:01,720 --> 00:22:06,190 So that'll be the other part of MC. We want there to be no incentive to insert shill bids of this form. 254 00:22:07,950 --> 00:22:11,939 Finally, uh, in a blockchain context, if possible, 255 00:22:11,940 --> 00:22:16,560 you would like to worry about collusion a little bit more than you normally do in traditional mechanism design applications. 256 00:22:16,710 --> 00:22:22,710 In traditional applications, you kind of like, you know, make all of the participants kind of sign a legal contract promising that they won't collude. 257 00:22:22,710 --> 00:22:26,640 And if you catch people colluding, you kind of like hand them off to the to the rule of law in effect. 258 00:22:27,120 --> 00:22:30,060 And it's not that that's impossible in a blockchain context, but it's really harder. 259 00:22:30,360 --> 00:22:36,239 So if you can design a mechanism where the mechanism design itself already discourages any kind of collusion, 260 00:22:36,240 --> 00:22:40,319 for example, between the users and the block producer, that's a property you would like to have. 261 00:22:40,320 --> 00:22:44,780 And that's going to be our final set of incentive compatibility constraints, which are going to be called OCA proof. 262 00:22:44,790 --> 00:22:47,430 This OCA here stands for off chain agreements. 263 00:22:49,110 --> 00:22:54,900 The question then is going to be, is there a transaction fee mechanism that has all of these properties against the incumbents? 264 00:22:55,170 --> 00:23:00,210 The first place auction. It turns out it does satisfy both of these two types of constraints, 265 00:23:00,510 --> 00:23:06,710 not unrelated to why it's been so successful so far in a blockchain context, and it very much does not satisfy truthfulness. 266 00:23:06,720 --> 00:23:10,800 Again, in a first price auction, you always want to shade below your maximum willingness to pay. 267 00:23:11,280 --> 00:23:14,550 So the question is, can we actually get all three of these at the same time? 268 00:23:15,060 --> 00:23:20,700 Sure. So in the next couple of slides, I have formal definitions of the two new and available constraints. 269 00:23:20,710 --> 00:23:26,620 Again. So remember that's the same as truthful. So there should just be no incentive for users to misrepresent their preferences. 270 00:23:26,950 --> 00:23:31,540 There's two novel ones. I'm not going to go through them in detail, given what I've told you in English. 271 00:23:31,780 --> 00:23:36,070 These are the math definitions that any of you would write down anyways. So I'm not going to go through them carefully. 272 00:23:36,430 --> 00:23:40,060 I just want to point out that each of these two new types of incentive compatibility constraints 273 00:23:40,240 --> 00:23:44,800 kills off one of the mechanism design formats you'd be tempted to copy paste from a textbook. 274 00:23:45,170 --> 00:23:51,730 MSI we already talked about. This is what kills off the second price auction, because those kinds of auctions are easy to manipulate using show bids. 275 00:23:52,450 --> 00:23:54,429 What about okay proof? This is the one. 276 00:23:54,430 --> 00:24:00,889 So remember MSI says it should be profit maximising for the block producer to run the mechanism that you want them to OCA proof. 277 00:24:00,890 --> 00:24:05,410 And that says a cartel of users in the block producer should not be able to do better off by sort of in a, 278 00:24:05,650 --> 00:24:08,410 in a coordinated way, deviating from what they're supposed to be doing. 279 00:24:09,190 --> 00:24:14,559 And so there what this kills off is the use of a reserve price, at least of the use of a reserve price. 280 00:24:14,560 --> 00:24:23,170 And the most obvious way to see what I mean. Suppose you tried to have a transaction fee mechanism where you forced a price floor at ten sort of £10. 281 00:24:23,350 --> 00:24:28,240 You said to get a transfer, to be eligible for inclusion, you have to pay £10 for your transaction. 282 00:24:29,150 --> 00:24:34,190 Right. But now imagine, like it turns out there's a bunch of people that want to be included, but it's only worth £7 to all of them. 283 00:24:34,520 --> 00:24:40,250 Okay, so if everybody did what they're supposed to be doing, there would be an empty block because no one bid high enough to get it. 284 00:24:41,000 --> 00:24:44,090 But now, if you think about it, there's a real incentive to cut sort of a deal on the side. 285 00:24:44,140 --> 00:24:51,240 Right. The block producer could just say, hey, everybody, make sure you bid £10 on chain so you're eligible for inclusion and I'll refund you. 286 00:24:51,260 --> 00:24:55,970 Let's call it three and a half pounds off Shane, and we'll all be better off than we would have been otherwise. 287 00:24:56,210 --> 00:24:59,930 Okay, so reserve prices are the casualty of this. Uh, okay. 288 00:24:59,940 --> 00:25:03,679 Previous, uh, condition. Okay. All right. 289 00:25:03,680 --> 00:25:11,480 So those are three types in compatibility robustness to manipulation by users, by block producers and by coalitions of users and block producers. 290 00:25:11,720 --> 00:25:15,020 So we can assess any transaction fee mechanism. Now according to these dimensions. 291 00:25:15,290 --> 00:25:18,889 Let me tell you about the challenger for Ethereum. Yep. 292 00:25:18,890 --> 00:25:22,620 5059. And let's assess it according to those three dimensions IP. 293 00:25:22,640 --> 00:25:25,910 If you're wondering that stands for Ethereum Improvement Proposal. 294 00:25:26,210 --> 00:25:30,620 This is the way by which changes to the Ethereum protocol get vetted and discussed in that community. 295 00:25:31,620 --> 00:25:35,609 So what does this new transaction fee mechanism? Three pretty big changes. 296 00:25:35,610 --> 00:25:37,500 And it's important that they all happen together. 297 00:25:37,620 --> 00:25:40,620 Again this is one of the things that kind of freaked people out because it seemed like a lot of moving parts. 298 00:25:41,160 --> 00:25:44,940 First of all, I'm going to sound crazy. There's going to be a reserve price. 299 00:25:45,300 --> 00:25:49,530 Okay. This is going to be called, in this context, a base fee. But wait, I said, okay. 300 00:25:49,530 --> 00:25:53,250 Prudence kills off any possible use of reserve price. All right. 301 00:25:53,850 --> 00:26:01,590 It kills off the obvious way of using a reserve price, in which the revenue raised by the reserve price is passed on to the block producer. 302 00:26:01,920 --> 00:26:03,780 As you would sort of expect it to. Okay. 303 00:26:04,110 --> 00:26:11,910 It turns out if you redirect revenue generated by a reserve price anywhere else, actually, you can get that property. 304 00:26:12,120 --> 00:26:16,620 Okay. And in 50, 59, they're going to do the simplest way of directing it somewhere else, 305 00:26:16,770 --> 00:26:22,130 which is literally money that's used to pay the reserve price fee is simply burned, if you like. 306 00:26:22,140 --> 00:26:27,540 It's literally just removed from the circulating supply of the native cryptocurrency removed from the supply. 307 00:26:28,140 --> 00:26:32,580 Okay. So a reserve price or base fee with all revenues from it being burned. 308 00:26:33,480 --> 00:26:37,110 Now. Next question is how do you know what the reserve price should be? Should it be a dollar? 309 00:26:37,110 --> 00:26:42,690 Should be $10 should be $100. How would you know? Ideally you would like a market clearing price where supply hits demand. 310 00:26:42,900 --> 00:26:46,620 You'd like the price at which the number of transactions willing to pay. 311 00:26:46,620 --> 00:26:52,410 That price is exactly 200. Goes up the block completely. And moreover, it's the most valuable transactions that you're including. 312 00:26:52,890 --> 00:26:56,580 You don't know. You know your supply. You don't know the demand. So you don't have the market clearing price. 313 00:26:56,730 --> 00:27:01,050 So it's just going to do local search trying to find it. Okay. The current reserve price looks too high. 314 00:27:01,320 --> 00:27:04,440 It's going to do a local adjustment, make it a little bit lower and vice versa. 315 00:27:04,800 --> 00:27:08,340 Okay. How do you know whether the current reserve price looks too high or too low? 316 00:27:08,910 --> 00:27:14,280 Well, rather than having a hard cap of 200 transactions per per block, like I said before. 317 00:27:14,580 --> 00:27:19,260 So, uh, let's be a little more flexible. Let's make that a soft capacity rather than a hard capacity. 318 00:27:19,740 --> 00:27:26,250 We'll let you go up to say like 400 transactions if you really need to, but the target is still 200 transactions in a block. 319 00:27:26,760 --> 00:27:33,870 And so now the target guides the local search. If a block is produced with 300 transactions, it seems like your price is too low. 320 00:27:33,900 --> 00:27:40,020 So you better raise it. If the last block has only 100 transactions, it seems like you're making things too expensive, so you lower your price. 321 00:27:40,320 --> 00:27:42,840 Uh, for the auction. That happens 12 seconds later. 322 00:27:43,170 --> 00:27:51,210 Okay, so you have a reserve price adjusted through local search, using past block sizes as the unchanged signal for which way to adjust. 323 00:27:51,420 --> 00:27:57,430 Okay. Finally, you're going to have a built in sort of emergency first price auction. 324 00:27:57,940 --> 00:28:05,620 So users have the option if they want to pay above and beyond the reserve price, above and beyond the base fee, 325 00:28:06,120 --> 00:28:12,730 uh, in which case that extra amount is passed on to the block producer like bids were in a first price auction. 326 00:28:13,000 --> 00:28:17,920 Okay. And as we'll see, this is kind of in there to gracefully degrade into a first price auction. 327 00:28:18,130 --> 00:28:22,420 If the reserve price happens to be very, very different than what you'd like it to be, the market clearing price. 328 00:28:22,930 --> 00:28:30,580 Okay. So this is the EIP 5059. This is the quite a bit more sophisticated one they were considering switching to back in back in 2020. 329 00:28:31,060 --> 00:28:34,180 And the question I want to ask is is this better than a first price auction. 330 00:28:34,210 --> 00:28:39,280 And again I've set up the rules by which we mean better. I mean does it satisfy more of those incentive compatibility constraints? 331 00:28:39,610 --> 00:28:44,620 Disick MC and okay, let's review the incumbent first price auction. 332 00:28:44,920 --> 00:28:48,489 As we said, definitely not truthful. You always want to shade your bid. Easy to check. 333 00:28:48,490 --> 00:28:50,680 It satisfies the other two constraints. Okay. 334 00:28:51,220 --> 00:28:57,370 And one thing that's kind of annoying, actually, in a way, about the first price auction, it's like there's a million different applications, 335 00:28:57,370 --> 00:29:02,319 a million different problems having nothing to do with blockchains, where that would be a totally reasonable mechanism to use. 336 00:29:02,320 --> 00:29:02,530 Right? 337 00:29:02,530 --> 00:29:08,620 It sort of is totally insensitive to the fact that you're working in this sort of blockchain environment where it's thinking about like IP 5059, 338 00:29:08,620 --> 00:29:13,660 right? You're doing this stuff you would never do otherwise. You'd never think of this mechanism just like randomly, right? 339 00:29:14,080 --> 00:29:19,180 Like, first of all, you're using the fact that there's an auction every 12 seconds and the auction outcome is public. 340 00:29:19,540 --> 00:29:25,090 So you could just have this very natural local search procedure that takes what happens 12 seconds ago to inform the current base of it. 341 00:29:25,330 --> 00:29:30,879 All right. So that's using the repeated nature of these auctions. And then there's also that burning of the base fee revenues, 342 00:29:30,880 --> 00:29:35,470 which would be very hard to do in a credible way in sort of a traditional mechanism design application. 343 00:29:36,010 --> 00:29:43,959 And in exchange, as a reward for taking advantage of these unique kind of, you know, the enlarged design space you have in the blockchain setting, 344 00:29:43,960 --> 00:29:49,240 you do get a Parado improvement in the incentive compatibility compared to the incumbent compared to the first price auction. 345 00:29:49,570 --> 00:29:53,050 It's a little less trivial, but it is in fact still mesi and OCA proof. 346 00:29:53,710 --> 00:29:57,820 It's not the case that it's always truthful, but it is the case that it's usually truthful. 347 00:29:57,830 --> 00:30:01,510 What do I mean by that? Well, if the reserve price is zero. 348 00:30:02,760 --> 00:30:07,560 Then it's exactly the same as the first price auction, if you think about it. Okay, so the first price auction isn't truthful. 349 00:30:07,710 --> 00:30:10,800 Neither is this mechanism when the reserve price is zero. 350 00:30:10,830 --> 00:30:14,100 Assuming there's more than 400 transactions that are willing to get in for free. Okay. 351 00:30:14,370 --> 00:30:18,240 In general, if the reserve price is way too low, you revert back to this first price auction. 352 00:30:18,540 --> 00:30:26,910 But you can show that is the only event in which you will not be a truthful mechanism, as long as the base fee is semi calibrated. 353 00:30:26,940 --> 00:30:30,719 Right. If it's at least kind of high enough that there's at most 400 transactions, 354 00:30:30,720 --> 00:30:35,240 a double full block willing to pay that reserve price, then in fact, it's a truthful mechanism. 355 00:30:35,250 --> 00:30:38,280 Everybody sort of pays the reserve price. Everybody's happy. Okay. 356 00:30:38,880 --> 00:30:42,770 So in most cases it is in fact, uh, it is in fact, Disick. 357 00:30:43,110 --> 00:30:47,999 I proposed a variant of this mechanism where the usually moved from the D6 to the OCA. 358 00:30:48,000 --> 00:30:53,340 Proof. So always in my c always Disick usually oca proof with the exact same edge case. 359 00:30:54,060 --> 00:30:57,180 Uh, so that's fine. Of course you want the best of all worlds. 360 00:30:57,180 --> 00:31:00,630 You'd like to have. Always all three of these. Uh, but it turns out. 361 00:31:01,800 --> 00:31:07,120 Let's skip that back. Um, there's a very recent result. 362 00:31:07,120 --> 00:31:12,130 This is, um, joint with Elaine. She, uh, of Carnegie Mellon and her PhD student, Hao Chung. 363 00:31:12,220 --> 00:31:16,060 In fact, you cannot get all three all the time, so you need to make some kind of compromise. 364 00:31:16,360 --> 00:31:19,659 So both the IP 1559 and the variant that I suggested in some sense live on 365 00:31:19,660 --> 00:31:23,200 the Parado frontier of incentive compatibility of transaction fee mechanisms. 366 00:31:23,480 --> 00:31:26,790 Okay. So to tie this back to kind of the Ethereum story. 367 00:31:27,790 --> 00:31:36,220 Um. The epiphany 59 was indeed adopted in early August of 2021, and it's been running happily ever since for the past two and a half years. 368 00:31:36,820 --> 00:31:41,379 Uh, I released my initial report. It's like a 58 page report, kind of for the general public on December 1st. 369 00:31:41,380 --> 00:31:47,020 Immediately, the people from the community we've been talking about, you know, they sort of realised it immediately. 370 00:31:47,020 --> 00:31:53,920 Vitalik Buterin Oh, I forgot to say that. Yep. 5059 the designer of that mechanism is Vitalik Buterin, who's also the, uh, co-founder of Ethereum. 371 00:31:54,460 --> 00:32:00,760 Uh, so he and Tim Baker both sort of immediately commented on it as the day to the switchover in early August 21st approached, 372 00:32:01,030 --> 00:32:05,200 my report was kind of the standard reference people use to explain to a public audience, 373 00:32:05,410 --> 00:32:11,830 you know, what you should expect to be solved and what would not be solved by, uh, switching over to this, um, transaction fee mechanism. 374 00:32:12,220 --> 00:32:18,340 So now it's just been running happily ever since. There's even sort of dashboards which sort of track the amount of East it's been burned to date, 375 00:32:18,610 --> 00:32:21,399 kind of because of the base fee revenues and, and so on and so forth. 376 00:32:21,400 --> 00:32:26,680 So it's definitely kind of a, uh, um, it's a it's a critical part of today's Ethereum blockchain protocol. 377 00:32:29,940 --> 00:32:33,690 So that was the segment where I wanted to zoom in on this notion of the open access property. 378 00:32:33,810 --> 00:32:36,750 Right, and how to make this computer in the sky economically sustainable. 379 00:32:37,140 --> 00:32:42,150 Next, I want to sort of drill down on like, what do I actually mean when I say that has no owner, has no operator? 380 00:32:42,150 --> 00:32:49,049 Like what does that mean literally? And what it means literally is that actually it's not a single computer, but it's lots, 381 00:32:49,050 --> 00:32:57,330 maybe thousands of computers scattered all over the globe that, through cooperation, are simulating the state of a virtual machine. 382 00:32:57,750 --> 00:33:04,440 So this computer in the sky isn't physical. It lives in the simulation of thousands of physical nodes down on Earth. 383 00:33:04,890 --> 00:33:09,600 Those nodes need to stay in sync about the current state of that virtual machine. 384 00:33:09,870 --> 00:33:12,990 And they do that by running what's called a consensus protocol. 385 00:33:13,770 --> 00:33:19,380 So this is going to be joint work with Andy Lewis Pi, who's just down the road London School of Economics I just had lunch with him yesterday. 386 00:33:19,980 --> 00:33:26,160 Um, and sort of all my papers these days, when I release them, I usually have sort of a corresponding, uh, tweet storm. 387 00:33:26,160 --> 00:33:27,930 And I know that's like kind of a gimmick. 388 00:33:28,170 --> 00:33:33,360 On the other hand, it seems like it's been, like, super helpful for people like you haven't talked to academics and they'll say, 389 00:33:33,360 --> 00:33:37,559 yeah, you know, I went to this conference, you know, the formal talks were like, not that helpful. 390 00:33:37,560 --> 00:33:41,070 But like the quarter conversations were like, great. I went just for the quarter conversations. 391 00:33:41,070 --> 00:33:45,870 And these are the quarter conversations just made available to everybody, which sort of seems like a positive thing. 392 00:33:45,870 --> 00:33:49,769 I think this is the longest research paper I've ever co-authored, 75 pages. 393 00:33:49,770 --> 00:33:52,860 I know some of you have written longer ones, but for me, this is my record. 394 00:33:53,270 --> 00:33:57,870 Um, so this is the only time I've ever had to do a recursive tweet storm, a tweet storm, a tweet storms. 395 00:33:57,870 --> 00:34:01,440 So still, I think a lot faster to read than the full 75 page paper. 396 00:34:01,770 --> 00:34:06,120 Obviously, I'll only be able to tell you a little bit about sort of the results in here in the time that I have. 397 00:34:06,390 --> 00:34:14,620 Okay. So just like in the mechanism design section, it's not the traditional mechanism wasn't useful. 398 00:34:14,620 --> 00:34:20,890 It was we built on it, but we needed to extend it because of these sort of permissionless nature of blockchain environment. 399 00:34:21,190 --> 00:34:26,169 Exact same thing will happen here. Design and analysis of consensus protocols. 400 00:34:26,170 --> 00:34:34,420 Classic topic okay, brilliant work in the 1980s Turing Awards given to Barbara Liskov, Leslie Lamport and we will be building on that theory. 401 00:34:34,690 --> 00:34:41,230 But again, because of the permissionless nature of blockchain protocols, that theory has to be significantly extended. 402 00:34:41,470 --> 00:34:42,970 Okay. So I want to tell you about that next. 403 00:34:44,010 --> 00:34:49,350 Be so again since census protocol, keep a whole bunch of nodes in sync with each other on the state of a virtual machine. 404 00:34:50,100 --> 00:34:55,800 The minimal properties you would want of any consensus protocol is liveness and consistency. 405 00:34:56,010 --> 00:34:59,270 Liveness, meaning as long as there's work to get done, it should get done. 406 00:34:59,280 --> 00:35:01,950 If there are pending transactions, they should eventually be processed. 407 00:35:02,430 --> 00:35:07,770 Consistency means like really the nodes got to stay in sync, but they should not have fundamental disagreements about, 408 00:35:07,770 --> 00:35:11,790 for example, what instructions have been executed in this in this virtual machine? 409 00:35:12,120 --> 00:35:14,010 Okay. Why is that a hard problem? 410 00:35:15,120 --> 00:35:19,980 Two of the traditional challenges are, first of all, some of these nodes running the protocol may not do so correctly. 411 00:35:20,130 --> 00:35:25,020 Who knows why. Maybe they crash. Maybe they have buggy software. You know, maybe they're controlled by a malicious actor. 412 00:35:25,020 --> 00:35:28,350 Who knows, they may deviate from what you you expecting them to do? 413 00:35:29,190 --> 00:35:33,170 Same thing with the communication network. Okay, so it may be a good day for the internet. 414 00:35:33,180 --> 00:35:39,720 It may be a bad day for the internet. It's hard to predict exactly how long messages will be delayed before received by the intended recipient. 415 00:35:40,020 --> 00:35:47,669 Okay. And in my opinion, one of the reasons why the traditional, um, theory of consensus protocols has been so successful, 416 00:35:47,670 --> 00:35:51,210 there's this really rich, deep literature over the last 40 years or so. 417 00:35:51,750 --> 00:35:57,090 I really give a lot of credit to the pioneers in the 1980s for setting things up as kind of like 418 00:35:57,090 --> 00:36:01,980 a playground for theorists who just have a lot of fun with and with clear measures of progress. 419 00:36:02,100 --> 00:36:09,180 Okay. In particular, if you take these two types of challenges, nodes that are not doing the right thing, the network not being as good as you'd like. 420 00:36:09,690 --> 00:36:13,440 They took each of those challenges and they parameterised them using hierarchies. 421 00:36:13,620 --> 00:36:20,160 So in effect, they almost like gave you ladders where like better papers just meant sort of climbing further up these ladders. 422 00:36:20,340 --> 00:36:24,300 Okay, let's take the faults. Okay. Nodes that deviate from what you want them to do. 423 00:36:24,600 --> 00:36:28,040 You've got very benign types of faults. Like maybe there's just like a hardware failure. 424 00:36:28,050 --> 00:36:31,230 It crashes. You never hear from it again. Okay, that's relatively benign. 425 00:36:31,800 --> 00:36:36,990 At the other extreme, you know, maybe like a hostile nation state has taken over one of your nodes and is actually 426 00:36:36,990 --> 00:36:40,470 just doing everything they can to mess up your protocol be a Byzantine fault. 427 00:36:41,190 --> 00:36:47,729 Same thing with the communication network. You've got very easy models, like the synchronous model, where you can assume that guaranteed, 428 00:36:47,730 --> 00:36:53,250 every message will be received within, you know, whatever. One second, five seconds, you've got the intermediate model, 429 00:36:53,250 --> 00:37:00,090 the partially synchronous model where synchrony may be interrupted with periodic, periodic but finite length denial of service attacks. 430 00:37:00,450 --> 00:37:03,929 And then you've got the sort of, you know, challenge level of the asynchronous model, 431 00:37:03,930 --> 00:37:07,440 where the only thing you assume about message delivery is that eventually that happens. 432 00:37:07,680 --> 00:37:07,950 Okay. 433 00:37:08,190 --> 00:37:15,390 So obviously positive results get more and more impressive, sort of the weaker the assumptions and vice versa for for protocols for positive results. 434 00:37:16,630 --> 00:37:21,790 Now, when you pass to permissionless consensus protocols as familiar from Bitcoin and Ethereum, 435 00:37:22,480 --> 00:37:26,160 there's more challenges than just the network and faulty nodes. 436 00:37:26,250 --> 00:37:30,790 Okay. So one challenge is you don't know who's running your protocol. 437 00:37:31,330 --> 00:37:38,560 Okay, so any of us today could go download some software spin up and start contributing to the Bitcoin protocol. 438 00:37:38,590 --> 00:37:41,860 Literally today it's been running for 15 years. Doesn't matter. 439 00:37:41,860 --> 00:37:47,710 It doesn't care. You could join the party right now. That's not a property historically consensus protocols have ever wanted to have. 440 00:37:48,430 --> 00:37:53,800 Secondly, you've got what's known as the Sybil problem. Given that anyone can sort of join the party and start running the protocol, 441 00:37:54,040 --> 00:37:58,270 anyone can also join multiple times under multiple identifiers, and you'd never know the difference. 442 00:37:58,570 --> 00:38:00,760 So one participant can masquerade as many. 443 00:38:01,120 --> 00:38:05,170 So, for example, if you're trying to do something like voting, that gets a lot more complicated with symbols. 444 00:38:05,470 --> 00:38:09,640 If you've ever heard about like proof of work or proof of stake. Those are techniques to control the Sybil problem. 445 00:38:10,330 --> 00:38:14,740 What I want to talk about today is this third additional challenge of getting the permission of the setting, 446 00:38:15,040 --> 00:38:18,189 which is that there may be nodes which will not faulty. 447 00:38:18,190 --> 00:38:21,489 There's nothing wrong with them. They're periodically inactive. Okay. 448 00:38:21,490 --> 00:38:24,640 So someone, you know, running a node to contribute to the Bitcoin protocol. 449 00:38:24,850 --> 00:38:28,990 Maybe they turn it off when they go out of town for the weekend. They come back Monday morning, turn it on again. 450 00:38:29,290 --> 00:38:34,720 And even though the amount of fluctuation, even though sorry, the amount of participation and who's running the protocol fluctuates. 451 00:38:34,990 --> 00:38:39,610 You would like the protocol to just, uh, hum along, uh, you know, come along perfectly. 452 00:38:40,120 --> 00:38:47,590 Okay. And in the same way that in the 1980s, the challenges of false and unreliable communication were parameterised through hierarchies. 453 00:38:47,830 --> 00:38:51,100 In this work, we do exactly the same thing with inactivity. 454 00:38:51,100 --> 00:38:55,300 So we have a hierarchy of degrees of sort of stronger and stronger assumptions. 455 00:38:55,390 --> 00:38:59,020 We allow protocols to make about the activity of honest participants. 456 00:38:59,970 --> 00:39:04,020 So I'll show you the hierarchy on this slide. There's going to be four levels. 457 00:39:04,810 --> 00:39:09,780 Uh, the top level is going to be the one where you make the weakest assumptions. So that's gonna be the hardest one to design protocols in. 458 00:39:10,110 --> 00:39:14,280 And then the assumptions and protocols allowed to make is going to get stronger as we go down the slide. 459 00:39:14,820 --> 00:39:19,020 So you you sort of have a back. So you expect to see lots of impossibility results up here. 460 00:39:19,290 --> 00:39:23,940 Hopefully you expect to see lots of positive results like protocols that have good properties down here. 461 00:39:24,270 --> 00:39:30,060 Okay. So first level we call the fully permissionless or PHP model. 462 00:39:30,780 --> 00:39:36,750 And here literally the protocol is not allowed to assume anything at all about who is running it right now. 463 00:39:36,900 --> 00:39:43,719 No idea. Okay. And if you ask, I think people in the 20th century working on consensus protocols, 464 00:39:43,720 --> 00:39:46,870 whether any nontrivial functionality would be possible under that assumption. 465 00:39:47,080 --> 00:39:50,170 My guess is everyone would have said, no way, that's crazytown. Okay. 466 00:39:50,580 --> 00:39:55,120 Turns out Bitcoin protocol under reasonable assumptions actually is provably consistent in live. 467 00:39:55,240 --> 00:39:58,420 Knowing literally nothing about who's running it at any given moment in time. 468 00:39:58,900 --> 00:40:04,890 Just still to this day, I find kind of incredible. All right, so that's the most demanding, fewest assumptions setting. 469 00:40:04,900 --> 00:40:12,240 You know nothing about who's running a protocol. Slightly stronger assumption is what we're going to call the Da or dynamically available setting. 470 00:40:12,540 --> 00:40:17,190 Okay. And so here there will be this list of identifiers known to our protocol. 471 00:40:17,430 --> 00:40:22,079 Okay. And those of you that know a little bit about this area canonically this would be, you know, 472 00:40:22,080 --> 00:40:27,090 people who have locked up in escrow some stake in a staking contract of a proof of stake blockchain protocol. 473 00:40:27,120 --> 00:40:28,500 That would be this list of identifiers. 474 00:40:29,100 --> 00:40:35,760 So you're the protocol is allowed to assume that whoever is running the protocol has an identifier in this known list. 475 00:40:36,000 --> 00:40:39,899 Okay. So you're not going to have like a block proposal from someone you've never heard of, for example. 476 00:40:39,900 --> 00:40:44,880 But you could in Bitcoin. On the other hand, you know, maybe people on that list have gone away for the weekends. 477 00:40:45,600 --> 00:40:48,870 Okay. So it's a necessary condition but not a sufficient condition for participation. 478 00:40:49,260 --> 00:40:55,300 Uh, that your identifier is in that list. Third level is the quasi permissionless or QP setting. 479 00:40:56,020 --> 00:41:03,550 So we again have this list of identifiers. But now membership in that list is both necessary and sufficient for participation. 480 00:41:03,730 --> 00:41:10,209 So in other words, when you're designing a protocol, you are allowed to assume that all of the identifiers you know about controlled by 481 00:41:10,210 --> 00:41:13,320 honest players actually are doing what they're supposed to be doing right now. 482 00:41:13,330 --> 00:41:15,810 You can count on that. They are not away for the weekend. Okay. 483 00:41:17,490 --> 00:41:22,640 And then finally we have the classical permission model, where at the time of protocol deployment, you know exactly who all of the nodes are. 484 00:41:22,650 --> 00:41:26,280 They're going to say stay the same all the time, and nobody ever turns the machines off. 485 00:41:26,490 --> 00:41:26,820 Okay. 486 00:41:27,390 --> 00:41:35,080 And if you want to map these onto protocols you might have heard about, as we said, Bitcoin operates even in the PHP setting dynamically available. 487 00:41:35,100 --> 00:41:39,960 This is in hindsight we can take proof of stake longest chain protocols like say Cardano. 488 00:41:40,140 --> 00:41:44,130 And in hindsight, they're kind of built to work properly, even in a dynamically available setting. 489 00:41:44,740 --> 00:41:49,860 Um, if you've heard of like the Pbft protocol or things based on them, like say, Algorand, again, in hindsight, 490 00:41:49,860 --> 00:41:55,770 we can say and also actually today's Ethereum, in hindsight, they're meant to work in the quasi permissionless setting. 491 00:41:56,010 --> 00:42:00,360 Okay. Now in our paper, we show that all. 492 00:42:00,370 --> 00:42:05,390 We show that this hierarchy is strict. Okay, so we have separations every level of this hierarchy. 493 00:42:05,410 --> 00:42:10,240 We show that there are things that are possible which are provably impossible at the level above. 494 00:42:10,570 --> 00:42:15,640 Okay. And from our results, it would appear that the really again there's a separation everywhere. 495 00:42:15,820 --> 00:42:20,380 But the really big separation the chasm would appear to be between D and QP. 496 00:42:20,410 --> 00:42:25,090 So I just want to add a little bit, add a little bit of colour about that chasm in the middle of the hierarchy. 497 00:42:25,360 --> 00:42:31,660 Okay. So again lots of stuff which you'd like to do, which you can do in the q p setting, which is impossible in the day setting. 498 00:42:32,050 --> 00:42:38,170 Okay. So two slides, one about the negative results in the Da setting, one about the positive results in the q p setting. 499 00:42:40,420 --> 00:42:43,950 All right. So dynamically available setting. So again remember what this means. 500 00:42:43,960 --> 00:42:48,490 So if you want to think about a proof of stake protocol. But here's the protocol knows this list of identifiers. 501 00:42:48,490 --> 00:42:51,520 Membership in the list is necessary but not sufficient for participation. 502 00:42:51,670 --> 00:42:56,050 So there's still going to be like wildly fluctuating participation. Some people might have gone away for the weekend. 503 00:42:56,620 --> 00:43:02,890 Uh, let me say three things you would like a protocol to have that you cannot have in the dynamically available setting. 504 00:43:02,900 --> 00:43:08,770 You cannot have them, no matter how clever you are with your protocol. Okay. All right. 505 00:43:09,520 --> 00:43:12,700 So property number one okay asynchronous safety. 506 00:43:13,000 --> 00:43:16,930 So this means if you have a network partition or there's sort of a long denial of service attack, 507 00:43:17,320 --> 00:43:20,350 you would like to obviously not have a consistency violation. 508 00:43:20,650 --> 00:43:28,480 But if you actually remain live in the dynamically available setting, you cannot also remain consistent in the presence of network partitions. 509 00:43:29,020 --> 00:43:32,770 If you have like a really specific protocol in mind, like, you know, Bitcoin or longest chain protocol, 510 00:43:32,800 --> 00:43:36,820 you can probably kind of see in your head why you're going to lose consistency with the network partition. 511 00:43:36,950 --> 00:43:40,060 That's not the point here, right? There's no concrete protocol on the slot. 512 00:43:40,360 --> 00:43:46,000 Right. The point here is to show that fundamentally, any protocol, no matter what it looks like, 513 00:43:46,300 --> 00:43:52,710 if it's live in the dynamically available setting, you must suffer consistency violations under network partitions. 514 00:43:52,900 --> 00:43:58,450 That's the context of this first result. Okay. Second property you'd like to have something called accountability. 515 00:43:58,610 --> 00:44:03,130 Okay. Which is should you suffer from a consistency violation should nodes get out of sync. 516 00:44:03,550 --> 00:44:10,880 You would like to know why. You would like to know which nodes deviated from intended behaviour that led to that consistency violation. 517 00:44:10,900 --> 00:44:14,380 That's called accountability. You get a smoking gun for who's responsible. 518 00:44:14,680 --> 00:44:19,570 That is also impossible in the day setting. Finally optimistic responsiveness. 519 00:44:19,870 --> 00:44:22,410 This is like an instance optimal version of liveness. 520 00:44:22,540 --> 00:44:28,270 So this just says like if the network's having a particularly good day to day, like it's five times faster than usual. 521 00:44:28,540 --> 00:44:32,420 Your protocol should also, uh, react five times faster than usual. 522 00:44:32,620 --> 00:44:36,120 So transactions should be confirmed faster as the network gets faster. 523 00:44:36,130 --> 00:44:43,100 Also provably impossible in the day setting. Turns out all of this is true even if there are no, uh, faulty nodes. 524 00:44:43,120 --> 00:44:48,580 Okay? Merely the possibility of fluctuating participation triggers all of these impossibility results. 525 00:44:50,410 --> 00:44:55,330 Positive results on the QPR setting. So again, here a protocol can assume quite a bit more. 526 00:44:55,690 --> 00:45:02,710 It can assume that not only is there this list of identifiers, but membership in that list is sufficient for active participation. 527 00:45:03,160 --> 00:45:06,070 So any identifier you see in there, which is controlled by an honest player, 528 00:45:06,250 --> 00:45:09,790 you can count on them being online doing what they're supposed to be doing. 529 00:45:10,060 --> 00:45:14,740 Okay. Like those of you who do know a little bit about Ethereum may know that there's sort of slashing. 530 00:45:14,740 --> 00:45:20,620 You can actually lose funds for inactivity. Okay, so if you're running an Ethereum validator, you're supposed to be like voting on stuff all the time. 531 00:45:20,860 --> 00:45:24,790 If you missed your turn for a while, you'll actually start losing some of the stake that you've pledged. 532 00:45:24,820 --> 00:45:31,150 There's economic penalties for being inactive, and that to be kind of like a, uh, sort of a light bulb going off. 533 00:45:31,510 --> 00:45:35,470 Uh, I guess Ethereum is really motivated to be in the quasi permissionless setting. 534 00:45:35,500 --> 00:45:40,630 They really need all the people they know about to be active. That's exactly what you're punished when you're inactive in Ethereum. 535 00:45:41,600 --> 00:45:46,280 All right. So you obviously need some control over how powerful the adversary is. 536 00:45:46,280 --> 00:45:51,710 And so here the magic threshold is going to be a third. So you need less than a third of the voting power to be controlled by an adversary. 537 00:45:52,370 --> 00:45:57,950 Uh, but in that setting you actually get all three of those properties. Okay. So you can guarantee that you have consistency. 538 00:45:58,220 --> 00:46:03,950 So there's a single protocol that achieves all three of these guarantees. Asynchronous safety doesn't matter what the network does. 539 00:46:03,950 --> 00:46:10,640 You will never have a consistency violation if there's a consistency violation because the adversary is bigger than the one third threshold, 540 00:46:10,850 --> 00:46:15,140 there's a smoking gun and you can figure out who they are, possibly for punishment later. 541 00:46:15,560 --> 00:46:21,560 And then finally, it does automatically speed up the speed of transaction confirmation as the network speeds up as well. 542 00:46:21,900 --> 00:46:24,630 Okay, so this is what I mean by the chasm between D and CuPy. 543 00:46:24,710 --> 00:46:30,110 Again, the separations at all levels of the hierarchy, but the starkest one is the one I've summarised in these two slots. 544 00:46:30,650 --> 00:46:36,360 Okay. So, um, obviously there's a lot more results in the paper I'm not gonna have time to talk about, honestly. 545 00:46:36,570 --> 00:46:40,470 I mean, we like the we like the results in those papers, in this paper, 546 00:46:40,650 --> 00:46:44,040 but we're really hoping that this becomes sort of the standard model and the standard 547 00:46:44,040 --> 00:46:47,460 language for lots of people to prove cool results about permissionless consensus. 548 00:46:47,700 --> 00:46:52,110 So we have a new paper also with Eric Buddhist, which is about slashing and proof of stake protocols. 549 00:46:52,410 --> 00:46:55,049 It's a paper that like, feels like should have been written a long time ago, 550 00:46:55,050 --> 00:46:58,530 and I think the only reason it's being written now is because the previous paper I just told 551 00:46:58,530 --> 00:47:02,940 you about gives us the language and the framework to make it to state theorems of this form. 552 00:47:02,970 --> 00:47:06,540 Okay. When does slashing work in a proof of stake protocol? So hopefully much more. 553 00:47:06,810 --> 00:47:10,320 I'd love to see maybe some people in this room. Uh, follow on on this. 554 00:47:10,320 --> 00:47:13,830 The style of work. All right. 555 00:47:13,840 --> 00:47:17,840 So for the last part. I said at the beginning, right. 556 00:47:17,990 --> 00:47:24,950 The one thing I wanted you to remember from this talk is that the technology is not fundamentally about cryptocurrency, about prices, about finance. 557 00:47:25,610 --> 00:47:32,840 That said, at the application layer, there are some really intellectually interesting things happening with financial applications. 558 00:47:33,080 --> 00:47:41,270 I want to tell you about some of my work there. Last. This is going to be joint, uh, with Jason Jonas, who's my PhD student at Columbia Cmac. 559 00:47:41,270 --> 00:47:46,910 And well, let me who who's my colleague in the business school, Columbia, Anthony Lisanne, who's a professor of finance at University of Chicago. 560 00:47:46,970 --> 00:47:50,630 Booth. All right. So. 561 00:47:51,650 --> 00:47:56,530 Right. We talked about property rights for sort of digital assets. Once you have ownership, right. 562 00:47:56,770 --> 00:47:59,250 Another property right is the right to transfer. Okay. 563 00:47:59,340 --> 00:48:04,540 If you have transferable digital assets, you have that you have you have the possibility of exchange of markets. 564 00:48:04,690 --> 00:48:14,170 Okay. Very simply maybe you want to exchange some cryptocurrency like say eath for some stablecoin, something pegged to the dollar like Usdc. 565 00:48:15,010 --> 00:48:20,740 And, um, we need a mechanism. We would like a mechanism to enable exchange of different digital assets. 566 00:48:21,490 --> 00:48:26,110 And once again, why is this a hard problem? Right? Doesn't this happen like, every single day? 567 00:48:26,110 --> 00:48:32,670 Like on every single stock exchange? Yes it does. The usual mechanism you would use in that context would be a limit order book. 568 00:48:32,710 --> 00:48:37,990 So you would have standing by standing sales, you'd have sort of matching logic to sort of unclear trades as they come in. 569 00:48:38,620 --> 00:48:42,220 Uh, and again, it's just it's not a perfect match for the blockchain environment. 570 00:48:42,490 --> 00:48:48,460 Okay. For two reasons. Um, reason number one, at least for Ethereum, at least for the older blockchain protocols, 571 00:48:48,880 --> 00:48:53,410 uh, implementing a limit order book is computationally relatively expensive. 572 00:48:53,560 --> 00:48:59,620 Okay. You have to maintain this. You have to maintain the order book. You have to do the matching, uh, more modern blockchain protocols. 573 00:48:59,620 --> 00:49:03,310 You know, probably they have enough computational power to do this. Well, but Ethereum does not. 574 00:49:03,610 --> 00:49:05,590 So that motivated looking at some different designs. 575 00:49:06,100 --> 00:49:15,850 Uh, secondly, to secondly, limit order books, whether you're in a blockchain environment or otherwise work really well in thick markets. 576 00:49:15,850 --> 00:49:19,360 We have lots of people on both sides of the market's interested in trading right to trade. 577 00:49:19,360 --> 00:49:22,870 In a limit order book you need a counterparty. You want to buy it. You need someone to sell okay. 578 00:49:23,260 --> 00:49:29,470 And ETH, Usdc. I mean that's actually a very thick market. But a lot of the digital assets that people want to exchange are thin markets. 579 00:49:29,620 --> 00:49:32,680 And the worry would be that there's just usually not a counterparty to trade. 580 00:49:33,220 --> 00:49:44,050 Okay. So both of these two issues have motivated, uh, the popularity of automated market makers or amms, on blockchain protocols such as Ethereum. 581 00:49:44,210 --> 00:49:50,110 Okay. So what's an Am? It's a market that in some sense is always open for business. 582 00:49:50,320 --> 00:49:57,520 At any given moment in time, an arm will quote you a spot price, and we'll be willing to take either side of a trade at that spot price. 583 00:49:57,670 --> 00:50:03,639 Right? So like, you know, $3,000 for eath that'll be willing to buy it at the margin for that or sell it at the margin at that. 584 00:50:03,640 --> 00:50:08,110 Either way. Okay. And, um. Uh. 585 00:50:08,340 --> 00:50:12,300 Good. So the next question you might say is, okay, that's fine. 586 00:50:12,540 --> 00:50:16,460 So I guess the market maker, like the Am, is itself acting as the counterparty. 587 00:50:16,470 --> 00:50:18,960 So I guess, you know, there's always a counterparty there that's always open for business. 588 00:50:19,500 --> 00:50:24,659 But then you might be like, like, why does this arm even have any of these assets in the first place? 589 00:50:24,660 --> 00:50:27,270 Like where did it get its eath from? Where did it get this Usdc from? 590 00:50:27,810 --> 00:50:34,890 And the answer is that those are provided via deposits by entities that are here called liquidity providers or LPs. 591 00:50:34,960 --> 00:50:40,620 Okay. So so these are people who hand over their hard earned cryptocurrencies two and a m so the traders can trade with them. 592 00:50:41,370 --> 00:50:46,790 Why would you do that. What's because you earn a share of the trading fees by depositing in this way. 593 00:50:46,800 --> 00:50:50,280 So it's a way to earn yield on your assets by depositing them in one of these. And that's. 594 00:50:51,240 --> 00:50:54,240 All right. Well, if I put it that way, like, why would you not do that? 595 00:50:54,270 --> 00:50:57,990 Right? If you just get sort of free money, like, on your assets by putting them in the pool? 596 00:50:58,350 --> 00:51:00,060 Sounds like a good idea. Why would you not do that? 597 00:51:01,350 --> 00:51:08,159 And the, you know, so the the issue with market making and again this is not specific to the blockchain protocols, 598 00:51:08,160 --> 00:51:11,820 but the reason market making is scary is because of adverse selection. 599 00:51:12,030 --> 00:51:17,189 Okay. Whenever someone's willing to trade with you like there should be this like fear in the back of your mind that like, 600 00:51:17,190 --> 00:51:20,550 maybe you know something that I don't like. Maybe there's a reason they're trading. 601 00:51:20,730 --> 00:51:24,809 Okay. And actually, if you think about it for for purely on chain exchanges. 602 00:51:24,810 --> 00:51:30,630 Right. Like a Uniswap that's a, that's a popular name of a one of these automated market makers like Uniswap on Ethereum. 603 00:51:30,900 --> 00:51:36,450 Uh, that's going to be happening a lot because remember what I said? I said there's a new batch of transactions every 12 seconds. 604 00:51:36,960 --> 00:51:40,560 So Ethereum's virtual machine stays frozen for 12 seconds at a time. 605 00:51:40,560 --> 00:51:47,310 And in particular, the spot price of an arm, which is a function just of the state of the virtual machine, stays frozen for 12 seconds. 606 00:51:47,820 --> 00:51:50,640 A lot can happen on the open markets and 12 seconds, right? 607 00:51:50,970 --> 00:51:56,840 And so generally, often there is an arbitrage opportunity with each Ethereum block that gets that gets added. 608 00:51:57,000 --> 00:51:59,910 So for example, if on the open market so that the price went up, 609 00:52:00,450 --> 00:52:04,919 all of a sudden there's an arbitrage opportunity where you buy from the AMM at the stale low 610 00:52:04,920 --> 00:52:09,510 price and then flip it on a centralised exchange at the current higher open market price. 611 00:52:09,750 --> 00:52:16,710 Okay. So that profitable arbitrage comes at the expense of the liquidity providers are playing a zero sum game in effect. 612 00:52:17,110 --> 00:52:21,780 All right. So that is the cost of providing liquidity to an adverse selection costs. 613 00:52:22,200 --> 00:52:32,939 Uh which goes to arbitrage right. So, um, if you're thinking about possibly helping or assessing, you know, uh, I mean, in general making markets, 614 00:52:32,940 --> 00:52:34,800 these are the two things you're sort of weighing with each other, 615 00:52:35,010 --> 00:52:39,959 the fees you're making and the average selection costs you're suffering fees pretty easy to think about. 616 00:52:39,960 --> 00:52:45,330 They're generally just sort of a percentage of volume, because you can sort of know how to think about how much revenue you might be making. 617 00:52:45,780 --> 00:52:51,000 It was the adverse selection costs where people didn't really have a good mental model for how to think about that in an EMS, 618 00:52:51,240 --> 00:52:58,379 let's say, in 2022, when we did this, when we did this work. And so what we propose here is we propose what is now the industry standard for 619 00:52:58,380 --> 00:53:02,730 thinking about the adverse selection costs suffered by liquidity providers of EMS, 620 00:53:02,940 --> 00:53:08,520 something known as loss versus rebalancing, or LVR, which most people pronounce as lever. 621 00:53:08,820 --> 00:53:12,780 Okay, so let me just tell you a little about lever. There's two versions. 622 00:53:12,780 --> 00:53:16,709 First, I'll tell you one, which is really for empirical analysis. This is really meant to do sort of forensics. 623 00:53:16,710 --> 00:53:21,060 You look back at trades that happened and you ask yourself, did I do well or did I not do well? 624 00:53:21,480 --> 00:53:25,410 The second version of lever is going to be for predicting the future and assessing, you know, 625 00:53:25,800 --> 00:53:31,230 do you think it's going to be profitable to be a deposit in a, to be an LP and an arm or not? 626 00:53:32,040 --> 00:53:35,579 So the first one, were you looking at the past? You just consider a sequence of trades. 627 00:53:35,580 --> 00:53:38,640 You say and trades. You focus on a single pool like eath for Usdc. 628 00:53:39,420 --> 00:53:44,309 Uh, and you just the definition is you just sort of do the thought experiment and you say, 629 00:53:44,310 --> 00:53:52,950 how much better off would I have been had I made the exact same trades that the Am forced me to do as a, as liquidity provider? 630 00:53:53,190 --> 00:53:56,790 Had I done those exact same trades but done them on the open market instead? 631 00:53:57,450 --> 00:54:00,450 Okay, so the ideas here are the quantities, right? 632 00:54:00,450 --> 00:54:06,270 So this is how much sort of aether the AMA forced you to buy or force you to sell, because a trader came in and took the opposite side. 633 00:54:06,690 --> 00:54:09,719 The piece here is the price on the open market and the Q's. 634 00:54:09,720 --> 00:54:14,400 Here are the possibly stale prices of the arm. You were forced to do the trade at this price. 635 00:54:14,400 --> 00:54:17,820 This would have been the price in the open market. So for example, right, 636 00:54:17,820 --> 00:54:24,210 if you were forced to sell by your amp and the you were forced to sell at a price which was less than what it was on the open market, 637 00:54:24,420 --> 00:54:27,440 that would be a loss and that would contribute to your lever. Okay. 638 00:54:27,460 --> 00:54:33,360 So trade by trade, your lever accumulates according to the counterfactual of how much better off you would have been on the open market. 639 00:54:33,720 --> 00:54:39,780 Okay, so that's obviously very easy to take the data. It's about as hard as like computing the volatility of an asset, something like that. 640 00:54:39,780 --> 00:54:45,750 I think there's already kind of first year MBAs in certain schools that have Excel spreadsheets that are computing this on various data sets. 641 00:54:46,410 --> 00:54:50,100 Um, you can characterise it, you know, so why is this a good definition? 642 00:54:50,100 --> 00:54:54,209 Well, one sanity check that you're on the right track is if you can characterise it in 643 00:54:54,210 --> 00:54:57,690 multiple different ways and you can here I've given you sort of interpretation, 644 00:54:57,690 --> 00:55:01,350 number one, which is this kind of thought experiment about trading on the open market. 645 00:55:01,620 --> 00:55:08,640 You can alternatively characterise it as the best sort of the maximum arbitrage profits that could have been attained from this profit of trades. 646 00:55:08,820 --> 00:55:12,400 Again, remember zero sum game between the LPs and the arbitrage years. 647 00:55:12,420 --> 00:55:16,490 So that's kind of a you're seeing that reflected here. Um, alternatively you know, 648 00:55:16,500 --> 00:55:21,149 if you prefer you can think of it as kind of the residual losses after you've Delta hedged 649 00:55:21,150 --> 00:55:25,110 out all of the market risk that you suffer by sort of leaping in one of these pools. 650 00:55:25,470 --> 00:55:31,170 So three different ways to think about lever, which I take as a sign that, you know, maybe it is the right concept or a fundamental concept. 651 00:55:32,760 --> 00:55:38,760 So let me just talk briefly about sort of predicting the future. So, you know, the lever is going to depend on the prices, right. 652 00:55:38,790 --> 00:55:45,060 We had a times P minus Q. Right. So obviously the way the PS move around is going to matter for what the lever is going to be going forward. 653 00:55:45,150 --> 00:55:48,720 If the price never moves, never any arbitrage there's going to be zero. 654 00:55:49,080 --> 00:55:53,850 The price moves a lot. You might think there's going to be lots of arbitrage and the lever is going to be high okay. 655 00:55:54,420 --> 00:55:58,530 So we need some model for future price movements. You could experiment with a lot of different things here. 656 00:55:58,560 --> 00:56:02,730 This is the first paper on the topics. We decided to start with. Just the obvious Black-Scholes model. 657 00:56:03,090 --> 00:56:07,950 Price moving according to geometric Brownian motion. Uh, if you don't know that model, that's okay. 658 00:56:07,950 --> 00:56:14,099 Just think of it as like a random walk in a way. Uh, and there's one really important parameter which is the volatility sigma, 659 00:56:14,100 --> 00:56:18,000 which you can think of is kind of like the typical step size in a random walk in the price. 660 00:56:18,240 --> 00:56:22,440 Okay. And so definitely as we discussed the lever will depend on the volatility. 661 00:56:22,470 --> 00:56:25,110 If sigma is zero and the price never moves it's going to be no lever. 662 00:56:25,320 --> 00:56:28,830 If sigma is high and is moving all over the place probably lots of opportunities. 663 00:56:29,250 --> 00:56:37,410 It's also going to depend on the arm okay. Because remember the formula we had a times quantity p minus q q is a function of which arm you're using. 664 00:56:37,740 --> 00:56:43,980 Okay. So different arm designs will give you different spot prices different cookies even for the same sequence of trades. 665 00:56:44,400 --> 00:56:49,200 So at a bare minimum we want a formula. It's going to have to depend on sigma and arm. 666 00:56:49,440 --> 00:56:53,280 We would like it to depend as minimally as possible in as interpretable way as possible. 667 00:56:54,370 --> 00:56:59,710 So here's the main result here. And I don't know if this seems like a clean formula to you or not. 668 00:56:59,770 --> 00:57:03,190 Probably depends on kind of your prior, but let me just unpack this a little bit. 669 00:57:03,650 --> 00:57:07,510 Okay, so now we're thinking continuous time right. So the price is moving all the time. 670 00:57:07,510 --> 00:57:11,430 So there's ARBs happening all the time. Lever accumulates trade by trade. 671 00:57:11,440 --> 00:57:12,940 So we have an end. That's why we have an integral. 672 00:57:12,940 --> 00:57:20,710 Now trading is happening all the time at a given moment in time the instantaneous accumulation of lever is given by this formula okay. 673 00:57:21,040 --> 00:57:25,839 As expected the volatility is in there. Expect volatility squared which is actually pretty severe. 674 00:57:25,840 --> 00:57:28,090 So that's that's uh it's actually pretty high. 675 00:57:28,420 --> 00:57:34,480 And then the choice of the AMM does show up, but only in a very kind of local way to this DX by DP term. 676 00:57:34,810 --> 00:57:41,260 So what this says is this is how much at equilibrium is the um, inventory of the risky asset of the AMM, 677 00:57:41,410 --> 00:57:44,980 the rate at which that changes as the open market price changes. 678 00:57:45,160 --> 00:57:50,290 Okay. So the marginal liquidity of the AMM at the current market price, that's what this is. 679 00:57:50,320 --> 00:57:55,389 All right. And if I had a little bit more time, I could literally just have a slide with a single triangle. 680 00:57:55,390 --> 00:57:57,690 And this would be like the area of that triangle. Okay. 681 00:57:57,700 --> 00:58:04,870 So I actually claim if you dig into this a little bit more, it's kind of about the simplest expression you could hope for, I think in hindsight, okay. 682 00:58:05,530 --> 00:58:11,679 In fact, it's so simple that you can even sort of evaluate and closed form what this is for some of the most popular AMS. 683 00:58:11,680 --> 00:58:16,300 So there's this so-called x y equals k curve made popular by Uniswap v1 and v2. 684 00:58:16,630 --> 00:58:19,780 Uh, and here you can actually just compute how leaver accumulates. 685 00:58:20,050 --> 00:58:24,250 And in fact, it doesn't even depend on where you are in the price range. It's uniform over the prices. 686 00:58:24,580 --> 00:58:31,120 If you normalise by the value of the pool, you get that the lever accumulates exactly as sigma squared over eight. 687 00:58:31,510 --> 00:58:35,680 Okay. And this is a sharp enough prediction. Right. You can really sort of plug in numbers here. 688 00:58:36,100 --> 00:58:40,750 Right. So if you take sigma equal to say 5% which would be kind of typical daily volatility for eath. 689 00:58:41,260 --> 00:58:46,240 Uh, and if you assume you're making 0.3% fees, which again in this part of the world is sort of pretty standard, 690 00:58:46,720 --> 00:58:55,660 then what you learn is that for LP to be profitable, about 10.5% of the pools assets should be traded on a daily basis. 691 00:58:55,960 --> 00:59:01,720 If the volume is higher than that, you're going to be profitable. If the volume is lower than that, you're going to be unprofitable. 692 00:59:02,110 --> 00:59:05,730 Okay. So you can really evaluate this quite crisply, quite precisely okay. 693 00:59:07,090 --> 00:59:10,890 And, uh, so we released this paper and started giving talks about it in August of 2022. 694 00:59:10,900 --> 00:59:17,140 And literally by the end of 2022, this was kind of industry standard for how people were sort of thinking about these adverse selection costs. 695 00:59:17,350 --> 00:59:18,549 So this was a few weeks later, 696 00:59:18,550 --> 00:59:23,110 just a couple of sort of leaders in decentralised finance sort of saying that they thought this was the right condition. 697 00:59:23,110 --> 00:59:26,740 And it's a condition that future design designers should take seriously. 698 00:59:27,160 --> 00:59:29,740 Lots of times this is in sort of popular podcast and so on. 699 00:59:30,100 --> 00:59:37,930 Maybe what I'm most proud of is, uh, last year, 2023, we were voted the number one meme of the year in DeFi. 700 00:59:38,760 --> 00:59:45,950 So. I'm kind of trying to figure out how, like, how do I list this on my CV exactly? 701 00:59:45,980 --> 00:59:49,309 I feel like there should be some way to do it, but I haven't figured it out. 702 00:59:49,310 --> 00:59:54,290 So, um, so anyway, like I said, just really within 4 or 5 months, you know, 703 00:59:54,290 --> 01:00:00,980 lever was just like a really well known concept and continues to be, uh, sort of a central concept in DeFi, which has been fantastic. 704 01:00:01,730 --> 01:00:06,440 Uh, pretty much out of time. Let me just sort of reflect for sort of one minute, because I've been working full time in this area, 705 01:00:06,440 --> 01:00:13,309 maybe for, I don't know, three more than three years now. Um, and I've worked in sort of immature areas before. 706 01:00:13,310 --> 01:00:17,510 Right? So many of us in this room, you know, Julius and Lesley and other people are well, 707 01:00:17,510 --> 01:00:22,610 we're sort of in algorithmic game theory in the very early days, uh, which also have kind of a Wild West flavour at the beginning. 708 01:00:22,610 --> 01:00:25,250 But I'm being reminded of that Wild West feeling now. 709 01:00:25,640 --> 01:00:29,990 And in particular, it reminds me of everything you take for granted when you work in a mature field. 710 01:00:30,260 --> 01:00:35,059 Right? So take like algorithms or I would even put like ML theory in this bucket as well. 711 01:00:35,060 --> 01:00:38,280 Right. Like you've got like relatively low barriers to entry. 712 01:00:38,300 --> 01:00:43,070 Like yes they're technical subjects but you've got good textbooks, you've got MOOCs, etc. 713 01:00:43,550 --> 01:00:47,480 Moreover, the research community has sort of solved the coordination problem, right? 714 01:00:47,480 --> 01:00:51,410 The sort of coordinated on an equilibrium where everybody more or less agrees, 715 01:00:51,410 --> 01:00:56,000 like the language to describe basic concepts like what are the most important definitions? 716 01:00:56,210 --> 01:01:03,440 What are the yardsticks against which you measure research progress? Uh, and, you know, you work in sort of, uh, an immature field like blockchains. 717 01:01:03,440 --> 01:01:06,980 You get none of these things, right? So it's really there's frustrations. 718 01:01:07,430 --> 01:01:12,290 On the other hand, I sort of learn through experience that this is also kind of the universe trying to 719 01:01:12,290 --> 01:01:16,099 tell you that there is a chance you might be working in what's viewed in hindsight, 720 01:01:16,100 --> 01:01:21,470 as a golden age. At the time, before everything, quote unquote easy, had already been done. 721 01:01:22,680 --> 01:01:26,190 So I like, you know, because, you know, you look at like maximum and cut. 722 01:01:26,200 --> 01:01:30,089 You're like, oh, man, I was born too late. I was there in like 1955. 723 01:01:30,090 --> 01:01:36,149 I could have I could have moved on and cut. Right. But the thing to remember is like, oh, like computer science is just changing. 724 01:01:36,150 --> 01:01:40,500 Like it's always evolving. There's always some current version of the maximum and cut through. 725 01:01:40,520 --> 01:01:43,589 And the hard part is seeing it. The hard part is necessarily proving it. 726 01:01:43,590 --> 01:01:49,440 Once you see it, the hard part is seeing it. And so for those of you that are intrigued by this idea of getting in on the ground floor, uh, 727 01:01:49,440 --> 01:01:54,420 of a technology that definitely needs rigorous computer science behind it and hopefully will sort of, 728 01:01:54,570 --> 01:01:59,970 you know, eventually have a massive impact on lots of people. I would maybe encourage you to check out this area a little bit further. 729 01:02:00,300 --> 01:02:03,990 So I'm out of time. I'll leave up the summary slide here, but, uh, let me stop there. 730 01:02:04,230 --> 01:02:05,070 Thanks so much for coming.