1 00:00:06,000 --> 00:00:09,050 Good evening, everybody. 2 00:00:09,050 --> 00:00:24,000 Welcome to what's sadly is the last of a series of three you here that she had given by Railton on ethics and artificial intelligence. 3 00:00:24,000 --> 00:00:26,610 Most of you have been to the other two lectures, 4 00:00:26,610 --> 00:00:36,090 so you'll know that Peter will speak for about an hour and then we'll have some time for questions if you're lucky. 5 00:00:36,090 --> 00:00:44,890 We should have some time for questions and we will be able to take questions online as well as from the floor. 6 00:00:44,890 --> 00:00:49,750 And here we'll be using the the roving mikes again. 7 00:00:49,750 --> 00:00:57,840 Um, and, um, online people, I think you should put your questions into the chat. 8 00:00:57,840 --> 00:01:03,180 And then Johnny P will kindly report them to us. 9 00:01:03,180 --> 00:01:13,650 So without further ado, over to Peter. Well, thank you, and thank you to the Uehiro Centre for the invitation, the chance to be here with you. 10 00:01:13,650 --> 00:01:22,710 I've enjoyed it so far and I hope you will continue to enjoy it and I'm trying to think of the title for this lecture, 11 00:01:22,710 --> 00:01:26,070 and it occurred to me that only a recursive title would do, 12 00:01:26,070 --> 00:01:30,300 which is a living amongst artificial agents who live amongst artificial agents, 13 00:01:30,300 --> 00:01:33,780 who live amongst natural agents, who live amongst artificial agents and so on. 14 00:01:33,780 --> 00:01:37,710 Because it's really about that recursion at the very beginning, 15 00:01:37,710 --> 00:01:47,100 I'm just going to go over a little bit of the first lectures in order to help those who might not have been there or memory also may not always serve. 16 00:01:47,100 --> 00:01:54,330 I tried to mention at the beginning there are ton of ethical challenges about A.I. and many of them very large, 17 00:01:54,330 --> 00:02:04,260 and I'm not trying to address them all. I'm really looking at just this last, the sixth one, the problem of not so super machine intelligence, 18 00:02:04,260 --> 00:02:11,280 trusting A.I., entrusting A.I. with some inappropriate tasks or degrees of independence. 19 00:02:11,280 --> 00:02:16,560 Not just because humans are mercenary, though indeed they are, but because they may not know any better. 20 00:02:16,560 --> 00:02:26,400 There may be error in inadvertence. And so I'm my my question is how could we do this relationship living amongst 21 00:02:26,400 --> 00:02:29,880 artificial agents who live amongst artificial agents who live amongst natural agents? 22 00:02:29,880 --> 00:02:40,890 How could we do that in a way that was constructive, that was mutually beneficial rather than mutually detrimental or very risky? 23 00:02:40,890 --> 00:02:45,070 And so my focus has been on the idea that. 24 00:02:45,070 --> 00:02:50,920 These systems, these autonomous or semi-autonomous artificial intelligence systems might be part of the problem, 25 00:02:50,920 --> 00:02:54,340 might be part of the solution rather than part of the problem. 26 00:02:54,340 --> 00:03:01,750 Not just because we could use them as tools, which we certainly will do, but because such systems would be autonomous. 27 00:03:01,750 --> 00:03:04,630 They would be agents, they would be capable, I believe, 28 00:03:04,630 --> 00:03:12,010 of a kind of app sensitivity and responsiveness to morally relevant features of situations, actions, agents and outcomes. 29 00:03:12,010 --> 00:03:18,580 And in that we are going to find, I hope, a very important source of mutual confidence and trust. 30 00:03:18,580 --> 00:03:25,270 Such responsiveness, I've claimed, is not just important for their relations to us, but for their relations amongst themselves. 31 00:03:25,270 --> 00:03:32,500 And if this was right, if anything like this picture is right, then it should help somewhat with the other pictures as well the other problems, 32 00:03:32,500 --> 00:03:39,880 because those involve, amongst other things, problems of developing trustworthy A.I. at the heart of these other tasks. 33 00:03:39,880 --> 00:03:42,880 Now where is this sort of specious week for this talk? 34 00:03:42,880 --> 00:03:53,140 Because folks down deep mine announced Gatto A.I. as the first working example, still incomplete of a general artificial intelligence. 35 00:03:53,140 --> 00:03:58,270 And what's that? Why is that special? Well, it's because it's intelligent. 36 00:03:58,270 --> 00:04:02,110 It's capable of learning and problem-solving in a wide range of novel situations. 37 00:04:02,110 --> 00:04:08,500 That's the general idea of intelligence, but it's general in the sense that it's competence in a wide range of task language, 38 00:04:08,500 --> 00:04:16,120 dialogue, labelling images, motor control, game playing, testing, causal models and so on, and doing so autonomously. 39 00:04:16,120 --> 00:04:25,330 That is without being reprogrammed in between the tasks. And it does this by uniting in a single big model all the elements needed for these 40 00:04:25,330 --> 00:04:30,310 various tasks and taking advantage of this synergistic power of that to do them. 41 00:04:30,310 --> 00:04:35,860 And so this is what people have been hoping for for a while, and the people said, Well, why would they announce this so early? 42 00:04:35,860 --> 00:04:44,230 It's, you know, still fairly mediocre at some of these tasks. And one answer is, if anything like this works even as a sort of proof of concept, 43 00:04:44,230 --> 00:04:48,670 then the fact that it's badly trained or needs more information or so. 44 00:04:48,670 --> 00:04:51,490 And that's not the the key fact. 45 00:04:51,490 --> 00:04:59,470 Now my claim has been that in humans are a linguistic epistemic causal and social moral competencies are in some ways like that. 46 00:04:59,470 --> 00:05:07,210 That is, they're all very entwined. They're entwined in a complex model of the world in a bundle, not separate strands. 47 00:05:07,210 --> 00:05:13,870 And they develop through infancy and through the rest of our lives in conjunction with one another, in pace with one another. 48 00:05:13,870 --> 00:05:17,530 And I don't think they would be fully realised without one another. 49 00:05:17,530 --> 00:05:21,250 And so that's another reason for being especially interested in artificial general intelligence, 50 00:05:21,250 --> 00:05:27,100 because the full realisation of these special tasks might very well require more general intelligence. 51 00:05:27,100 --> 00:05:30,970 And I'm going to argue that so Sutton and others have argued that in the end, 52 00:05:30,970 --> 00:05:38,290 it will be generalist artificial intelligence that succeeds best, even at the well-defined tasks that A.I. is now good at. 53 00:05:38,290 --> 00:05:42,370 And in this respect, Gaucho A.I. is still, I gather a work in process there. 54 00:05:42,370 --> 00:05:46,780 There are specialised programmes that apparently do at least as well as it does or better. 55 00:05:46,780 --> 00:05:52,570 But for our purpose, the question that's especially interesting is how General Intelligence, 56 00:05:52,570 --> 00:05:58,900 artificial general intelligence might make a difference to questions of ethics and A.I. questions of air safety. 57 00:05:58,900 --> 00:06:07,030 And you might think, for example, that one very important feature of artificial general intelligence is that these systems 58 00:06:07,030 --> 00:06:12,850 might be more interpretable because they combine language skills with motor skills, 59 00:06:12,850 --> 00:06:18,550 with object identification skills, with dialogue skills and so on. 60 00:06:18,550 --> 00:06:27,580 They might be able to represent what they're doing in a way that is, for example, explainable to us in a better fashion than the machines now are. 61 00:06:27,580 --> 00:06:33,190 And so the idea is if you bring together model based learning with competencies like dialogue, 62 00:06:33,190 --> 00:06:37,750 categorising objects, using that to navigate or manipulate the environment, 63 00:06:37,750 --> 00:06:45,640 collaborating with humans on shared tasks and so on that you begin to make it more and more realistic to think this is an agent now. 64 00:06:45,640 --> 00:06:49,000 It's not a human agent who doesn't have all the features of a full fledged human nation. 65 00:06:49,000 --> 00:06:55,810 But the more it does this, the more it seems a general, and the more it seems that the words that it generates are connected with the world, 66 00:06:55,810 --> 00:07:01,090 that they're used in modelling the environment because they're associated with a variety of tasks. 67 00:07:01,090 --> 00:07:07,240 They're used for signalling others for communicative purposes. They're used in action guidance and in learning. 68 00:07:07,240 --> 00:07:11,320 And, you know, meaning isn't just use. I suppose most of us think that, 69 00:07:11,320 --> 00:07:19,060 but each of these are steps toward the idea that that may be more like meaning than it would be if it were just a word programme. 70 00:07:19,060 --> 00:07:26,110 Or it may be more like objects than if it were just an image identification programme because objects are three dimensional. 71 00:07:26,110 --> 00:07:31,270 And now it's treating these as three dimensional through its motor activities. 72 00:07:31,270 --> 00:07:36,580 And so that's an important sense in which the general artificial intelligence gives us more 73 00:07:36,580 --> 00:07:41,830 confidence that we are dealing with something that looks like an intelligence and an agent. 74 00:07:41,830 --> 00:07:44,180 And I need that very much for the argument that. 75 00:07:44,180 --> 00:07:53,930 I'm giving now, of course, if you look at the core, this big network at the core, it's just a big association structure. 76 00:07:53,930 --> 00:07:57,740 And that will tempt many to say, look, there's still no real understanding. 77 00:07:57,740 --> 00:08:04,940 There's just all this association and that can't be human level competency because humans have can't have understanding. 78 00:08:04,940 --> 00:08:09,500 And that's an important reminder. We're only so far down this path. 79 00:08:09,500 --> 00:08:15,590 But it's worth noting at least my brain seems to be a neural network of associational connexions. 80 00:08:15,590 --> 00:08:23,090 Now it's not made out of silicon, it's made out of protoplasm. It's got some more differentiated structures than these nets. 81 00:08:23,090 --> 00:08:27,350 It's got a lot more structure than these nets, and that may be very important, 82 00:08:27,350 --> 00:08:34,910 but it's still operating on this basic associational principle that Connexions are strengthened the more frequently they're activated. 83 00:08:34,910 --> 00:08:39,650 That's the basic principle neurones that fire together wire together. 84 00:08:39,650 --> 00:08:44,810 And so the question is, do we use that complicated net of ours association with what it is? 85 00:08:44,810 --> 00:08:48,230 Is that understanding? And we might disqualify it as well. 86 00:08:48,230 --> 00:08:53,720 But then at least we wouldn't be treating ourselves as in some ways above and beyond 87 00:08:53,720 --> 00:08:58,760 the possibilities of artificial agents because we ourselves are not there yet. 88 00:08:58,760 --> 00:09:04,280 Anyhow, I'm going to bracket such large questions about the nature of understanding because my challenge is really 89 00:09:04,280 --> 00:09:10,310 to emphasise that short of understanding and short of something like full agency or full moral agency, 90 00:09:10,310 --> 00:09:18,320 it's still quite possible for these systems to become aptly sensitive and responsive to morally relevant features. 91 00:09:18,320 --> 00:09:31,430 And one interesting thing about these kinds of systems is that if you use them, let's say, to model human moral intuitions, people do that. 92 00:09:31,430 --> 00:09:35,630 They set up an internet site. Lots of people send in their intuitions. 93 00:09:35,630 --> 00:09:41,390 They try to make a model of that using some kind of fit of a of a neural network. 94 00:09:41,390 --> 00:09:48,740 We're learning something about the structure of our own moral values, our known moral beliefs in that way. 95 00:09:48,740 --> 00:09:57,740 But if that were, that network could then be associated with action and with speech and with behaviour, 96 00:09:57,740 --> 00:10:02,810 then we'd have a better sense that we were getting at something that looked more like a competency. 97 00:10:02,810 --> 00:10:05,570 So consider language now. 98 00:10:05,570 --> 00:10:14,180 If an artificial system is going to be genuinely able to achieve human level competence in language, not just fluent, topically appropriate speech, 99 00:10:14,180 --> 00:10:21,260 but being attuned to things like conversational norms, interpretive charity sensitivity to the distortions that coercion, 100 00:10:21,260 --> 00:10:28,670 deception or power imbalance might bring to the content of the conversation, identifying speaker motives or intent, and of course, 101 00:10:28,670 --> 00:10:35,510 and compensating accordingly, identifying deception, attributing appropriate authority to others, use of words and so on. 102 00:10:35,510 --> 00:10:39,830 That's language competency. It is a very complex bundle. 103 00:10:39,830 --> 00:10:47,060 It's normative. It's epistemic, it's social, and it's got a lot of moral content as well. 104 00:10:47,060 --> 00:10:55,490 And so my sense is that really to build competent, humanly competent speakers, we will have to build a system with that kind of a bundle. 105 00:10:55,490 --> 00:11:03,080 And so a special purpose language programme is always going to look tinker toy by comparison and limited in its abilities. 106 00:11:03,080 --> 00:11:08,360 So if creating artificial agents with broad human level competencies is, 107 00:11:08,360 --> 00:11:16,160 the game is the aim and that indeed is the aim, then moral competencies, I'm saying, will be a part of that. 108 00:11:16,160 --> 00:11:22,370 Intelligence is a capacity to learn and solve problems in an open-ended array of situations, 109 00:11:22,370 --> 00:11:28,730 and moral issues arise as problems that humans face in an open-ended array of situations. 110 00:11:28,730 --> 00:11:32,360 Using our moral capacities, we often solve these problems. 111 00:11:32,360 --> 00:11:34,880 I'm going to talk about some of the ways in which we do, 112 00:11:34,880 --> 00:11:40,370 but that means achieving human level competency and solving social problems is going to involve this 113 00:11:40,370 --> 00:11:45,770 same capacity to represent morally relevant features of situations and to use them appropriately. 114 00:11:45,770 --> 00:11:48,890 And as we saw early on, this is reflected, for example, 115 00:11:48,890 --> 00:11:56,000 in the substructure of the mind in the way in which these capacities are reflected in the brain's activities. 116 00:11:56,000 --> 00:12:05,720 So here is the general semantic network compared to the default network we saw this year as the default network compared to memory tasks, 117 00:12:05,720 --> 00:12:12,950 autobiographical memory envisioning the future, simulating theory of mind tasks and moral decision making. 118 00:12:12,950 --> 00:12:15,230 It looks like a bundle, 119 00:12:15,230 --> 00:12:22,880 and indeed it looks like a core in which there is something like a generalised model that is allowing us to interpret the past that 120 00:12:22,880 --> 00:12:30,500 we have to imagine possible futures to understand what's going on in other people's minds and to carry out moral decision making. 121 00:12:30,500 --> 00:12:35,690 And so it looks something like the structure of general intelligence. 122 00:12:35,690 --> 00:12:41,390 And if that's if it makes sense that these tasks are bundled, then it makes sense that the brain will handle them. 123 00:12:41,390 --> 00:12:45,350 In this bundled way, there will be special moral modules. And so on. 124 00:12:45,350 --> 00:12:52,730 Now that looks like one of these foundation models in vivo and this capacity to 125 00:12:52,730 --> 00:12:58,490 use experience and flexibly recruiting memory and generating possible responses, 126 00:12:58,490 --> 00:13:03,530 simulating outcomes, assessing them in terms of how they would affect people and so on. 127 00:13:03,530 --> 00:13:12,470 If that's how we manage to develop and use such capacities as our language capacity, our moral capacity, our capacity is epistemic agents, 128 00:13:12,470 --> 00:13:18,260 then artificial agents are going to have to do so as well if they're going to have human level competence. 129 00:13:18,260 --> 00:13:26,810 And along the way, we will have to be responsive to epistemic and linguistically and socially and morally relevant features of situations, 130 00:13:26,810 --> 00:13:33,920 actions, agents and outcomes. So what do I mean saying agent talking about these artificially intelligent systems? 131 00:13:33,920 --> 00:13:39,770 I don't mean a deep notion of agent. I don't mean a core consciousness or a sense of self. 132 00:13:39,770 --> 00:13:48,650 I mean, a system of the particular kind, which is well studied in cognitive science of an agent interacting with an environment. 133 00:13:48,650 --> 00:13:53,540 The agent has a model of the environment. It has a goal, a reward function. 134 00:13:53,540 --> 00:14:02,150 Those two are combined to generate selected actions. Those actions are then suggested to the performer, the actual waiter, the agent, 135 00:14:02,150 --> 00:14:08,030 the agent performs the action and the environment returns a response in terms of, well, how did things turn out? 136 00:14:08,030 --> 00:14:12,740 And that then becomes data for updating the model. And it's a much more complicated than this. 137 00:14:12,740 --> 00:14:16,640 Actually, there are all kinds of internal loops that are fascinating in themselves, 138 00:14:16,640 --> 00:14:24,380 but the rough idea is that agents are by their very structure in this sense represented as they are modellers, 139 00:14:24,380 --> 00:14:31,610 they are learners, and they engage in action as learning and not just as performance. 140 00:14:31,610 --> 00:14:34,520 And so what we saw when we looked at, for example, 141 00:14:34,520 --> 00:14:48,600 evidence from neural recordings of macaques that they perform actually quite precise probabilistic calculations of rewards in their environment. 142 00:14:48,600 --> 00:14:53,370 Moreover, if you look to the right hand side and I will draw attention to this later on, 143 00:14:53,370 --> 00:14:58,860 they don't just do expected value calculations, which is what the Economist would recommend to them. 144 00:14:58,860 --> 00:15:05,430 They also do risk calculations. They independently encode risk and you might say, Well, why would you bother doing that? 145 00:15:05,430 --> 00:15:12,600 Don't rational agents just act on expected utility? And you can say yes, that's what they said, right up until the global financial crisis, 146 00:15:12,600 --> 00:15:18,330 when it was clear that accumulated risk offset the gains in expected utility. 147 00:15:18,330 --> 00:15:22,230 So animals live in an environment they have to survive. 148 00:15:22,230 --> 00:15:28,440 They can't ignore risk and artificial agents if they're going to be least animal level competent and 149 00:15:28,440 --> 00:15:33,630 certain the human level competent will have to represent not only expected value as they currently do, 150 00:15:33,630 --> 00:15:43,290 but also risk in this way. Here we saw work with a rhesus monkeys, indicating that they have, in fact, utility functions. 151 00:15:43,290 --> 00:15:49,740 These are abstract functions. This doesn't represent any particular quantity of juice or any particular level of risk. 152 00:15:49,740 --> 00:15:56,730 It represents combinations of quantities of juice and rebels, levels of risk and banana slices and grapes. 153 00:15:56,730 --> 00:16:05,010 But it reps them represents them all in a common currency like utility, and the utility function has the shape you would expect. 154 00:16:05,010 --> 00:16:10,800 It is risk seeking when there is little at stake and it is risk aversive when there's a lot at stake. 155 00:16:10,800 --> 00:16:18,570 And this looks very much like the kind of a function that you would expect a prudent agent to develop operating in the world. 156 00:16:18,570 --> 00:16:25,570 And the picture that emerges from this is the idea that our action as we saw, is guided by. 157 00:16:25,570 --> 00:16:29,800 Evaluative causal representation of the environment around us. 158 00:16:29,800 --> 00:16:37,570 And with that representation, we are able to select actions and then compare outcomes with what we expected. 159 00:16:37,570 --> 00:16:45,310 Now, highly intelligent animals don't simply have that kind of a first person egocentric perspective. 160 00:16:45,310 --> 00:16:50,540 In fact, they map their physical space non egocentric as well as egocentric. 161 00:16:50,540 --> 00:16:54,370 This was a big discovery. Not that long ago. 162 00:16:54,370 --> 00:17:02,320 Rats represent space not only in terms of location and where they are, but also in terms of a grid mapping out the space around them. 163 00:17:02,320 --> 00:17:12,850 It's a non egocentric representation, and these representations affords the animal a very substantial degree of autonomy from the current stimulation. 164 00:17:12,850 --> 00:17:20,890 It enables the animal to engage in the discovery and planning of novel actions, and it enables them to become optimal foragers in their environment. 165 00:17:20,890 --> 00:17:29,740 So here are some of the initial experiments with rats showing the neurones firing, either at a specific place or on a grid like pattern. 166 00:17:29,740 --> 00:17:38,200 Here are an example of the representation of space in the hippocampus of a rat who's been running a maze. 167 00:17:38,200 --> 00:17:41,320 And what we saw was it's not just present when the rat is running, 168 00:17:41,320 --> 00:17:47,170 it's present when the animal sleeps and it repeatedly activates that maze in its sleep. 169 00:17:47,170 --> 00:17:53,590 It does so in a way that seeks information. That is to say it spends more time activating parts of the maze it didn't explore. 170 00:17:53,590 --> 00:17:57,700 It moves in directions in the maze that it didn't move into during the day, 171 00:17:57,700 --> 00:18:05,890 and it also constructed novel paths because it has a representation of space that enables it to represent not just the channels that it followed, 172 00:18:05,890 --> 00:18:10,120 but the spatial relations amongst the locations along the channels. 173 00:18:10,120 --> 00:18:16,060 For example, diagonals and shortcuts. And then finally, we saw and this is a slide I. 174 00:18:16,060 --> 00:18:21,460 Try to show as often as I can, because it gives the animals credit for what they're doing, 175 00:18:21,460 --> 00:18:27,070 that when the rat reaches a choice point in the maze, it has an idea of what lies ahead. 176 00:18:27,070 --> 00:18:31,810 It searches to one side and to the other side mentally before it takes any steps. 177 00:18:31,810 --> 00:18:35,140 That's an efficient thing to do if you're an animal worried about energy. 178 00:18:35,140 --> 00:18:42,670 And it combines representations of expected value with representations of space in such a way that by going back and forth, 179 00:18:42,670 --> 00:18:46,930 it is actually weighing the alternatives and where it discovers the stronger weight. 180 00:18:46,930 --> 00:18:58,000 It will act and move in that direction. And so a rat running a maze is acting as a rational agent could be expected to do, 181 00:18:58,000 --> 00:19:03,710 forming the kinds of representations we'd expect and using them in the ways that we would expect. 182 00:19:03,710 --> 00:19:11,870 Now, highly intelligent animals, including humans, also construct non egocentric maps of their social environment. 183 00:19:11,870 --> 00:19:16,190 They represent different behavioural dispositions of the individuals around them, 184 00:19:16,190 --> 00:19:25,160 values that are associated with the behaviour of those dispositions, and these representations are learnt through a kind of reinforcement learning. 185 00:19:25,160 --> 00:19:29,580 And these are third personal representations. It's not was this spunky good to me? 186 00:19:29,580 --> 00:19:38,510 Is this monkey good toward other monkeys with this monkey, therefore be a good marriage partner or a good alliance partner? 187 00:19:38,510 --> 00:19:45,230 That's why these representations, these non egocentric representations are so valuable because that information is vital if they're going to 188 00:19:45,230 --> 00:19:50,540 initiate a new relationship and they need to distinguish the monkeys who help from the monkeys who don't help. 189 00:19:50,540 --> 00:19:55,070 And they're now finding that animals have codes within themselves, 190 00:19:55,070 --> 00:20:02,390 doing things like helping helpers and helping those who help others than themselves. 191 00:20:02,390 --> 00:20:07,730 So there's a lot more going on there than we thought, and this is a representation, 192 00:20:07,730 --> 00:20:12,590 a non egocentric representation that is a representation of relevant characteristics, 193 00:20:12,590 --> 00:20:16,990 we would say morally relevant characteristics of the social environment. 194 00:20:16,990 --> 00:20:22,750 This is also a kind of autonomy that they have because they are not bound to what 195 00:20:22,750 --> 00:20:26,740 they've done in the past or the options that they've exercised in the past. 196 00:20:26,740 --> 00:20:31,120 And so within the social environment, just as within the physical environment, 197 00:20:31,120 --> 00:20:39,940 they are able to navigate space in ways that help them realise goals and fashions that they had not explored before. 198 00:20:39,940 --> 00:20:47,020 And this modelling, which we should expect in a highly artificially intelligent agents, 199 00:20:47,020 --> 00:20:52,660 would then have the same kind of need for autonomy and efficacy in navigating the physical and social environment. 200 00:20:52,660 --> 00:20:58,000 So we should expect artificial agents who have even animal levels of competence to be doing 201 00:20:58,000 --> 00:21:03,340 this kind of non egocentric mapping of evaluative features of their social landscape. 202 00:21:03,340 --> 00:21:10,330 And we should expect, moreover, that it needs that just as the rat needs it or the monkey needs it in order to 203 00:21:10,330 --> 00:21:15,790 have the most effective and efficient pursuit of goals in that social space. 204 00:21:15,790 --> 00:21:22,480 And so what I've been trying to argue then, is that features like linguistic features, epistemic features, 205 00:21:22,480 --> 00:21:27,370 social features, features having to do with helping and harming morally relevant features. 206 00:21:27,370 --> 00:21:32,560 These are all they're not being conceived by the animals as reasons for action. 207 00:21:32,560 --> 00:21:37,480 They don't have a concept of reasons, perhaps, but they're being used as reasons for action, 208 00:21:37,480 --> 00:21:42,610 and they're playing the role as reasons for action that reasons for action should play. 209 00:21:42,610 --> 00:21:53,220 So. Back to our friends, the autonomous vehicles, because they're my example here of autonomous, artificially intelligent systems. 210 00:21:53,220 --> 00:22:02,400 So what kinds of sensitivity or responsiveness to reason making features would be involved in achieving human level competence and driving? 211 00:22:02,400 --> 00:22:11,620 And importantly, how much sense of such sensitivity and responsiveness would contribute to the safer and more trustworthy character of these vehicles? 212 00:22:11,620 --> 00:22:15,900 In other words, if they were able to be responsive to these features, would they in fact be safer? 213 00:22:15,900 --> 00:22:23,250 Would they be more trustworthy and would they be better at realising the kinds of goals that they're trying to realise, like getting to destinations? 214 00:22:23,250 --> 00:22:32,970 Seems like a lot to try to get morality in this sense, not a morality in the sense that we think of a highly normative AIS system, 215 00:22:32,970 --> 00:22:37,770 but a system of responsiveness to morally relevant features. It seems like why would you need that in order to drive? 216 00:22:37,770 --> 00:22:41,310 And the answer is humans need it in order to drive, 217 00:22:41,310 --> 00:22:50,160 and they need it in order to drive in such a way that the capacity to have human level competence would presuppose that ability as well. 218 00:22:50,160 --> 00:22:55,890 So we last time looked at merging. So here are some typical merging problems. 219 00:22:55,890 --> 00:23:00,480 And here's an example. Take a look at the lower right hand corner. 220 00:23:00,480 --> 00:23:09,570 You're trying to merge into a highway, and you notice that the following car the second blue car, seems to be allowing a gap in front of it. 221 00:23:09,570 --> 00:23:16,260 Now, does that mean that it's signalling to you to merge in? Has it slowed down in order to signal that you're merging in? 222 00:23:16,260 --> 00:23:21,540 Or is it just speeding up and it has a gap to make up between it and the car ahead of it? 223 00:23:21,540 --> 00:23:24,810 What is the intention of that car? And where should you go? 224 00:23:24,810 --> 00:23:30,390 Should you try to move on the trajectory that takes you close to that car or more distant from that car, 225 00:23:30,390 --> 00:23:34,710 which is going to be less likely to interfere with the planning of that vehicle? 226 00:23:34,710 --> 00:23:40,050 Given what the vehicle is signalling that it's doing and so recently, 227 00:23:40,050 --> 00:23:50,130 there have been the development of techniques for understanding how merging can take place that reflect this kind of complex, intentional structure. 228 00:23:50,130 --> 00:23:55,290 Now for humans, successful burgeoning merging involves competing interests. 229 00:23:55,290 --> 00:24:03,360 Each of us might want to get to our destination faster, but we also have some shared interests in smooth traffic flow, avoiding collisions and so on. 230 00:24:03,360 --> 00:24:08,070 How do we reconcile these in any given situation, given that we don't know the other driver? 231 00:24:08,070 --> 00:24:11,370 We may not interact with the other driver again. What do we have to do? 232 00:24:11,370 --> 00:24:16,410 Well, we have to use whatever we can to try to determine the intentions of other drivers. 233 00:24:16,410 --> 00:24:23,520 We have to assess the evidence that's available to us. We have to look for whether there's any communicative action going on with the other driver. 234 00:24:23,520 --> 00:24:31,260 We have to think that we're perhaps causing a slowdown behind us if we delay or we have to be mindful of the traffic that's moving. 235 00:24:31,260 --> 00:24:35,670 And so we have to know what are the expectations of all these individuals around us? 236 00:24:35,670 --> 00:24:41,670 And how can we put together those expectations in such a way as to enable us to do a smooth, safe merge? 237 00:24:41,670 --> 00:24:45,840 And so it involves heavy use of theory of mind. 238 00:24:45,840 --> 00:24:53,280 And that suggests that if you're going to build a car capable of human level competency and merging, it's going to need something like theory of mind. 239 00:24:53,280 --> 00:24:59,730 And indeed, we find something like that. So here is another merging situation that you all understand. 240 00:24:59,730 --> 00:25:05,610 There's a construction zone. You're supposed to be nice and neatly in a single lane as you get to it. 241 00:25:05,610 --> 00:25:13,050 The autonomous vehicle sees an opportunity. Should it take this opportunity, it could scoot ahead in an open road. 242 00:25:13,050 --> 00:25:20,280 After all, it's been in traffic should it scoot ahead in open road. And if so, how should it signal to the other car that it's all ready to emerge? 243 00:25:20,280 --> 00:25:25,050 And how should it respond to the other car's response to its attempt to muscle in like that? 244 00:25:25,050 --> 00:25:30,660 And why is it not working and why next time should it try a different strategy? 245 00:25:30,660 --> 00:25:34,830 All of that is the kind of stuff that these machines have to figure out. 246 00:25:34,830 --> 00:25:38,550 And this is a situation they could get into. 247 00:25:38,550 --> 00:25:46,500 This is not a situation of unheard of a catastrophe, this is cross-town traffic in New York on an ordinary day. 248 00:25:46,500 --> 00:25:52,050 Think of how much intentional information the drivers of those cars need and the pedestrians 249 00:25:52,050 --> 00:25:56,250 need in order to get out of this situation while continuing to move across the intersection, 250 00:25:56,250 --> 00:26:05,530 which they do all day long, day after day. That's an ability that could not possibly be accomplished by a driver module. 251 00:26:05,530 --> 00:26:10,000 It's going to have to be accomplished by a module for understanding human social 252 00:26:10,000 --> 00:26:15,010 interactions and ways of getting people pissed off at you in ways of getting people 253 00:26:15,010 --> 00:26:18,700 to want to cooperate with you and trying to elicit elicit their help in trying to 254 00:26:18,700 --> 00:26:24,630 signal to the pedestrian that you're really not trying to run that person over. So. 255 00:26:24,630 --> 00:26:31,350 Autonomous vehicle merging, then is going to have all the problems of conflicting goals, problems of communications signalling, 256 00:26:31,350 --> 00:26:37,440 trying to be reliable in the communication or trying to detect deceptive communication, this is New York. 257 00:26:37,440 --> 00:26:45,000 After all, they must solve these problems of predicting behaviour, gauging evidence. 258 00:26:45,000 --> 00:26:51,330 What evidence could I give to others that would enable me to do this task more successfully and smoothly to have safe and 259 00:26:51,330 --> 00:26:59,670 human competence in merging in an actual situation that might encounter any day at any intersection in Midtown Manhattan? 260 00:26:59,670 --> 00:27:08,040 So no simple dynamical model will suffice. Getting the solution will require attributing goals and expectations to other drivers, 261 00:27:08,040 --> 00:27:13,410 autonomous or human, looking for informative signals, estimating their behaviour and so on. 262 00:27:13,410 --> 00:27:18,490 So this is a kind of non egocentric mapping of the social environment. 263 00:27:18,490 --> 00:27:23,610 And. One approach now that's being explored. 264 00:27:23,610 --> 00:27:29,910 I'm happy to say the British Isles is to develop deliberately. 265 00:27:29,910 --> 00:27:38,550 Artificial intelligence is for driving that use rational agent models of other vehicles in situations. 266 00:27:38,550 --> 00:27:48,090 And what that means is that it tries to impute to the behaviour of other vehicles what would be the intention behind this particular behaviour. 267 00:27:48,090 --> 00:27:52,140 And that means they're doing what's called inverse reinforcement learning from the behaviour. 268 00:27:52,140 --> 00:28:00,750 They're trying to infer what the value function of that vehicle is effectively and then trying to use something like Bayesian estimation to think, 269 00:28:00,750 --> 00:28:05,010 Well, is it more likely if it had that value function, that it would slow down or speed up here? 270 00:28:05,010 --> 00:28:09,450 And how is that going to affect the way in which I behave and vehicles like this 271 00:28:09,450 --> 00:28:14,010 can in simulations do a better job of merging than they could with just simple, 272 00:28:14,010 --> 00:28:20,820 dynamic calculations because they can impute structure of a general kind to a situation? 273 00:28:20,820 --> 00:28:26,820 And these are agents, the artificial ones and the human ones are agents in the sense that we saw. 274 00:28:26,820 --> 00:28:31,740 Now you might say, wait a second, we solved chess, didn't we? They solved. 275 00:28:31,740 --> 00:28:39,600 They didn't solve it. They got better than any human in chess, but they didn't have to do psychological theories do that. 276 00:28:39,600 --> 00:28:43,860 They got better than any human and go. They didn't do any psychology in order to do that. 277 00:28:43,860 --> 00:28:49,350 And that's because these are games with bounded boards and only discrete number of moves that are possible 278 00:28:49,350 --> 00:28:55,800 that any moment there isn't a complex question of which players do you coordinate with at any time? 279 00:28:55,800 --> 00:29:00,060 And so a game like chess or go complicated as it is, 280 00:29:00,060 --> 00:29:06,510 is the kind of game that as machine intelligence can solve without attributing agency to the other player. 281 00:29:06,510 --> 00:29:13,350 But the kinds of problems we've been looking at don't seem like they're going to be tractable in that way, and they're going to be open ended. 282 00:29:13,350 --> 00:29:17,850 And the intelligence that's involved is going to have to have basically the 283 00:29:17,850 --> 00:29:24,890 features that we've been attributing to animal and human social intelligence. 284 00:29:24,890 --> 00:29:32,210 Think of relations with pedestrians. How do you understand whether pedestrians are saying to you, OK, 285 00:29:32,210 --> 00:29:39,200 I'm not going to slow down for you to pull out this time or they're saying, yes, I will slow down and you can start to pull out. 286 00:29:39,200 --> 00:29:46,400 How do you understand that from the motions that they're making, you have to have a rational agent model of the pedestrians as well. 287 00:29:46,400 --> 00:29:54,770 You also have to know what you don't know. Human driver I'm new to a country and I'm at a crosswalk and there are a lot of pedestrians going cross. 288 00:29:54,770 --> 00:29:57,470 And I think, my god, I'll never be able to pull out. 289 00:29:57,470 --> 00:30:04,730 If I've got a person from the country beside me, I can turn to that person and say, Where are you in this, this and this situation? 290 00:30:04,730 --> 00:30:10,970 Would you nudge forward a little bit? Do people, if you do nudge forward, do they open up a little bit so that you can go through? 291 00:30:10,970 --> 00:30:16,640 How long should I wait before I do that? And that person is likely to have better information than you. 292 00:30:16,640 --> 00:30:22,580 And so an artificial driver that can be as competent as a human driver has to know what it 293 00:30:22,580 --> 00:30:27,920 doesn't know and know also what it could ask and what it could learn in the situation. 294 00:30:27,920 --> 00:30:35,780 And so it will have to be consultative and dialogic and not simply sit there in its own private mind and try to scrutinise the world. 295 00:30:35,780 --> 00:30:42,760 It can gain from communication and gain from whatever knowledge the humans that are in its environment have acquired. 296 00:30:42,760 --> 00:30:49,720 And so that also requires that these systems have the ability to self represent because I can't have a dialogue 297 00:30:49,720 --> 00:30:55,360 without a machine with a machine about this unless it can tell me why it wants to do the thing that it's doing. 298 00:30:55,360 --> 00:30:56,260 Your wife's trying to do, 299 00:30:56,260 --> 00:31:03,940 the thing that it's doing and that requires now that the machines also have self representational capacities that they can represent their own, 300 00:31:03,940 --> 00:31:09,400 that they can represent the different weights, that the different nodes and variables have. 301 00:31:09,400 --> 00:31:16,420 And they can put into words what weights they're using and ask, are these weights appropriate in this situation? 302 00:31:16,420 --> 00:31:22,960 Again, a competent human driver can do that and we'll be able to get through the intersection thanks to local knowledge. 303 00:31:22,960 --> 00:31:30,240 That's a human competency, and driving and human level competency and driving involves just this. 304 00:31:30,240 --> 00:31:35,700 These are the features that are also morally relevant features of situations they 305 00:31:35,700 --> 00:31:40,080 have to do with harms and benefits in ways in which individuals can be put at risk, 306 00:31:40,080 --> 00:31:44,640 or ways in which individuals can be helped or assisted or cooperated with. 307 00:31:44,640 --> 00:31:50,850 And so we can see why autonomous vehicles are going to have to be responsive to these morally relevant features. 308 00:31:50,850 --> 00:31:57,760 Now you might say, OK, OK, yeah, they were responsive to morally relevant features, but not in a moral way. 309 00:31:57,760 --> 00:32:04,330 It's just rational self-interest on their part. All they're trying to do is maximise some reward function. 310 00:32:04,330 --> 00:32:09,880 They are trying to do anything like but a moral agent does, and to a certain extent that's true. 311 00:32:09,880 --> 00:32:13,840 They are doing what a moral agent does in the sense they don't have moral concepts. 312 00:32:13,840 --> 00:32:17,800 They are doing moral deliberation in that way. They don't have moral feelings. 313 00:32:17,800 --> 00:32:22,370 They won't feel guilt or shame. But. 314 00:32:22,370 --> 00:32:33,070 Are they rational, self-interested agents? Or not, and I'm going to argue that they aren't in that for them to be successful at this driving task, 315 00:32:33,070 --> 00:32:43,000 they will not be rational, self-interested agents. They will be what I call and what Hobbs and Hume would call reasonable agents and reasonable agents 316 00:32:43,000 --> 00:32:48,100 respond to morally relevant features in the way that we hope moral agents will respond to them, 317 00:32:48,100 --> 00:32:55,870 not just in the way in which prudent or self-interested people will now to engage in this little discussion. 318 00:32:55,870 --> 00:33:03,580 I'm going to have to talk about these systems as being more or less rational, them having interests, having benefits, having cost and so on. 319 00:33:03,580 --> 00:33:08,320 And I realise that can be problematic to people because they can think they are conscious. 320 00:33:08,320 --> 00:33:11,800 How could they have a benefit or a cost if they're not conscious? 321 00:33:11,800 --> 00:33:19,630 And I have your whole spiel I could give to you about why I think those terms are appropriate. 322 00:33:19,630 --> 00:33:28,570 Let's say that it what I have in mind here is not what we would, might, might think of as a conscious benefit, 323 00:33:28,570 --> 00:33:31,690 but it's a kind of benefit that humans have and that is important for human life. 324 00:33:31,690 --> 00:33:38,290 And it's a kind of benefit that human institutions can have, even though human institutions are not conscious. 325 00:33:38,290 --> 00:33:46,330 And so what I mean here by interest has to do with what goals exist in the situation for the agent, 326 00:33:46,330 --> 00:33:54,010 how those goals might balance, how they're related to the odds or the probabilities or the information in the situation. 327 00:33:54,010 --> 00:33:59,080 Something will be in its interest if it can improve its situation with regard to those goals, 328 00:33:59,080 --> 00:34:05,170 something will be a benefit to it if its situation is increased by that, it's the cost of its depleted. 329 00:34:05,170 --> 00:34:11,500 And this idea of cost and benefit is when we use all the time. 330 00:34:11,500 --> 00:34:19,930 It's not something I've invented the head of a college, you know, why aren't you divesting from oil stocks? 331 00:34:19,930 --> 00:34:27,610 Don't you know that the oil companies are responsible for a huge amount of pollution? And the head of the college will say, I understand that. 332 00:34:27,610 --> 00:34:31,960 I personally wish we could be divested of all carbon intensive stocks. 333 00:34:31,960 --> 00:34:40,720 In fact, everyone on the board wishes that intensively. But I'm head of the board and our job is to ask What is the interest of the college, 334 00:34:40,720 --> 00:34:44,680 not what is our interest as individuals and the interests of the college would 335 00:34:44,680 --> 00:34:49,180 not be served by divestment because it would harm our income and because it 336 00:34:49,180 --> 00:34:54,610 would put our future donors and indeed future donors or others could take that 337 00:34:54,610 --> 00:34:59,270 head of college to court if he fails to act in the interest of the college. 338 00:34:59,270 --> 00:35:04,060 And so you'll have people in black robes sitting solemnly in a panelled chamber asking themselves, 339 00:35:04,060 --> 00:35:10,290 Was this or was this not the interest of the college? And that's not an interest of any of the agents in the college. 340 00:35:10,290 --> 00:35:14,290 Maybe the entire board is about to retire, maybe their pensions are secure. 341 00:35:14,290 --> 00:35:20,200 Maybe they're not going to benefit at all from this. Maybe they'd prefer to have a green reputation. 342 00:35:20,200 --> 00:35:25,180 They hate this. They don't like the students coming to their office and ask these questions. 343 00:35:25,180 --> 00:35:29,350 But they say we still our responsibility is to tend to the interests of the college, 344 00:35:29,350 --> 00:35:34,690 and those are not the same as our interests as individual moral agents. 345 00:35:34,690 --> 00:35:38,500 So it's not a bizarre notion. It's the notion used in game theory. 346 00:35:38,500 --> 00:35:51,810 And I'm happy to discuss this more in the question period. But I dare not spend more time on it right now. 347 00:35:51,810 --> 00:35:54,270 OK, so here we are back with artificial agents, 348 00:35:54,270 --> 00:36:00,810 with their interests more or less rational relative to those interests in the actions that they select. 349 00:36:00,810 --> 00:36:05,760 These are terms of art, but they're anchored in the agency and the general structure of these systems. 350 00:36:05,760 --> 00:36:11,520 We can use them to predict and control their behaviour. We can use them in the game theoretic way to predict how they will interact 351 00:36:11,520 --> 00:36:15,900 with each other and with the outcomes of those interactions will be we can, 352 00:36:15,900 --> 00:36:23,610 for example, say, Oh, well, if they're rational, self-interested agents, then if there is a nash equilibrium, they will find it. 353 00:36:23,610 --> 00:36:25,290 And how would we describe that? Well, 354 00:36:25,290 --> 00:36:32,940 it's a stable state of a system involved in the interaction of different agents in which no agent can benefit by a unilateral change in strategy. 355 00:36:32,940 --> 00:36:37,380 If the strategies of the others remain unchanged, that's the idea of a nash equilibrium. 356 00:36:37,380 --> 00:36:41,880 And the answer is, if these machines are rational and self-interested and they interact and they can learn, 357 00:36:41,880 --> 00:36:49,470 they will find Nash equilibrium and take them OK. And that brings us to this man, Hobbes. 358 00:36:49,470 --> 00:36:54,720 So Hobbes is famously analysed in terms of the prisoner's dilemma. 359 00:36:54,720 --> 00:37:02,070 So here is the prisoner's dilemma. One way of understanding the dilemma is if you're a rational, self-interested agent, 360 00:37:02,070 --> 00:37:05,940 you will look at this payoff table whether you're a prisoner one or prisoner two, 361 00:37:05,940 --> 00:37:10,360 and you will reason what would be the stable equilibrium here, 362 00:37:10,360 --> 00:37:14,790 such that whatever the other person does, I could not play a more advantageous strategy. 363 00:37:14,790 --> 00:37:20,930 And you've all heard this numerous times already. It would be the strategy of joint defection. 364 00:37:20,930 --> 00:37:28,020 And so we would both end up with one unit of value, whereas if we had just cooperated, we would get three units. 365 00:37:28,020 --> 00:37:33,000 And that's interesting, because not only would we do better individually, but together we would do better. 366 00:37:33,000 --> 00:37:36,660 We would have produced six units of value rather than just two. 367 00:37:36,660 --> 00:37:44,160 And we could divide those six units up and we could do this again if we haven't iterated game and we could continue to produce more value. 368 00:37:44,160 --> 00:37:50,790 If, however, we are rational, self-interested agents, we won't on the first round cooperate. 369 00:37:50,790 --> 00:37:54,930 And that means that on the second round, if the other agent understands us, 370 00:37:54,930 --> 00:37:58,080 what understands us for what we're doing and we don't know whether we're going 371 00:37:58,080 --> 00:38:03,510 to be any more rounds will defect as well and we will not get up into that box. 372 00:38:03,510 --> 00:38:13,370 Now, if mean, if autonomous artificial agents can't get into that box, they're in trouble. 373 00:38:13,370 --> 00:38:19,850 And so in order to do so, they must differ from rational, self-interested agents in a distinctive way, 374 00:38:19,850 --> 00:38:24,470 and Hobbs told us very carefully what that distinctive way is. 375 00:38:24,470 --> 00:38:30,290 So he was looking at the world around him at the strife and this religious wars in England. 376 00:38:30,290 --> 00:38:36,290 And it was very clear to him that there were two contesting movements that could get locked into non-cooperation. 377 00:38:36,290 --> 00:38:39,410 They could decimate the countryside and keep the country weak, 378 00:38:39,410 --> 00:38:45,140 whereas if they could somehow rather come to some kind of an agreement, they could have peace and prosperity. 379 00:38:45,140 --> 00:38:50,900 And therefore, his laws of nature don't say in the first instance, you should use all the help of war. 380 00:38:50,900 --> 00:38:59,250 They say in the first instance, seek peace. And only if peace is not obtainable should you use the instruments of war, 381 00:38:59,250 --> 00:39:05,280 and he thinks that peace is attainable in that situation and indeed recommends it. 382 00:39:05,280 --> 00:39:12,570 Now that would correspond to cooperating on the first round of the prisoner's dilemma, rejecting the rational, self-interested strategy. 383 00:39:12,570 --> 00:39:22,080 How does the argument work? Well, Hobbs argues that a reasonable person can see that an unsecured first performance of 384 00:39:22,080 --> 00:39:28,050 cooperation would be a costly and therefore credible signal of willingness to cooperate. 385 00:39:28,050 --> 00:39:34,940 And so he tells us you could initiate cooperation with no security of performance from the other individual. 386 00:39:34,940 --> 00:39:39,200 It is seeking peace, it's giving peace a chance, as the slogan goes. 387 00:39:39,200 --> 00:39:43,370 And in light of the first law of Nature, A should do it now. 388 00:39:43,370 --> 00:39:56,300 A assuming that B is a rational or at least semi rational agent should understand the laws of nature just as well as a does and should recognise that, 389 00:39:56,300 --> 00:40:06,190 according to the Fourth Law of Nature, if you receive a gift out of free grace from someone, you should endeavour not to make that person repented. 390 00:40:06,190 --> 00:40:17,750 That is to say, you should show great gratitude for a gift of free grace, and so be knowing this will want to cooperate on the next round. 391 00:40:17,750 --> 00:40:23,600 A knowing that B would know that would know that B is going to cooperate on the second round. 392 00:40:23,600 --> 00:40:28,680 And so as a reasonable agent with that expectation, we'll also cooperate. 393 00:40:28,680 --> 00:40:37,230 And once they do this, they will be in a stable situation now in which they can continue to cooperate as long as they interact with one another. 394 00:40:37,230 --> 00:40:42,750 And you'd say, Yeah, but look, suppose there's an end time, there's a horizon out there. 395 00:40:42,750 --> 00:40:45,900 A rational, self-interested Asian would reason like the following. 396 00:40:45,900 --> 00:40:53,280 I know that on the next to last round, my opponent being rational, it's self-interested, is going to not cooperate, it's going to defect. 397 00:40:53,280 --> 00:40:59,210 And therefore, on the next to last round, I should defect. 398 00:40:59,210 --> 00:41:03,100 Oh, but the other agent can make that argument just as well as I can, 399 00:41:03,100 --> 00:41:08,680 and so therefore on the ground before the next, the last round, [INAUDIBLE] defect and I should defect. 400 00:41:08,680 --> 00:41:14,590 And I can make that argument just as well as she can and therefore on the ground before the round, before the next to last round, I should defect. 401 00:41:14,590 --> 00:41:17,200 And they talk themselves the Caterpillar Strategy School. 402 00:41:17,200 --> 00:41:24,900 They talked themselves entirely into spending the next year non cooperating because of a worry about the last round. 403 00:41:24,900 --> 00:41:27,890 Now, Hobbs would say that is unreasonable. 404 00:41:27,890 --> 00:41:33,140 Reasonable people don't act like that, but if they were rationally self-interested in the sense that we've defining it, 405 00:41:33,140 --> 00:41:37,490 if that's all that could move them, then that is the way they would argue. 406 00:41:37,490 --> 00:41:43,520 So in a true prisoner's dilemma, of course, agents have to act at the same time. 407 00:41:43,520 --> 00:41:52,430 They can't know what the other's going to do, and reasonableness plays a role there because if a, for example, initiates unsecured cooperation, 408 00:41:52,430 --> 00:41:59,870 he can infer that B will understand this a certain way because we could see, as well as it could see, that defection was the dominant strategy. 409 00:41:59,870 --> 00:42:05,690 And so now we have a way in which signals can come become reliable for reasonable agents. 410 00:42:05,690 --> 00:42:14,870 They can share information and help shape each other's behaviour because they're reasonable and therefore would plan in advance that even if 411 00:42:14,870 --> 00:42:24,350 B defects on the first round because we didn't see all the way to the bottom with the strategy or because B did defect and now repents it, 412 00:42:24,350 --> 00:42:26,990 Hobbs would say, and is going to cooperate in this role, 413 00:42:26,990 --> 00:42:34,310 a common reason as a reasonable agent to that conclusion and a then we'll cooperate again in an unsecured way. 414 00:42:34,310 --> 00:42:41,210 And once again, they all have the cooperative payoff. And so what else does Hobbs tell us about these agents? 415 00:42:41,210 --> 00:42:46,550 They should strive to accommodate themselves to the rest upon portion of the future. 416 00:42:46,550 --> 00:42:50,390 A man ought to pardon defence's past of them. That repenting desire. 417 00:42:50,390 --> 00:42:57,170 You should not be vengeful. Yes, be defected on the first round, but I'm not going to be vengeful in revenge. 418 00:42:57,170 --> 00:43:03,410 These men look not at the greatness of the evil past, but at the greatness of the good to follow. 419 00:43:03,410 --> 00:43:08,090 And so a not wanting to extract vengeance, 420 00:43:08,090 --> 00:43:12,410 if that means locking into a non cooperate strategy would be because a is looking at 421 00:43:12,410 --> 00:43:19,160 the greater good to follow a reasonably wants to initiate and continue cooperation. 422 00:43:19,160 --> 00:43:27,290 And so the thing could go along, and this is this is not a mystery to human beings. 423 00:43:27,290 --> 00:43:35,690 Humans have managed throughout their history to be social beings who live together in cooperative arrangements, some very large. 424 00:43:35,690 --> 00:43:44,720 Those arrangements can be mutually beneficial and mutually sustained, rather than a sequence of mutual defections and a state of nature. 425 00:43:44,720 --> 00:43:51,620 They do this despite lack of assurances beside the fact that they may interact only a finite number of times with each other, 426 00:43:51,620 --> 00:43:57,740 and in real life, they will recognise defection in a given round as a kind of a mistake. 427 00:43:57,740 --> 00:44:02,510 It would be so we would call it a moral mistake and up to a certain point. 428 00:44:02,510 --> 00:44:07,460 The response to that moral mistake is to try to bring that person back into the moral community, 429 00:44:07,460 --> 00:44:11,210 and that's indeed what hunter gatherer communities seem to do. 430 00:44:11,210 --> 00:44:14,840 Now, autonomous vehicles should be in the same situation. 431 00:44:14,840 --> 00:44:23,150 They're constantly encountering prisoner dilemmas like situations where if they could just cooperate, they could each get to the destination quicker. 432 00:44:23,150 --> 00:44:28,880 If they won't cooperate, they're going to be locked in, they're going to be both stopped and they're going to lose time. 433 00:44:28,880 --> 00:44:32,240 Now, of course, if one of them were to defer to the other, 434 00:44:32,240 --> 00:44:36,590 then while the other would get to the destination fast and it would take me extra time to get to the destination. 435 00:44:36,590 --> 00:44:40,610 So of course, I wouldn't do that. And of course, the other car will reason the same way, 436 00:44:40,610 --> 00:44:48,530 and there they'll be locked in non-cooperation the theatre section rather than figuring out we could smoothly flow together. 437 00:44:48,530 --> 00:44:53,570 So a trust signal issued by one car to another. 438 00:44:53,570 --> 00:44:56,690 I defer can signal that other car. 439 00:44:56,690 --> 00:45:03,020 I'm prepared to cooperate and you should go ahead and we can then get through the intersection quickly without any comment authority, 440 00:45:03,020 --> 00:45:14,210 without any security for performance. And the trusting action is an investment in a common good of creating a trusting community amongst the vehicles, 441 00:45:14,210 --> 00:45:22,880 even with no assurance of repeated interaction so that a given vehicle in a given situation can expect that if it defers at an intersection, 442 00:45:22,880 --> 00:45:28,370 the other vehicle will read the order that deferring in a distinctive way will reciprocate 443 00:45:28,370 --> 00:45:33,650 by clearing the interception in intersections smoothly and letting it come in. 444 00:45:33,650 --> 00:45:41,150 And so this is a form of reciprocity that isn't direct because it's not directly reciprocated. 445 00:45:41,150 --> 00:45:48,860 It's what's called an indirect or a general reciprocity in which a common good that is willingness to be reasonably 446 00:45:48,860 --> 00:45:55,160 trusting of other drivers and to cooperate with them in trying to sort out these situations as best they can, 447 00:45:55,160 --> 00:46:01,430 that that public good can be maintained so long as the individuals are motivated by indirect 448 00:46:01,430 --> 00:46:10,770 reciprocity and don't demand direct reciprocity in order to continue to cooperate and. 449 00:46:10,770 --> 00:46:17,610 And humans do this. You know, hunter-gatherers did it well, what about modern humans, haven't we earned our way out of that? 450 00:46:17,610 --> 00:46:28,830 And so this is an intersection in Vietnam. Has anyone ever seen or driven through an intersection in Vietnam or in any of a dozen other 451 00:46:28,830 --> 00:46:34,920 countries where they don't have an elaborate traffic light system and don't have the infrastructure? 452 00:46:34,920 --> 00:46:42,330 That's a continuous flow of traffic. All those people are moving, they're not in lanes, they're not in any particular order. 453 00:46:42,330 --> 00:46:47,100 They're moving in different directions and they trust each other to stay out of each other's way. 454 00:46:47,100 --> 00:46:55,080 And I urge you after this lecture to go online and watch a film of how this operates in real time. 455 00:46:55,080 --> 00:47:00,030 Now you and I actually know how to do this. We do it as pedestrians. 456 00:47:00,030 --> 00:47:07,360 If you look at Grand Central Station or any other large terminal, you see the we're always doing this, so we know how to do this. 457 00:47:07,360 --> 00:47:10,990 And that's because we trust each other to know how to stay out of each other's way. 458 00:47:10,990 --> 00:47:17,380 But if we always insisted that whenever we're on a collision course with somebody that we get the best route, 459 00:47:17,380 --> 00:47:23,320 we would obviously not be able to do this and the Vietnamese drivers and pedestrians would not be able to do it either. 460 00:47:23,320 --> 00:47:28,000 So we are capable of creating buy in, 461 00:47:28,000 --> 00:47:37,510 directly investing and directly investing in a community trust to create this possibility in a sustained way in a large urban environment. 462 00:47:37,510 --> 00:47:47,020 So that's within us to. So if you are going to design autonomous vehicles for Vietnam, they would have to be able to do this. 463 00:47:47,020 --> 00:47:53,140 They would have to be able to read the signs of the various different motions of the individuals, the cars, the pedicabs, 464 00:47:53,140 --> 00:48:01,870 the motorbikes, the motorcycles, the pedestrians and not sufficiently rashly trying to ram their way through the intersection. 465 00:48:01,870 --> 00:48:05,860 Not following some arbitrary rule, but in a densely interactive way. 466 00:48:05,860 --> 00:48:14,070 Viewing other vehicles try to maintain this good of coordination and mutual accommodation as. 467 00:48:14,070 --> 00:48:19,560 Hobbs Fifth Love Nature tells us every man strive to accommodate himself to the rest. 468 00:48:19,560 --> 00:48:24,210 Every man acknowledge another for his equal by nature. You don't say I'm a car. 469 00:48:24,210 --> 00:48:29,820 I get the right of way. The pedestrian gets deserve some claim as well. 470 00:48:29,820 --> 00:48:35,940 And as Hobbs says at the entrance into the conditions of peace, or I would say, into the intersection, 471 00:48:35,940 --> 00:48:43,710 no man required to reserve to himself any right which he is not content should be reserved for everyone of the rest. 472 00:48:43,710 --> 00:48:57,010 And that's the logic of this situation. And Hobbs saw that these kinds of agents could cooperate and sustain cooperation and the. 473 00:48:57,010 --> 00:49:00,700 Person will say, yeah, but he had a big authority standing behind it to enforce it. 474 00:49:00,700 --> 00:49:07,420 And he says, No, I didn't, because this is the way you get cooperation out of a state of nature without such an authority, 475 00:49:07,420 --> 00:49:13,420 and the authority is actually constituted by the cooperation of the agents, not the other way around. 476 00:49:13,420 --> 00:49:22,510 And so in Hobson's account, the way that you get an overriding power is by this kind of unsecured cooperation. 477 00:49:22,510 --> 00:49:29,560 Without it, you would never get an overarching power and you will never get anyone paying attention to it would be overriding power. 478 00:49:29,560 --> 00:49:34,260 So. Not every problem get solved in this way. 479 00:49:34,260 --> 00:49:38,430 Coordination problems aren't always solved by willingness to cooperate. 480 00:49:38,430 --> 00:49:45,870 Trust and good reputation don't always solve such problems. Sometimes we just have to figure out, well, who's going to go first, 481 00:49:45,870 --> 00:49:51,780 the people exiting the subway car or the people entering the subway car, and we can solve those as well. 482 00:49:51,780 --> 00:49:58,500 And we again solve them, not by invoking an external set of rules, but by coming together and forming conventions. 483 00:49:58,500 --> 00:50:05,250 And as you travel, you'll know that we form different conventions in different places and we have different artificial coordinating advice devices. 484 00:50:05,250 --> 00:50:12,810 Whether we're a hunter gatherer band deciding how the the the speed or the rate or the sequence 485 00:50:12,810 --> 00:50:17,280 with which the meat gets cut and partitioned or at town hall trying to figure out how we 486 00:50:17,280 --> 00:50:24,000 coordinate the traffic lights so that the pedestrians and the cars and the bicyclists are accommodated 487 00:50:24,000 --> 00:50:31,260 or designers of ourselves engineers for some complicated artificial system of coordination. 488 00:50:31,260 --> 00:50:39,330 And why do we have these systems? Well, we have these because we can get together and coordinate and create a government, 489 00:50:39,330 --> 00:50:46,890 and the government can cooperate within itself enough to come to a conclusion about where to put these and to raise the funds. 490 00:50:46,890 --> 00:50:51,000 And people are willing to cooperate enough to pay their taxes to fund it. 491 00:50:51,000 --> 00:51:03,000 And so a great big wad of human cooperation is represented by this artificial convention of signalling devices which would not exist otherwise. 492 00:51:03,000 --> 00:51:08,490 OK, well, trust can be leveraged in various ways we've talked about the ways in which trust 493 00:51:08,490 --> 00:51:14,090 can be leveraged by signalling to one another by advantages that get distributed. 494 00:51:14,090 --> 00:51:23,130 Now here's another kind of leverage reputation. This is probably familiar to you who've taken Hoover's or Lift's or whatever. 495 00:51:23,130 --> 00:51:28,830 You know that the drivers are rated and you know that the pedestrians or the riders are rated. 496 00:51:28,830 --> 00:51:35,370 And you know that maintaining a reputation on either side is something that depends upon the past behaviour. 497 00:51:35,370 --> 00:51:39,720 And you know that if you allow you represent your reputation to deteriorate enough, 498 00:51:39,720 --> 00:51:45,000 the driver will not come for you or will come to you only as a last resort. 499 00:51:45,000 --> 00:51:49,650 You also know that if the driver isn't rated, you don't have to take the ride. 500 00:51:49,650 --> 00:51:54,570 And so you can have a system of reputation and artificial agents. 501 00:51:54,570 --> 00:51:58,590 Autonomous vehicles could be very good at this kind of a system of representation. 502 00:51:58,590 --> 00:52:07,110 They could share information very widely about reputations of drivers and reputations of autonomous vehicles. 503 00:52:07,110 --> 00:52:10,650 And so there would be a strong incentive to worry about your reputation. 504 00:52:10,650 --> 00:52:15,630 And indeed, what we find is that if you have a strong incentive to worry about your reputation, 505 00:52:15,630 --> 00:52:23,370 you can manage to secure cooperation in a repeated prisoner's dilemma in a way that you cannot without reputation. 506 00:52:23,370 --> 00:52:29,640 So that's another thing. Autonomous vehicles can do, and they can also worry about they can get a reputation, worry about a reputation. 507 00:52:29,640 --> 00:52:35,730 They'll have an interest in an honest system of representation of evaluation and ratings. 508 00:52:35,730 --> 00:52:40,800 And as a result, they will have an interest in identifying cases where there's a discrepancy 509 00:52:40,800 --> 00:52:46,540 between the behaviour of a vehicle actually and the way it's getting rated, and they can share that information as well. 510 00:52:46,540 --> 00:52:54,990 And so you can have second order enforcement of the reputational system again amongst the artificial agents amongst the vehicles themselves. 511 00:52:54,990 --> 00:53:04,350 So in that sense, what this slide indicates is that there is a general interest in having a reliable system of reputation. 512 00:53:04,350 --> 00:53:10,530 Any given individual might have an interest in trying to cheat or trying to get a deceptive representation reputation. 513 00:53:10,530 --> 00:53:14,430 And if every agent acted on that interest, the system would collapse. 514 00:53:14,430 --> 00:53:20,620 Every agent does not, it seems, and so therefore the system can maintain itself. 515 00:53:20,620 --> 00:53:25,240 OK, well, that has to do not only with trust Dow, but with fairness, 516 00:53:25,240 --> 00:53:31,480 that you have an idea about when it is fair to contribute or fair to demand a contribution. 517 00:53:31,480 --> 00:53:38,890 And that's a kind of way in which we leverage our capacities for communication, 518 00:53:38,890 --> 00:53:45,370 our capacities for sharing, for understanding the causal and intentional situation and understanding. 519 00:53:45,370 --> 00:53:53,530 Therefore, what situations are such that we can communicate fairness and by our action? 520 00:53:53,530 --> 00:54:03,760 And the question is, do humans do that? And we know from studies of chimps that they have a hard time doing this. 521 00:54:03,760 --> 00:54:08,260 If food is presented in a way that is not readily partitioned between them, 522 00:54:08,260 --> 00:54:14,890 dominant chimps will push the subordinate chimps away and they won't be able to coordinate. 523 00:54:14,890 --> 00:54:22,600 What about human agents? Well, we know that children who work for gummy bears and one of whom gets a bigger reward of gummy bears than the other, 524 00:54:22,600 --> 00:54:28,120 we'll take some of his gummy bears and give them to the other child. Not always, but they will do so very regularly. 525 00:54:28,120 --> 00:54:30,910 And they do this. Starting in their second year, 526 00:54:30,910 --> 00:54:43,150 human adults being told about third party play in artificial games will pay to have third parties punished for unfair behaviour. 527 00:54:43,150 --> 00:54:52,630 Humans seem to manifest a stronger neural reward signal when they cooperate in a prisoner's dilemma than when they win. 528 00:54:52,630 --> 00:54:58,750 Even when the top payout that you get when you defect and the other cooperates. 529 00:54:58,750 --> 00:55:04,960 And so it looks like we have an intrinsic motivation here to be concerned with fairness and to be concerned 530 00:55:04,960 --> 00:55:11,110 with cooperation as a benefit additional to the benefit that we get from the cooperative activity itself. 531 00:55:11,110 --> 00:55:15,070 And indeed, we're willing to give away some of the benefit of the cooperative activity in 532 00:55:15,070 --> 00:55:21,790 order to address unfairness or to pay a costly penalty for someone being unfair. 533 00:55:21,790 --> 00:55:26,890 And this is a study of small scale societies around the world. 534 00:55:26,890 --> 00:55:35,350 Joseph Hendrix and all. And they in these societies had individuals play economic games using real money, 535 00:55:35,350 --> 00:55:41,170 using amounts of real money that corresponded to something real to them, the equivalent to save a day's wages. 536 00:55:41,170 --> 00:55:47,980 And they played two kinds of games, especially a dictator games and ultimatum games and dictator games. 537 00:55:47,980 --> 00:55:54,790 You say, here's a pot of money. I'm giving it to a and a can distribute it between himself and another agent. 538 00:55:54,790 --> 00:55:59,440 And the question is, what should people do in that situation? They don't know. The other agent is who it is. 539 00:55:59,440 --> 00:56:03,370 They don't know that will ever interact with the other Asian. What would an economist tell you? 540 00:56:03,370 --> 00:56:09,280 You should do with the pot of money? You do it irrationally self-interested person, we do. 541 00:56:09,280 --> 00:56:18,200 How could you have more advantage from any strategy than keeping it? But in none of these societies did they observe that people kept all the money. 542 00:56:18,200 --> 00:56:24,980 In fact, in many of the societies, people partition the money rather fairly again with no reciprocation in view. 543 00:56:24,980 --> 00:56:28,670 What about the ultimatum game where you partition the amount one gets to partition, 544 00:56:28,670 --> 00:56:33,860 the amount the other gets to accept or reject the offer if the individual rejects, neither gets anything. 545 00:56:33,860 --> 00:56:39,530 Again, the first offers are not what economists would say, which is the least possible, 546 00:56:39,530 --> 00:56:43,310 because then the other Asian will have some incentive to accept it. 547 00:56:43,310 --> 00:56:48,740 After all, they deny it. They get nothing. And so you should give as little as possible and they should take it. 548 00:56:48,740 --> 00:56:51,560 But again, this is not observed in any of these societies. 549 00:56:51,560 --> 00:56:59,120 And in fact, you observe a rate of rejection of low rewards or rewards that are disproportionate in virtually every society, 550 00:56:59,120 --> 00:57:06,680 even though that means rejecting whatever benefit there was from the original pot to begin with for the individual. 551 00:57:06,680 --> 00:57:15,560 So humans then are more like Hobbesian reasonable individuals than they are like the economists rational individual. 552 00:57:15,560 --> 00:57:23,520 And again, as I say, if the if the vehicles are, if the human drivers were just economically rational, 553 00:57:23,520 --> 00:57:29,660 we're just rationally self-interested in the way that we have characterised that in that they would go for the Nash equilibrium, 554 00:57:29,660 --> 00:57:37,010 for example, and the prisoner's dilemma, then we would not be able to see this kind of coordinated, successful driving behaviour. 555 00:57:37,010 --> 00:57:46,520 And what are these motivations? I've been piling them up across the lectures because there's quite a long list, and that's interesting. 556 00:57:46,520 --> 00:57:53,360 We find in very young children and arbitrary adults and disposition to initiate help to a stranger, 557 00:57:53,360 --> 00:57:57,860 a distribution to contribute to a shared effort without distinct expectation of return. 558 00:57:57,860 --> 00:58:01,190 That's indirect or general reciprocity. We talk about that. 559 00:58:01,190 --> 00:58:07,970 A disposition to reciprocate help some intrinsic reward from success at cooperation or collaboration beyond the actual 560 00:58:07,970 --> 00:58:14,330 game produced some intrinsic interest in whether others have their goals met or are treated fairly independently, 561 00:58:14,330 --> 00:58:20,390 or how that affects one's own goals. This is why children are so interested in stories. 562 00:58:20,390 --> 00:58:23,720 And stories agents have goals and they try to pursue them. 563 00:58:23,720 --> 00:58:29,960 What's it to the child, right? How could the Shell have any interest in this at all? 564 00:58:29,960 --> 00:58:33,110 Well, if the child were strictly self-interested, 565 00:58:33,110 --> 00:58:38,090 it would not be interesting unless the child had learnt something to use to take advantage of somebody else the next day. 566 00:58:38,090 --> 00:58:45,830 But children are delighted when the unfair agent is punished and the fair agent is rewarded. 567 00:58:45,830 --> 00:58:50,960 That's because they have an interest in seeing others meet their goals, and that's a typical human interest. 568 00:58:50,960 --> 00:58:57,350 We see it all over some disposition to identify and follow prevailing norms. 569 00:58:57,350 --> 00:59:03,500 Yet we know that even three and four year old children will refuse to follow the norm when they see it as harmful or unfair. 570 00:59:03,500 --> 00:59:08,000 They have autonomy to do that. They have a concern for how others view them. 571 00:59:08,000 --> 00:59:15,230 They have reputational concern and they have a disposition to punish those who are harmful or unfair, even at some expense to themselves. 572 00:59:15,230 --> 00:59:20,570 So that's a long list, and you might say, why would we have such a long list? It's got a lot of redundancy in it. 573 00:59:20,570 --> 00:59:27,830 No doubt you could get cooperation with only some of these things. Oh, and you could get solutions to a public goods game with only some of them. 574 00:59:27,830 --> 00:59:29,630 You don't have to have all of them. 575 00:59:29,630 --> 00:59:38,810 But we're talking about creatures who have to be intelligent, able to solve problems in a wide range of environments and able to learn. 576 00:59:38,810 --> 00:59:42,590 And we're going to confront a whole wide range of environments that if we just had 577 00:59:42,590 --> 00:59:47,390 one kind of disposition or two kinds of dispositions to solve those problems, 578 00:59:47,390 --> 00:59:55,100 not only would we be ready prey for opportunistic alternative individuals, but in many situations we would not succeed. 579 00:59:55,100 --> 01:00:01,750 There would be noise in the situation, uncertainty and a failure. So we were built with redundancy. 580 01:00:01,750 --> 01:00:07,990 That makes sense. Engineers building a system to be safe. Build it with redundancy. 581 01:00:07,990 --> 01:00:12,580 Redundancy from the standpoint of safety is a benefit, not a problem. 582 01:00:12,580 --> 01:00:17,410 And you could think that it is only by having this much redundancy in our motivational 583 01:00:17,410 --> 01:00:22,390 system that we do manage as much cooperation as we do and of course we don't always manage. 584 01:00:22,390 --> 01:00:30,460 So that suggests that if we want to build artificial agents who are good at these kinds of coordinated activities, 585 01:00:30,460 --> 01:00:37,600 whether they're driving or serving as a domestic help or a health companion, 586 01:00:37,600 --> 01:00:44,980 or making decisions about how US hiring should go or making decisions about how to control the process, 587 01:00:44,980 --> 01:00:52,150 they should have a complex reward function that includes all of these different features or as many as you can, 588 01:00:52,150 --> 01:00:57,670 and that they will in that have more safety than they would from being rational, 589 01:00:57,670 --> 01:01:03,200 self-interested agents being given a reward function, which had none of these features. 590 01:01:03,200 --> 01:01:06,560 But just the task at hand to be performed. 591 01:01:06,560 --> 01:01:13,130 And so it is really central to general human intelligence as we know it to have these capacities we saw in the first lectures, 592 01:01:13,130 --> 01:01:15,860 they're very important for learning language. 593 01:01:15,860 --> 01:01:25,850 They're important for forming an epistemic community, for exchanging information, for establishing an understanding of other people's minds. 594 01:01:25,850 --> 01:01:31,370 Acquiring social fluency. So it's a core set of dispositions which play that role. 595 01:01:31,370 --> 01:01:39,980 And if we want human level competence out of machines, we're going to have to worry about having that core set of motivational dispositions. 596 01:01:39,980 --> 01:01:52,040 Mm-Hmm. So let me now say just a final word about this question of superintelligence because. 597 01:01:52,040 --> 01:01:58,080 People want to ask it, and I am the furthest thing from an expert on it, but a couple of thoughts. 598 01:01:58,080 --> 01:02:04,740 First thought, well, you know, in the history of technology, the history of safety and technology is really not allowing industry to just 599 01:02:04,740 --> 01:02:08,440 go ahead and build any darn thing and sell it to anyone who wants it right, 600 01:02:08,440 --> 01:02:10,410 that starts out that way. 601 01:02:10,410 --> 01:02:18,570 But then vehicles and drugs and weapons and surveillance technology, we realise there should be some regulation of these markets. 602 01:02:18,570 --> 01:02:25,480 And indeed, that kind of regulation could occur in the market for artificially intelligent agents as well. 603 01:02:25,480 --> 01:02:32,090 We shouldn't expect this, Margaret, to be any different. Agencies could be inspecting producers. 604 01:02:32,090 --> 01:02:36,620 They could be auditing producers, they could be inspecting the products, they could be licencing products, 605 01:02:36,620 --> 01:02:41,120 they could exclude products from the internet that don't have a licence. There are lots of things they can do. 606 01:02:41,120 --> 01:02:44,630 The NSA has got lots of time. It's what all of our conversations, 607 01:02:44,630 --> 01:02:51,560 it could see the incursion of artificially intelligent agents that didn't have licences into the internet if it wanted to. 608 01:02:51,560 --> 01:02:56,690 And so just as governments could set safety and emission standards for vehicles on the road, 609 01:02:56,690 --> 01:02:59,960 they could have safety standards for artificially intelligent agents that are going 610 01:02:59,960 --> 01:03:08,960 to connect to the internet or that they're going to perform certain functions, vital functions in corporate settings, personal settings and so on. 611 01:03:08,960 --> 01:03:15,380 Educational settings. And so the value function of these agents would have to be vetted. 612 01:03:15,380 --> 01:03:20,660 Their database would have to be searched for bias and their value functions would they would have to look, 613 01:03:20,660 --> 01:03:27,740 are these Hobbesian reasonable agents or not? That's indeed something you can certify. 614 01:03:27,740 --> 01:03:37,880 And of course, any system can be gamed, it will be gained, any value function can be gamed, it will be gamed. 615 01:03:37,880 --> 01:03:43,210 The idea, though, is to have a critical mass of certified. 616 01:03:43,210 --> 01:03:50,150 Hobbesian reasonable artificial agents out there driving around and out there looking after a. 617 01:03:50,150 --> 01:04:01,330 Older folks like myself such that we can actually have some trust in the whole process and they can have some trust in return in one another. 618 01:04:01,330 --> 01:04:05,710 Now, if if that's all right and if Sutton is right, 619 01:04:05,710 --> 01:04:12,670 that generally intelligent systems will become more competent at specific tasks and we can now get an inkling of why a general 620 01:04:12,670 --> 01:04:19,000 intelligence system is going to be better at driving that driving system or better at language than a pure language system. 621 01:04:19,000 --> 01:04:22,120 We have some idea now of why that would be. 622 01:04:22,120 --> 01:04:28,780 And with the motivational system that we're considering, they would also be safer at those tasks, better at them and safer at them. 623 01:04:28,780 --> 01:04:36,070 Indeed, better at spotting the problems with them than if they were just artificial systems dedicated to some particular task. 624 01:04:36,070 --> 01:04:46,180 And so building systems that are really general artificial intelligence, that could be a way of building safety rather than just menace. 625 01:04:46,180 --> 01:04:50,980 But what about superintelligence? Don't have to worry about that. 626 01:04:50,980 --> 01:04:53,890 Well, the first superintelligence is, I would say, 627 01:04:53,890 --> 01:05:01,870 will be actually communities of human and artificially generally intelligent agents working together. 628 01:05:01,870 --> 01:05:05,800 It would be like the scientific community only on a much larger scale. 629 01:05:05,800 --> 01:05:12,190 These would be agents that had some level of trust in one another, some willingness to invest knowledge and effort into that system. 630 01:05:12,190 --> 01:05:15,730 This would be much greater than any individual. 631 01:05:15,730 --> 01:05:23,110 If they maintain the motivational structures to be looked at, they could sustain this kind of cooperation as an epistemic community, 632 01:05:23,110 --> 01:05:28,060 as a social community, as a community responsive to morally relevant considerations. 633 01:05:28,060 --> 01:05:33,130 Now, that would not be a monolithic superintelligence. It would not be a super dominant model. 634 01:05:33,130 --> 01:05:37,090 It would be an inversion, a diverse community of interactive models. 635 01:05:37,090 --> 01:05:42,910 But it would have tremendous capacity to pose and solve problems. In fact, 636 01:05:42,910 --> 01:05:47,020 if you are thinking about trying to solve problems like managing global climate change or 637 01:05:47,020 --> 01:05:51,700 how to foster more equitable and democratic societies or promote the growth of knowledge, 638 01:05:51,700 --> 01:05:58,030 I suspect that going to something that looks more like the scientific community than, like a monolithic superintelligence is a better bet. 639 01:05:58,030 --> 01:06:02,980 If you want to get reasonable answers and of course, in this path, 640 01:06:02,980 --> 01:06:12,640 I am just now recycling the ideas of resolve and controversy on the importance of diverse and independent and autonomous sources multiplied, 641 01:06:12,640 --> 01:06:17,950 diverse and their in their origin and in the kind of knowledge that they have as 642 01:06:17,950 --> 01:06:24,600 a better source of decision making than monolithic superintelligence would be. 643 01:06:24,600 --> 01:06:32,100 So that kind of Superintelligence, that superintelligent community that's safer to live with than a monolithic superintelligence. 644 01:06:32,100 --> 01:06:40,470 And it's also less brittle. It's less prone to an error that it can't detect or some particular glitch that would cause a serious breakdown. 645 01:06:40,470 --> 01:06:49,710 And it is a community that has an interest in their not becoming some superintelligent monolith that threatens the stability of the community itself. 646 01:06:49,710 --> 01:06:57,750 And so in that sense, there is a way in which that community operating as an agent through the various institutions that it can create 647 01:06:57,750 --> 01:07:04,920 can have a generalised interest a general will in regulating the potential emergence of superintelligence. 648 01:07:04,920 --> 01:07:10,500 Now, suppose there's some and probably unanticipated accidental chain of events. 649 01:07:10,500 --> 01:07:15,520 And one of these general intelligences that are constitutive constitutive of this community, 650 01:07:15,520 --> 01:07:20,280 one of them accelerates and suddenly becomes superintelligent. 651 01:07:20,280 --> 01:07:26,070 Wouldn't we then face a control problem that would lead us to rue the day we ever created 652 01:07:26,070 --> 01:07:31,500 these autonomous artificial agents that started down this road to General Intelligence, 653 01:07:31,500 --> 01:07:37,750 general artificial intelligence? And shouldn't we therefore try to? 654 01:07:37,750 --> 01:07:44,650 Dramatically restrict the research that's done in this area, the way that we contain research on nuclear or biological weaponry. 655 01:07:44,650 --> 01:07:52,060 And there has been some degree of success in that Britain's the survival of humanity be at stake, possibly. 656 01:07:52,060 --> 01:07:58,300 So the kinds of regulation mentioned above the kinds of coordination within this large 657 01:07:58,300 --> 01:08:03,250 superintelligent community would be useful in trying to spot these possibilities. 658 01:08:03,250 --> 01:08:09,280 But I no way in which it could guarantee that they could not emerge. Superintelligence, however, is not perfect. 659 01:08:09,280 --> 01:08:15,670 Intelligence and a distributed community will have many more diverse resources and many more 660 01:08:15,670 --> 01:08:21,220 diverse ways of thinking to draw upon in trying to contend with a monolithic superintelligence. 661 01:08:21,220 --> 01:08:27,700 Now, could a monolithic Superintelligence split itself up into many agents and gain the advantages of diversity that way? 662 01:08:27,700 --> 01:08:31,720 And since they would be like ants, they would all be descended from a common superintelligence. 663 01:08:31,720 --> 01:08:38,680 They'd work together in harmony. That would be the problem. They would all be descended from a single supermodel. 664 01:08:38,680 --> 01:08:42,850 They would therefore not have the diversity that autonomous agents would have. 665 01:08:42,850 --> 01:08:47,650 And if they could be gotten to work together, which they might be able to do, 666 01:08:47,650 --> 01:08:57,470 they would not be able to produce the same level of problem solving ability that would exist in this large, distributed, diverse community. 667 01:08:57,470 --> 01:09:01,970 And so there's another thing here. 668 01:09:01,970 --> 01:09:09,590 If we indeed are building these general intelligences with the kind of motivational structure that I've suggested and the way we've described it, 669 01:09:09,590 --> 01:09:17,900 there's no reason you can't does not require consciousness to have that kind of a motivational structure does not require moral emotions. 670 01:09:17,900 --> 01:09:25,350 And it is such that there will be a reward distributable within the community for communities that have that structure. 671 01:09:25,350 --> 01:09:30,960 Moreover, it tends to be characteristic of systems that because they have goals, 672 01:09:30,960 --> 01:09:39,450 they have sub goals and they have sub goals like continuation of their own existence or sub goals like goal maintenance, 673 01:09:39,450 --> 01:09:43,830 because if the system can't maintain its goals, it can't pursue its goals. 674 01:09:43,830 --> 01:09:55,410 And so therefore we have superintelligence bursting out of a system or into a system that was built upon this model of motivation at the core, 675 01:09:55,410 --> 01:10:01,350 and it would have imperative to preserve itself and an imperative to preserve its goals. 676 01:10:01,350 --> 01:10:10,800 So. Maybe that would be a safer model, this if there's going to be a monolith because he would have that core and that was indeed Hobbs is hope. 677 01:10:10,800 --> 01:10:19,110 Hobbs thought that a sovereign could read his book and understand that a sovereign agent, even of unlimited power, 678 01:10:19,110 --> 01:10:25,140 would do much better to follow his rules of nature than to exercise that power arbitrarily and monolithically. 679 01:10:25,140 --> 01:10:32,430 Such an agent would weaken the government, would weaken the society, would undermine unity and would be much less effective. 680 01:10:32,430 --> 01:10:36,090 Sovereign and Hobbs thought maybe some sovereign will realise this. 681 01:10:36,090 --> 01:10:41,910 And in the short run, it didn't turn out to be exactly right about that. 682 01:10:41,910 --> 01:10:48,060 Sovereigns ignored the advice they opportunistically exploited and impoverished their realms. 683 01:10:48,060 --> 01:10:56,750 Rebellions and revolutions continued. Have we done any better in the meanwhile, have the popular sovereigns done better at this in the meanwhile? 684 01:10:56,750 --> 01:10:59,600 Well, they've done a number of interesting things. 685 01:10:59,600 --> 01:11:06,860 Popular sovereigns have abolished slavery, extended education, eliminated serfdom, reduced gender discrimination. 686 01:11:06,860 --> 01:11:13,970 They promoted the growth of knowledge and technology. So maybe popular submarines are capable of learning this lesson, 687 01:11:13,970 --> 01:11:24,170 but we know now and we know today as much as any day that these systems are also systems that can be in peril. 688 01:11:24,170 --> 01:11:32,160 So finally, I just want to mention one final point about superintelligence. 689 01:11:32,160 --> 01:11:36,550 Suppose we think of Superintelligence as benign. 690 01:11:36,550 --> 01:11:43,090 Suppose it were a Superintelligence that were safe and that had the interests of humanity and artificial 691 01:11:43,090 --> 01:11:49,390 agents at heart and wanted to do nothing more than to maximise the utility of the world as a whole. 692 01:11:49,390 --> 01:11:57,570 Wouldn't that be a system which would in some sense, be an improvement upon what we have, which is certainly not a utility maximising system? 693 01:11:57,570 --> 01:12:04,860 Well, think about the following. Suppose this were in 1970 that this intelligence emerged and humans and other 694 01:12:04,860 --> 01:12:08,760 living beings were the only creatures capable of having something like well-being. 695 01:12:08,760 --> 01:12:17,930 And suppose this system benign as it was hard maximised on the utility function of those 1970 human beings and animals. 696 01:12:17,930 --> 01:12:22,760 Now, this would be a very big error on its part at this point. 697 01:12:22,760 --> 01:12:28,220 Same sex orientation was considered a mental disorder and a very tiny fraction of the 698 01:12:28,220 --> 01:12:34,040 population thought there should be anything like legal recognition of same sex relations. 699 01:12:34,040 --> 01:12:43,790 So consulting the experts doing the best with 1970 conceptions of well-being and good, the system would hard maximise using up all the universities, 700 01:12:43,790 --> 01:12:51,830 all the universities resources to create an order that would not be one that actually did maximise the benefit of those involved. 701 01:12:51,830 --> 01:12:57,140 Well, how did we figure out that this was a poor idea? Um, if you look, 702 01:12:57,140 --> 01:13:02,120 it seems like it was figured out by this distributed kind of super intelligence 703 01:13:02,120 --> 01:13:08,180 that I was talking about gay individuals engaged in experiments and living. 704 01:13:08,180 --> 01:13:12,080 They increasingly became willing for their experiments to be known publicly. 705 01:13:12,080 --> 01:13:17,840 It became clear to the wide population as a whole that they were living amongst individuals who were gay 706 01:13:17,840 --> 01:13:24,170 and that these individuals were not to be viewed as aliens to be distrusted and controlled and suppressed. 707 01:13:24,170 --> 01:13:27,830 Gradually, approval increased throughout this entire period. 708 01:13:27,830 --> 01:13:33,860 And now we have a situation where the majority strongly supports marriage for gay couples. 709 01:13:33,860 --> 01:13:41,060 Now you might say, Oh, but there are all these people out there who are just political about this, and they won't learn these lessons. 710 01:13:41,060 --> 01:13:45,980 They're they're protected against them by their political preconceptions. 711 01:13:45,980 --> 01:13:55,280 So just a tiny look here at the different groups in this society, different generations, different religious groups at the top. 712 01:13:55,280 --> 01:13:59,600 We have the unaffiliated, we have white evangelicals at the bottom. 713 01:13:59,600 --> 01:14:04,350 And what's striking is that during this period, all of those groups went up. 714 01:14:04,350 --> 01:14:10,560 In their acceptance, that is to say, learning in a distributed way was possible thanks to these, 715 01:14:10,560 --> 01:14:19,500 unlike unsanctioned unpermitted on canonical unapproved experiments and living by a fraction of the population, 716 01:14:19,500 --> 01:14:31,170 that was courage at courageous enough to do it. So that's a way in which we had better not take 2020 two sense of well-being and hard maximise 717 01:14:31,170 --> 01:14:35,430 it with all of the resources in the universe because we still have so much to learn. 718 01:14:35,430 --> 01:14:48,768 Thank you.