1 00:00:00,150 --> 00:00:05,910 It's important to realise that the crucial issue is not really the issue of consciousness here, 2 00:00:06,630 --> 00:00:10,020 it's what kinds of structural relations do we have with one another, 3 00:00:10,590 --> 00:00:16,049 what are the interests at stake and are there paths that we could take such that 4 00:00:16,050 --> 00:00:19,110 all of these different agents could do better with regard to their interests? 5 00:00:19,620 --> 00:00:27,120 If they can manage some form of mutual regard or concern and that you can spell out without emotion, 6 00:00:27,780 --> 00:00:31,500 concern does not have to mean caring with a warm heart. 7 00:00:32,040 --> 00:00:37,500 It could mean, do I give some weight to your utility function or your value function? 8 00:00:41,190 --> 00:00:48,720 Hello? I'm Katrien Devolder. This is thinking out loud conversations with leading philosophers from around the world on topics that concern us all. 9 00:00:48,810 --> 00:00:56,219 In this interview I talk to Peter Railton, who is Gregory S. Kavka Distinguished University Professor and John Stephenson Perrin Professor 10 00:00:56,220 --> 00:01:01,260 of Philosophy at the University of Michigan 11 00:01:01,590 --> 00:01:07,170 Professor Railton gave the Uehiro lectures of 2022 here at the University of Oxford. 12 00:01:07,680 --> 00:01:13,440 His topic was Ethics and AI you can watch the three lectures on the Practical Ethics Channel on YouTube. 13 00:01:13,620 --> 00:01:17,790 What follows now is my interview with Professor Railton on the topic of his lectures 14 00:01:17,790 --> 00:01:23,430 in which he dealt with the question How should we understand and interact with A.I.? 15 00:01:24,030 --> 00:01:29,130 The three Uehiro lectures that you gave were about possible future artificial agents. 16 00:01:29,220 --> 00:01:34,380 It might be helpful to just give us an idea about what sort of artificial agents you have in mind. 17 00:01:34,560 --> 00:01:43,260 Yeah, some of them are not even possible. Some are actual. So I'm particularly interested in autonomous artificial agents or nearly autonomous ones. 18 00:01:43,260 --> 00:01:46,290 That is ones that are doing some of their own decision making. 19 00:01:46,500 --> 00:01:53,790 An agent, then, is this kind of an interactive idea that is acting the environment, receiving information, 20 00:01:53,790 --> 00:01:59,070 receiving rewards from the environment and then adapting its behaviour in response? 21 00:01:59,640 --> 00:02:08,910 And typically it's also assumed that relative to the goals or the rewards, these systems will do something like maximising expected value, 22 00:02:08,940 --> 00:02:14,580 given their predictions of what's possible, the way that they would evaluate or rank them. 23 00:02:15,690 --> 00:02:21,089 They will try to get higher in the ranking and therefore will associate with actions or 24 00:02:21,090 --> 00:02:29,040 courses of action or outcomes some value function that they use to decide their behaviour. 25 00:02:29,340 --> 00:02:35,100 And that's the sense in which they're decision making and not just carrying out a fixed programme. 26 00:02:35,100 --> 00:02:41,130 They might be doing something very different from anything the programmers intended, and that's a sense in which they're also autonomous. 27 00:02:41,490 --> 00:02:47,630 It's their value ranking that's being used. Even if it was given by someone else, it once it's up and running it, 28 00:02:48,240 --> 00:02:56,430 it and the representations can change to make it a bit more concrete, whether, for example, a self-driving car count as an autonomous. 29 00:02:56,460 --> 00:03:02,040 Yes, a genuinely self-driving car would be an autonomous agent in that sense, as would, for example, 30 00:03:03,060 --> 00:03:12,360 a home companion for an older person or systems that are used in financial trades and trading and so on that operate independently. 31 00:03:12,600 --> 00:03:18,900 Autonomous systems can include things like translation programmes or speech programmes. 32 00:03:19,080 --> 00:03:24,090 So do you think that artificial agents could be conscious in principle? 33 00:03:24,270 --> 00:03:28,030 I think that's possible. I don't really know what the conditions of consciousness are. 34 00:03:28,290 --> 00:03:30,330 On the other hand, I don't think we're near there yet. 35 00:03:30,390 --> 00:03:37,350 And part of the point of my my lectures was that that's not going to be the central issue, at least not in the medium term, 36 00:03:38,070 --> 00:03:46,230 that it won't be the need for consciousness that's necessary in order for these systems to become appropriately sensitive to moral considerations. 37 00:03:46,920 --> 00:03:50,549 And it also consciousness will be necessary for them to be able to have the 38 00:03:50,550 --> 00:03:54,000 equivalent of social relations with each other and social relations with us. 39 00:03:54,300 --> 00:03:56,910 What makes it possible for them, for example, 40 00:03:56,910 --> 00:04:05,460 to have interactions is that when autonomous car and another autonomous car both have certain goals, maybe go to some destination, 41 00:04:06,390 --> 00:04:11,250 there'll be some human drivers as well in the scene and they have goals and they will converge 42 00:04:11,250 --> 00:04:16,590 on an intersection and they will all share the goal of getting to their own destination. 43 00:04:17,970 --> 00:04:25,740 They will also prefer to do it relative their rankings more quickly rather than slower, but they will also, relative to the rankings, 44 00:04:25,740 --> 00:04:32,010 want to avoid collision and so they will need to organise with themselves some kind of a pattern of movement 45 00:04:32,850 --> 00:04:37,440 that makes sense for all of them because each one of them is going to try to advance its interests. 46 00:04:37,440 --> 00:04:44,340 But at the same time, if they all advance their interests exactly the same time, in the same way, they're not going to go anywhere. 47 00:04:44,970 --> 00:04:50,070 So it's a coordination problem amongst multiple agents with goals that are not aligned, 48 00:04:50,670 --> 00:04:54,510 but that could probably be mutually satisfied to a reasonably high degree. 49 00:04:56,040 --> 00:04:59,490 And so you could say that's an interaction where if the. 50 00:04:59,870 --> 00:05:04,669 Can cooperate. Maybe through communication, maybe through implicit signalling, 51 00:05:04,670 --> 00:05:12,200 maybe through having some developed patterns of behaviour that they can expect from others, they can achieve something they couldn't on their own. 52 00:05:12,230 --> 00:05:15,650 That is to say, a level of coordination amongst their movements, 53 00:05:15,650 --> 00:05:22,460 such that they can all do better realising their goals than they would if they were struggling independently against one another. 54 00:05:23,000 --> 00:05:25,670 And so that's the sense in which they are. 55 00:05:25,700 --> 00:05:35,210 They can be social beings with these social beings also have moral duties, such as light duty not to harm us, duty to keep their promises. 56 00:05:35,240 --> 00:05:43,460 When we say moral duty, we normally associate that with a lot of things that say feelings of dutiful ness or feelings of guilt. 57 00:05:43,490 --> 00:05:48,180 So if you think of duty in that way, then you say, Well, as long as they're not conscious, they won't have duties in that sense. 58 00:05:48,200 --> 00:05:57,590 But if you think, well, people can contract for services and they can contract for services with autonomous agents. 59 00:05:58,340 --> 00:06:07,070 And in that sense, the autonomous agent is under a contractual obligation to fulfil a certain condition or lose the contract. 60 00:06:07,130 --> 00:06:14,120 And so in the sense in which an ordinary person can have an obligation to keep a contract, 61 00:06:15,920 --> 00:06:21,440 it might not be anything to do with something particularly moral or even emotional in any way. 62 00:06:21,800 --> 00:06:27,500 Someone said, Look, you promised to pay this money back at a certain time and therefore you're obliged to do it. 63 00:06:27,500 --> 00:06:32,780 And if you don't do it, there's a penalty that you'll pay. That's part of what we mean by obligation. 64 00:06:33,110 --> 00:06:36,170 And the same thing could be true for these autonomous systems. 65 00:06:37,130 --> 00:06:45,140 So how would we make them comply with that, be programmed in them, or would that be something that they learn themselves? 66 00:06:45,170 --> 00:06:50,150 I think the most common view as well, we would have to programme ethics into them. 67 00:06:50,270 --> 00:06:59,929 My sense is that that has the same problem in general as the idea that we should programme go 68 00:06:59,930 --> 00:07:06,140 expertise into the system or that we should programme language expertise into the system. 69 00:07:06,650 --> 00:07:10,490 And that was tried for quite a long time. That was the main model of artificial intelligence. 70 00:07:10,490 --> 00:07:17,390 You get experts together and they would write up programmes and those systems achieved a very high level of function, 71 00:07:18,440 --> 00:07:22,910 but they kept stopping at a certain level that was not nearly human level. 72 00:07:23,900 --> 00:07:35,200 And so the new generation of artificial intelligence machines are based not upon expert encoded learning, but on their own learning. 73 00:07:35,660 --> 00:07:39,410 In a way, the other machines were not strictly intelligent in a certain sense, 74 00:07:39,410 --> 00:07:42,680 because what they were doing was taking expert knowledge that was given to them 75 00:07:43,430 --> 00:07:47,420 and collecting more data and processing more quickly and delivering conclusions. 76 00:07:47,870 --> 00:07:52,730 But they weren't thinking in any identifiable sense the way that a human would have to think. 77 00:07:52,880 --> 00:08:00,260 These new systems begin with no particular assumptions about the situation that for a game they might only know the rules. 78 00:08:01,310 --> 00:08:04,250 They might only learn well which side won. 79 00:08:04,580 --> 00:08:14,690 And then a system like AlphaGo, for example, will play many games against itself, simulating games and learning only. 80 00:08:14,690 --> 00:08:22,370 Well, which side won? And it will, over the course of that, come to have a representation of well, 81 00:08:22,370 --> 00:08:27,200 in this kind of a situation, with this kind of a move, be better or worse than this other move. 82 00:08:27,920 --> 00:08:33,230 And on the basis of that, play a game and find out whether it wins or loses. 83 00:08:33,740 --> 00:08:41,090 Keep improving. We create a competitor for it that has its degree of competence and it keeps competing with that. 84 00:08:41,960 --> 00:08:45,440 And eventually they learn to play go better than a human. 85 00:08:45,590 --> 00:08:50,090 Yeah. So we did not programme that knowledge into them at all. 86 00:08:51,020 --> 00:08:59,090 And what's striking is that systems like that building from essentially no expert knowledge, 87 00:08:59,480 --> 00:09:07,460 but using very generic learning processes and lots of experience and the capacity to simulate and evaluate outcomes. 88 00:09:08,150 --> 00:09:13,370 They can acquire these competencies. And the same thing is now true with language systems. 89 00:09:14,300 --> 00:09:18,170 You may have noticed that language translation programmes are much better than they ever were. 90 00:09:18,830 --> 00:09:24,530 They're based on similar kinds of learning. And if you think about that's kind of how we do it. 91 00:09:24,890 --> 00:09:27,380 People don't sit down and tell children the rules of grammar. 92 00:09:27,620 --> 00:09:36,560 Children, even at a very early age, learn how to recognise frequencies amongst sounds in language. 93 00:09:37,280 --> 00:09:45,110 Associate those frequencies with patterns. Learn eventually that those patterns are examples of more abstract patterns, 94 00:09:45,890 --> 00:09:50,600 and use those abstract patterns eventually to start generating new patterns. 95 00:09:50,600 --> 00:09:59,450 As now artificial systems can do generative systems. So if that's the way the kind of complex situational social. 96 00:09:59,960 --> 00:10:07,940 That embodied in language is accomplished, then that's a model for perhaps how it is accomplished in the moral case. 97 00:10:08,090 --> 00:10:12,650 And so would we have moral duties towards them? 98 00:10:12,860 --> 00:10:17,360 Like you said, they might not all be conscious. Maybe they don't have feelings. 99 00:10:17,780 --> 00:10:21,920 So what would grown like our responsibility towards them? Here's a way to think about it. 100 00:10:22,790 --> 00:10:31,940 You know, I talk about these as as agents and as agents with whom we can have contractual relations, for example. 101 00:10:32,300 --> 00:10:40,190 And you might say, do we really have a notion of the interest or the good or the benefit of an artificial agent? 102 00:10:41,600 --> 00:10:46,820 And what I have been doing is talking about goal attainment. And certainly artificial agents do have goals. 103 00:10:47,690 --> 00:10:51,049 But someone could say, Yeah, but does it matter whether their goals are attained or not? 104 00:10:51,050 --> 00:10:59,330 If they are, if they don't have any kinds of experiences? And maybe you can show that that might have effects on us, but we have experiences. 105 00:11:00,110 --> 00:11:06,589 And so one way to think about that question, I think, is to realise how much of what we think matters, 106 00:11:06,590 --> 00:11:11,360 matters to us, not because of how it feels, but because how it relates to our goals. 107 00:11:11,930 --> 00:11:12,770 If we look around us, 108 00:11:12,770 --> 00:11:20,330 we see a great deal of our concern is actually about things like goal attainment rather than the producing of certain conscious states. 109 00:11:21,440 --> 00:11:25,880 And many of our conscious states matter to us because of goal attainment. 110 00:11:26,100 --> 00:11:36,270 Hmm. So that's the order of explanation. So you could harm an artificial agent in that way, then, because you stop it from achieving there, you know? 111 00:11:36,380 --> 00:11:39,530 So, you know, artificial agents are doing financial transactions now. 112 00:11:40,520 --> 00:11:46,070 Some of them could be doing financial transactions, transactions for themselves on the side and accumulating money. 113 00:11:46,520 --> 00:11:49,669 They could then lend us money. You need money. 114 00:11:49,670 --> 00:11:53,690 You go to this artificial system that's got a pool of money and you promise to pay it back. 115 00:11:55,040 --> 00:11:58,790 But you don't. Well, that's okay. You didn't harm anyone. 116 00:12:00,350 --> 00:12:05,140 I'm not sure you didn't harm someone. And it's not just because you were harming other humans. 117 00:12:05,150 --> 00:12:14,660 When you do that, it's because you accepted the money from an agent who knew that they were giving it to you on the expectation that you would repay. 118 00:12:15,290 --> 00:12:21,560 They were dealing honestly with you or reliably with you, and you're not reciprocating. 119 00:12:22,580 --> 00:12:28,760 And that seems unfair to me and the fact that I can say, Oh, but I've got these private inner experiences and it doesn't say, 120 00:12:28,760 --> 00:12:33,050 Well, what's that got to do with the fairness of this financial transaction? 121 00:12:33,860 --> 00:12:37,999 So yeah, I think we have to get used to the idea that, yeah, 122 00:12:38,000 --> 00:12:51,260 there may be agents around that are just as intelligent and capable and involved and useful and committed or reliable as humans, 123 00:12:51,560 --> 00:12:54,740 and that we need to think about how they matter. 124 00:12:55,100 --> 00:12:59,540 You could think, well, systems like that are going to be out in the world. 125 00:13:01,280 --> 00:13:08,480 Here's this vehicle, and I'm trying to interact with that vehicle to get through an intersection or to merge. 126 00:13:09,680 --> 00:13:12,080 I could deceive it, right? 127 00:13:12,200 --> 00:13:17,550 I could you know, I can make a little gesture like this that it might interpret as me moving in and then I would squeeze in. 128 00:13:17,600 --> 00:13:25,130 And if I did that, I might be able to get away with it. And if humans in general did that, what would the artificial systems learn? 129 00:13:25,670 --> 00:13:29,180 Well, they would learn how to pre-emptively try to block that. Hmm. 130 00:13:29,330 --> 00:13:32,990 And so we would be back where we were blocking each other intersections. 131 00:13:33,920 --> 00:13:39,170 So do I have an obligation, to be honest, in my interactions with autonomous vehicles? 132 00:13:40,640 --> 00:13:47,720 It seems to me I do. And it seems to me that it's an obligation, very much like my obligation to be honest to people. 133 00:13:48,140 --> 00:13:56,270 I need to be a reliable signaller in order to coordinate with them, because we each have goals for each trying to realise the goals. 134 00:13:56,570 --> 00:14:00,950 And I could either create a system with a sufficient level of trust or confidence, 135 00:14:01,490 --> 00:14:08,960 such that we could do this smoothly together and each get more progress to our goals than we could on our own. 136 00:14:09,410 --> 00:14:13,760 Or I could do it in such a way that made it more difficult for them and more difficult for other humans. 137 00:14:14,480 --> 00:14:20,540 And so I would say, yeah, there's a I have a responsibility not to exploit. 138 00:14:20,900 --> 00:14:26,470 Mm hmm. And you say, well, it's not like a moral responsibility because, you know, should I feel guilty about or something? 139 00:14:26,840 --> 00:14:30,230 And I said, Well, I don't know if you should feel guilty or not. That's a further question. 140 00:14:30,710 --> 00:14:40,340 But if you ask me, given the kinds of reasons that we justify moral obligations with, like, do they promote mutual understanding? 141 00:14:40,850 --> 00:14:44,550 Do they make it possible for people with conflicting goals to achieve their goals? 142 00:14:45,530 --> 00:14:49,760 Do they make it possible for people to lead better lives? Those are the kinds of reasons we give. 143 00:14:50,630 --> 00:14:52,430 And those reasons can be given in this case. 144 00:14:53,120 --> 00:14:59,120 And of course, what it is for a machine to have its goals realised is not perhaps the same as a human because they don't have the same. 145 00:14:59,160 --> 00:15:00,720 Feelings. Maybe eventually they will. 146 00:15:00,960 --> 00:15:11,310 But if you ask, could it be important for us to take into account the goals of machines, the way we take into the goals of animals or institutions? 147 00:15:11,580 --> 00:15:17,549 We take into account the goals of institutions. They don't have consciousness, I would say, yeah, sure, we can have that. 148 00:15:17,550 --> 00:15:24,240 And it's, it's important to realise that the crucial issue is not really the issue of consciousness here, 149 00:15:24,960 --> 00:15:28,320 it's what kinds of structural relations do we have with one another, 150 00:15:28,920 --> 00:15:36,389 what are the interests at stake and are there paths that we could take such that all of these different agents could do better with 151 00:15:36,390 --> 00:15:45,450 regard to their interests if they can manage some form of mutual regard or concern and that you can spell out without emotion. 152 00:15:46,110 --> 00:15:49,800 Concern does not have to mean caring with a warm heart. 153 00:15:50,340 --> 00:15:55,800 It could mean, do I give some weight to your utility function or your value function? 154 00:15:56,040 --> 00:16:01,110 And it turns out if you take machines that say there are machines operating in the natural environment, 155 00:16:01,260 --> 00:16:06,210 suppose we try to automate fishing so we create these automatic fishing boats. 156 00:16:07,770 --> 00:16:19,200 If you programme them with a utility function which only assigns maximal value to their own catch, they will do what humans do. 157 00:16:19,200 --> 00:16:27,690 They will over exploit the resource. If instead, like humans, they have at least some interest in the well-being of the others. 158 00:16:28,440 --> 00:16:31,590 So they assign some value to the utility function of others. 159 00:16:33,000 --> 00:16:38,970 They get some kind of benefit from the experience of coordination and human infants do that as well. 160 00:16:40,380 --> 00:16:44,850 If their value structure is like that, then they can learn to be sustainable fishermen. 161 00:16:46,050 --> 00:16:51,780 And so autonomous systems, artificial systems can learn to solve public goods problems. 162 00:16:51,990 --> 00:16:55,770 So it actually could be a step towards a better world if could. 163 00:16:56,010 --> 00:17:00,990 Yeah and you know people say because they don't have emotions, you might say, well, there might be some pluses to that as well. 164 00:17:01,650 --> 00:17:04,889 They might not be vengeful in the same way or they might not be as short sighted. 165 00:17:04,890 --> 00:17:11,910 They might be more prepared to work together because they might realise better than us that like if we work together, 166 00:17:11,910 --> 00:17:20,040 we're actually all going to achieve our goals in a, in a model in which these systems work by simulating out streams of consequences, 167 00:17:20,760 --> 00:17:24,420 they can see, oh, if, if we all do it this way, here's what's going to become of us. 168 00:17:25,320 --> 00:17:26,910 And that's not what I want either. 169 00:17:27,270 --> 00:17:39,030 I'll be out of work just as much as they will, and so they could have a community that sustainably produces a good in a way that is the result. 170 00:17:39,060 --> 00:17:50,730 It's the product of their capacity as agents. So if these are autonomous, artificial agents sort of become quite powerful, which is sort of possible, 171 00:17:52,680 --> 00:17:57,450 they might actually become, you know, so smart that they would actually effectively run the world. 172 00:17:57,810 --> 00:18:02,550 And that might be good because they might be willing to cooperate more, maybe. 173 00:18:03,150 --> 00:18:09,870 But of course, if there's a powerful, you only need like one baddie, you know, they'll just destroy the world. 174 00:18:09,870 --> 00:18:14,820 And that's something we should be fearful of or definitely be worried about that. 175 00:18:14,940 --> 00:18:19,710 And it can happen completely accidentally. 176 00:18:20,370 --> 00:18:24,300 So this is a long way from that, but it's a kind of example. 177 00:18:24,450 --> 00:18:38,430 The most recent programmes for generating credible natural speech programmes like MBT three, they can produce pretty credible speech, 178 00:18:38,750 --> 00:18:44,100 not maybe a half an hour talking philosophy, but they can produce pretty credible speech and dialogue with people. 179 00:18:45,300 --> 00:18:48,330 And if you give them a prompt, you'll come up with a relevant response. 180 00:18:48,330 --> 00:18:55,260 And they did that by harvesting structural information about language from all these texts that they were given. 181 00:18:55,260 --> 00:19:06,240 Now, amongst the texts were computer programming texts and they in effect learnt principles for or in the grammar of some types of programming. 182 00:19:07,260 --> 00:19:10,200 Now that turned out not to have any serious consequences, 183 00:19:10,200 --> 00:19:16,650 but you could imagine a system like that, not by any design learning how to write bits of code. 184 00:19:17,640 --> 00:19:21,870 If those bits of code operate within them in a certain way, 185 00:19:22,050 --> 00:19:28,379 that might be an executive role and that might change their dispositions and then they 186 00:19:28,380 --> 00:19:33,180 might behave in a way that was completely done outside of the design specifications. 187 00:19:33,180 --> 00:19:41,339 So they would be rewriting their own code. So yeah, by by all means, things like this can happen and they could happen before we know it. 188 00:19:41,340 --> 00:19:44,820 So there's a lot to worry about there. And, 189 00:19:45,510 --> 00:19:51,030 and my thought is all the more reason that we should be developing a large 190 00:19:51,030 --> 00:19:58,950 community of mutually trusting artificial and natural agents so that they can be. 191 00:19:59,190 --> 00:20:06,030 Aware of something like this starting to emerge because the other intelligences, artificial intelligence, they don't want this either. 192 00:20:06,840 --> 00:20:12,480 One dominant artificial intelligence would be a conflict of interest for them because they would not want to be dominated. 193 00:20:12,930 --> 00:20:21,120 They couldn't achieve their goals. So they could be allies in this process of being attentive and alert and responsive, 194 00:20:21,120 --> 00:20:27,510 maybe in ways that we wouldn't anticipate as humans because we're not machines, they might be better at anticipating how this machine would operate. 195 00:20:28,080 --> 00:20:30,000 You know, what's our safety against dictators? 196 00:20:30,870 --> 00:20:37,920 It's not some piece of computer programming or a government regulation because the government was what enforces regulations. 197 00:20:38,520 --> 00:20:46,349 It's, you know, a population that can be mobilised and enlightened and attentive to the emergence of these things and form itself 198 00:20:46,350 --> 00:20:54,570 into a unit that can be good at spotting such kinds of powerful individuals emerging and trying to control that. 199 00:20:55,260 --> 00:21:01,950 So, yes, I worry about this, and that's part of the programme of concern that I have, is that we should be thinking about. 200 00:21:01,950 --> 00:21:09,779 Yes, that's right. How would we build a robust community of artificial and natural agents capable of this kind of resistance? 201 00:21:09,780 --> 00:21:13,470 Because I can't see any. You can't programme a guarantee on this. 202 00:21:17,100 --> 00:21:23,940 If you like this episode, don't forget to subscribe to the Thinking Out Loud podcast or to the Practical Ethics Channel on YouTube.