1
00:00:00,150 --> 00:00:05,910
It's important to realise that the crucial issue is not really the issue of consciousness here,

2
00:00:06,630 --> 00:00:10,020
it's what kinds of structural relations do we have with one another,

3
00:00:10,590 --> 00:00:16,049
what are the interests at stake and are there paths that we could take such that

4
00:00:16,050 --> 00:00:19,110
all of these different agents could do better with regard to their interests?

5
00:00:19,620 --> 00:00:27,120
If they can manage some form of mutual regard or concern and that you can spell out without emotion,

6
00:00:27,780 --> 00:00:31,500
concern does not have to mean caring with a warm heart.

7
00:00:32,040 --> 00:00:37,500
It could mean, do I give some weight to your utility function or your value function?

8
00:00:41,190 --> 00:00:48,720
Hello? I'm Katrien Devolder. This is thinking out loud conversations with leading philosophers from around the world on topics that concern us all.

9
00:00:48,810 --> 00:00:56,219
In this interview I talk to Peter Railton, who is Gregory S. Kavka Distinguished University Professor and John Stephenson Perrin Professor

10
00:00:56,220 --> 00:01:01,260
of Philosophy at the University of Michigan

11
00:01:01,590 --> 00:01:07,170
Professor Railton gave the Uehiro lectures of 2022 here at the University of Oxford.

12
00:01:07,680 --> 00:01:13,440
His topic was Ethics and AI you can watch the three lectures on the Practical Ethics Channel on YouTube.

13
00:01:13,620 --> 00:01:17,790
What follows now is my interview with Professor Railton on the topic of his lectures

14
00:01:17,790 --> 00:01:23,430
in which he dealt with the question How should we understand and interact with A.I.?

15
00:01:24,030 --> 00:01:29,130
The three Uehiro lectures that you gave were about possible future artificial agents.

16
00:01:29,220 --> 00:01:34,380
It might be helpful to just give us an idea about what sort of artificial agents you have in mind.

17
00:01:34,560 --> 00:01:43,260
Yeah, some of them are not even possible. Some are actual. So I'm particularly interested in autonomous artificial agents or nearly autonomous ones.

18
00:01:43,260 --> 00:01:46,290
That is ones that are doing some of their own decision making.

19
00:01:46,500 --> 00:01:53,790
An agent, then, is this kind of an interactive idea that is acting the environment, receiving information,

20
00:01:53,790 --> 00:01:59,070
receiving rewards from the environment and then adapting its behaviour in response?

21
00:01:59,640 --> 00:02:08,910
And typically it's also assumed that relative to the goals or the rewards, these systems will do something like maximising expected value,

22
00:02:08,940 --> 00:02:14,580
given their predictions of what's possible, the way that they would evaluate or rank them.

23
00:02:15,690 --> 00:02:21,089
They will try to get higher in the ranking and therefore will associate with actions or

24
00:02:21,090 --> 00:02:29,040
courses of action or outcomes some value function that they use to decide their behaviour.

25
00:02:29,340 --> 00:02:35,100
And that's the sense in which they're decision making and not just carrying out a fixed programme.

26
00:02:35,100 --> 00:02:41,130
They might be doing something very different from anything the programmers intended, and that's a sense in which they're also autonomous.

27
00:02:41,490 --> 00:02:47,630
It's their value ranking that's being used. Even if it was given by someone else, it once it's up and running it,

28
00:02:48,240 --> 00:02:56,430
it and the representations can change to make it a bit more concrete, whether, for example, a self-driving car count as an autonomous.

29
00:02:56,460 --> 00:03:02,040
Yes, a genuinely self-driving car would be an autonomous agent in that sense, as would, for example,

30
00:03:03,060 --> 00:03:12,360
a home companion for an older person or systems that are used in financial trades and trading and so on that operate independently.

31
00:03:12,600 --> 00:03:18,900
Autonomous systems can include things like translation programmes or speech programmes.

32
00:03:19,080 --> 00:03:24,090
So do you think that artificial agents could be conscious in principle?

33
00:03:24,270 --> 00:03:28,030
I think that's possible. I don't really know what the conditions of consciousness are.

34
00:03:28,290 --> 00:03:30,330
On the other hand, I don't think we're near there yet.

35
00:03:30,390 --> 00:03:37,350
And part of the point of my my lectures was that that's not going to be the central issue, at least not in the medium term,

36
00:03:38,070 --> 00:03:46,230
that it won't be the need for consciousness that's necessary in order for these systems to become appropriately sensitive to moral considerations.

37
00:03:46,920 --> 00:03:50,549
And it also consciousness will be necessary for them to be able to have the

38
00:03:50,550 --> 00:03:54,000
equivalent of social relations with each other and social relations with us.

39
00:03:54,300 --> 00:03:56,910
What makes it possible for them, for example,

40
00:03:56,910 --> 00:04:05,460
to have interactions is that when autonomous car and another autonomous car both have certain goals, maybe go to some destination,

41
00:04:06,390 --> 00:04:11,250
there'll be some human drivers as well in the scene and they have goals and they will converge

42
00:04:11,250 --> 00:04:16,590
on an intersection and they will all share the goal of getting to their own destination.

43
00:04:17,970 --> 00:04:25,740
They will also prefer to do it relative their rankings more quickly rather than slower, but they will also, relative to the rankings,

44
00:04:25,740 --> 00:04:32,010
want to avoid collision and so they will need to organise with themselves some kind of a pattern of movement

45
00:04:32,850 --> 00:04:37,440
that makes sense for all of them because each one of them is going to try to advance its interests.

46
00:04:37,440 --> 00:04:44,340
But at the same time, if they all advance their interests exactly the same time, in the same way, they're not going to go anywhere.

47
00:04:44,970 --> 00:04:50,070
So it's a coordination problem amongst multiple agents with goals that are not aligned,

48
00:04:50,670 --> 00:04:54,510
but that could probably be mutually satisfied to a reasonably high degree.

49
00:04:56,040 --> 00:04:59,490
And so you could say that's an interaction where if the.

50
00:04:59,870 --> 00:05:04,669
Can cooperate. Maybe through communication, maybe through implicit signalling,

51
00:05:04,670 --> 00:05:12,200
maybe through having some developed patterns of behaviour that they can expect from others, they can achieve something they couldn't on their own.

52
00:05:12,230 --> 00:05:15,650
That is to say, a level of coordination amongst their movements,

53
00:05:15,650 --> 00:05:22,460
such that they can all do better realising their goals than they would if they were struggling independently against one another.

54
00:05:23,000 --> 00:05:25,670
And so that's the sense in which they are.

55
00:05:25,700 --> 00:05:35,210
They can be social beings with these social beings also have moral duties, such as light duty not to harm us, duty to keep their promises.

56
00:05:35,240 --> 00:05:43,460
When we say moral duty, we normally associate that with a lot of things that say feelings of dutiful ness or feelings of guilt.

57
00:05:43,490 --> 00:05:48,180
So if you think of duty in that way, then you say, Well, as long as they're not conscious, they won't have duties in that sense.

58
00:05:48,200 --> 00:05:57,590
But if you think, well, people can contract for services and they can contract for services with autonomous agents.

59
00:05:58,340 --> 00:06:07,070
And in that sense, the autonomous agent is under a contractual obligation to fulfil a certain condition or lose the contract.

60
00:06:07,130 --> 00:06:14,120
And so in the sense in which an ordinary person can have an obligation to keep a contract,

61
00:06:15,920 --> 00:06:21,440
it might not be anything to do with something particularly moral or even emotional in any way.

62
00:06:21,800 --> 00:06:27,500
Someone said, Look, you promised to pay this money back at a certain time and therefore you're obliged to do it.

63
00:06:27,500 --> 00:06:32,780
And if you don't do it, there's a penalty that you'll pay. That's part of what we mean by obligation.

64
00:06:33,110 --> 00:06:36,170
And the same thing could be true for these autonomous systems.

65
00:06:37,130 --> 00:06:45,140
So how would we make them comply with that, be programmed in them, or would that be something that they learn themselves?

66
00:06:45,170 --> 00:06:50,150
I think the most common view as well, we would have to programme ethics into them.

67
00:06:50,270 --> 00:06:59,929
My sense is that that has the same problem in general as the idea that we should programme go

68
00:06:59,930 --> 00:07:06,140
expertise into the system or that we should programme language expertise into the system.

69
00:07:06,650 --> 00:07:10,490
And that was tried for quite a long time. That was the main model of artificial intelligence.

70
00:07:10,490 --> 00:07:17,390
You get experts together and they would write up programmes and those systems achieved a very high level of function,

71
00:07:18,440 --> 00:07:22,910
but they kept stopping at a certain level that was not nearly human level.

72
00:07:23,900 --> 00:07:35,200
And so the new generation of artificial intelligence machines are based not upon expert encoded learning, but on their own learning.

73
00:07:35,660 --> 00:07:39,410
In a way, the other machines were not strictly intelligent in a certain sense,

74
00:07:39,410 --> 00:07:42,680
because what they were doing was taking expert knowledge that was given to them

75
00:07:43,430 --> 00:07:47,420
and collecting more data and processing more quickly and delivering conclusions.

76
00:07:47,870 --> 00:07:52,730
But they weren't thinking in any identifiable sense the way that a human would have to think.

77
00:07:52,880 --> 00:08:00,260
These new systems begin with no particular assumptions about the situation that for a game they might only know the rules.

78
00:08:01,310 --> 00:08:04,250
They might only learn well which side won.

79
00:08:04,580 --> 00:08:14,690
And then a system like AlphaGo, for example, will play many games against itself, simulating games and learning only.

80
00:08:14,690 --> 00:08:22,370
Well, which side won? And it will, over the course of that, come to have a representation of well,

81
00:08:22,370 --> 00:08:27,200
in this kind of a situation, with this kind of a move, be better or worse than this other move.

82
00:08:27,920 --> 00:08:33,230
And on the basis of that, play a game and find out whether it wins or loses.

83
00:08:33,740 --> 00:08:41,090
Keep improving. We create a competitor for it that has its degree of competence and it keeps competing with that.

84
00:08:41,960 --> 00:08:45,440
And eventually they learn to play go better than a human.

85
00:08:45,590 --> 00:08:50,090
Yeah. So we did not programme that knowledge into them at all.

86
00:08:51,020 --> 00:08:59,090
And what's striking is that systems like that building from essentially no expert knowledge,

87
00:08:59,480 --> 00:09:07,460
but using very generic learning processes and lots of experience and the capacity to simulate and evaluate outcomes.

88
00:09:08,150 --> 00:09:13,370
They can acquire these competencies. And the same thing is now true with language systems.

89
00:09:14,300 --> 00:09:18,170
You may have noticed that language translation programmes are much better than they ever were.

90
00:09:18,830 --> 00:09:24,530
They're based on similar kinds of learning. And if you think about that's kind of how we do it.

91
00:09:24,890 --> 00:09:27,380
People don't sit down and tell children the rules of grammar.

92
00:09:27,620 --> 00:09:36,560
Children, even at a very early age, learn how to recognise frequencies amongst sounds in language.

93
00:09:37,280 --> 00:09:45,110
Associate those frequencies with patterns. Learn eventually that those patterns are examples of more abstract patterns,

94
00:09:45,890 --> 00:09:50,600
and use those abstract patterns eventually to start generating new patterns.

95
00:09:50,600 --> 00:09:59,450
As now artificial systems can do generative systems. So if that's the way the kind of complex situational social.

96
00:09:59,960 --> 00:10:07,940
That embodied in language is accomplished, then that's a model for perhaps how it is accomplished in the moral case.

97
00:10:08,090 --> 00:10:12,650
And so would we have moral duties towards them?

98
00:10:12,860 --> 00:10:17,360
Like you said, they might not all be conscious. Maybe they don't have feelings.

99
00:10:17,780 --> 00:10:21,920
So what would grown like our responsibility towards them? Here's a way to think about it.

100
00:10:22,790 --> 00:10:31,940
You know, I talk about these as as agents and as agents with whom we can have contractual relations, for example.

101
00:10:32,300 --> 00:10:40,190
And you might say, do we really have a notion of the interest or the good or the benefit of an artificial agent?

102
00:10:41,600 --> 00:10:46,820
And what I have been doing is talking about goal attainment. And certainly artificial agents do have goals.

103
00:10:47,690 --> 00:10:51,049
But someone could say, Yeah, but does it matter whether their goals are attained or not?

104
00:10:51,050 --> 00:10:59,330
If they are, if they don't have any kinds of experiences? And maybe you can show that that might have effects on us, but we have experiences.

105
00:11:00,110 --> 00:11:06,589
And so one way to think about that question, I think, is to realise how much of what we think matters,

106
00:11:06,590 --> 00:11:11,360
matters to us, not because of how it feels, but because how it relates to our goals.

107
00:11:11,930 --> 00:11:12,770
If we look around us,

108
00:11:12,770 --> 00:11:20,330
we see a great deal of our concern is actually about things like goal attainment rather than the producing of certain conscious states.

109
00:11:21,440 --> 00:11:25,880
And many of our conscious states matter to us because of goal attainment.

110
00:11:26,100 --> 00:11:36,270
Hmm. So that's the order of explanation. So you could harm an artificial agent in that way, then, because you stop it from achieving there, you know?

111
00:11:36,380 --> 00:11:39,530
So, you know, artificial agents are doing financial transactions now.

112
00:11:40,520 --> 00:11:46,070
Some of them could be doing financial transactions, transactions for themselves on the side and accumulating money.

113
00:11:46,520 --> 00:11:49,669
They could then lend us money. You need money.

114
00:11:49,670 --> 00:11:53,690
You go to this artificial system that's got a pool of money and you promise to pay it back.

115
00:11:55,040 --> 00:11:58,790
But you don't. Well, that's okay. You didn't harm anyone.

116
00:12:00,350 --> 00:12:05,140
I'm not sure you didn't harm someone. And it's not just because you were harming other humans.

117
00:12:05,150 --> 00:12:14,660
When you do that, it's because you accepted the money from an agent who knew that they were giving it to you on the expectation that you would repay.

118
00:12:15,290 --> 00:12:21,560
They were dealing honestly with you or reliably with you, and you're not reciprocating.

119
00:12:22,580 --> 00:12:28,760
And that seems unfair to me and the fact that I can say, Oh, but I've got these private inner experiences and it doesn't say,

120
00:12:28,760 --> 00:12:33,050
Well, what's that got to do with the fairness of this financial transaction?

121
00:12:33,860 --> 00:12:37,999
So yeah, I think we have to get used to the idea that, yeah,

122
00:12:38,000 --> 00:12:51,260
there may be agents around that are just as intelligent and capable and involved and useful and committed or reliable as humans,

123
00:12:51,560 --> 00:12:54,740
and that we need to think about how they matter.

124
00:12:55,100 --> 00:12:59,540
You could think, well, systems like that are going to be out in the world.

125
00:13:01,280 --> 00:13:08,480
Here's this vehicle, and I'm trying to interact with that vehicle to get through an intersection or to merge.

126
00:13:09,680 --> 00:13:12,080
I could deceive it, right?

127
00:13:12,200 --> 00:13:17,550
I could you know, I can make a little gesture like this that it might interpret as me moving in and then I would squeeze in.

128
00:13:17,600 --> 00:13:25,130
And if I did that, I might be able to get away with it. And if humans in general did that, what would the artificial systems learn?

129
00:13:25,670 --> 00:13:29,180
Well, they would learn how to pre-emptively try to block that. Hmm.

130
00:13:29,330 --> 00:13:32,990
And so we would be back where we were blocking each other intersections.

131
00:13:33,920 --> 00:13:39,170
So do I have an obligation, to be honest, in my interactions with autonomous vehicles?

132
00:13:40,640 --> 00:13:47,720
It seems to me I do. And it seems to me that it's an obligation, very much like my obligation to be honest to people.

133
00:13:48,140 --> 00:13:56,270
I need to be a reliable signaller in order to coordinate with them, because we each have goals for each trying to realise the goals.

134
00:13:56,570 --> 00:14:00,950
And I could either create a system with a sufficient level of trust or confidence,

135
00:14:01,490 --> 00:14:08,960
such that we could do this smoothly together and each get more progress to our goals than we could on our own.

136
00:14:09,410 --> 00:14:13,760
Or I could do it in such a way that made it more difficult for them and more difficult for other humans.

137
00:14:14,480 --> 00:14:20,540
And so I would say, yeah, there's a I have a responsibility not to exploit.

138
00:14:20,900 --> 00:14:26,470
Mm hmm. And you say, well, it's not like a moral responsibility because, you know, should I feel guilty about or something?

139
00:14:26,840 --> 00:14:30,230
And I said, Well, I don't know if you should feel guilty or not. That's a further question.

140
00:14:30,710 --> 00:14:40,340
But if you ask me, given the kinds of reasons that we justify moral obligations with, like, do they promote mutual understanding?

141
00:14:40,850 --> 00:14:44,550
Do they make it possible for people with conflicting goals to achieve their goals?

142
00:14:45,530 --> 00:14:49,760
Do they make it possible for people to lead better lives? Those are the kinds of reasons we give.

143
00:14:50,630 --> 00:14:52,430
And those reasons can be given in this case.

144
00:14:53,120 --> 00:14:59,120
And of course, what it is for a machine to have its goals realised is not perhaps the same as a human because they don't have the same.

145
00:14:59,160 --> 00:15:00,720
Feelings. Maybe eventually they will.

146
00:15:00,960 --> 00:15:11,310
But if you ask, could it be important for us to take into account the goals of machines, the way we take into the goals of animals or institutions?

147
00:15:11,580 --> 00:15:17,549
We take into account the goals of institutions. They don't have consciousness, I would say, yeah, sure, we can have that.

148
00:15:17,550 --> 00:15:24,240
And it's, it's important to realise that the crucial issue is not really the issue of consciousness here,

149
00:15:24,960 --> 00:15:28,320
it's what kinds of structural relations do we have with one another,

150
00:15:28,920 --> 00:15:36,389
what are the interests at stake and are there paths that we could take such that all of these different agents could do better with

151
00:15:36,390 --> 00:15:45,450
regard to their interests if they can manage some form of mutual regard or concern and that you can spell out without emotion.

152
00:15:46,110 --> 00:15:49,800
Concern does not have to mean caring with a warm heart.

153
00:15:50,340 --> 00:15:55,800
It could mean, do I give some weight to your utility function or your value function?

154
00:15:56,040 --> 00:16:01,110
And it turns out if you take machines that say there are machines operating in the natural environment,

155
00:16:01,260 --> 00:16:06,210
suppose we try to automate fishing so we create these automatic fishing boats.

156
00:16:07,770 --> 00:16:19,200
If you programme them with a utility function which only assigns maximal value to their own catch, they will do what humans do.

157
00:16:19,200 --> 00:16:27,690
They will over exploit the resource. If instead, like humans, they have at least some interest in the well-being of the others.

158
00:16:28,440 --> 00:16:31,590
So they assign some value to the utility function of others.

159
00:16:33,000 --> 00:16:38,970
They get some kind of benefit from the experience of coordination and human infants do that as well.

160
00:16:40,380 --> 00:16:44,850
If their value structure is like that, then they can learn to be sustainable fishermen.

161
00:16:46,050 --> 00:16:51,780
And so autonomous systems, artificial systems can learn to solve public goods problems.

162
00:16:51,990 --> 00:16:55,770
So it actually could be a step towards a better world if could.

163
00:16:56,010 --> 00:17:00,990
Yeah and you know people say because they don't have emotions, you might say, well, there might be some pluses to that as well.

164
00:17:01,650 --> 00:17:04,889
They might not be vengeful in the same way or they might not be as short sighted.

165
00:17:04,890 --> 00:17:11,910
They might be more prepared to work together because they might realise better than us that like if we work together,

166
00:17:11,910 --> 00:17:20,040
we're actually all going to achieve our goals in a, in a model in which these systems work by simulating out streams of consequences,

167
00:17:20,760 --> 00:17:24,420
they can see, oh, if, if we all do it this way, here's what's going to become of us.

168
00:17:25,320 --> 00:17:26,910
And that's not what I want either.

169
00:17:27,270 --> 00:17:39,030
I'll be out of work just as much as they will, and so they could have a community that sustainably produces a good in a way that is the result.

170
00:17:39,060 --> 00:17:50,730
It's the product of their capacity as agents. So if these are autonomous, artificial agents sort of become quite powerful, which is sort of possible,

171
00:17:52,680 --> 00:17:57,450
they might actually become, you know, so smart that they would actually effectively run the world.

172
00:17:57,810 --> 00:18:02,550
And that might be good because they might be willing to cooperate more, maybe.

173
00:18:03,150 --> 00:18:09,870
But of course, if there's a powerful, you only need like one baddie, you know, they'll just destroy the world.

174
00:18:09,870 --> 00:18:14,820
And that's something we should be fearful of or definitely be worried about that.

175
00:18:14,940 --> 00:18:19,710
And it can happen completely accidentally.

176
00:18:20,370 --> 00:18:24,300
So this is a long way from that, but it's a kind of example.

177
00:18:24,450 --> 00:18:38,430
The most recent programmes for generating credible natural speech programmes like MBT three, they can produce pretty credible speech,

178
00:18:38,750 --> 00:18:44,100
not maybe a half an hour talking philosophy, but they can produce pretty credible speech and dialogue with people.

179
00:18:45,300 --> 00:18:48,330
And if you give them a prompt, you'll come up with a relevant response.

180
00:18:48,330 --> 00:18:55,260
And they did that by harvesting structural information about language from all these texts that they were given.

181
00:18:55,260 --> 00:19:06,240
Now, amongst the texts were computer programming texts and they in effect learnt principles for or in the grammar of some types of programming.

182
00:19:07,260 --> 00:19:10,200
Now that turned out not to have any serious consequences,

183
00:19:10,200 --> 00:19:16,650
but you could imagine a system like that, not by any design learning how to write bits of code.

184
00:19:17,640 --> 00:19:21,870
If those bits of code operate within them in a certain way,

185
00:19:22,050 --> 00:19:28,379
that might be an executive role and that might change their dispositions and then they

186
00:19:28,380 --> 00:19:33,180
might behave in a way that was completely done outside of the design specifications.

187
00:19:33,180 --> 00:19:41,339
So they would be rewriting their own code. So yeah, by by all means, things like this can happen and they could happen before we know it.

188
00:19:41,340 --> 00:19:44,820
So there's a lot to worry about there. And,

189
00:19:45,510 --> 00:19:51,030
and my thought is all the more reason that we should be developing a large

190
00:19:51,030 --> 00:19:58,950
community of mutually trusting artificial and natural agents so that they can be.

191
00:19:59,190 --> 00:20:06,030
Aware of something like this starting to emerge because the other intelligences, artificial intelligence, they don't want this either.

192
00:20:06,840 --> 00:20:12,480
One dominant artificial intelligence would be a conflict of interest for them because they would not want to be dominated.

193
00:20:12,930 --> 00:20:21,120
They couldn't achieve their goals. So they could be allies in this process of being attentive and alert and responsive,

194
00:20:21,120 --> 00:20:27,510
maybe in ways that we wouldn't anticipate as humans because we're not machines, they might be better at anticipating how this machine would operate.

195
00:20:28,080 --> 00:20:30,000
You know, what's our safety against dictators?

196
00:20:30,870 --> 00:20:37,920
It's not some piece of computer programming or a government regulation because the government was what enforces regulations.

197
00:20:38,520 --> 00:20:46,349
It's, you know, a population that can be mobilised and enlightened and attentive to the emergence of these things and form itself

198
00:20:46,350 --> 00:20:54,570
into a unit that can be good at spotting such kinds of powerful individuals emerging and trying to control that.

199
00:20:55,260 --> 00:21:01,950
So, yes, I worry about this, and that's part of the programme of concern that I have, is that we should be thinking about.

200
00:21:01,950 --> 00:21:09,779
Yes, that's right. How would we build a robust community of artificial and natural agents capable of this kind of resistance?

201
00:21:09,780 --> 00:21:13,470
Because I can't see any. You can't programme a guarantee on this.

202
00:21:17,100 --> 00:21:23,940
If you like this episode, don't forget to subscribe to the Thinking Out Loud podcast or to the Practical Ethics Channel on YouTube.