1
00:00:06,000 --> 00:00:09,050
Good evening, everybody.

2
00:00:09,050 --> 00:00:24,000
Welcome to what's sadly is the last of a series of three you here that she had given by Railton on ethics and artificial intelligence.

3
00:00:24,000 --> 00:00:26,610
Most of you have been to the other two lectures,

4
00:00:26,610 --> 00:00:36,090
so you'll know that Peter will speak for about an hour and then we'll have some time for questions if you're lucky.

5
00:00:36,090 --> 00:00:44,890
We should have some time for questions and we will be able to take questions online as well as from the floor.

6
00:00:44,890 --> 00:00:49,750
And here we'll be using the the roving mikes again.

7
00:00:49,750 --> 00:00:57,840
Um, and, um, online people, I think you should put your questions into the chat.

8
00:00:57,840 --> 00:01:03,180
And then Johnny P will kindly report them to us.

9
00:01:03,180 --> 00:01:13,650
So without further ado, over to Peter. Well, thank you, and thank you to the Uehiro Centre for the invitation, the chance to be here with you.

10
00:01:13,650 --> 00:01:22,710
I've enjoyed it so far and I hope you will continue to enjoy it and I'm trying to think of the title for this lecture,

11
00:01:22,710 --> 00:01:26,070
and it occurred to me that only a recursive title would do,

12
00:01:26,070 --> 00:01:30,300
which is a living amongst artificial agents who live amongst artificial agents,

13
00:01:30,300 --> 00:01:33,780
who live amongst natural agents, who live amongst artificial agents and so on.

14
00:01:33,780 --> 00:01:37,710
Because it's really about that recursion at the very beginning,

15
00:01:37,710 --> 00:01:47,100
I'm just going to go over a little bit of the first lectures in order to help those who might not have been there or memory also may not always serve.

16
00:01:47,100 --> 00:01:54,330
I tried to mention at the beginning there are ton of ethical challenges about A.I. and many of them very large,

17
00:01:54,330 --> 00:02:04,260
and I'm not trying to address them all. I'm really looking at just this last, the sixth one, the problem of not so super machine intelligence,

18
00:02:04,260 --> 00:02:11,280
trusting A.I., entrusting A.I. with some inappropriate tasks or degrees of independence.

19
00:02:11,280 --> 00:02:16,560
Not just because humans are mercenary, though indeed they are, but because they may not know any better.

20
00:02:16,560 --> 00:02:26,400
There may be error in inadvertence. And so I'm my my question is how could we do this relationship living amongst

21
00:02:26,400 --> 00:02:29,880
artificial agents who live amongst artificial agents who live amongst natural agents?

22
00:02:29,880 --> 00:02:40,890
How could we do that in a way that was constructive, that was mutually beneficial rather than mutually detrimental or very risky?

23
00:02:40,890 --> 00:02:45,070
And so my focus has been on the idea that.

24
00:02:45,070 --> 00:02:50,920
These systems, these autonomous or semi-autonomous artificial intelligence systems might be part of the problem,

25
00:02:50,920 --> 00:02:54,340
might be part of the solution rather than part of the problem.

26
00:02:54,340 --> 00:03:01,750
Not just because we could use them as tools, which we certainly will do, but because such systems would be autonomous.

27
00:03:01,750 --> 00:03:04,630
They would be agents, they would be capable, I believe,

28
00:03:04,630 --> 00:03:12,010
of a kind of app sensitivity and responsiveness to morally relevant features of situations, actions, agents and outcomes.

29
00:03:12,010 --> 00:03:18,580
And in that we are going to find, I hope, a very important source of mutual confidence and trust.

30
00:03:18,580 --> 00:03:25,270
Such responsiveness, I've claimed, is not just important for their relations to us, but for their relations amongst themselves.

31
00:03:25,270 --> 00:03:32,500
And if this was right, if anything like this picture is right, then it should help somewhat with the other pictures as well the other problems,

32
00:03:32,500 --> 00:03:39,880
because those involve, amongst other things, problems of developing trustworthy A.I. at the heart of these other tasks.

33
00:03:39,880 --> 00:03:42,880
Now where is this sort of specious week for this talk?

34
00:03:42,880 --> 00:03:53,140
Because folks down deep mine announced Gatto A.I. as the first working example, still incomplete of a general artificial intelligence.

35
00:03:53,140 --> 00:03:58,270
And what's that? Why is that special? Well, it's because it's intelligent.

36
00:03:58,270 --> 00:04:02,110
It's capable of learning and problem-solving in a wide range of novel situations.

37
00:04:02,110 --> 00:04:08,500
That's the general idea of intelligence, but it's general in the sense that it's competence in a wide range of task language,

38
00:04:08,500 --> 00:04:16,120
dialogue, labelling images, motor control, game playing, testing, causal models and so on, and doing so autonomously.

39
00:04:16,120 --> 00:04:25,330
That is without being reprogrammed in between the tasks. And it does this by uniting in a single big model all the elements needed for these

40
00:04:25,330 --> 00:04:30,310
various tasks and taking advantage of this synergistic power of that to do them.

41
00:04:30,310 --> 00:04:35,860
And so this is what people have been hoping for for a while, and the people said, Well, why would they announce this so early?

42
00:04:35,860 --> 00:04:44,230
It's, you know, still fairly mediocre at some of these tasks. And one answer is, if anything like this works even as a sort of proof of concept,

43
00:04:44,230 --> 00:04:48,670
then the fact that it's badly trained or needs more information or so.

44
00:04:48,670 --> 00:04:51,490
And that's not the the key fact.

45
00:04:51,490 --> 00:04:59,470
Now my claim has been that in humans are a linguistic epistemic causal and social moral competencies are in some ways like that.

46
00:04:59,470 --> 00:05:07,210
That is, they're all very entwined. They're entwined in a complex model of the world in a bundle, not separate strands.

47
00:05:07,210 --> 00:05:13,870
And they develop through infancy and through the rest of our lives in conjunction with one another, in pace with one another.

48
00:05:13,870 --> 00:05:17,530
And I don't think they would be fully realised without one another.

49
00:05:17,530 --> 00:05:21,250
And so that's another reason for being especially interested in artificial general intelligence,

50
00:05:21,250 --> 00:05:27,100
because the full realisation of these special tasks might very well require more general intelligence.

51
00:05:27,100 --> 00:05:30,970
And I'm going to argue that so Sutton and others have argued that in the end,

52
00:05:30,970 --> 00:05:38,290
it will be generalist artificial intelligence that succeeds best, even at the well-defined tasks that A.I. is now good at.

53
00:05:38,290 --> 00:05:42,370
And in this respect, Gaucho A.I. is still, I gather a work in process there.

54
00:05:42,370 --> 00:05:46,780
There are specialised programmes that apparently do at least as well as it does or better.

55
00:05:46,780 --> 00:05:52,570
But for our purpose, the question that's especially interesting is how General Intelligence,

56
00:05:52,570 --> 00:05:58,900
artificial general intelligence might make a difference to questions of ethics and A.I. questions of air safety.

57
00:05:58,900 --> 00:06:07,030
And you might think, for example, that one very important feature of artificial general intelligence is that these systems

58
00:06:07,030 --> 00:06:12,850
might be more interpretable because they combine language skills with motor skills,

59
00:06:12,850 --> 00:06:18,550
with object identification skills, with dialogue skills and so on.

60
00:06:18,550 --> 00:06:27,580
They might be able to represent what they're doing in a way that is, for example, explainable to us in a better fashion than the machines now are.

61
00:06:27,580 --> 00:06:33,190
And so the idea is if you bring together model based learning with competencies like dialogue,

62
00:06:33,190 --> 00:06:37,750
categorising objects, using that to navigate or manipulate the environment,

63
00:06:37,750 --> 00:06:45,640
collaborating with humans on shared tasks and so on that you begin to make it more and more realistic to think this is an agent now.

64
00:06:45,640 --> 00:06:49,000
It's not a human agent who doesn't have all the features of a full fledged human nation.

65
00:06:49,000 --> 00:06:55,810
But the more it does this, the more it seems a general, and the more it seems that the words that it generates are connected with the world,

66
00:06:55,810 --> 00:07:01,090
that they're used in modelling the environment because they're associated with a variety of tasks.

67
00:07:01,090 --> 00:07:07,240
They're used for signalling others for communicative purposes. They're used in action guidance and in learning.

68
00:07:07,240 --> 00:07:11,320
And, you know, meaning isn't just use. I suppose most of us think that,

69
00:07:11,320 --> 00:07:19,060
but each of these are steps toward the idea that that may be more like meaning than it would be if it were just a word programme.

70
00:07:19,060 --> 00:07:26,110
Or it may be more like objects than if it were just an image identification programme because objects are three dimensional.

71
00:07:26,110 --> 00:07:31,270
And now it's treating these as three dimensional through its motor activities.

72
00:07:31,270 --> 00:07:36,580
And so that's an important sense in which the general artificial intelligence gives us more

73
00:07:36,580 --> 00:07:41,830
confidence that we are dealing with something that looks like an intelligence and an agent.

74
00:07:41,830 --> 00:07:44,180
And I need that very much for the argument that.

75
00:07:44,180 --> 00:07:53,930
I'm giving now, of course, if you look at the core, this big network at the core, it's just a big association structure.

76
00:07:53,930 --> 00:07:57,740
And that will tempt many to say, look, there's still no real understanding.

77
00:07:57,740 --> 00:08:04,940
There's just all this association and that can't be human level competency because humans have can't have understanding.

78
00:08:04,940 --> 00:08:09,500
And that's an important reminder. We're only so far down this path.

79
00:08:09,500 --> 00:08:15,590
But it's worth noting at least my brain seems to be a neural network of associational connexions.

80
00:08:15,590 --> 00:08:23,090
Now it's not made out of silicon, it's made out of protoplasm. It's got some more differentiated structures than these nets.

81
00:08:23,090 --> 00:08:27,350
It's got a lot more structure than these nets, and that may be very important,

82
00:08:27,350 --> 00:08:34,910
but it's still operating on this basic associational principle that Connexions are strengthened the more frequently they're activated.

83
00:08:34,910 --> 00:08:39,650
That's the basic principle neurones that fire together wire together.

84
00:08:39,650 --> 00:08:44,810
And so the question is, do we use that complicated net of ours association with what it is?

85
00:08:44,810 --> 00:08:48,230
Is that understanding? And we might disqualify it as well.

86
00:08:48,230 --> 00:08:53,720
But then at least we wouldn't be treating ourselves as in some ways above and beyond

87
00:08:53,720 --> 00:08:58,760
the possibilities of artificial agents because we ourselves are not there yet.

88
00:08:58,760 --> 00:09:04,280
Anyhow, I'm going to bracket such large questions about the nature of understanding because my challenge is really

89
00:09:04,280 --> 00:09:10,310
to emphasise that short of understanding and short of something like full agency or full moral agency,

90
00:09:10,310 --> 00:09:18,320
it's still quite possible for these systems to become aptly sensitive and responsive to morally relevant features.

91
00:09:18,320 --> 00:09:31,430
And one interesting thing about these kinds of systems is that if you use them, let's say, to model human moral intuitions, people do that.

92
00:09:31,430 --> 00:09:35,630
They set up an internet site. Lots of people send in their intuitions.

93
00:09:35,630 --> 00:09:41,390
They try to make a model of that using some kind of fit of a of a neural network.

94
00:09:41,390 --> 00:09:48,740
We're learning something about the structure of our own moral values, our known moral beliefs in that way.

95
00:09:48,740 --> 00:09:57,740
But if that were, that network could then be associated with action and with speech and with behaviour,

96
00:09:57,740 --> 00:10:02,810
then we'd have a better sense that we were getting at something that looked more like a competency.

97
00:10:02,810 --> 00:10:05,570
So consider language now.

98
00:10:05,570 --> 00:10:14,180
If an artificial system is going to be genuinely able to achieve human level competence in language, not just fluent, topically appropriate speech,

99
00:10:14,180 --> 00:10:21,260
but being attuned to things like conversational norms, interpretive charity sensitivity to the distortions that coercion,

100
00:10:21,260 --> 00:10:28,670
deception or power imbalance might bring to the content of the conversation, identifying speaker motives or intent, and of course,

101
00:10:28,670 --> 00:10:35,510
and compensating accordingly, identifying deception, attributing appropriate authority to others, use of words and so on.

102
00:10:35,510 --> 00:10:39,830
That's language competency. It is a very complex bundle.

103
00:10:39,830 --> 00:10:47,060
It's normative. It's epistemic, it's social, and it's got a lot of moral content as well.

104
00:10:47,060 --> 00:10:55,490
And so my sense is that really to build competent, humanly competent speakers, we will have to build a system with that kind of a bundle.

105
00:10:55,490 --> 00:11:03,080
And so a special purpose language programme is always going to look tinker toy by comparison and limited in its abilities.

106
00:11:03,080 --> 00:11:08,360
So if creating artificial agents with broad human level competencies is,

107
00:11:08,360 --> 00:11:16,160
the game is the aim and that indeed is the aim, then moral competencies, I'm saying, will be a part of that.

108
00:11:16,160 --> 00:11:22,370
Intelligence is a capacity to learn and solve problems in an open-ended array of situations,

109
00:11:22,370 --> 00:11:28,730
and moral issues arise as problems that humans face in an open-ended array of situations.

110
00:11:28,730 --> 00:11:32,360
Using our moral capacities, we often solve these problems.

111
00:11:32,360 --> 00:11:34,880
I'm going to talk about some of the ways in which we do,

112
00:11:34,880 --> 00:11:40,370
but that means achieving human level competency and solving social problems is going to involve this

113
00:11:40,370 --> 00:11:45,770
same capacity to represent morally relevant features of situations and to use them appropriately.

114
00:11:45,770 --> 00:11:48,890
And as we saw early on, this is reflected, for example,

115
00:11:48,890 --> 00:11:56,000
in the substructure of the mind in the way in which these capacities are reflected in the brain's activities.

116
00:11:56,000 --> 00:12:05,720
So here is the general semantic network compared to the default network we saw this year as the default network compared to memory tasks,

117
00:12:05,720 --> 00:12:12,950
autobiographical memory envisioning the future, simulating theory of mind tasks and moral decision making.

118
00:12:12,950 --> 00:12:15,230
It looks like a bundle,

119
00:12:15,230 --> 00:12:22,880
and indeed it looks like a core in which there is something like a generalised model that is allowing us to interpret the past that

120
00:12:22,880 --> 00:12:30,500
we have to imagine possible futures to understand what's going on in other people's minds and to carry out moral decision making.

121
00:12:30,500 --> 00:12:35,690
And so it looks something like the structure of general intelligence.

122
00:12:35,690 --> 00:12:41,390
And if that's if it makes sense that these tasks are bundled, then it makes sense that the brain will handle them.

123
00:12:41,390 --> 00:12:45,350
In this bundled way, there will be special moral modules. And so on.

124
00:12:45,350 --> 00:12:52,730
Now that looks like one of these foundation models in vivo and this capacity to

125
00:12:52,730 --> 00:12:58,490
use experience and flexibly recruiting memory and generating possible responses,

126
00:12:58,490 --> 00:13:03,530
simulating outcomes, assessing them in terms of how they would affect people and so on.

127
00:13:03,530 --> 00:13:12,470
If that's how we manage to develop and use such capacities as our language capacity, our moral capacity, our capacity is epistemic agents,

128
00:13:12,470 --> 00:13:18,260
then artificial agents are going to have to do so as well if they're going to have human level competence.

129
00:13:18,260 --> 00:13:26,810
And along the way, we will have to be responsive to epistemic and linguistically and socially and morally relevant features of situations,

130
00:13:26,810 --> 00:13:33,920
actions, agents and outcomes. So what do I mean saying agent talking about these artificially intelligent systems?

131
00:13:33,920 --> 00:13:39,770
I don't mean a deep notion of agent. I don't mean a core consciousness or a sense of self.

132
00:13:39,770 --> 00:13:48,650
I mean, a system of the particular kind, which is well studied in cognitive science of an agent interacting with an environment.

133
00:13:48,650 --> 00:13:53,540
The agent has a model of the environment. It has a goal, a reward function.

134
00:13:53,540 --> 00:14:02,150
Those two are combined to generate selected actions. Those actions are then suggested to the performer, the actual waiter, the agent,

135
00:14:02,150 --> 00:14:08,030
the agent performs the action and the environment returns a response in terms of, well, how did things turn out?

136
00:14:08,030 --> 00:14:12,740
And that then becomes data for updating the model. And it's a much more complicated than this.

137
00:14:12,740 --> 00:14:16,640
Actually, there are all kinds of internal loops that are fascinating in themselves,

138
00:14:16,640 --> 00:14:24,380
but the rough idea is that agents are by their very structure in this sense represented as they are modellers,

139
00:14:24,380 --> 00:14:31,610
they are learners, and they engage in action as learning and not just as performance.

140
00:14:31,610 --> 00:14:34,520
And so what we saw when we looked at, for example,

141
00:14:34,520 --> 00:14:48,600
evidence from neural recordings of macaques that they perform actually quite precise probabilistic calculations of rewards in their environment.

142
00:14:48,600 --> 00:14:53,370
Moreover, if you look to the right hand side and I will draw attention to this later on,

143
00:14:53,370 --> 00:14:58,860
they don't just do expected value calculations, which is what the Economist would recommend to them.

144
00:14:58,860 --> 00:15:05,430
They also do risk calculations. They independently encode risk and you might say, Well, why would you bother doing that?

145
00:15:05,430 --> 00:15:12,600
Don't rational agents just act on expected utility? And you can say yes, that's what they said, right up until the global financial crisis,

146
00:15:12,600 --> 00:15:18,330
when it was clear that accumulated risk offset the gains in expected utility.

147
00:15:18,330 --> 00:15:22,230
So animals live in an environment they have to survive.

148
00:15:22,230 --> 00:15:28,440
They can't ignore risk and artificial agents if they're going to be least animal level competent and

149
00:15:28,440 --> 00:15:33,630
certain the human level competent will have to represent not only expected value as they currently do,

150
00:15:33,630 --> 00:15:43,290
but also risk in this way. Here we saw work with a rhesus monkeys, indicating that they have, in fact, utility functions.

151
00:15:43,290 --> 00:15:49,740
These are abstract functions. This doesn't represent any particular quantity of juice or any particular level of risk.

152
00:15:49,740 --> 00:15:56,730
It represents combinations of quantities of juice and rebels, levels of risk and banana slices and grapes.

153
00:15:56,730 --> 00:16:05,010
But it reps them represents them all in a common currency like utility, and the utility function has the shape you would expect.

154
00:16:05,010 --> 00:16:10,800
It is risk seeking when there is little at stake and it is risk aversive when there's a lot at stake.

155
00:16:10,800 --> 00:16:18,570
And this looks very much like the kind of a function that you would expect a prudent agent to develop operating in the world.

156
00:16:18,570 --> 00:16:25,570
And the picture that emerges from this is the idea that our action as we saw, is guided by.

157
00:16:25,570 --> 00:16:29,800
Evaluative causal representation of the environment around us.

158
00:16:29,800 --> 00:16:37,570
And with that representation, we are able to select actions and then compare outcomes with what we expected.

159
00:16:37,570 --> 00:16:45,310
Now, highly intelligent animals don't simply have that kind of a first person egocentric perspective.

160
00:16:45,310 --> 00:16:50,540
In fact, they map their physical space non egocentric as well as egocentric.

161
00:16:50,540 --> 00:16:54,370
This was a big discovery. Not that long ago.

162
00:16:54,370 --> 00:17:02,320
Rats represent space not only in terms of location and where they are, but also in terms of a grid mapping out the space around them.

163
00:17:02,320 --> 00:17:12,850
It's a non egocentric representation, and these representations affords the animal a very substantial degree of autonomy from the current stimulation.

164
00:17:12,850 --> 00:17:20,890
It enables the animal to engage in the discovery and planning of novel actions, and it enables them to become optimal foragers in their environment.

165
00:17:20,890 --> 00:17:29,740
So here are some of the initial experiments with rats showing the neurones firing, either at a specific place or on a grid like pattern.

166
00:17:29,740 --> 00:17:38,200
Here are an example of the representation of space in the hippocampus of a rat who's been running a maze.

167
00:17:38,200 --> 00:17:41,320
And what we saw was it's not just present when the rat is running,

168
00:17:41,320 --> 00:17:47,170
it's present when the animal sleeps and it repeatedly activates that maze in its sleep.

169
00:17:47,170 --> 00:17:53,590
It does so in a way that seeks information. That is to say it spends more time activating parts of the maze it didn't explore.

170
00:17:53,590 --> 00:17:57,700
It moves in directions in the maze that it didn't move into during the day,

171
00:17:57,700 --> 00:18:05,890
and it also constructed novel paths because it has a representation of space that enables it to represent not just the channels that it followed,

172
00:18:05,890 --> 00:18:10,120
but the spatial relations amongst the locations along the channels.

173
00:18:10,120 --> 00:18:16,060
For example, diagonals and shortcuts. And then finally, we saw and this is a slide I.

174
00:18:16,060 --> 00:18:21,460
Try to show as often as I can, because it gives the animals credit for what they're doing,

175
00:18:21,460 --> 00:18:27,070
that when the rat reaches a choice point in the maze, it has an idea of what lies ahead.

176
00:18:27,070 --> 00:18:31,810
It searches to one side and to the other side mentally before it takes any steps.

177
00:18:31,810 --> 00:18:35,140
That's an efficient thing to do if you're an animal worried about energy.

178
00:18:35,140 --> 00:18:42,670
And it combines representations of expected value with representations of space in such a way that by going back and forth,

179
00:18:42,670 --> 00:18:46,930
it is actually weighing the alternatives and where it discovers the stronger weight.

180
00:18:46,930 --> 00:18:58,000
It will act and move in that direction. And so a rat running a maze is acting as a rational agent could be expected to do,

181
00:18:58,000 --> 00:19:03,710
forming the kinds of representations we'd expect and using them in the ways that we would expect.

182
00:19:03,710 --> 00:19:11,870
Now, highly intelligent animals, including humans, also construct non egocentric maps of their social environment.

183
00:19:11,870 --> 00:19:16,190
They represent different behavioural dispositions of the individuals around them,

184
00:19:16,190 --> 00:19:25,160
values that are associated with the behaviour of those dispositions, and these representations are learnt through a kind of reinforcement learning.

185
00:19:25,160 --> 00:19:29,580
And these are third personal representations. It's not was this spunky good to me?

186
00:19:29,580 --> 00:19:38,510
Is this monkey good toward other monkeys with this monkey, therefore be a good marriage partner or a good alliance partner?

187
00:19:38,510 --> 00:19:45,230
That's why these representations, these non egocentric representations are so valuable because that information is vital if they're going to

188
00:19:45,230 --> 00:19:50,540
initiate a new relationship and they need to distinguish the monkeys who help from the monkeys who don't help.

189
00:19:50,540 --> 00:19:55,070
And they're now finding that animals have codes within themselves,

190
00:19:55,070 --> 00:20:02,390
doing things like helping helpers and helping those who help others than themselves.

191
00:20:02,390 --> 00:20:07,730
So there's a lot more going on there than we thought, and this is a representation,

192
00:20:07,730 --> 00:20:12,590
a non egocentric representation that is a representation of relevant characteristics,

193
00:20:12,590 --> 00:20:16,990
we would say morally relevant characteristics of the social environment.

194
00:20:16,990 --> 00:20:22,750
This is also a kind of autonomy that they have because they are not bound to what

195
00:20:22,750 --> 00:20:26,740
they've done in the past or the options that they've exercised in the past.

196
00:20:26,740 --> 00:20:31,120
And so within the social environment, just as within the physical environment,

197
00:20:31,120 --> 00:20:39,940
they are able to navigate space in ways that help them realise goals and fashions that they had not explored before.

198
00:20:39,940 --> 00:20:47,020
And this modelling, which we should expect in a highly artificially intelligent agents,

199
00:20:47,020 --> 00:20:52,660
would then have the same kind of need for autonomy and efficacy in navigating the physical and social environment.

200
00:20:52,660 --> 00:20:58,000
So we should expect artificial agents who have even animal levels of competence to be doing

201
00:20:58,000 --> 00:21:03,340
this kind of non egocentric mapping of evaluative features of their social landscape.

202
00:21:03,340 --> 00:21:10,330
And we should expect, moreover, that it needs that just as the rat needs it or the monkey needs it in order to

203
00:21:10,330 --> 00:21:15,790
have the most effective and efficient pursuit of goals in that social space.

204
00:21:15,790 --> 00:21:22,480
And so what I've been trying to argue then, is that features like linguistic features, epistemic features,

205
00:21:22,480 --> 00:21:27,370
social features, features having to do with helping and harming morally relevant features.

206
00:21:27,370 --> 00:21:32,560
These are all they're not being conceived by the animals as reasons for action.

207
00:21:32,560 --> 00:21:37,480
They don't have a concept of reasons, perhaps, but they're being used as reasons for action,

208
00:21:37,480 --> 00:21:42,610
and they're playing the role as reasons for action that reasons for action should play.

209
00:21:42,610 --> 00:21:53,220
So. Back to our friends, the autonomous vehicles, because they're my example here of autonomous, artificially intelligent systems.

210
00:21:53,220 --> 00:22:02,400
So what kinds of sensitivity or responsiveness to reason making features would be involved in achieving human level competence and driving?

211
00:22:02,400 --> 00:22:11,620
And importantly, how much sense of such sensitivity and responsiveness would contribute to the safer and more trustworthy character of these vehicles?

212
00:22:11,620 --> 00:22:15,900
In other words, if they were able to be responsive to these features, would they in fact be safer?

213
00:22:15,900 --> 00:22:23,250
Would they be more trustworthy and would they be better at realising the kinds of goals that they're trying to realise, like getting to destinations?

214
00:22:23,250 --> 00:22:32,970
Seems like a lot to try to get morality in this sense, not a morality in the sense that we think of a highly normative AIS system,

215
00:22:32,970 --> 00:22:37,770
but a system of responsiveness to morally relevant features. It seems like why would you need that in order to drive?

216
00:22:37,770 --> 00:22:41,310
And the answer is humans need it in order to drive,

217
00:22:41,310 --> 00:22:50,160
and they need it in order to drive in such a way that the capacity to have human level competence would presuppose that ability as well.

218
00:22:50,160 --> 00:22:55,890
So we last time looked at merging. So here are some typical merging problems.

219
00:22:55,890 --> 00:23:00,480
And here's an example. Take a look at the lower right hand corner.

220
00:23:00,480 --> 00:23:09,570
You're trying to merge into a highway, and you notice that the following car the second blue car, seems to be allowing a gap in front of it.

221
00:23:09,570 --> 00:23:16,260
Now, does that mean that it's signalling to you to merge in? Has it slowed down in order to signal that you're merging in?

222
00:23:16,260 --> 00:23:21,540
Or is it just speeding up and it has a gap to make up between it and the car ahead of it?

223
00:23:21,540 --> 00:23:24,810
What is the intention of that car? And where should you go?

224
00:23:24,810 --> 00:23:30,390
Should you try to move on the trajectory that takes you close to that car or more distant from that car,

225
00:23:30,390 --> 00:23:34,710
which is going to be less likely to interfere with the planning of that vehicle?

226
00:23:34,710 --> 00:23:40,050
Given what the vehicle is signalling that it's doing and so recently,

227
00:23:40,050 --> 00:23:50,130
there have been the development of techniques for understanding how merging can take place that reflect this kind of complex, intentional structure.

228
00:23:50,130 --> 00:23:55,290
Now for humans, successful burgeoning merging involves competing interests.

229
00:23:55,290 --> 00:24:03,360
Each of us might want to get to our destination faster, but we also have some shared interests in smooth traffic flow, avoiding collisions and so on.

230
00:24:03,360 --> 00:24:08,070
How do we reconcile these in any given situation, given that we don't know the other driver?

231
00:24:08,070 --> 00:24:11,370
We may not interact with the other driver again. What do we have to do?

232
00:24:11,370 --> 00:24:16,410
Well, we have to use whatever we can to try to determine the intentions of other drivers.

233
00:24:16,410 --> 00:24:23,520
We have to assess the evidence that's available to us. We have to look for whether there's any communicative action going on with the other driver.

234
00:24:23,520 --> 00:24:31,260
We have to think that we're perhaps causing a slowdown behind us if we delay or we have to be mindful of the traffic that's moving.

235
00:24:31,260 --> 00:24:35,670
And so we have to know what are the expectations of all these individuals around us?

236
00:24:35,670 --> 00:24:41,670
And how can we put together those expectations in such a way as to enable us to do a smooth, safe merge?

237
00:24:41,670 --> 00:24:45,840
And so it involves heavy use of theory of mind.

238
00:24:45,840 --> 00:24:53,280
And that suggests that if you're going to build a car capable of human level competency and merging, it's going to need something like theory of mind.

239
00:24:53,280 --> 00:24:59,730
And indeed, we find something like that. So here is another merging situation that you all understand.

240
00:24:59,730 --> 00:25:05,610
There's a construction zone. You're supposed to be nice and neatly in a single lane as you get to it.

241
00:25:05,610 --> 00:25:13,050
The autonomous vehicle sees an opportunity. Should it take this opportunity, it could scoot ahead in an open road.

242
00:25:13,050 --> 00:25:20,280
After all, it's been in traffic should it scoot ahead in open road. And if so, how should it signal to the other car that it's all ready to emerge?

243
00:25:20,280 --> 00:25:25,050
And how should it respond to the other car's response to its attempt to muscle in like that?

244
00:25:25,050 --> 00:25:30,660
And why is it not working and why next time should it try a different strategy?

245
00:25:30,660 --> 00:25:34,830
All of that is the kind of stuff that these machines have to figure out.

246
00:25:34,830 --> 00:25:38,550
And this is a situation they could get into.

247
00:25:38,550 --> 00:25:46,500
This is not a situation of unheard of a catastrophe, this is cross-town traffic in New York on an ordinary day.

248
00:25:46,500 --> 00:25:52,050
Think of how much intentional information the drivers of those cars need and the pedestrians

249
00:25:52,050 --> 00:25:56,250
need in order to get out of this situation while continuing to move across the intersection,

250
00:25:56,250 --> 00:26:05,530
which they do all day long, day after day. That's an ability that could not possibly be accomplished by a driver module.

251
00:26:05,530 --> 00:26:10,000
It's going to have to be accomplished by a module for understanding human social

252
00:26:10,000 --> 00:26:15,010
interactions and ways of getting people pissed off at you in ways of getting people

253
00:26:15,010 --> 00:26:18,700
to want to cooperate with you and trying to elicit elicit their help in trying to

254
00:26:18,700 --> 00:26:24,630
signal to the pedestrian that you're really not trying to run that person over. So.

255
00:26:24,630 --> 00:26:31,350
Autonomous vehicle merging, then is going to have all the problems of conflicting goals, problems of communications signalling,

256
00:26:31,350 --> 00:26:37,440
trying to be reliable in the communication or trying to detect deceptive communication, this is New York.

257
00:26:37,440 --> 00:26:45,000
After all, they must solve these problems of predicting behaviour, gauging evidence.

258
00:26:45,000 --> 00:26:51,330
What evidence could I give to others that would enable me to do this task more successfully and smoothly to have safe and

259
00:26:51,330 --> 00:26:59,670
human competence in merging in an actual situation that might encounter any day at any intersection in Midtown Manhattan?

260
00:26:59,670 --> 00:27:08,040
So no simple dynamical model will suffice. Getting the solution will require attributing goals and expectations to other drivers,

261
00:27:08,040 --> 00:27:13,410
autonomous or human, looking for informative signals, estimating their behaviour and so on.

262
00:27:13,410 --> 00:27:18,490
So this is a kind of non egocentric mapping of the social environment.

263
00:27:18,490 --> 00:27:23,610
And. One approach now that's being explored.

264
00:27:23,610 --> 00:27:29,910
I'm happy to say the British Isles is to develop deliberately.

265
00:27:29,910 --> 00:27:38,550
Artificial intelligence is for driving that use rational agent models of other vehicles in situations.

266
00:27:38,550 --> 00:27:48,090
And what that means is that it tries to impute to the behaviour of other vehicles what would be the intention behind this particular behaviour.

267
00:27:48,090 --> 00:27:52,140
And that means they're doing what's called inverse reinforcement learning from the behaviour.

268
00:27:52,140 --> 00:28:00,750
They're trying to infer what the value function of that vehicle is effectively and then trying to use something like Bayesian estimation to think,

269
00:28:00,750 --> 00:28:05,010
Well, is it more likely if it had that value function, that it would slow down or speed up here?

270
00:28:05,010 --> 00:28:09,450
And how is that going to affect the way in which I behave and vehicles like this

271
00:28:09,450 --> 00:28:14,010
can in simulations do a better job of merging than they could with just simple,

272
00:28:14,010 --> 00:28:20,820
dynamic calculations because they can impute structure of a general kind to a situation?

273
00:28:20,820 --> 00:28:26,820
And these are agents, the artificial ones and the human ones are agents in the sense that we saw.

274
00:28:26,820 --> 00:28:31,740
Now you might say, wait a second, we solved chess, didn't we? They solved.

275
00:28:31,740 --> 00:28:39,600
They didn't solve it. They got better than any human in chess, but they didn't have to do psychological theories do that.

276
00:28:39,600 --> 00:28:43,860
They got better than any human and go. They didn't do any psychology in order to do that.

277
00:28:43,860 --> 00:28:49,350
And that's because these are games with bounded boards and only discrete number of moves that are possible

278
00:28:49,350 --> 00:28:55,800
that any moment there isn't a complex question of which players do you coordinate with at any time?

279
00:28:55,800 --> 00:29:00,060
And so a game like chess or go complicated as it is,

280
00:29:00,060 --> 00:29:06,510
is the kind of game that as machine intelligence can solve without attributing agency to the other player.

281
00:29:06,510 --> 00:29:13,350
But the kinds of problems we've been looking at don't seem like they're going to be tractable in that way, and they're going to be open ended.

282
00:29:13,350 --> 00:29:17,850
And the intelligence that's involved is going to have to have basically the

283
00:29:17,850 --> 00:29:24,890
features that we've been attributing to animal and human social intelligence.

284
00:29:24,890 --> 00:29:32,210
Think of relations with pedestrians. How do you understand whether pedestrians are saying to you, OK,

285
00:29:32,210 --> 00:29:39,200
I'm not going to slow down for you to pull out this time or they're saying, yes, I will slow down and you can start to pull out.

286
00:29:39,200 --> 00:29:46,400
How do you understand that from the motions that they're making, you have to have a rational agent model of the pedestrians as well.

287
00:29:46,400 --> 00:29:54,770
You also have to know what you don't know. Human driver I'm new to a country and I'm at a crosswalk and there are a lot of pedestrians going cross.

288
00:29:54,770 --> 00:29:57,470
And I think, my god, I'll never be able to pull out.

289
00:29:57,470 --> 00:30:04,730
If I've got a person from the country beside me, I can turn to that person and say, Where are you in this, this and this situation?

290
00:30:04,730 --> 00:30:10,970
Would you nudge forward a little bit? Do people, if you do nudge forward, do they open up a little bit so that you can go through?

291
00:30:10,970 --> 00:30:16,640
How long should I wait before I do that? And that person is likely to have better information than you.

292
00:30:16,640 --> 00:30:22,580
And so an artificial driver that can be as competent as a human driver has to know what it

293
00:30:22,580 --> 00:30:27,920
doesn't know and know also what it could ask and what it could learn in the situation.

294
00:30:27,920 --> 00:30:35,780
And so it will have to be consultative and dialogic and not simply sit there in its own private mind and try to scrutinise the world.

295
00:30:35,780 --> 00:30:42,760
It can gain from communication and gain from whatever knowledge the humans that are in its environment have acquired.

296
00:30:42,760 --> 00:30:49,720
And so that also requires that these systems have the ability to self represent because I can't have a dialogue

297
00:30:49,720 --> 00:30:55,360
without a machine with a machine about this unless it can tell me why it wants to do the thing that it's doing.

298
00:30:55,360 --> 00:30:56,260
Your wife's trying to do,

299
00:30:56,260 --> 00:31:03,940
the thing that it's doing and that requires now that the machines also have self representational capacities that they can represent their own,

300
00:31:03,940 --> 00:31:09,400
that they can represent the different weights, that the different nodes and variables have.

301
00:31:09,400 --> 00:31:16,420
And they can put into words what weights they're using and ask, are these weights appropriate in this situation?

302
00:31:16,420 --> 00:31:22,960
Again, a competent human driver can do that and we'll be able to get through the intersection thanks to local knowledge.

303
00:31:22,960 --> 00:31:30,240
That's a human competency, and driving and human level competency and driving involves just this.

304
00:31:30,240 --> 00:31:35,700
These are the features that are also morally relevant features of situations they

305
00:31:35,700 --> 00:31:40,080
have to do with harms and benefits in ways in which individuals can be put at risk,

306
00:31:40,080 --> 00:31:44,640
or ways in which individuals can be helped or assisted or cooperated with.

307
00:31:44,640 --> 00:31:50,850
And so we can see why autonomous vehicles are going to have to be responsive to these morally relevant features.

308
00:31:50,850 --> 00:31:57,760
Now you might say, OK, OK, yeah, they were responsive to morally relevant features, but not in a moral way.

309
00:31:57,760 --> 00:32:04,330
It's just rational self-interest on their part. All they're trying to do is maximise some reward function.

310
00:32:04,330 --> 00:32:09,880
They are trying to do anything like but a moral agent does, and to a certain extent that's true.

311
00:32:09,880 --> 00:32:13,840
They are doing what a moral agent does in the sense they don't have moral concepts.

312
00:32:13,840 --> 00:32:17,800
They are doing moral deliberation in that way. They don't have moral feelings.

313
00:32:17,800 --> 00:32:22,370
They won't feel guilt or shame. But.

314
00:32:22,370 --> 00:32:33,070
Are they rational, self-interested agents? Or not, and I'm going to argue that they aren't in that for them to be successful at this driving task,

315
00:32:33,070 --> 00:32:43,000
they will not be rational, self-interested agents. They will be what I call and what Hobbs and Hume would call reasonable agents and reasonable agents

316
00:32:43,000 --> 00:32:48,100
respond to morally relevant features in the way that we hope moral agents will respond to them,

317
00:32:48,100 --> 00:32:55,870
not just in the way in which prudent or self-interested people will now to engage in this little discussion.

318
00:32:55,870 --> 00:33:03,580
I'm going to have to talk about these systems as being more or less rational, them having interests, having benefits, having cost and so on.

319
00:33:03,580 --> 00:33:08,320
And I realise that can be problematic to people because they can think they are conscious.

320
00:33:08,320 --> 00:33:11,800
How could they have a benefit or a cost if they're not conscious?

321
00:33:11,800 --> 00:33:19,630
And I have your whole spiel I could give to you about why I think those terms are appropriate.

322
00:33:19,630 --> 00:33:28,570
Let's say that it what I have in mind here is not what we would, might, might think of as a conscious benefit,

323
00:33:28,570 --> 00:33:31,690
but it's a kind of benefit that humans have and that is important for human life.

324
00:33:31,690 --> 00:33:38,290
And it's a kind of benefit that human institutions can have, even though human institutions are not conscious.

325
00:33:38,290 --> 00:33:46,330
And so what I mean here by interest has to do with what goals exist in the situation for the agent,

326
00:33:46,330 --> 00:33:54,010
how those goals might balance, how they're related to the odds or the probabilities or the information in the situation.

327
00:33:54,010 --> 00:33:59,080
Something will be in its interest if it can improve its situation with regard to those goals,

328
00:33:59,080 --> 00:34:05,170
something will be a benefit to it if its situation is increased by that, it's the cost of its depleted.

329
00:34:05,170 --> 00:34:11,500
And this idea of cost and benefit is when we use all the time.

330
00:34:11,500 --> 00:34:19,930
It's not something I've invented the head of a college, you know, why aren't you divesting from oil stocks?

331
00:34:19,930 --> 00:34:27,610
Don't you know that the oil companies are responsible for a huge amount of pollution? And the head of the college will say, I understand that.

332
00:34:27,610 --> 00:34:31,960
I personally wish we could be divested of all carbon intensive stocks.

333
00:34:31,960 --> 00:34:40,720
In fact, everyone on the board wishes that intensively. But I'm head of the board and our job is to ask What is the interest of the college,

334
00:34:40,720 --> 00:34:44,680
not what is our interest as individuals and the interests of the college would

335
00:34:44,680 --> 00:34:49,180
not be served by divestment because it would harm our income and because it

336
00:34:49,180 --> 00:34:54,610
would put our future donors and indeed future donors or others could take that

337
00:34:54,610 --> 00:34:59,270
head of college to court if he fails to act in the interest of the college.

338
00:34:59,270 --> 00:35:04,060
And so you'll have people in black robes sitting solemnly in a panelled chamber asking themselves,

339
00:35:04,060 --> 00:35:10,290
Was this or was this not the interest of the college? And that's not an interest of any of the agents in the college.

340
00:35:10,290 --> 00:35:14,290
Maybe the entire board is about to retire, maybe their pensions are secure.

341
00:35:14,290 --> 00:35:20,200
Maybe they're not going to benefit at all from this. Maybe they'd prefer to have a green reputation.

342
00:35:20,200 --> 00:35:25,180
They hate this. They don't like the students coming to their office and ask these questions.

343
00:35:25,180 --> 00:35:29,350
But they say we still our responsibility is to tend to the interests of the college,

344
00:35:29,350 --> 00:35:34,690
and those are not the same as our interests as individual moral agents.

345
00:35:34,690 --> 00:35:38,500
So it's not a bizarre notion. It's the notion used in game theory.

346
00:35:38,500 --> 00:35:51,810
And I'm happy to discuss this more in the question period. But I dare not spend more time on it right now.

347
00:35:51,810 --> 00:35:54,270
OK, so here we are back with artificial agents,

348
00:35:54,270 --> 00:36:00,810
with their interests more or less rational relative to those interests in the actions that they select.

349
00:36:00,810 --> 00:36:05,760
These are terms of art, but they're anchored in the agency and the general structure of these systems.

350
00:36:05,760 --> 00:36:11,520
We can use them to predict and control their behaviour. We can use them in the game theoretic way to predict how they will interact

351
00:36:11,520 --> 00:36:15,900
with each other and with the outcomes of those interactions will be we can,

352
00:36:15,900 --> 00:36:23,610
for example, say, Oh, well, if they're rational, self-interested agents, then if there is a nash equilibrium, they will find it.

353
00:36:23,610 --> 00:36:25,290
And how would we describe that? Well,

354
00:36:25,290 --> 00:36:32,940
it's a stable state of a system involved in the interaction of different agents in which no agent can benefit by a unilateral change in strategy.

355
00:36:32,940 --> 00:36:37,380
If the strategies of the others remain unchanged, that's the idea of a nash equilibrium.

356
00:36:37,380 --> 00:36:41,880
And the answer is, if these machines are rational and self-interested and they interact and they can learn,

357
00:36:41,880 --> 00:36:49,470
they will find Nash equilibrium and take them OK. And that brings us to this man, Hobbes.

358
00:36:49,470 --> 00:36:54,720
So Hobbes is famously analysed in terms of the prisoner's dilemma.

359
00:36:54,720 --> 00:37:02,070
So here is the prisoner's dilemma. One way of understanding the dilemma is if you're a rational, self-interested agent,

360
00:37:02,070 --> 00:37:05,940
you will look at this payoff table whether you're a prisoner one or prisoner two,

361
00:37:05,940 --> 00:37:10,360
and you will reason what would be the stable equilibrium here,

362
00:37:10,360 --> 00:37:14,790
such that whatever the other person does, I could not play a more advantageous strategy.

363
00:37:14,790 --> 00:37:20,930
And you've all heard this numerous times already. It would be the strategy of joint defection.

364
00:37:20,930 --> 00:37:28,020
And so we would both end up with one unit of value, whereas if we had just cooperated, we would get three units.

365
00:37:28,020 --> 00:37:33,000
And that's interesting, because not only would we do better individually, but together we would do better.

366
00:37:33,000 --> 00:37:36,660
We would have produced six units of value rather than just two.

367
00:37:36,660 --> 00:37:44,160
And we could divide those six units up and we could do this again if we haven't iterated game and we could continue to produce more value.

368
00:37:44,160 --> 00:37:50,790
If, however, we are rational, self-interested agents, we won't on the first round cooperate.

369
00:37:50,790 --> 00:37:54,930
And that means that on the second round, if the other agent understands us,

370
00:37:54,930 --> 00:37:58,080
what understands us for what we're doing and we don't know whether we're going

371
00:37:58,080 --> 00:38:03,510
to be any more rounds will defect as well and we will not get up into that box.

372
00:38:03,510 --> 00:38:13,370
Now, if mean, if autonomous artificial agents can't get into that box, they're in trouble.

373
00:38:13,370 --> 00:38:19,850
And so in order to do so, they must differ from rational, self-interested agents in a distinctive way,

374
00:38:19,850 --> 00:38:24,470
and Hobbs told us very carefully what that distinctive way is.

375
00:38:24,470 --> 00:38:30,290
So he was looking at the world around him at the strife and this religious wars in England.

376
00:38:30,290 --> 00:38:36,290
And it was very clear to him that there were two contesting movements that could get locked into non-cooperation.

377
00:38:36,290 --> 00:38:39,410
They could decimate the countryside and keep the country weak,

378
00:38:39,410 --> 00:38:45,140
whereas if they could somehow rather come to some kind of an agreement, they could have peace and prosperity.

379
00:38:45,140 --> 00:38:50,900
And therefore, his laws of nature don't say in the first instance, you should use all the help of war.

380
00:38:50,900 --> 00:38:59,250
They say in the first instance, seek peace. And only if peace is not obtainable should you use the instruments of war,

381
00:38:59,250 --> 00:39:05,280
and he thinks that peace is attainable in that situation and indeed recommends it.

382
00:39:05,280 --> 00:39:12,570
Now that would correspond to cooperating on the first round of the prisoner's dilemma, rejecting the rational, self-interested strategy.

383
00:39:12,570 --> 00:39:22,080
How does the argument work? Well, Hobbs argues that a reasonable person can see that an unsecured first performance of

384
00:39:22,080 --> 00:39:28,050
cooperation would be a costly and therefore credible signal of willingness to cooperate.

385
00:39:28,050 --> 00:39:34,940
And so he tells us you could initiate cooperation with no security of performance from the other individual.

386
00:39:34,940 --> 00:39:39,200
It is seeking peace, it's giving peace a chance, as the slogan goes.

387
00:39:39,200 --> 00:39:43,370
And in light of the first law of Nature, A should do it now.

388
00:39:43,370 --> 00:39:56,300
A assuming that B is a rational or at least semi rational agent should understand the laws of nature just as well as a does and should recognise that,

389
00:39:56,300 --> 00:40:06,190
according to the Fourth Law of Nature, if you receive a gift out of free grace from someone, you should endeavour not to make that person repented.

390
00:40:06,190 --> 00:40:17,750
That is to say, you should show great gratitude for a gift of free grace, and so be knowing this will want to cooperate on the next round.

391
00:40:17,750 --> 00:40:23,600
A knowing that B would know that would know that B is going to cooperate on the second round.

392
00:40:23,600 --> 00:40:28,680
And so as a reasonable agent with that expectation, we'll also cooperate.

393
00:40:28,680 --> 00:40:37,230
And once they do this, they will be in a stable situation now in which they can continue to cooperate as long as they interact with one another.

394
00:40:37,230 --> 00:40:42,750
And you'd say, Yeah, but look, suppose there's an end time, there's a horizon out there.

395
00:40:42,750 --> 00:40:45,900
A rational, self-interested Asian would reason like the following.

396
00:40:45,900 --> 00:40:53,280
I know that on the next to last round, my opponent being rational, it's self-interested, is going to not cooperate, it's going to defect.

397
00:40:53,280 --> 00:40:59,210
And therefore, on the next to last round, I should defect.

398
00:40:59,210 --> 00:41:03,100
Oh, but the other agent can make that argument just as well as I can,

399
00:41:03,100 --> 00:41:08,680
and so therefore on the ground before the next, the last round, [INAUDIBLE] defect and I should defect.

400
00:41:08,680 --> 00:41:14,590
And I can make that argument just as well as she can and therefore on the ground before the round, before the next to last round, I should defect.

401
00:41:14,590 --> 00:41:17,200
And they talk themselves the Caterpillar Strategy School.

402
00:41:17,200 --> 00:41:24,900
They talked themselves entirely into spending the next year non cooperating because of a worry about the last round.

403
00:41:24,900 --> 00:41:27,890
Now, Hobbs would say that is unreasonable.

404
00:41:27,890 --> 00:41:33,140
Reasonable people don't act like that, but if they were rationally self-interested in the sense that we've defining it,

405
00:41:33,140 --> 00:41:37,490
if that's all that could move them, then that is the way they would argue.

406
00:41:37,490 --> 00:41:43,520
So in a true prisoner's dilemma, of course, agents have to act at the same time.

407
00:41:43,520 --> 00:41:52,430
They can't know what the other's going to do, and reasonableness plays a role there because if a, for example, initiates unsecured cooperation,

408
00:41:52,430 --> 00:41:59,870
he can infer that B will understand this a certain way because we could see, as well as it could see, that defection was the dominant strategy.

409
00:41:59,870 --> 00:42:05,690
And so now we have a way in which signals can come become reliable for reasonable agents.

410
00:42:05,690 --> 00:42:14,870
They can share information and help shape each other's behaviour because they're reasonable and therefore would plan in advance that even if

411
00:42:14,870 --> 00:42:24,350
B defects on the first round because we didn't see all the way to the bottom with the strategy or because B did defect and now repents it,

412
00:42:24,350 --> 00:42:26,990
Hobbs would say, and is going to cooperate in this role,

413
00:42:26,990 --> 00:42:34,310
a common reason as a reasonable agent to that conclusion and a then we'll cooperate again in an unsecured way.

414
00:42:34,310 --> 00:42:41,210
And once again, they all have the cooperative payoff. And so what else does Hobbs tell us about these agents?

415
00:42:41,210 --> 00:42:46,550
They should strive to accommodate themselves to the rest upon portion of the future.

416
00:42:46,550 --> 00:42:50,390
A man ought to pardon defence's past of them. That repenting desire.

417
00:42:50,390 --> 00:42:57,170
You should not be vengeful. Yes, be defected on the first round, but I'm not going to be vengeful in revenge.

418
00:42:57,170 --> 00:43:03,410
These men look not at the greatness of the evil past, but at the greatness of the good to follow.

419
00:43:03,410 --> 00:43:08,090
And so a not wanting to extract vengeance,

420
00:43:08,090 --> 00:43:12,410
if that means locking into a non cooperate strategy would be because a is looking at

421
00:43:12,410 --> 00:43:19,160
the greater good to follow a reasonably wants to initiate and continue cooperation.

422
00:43:19,160 --> 00:43:27,290
And so the thing could go along, and this is this is not a mystery to human beings.

423
00:43:27,290 --> 00:43:35,690
Humans have managed throughout their history to be social beings who live together in cooperative arrangements, some very large.

424
00:43:35,690 --> 00:43:44,720
Those arrangements can be mutually beneficial and mutually sustained, rather than a sequence of mutual defections and a state of nature.

425
00:43:44,720 --> 00:43:51,620
They do this despite lack of assurances beside the fact that they may interact only a finite number of times with each other,

426
00:43:51,620 --> 00:43:57,740
and in real life, they will recognise defection in a given round as a kind of a mistake.

427
00:43:57,740 --> 00:44:02,510
It would be so we would call it a moral mistake and up to a certain point.

428
00:44:02,510 --> 00:44:07,460
The response to that moral mistake is to try to bring that person back into the moral community,

429
00:44:07,460 --> 00:44:11,210
and that's indeed what hunter gatherer communities seem to do.

430
00:44:11,210 --> 00:44:14,840
Now, autonomous vehicles should be in the same situation.

431
00:44:14,840 --> 00:44:23,150
They're constantly encountering prisoner dilemmas like situations where if they could just cooperate, they could each get to the destination quicker.

432
00:44:23,150 --> 00:44:28,880
If they won't cooperate, they're going to be locked in, they're going to be both stopped and they're going to lose time.

433
00:44:28,880 --> 00:44:32,240
Now, of course, if one of them were to defer to the other,

434
00:44:32,240 --> 00:44:36,590
then while the other would get to the destination fast and it would take me extra time to get to the destination.

435
00:44:36,590 --> 00:44:40,610
So of course, I wouldn't do that. And of course, the other car will reason the same way,

436
00:44:40,610 --> 00:44:48,530
and there they'll be locked in non-cooperation the theatre section rather than figuring out we could smoothly flow together.

437
00:44:48,530 --> 00:44:53,570
So a trust signal issued by one car to another.

438
00:44:53,570 --> 00:44:56,690
I defer can signal that other car.

439
00:44:56,690 --> 00:45:03,020
I'm prepared to cooperate and you should go ahead and we can then get through the intersection quickly without any comment authority,

440
00:45:03,020 --> 00:45:14,210
without any security for performance. And the trusting action is an investment in a common good of creating a trusting community amongst the vehicles,

441
00:45:14,210 --> 00:45:22,880
even with no assurance of repeated interaction so that a given vehicle in a given situation can expect that if it defers at an intersection,

442
00:45:22,880 --> 00:45:28,370
the other vehicle will read the order that deferring in a distinctive way will reciprocate

443
00:45:28,370 --> 00:45:33,650
by clearing the interception in intersections smoothly and letting it come in.

444
00:45:33,650 --> 00:45:41,150
And so this is a form of reciprocity that isn't direct because it's not directly reciprocated.

445
00:45:41,150 --> 00:45:48,860
It's what's called an indirect or a general reciprocity in which a common good that is willingness to be reasonably

446
00:45:48,860 --> 00:45:55,160
trusting of other drivers and to cooperate with them in trying to sort out these situations as best they can,

447
00:45:55,160 --> 00:46:01,430
that that public good can be maintained so long as the individuals are motivated by indirect

448
00:46:01,430 --> 00:46:10,770
reciprocity and don't demand direct reciprocity in order to continue to cooperate and.

449
00:46:10,770 --> 00:46:17,610
And humans do this. You know, hunter-gatherers did it well, what about modern humans, haven't we earned our way out of that?

450
00:46:17,610 --> 00:46:28,830
And so this is an intersection in Vietnam. Has anyone ever seen or driven through an intersection in Vietnam or in any of a dozen other

451
00:46:28,830 --> 00:46:34,920
countries where they don't have an elaborate traffic light system and don't have the infrastructure?

452
00:46:34,920 --> 00:46:42,330
That's a continuous flow of traffic. All those people are moving, they're not in lanes, they're not in any particular order.

453
00:46:42,330 --> 00:46:47,100
They're moving in different directions and they trust each other to stay out of each other's way.

454
00:46:47,100 --> 00:46:55,080
And I urge you after this lecture to go online and watch a film of how this operates in real time.

455
00:46:55,080 --> 00:47:00,030
Now you and I actually know how to do this. We do it as pedestrians.

456
00:47:00,030 --> 00:47:07,360
If you look at Grand Central Station or any other large terminal, you see the we're always doing this, so we know how to do this.

457
00:47:07,360 --> 00:47:10,990
And that's because we trust each other to know how to stay out of each other's way.

458
00:47:10,990 --> 00:47:17,380
But if we always insisted that whenever we're on a collision course with somebody that we get the best route,

459
00:47:17,380 --> 00:47:23,320
we would obviously not be able to do this and the Vietnamese drivers and pedestrians would not be able to do it either.

460
00:47:23,320 --> 00:47:28,000
So we are capable of creating buy in,

461
00:47:28,000 --> 00:47:37,510
directly investing and directly investing in a community trust to create this possibility in a sustained way in a large urban environment.

462
00:47:37,510 --> 00:47:47,020
So that's within us to. So if you are going to design autonomous vehicles for Vietnam, they would have to be able to do this.

463
00:47:47,020 --> 00:47:53,140
They would have to be able to read the signs of the various different motions of the individuals, the cars, the pedicabs,

464
00:47:53,140 --> 00:48:01,870
the motorbikes, the motorcycles, the pedestrians and not sufficiently rashly trying to ram their way through the intersection.

465
00:48:01,870 --> 00:48:05,860
Not following some arbitrary rule, but in a densely interactive way.

466
00:48:05,860 --> 00:48:14,070
Viewing other vehicles try to maintain this good of coordination and mutual accommodation as.

467
00:48:14,070 --> 00:48:19,560
Hobbs Fifth Love Nature tells us every man strive to accommodate himself to the rest.

468
00:48:19,560 --> 00:48:24,210
Every man acknowledge another for his equal by nature. You don't say I'm a car.

469
00:48:24,210 --> 00:48:29,820
I get the right of way. The pedestrian gets deserve some claim as well.

470
00:48:29,820 --> 00:48:35,940
And as Hobbs says at the entrance into the conditions of peace, or I would say, into the intersection,

471
00:48:35,940 --> 00:48:43,710
no man required to reserve to himself any right which he is not content should be reserved for everyone of the rest.

472
00:48:43,710 --> 00:48:57,010
And that's the logic of this situation. And Hobbs saw that these kinds of agents could cooperate and sustain cooperation and the.

473
00:48:57,010 --> 00:49:00,700
Person will say, yeah, but he had a big authority standing behind it to enforce it.

474
00:49:00,700 --> 00:49:07,420
And he says, No, I didn't, because this is the way you get cooperation out of a state of nature without such an authority,

475
00:49:07,420 --> 00:49:13,420
and the authority is actually constituted by the cooperation of the agents, not the other way around.

476
00:49:13,420 --> 00:49:22,510
And so in Hobson's account, the way that you get an overriding power is by this kind of unsecured cooperation.

477
00:49:22,510 --> 00:49:29,560
Without it, you would never get an overarching power and you will never get anyone paying attention to it would be overriding power.

478
00:49:29,560 --> 00:49:34,260
So. Not every problem get solved in this way.

479
00:49:34,260 --> 00:49:38,430
Coordination problems aren't always solved by willingness to cooperate.

480
00:49:38,430 --> 00:49:45,870
Trust and good reputation don't always solve such problems. Sometimes we just have to figure out, well, who's going to go first,

481
00:49:45,870 --> 00:49:51,780
the people exiting the subway car or the people entering the subway car, and we can solve those as well.

482
00:49:51,780 --> 00:49:58,500
And we again solve them, not by invoking an external set of rules, but by coming together and forming conventions.

483
00:49:58,500 --> 00:50:05,250
And as you travel, you'll know that we form different conventions in different places and we have different artificial coordinating advice devices.

484
00:50:05,250 --> 00:50:12,810
Whether we're a hunter gatherer band deciding how the the the speed or the rate or the sequence

485
00:50:12,810 --> 00:50:17,280
with which the meat gets cut and partitioned or at town hall trying to figure out how we

486
00:50:17,280 --> 00:50:24,000
coordinate the traffic lights so that the pedestrians and the cars and the bicyclists are accommodated

487
00:50:24,000 --> 00:50:31,260
or designers of ourselves engineers for some complicated artificial system of coordination.

488
00:50:31,260 --> 00:50:39,330
And why do we have these systems? Well, we have these because we can get together and coordinate and create a government,

489
00:50:39,330 --> 00:50:46,890
and the government can cooperate within itself enough to come to a conclusion about where to put these and to raise the funds.

490
00:50:46,890 --> 00:50:51,000
And people are willing to cooperate enough to pay their taxes to fund it.

491
00:50:51,000 --> 00:51:03,000
And so a great big wad of human cooperation is represented by this artificial convention of signalling devices which would not exist otherwise.

492
00:51:03,000 --> 00:51:08,490
OK, well, trust can be leveraged in various ways we've talked about the ways in which trust

493
00:51:08,490 --> 00:51:14,090
can be leveraged by signalling to one another by advantages that get distributed.

494
00:51:14,090 --> 00:51:23,130
Now here's another kind of leverage reputation. This is probably familiar to you who've taken Hoover's or Lift's or whatever.

495
00:51:23,130 --> 00:51:28,830
You know that the drivers are rated and you know that the pedestrians or the riders are rated.

496
00:51:28,830 --> 00:51:35,370
And you know that maintaining a reputation on either side is something that depends upon the past behaviour.

497
00:51:35,370 --> 00:51:39,720
And you know that if you allow you represent your reputation to deteriorate enough,

498
00:51:39,720 --> 00:51:45,000
the driver will not come for you or will come to you only as a last resort.

499
00:51:45,000 --> 00:51:49,650
You also know that if the driver isn't rated, you don't have to take the ride.

500
00:51:49,650 --> 00:51:54,570
And so you can have a system of reputation and artificial agents.

501
00:51:54,570 --> 00:51:58,590
Autonomous vehicles could be very good at this kind of a system of representation.

502
00:51:58,590 --> 00:52:07,110
They could share information very widely about reputations of drivers and reputations of autonomous vehicles.

503
00:52:07,110 --> 00:52:10,650
And so there would be a strong incentive to worry about your reputation.

504
00:52:10,650 --> 00:52:15,630
And indeed, what we find is that if you have a strong incentive to worry about your reputation,

505
00:52:15,630 --> 00:52:23,370
you can manage to secure cooperation in a repeated prisoner's dilemma in a way that you cannot without reputation.

506
00:52:23,370 --> 00:52:29,640
So that's another thing. Autonomous vehicles can do, and they can also worry about they can get a reputation, worry about a reputation.

507
00:52:29,640 --> 00:52:35,730
They'll have an interest in an honest system of representation of evaluation and ratings.

508
00:52:35,730 --> 00:52:40,800
And as a result, they will have an interest in identifying cases where there's a discrepancy

509
00:52:40,800 --> 00:52:46,540
between the behaviour of a vehicle actually and the way it's getting rated, and they can share that information as well.

510
00:52:46,540 --> 00:52:54,990
And so you can have second order enforcement of the reputational system again amongst the artificial agents amongst the vehicles themselves.

511
00:52:54,990 --> 00:53:04,350
So in that sense, what this slide indicates is that there is a general interest in having a reliable system of reputation.

512
00:53:04,350 --> 00:53:10,530
Any given individual might have an interest in trying to cheat or trying to get a deceptive representation reputation.

513
00:53:10,530 --> 00:53:14,430
And if every agent acted on that interest, the system would collapse.

514
00:53:14,430 --> 00:53:20,620
Every agent does not, it seems, and so therefore the system can maintain itself.

515
00:53:20,620 --> 00:53:25,240
OK, well, that has to do not only with trust Dow, but with fairness,

516
00:53:25,240 --> 00:53:31,480
that you have an idea about when it is fair to contribute or fair to demand a contribution.

517
00:53:31,480 --> 00:53:38,890
And that's a kind of way in which we leverage our capacities for communication,

518
00:53:38,890 --> 00:53:45,370
our capacities for sharing, for understanding the causal and intentional situation and understanding.

519
00:53:45,370 --> 00:53:53,530
Therefore, what situations are such that we can communicate fairness and by our action?

520
00:53:53,530 --> 00:54:03,760
And the question is, do humans do that? And we know from studies of chimps that they have a hard time doing this.

521
00:54:03,760 --> 00:54:08,260
If food is presented in a way that is not readily partitioned between them,

522
00:54:08,260 --> 00:54:14,890
dominant chimps will push the subordinate chimps away and they won't be able to coordinate.

523
00:54:14,890 --> 00:54:22,600
What about human agents? Well, we know that children who work for gummy bears and one of whom gets a bigger reward of gummy bears than the other,

524
00:54:22,600 --> 00:54:28,120
we'll take some of his gummy bears and give them to the other child. Not always, but they will do so very regularly.

525
00:54:28,120 --> 00:54:30,910
And they do this. Starting in their second year,

526
00:54:30,910 --> 00:54:43,150
human adults being told about third party play in artificial games will pay to have third parties punished for unfair behaviour.

527
00:54:43,150 --> 00:54:52,630
Humans seem to manifest a stronger neural reward signal when they cooperate in a prisoner's dilemma than when they win.

528
00:54:52,630 --> 00:54:58,750
Even when the top payout that you get when you defect and the other cooperates.

529
00:54:58,750 --> 00:55:04,960
And so it looks like we have an intrinsic motivation here to be concerned with fairness and to be concerned

530
00:55:04,960 --> 00:55:11,110
with cooperation as a benefit additional to the benefit that we get from the cooperative activity itself.

531
00:55:11,110 --> 00:55:15,070
And indeed, we're willing to give away some of the benefit of the cooperative activity in

532
00:55:15,070 --> 00:55:21,790
order to address unfairness or to pay a costly penalty for someone being unfair.

533
00:55:21,790 --> 00:55:26,890
And this is a study of small scale societies around the world.

534
00:55:26,890 --> 00:55:35,350
Joseph Hendrix and all. And they in these societies had individuals play economic games using real money,

535
00:55:35,350 --> 00:55:41,170
using amounts of real money that corresponded to something real to them, the equivalent to save a day's wages.

536
00:55:41,170 --> 00:55:47,980
And they played two kinds of games, especially a dictator games and ultimatum games and dictator games.

537
00:55:47,980 --> 00:55:54,790
You say, here's a pot of money. I'm giving it to a and a can distribute it between himself and another agent.

538
00:55:54,790 --> 00:55:59,440
And the question is, what should people do in that situation? They don't know. The other agent is who it is.

539
00:55:59,440 --> 00:56:03,370
They don't know that will ever interact with the other Asian. What would an economist tell you?

540
00:56:03,370 --> 00:56:09,280
You should do with the pot of money? You do it irrationally self-interested person, we do.

541
00:56:09,280 --> 00:56:18,200
How could you have more advantage from any strategy than keeping it? But in none of these societies did they observe that people kept all the money.

542
00:56:18,200 --> 00:56:24,980
In fact, in many of the societies, people partition the money rather fairly again with no reciprocation in view.

543
00:56:24,980 --> 00:56:28,670
What about the ultimatum game where you partition the amount one gets to partition,

544
00:56:28,670 --> 00:56:33,860
the amount the other gets to accept or reject the offer if the individual rejects, neither gets anything.

545
00:56:33,860 --> 00:56:39,530
Again, the first offers are not what economists would say, which is the least possible,

546
00:56:39,530 --> 00:56:43,310
because then the other Asian will have some incentive to accept it.

547
00:56:43,310 --> 00:56:48,740
After all, they deny it. They get nothing. And so you should give as little as possible and they should take it.

548
00:56:48,740 --> 00:56:51,560
But again, this is not observed in any of these societies.

549
00:56:51,560 --> 00:56:59,120
And in fact, you observe a rate of rejection of low rewards or rewards that are disproportionate in virtually every society,

550
00:56:59,120 --> 00:57:06,680
even though that means rejecting whatever benefit there was from the original pot to begin with for the individual.

551
00:57:06,680 --> 00:57:15,560
So humans then are more like Hobbesian reasonable individuals than they are like the economists rational individual.

552
00:57:15,560 --> 00:57:23,520
And again, as I say, if the if the vehicles are, if the human drivers were just economically rational,

553
00:57:23,520 --> 00:57:29,660
we're just rationally self-interested in the way that we have characterised that in that they would go for the Nash equilibrium,

554
00:57:29,660 --> 00:57:37,010
for example, and the prisoner's dilemma, then we would not be able to see this kind of coordinated, successful driving behaviour.

555
00:57:37,010 --> 00:57:46,520
And what are these motivations? I've been piling them up across the lectures because there's quite a long list, and that's interesting.

556
00:57:46,520 --> 00:57:53,360
We find in very young children and arbitrary adults and disposition to initiate help to a stranger,

557
00:57:53,360 --> 00:57:57,860
a distribution to contribute to a shared effort without distinct expectation of return.

558
00:57:57,860 --> 00:58:01,190
That's indirect or general reciprocity. We talk about that.

559
00:58:01,190 --> 00:58:07,970
A disposition to reciprocate help some intrinsic reward from success at cooperation or collaboration beyond the actual

560
00:58:07,970 --> 00:58:14,330
game produced some intrinsic interest in whether others have their goals met or are treated fairly independently,

561
00:58:14,330 --> 00:58:20,390
or how that affects one's own goals. This is why children are so interested in stories.

562
00:58:20,390 --> 00:58:23,720
And stories agents have goals and they try to pursue them.

563
00:58:23,720 --> 00:58:29,960
What's it to the child, right? How could the Shell have any interest in this at all?

564
00:58:29,960 --> 00:58:33,110
Well, if the child were strictly self-interested,

565
00:58:33,110 --> 00:58:38,090
it would not be interesting unless the child had learnt something to use to take advantage of somebody else the next day.

566
00:58:38,090 --> 00:58:45,830
But children are delighted when the unfair agent is punished and the fair agent is rewarded.

567
00:58:45,830 --> 00:58:50,960
That's because they have an interest in seeing others meet their goals, and that's a typical human interest.

568
00:58:50,960 --> 00:58:57,350
We see it all over some disposition to identify and follow prevailing norms.

569
00:58:57,350 --> 00:59:03,500
Yet we know that even three and four year old children will refuse to follow the norm when they see it as harmful or unfair.

570
00:59:03,500 --> 00:59:08,000
They have autonomy to do that. They have a concern for how others view them.

571
00:59:08,000 --> 00:59:15,230
They have reputational concern and they have a disposition to punish those who are harmful or unfair, even at some expense to themselves.

572
00:59:15,230 --> 00:59:20,570
So that's a long list, and you might say, why would we have such a long list? It's got a lot of redundancy in it.

573
00:59:20,570 --> 00:59:27,830
No doubt you could get cooperation with only some of these things. Oh, and you could get solutions to a public goods game with only some of them.

574
00:59:27,830 --> 00:59:29,630
You don't have to have all of them.

575
00:59:29,630 --> 00:59:38,810
But we're talking about creatures who have to be intelligent, able to solve problems in a wide range of environments and able to learn.

576
00:59:38,810 --> 00:59:42,590
And we're going to confront a whole wide range of environments that if we just had

577
00:59:42,590 --> 00:59:47,390
one kind of disposition or two kinds of dispositions to solve those problems,

578
00:59:47,390 --> 00:59:55,100
not only would we be ready prey for opportunistic alternative individuals, but in many situations we would not succeed.

579
00:59:55,100 --> 01:00:01,750
There would be noise in the situation, uncertainty and a failure. So we were built with redundancy.

580
01:00:01,750 --> 01:00:07,990
That makes sense. Engineers building a system to be safe. Build it with redundancy.

581
01:00:07,990 --> 01:00:12,580
Redundancy from the standpoint of safety is a benefit, not a problem.

582
01:00:12,580 --> 01:00:17,410
And you could think that it is only by having this much redundancy in our motivational

583
01:00:17,410 --> 01:00:22,390
system that we do manage as much cooperation as we do and of course we don't always manage.

584
01:00:22,390 --> 01:00:30,460
So that suggests that if we want to build artificial agents who are good at these kinds of coordinated activities,

585
01:00:30,460 --> 01:00:37,600
whether they're driving or serving as a domestic help or a health companion,

586
01:00:37,600 --> 01:00:44,980
or making decisions about how US hiring should go or making decisions about how to control the process,

587
01:00:44,980 --> 01:00:52,150
they should have a complex reward function that includes all of these different features or as many as you can,

588
01:00:52,150 --> 01:00:57,670
and that they will in that have more safety than they would from being rational,

589
01:00:57,670 --> 01:01:03,200
self-interested agents being given a reward function, which had none of these features.

590
01:01:03,200 --> 01:01:06,560
But just the task at hand to be performed.

591
01:01:06,560 --> 01:01:13,130
And so it is really central to general human intelligence as we know it to have these capacities we saw in the first lectures,

592
01:01:13,130 --> 01:01:15,860
they're very important for learning language.

593
01:01:15,860 --> 01:01:25,850
They're important for forming an epistemic community, for exchanging information, for establishing an understanding of other people's minds.

594
01:01:25,850 --> 01:01:31,370
Acquiring social fluency. So it's a core set of dispositions which play that role.

595
01:01:31,370 --> 01:01:39,980
And if we want human level competence out of machines, we're going to have to worry about having that core set of motivational dispositions.

596
01:01:39,980 --> 01:01:52,040
Mm-Hmm. So let me now say just a final word about this question of superintelligence because.

597
01:01:52,040 --> 01:01:58,080
People want to ask it, and I am the furthest thing from an expert on it, but a couple of thoughts.

598
01:01:58,080 --> 01:02:04,740
First thought, well, you know, in the history of technology, the history of safety and technology is really not allowing industry to just

599
01:02:04,740 --> 01:02:08,440
go ahead and build any darn thing and sell it to anyone who wants it right,

600
01:02:08,440 --> 01:02:10,410
that starts out that way.

601
01:02:10,410 --> 01:02:18,570
But then vehicles and drugs and weapons and surveillance technology, we realise there should be some regulation of these markets.

602
01:02:18,570 --> 01:02:25,480
And indeed, that kind of regulation could occur in the market for artificially intelligent agents as well.

603
01:02:25,480 --> 01:02:32,090
We shouldn't expect this, Margaret, to be any different. Agencies could be inspecting producers.

604
01:02:32,090 --> 01:02:36,620
They could be auditing producers, they could be inspecting the products, they could be licencing products,

605
01:02:36,620 --> 01:02:41,120
they could exclude products from the internet that don't have a licence. There are lots of things they can do.

606
01:02:41,120 --> 01:02:44,630
The NSA has got lots of time. It's what all of our conversations,

607
01:02:44,630 --> 01:02:51,560
it could see the incursion of artificially intelligent agents that didn't have licences into the internet if it wanted to.

608
01:02:51,560 --> 01:02:56,690
And so just as governments could set safety and emission standards for vehicles on the road,

609
01:02:56,690 --> 01:02:59,960
they could have safety standards for artificially intelligent agents that are going

610
01:02:59,960 --> 01:03:08,960
to connect to the internet or that they're going to perform certain functions, vital functions in corporate settings, personal settings and so on.

611
01:03:08,960 --> 01:03:15,380
Educational settings. And so the value function of these agents would have to be vetted.

612
01:03:15,380 --> 01:03:20,660
Their database would have to be searched for bias and their value functions would they would have to look,

613
01:03:20,660 --> 01:03:27,740
are these Hobbesian reasonable agents or not? That's indeed something you can certify.

614
01:03:27,740 --> 01:03:37,880
And of course, any system can be gamed, it will be gained, any value function can be gamed, it will be gamed.

615
01:03:37,880 --> 01:03:43,210
The idea, though, is to have a critical mass of certified.

616
01:03:43,210 --> 01:03:50,150
Hobbesian reasonable artificial agents out there driving around and out there looking after a.

617
01:03:50,150 --> 01:04:01,330
Older folks like myself such that we can actually have some trust in the whole process and they can have some trust in return in one another.

618
01:04:01,330 --> 01:04:05,710
Now, if if that's all right and if Sutton is right,

619
01:04:05,710 --> 01:04:12,670
that generally intelligent systems will become more competent at specific tasks and we can now get an inkling of why a general

620
01:04:12,670 --> 01:04:19,000
intelligence system is going to be better at driving that driving system or better at language than a pure language system.

621
01:04:19,000 --> 01:04:22,120
We have some idea now of why that would be.

622
01:04:22,120 --> 01:04:28,780
And with the motivational system that we're considering, they would also be safer at those tasks, better at them and safer at them.

623
01:04:28,780 --> 01:04:36,070
Indeed, better at spotting the problems with them than if they were just artificial systems dedicated to some particular task.

624
01:04:36,070 --> 01:04:46,180
And so building systems that are really general artificial intelligence, that could be a way of building safety rather than just menace.

625
01:04:46,180 --> 01:04:50,980
But what about superintelligence? Don't have to worry about that.

626
01:04:50,980 --> 01:04:53,890
Well, the first superintelligence is, I would say,

627
01:04:53,890 --> 01:05:01,870
will be actually communities of human and artificially generally intelligent agents working together.

628
01:05:01,870 --> 01:05:05,800
It would be like the scientific community only on a much larger scale.

629
01:05:05,800 --> 01:05:12,190
These would be agents that had some level of trust in one another, some willingness to invest knowledge and effort into that system.

630
01:05:12,190 --> 01:05:15,730
This would be much greater than any individual.

631
01:05:15,730 --> 01:05:23,110
If they maintain the motivational structures to be looked at, they could sustain this kind of cooperation as an epistemic community,

632
01:05:23,110 --> 01:05:28,060
as a social community, as a community responsive to morally relevant considerations.

633
01:05:28,060 --> 01:05:33,130
Now, that would not be a monolithic superintelligence. It would not be a super dominant model.

634
01:05:33,130 --> 01:05:37,090
It would be an inversion, a diverse community of interactive models.

635
01:05:37,090 --> 01:05:42,910
But it would have tremendous capacity to pose and solve problems. In fact,

636
01:05:42,910 --> 01:05:47,020
if you are thinking about trying to solve problems like managing global climate change or

637
01:05:47,020 --> 01:05:51,700
how to foster more equitable and democratic societies or promote the growth of knowledge,

638
01:05:51,700 --> 01:05:58,030
I suspect that going to something that looks more like the scientific community than, like a monolithic superintelligence is a better bet.

639
01:05:58,030 --> 01:06:02,980
If you want to get reasonable answers and of course, in this path,

640
01:06:02,980 --> 01:06:12,640
I am just now recycling the ideas of resolve and controversy on the importance of diverse and independent and autonomous sources multiplied,

641
01:06:12,640 --> 01:06:17,950
diverse and their in their origin and in the kind of knowledge that they have as

642
01:06:17,950 --> 01:06:24,600
a better source of decision making than monolithic superintelligence would be.

643
01:06:24,600 --> 01:06:32,100
So that kind of Superintelligence, that superintelligent community that's safer to live with than a monolithic superintelligence.

644
01:06:32,100 --> 01:06:40,470
And it's also less brittle. It's less prone to an error that it can't detect or some particular glitch that would cause a serious breakdown.

645
01:06:40,470 --> 01:06:49,710
And it is a community that has an interest in their not becoming some superintelligent monolith that threatens the stability of the community itself.

646
01:06:49,710 --> 01:06:57,750
And so in that sense, there is a way in which that community operating as an agent through the various institutions that it can create

647
01:06:57,750 --> 01:07:04,920
can have a generalised interest a general will in regulating the potential emergence of superintelligence.

648
01:07:04,920 --> 01:07:10,500
Now, suppose there's some and probably unanticipated accidental chain of events.

649
01:07:10,500 --> 01:07:15,520
And one of these general intelligences that are constitutive constitutive of this community,

650
01:07:15,520 --> 01:07:20,280
one of them accelerates and suddenly becomes superintelligent.

651
01:07:20,280 --> 01:07:26,070
Wouldn't we then face a control problem that would lead us to rue the day we ever created

652
01:07:26,070 --> 01:07:31,500
these autonomous artificial agents that started down this road to General Intelligence,

653
01:07:31,500 --> 01:07:37,750
general artificial intelligence? And shouldn't we therefore try to?

654
01:07:37,750 --> 01:07:44,650
Dramatically restrict the research that's done in this area, the way that we contain research on nuclear or biological weaponry.

655
01:07:44,650 --> 01:07:52,060
And there has been some degree of success in that Britain's the survival of humanity be at stake, possibly.

656
01:07:52,060 --> 01:07:58,300
So the kinds of regulation mentioned above the kinds of coordination within this large

657
01:07:58,300 --> 01:08:03,250
superintelligent community would be useful in trying to spot these possibilities.

658
01:08:03,250 --> 01:08:09,280
But I no way in which it could guarantee that they could not emerge. Superintelligence, however, is not perfect.

659
01:08:09,280 --> 01:08:15,670
Intelligence and a distributed community will have many more diverse resources and many more

660
01:08:15,670 --> 01:08:21,220
diverse ways of thinking to draw upon in trying to contend with a monolithic superintelligence.

661
01:08:21,220 --> 01:08:27,700
Now, could a monolithic Superintelligence split itself up into many agents and gain the advantages of diversity that way?

662
01:08:27,700 --> 01:08:31,720
And since they would be like ants, they would all be descended from a common superintelligence.

663
01:08:31,720 --> 01:08:38,680
They'd work together in harmony. That would be the problem. They would all be descended from a single supermodel.

664
01:08:38,680 --> 01:08:42,850
They would therefore not have the diversity that autonomous agents would have.

665
01:08:42,850 --> 01:08:47,650
And if they could be gotten to work together, which they might be able to do,

666
01:08:47,650 --> 01:08:57,470
they would not be able to produce the same level of problem solving ability that would exist in this large, distributed, diverse community.

667
01:08:57,470 --> 01:09:01,970
And so there's another thing here.

668
01:09:01,970 --> 01:09:09,590
If we indeed are building these general intelligences with the kind of motivational structure that I've suggested and the way we've described it,

669
01:09:09,590 --> 01:09:17,900
there's no reason you can't does not require consciousness to have that kind of a motivational structure does not require moral emotions.

670
01:09:17,900 --> 01:09:25,350
And it is such that there will be a reward distributable within the community for communities that have that structure.

671
01:09:25,350 --> 01:09:30,960
Moreover, it tends to be characteristic of systems that because they have goals,

672
01:09:30,960 --> 01:09:39,450
they have sub goals and they have sub goals like continuation of their own existence or sub goals like goal maintenance,

673
01:09:39,450 --> 01:09:43,830
because if the system can't maintain its goals, it can't pursue its goals.

674
01:09:43,830 --> 01:09:55,410
And so therefore we have superintelligence bursting out of a system or into a system that was built upon this model of motivation at the core,

675
01:09:55,410 --> 01:10:01,350
and it would have imperative to preserve itself and an imperative to preserve its goals.

676
01:10:01,350 --> 01:10:10,800
So. Maybe that would be a safer model, this if there's going to be a monolith because he would have that core and that was indeed Hobbs is hope.

677
01:10:10,800 --> 01:10:19,110
Hobbs thought that a sovereign could read his book and understand that a sovereign agent, even of unlimited power,

678
01:10:19,110 --> 01:10:25,140
would do much better to follow his rules of nature than to exercise that power arbitrarily and monolithically.

679
01:10:25,140 --> 01:10:32,430
Such an agent would weaken the government, would weaken the society, would undermine unity and would be much less effective.

680
01:10:32,430 --> 01:10:36,090
Sovereign and Hobbs thought maybe some sovereign will realise this.

681
01:10:36,090 --> 01:10:41,910
And in the short run, it didn't turn out to be exactly right about that.

682
01:10:41,910 --> 01:10:48,060
Sovereigns ignored the advice they opportunistically exploited and impoverished their realms.

683
01:10:48,060 --> 01:10:56,750
Rebellions and revolutions continued. Have we done any better in the meanwhile, have the popular sovereigns done better at this in the meanwhile?

684
01:10:56,750 --> 01:10:59,600
Well, they've done a number of interesting things.

685
01:10:59,600 --> 01:11:06,860
Popular sovereigns have abolished slavery, extended education, eliminated serfdom, reduced gender discrimination.

686
01:11:06,860 --> 01:11:13,970
They promoted the growth of knowledge and technology. So maybe popular submarines are capable of learning this lesson,

687
01:11:13,970 --> 01:11:24,170
but we know now and we know today as much as any day that these systems are also systems that can be in peril.

688
01:11:24,170 --> 01:11:32,160
So finally, I just want to mention one final point about superintelligence.

689
01:11:32,160 --> 01:11:36,550
Suppose we think of Superintelligence as benign.

690
01:11:36,550 --> 01:11:43,090
Suppose it were a Superintelligence that were safe and that had the interests of humanity and artificial

691
01:11:43,090 --> 01:11:49,390
agents at heart and wanted to do nothing more than to maximise the utility of the world as a whole.

692
01:11:49,390 --> 01:11:57,570
Wouldn't that be a system which would in some sense, be an improvement upon what we have, which is certainly not a utility maximising system?

693
01:11:57,570 --> 01:12:04,860
Well, think about the following. Suppose this were in 1970 that this intelligence emerged and humans and other

694
01:12:04,860 --> 01:12:08,760
living beings were the only creatures capable of having something like well-being.

695
01:12:08,760 --> 01:12:17,930
And suppose this system benign as it was hard maximised on the utility function of those 1970 human beings and animals.

696
01:12:17,930 --> 01:12:22,760
Now, this would be a very big error on its part at this point.

697
01:12:22,760 --> 01:12:28,220
Same sex orientation was considered a mental disorder and a very tiny fraction of the

698
01:12:28,220 --> 01:12:34,040
population thought there should be anything like legal recognition of same sex relations.

699
01:12:34,040 --> 01:12:43,790
So consulting the experts doing the best with 1970 conceptions of well-being and good, the system would hard maximise using up all the universities,

700
01:12:43,790 --> 01:12:51,830
all the universities resources to create an order that would not be one that actually did maximise the benefit of those involved.

701
01:12:51,830 --> 01:12:57,140
Well, how did we figure out that this was a poor idea? Um, if you look,

702
01:12:57,140 --> 01:13:02,120
it seems like it was figured out by this distributed kind of super intelligence

703
01:13:02,120 --> 01:13:08,180
that I was talking about gay individuals engaged in experiments and living.

704
01:13:08,180 --> 01:13:12,080
They increasingly became willing for their experiments to be known publicly.

705
01:13:12,080 --> 01:13:17,840
It became clear to the wide population as a whole that they were living amongst individuals who were gay

706
01:13:17,840 --> 01:13:24,170
and that these individuals were not to be viewed as aliens to be distrusted and controlled and suppressed.

707
01:13:24,170 --> 01:13:27,830
Gradually, approval increased throughout this entire period.

708
01:13:27,830 --> 01:13:33,860
And now we have a situation where the majority strongly supports marriage for gay couples.

709
01:13:33,860 --> 01:13:41,060
Now you might say, Oh, but there are all these people out there who are just political about this, and they won't learn these lessons.

710
01:13:41,060 --> 01:13:45,980
They're they're protected against them by their political preconceptions.

711
01:13:45,980 --> 01:13:55,280
So just a tiny look here at the different groups in this society, different generations, different religious groups at the top.

712
01:13:55,280 --> 01:13:59,600
We have the unaffiliated, we have white evangelicals at the bottom.

713
01:13:59,600 --> 01:14:04,350
And what's striking is that during this period, all of those groups went up.

714
01:14:04,350 --> 01:14:10,560
In their acceptance, that is to say, learning in a distributed way was possible thanks to these,

715
01:14:10,560 --> 01:14:19,500
unlike unsanctioned unpermitted on canonical unapproved experiments and living by a fraction of the population,

716
01:14:19,500 --> 01:14:31,170
that was courage at courageous enough to do it. So that's a way in which we had better not take 2020 two sense of well-being and hard maximise

717
01:14:31,170 --> 01:14:35,430
it with all of the resources in the universe because we still have so much to learn.

718
01:14:35,430 --> 01:14:48,768
Thank you.