1 00:00:02,640 --> 00:00:20,040 20. Yeah, I think everybody. 2 00:00:20,700 --> 00:00:28,200 And welcome to the 2018 Michaelmas term strategy lecture organised by the Computer Science Department in Oxford. 3 00:00:28,950 --> 00:00:36,180 This series of distinguished lectures is named after Christopher Street XI, the first professor of computation at Oxford University. 4 00:00:36,780 --> 00:00:41,160 The straight she lectures are generously supported by Oxford Asset Management. 5 00:00:42,570 --> 00:00:43,500 I'm Peter Jevons. 6 00:00:44,280 --> 00:00:52,530 I'm acting head of the Department of Computer Science and it's my job to I pleasure to welcome Speaker today Professor Rodney Brooks. 7 00:00:54,110 --> 00:01:02,059 Mr. Brooks is the Panasonic professor of robotics emeritus at M.I.T., where he was for ten years, 8 00:01:02,060 --> 00:01:08,300 the director of the MIT Artificial Intelligence Lab, and then the Computer Science and Artificial Intelligence Love. 9 00:01:08,330 --> 00:01:10,820 After I merged with C.S. 10 00:01:13,030 --> 00:01:21,610 He was born in Adelaide, Australia and received his bachelor's and master's degree in mathematics before joining Stanford University for his Ph.D. 11 00:01:22,840 --> 00:01:31,570 He was a research scientist at Carnegie Mellon University and at MIT on the faculty at Stanford before joining the MIT faculty in 1984. 12 00:01:34,050 --> 00:01:41,130 Facebook's research has been in computer vision, robotics, artificial life, artificial intelligence. 13 00:01:41,940 --> 00:01:47,130 He develops behaviour based robotics, which has been deployed in robots on the surface of Mars. 14 00:01:48,090 --> 00:01:54,780 In thousands of bomb disposal robots in Afghanistan and Iraq inside Fukushima Daiichi after 15 00:01:54,960 --> 00:02:01,590 in the intimate aftermath of the 2011 tsunami in thousands of factories around the world. 16 00:02:02,580 --> 00:02:07,680 And in tens of millions of homes in the form of robot vacuum cleaners. 17 00:02:09,660 --> 00:02:16,020 He was co-founder, chairman and CTO of iRobot and of Rethink Robotics. 18 00:02:17,590 --> 00:02:19,330 He's won numerous scientific awards. 19 00:02:19,630 --> 00:02:34,240 Too many to mention, and also starred as himself in the 1997 movie Fast, Cheap and Out of Control that was named after one of his scientific papers. 20 00:02:35,860 --> 00:02:43,270 But these days, he describes himself as trying to inject realism into public discourse about the AI boom. 21 00:02:44,590 --> 00:02:51,670 We may be getting some of that this afternoon and also working on the question of how order can arise from disorder. 22 00:02:52,860 --> 00:02:56,939 So there's much more I could say, but I think it's better to let Ronnie speak for himself. 23 00:02:56,940 --> 00:03:10,190 So I ask you to welcome my speaker, Professor Rodney Brooks. We see people talk about artificial intelligence that's been around for a long time. 24 00:03:10,700 --> 00:03:16,099 Recently, there's been a movement that says it's called artificial general intelligence, 25 00:03:16,100 --> 00:03:22,070 which tries to distinguish itself from artificial intelligence to say they want to build a complete, intelligent entity. 26 00:03:22,280 --> 00:03:26,329 And I'm going to show you that I think that's a little bit of over marketing, 27 00:03:26,330 --> 00:03:31,460 because that's been the goal of the original AI people from from day one. 28 00:03:32,240 --> 00:03:40,880 And then there's been the idea of artificial superintelligence, which is rather poorly defined. 29 00:03:41,390 --> 00:03:46,040 But is the idea that with soon going to have artificial intelligence, 30 00:03:46,040 --> 00:03:52,670 which is much better and smarter than people, and so somehow it's going to kill us all. 31 00:03:55,550 --> 00:03:59,770 Nick Bostrom, is Nick here, by the way? Nick's not here. 32 00:03:59,800 --> 00:04:07,600 Okay. Well. Nick Bostrom, you know, did a survey a little while ago when we're going to get artificial general intelligence. 33 00:04:07,600 --> 00:04:11,770 And then there was another sort of extra survey around that. 34 00:04:12,130 --> 00:04:18,520 And the claim is that the median estimates from people in the field is that we'll get 35 00:04:18,520 --> 00:04:23,799 artificial general intelligence in 2040 and an artificial superintelligence in 2060. 36 00:04:23,800 --> 00:04:27,490 And once we get artificial superintelligence, everything's going to go crazy. 37 00:04:28,480 --> 00:04:31,720 I think both those numbers are very optimistic. 38 00:04:32,590 --> 00:04:37,700 And Nick has had his book on superintelligence, which the press has picked up. 39 00:04:38,860 --> 00:04:43,089 And I think, you know, because he's associated with Oxford and it says, 40 00:04:43,090 --> 00:04:47,110 you know, there's a real chance that superintelligence is going to kill us all. 41 00:04:47,680 --> 00:04:53,440 The press really picked up on that. But if you go to Nick's Web page, these are some of his featured papers. 42 00:04:54,250 --> 00:04:58,360 Where are they? Why? I hope the search for extraterrestrial terrestrial life finds nothing. 43 00:04:58,930 --> 00:05:05,650 Particularly he even mentions Mars. I hope there's no artificial, you know, extraterrestrial life there because it might come and kill us all. 44 00:05:07,150 --> 00:05:11,469 And the vulnerable world hypothesis is about all the ways that things could kill us all. 45 00:05:11,470 --> 00:05:21,580 And the how unlikely the doomsday. Doomsday catastrophe is about how experiments with the physics super colliders could kill us all. 46 00:05:22,420 --> 00:05:28,629 It's really and so on. Even the a typology of potential harms is about knowledge. 47 00:05:28,630 --> 00:05:32,530 That is true, but that probably we shouldn't know because it might kill us all. 48 00:05:33,880 --> 00:05:39,990 But the only thing missing is, is talking about AI that, you know, 49 00:05:40,000 --> 00:05:43,690 in a way that it might kill us all and then damping down AI and that will kill us all. 50 00:05:43,690 --> 00:05:51,130 But that's maybe a bit self-referential. So I feel sorry for Nick actually, about what it must be like to be him, 51 00:05:51,670 --> 00:05:55,600 because he's really afraid of a lot of stuff, but it hasn't stopped other people. 52 00:05:56,560 --> 00:06:00,570 Martin Rees is a dear friend of mine, the former astronomer, 53 00:06:00,580 --> 00:06:08,590 or maybe still the Astronomer Royal has talked about how I may, you know, end up killing us all. 54 00:06:10,600 --> 00:06:18,069 Stephen Hawking is worried about it. Oh, now finally we've got an MIT professor, but he must know about AI right now. 55 00:06:18,070 --> 00:06:23,620 He's a professor of physics, and he's never been he's never given a talk in AI or computer science at MIT. 56 00:06:24,520 --> 00:06:31,690 Well, then there's Elon Musk. He must know about AI, and he thinks it is going to kill us all. 57 00:06:31,690 --> 00:06:40,150 It's going to be an apocalypse now. He thought that I and robotics was going to help build cars, but it turned out that was a little beyond that. 58 00:06:41,530 --> 00:06:46,210 And Stuart, I haven't mentioned you and I'm sorry I left you out. 59 00:06:48,010 --> 00:06:51,230 So Alan Turing either. I will. 60 00:06:51,260 --> 00:06:54,639 I will. Yeah. But here's the here's the point. 61 00:06:54,640 --> 00:07:00,250 Maybe maybe I'm just a grumpy old guy. And, you know, maybe they were right and I'm, you know, get off my lawn. 62 00:07:01,120 --> 00:07:08,080 So I decided, well, okay, let's let's rethink this, you know, am I being unfair? 63 00:07:08,500 --> 00:07:15,970 And I go back to Marvin Minsky's, 1961 paper, which, for those who haven't seen it, is an amazingly good paper, 64 00:07:16,600 --> 00:07:26,020 well over 100 references where he talks about how to build artificial intelligence and he breaks it down in five sections. 65 00:07:26,020 --> 00:07:33,190 The search pattern recognition, which most closely corresponds, I think today to two deep learning sort of things. 66 00:07:33,460 --> 00:07:40,930 And then there's learning, planning, induction, and he he frames each of those as methods of controlling search and that's his view. 67 00:07:41,380 --> 00:07:45,400 So I thought, well, you know, Marvin did that for for artificial intelligence. 68 00:07:45,790 --> 00:07:54,069 If I want to, you know, really not be just a grumpy old guy, maybe I should rethink things and think about steps towards superintelligence. 69 00:07:54,070 --> 00:07:57,430 What would it take to get there? And so that's what this talk is about. 70 00:07:57,610 --> 00:08:06,070 What would it take to get to superintelligence? And I'm going to go through these sections, starting with a brief history of AI. 71 00:08:06,850 --> 00:08:16,149 I'm sure Alan Turing, too. Well, very well known papers, uncountable numbers in 1936 and his 1950 computing machine intelligence, 72 00:08:16,150 --> 00:08:20,770 which is where he would get the Imitation Game and the Turing Test from. 73 00:08:21,040 --> 00:08:25,780 But he had another paper in 1948, which was not published until 1970. 74 00:08:26,140 --> 00:08:35,260 His boss, whose name was Sir Charles Darwin, was grandson of another Charles Darwin wouldn't let him publish it, 75 00:08:36,190 --> 00:08:46,000 and so that was only published posthumously. And in that he starts off and you can see echoes of this paper in his 1950 paper. 76 00:08:46,360 --> 00:08:50,650 The possible ways in which machinery might be made to show intelligent behaviour are discussed. 77 00:08:51,130 --> 00:08:58,210 The analogy with the human brain is used as a guiding principle, so I think he was perhaps the first person to talk about. 78 00:08:58,840 --> 00:09:06,190 Computing machinery, and he was talking about computing machinery, being able to emulate human intelligence. 79 00:09:07,180 --> 00:09:14,890 He specifically talks about discrete controlling machinery by which he means essentially a digital computer. 80 00:09:14,920 --> 00:09:22,180 If you read the you know, this was 1948, the words went around and he says brains very nearly fall into this class. 81 00:09:22,660 --> 00:09:26,800 And so he talks about in this paper ways that machines could learn. 82 00:09:27,370 --> 00:09:33,190 And then in 1950, he published he did publish this paper computing machinery and intelligence. 83 00:09:34,090 --> 00:09:37,000 I proposed to consider the question, can machines think? 84 00:09:37,570 --> 00:09:46,170 And this is where he started talking about what he called The Imitation Game, which started out as Could I, man. 85 00:09:46,180 --> 00:09:52,750 And could you tell whether it was a man or a woman? If you were just getting questions back and forth to them, if they were trying to fool you. 86 00:09:55,330 --> 00:10:02,890 And then he goes on to use that to say, Well, what if it was a machine that could fool you about whether it was a person or not? 87 00:10:03,370 --> 00:10:07,750 Then surely it could be said to be thinking is essentially his argument. 88 00:10:07,820 --> 00:10:13,149 I think I think, Aaron, you may disagree with me a little bit on this, but I come back to this a little later. 89 00:10:13,150 --> 00:10:14,920 I go back and forth on what he meant. 90 00:10:17,110 --> 00:10:23,290 He does say there's no convincing arguments of a positive nature to support his views, which is an honest statement from him. 91 00:10:23,800 --> 00:10:29,650 And he also talks about ESP as being being a proven thing and that ESP should be perhaps considered. 92 00:10:29,650 --> 00:10:34,810 So telepathy, etc. So it's a little strange by today's standards. 93 00:10:35,710 --> 00:10:45,620 But he says that, you know, if we look at The Imitation Game and a computer can fool a human observer 70% of the time. 94 00:10:47,050 --> 00:10:53,210 That's a good substitute for can a machine think? And he says, I believe that in about 50 years time, 95 00:10:53,230 --> 00:11:02,260 we be possible to program computers with a storage capacity of about ten to the ninth to make them play The Imitation Game and win 70% of the time. 96 00:11:03,220 --> 00:11:12,850 Where does he get that ten to the ninth from? Well, he he says that the Encyclopaedia Britannica 11th edition has two by ten to the ninth bits in it. 97 00:11:13,360 --> 00:11:17,860 So somehow that turns into ten to the ninth probably being enough. 98 00:11:17,860 --> 00:11:22,329 And but 10th of life is the program space because in the computers fast enough, 99 00:11:22,330 --> 00:11:27,100 he thinks because then he says he can produce about a thousand digits or bits. 100 00:11:27,120 --> 00:11:31,239 He's referring to bits there, a program a day and that's before assemblers, remember. 101 00:11:31,240 --> 00:11:38,799 So it was really arduous to write code. And so he says 60 workers working steadily through 50 years might accomplish this job. 102 00:11:38,800 --> 00:11:45,250 And if you multiply it out and make them work 333 days a year each, you get exactly ten to the ninth bits. 103 00:11:45,700 --> 00:11:53,980 So somehow the two by ten, the ninth bits in the 11th edition of Encyclopaedia Britannica becomes a program of length ten to the ninth bits. 104 00:11:54,190 --> 00:12:01,540 By the way, I'm not sure anyone's ever checked this, but I looked around my my living room and that was, you know, 30 year old robot. 105 00:12:01,870 --> 00:12:05,860 But over in that bottom right hand corner. What's that? Oh, it's Encyclopaedia Britannica. 106 00:12:05,920 --> 00:12:14,810 Britannica, the 11th edition. I happened to have it, so I open to a random page counted, and it comes out to about two by the end of the night. 107 00:12:14,830 --> 00:12:27,190 So he was accurate. Now, people remember the term artificial intelligence coming from the 1956 A.I. Workshop at Dartmouth. 108 00:12:27,400 --> 00:12:33,130 And this is the proposal was only about 12 pages long and the first few pages were written by John McCarthy. 109 00:12:33,670 --> 00:12:37,270 And he just goes right in and starts using the words artificial intelligence. 110 00:12:37,570 --> 00:12:41,680 And this is, as far as we can tell, the first use of that term. 111 00:12:41,680 --> 00:12:50,409 But he doesn't explain it. We propose a two month ten man study of artificial intelligence and the conjecture that every aspect of learning or 112 00:12:50,410 --> 00:12:57,190 any other feature of intelligence can in principle be so precisely described that a machine could be made simulated. 113 00:12:57,220 --> 00:13:04,210 So I think despite what the artificial general intelligence people say that, oh, it's a new thing to go after all aspects of intelligence. 114 00:13:04,480 --> 00:13:10,840 Certainly John McCarthy, Marvin Minsky, etc. thought they were going after all aspects of intelligence back in the fifties. 115 00:13:12,580 --> 00:13:16,870 John does say, well, maybe the speeds as as the Turing. 116 00:13:16,870 --> 00:13:19,970 Turing said we may need more memory, but not more speed. 117 00:13:19,990 --> 00:13:26,350 John says more speed is probably necessary, but the major obstacle was just knowing how to do the programming. 118 00:13:26,650 --> 00:13:29,559 And later, you know, into the sixties, seventies, 119 00:13:29,560 --> 00:13:37,170 you saw John and Marvin and other people thinking that we were going to have human level intelligence within, you know, a decade sort of thing. 120 00:13:37,180 --> 00:13:44,559 So that idea has been around for a long time. Now I'm going to give four approaches to artificial intelligence that have come about. 121 00:13:44,560 --> 00:13:48,190 And these are cartoon total cartoons. 122 00:13:48,910 --> 00:13:55,590 They're very cartoonish. What I'm going to describe here there just to give a general flavour to it, please hold back. 123 00:13:55,600 --> 00:13:59,220 You know, it's a cartoon. It's not a. It's not your book. 124 00:13:59,430 --> 00:14:02,840 You know, it's it's just a cartoon. 125 00:14:02,850 --> 00:14:10,140 So the first one is the symbolic approach, which definitely John McCarthy was, was was it was a leader. 126 00:14:10,680 --> 00:14:20,010 And here basically you have symbols and these are these what they call bolded words. 127 00:14:20,910 --> 00:14:23,250 And those symbols are sort of indivisible things. 128 00:14:23,400 --> 00:14:30,090 But you can talk about relationships between them and make and you can talk about rules of inference and how the rules of inference work. 129 00:14:30,510 --> 00:14:41,070 And. So you can have some thing resembling intelligence discussing objects in the world as symbols. 130 00:14:41,100 --> 00:14:46,350 The problem has been how you ground those symbols in actuality, in real perception. 131 00:14:46,350 --> 00:14:50,100 And that will come in a minute when we get to deep learning and other things. 132 00:14:50,790 --> 00:14:58,600 But these symbols. To the machine that's looking at them don't have the meaning that we associate with and read. 133 00:14:58,700 --> 00:15:07,040 You know, we read a lot in the cap. But to these symbolic systems, you know, it could as well being zero five, three seven. 134 00:15:07,700 --> 00:15:12,560 They're all instances of g0083. So I just did substitutions for symbols there. 135 00:15:13,310 --> 00:15:18,470 Oh. In those relationships, instance of that has a lot of meaning to us, 136 00:15:18,980 --> 00:15:24,740 but they are symbolic sorts of things and there are strict rules of inference, etc. 137 00:15:25,100 --> 00:15:30,590 And so, you know, really it's something more like this is how things are speaking. 138 00:15:32,090 --> 00:15:34,400 There's a great advantage to these symbols though. 139 00:15:34,580 --> 00:15:42,470 You can compose symbols and compose things from different subsystems by using the symbols, and that isn't true for some of the other approaches. 140 00:15:43,340 --> 00:15:46,880 Second approach I want to talk about is neural networks cartoon again. 141 00:15:47,720 --> 00:15:52,640 And it's, you know, 2.0, 2.1. It's been rediscovered and rebuilt many times. 142 00:15:52,640 --> 00:15:57,560 Marvin Minsky, in his 1954 thesis, was doing neural networks at Princeton. 143 00:15:59,180 --> 00:16:03,680 But then, you know, it came again, late fifties, came again, sixties, etc., etc. 144 00:16:03,950 --> 00:16:09,260 And just for those who who don't know, again, a cartoon neural networks, 145 00:16:09,650 --> 00:16:16,760 roughly the idea is you have these things which are in some really abstracted way, 146 00:16:16,760 --> 00:16:23,180 modelled after neurones, starting with a 1943 paper by McCulloch and Pitts. 147 00:16:23,690 --> 00:16:29,720 They're not really like real neurones in any way, but the press likes to pick up on, Oh, they're modelled after the brain. 148 00:16:30,440 --> 00:16:32,330 It's a very far thing from the brain. 149 00:16:32,780 --> 00:16:39,769 The idea is you have inputs which come into the neurones on the left and a feedforward network where the neurones do something. 150 00:16:39,770 --> 00:16:43,700 I'll tell you what that is in a minute and push it onto the ones further to the right. 151 00:16:44,000 --> 00:16:50,090 And then and the output your these the ones on the right are some sort of classes. 152 00:16:50,090 --> 00:16:58,610 And and the one that is most activated and we'll talk about that in a second is the the winning class of what the inputs are looking at. 153 00:16:58,910 --> 00:17:04,850 And each of those neurones is typically a linear weighted sum of the inputs. 154 00:17:04,850 --> 00:17:13,430 The inputs are between zero and one X, one through X, and the weight them and the weights are what I learned the numbers, just a bunch of numbers. 155 00:17:13,790 --> 00:17:18,860 This multiplied in sum to get a number that could be quite big, negative or positive. 156 00:17:19,160 --> 00:17:25,070 So you put it through a function which squashes it down to zero one again, and that's the individual neurone. 157 00:17:25,550 --> 00:17:33,110 So you have inputs from some sort of sensory apparatus and it goes through these hidden layers, feeds forward. 158 00:17:33,110 --> 00:17:39,020 And then on the right, maybe, you know, one of the if the if the one labelled cat has the highest output score, 159 00:17:39,230 --> 00:17:45,080 it says this cat in the image, the dog in the image is a car in the image, but it's feedforward. 160 00:17:45,320 --> 00:17:49,640 And we know that the brain doesn't work like that at all in in the human visual system. 161 00:17:49,940 --> 00:17:51,200 And we'll come back to this in a bit. 162 00:17:51,200 --> 00:17:57,800 There's always things going the other direction from the output towards the input in I think it's V v2 one part of the visual cortex, 163 00:17:58,040 --> 00:18:01,760 there's ten times as many connections going backward as there are forward. 164 00:18:02,150 --> 00:18:09,500 These are all feedforward and that has some implications. And then the big revolution about nine years ago was something called deep learning, 165 00:18:09,770 --> 00:18:18,830 where we went from two or three layers to 12 layers, and now you can find hundreds of layers and some versions of it. 166 00:18:19,100 --> 00:18:29,660 And this has been the revolution in A.I., which I think has got everyone thinking important again, just like it was with expert systems decades ago. 167 00:18:30,830 --> 00:18:35,930 And it is what is driving everything about AI today. 168 00:18:35,930 --> 00:18:42,649 It's exploitation of this particular algorithm. And it was nine years ago that that this was was done. 169 00:18:42,650 --> 00:18:49,010 I think people tend to think, oh, every three weeks as a new revolutionary concept coming out in AI, but it's not true. 170 00:18:49,010 --> 00:18:53,480 It's it's fairly slow. The revolutionary concepts, this was a big one. 171 00:18:53,900 --> 00:18:59,510 And one of the things that happened was that previously with these, you know, 172 00:18:59,510 --> 00:19:05,389 just two or three led networks, there'd been some algorithms developed by people that's a person, 173 00:19:05,390 --> 00:19:09,200 you know, a person programming, not analysing the picture of the car, 174 00:19:09,200 --> 00:19:14,870 but writing the operators that look at the images, maybe find circles, maybe find straight lines, etc. 175 00:19:15,410 --> 00:19:21,830 And they're the inputs, the network out of which comes classification in deep learning. 176 00:19:22,280 --> 00:19:25,849 People got rid of that handcrafted input processing. 177 00:19:25,850 --> 00:19:29,600 And that, by the way, is why speech understanding has gotten so good. 178 00:19:29,600 --> 00:19:32,780 You know, five years ago, speech understanding systems were still pretty bad. 179 00:19:33,020 --> 00:19:42,620 But Alexa, the Amazon Echo, if any of you used that, it's pretty damn good transliterated speech to text, 180 00:19:43,010 --> 00:19:46,909 not talking about understanding, but getting what the words correspond. 181 00:19:46,910 --> 00:19:50,900 And we had accents don't matter too much. 182 00:19:51,170 --> 00:19:53,120 Our noise in the room doesn't matter too much. 183 00:19:53,590 --> 00:20:01,570 And that's because the deep learning got rid of the human built early operators on the speech signal and learnt much better ones. 184 00:20:03,060 --> 00:20:08,070 People have overgeneralise that to say, well, we should never have the human doing any part of the job. 185 00:20:08,730 --> 00:20:12,690 A lot of learning. People say that, and I think that's a mistake in many ways. 186 00:20:13,710 --> 00:20:22,680 And the world came to know about this in a in a 2014 article in The New York Times by John Markoff, where this image was showed. 187 00:20:22,680 --> 00:20:28,680 And I Google a couple of networks. Google labelled this a group of young people playing a game of Frisbee, 188 00:20:29,010 --> 00:20:33,420 and I think most A.I. researchers were surprised that was pretty damn good and much better than. 189 00:20:33,450 --> 00:20:37,560 Were you surprised, Stuart? Yeah. Yeah, I was surprised. 190 00:20:37,860 --> 00:20:42,330 It was surprising. This was a of an eye opener that they could do this and. 191 00:20:46,750 --> 00:20:52,910 Uh, so that, that, that really got the world talking about, I think around 2014. 192 00:20:53,480 --> 00:20:56,810 In neural networks, I'm going to include something called reinforcement learning. 193 00:20:57,710 --> 00:21:05,030 I'm going to just lump that together. This was Donald Mickey, who was a colleague of Alan Turing at Bletchley Park. 194 00:21:05,690 --> 00:21:11,839 He did this in 1961 at the University of Edinburgh. I think it was a professor of surgery, so he wasn't allowed to use a computer. 195 00:21:11,840 --> 00:21:17,150 So he built the set of matchboxes with with coloured beads in them and and had reinforcement learning. 196 00:21:17,420 --> 00:21:24,480 Learn to play the game. Tic tac toe. Traditional robotics is the third approach I want to talk about. 197 00:21:25,830 --> 00:21:31,700 I think this, you know, got started with the. Uh oh. 198 00:21:31,940 --> 00:21:35,570 Just lost his name. Someone at MIT. Who? 199 00:21:37,370 --> 00:21:44,299 Larry Roberts. Yes. Thank you. Larry Roberts. Who showed how to take images of Polyhedra and get the lines out of now. 200 00:21:44,300 --> 00:21:47,870 This time, taking an image was a was a job. 201 00:21:47,960 --> 00:21:53,090 You took a picture with a camera. You got film, and then you scanned the film. 202 00:21:53,400 --> 00:21:56,540 Know bit by bit mechanically to get the image. 203 00:21:56,990 --> 00:22:01,670 And he showed that you could get polyhedra, simple polyhedra from actual images. 204 00:22:01,670 --> 00:22:05,450 And then I went off and saw, okay, what about line drawings? 205 00:22:05,450 --> 00:22:07,870 Can we figure out the three dimensional structure from that? 206 00:22:07,880 --> 00:22:16,160 And that worked out fairly well with shadows in there, etc. And it was then used for robotics experiments as the cube stacker at MIT, 207 00:22:16,400 --> 00:22:26,000 which looked at a stack of cubes and tried to model them, and the computer copied the A.I. program, copying the stack of blocks. 208 00:22:26,870 --> 00:22:31,650 Unfortunately, this is the only photo I can find of it, which doesn't really show you much much. 209 00:22:32,210 --> 00:22:37,850 Swri Stanford Research Institute. Now, of course, these are right next door to Stanford. 210 00:22:37,850 --> 00:22:48,080 In 1970, they built the robot shakey and it lived in a world of polyhedra and with each side of each polyhedron painted in a matte colour, 211 00:22:48,320 --> 00:22:51,980 and it built 3D models of the Polyhedron planned how to get around. 212 00:22:53,420 --> 00:22:56,600 This is Hans. This is a photo I took in 1979. 213 00:22:56,750 --> 00:23:01,070 That's Hans Moravec up there filming his robot, the cop out. 214 00:23:01,610 --> 00:23:08,240 And it's only ever outdoor run because it needed the whole mainframe and no one was going to let him use the computer. 215 00:23:08,900 --> 00:23:12,620 But this was the last day before the lab closed down to move the campus. 216 00:23:12,620 --> 00:23:17,479 So he got the computer to run all day, this one outdoor run. 217 00:23:17,480 --> 00:23:20,390 And you can see he built Polyhedra for the obstacles. 218 00:23:21,650 --> 00:23:29,870 But it took 15 minutes of mainframe processing to to to process the images, to plan a motion of one metre. 219 00:23:30,650 --> 00:23:35,360 And the shadows moved a lot in that 15 minutes, and it really screwed up the algorithms. 220 00:23:37,190 --> 00:23:46,000 But traditional robotics then was in a sense, the world built a complete world model and then plan what to do. 221 00:23:46,160 --> 00:23:49,850 This is the robot Freddie at University of Edinburgh, 222 00:23:50,150 --> 00:23:56,120 with a fairly simple polyhedron modelled by the way, that hand is about a metre wide and a metre high. 223 00:23:56,390 --> 00:24:01,790 That is not a miniature camera there next to it. That's about a 30 centimetre long camera. 224 00:24:03,380 --> 00:24:06,750 You know, we all today think cameras are tiny and plentiful and cheap. 225 00:24:06,770 --> 00:24:09,150 They were not for most of the history of I. 226 00:24:10,700 --> 00:24:18,050 And then there's behaviour based robotics which I attribute originally to Grey Walter, who was at Bristol at the American at Bristol. 227 00:24:19,040 --> 00:24:25,580 This is a 1950 paper he wrote on his tortoise's tortoise as on the bottom left there. 228 00:24:26,390 --> 00:24:31,340 This particular one had a circuit with two vacuum tubes or valves. 229 00:24:33,540 --> 00:24:38,930 They're the you know, the those that's a vacuum tube there. 230 00:24:39,110 --> 00:24:44,150 There's a filament which heats up, spits up electrons and anode there, which collects them. 231 00:24:44,360 --> 00:24:50,270 And these plates in between can modulate it. And that's how that's how we that's how people used to do electronics. 232 00:24:50,720 --> 00:24:56,240 It's only got two modules, but it could demonstrate all sorts of behaviours. 233 00:24:56,480 --> 00:24:59,960 And this is from that Scientific American four page article. 234 00:25:00,170 --> 00:25:05,860 The next year he published an another one on a machine that could learn there. 235 00:25:05,870 --> 00:25:11,470 It had grown to seven vacuum tubes or valves for learning and it was very much like Pavlov. 236 00:25:11,480 --> 00:25:16,730 The 3000 cycles there is about a whistle that it could hear and and do Pavlov type experiments. 237 00:25:17,720 --> 00:25:22,250 So there that was behaviour generated in a sort of holistic way. 238 00:25:22,610 --> 00:25:30,200 I took that and changed that a little bit for digital, where I connected sensors to actuate is instead of building one central model, 239 00:25:30,350 --> 00:25:35,870 maybe had lots of partial models all happening at once and each layer generating different behaviours and having the 240 00:25:35,870 --> 00:25:44,350 actuate it sort of figure out what to do with the conflicting demands and put them together in a into networks. 241 00:25:44,360 --> 00:25:50,680 This is a robot called Genghis. On the left are these little finite state machines which make up the layers. 242 00:25:50,690 --> 00:25:58,850 And there's actually six copies of many of them for each leg, one for each leg, 57 finite state machines and 12 layers there on the left. 243 00:25:59,150 --> 00:26:02,420 And I got these robots to walk around and do interesting things. 244 00:26:02,660 --> 00:26:13,160 But they they did have implications. Sojourner, which went to Mars, landed in 1997 for the primary mission of seven Sol, seven days. 245 00:26:13,160 --> 00:26:18,560 And the secondary mission for a further 21 Sols was was operating from the ground. 246 00:26:18,830 --> 00:26:26,900 But at the Sol 28 the behaviour system was turned on and there's this was taken on Sol 72. 247 00:26:27,320 --> 00:26:33,710 There's Sojourner off in the distance from the lander, wandering around exploring Mars by itself. 248 00:26:34,880 --> 00:26:39,770 And then, as Pete mentioned, this has been the basis of the Roomba. 249 00:26:39,950 --> 00:26:48,050 20 million of those in homes, a lot of robots in Afghanistan and Iraq dealing with roadside bombs in Fukushima. 250 00:26:48,710 --> 00:26:52,460 We got there a week after it happened, helped shut down. 251 00:26:52,460 --> 00:26:57,800 And really it was essential for the shutdown of the reactors that were still operating during the cold shutdown. 252 00:26:58,100 --> 00:27:05,000 And then the top middle layer, you see one of the larger robots, a 200 kilowatt robot with a suction tube. 253 00:27:05,090 --> 00:27:08,660 That's a really big Roomba that was used to clean up radioactive stuff. 254 00:27:08,900 --> 00:27:13,970 And when I was last in Fukushima in 2015, these robots were still operating and many more. 255 00:27:14,840 --> 00:27:18,770 And the cleanup is not going to be finished till 2050, by the way. 256 00:27:20,280 --> 00:27:26,220 And then more recently have put robots in factories using this behaviour based system. 257 00:27:26,910 --> 00:27:33,000 Around the year 2000, Damian Esler and Bruce Blumberg took my finite state machines, 258 00:27:33,000 --> 00:27:39,210 which sort of mix logic and behaviour together and split them into something separate called behaviour trees. 259 00:27:39,750 --> 00:27:46,830 And those behaviour trees have become quite popular at MIT, at Rethink Robotics, my company. 260 00:27:47,010 --> 00:27:52,560 When someone shows the robot what to do, it automatically builds a behaviour tree on the left there and you can go and edit it. 261 00:27:52,980 --> 00:28:00,950 But more importantly, perhaps or more interestingly, about two thirds of all video games are now programmed with behaviour trees. 262 00:28:00,960 --> 00:28:04,020 This is one of the frameworks, unity, there's a whole bunch of them. 263 00:28:04,380 --> 00:28:11,070 And so people all around the world are programming that AI little characters in video games using behaviour trees. 264 00:28:12,120 --> 00:28:16,319 So it's, you know, if you count the, the actual number of instances, you know, 265 00:28:16,320 --> 00:28:20,880 every one of the horde of those little creatures in the video game that are coming to attack you is, 266 00:28:20,990 --> 00:28:24,090 is running a separate behaviour tree or running an instance of a behaviour tree. 267 00:28:24,390 --> 00:28:27,390 The raw numbers is most AI systems in the world. 268 00:28:27,630 --> 00:28:36,870 Sorry, are behaviour based. I have forgiven Stuart for in his book saying there's been no known application of behaviour based robots. 269 00:28:41,190 --> 00:28:48,960 Anyway, with first edition, this is my summary of the four approaches to AI. 270 00:28:49,890 --> 00:28:53,850 Symbolic is very deliberative. Traditional robotics is very deliberative. 271 00:28:54,210 --> 00:28:58,740 Behaviour with behaviour trees can be reactive and deliberative and neural networks. 272 00:28:58,740 --> 00:29:02,520 I see a cat, it doesn't really know what it's doing. 273 00:29:02,520 --> 00:29:14,729 It labels and here I get a really scientific study that's a joke of what are the strengths of the different approaches for composition? 274 00:29:14,730 --> 00:29:20,340 Symbolic is the best for composition in on a scale of 1 to 3, it's symbolic. 275 00:29:20,340 --> 00:29:25,440 It's best composition because those symbols let you patch different things together via the symbols. 276 00:29:26,580 --> 00:29:35,969 But symbolic is pretty bad at being grounded, whereas the neural systems really ground in sensor data to the symbols that are the labels. 277 00:29:35,970 --> 00:29:48,930 So they're good at grounding. And then, you know, just for fun, I again very scientifically compared this to a child and I added cognition, 278 00:29:48,990 --> 00:29:53,790 remember, on a scale of 1 to 3 and these are the numbers that I that I came up with. 279 00:29:54,060 --> 00:29:58,440 I think we're very, very far away from what even a child can do. 280 00:29:58,440 --> 00:30:06,450 And I'll talk about that in some detail a bit later. Now, I think a lot of people have been very bad at predicting the future of AI. 281 00:30:06,600 --> 00:30:15,720 And I. I had a thinking technology review last year about the seven ways I see people getting their predictions wrong, 282 00:30:15,990 --> 00:30:22,709 including treating a sufficiently advanced stuff as magic and thereby attributing 283 00:30:22,710 --> 00:30:27,030 any quality you want for the purpose of an argument about what it's capable of. 284 00:30:27,240 --> 00:30:33,750 I'm just going to talk about two of these cases performance versus competence and suitcase words. 285 00:30:34,530 --> 00:30:39,630 So performance versus competence. When we see a person perform some tasks at some level, 286 00:30:39,960 --> 00:30:46,920 we know we have a good intuition for what else they know around that performance in that general area. 287 00:30:47,490 --> 00:30:53,070 So if we see a person describe this image as a group of young people playing a game of Frisbee, 288 00:30:53,490 --> 00:30:57,810 we'd expect to be able to, you know, ask the person, what's the shape of a Frisbee? 289 00:30:57,840 --> 00:31:00,959 We'd expect them to know that. We'd expect them to say no. 290 00:31:00,960 --> 00:31:07,140 Whether a person can eat a Frisbee, can a three month old person play Frisbee, how young are they? 291 00:31:07,140 --> 00:31:12,030 You know, kind a person. But the labelling systems know nothing about this. 292 00:31:12,030 --> 00:31:19,560 They not have any general competence around these symbols, which they label the images with. 293 00:31:19,620 --> 00:31:21,870 They have a performance, but not a competence. 294 00:31:21,870 --> 00:31:31,679 In the same way that deep blue Kasparov at chess, but Deep Blue couldn't be a coach in chess in any shape or form. 295 00:31:31,680 --> 00:31:39,180 Whereas a human who be Kasparov could probably even teach me to play a little bit better chess than not really lousy chess that I play. 296 00:31:42,060 --> 00:31:47,070 And it would be weird if a person could label those images and didn't have these more general competence. 297 00:31:47,070 --> 00:31:50,100 So I think people hear about the performance of a system. 298 00:31:50,100 --> 00:31:54,239 It beat the world champion and go, Wow, it must be really intelligent. 299 00:31:54,240 --> 00:31:58,050 It must be able to do just about anything. But no, it's very narrow. 300 00:31:58,950 --> 00:32:01,529 And why are these labelling? 301 00:32:01,530 --> 00:32:10,500 You know, why have these, you know, neural nets that label images based on probabilities when it comes down to it and you know, 302 00:32:10,830 --> 00:32:14,399 yeah, that's a 90% chance, that's a person, a 60% chance, it's a person. 303 00:32:14,400 --> 00:32:17,459 But we would never make the mistake. Oh, it's 20% chance. 304 00:32:17,460 --> 00:32:23,010 That piece of tree is a person. But they get put together and come up with something. 305 00:32:23,370 --> 00:32:27,359 His his adversarial attacks. 306 00:32:27,360 --> 00:32:31,919 On on on on. Deep learning has become very popular. 307 00:32:31,920 --> 00:32:39,510 This is one of the earliest ones. His a deep learning network. It says with 9090 9%, that's a guitar, 100% that's a penguin. 308 00:32:39,840 --> 00:32:47,159 And then a program goes and plays around with images and tries to create images and does hill climbing and a genetic space. 309 00:32:47,160 --> 00:32:55,860 And it comes up with that. There's this image here, literally this image that says at 100%, that's a guitar and that's a penguin. 310 00:32:56,280 --> 00:33:02,100 And also it's found, you know, a set of pixels that provoke the early stages. 311 00:33:02,370 --> 00:33:07,020 And that's sort of weird. And and so you get these weird classifications. 312 00:33:09,250 --> 00:33:13,090 My favourite there is the school bus. You can see the school buses, shortness of it. 313 00:33:13,450 --> 00:33:16,720 For an American, you know, that's what school buses sort of look like with a yellow. 314 00:33:16,900 --> 00:33:26,800 But it's clear that there's no real spatial understanding that convolutional neural networks actually do lack coherence of spatial input. 315 00:33:27,400 --> 00:33:30,940 And I'll come back to this later, because it actually has significant. 316 00:33:32,330 --> 00:33:35,330 Problems for us. Suitcase words. 317 00:33:35,480 --> 00:33:42,800 This is a term that Marvin Minsky came up with where he pointed out that many words we use have many, many meanings. 318 00:33:43,130 --> 00:33:51,170 Now what happens is an AI researcher comes up with something that can do a little 319 00:33:51,170 --> 00:33:55,460 bit of classification or a little bit of recognising or a little bit of reading. 320 00:33:55,940 --> 00:34:00,980 The Institutional Press Office says, We got to we got to get a press release about this. 321 00:34:01,340 --> 00:34:12,229 And before you know it, it gets turned into, you know, a local local professor has built up an AI program that can hallucinate. 322 00:34:12,230 --> 00:34:15,680 For instance, we've seen hallucinations o. AI systems going on loosely. 323 00:34:15,680 --> 00:34:19,500 This is really wild. But these are suitcase words. 324 00:34:19,520 --> 00:34:23,990 Take. Take the word learn. For instance, learn to ask means many, many different things. 325 00:34:24,020 --> 00:34:29,990 Learning to play tennis is related to, but rather different from learning to ride a bicycle. 326 00:34:30,200 --> 00:34:34,280 It's certainly a very different sort of process than learning ancient Latin. 327 00:34:36,410 --> 00:34:41,600 And even though they're both in both blackboards, learning Latin again is a very different process from learning algebra. 328 00:34:42,680 --> 00:34:48,380 I'm good at algebra. I was really lousy at Latin and I'm really lousy chess. 329 00:34:48,410 --> 00:34:52,850 Learning to play chess is a different sort of skill again, learning to play music again. 330 00:34:52,850 --> 00:35:00,860 These are all very different skills, but we use that same suitcase word and in my last company I'd have the VCs call me up and say, 331 00:35:01,310 --> 00:35:04,940 So the carrier is is doing just come in. 332 00:35:04,940 --> 00:35:08,570 In the sounds in the Bay Area, the carriers is learning about robots. 333 00:35:08,720 --> 00:35:11,630 The robots have learning. They've got to have learning. They'll be beaten. 334 00:35:13,130 --> 00:35:21,230 And that learning is it also shows my love for VCs and learning learning away around a new city is a very different sort of process. 335 00:35:21,230 --> 00:35:26,030 So these are suitcase words. And when we hear, you know, the machine can do a little aspect of it, 336 00:35:26,030 --> 00:35:35,330 it often gets generalised that big aspect and it's a much bigger set of skills on a code, no cheering. 337 00:35:35,870 --> 00:35:41,270 And this is getting back, Aaron, to what he really meant in The Imitation Game. 338 00:35:42,230 --> 00:35:46,370 Can you tell whether it's a computer or a person answering your questions? 339 00:35:46,370 --> 00:35:53,450 And I sometimes thought that he was using it as a rhetorical device to say, well, if you can't tell, 340 00:35:53,660 --> 00:35:58,280 then you can't say anything more about thinking because the machine, you know, can fool you. 341 00:35:58,280 --> 00:36:02,150 So you can't say a machine can't think because you just don't know. 342 00:36:02,810 --> 00:36:05,570 And I don't know what the step from man versus woman. 343 00:36:07,180 --> 00:36:13,750 To Machine versus persons doesn't quite fit in here, but with what it was doing it's a little strange. 344 00:36:14,500 --> 00:36:15,940 Then can it be said to be thinking? 345 00:36:18,090 --> 00:36:24,540 And so some people and I at various times, various versions of me over the decades have thought that it was just a rhetorical device, 346 00:36:24,610 --> 00:36:29,010 a thought experiment to show in principle, we can't rule out that a machine can think. 347 00:36:29,400 --> 00:36:35,010 But then as I reread his papers, now I see that he talks about how to construct a program for such a machine. 348 00:36:35,430 --> 00:36:44,220 He thinks 3000 person years that 50 years or 60 years or 50 programmers was 50 years of 60 programmers was way too onerous. 349 00:36:44,550 --> 00:36:50,130 So he suggests having the program learn like a like a child and then play the game. 350 00:36:50,490 --> 00:36:53,760 And so at some level, he does seem to treat it as a real test. 351 00:36:53,760 --> 00:37:00,030 And so it's become, you know, a benchmark for A.I., supposedly, but it's something that can easily be hacked. 352 00:37:00,300 --> 00:37:06,930 So all the Turing Test competitions are won by stupid programs which are not intelligent at all. 353 00:37:07,620 --> 00:37:16,799 So I suggest that we get rid of Turing's imitation game and think about better tests, get machines to do real tasks in the world. 354 00:37:16,800 --> 00:37:23,270 And I'll just give you a couple of examples. If we could do these and we we do not have a clue how to do these. 355 00:37:23,280 --> 00:37:31,620 I mean, you could you could give, you know, the best AI company in the world, you know, 5000 people devoted to these tasks. 356 00:37:31,620 --> 00:37:38,070 And they wouldn't get very far in the next five years. Certainly if they were really lucky, maybe they'd get somewhere in the next ten years. 357 00:37:38,460 --> 00:37:42,570 One is an elder care worker, which is an embodied task. 358 00:37:42,690 --> 00:37:47,220 So a living care provider for a for an elderly person over decades. 359 00:37:48,240 --> 00:37:51,990 And we will need these, by the way, as as the baby boomers get older. 360 00:37:52,410 --> 00:37:57,030 And it's got to understand human relationships, expectations in a household provide physical help the person, 361 00:37:57,030 --> 00:38:02,540 including manipulate the whole body as they get weaker, understand their degrading language. 362 00:38:02,550 --> 00:38:12,990 You know, Alexa can only understand pretty good language, but as a person gets a little hard of hearing and can't get nouns so well, 363 00:38:12,990 --> 00:38:21,420 they start pointing, they start nodding, they start grunting, understand, provide for human needs, etc., etc. a whole bunch of things it has to do. 364 00:38:21,870 --> 00:38:32,690 A service logistics planner would be a disembodied system design implement perhaps a new dialysis ward in an existing space from scratch. 365 00:38:32,730 --> 00:38:34,920 This is just an example. Now people can do this. 366 00:38:35,400 --> 00:38:41,550 So if we're going to have superintelligent superintelligence better be able to do all these things, because otherwise it isn't superintelligence. 367 00:38:42,090 --> 00:38:48,930 And to do that, it's got to do all sorts of geometric reasoning, quantitative geometric reason, quantitative physical simulation, 368 00:38:49,420 --> 00:38:55,860 understand human needs and fears, understand how family members will feel and act in dialysis for all sorts of stuff. 369 00:38:55,860 --> 00:39:00,479 So the elder care worker, these are happy pictures that you get from Google. 370 00:39:00,480 --> 00:39:08,000 When you Google elder care. The reality is not this happy but, you know, give physical help to people. 371 00:39:08,520 --> 00:39:13,170 If you if they at this stage it's really not so happy because they normally wear a diaper. 372 00:39:13,320 --> 00:39:19,530 And, you know, figuring out how to help people at that stage is going to be very difficult for a robotic elder care worker, 373 00:39:20,490 --> 00:39:27,390 but help them with all sorts of physical tasks. The service logistics planner I base on the idea of an army colonel. 374 00:39:27,780 --> 00:39:32,639 When the US military goes into some new area, Army colonel is given tasks, 375 00:39:32,640 --> 00:39:36,480 set up a hospital, set this up, set up schooling, set up all sorts of stuff. 376 00:39:36,480 --> 00:39:40,860 Well, set up a dialysis ward amount and Colonel will be expected to do it. 377 00:39:41,280 --> 00:39:45,940 And they have to figure out, you know, how the patients are going to sit or lie, 378 00:39:45,990 --> 00:39:53,400 how many places you need in dialysis ward, what the flow of people is going to be through it, 379 00:39:53,700 --> 00:40:00,179 what the the nurses or other attendants need to do with the people, what sort of information they need to give, 380 00:40:00,180 --> 00:40:05,999 what sort of feedback, what the layout of the space for dialysis needs to be, what for waiting room? 381 00:40:06,000 --> 00:40:12,329 How people get from public transportation in the city or cars in into an out of whatever 382 00:40:12,330 --> 00:40:18,420 facilities whole bunch of problems which we cannot have an automatic system do at the moment. 383 00:40:18,750 --> 00:40:24,930 And if you say, oh, deep learning will do it, well, you've got to have an awful lot of dialysis machine examples. 384 00:40:25,920 --> 00:40:30,540 That's going to be pretty hard data to get in in the in the quantities that you need. 385 00:40:30,960 --> 00:40:35,040 So I think these sorts of challenges are the sorts of trouble. 386 00:40:35,070 --> 00:40:39,600 If we if we can't do these sorts of things, we're not getting towards human level intelligence. 387 00:40:41,820 --> 00:40:47,549 So what's hard the day I'm going to my blog, I've got just seven random ones. 388 00:40:47,550 --> 00:40:52,200 I'm going to briefly talk about four things that are hard today that we that we have no idea of. 389 00:40:52,200 --> 00:40:57,500 But that superintelligence proponents sort of assume that we're pretty good at first. 390 00:40:57,600 --> 00:41:00,600 Is real perception as distinct from labelling those images? 391 00:41:01,080 --> 00:41:06,299 Here's some other examples. No. One, these are adversely generated. 392 00:41:06,300 --> 00:41:10,470 No one would get that third one from the left and top row and say that's an armadillo. 393 00:41:11,280 --> 00:41:17,469 But this particular network does well. That's sort of fun and games, but it does have some real implications. 394 00:41:17,470 --> 00:41:23,459 So. This is from a Senate hearing in the US Senate from two or three months ago where there's 395 00:41:23,460 --> 00:41:30,510 a stop sign that an automatic driving vision system labelled as a 45 mile an hour sign. 396 00:41:31,110 --> 00:41:34,470 Now it's got four pieces of tape attached to it. It was produced. 397 00:41:35,410 --> 00:41:36,430 Purposely to fold it. 398 00:41:36,910 --> 00:41:45,129 And if you look around the S and the T with those two pieces, it's sort of like a four and over the O and P, maybe it's sort of like a five. 399 00:41:45,130 --> 00:41:49,600 You can sort of see maybe how it got, but how a stop sign is red. 400 00:41:51,340 --> 00:41:55,800 Why didn't it get? Why didn't it get that? 401 00:41:55,980 --> 00:42:02,100 That couldn't possibly be a speed limit sign. It's read. Well, it turns out the colour is not colour. 402 00:42:02,670 --> 00:42:06,930 We think colour is just from the colour of the pixels and. 403 00:42:08,260 --> 00:42:14,860 When you use deep learning and you don't use a human designed input system, it actually doesn't get colour constancy. 404 00:42:14,890 --> 00:42:18,430 This is from Ted Adelson. And they might be. What do we see that anyone? 405 00:42:20,280 --> 00:42:23,370 Want to look for in check? Yeah. How do you know it's a checkerboard? 406 00:42:23,820 --> 00:42:27,030 Because it's black and white. Oh, it's black and white. See those two squares there? 407 00:42:27,480 --> 00:42:35,340 They're the same colour. And there I've expanded the pixels. 408 00:42:35,610 --> 00:42:38,820 Now, how could they be the same colour? Look, black and white. 409 00:42:39,090 --> 00:42:43,290 Remember all those things going downward instead of upward? 410 00:42:43,620 --> 00:42:46,769 It turns out we compensate for their shadow. There's a shadow there. 411 00:42:46,770 --> 00:42:51,899 We compensate for that. Right. And that's how we know that it's black and white, which it really is. 412 00:42:51,900 --> 00:42:58,680 But if you just look at the pixel value and don't label the individual parts, then you get it wrong. 413 00:42:58,920 --> 00:43:03,910 And what does this. And what colour are the strawberries? 414 00:43:04,210 --> 00:43:05,830 Red. Yeah, they're red. 415 00:43:06,520 --> 00:43:14,020 So of the three quarters of a million pixel, there are only 122 pixels where the red component is bigger than both the green and the blue. 416 00:43:14,350 --> 00:43:20,590 They're the ones with the the biggest red component where they're both bigger than blue and green, 417 00:43:21,850 --> 00:43:25,300 and they're the three pixels which just have the biggest red in them. 418 00:43:26,170 --> 00:43:31,959 They're not very red. And when you get rid of the strawberries, the any any if when the strawberries are there, 419 00:43:31,960 --> 00:43:37,810 you sort of see a little red in the grey right next to it that's flowing over from you seeing the strawberries. 420 00:43:37,810 --> 00:43:39,760 And when you get rid of it, it becomes less red. 421 00:43:40,510 --> 00:43:45,459 Some some may or may not see that, but that's where you know, you're using are these are strawberries. 422 00:43:45,460 --> 00:43:48,730 They must be red. Your reconstructing everything. We do that all the time. 423 00:43:49,120 --> 00:43:54,730 And colour constancy is just one of many, many visual tricks that we use which make good sense. 424 00:43:55,240 --> 00:44:00,850 If you can't label the colours and teach the deep learning that what it doesn't learn them. 425 00:44:01,090 --> 00:44:08,500 So the Deep Learning Network never realises that stop signs are red because in pixels space a lot of them aren't red. 426 00:44:08,920 --> 00:44:12,100 And so it's not a feature and we don't have colour constancy. 427 00:44:12,550 --> 00:44:26,050 So, um, realism in real problems is, is, is, is different from what I call tech rows, you know, with the myriads of them in the Silicon Valley, 428 00:44:26,260 --> 00:44:30,580 you know, I'm going to apply machine learning to X, I'm going to apply machine learning to I just need data. 429 00:44:30,910 --> 00:44:35,059 But they don't necessarily know what the data they need to get the robustness if 430 00:44:35,060 --> 00:44:38,530 we're going to have an elder care worker better work in human homes at a human home. 431 00:44:38,530 --> 00:44:43,480 Right. This is from Aaron editing his thesis. This is what his home looks like, his kitchen. 432 00:44:43,870 --> 00:44:47,560 That's the sort of world we have to deal in. But we're pretty good, you know? 433 00:44:48,010 --> 00:44:51,180 Anyone know what that is? Yeah. 434 00:44:51,180 --> 00:44:55,300 It's a tow container. And how do we know that? Well, maybe that's the. 435 00:44:55,940 --> 00:44:59,450 The salt shaker is priming. That's the salt. But we sort of know that that's. 436 00:44:59,970 --> 00:45:04,650 But if you just saw it by itself or you saw that by itself, it's a rice cooker. 437 00:45:05,040 --> 00:45:10,770 You know, you can get that because, you know, it's a kitchen. I'm going to do a little experiment now. 438 00:45:11,100 --> 00:45:15,210 Does anyone know what steampunk is? Sure. Does anyone not know what steampunk is? 439 00:45:17,170 --> 00:45:20,650 Oh, good. You're my test subjects. This is theme. 440 00:45:21,970 --> 00:45:27,070 Okay, that is a style. Steampunk is more steampunk. 441 00:45:28,990 --> 00:45:32,320 Here is this these are usually make up is there's more steampunk. 442 00:45:32,770 --> 00:45:36,489 Okay, now we do the test. You've had three examples. Is that steampunk? 443 00:45:36,490 --> 00:45:40,090 There are two maker faire. No, that's not steampunk. 444 00:45:41,140 --> 00:45:47,200 She has goggles. Not that steampunk wear is goggles. 445 00:45:47,230 --> 00:45:51,340 The others had goggles while they around his neck. Okay. They thought we had goggles. 446 00:45:51,550 --> 00:45:55,370 Steampunk. No steampunk. Yeah. 447 00:45:55,380 --> 00:46:00,950 I think that's debateable. I think it's sort of lousy steampunk now. 448 00:46:00,990 --> 00:46:04,820 Steampunk? What about that? Yeah. It's got nothing to do with goggles. 449 00:46:04,940 --> 00:46:06,589 You know, this is the examples. It wasn't. 450 00:46:06,590 --> 00:46:12,260 We had robot arms, but you, you know, some of you, they were able to learn that category from three examples. 451 00:46:12,560 --> 00:46:22,910 And you got it pretty, pretty well. When we use a deep learning, we show it millions of examples, hundreds of thousands of times each very different. 452 00:46:22,920 --> 00:46:25,980 So we don't have real perception. We don't have real manipulation. 453 00:46:26,550 --> 00:46:31,260 This is the air lab at Stanford in 1978. 454 00:46:31,620 --> 00:46:38,100 And there you see on the right of the image, the blue arm. We also had a gold arm which is off of the camera. 455 00:46:38,370 --> 00:46:42,480 I don't have a picture of it. How do I know that gold arm is there? Because that's me. 456 00:46:45,480 --> 00:46:49,830 So here's the gold arm in the lobby of the computer science department at Stanford today. 457 00:46:50,830 --> 00:46:54,180 And you might remember these arms, Steve. 458 00:46:54,870 --> 00:47:01,920 And you see the gripper. There is a parallel jaw, gripper and worm screws where the two fingers come together in parallel, back and forth. 459 00:47:02,220 --> 00:47:08,760 That's 1978. Here's what my company was selling for grippers in Nice in 2018. 460 00:47:08,940 --> 00:47:13,290 Same thing. Not much improvement in the hands. 461 00:47:13,290 --> 00:47:18,719 We actually sold kits for parallel grippers and kits for suction grippers. 462 00:47:18,720 --> 00:47:24,390 Here's the shunt catalogues thousands of pages of different size of parallel jaw grippers. 463 00:47:24,690 --> 00:47:26,610 And that's what people use for robot today. 464 00:47:26,620 --> 00:47:37,440 We are lousy at manipulation, but you know, humans can do all sorts of manipulation and superintelligence should be able to do them too. 465 00:47:37,530 --> 00:47:43,120 Superintelligence should be out of play or run. 466 00:47:44,340 --> 00:47:49,860 You know, if superintelligence is going to, you know, take over and kill all the people, it better be able to. 467 00:47:50,220 --> 00:47:55,620 Well, it doesn't have to cook, but it better be able to do the all the manufacturing that people currently do with their hands. 468 00:47:55,650 --> 00:48:02,640 That's Julia hit. Hit again, doesn't have to be cooking, but this just shows the dexterity of humans. 469 00:48:02,820 --> 00:48:11,280 This is a sushi chef where they're using the force and changing the angle of the knife so they don't pull apart the pieces of the fish. 470 00:48:11,700 --> 00:48:15,750 And we just we can just figure out how to do these things with our hands. 471 00:48:16,020 --> 00:48:20,280 We cannot get Steve, can we can we get a robot to do any of these? 472 00:48:20,570 --> 00:48:24,840 No, he says. Steve says, no, it must be true. We can't we can't get a robot. 473 00:48:25,110 --> 00:48:30,540 And all our manufacturing relies on these sorts of dexterity with sloppy floppy objects, etc. 474 00:48:32,400 --> 00:48:39,930 You know, some people are experimenting with manipulation for people and, you know, putting clothes on people and stuff, but we can't. 475 00:48:39,960 --> 00:48:45,780 This is from my home state of South Australia and if Nick was here I should tell him, you don't have to worry. 476 00:48:45,810 --> 00:48:49,980 We can't give the robots the knives. They can't cut up rib cages yet. 477 00:48:51,450 --> 00:48:52,530 We're a long way from that. 478 00:48:52,710 --> 00:49:00,450 We can't do most of the manipulation that people can do the very simple pick and place, even after working on it for 40 years. 479 00:49:01,680 --> 00:49:05,579 So the thing read a book. Why read a book? A lot of human knowledge is in books. 480 00:49:05,580 --> 00:49:09,990 If we're going to have superintelligence, maybe it should read just like we read to get a lot of knowledge. 481 00:49:10,320 --> 00:49:14,070 And you know, every so often there's a scare a bit humans at reading. 482 00:49:14,580 --> 00:49:15,960 Good headline or maybe not. 483 00:49:16,770 --> 00:49:26,430 This was this is where a little while ago people thought, oh, suddenly this AI system can read a book when examined, closed, not so well. 484 00:49:26,820 --> 00:49:30,780 And here's some examples of it. So these are the Winograd schemas from NYU. 485 00:49:31,230 --> 00:49:36,540 Alice tried frantically to stop her daughter from chatting at the party, leaving us to wonder why she was behaving so strangely. 486 00:49:36,540 --> 00:49:40,419 Who's who? She. Well, Alice. 487 00:49:40,420 --> 00:49:43,480 Yeah, but if we change one word, the dog was barking. 488 00:49:45,910 --> 00:49:49,120 The delivery truck zoomed by the school bus because it was going so fast. 489 00:49:49,120 --> 00:49:54,930 What was going fast in the truck? Yeah. The delivery trucks went by the school bus because it was going so slow. 490 00:49:54,940 --> 00:49:59,620 Now it's the school bus. Sam pulled up a chair to the piano, but it was broken. 491 00:49:59,620 --> 00:50:05,170 So we had the stand and said the chair was broken. But if instead he's seeing it was the piano that was broken. 492 00:50:05,530 --> 00:50:08,650 Now, to answer all those questions, you're doing all sorts of different, 493 00:50:08,740 --> 00:50:12,790 different sorts of simulations for each question in your head about what's going on. 494 00:50:12,790 --> 00:50:20,920 And we can't do those sorts of sort of common sense gross simulations at the moment. 495 00:50:21,370 --> 00:50:24,609 And writers have written books. Just assume we know a lot of stuff. 496 00:50:24,610 --> 00:50:27,910 You know, you all know that Prince William is taller than his son, Prince George. 497 00:50:28,150 --> 00:50:33,880 But you don't know and you know that you don't know whether that's going to be true in 20 years, which is an interesting thing. 498 00:50:34,290 --> 00:50:40,600 You just know that I didn't even have to get trained on that. You just know that, you know, down the bottom, dolphins eat raw fish. 499 00:50:40,690 --> 00:50:44,890 So to some humans, dolphins don't usually cook food. 500 00:50:45,160 --> 00:50:52,000 Most humans do. We just know this stuff. And it's background and it's assumed in all writing that we know the stuff. 501 00:50:52,480 --> 00:50:58,000 Our air systems have very little common sense. John McCarthy wrote his first paper on this topic in 1958. 502 00:50:58,390 --> 00:51:03,940 We're still away from it. And Tampa just announced a $2 billion research program on common sense. 503 00:51:07,460 --> 00:51:10,480 And, you know, there's a couple of sorts of it. Here's one of the pieces. 504 00:51:10,990 --> 00:51:16,810 Know in the main objects, agents, places. They're saying this is what we want you to do. 505 00:51:17,050 --> 00:51:23,500 Do all these things for objects, agents and places. One of those months on the right then not months for the research. 506 00:51:23,860 --> 00:51:28,620 They're the age of babies when they can do them. These are all lives spell qui Susan Carey. 507 00:51:29,020 --> 00:51:34,870 So it's a really it's a multibillion dollar project on getting the common sense of an 18 month old. 508 00:51:35,830 --> 00:51:40,180 That's what the funding agency believes is still hot. Not quite at superintelligence yet. 509 00:51:40,960 --> 00:51:43,510 And I'm going to skip over writing or debugging a program. 510 00:51:43,690 --> 00:51:51,130 But some people think that, you know, we're going to have artificial intelligence going to rewrite its code from scratch. 511 00:51:51,430 --> 00:51:57,940 Basically, my argument here is that people keep saying this, you know, we cannot we cannot have. 512 00:51:58,440 --> 00:52:01,350 And then they say, what's superintelligence going to rewrite the rules of physics, too? 513 00:52:01,360 --> 00:52:08,140 By the way, even Max TEGMARK may not believe that he's a cosmologist. 514 00:52:09,310 --> 00:52:14,140 But, you know, we we don't have anything that can understand this program. 515 00:52:14,470 --> 00:52:17,560 And I was going to go through and explain how I just glance at it. 516 00:52:17,560 --> 00:52:25,030 And I know all sorts of stuff about this program. And none of those deductions, anything that we can have any sort of system do at the moment. 517 00:52:25,540 --> 00:52:30,390 So what do we work on now? I'm running out of time. I think we can't work on those. 518 00:52:30,400 --> 00:52:34,719 You know, the health care worker directly, the older care worker. 519 00:52:34,720 --> 00:52:38,140 We can't work on the systems logistics planet directly. It's too hard. 520 00:52:38,620 --> 00:52:41,379 But these are sorts of goals that we could work towards. 521 00:52:41,380 --> 00:52:47,980 And if we're making progress towards any of these goals, our AI systems would have more common sense and be more robust. 522 00:52:48,460 --> 00:52:56,650 A two year old has colour constancy and what colour categories can map form to function can figure out that they can sit on this. 523 00:52:56,650 --> 00:53:00,850 This can act as a chair even though it's not a chair. So and then form and function. 524 00:53:01,420 --> 00:53:05,559 They have object classes and can categorise objects that look totally new at the pixel level. 525 00:53:05,560 --> 00:53:09,250 Unlike the deep learning they can do one shot subclass learning. 526 00:53:09,250 --> 00:53:13,810 You take, you take a kid to the zoo and they see a giraffe for the first time in their life. 527 00:53:14,020 --> 00:53:18,310 You don't say, Oh, by the way, that's an animal. They know it's an animal. 528 00:53:18,550 --> 00:53:24,160 At age two and after they've seen a zoo for a few seconds, you go home and they open a book. 529 00:53:24,400 --> 00:53:33,670 They know that's a giraffe. One shot learning. Four year old can talk and listen when they come in about turn taking understanding cues. 530 00:53:34,780 --> 00:53:40,420 They know when they're in a conversation with someone and one of the participants changes they if multiple 531 00:53:40,930 --> 00:53:44,980 participants in the conversation they know how to get attention and direct their remarks to someone, 532 00:53:45,310 --> 00:53:51,040 they can tune in a lot of conversations. They know when someone is suddenly speaking, we speaking differently from normal. 533 00:53:51,310 --> 00:53:56,380 So they know a lot of stuff. And currently Alexa knows none of this stuff, for instance. 534 00:53:57,190 --> 00:54:00,459 And Alexa is a fantastic push forward, by the way. 535 00:54:00,460 --> 00:54:08,080 I'm not belittling it. It's a lot more. Six year old can estimate from vision how to pick up many objects. 536 00:54:08,080 --> 00:54:12,850 Whether it's going to be one hand, wants to handle what's going to wrap their arms around it, use the whole body, 537 00:54:13,210 --> 00:54:19,540 they pre shape their hands for a grasp they can apply and control force appropriate for a task they can use. 538 00:54:19,540 --> 00:54:24,400 Chopsticks. A six year old who's grown up in a chopstick house can use chopsticks as a complex thing. 539 00:54:24,880 --> 00:54:30,460 They can do all sorts of tasks. None of our industrial robots can do any of these. 540 00:54:31,510 --> 00:54:34,390 They can even pick up. This is a step towards elder care. 541 00:54:34,570 --> 00:54:41,050 They can pick up cats and dogs and pet them and they can wipe their own bums, which is important for elder care, too. 542 00:54:44,140 --> 00:54:50,860 An eight year old can articulate their beliefs, desires and intentions, and they understand that different people, other people have different. 543 00:54:52,000 --> 00:54:57,670 Beliefs, desires and intentions. Some of this might be a nine year old, but eight year olds are fit the pattern better. 544 00:54:58,450 --> 00:55:03,280 I apologise and they can deduce many of these things by observing others. 545 00:55:04,090 --> 00:55:06,340 And we don't have systems that can do that today. 546 00:55:06,460 --> 00:55:13,780 But if we made progress on any of these four things, it would help our systems operate in the world where things go wrong. 547 00:55:14,470 --> 00:55:20,020 Oh, the superintelligence destroying us. Thanks, Nick. But here's where it goes wrong. 548 00:55:20,260 --> 00:55:25,230 It goes wrong in the datasets getting putting bias into systems. 549 00:55:25,240 --> 00:55:29,890 And some of the headlines get a little hysterical, but there's real issues there. 550 00:55:31,480 --> 00:55:37,480 Watson Health You know IBM tried to push Watson too fast the in the health care and it was a bit of 551 00:55:37,480 --> 00:55:44,080 a big failure but not why they are pushing it into into digital marketing now and that's a really. 552 00:55:45,330 --> 00:55:51,280 Oh well, anyway, forget it. But what if we're completely off track? 553 00:55:52,360 --> 00:56:00,220 You know, we think building intelligent machines is like building copies of ourselves ultimately. 554 00:56:02,490 --> 00:56:10,050 Okay. Well, maybe we can do that. What if we see these two dolphins, A and B, and then we look more closely and we notice that B is a robot. 555 00:56:11,410 --> 00:56:15,940 Are we going to conclude that a bunch of A's got together and built the B? 556 00:56:16,930 --> 00:56:20,650 Did the dolphins build the robot dolphin? 557 00:56:21,010 --> 00:56:24,340 We don't think they're capable of it. Okay. 558 00:56:25,240 --> 00:56:29,140 Are we capable of it? Are we capable of building human level intelligence? 559 00:56:29,710 --> 00:56:33,190 We'd like to think we are. We're pretty arrogant about it. 560 00:56:34,260 --> 00:56:39,120 But maybe we're not. Maybe we were just not good enough. Just like we think the dolphins are just not good enough. 561 00:56:39,150 --> 00:56:44,460 Maybe the aliens up there are looking down on us. Look at those little humans trying to kind of make copies of themselves. 562 00:56:45,270 --> 00:56:51,360 But we're just not smart enough. But maybe we are smarter, but we're going about it the wrong way. 563 00:56:51,360 --> 00:56:59,070 And I want to use flight as an example here. You know, people always say, well, you know, we don't we don't we don't make things fly by copying birds. 564 00:56:59,790 --> 00:57:05,760 And Lord Kelvin was wrong just a few years before heavier than air flight, although it's not clear exactly what he was saying. 565 00:57:06,240 --> 00:57:11,030 But. No. The Wright brothers. One out. 566 00:57:12,200 --> 00:57:16,729 You know, earlier people had said, well, we need flapping, flapping wings and that wasn't helpful. 567 00:57:16,730 --> 00:57:24,200 And then Lilienthal maybe mispronouncing his name, really understood that it was a static wing in airflow that was important for gliding. 568 00:57:24,210 --> 00:57:30,260 He he did over 2000 glides before he died doing what he loved gliding. 569 00:57:31,280 --> 00:57:36,920 And he wrote a book saying in the German title, says, It's a human flight inspired by birds. 570 00:57:37,310 --> 00:57:43,639 And Wilbur Wright was certainly a knew about that book read that book and he observed 571 00:57:43,640 --> 00:57:49,520 that birds use a change of shape of their wings in order to roll left and right. 572 00:57:49,520 --> 00:57:51,950 And that was they did have better engines. 573 00:57:51,950 --> 00:58:02,030 But the key innovation for for Wilbur and Orville Wright was realising that control of the wing shape in the airflow was important. 574 00:58:02,360 --> 00:58:05,569 Up until that point, people hadn't been thinking about control. 575 00:58:05,570 --> 00:58:09,950 That turned out to be the the critical thing. We use our computers. 576 00:58:10,940 --> 00:58:20,690 We've had, you know, Moore's Law just delivering us better and better cocaine for a long time and computation and certain metaphors. 577 00:58:21,230 --> 00:58:24,110 But maybe we're not thinking about things the right way. Maybe. 578 00:58:24,470 --> 00:58:30,440 And maybe it's you know, maybe it's going to be 150 years before someone figures that out. 579 00:58:30,750 --> 00:58:34,309 Some things take a long time. We do not know how long it will take. 580 00:58:34,310 --> 00:58:38,750 But I want to end with something that Alan Turing said in 1950. 581 00:58:39,410 --> 00:58:46,540 We can only see a short distance ahead, but we can see plenty that needs to be done in AI and there's plenty for us all to do. 582 00:58:46,550 --> 00:58:48,620 So thank you. And I'm sorry I went a bit long. Thanks.