1
00:00:01,980 --> 00:00:05,610
ARI: Hello, I'm Ari. [CLAUDINE: And I'm Claudine] Welcome to Proving the Negative.

2
00:00:05,610 --> 00:00:09,490
ARI: We're a podcast all about exploring the different sides of cybersecurity,

3
00:00:09,490 --> 00:00:13,770
from political to computer science, international relations to mathematics.

4
00:00:13,770 --> 00:00:17,180
Join us as we talk to our friends about the work they do.

5
00:00:17,180 --> 00:00:22,180
MARY: My name is Mary. My doctoral project was about the security of

6
00:00:22,180 --> 00:00:25,040
speech interfaces and voice control.

7
00:00:25,040 --> 00:00:30,230
My website says that I am a cyber security researcher and an innovator.

8
00:00:30,230 --> 00:00:35,100
Being an innovator is something that I am aspiring to at present! I am keen to

9
00:00:35,100 --> 00:00:41,090
deepen my understanding and, in the future, use that to innovate.

10
00:00:41,090 --> 00:00:45,000
Not quite sure what that will be, but something that will be of genuine value

11
00:00:45,000 --> 00:00:49,250
to people and genuine use in contributing to the field.

12
00:00:49,250 --> 00:00:51,530
ARI: What are you curious about?

13
00:00:51,530 --> 00:00:55,610
MARY: More broadly, is this technical understanding of detail that I find

14
00:00:55,610 --> 00:01:00,690
fascinating. The technical detail of how things become vulnerable,

15
00:01:00,690 --> 00:01:07,480
how it works in practice, specific bugs in specific systems, and digging into generic

16
00:01:07,480 --> 00:01:13,160
descriptions and news headlines to understand how that works in practice.

17
00:01:13,160 --> 00:01:18,020
That's something that I've scratched the surface of, and continue to learn about.

18
00:01:18,020 --> 00:01:24,000
I didn't start from a technical background - I've tried to gain as much technical

19
00:01:24,000 --> 00:01:28,280
understanding as I can during my doctorate and on an ongoing basis.

20
00:01:28,280 --> 00:01:32,090
And I found that really rewarding, especially when I feel that I have fully

21
00:01:32,090 --> 00:01:36,270
understood something and I know how it works, that really motivates me.

22
00:01:36,270 --> 00:01:39,500
CLAUDINE: I'm curious - you said you don't come from a technical background.

23
00:01:39,500 --> 00:01:46,000
How did you become interested in attacks on speech interfaces?

24
00:01:46,000 --> 00:01:50,520
Well, a varied background - I came into cybersecurity as a second career,

25
00:01:50,520 --> 00:01:55,230
so I had accumulated a fairly mixed bag of knowledge, some of it useful.

26
00:01:55,230 --> 00:02:01,320
I started out in humanities learning Latin, which I really enjoyed at the time.

27
00:02:01,320 --> 00:02:07,290
It detailed how language and grammar works. I always wanted to follow that up.

28
00:02:07,290 --> 00:02:14,760
I'd studied in my own time. When I got the opportunity to do the doctorate,

29
00:02:14,760 --> 00:02:20,500
I was looking to use as much of my prior knowledge as possible.

30
00:02:20,500 --> 00:02:22,260
Language technology and speech,

31
00:02:22,260 --> 00:02:26,550
even though at the time I didn't know much about it, seemed to fit that bill.

32
00:02:26,550 --> 00:02:33,660
As I looked into it, people started to use Alexa and Google more widely.

33
00:02:33,660 --> 00:02:37,560
I feel I was quite lucky as well in that it coincided with a more general trend.

34
00:02:37,560 --> 00:02:42,000
I didn't know that at the time - a stroke of luck, and [it] fitted previous interests.

35
00:02:42,000 --> 00:02:44,000
ARI: Which general trend was that at the time?

36
00:02:44,000 --> 00:02:49,500
MARY: when I started looking at this in 2015, it was still fairly unusual for people

37
00:02:49,500 --> 00:02:53,640
to be using Alexa and Google and voice on their phone and things like that.

38
00:02:53,640 --> 00:02:59,070
My impression is that that's become more widespread since, more ordinary.

39
00:02:59,070 --> 00:03:02,000
Perhaps less so amongst security researchers who have concerns

40
00:03:02,000 --> 00:03:04,380
about Alexa listening in on their conversations,

41
00:03:04,380 --> 00:03:07,170
which was kind of why I pursued this research. Certainly, looking at my family

42
00:03:07,170 --> 00:03:15,060
friends, they all seem to be using Alexa in a way that they weren't five years ago.

43
00:03:15,060 --> 00:03:19,950
That's not something that I can claim to have foreseen, it's been lucky for me!

44
00:03:19,950 --> 00:03:23,740
When I started the research, I wondered if it was too niche,

45
00:03:23,740 --> 00:03:26,430
possibly something that would go out of fashion within a couple of years.

46
00:03:26,430 --> 00:03:30,000
That doesn't seem so far to have happened, so that's encouraging.

47
00:03:30,000 --> 00:03:31,500
ARI: Are we looking at voice hacking here?

48
00:03:31,500 --> 00:03:34,710
That sounds quite cool, doesn't it? Very sci[ence] fi[ction]! Well yes, in a sense that

49
00:03:34,710 --> 00:03:41,700
hacking is trying to get a system to behave in an unintended way.

50
00:03:41,700 --> 00:03:46,440
And that's what I've tried to do with voice control - provide speech or audio

51
00:03:46,440 --> 00:03:51,120
input to devices that will provoke it to do something that it shouldn't do,

52
00:03:51,120 --> 00:03:54,280
according to the developer's specifications.

53
00:03:54,280 --> 00:03:57,384
Yes, you could describe that as voice hacking,

54
00:03:57,384 --> 00:04:00,060
it's perhaps a little bit in the realms of science fiction for the moment.

55
00:04:00,060 --> 00:04:06,500
ARI: You're also a penetration tester. We talked about general, headline security

56
00:04:06,500 --> 00:04:10,800
but also these technical details. You've said you're interested in these details...

57
00:04:10,800 --> 00:04:14,310
What is pen[etration] testing and what interests you about it?

58
00:04:14,310 --> 00:04:18,780
MARY: I did an internship with a pentesting company during my doctorate.

59
00:04:18,780 --> 00:04:24,840
What it consists of, is pretending to be the person who's trying to attack a company.

60
00:04:24,840 --> 00:04:29,570
But your intentions are good rather than bad. You pre-empt an attacker

61
00:04:29,570 --> 00:04:34,230
attacking their potential victim yourself without intending any actual harm,

62
00:04:34,230 --> 00:04:39,120
and give them a chance to prepare for someone doing that for real.

63
00:04:39,120 --> 00:04:45,990
Having been in academia, considering theoretical and abstract topics, to

64
00:04:45,990 --> 00:04:52,020
the real world of vulnerabilities and management of large systems,

65
00:04:52,020 --> 00:04:56,670
there is quite a contrast. It was quite a valuable experience to be aware of it.

66
00:04:56,670 --> 00:05:02,610
You learn that you might you might solve something as a theoretical problem.

67
00:05:02,610 --> 00:05:09,390
But implementing that in reality, in society is going to be very different.

68
00:05:09,390 --> 00:05:12,000
Some awareness of that as an academic is good.

69
00:05:12,000 --> 00:05:16,740
CLAUDINE: How did academic experience inform professional experience (and vice versa)?

70
00:05:16,740 --> 00:05:21,410
Oh, that's interesting question. They are different worlds. That is as it should be.

71
00:05:21,410 --> 00:05:26,850
[In] industry you're not looking for the perfect solution that will be valid forever.

72
00:05:26,850 --> 00:05:31,840
You're looking for something that will be of practical use in a given time and place.

73
00:05:31,840 --> 00:05:36,500
Everything is done under time pressure. You don't have time for perfection.

74
00:05:36,500 --> 00:05:40,800
Whereas, in academia there is certainly a luxury of having that time and lack of

75
00:05:40,800 --> 00:05:45,700
real world consequences, to consider things. Each have their own place, really.

76
00:05:45,700 --> 00:05:50,000
We need people who are exploring things on a theoretical basis and there's a need

77
00:05:50,000 --> 00:05:53,720
for practical implementation where academic concerns need to take

78
00:05:53,720 --> 00:05:58,180
second place to more immediate, financial, practical things.

79
00:05:58,180 --> 00:06:02,680
I certainly wouldn't say that one is more important or significant than the other.

80
00:06:02,680 --> 00:06:05,790
I just think they both need to be valued for what they are.

81
00:06:05,790 --> 00:06:09,660
CLAUDINE: Could you talk more about [your] research - what were you looking at?

82
00:06:09,660 --> 00:06:14,000
MARY: I looked quite specifically at Google and Alexa because they're the

83
00:06:14,000 --> 00:06:18,930
most widespread devices in use, at least in the English speaking Western world.

84
00:06:18,930 --> 00:06:22,950
The first thing I needed to do was gain an overview and understanding of the

85
00:06:22,950 --> 00:06:27,870
technology involved and the potential security problems that it currently has,

86
00:06:27,870 --> 00:06:33,000
and that it could possibly have, and link that to broader cybersecurity context.

87
00:06:33,000 --> 00:06:38,370
So really, the best part of my first and even second year was taken up with that.

88
00:06:38,370 --> 00:06:43,000
In the second phase of the project, I did more practical experiments, looking at

89
00:06:43,000 --> 00:06:53,000
developer versions of Google Assistant and Alexa Skills - apps for voice devices.

90
00:06:53,000 --> 00:06:58,870
For the Google Assistant, what I did specifically was input nonsense sounds,

91
00:06:58,870 --> 00:07:01,170
sounds which don't actually have any meaning attached to them,

92
00:07:01,170 --> 00:07:04,700
(which all languages have - they have sounds which in the current usage of a

93
00:07:04,700 --> 00:07:07,350
language, don't actually have any meaning attached to them).

94
00:07:07,350 --> 00:07:12,391
I'd see what would happen if I directed that kind of input to Google Assistant

95
00:07:12,391 --> 00:07:15,240
and whether that would trigger any unexpected effects.

96
00:07:15,240 --> 00:07:20,300
I was looking for sounds which rhymed with target commands and seeing if I

97
00:07:20,300 --> 00:07:23,520
could trick the device into thinking that a nonsense sound was a target command,

98
00:07:23,520 --> 00:07:30,120
which worked in a couple of instances - quite satisfying!

99
00:07:30,120 --> 00:07:33,130
I also tested the human perception of those nonsense sounds

100
00:07:33,130 --> 00:07:36,120
to see whether they would detect them as a target command.

101
00:07:36,120 --> 00:07:39,390
As it turns out, humans couldn't hear any meaning in those sounds at all.

102
00:07:39,390 --> 00:07:43,000
So it opened up a gap between what a human would hear

103
00:07:43,000 --> 00:07:47,500
and how a device would interpret it. Often in security there's a gap between

104
00:07:47,500 --> 00:07:50,670
how things are in a system [plan] and how they are in the real world.

105
00:07:50,670 --> 00:07:54,200
That's the essence of it. That gap between how humans heard sounds and

106
00:07:54,200 --> 00:08:01,750
how they were picked up by Google was a potential security vulnerability.

107
00:08:01,750 --> 00:08:05,580
ARI: How have you approached the biggest challenge in your work?

108
00:08:05,580 --> 00:08:10,380
MARY: There are technical challenges in trying to write code, getting things to work...

109
00:08:10,380 --> 00:08:18,090
Things inevitably don't work, having having to redo them...

110
00:08:18,090 --> 00:08:22,830
Then an immense challenge of information overload, knowing where to stop.

111
00:08:22,830 --> 00:08:25,700
You can't realistically know everything there is to know about your subject.

112
00:08:25,700 --> 00:08:30,500
On the other hand, you want to know as much as possible to inform it.

113
00:08:30,500 --> 00:08:33,540
The most challenging thing, really, has been a bit more personal.

114
00:08:33,540 --> 00:08:38,610
Trying to be convinced that what I'm doing will (potentially) work and be of use.

115
00:08:38,610 --> 00:08:42,570
One particularly challenging aspect of that is that when you start anything,

116
00:08:42,570 --> 00:08:48,670
any project, inevitably your early ideas will appear, in retrospect, very naive,

117
00:08:48,670 --> 00:08:53,340
very romanticised and frankly, a little bit embarrassing. To move beyond them,

118
00:08:53,340 --> 00:08:57,500
You really have to be prepared to see that, but also not just stay there and

119
00:08:57,500 --> 00:09:04,140
instead move on to finding something that is more realisable and more serious.

120
00:09:04,140 --> 00:09:11,500
So that process, being convinced that it's worth persevering and then not getting

121
00:09:11,500 --> 00:09:14,760
too caught up in your mistakes, that's been the biggest challenge for me,

122
00:09:14,760 --> 00:09:19,190
especially in research where you're not told what to do, you're expected to be

123
00:09:19,190 --> 00:09:23,000
exploring something that hasn't been done before, so you really are not going

124
00:09:23,000 --> 00:09:28,500
to get it right. Carrying on regardless in research is an important skill, perhaps

125
00:09:28,500 --> 00:09:32,100
even more so than technical understanding or reading papers.

126
00:09:32,100 --> 00:09:40,200
CLAUDINE: I have to ask - what was an early idea that you had to move past?

127
00:09:40,200 --> 00:09:44,000
MARY: Of the many examples, I remember submitting an early version

128
00:09:44,000 --> 00:09:50,000
of my doctoral proposal plan and it was something to do with...

129
00:09:50,000 --> 00:09:55,920
As I said, I studied Latin. I had this idea that I was going to somehow link the

130
00:09:55,920 --> 00:10:00,330
theories of rhetoric in ancient Rome to the security of modern voice control.

131
00:10:00,330 --> 00:10:05,500
That was going to (somehow) be of practical use and solve a cyber security problem.

132
00:10:05,500 --> 00:10:08,500
Within six months of learning something about cyber security,

133
00:10:08,500 --> 00:10:21,990
I realised that was, very clearly, never going to work. It was the starting point.

134
00:10:21,990 --> 00:10:25,800
You really do have to start somewhere [and] not get too caught up

135
00:10:25,800 --> 00:10:28,680
in the less than perfect aspects of your earlier work.

136
00:10:28,680 --> 00:10:33,030
ARI: We ask guests "when did you fail and what have you learned?" perhaps

137
00:10:33,030 --> 00:10:37,060
it's about being open to new ideas - you got to start somewhere.

138
00:10:37,060 --> 00:10:41,280
Especially in an academic environment where it can be quite critical.

139
00:10:41,280 --> 00:10:45,220
How do you use that? How do you keep going? What does resilience look like?

140
00:10:45,220 --> 00:10:49,280
Cyber security leveraging Latin was certainly an idea. Then you got to a

141
00:10:49,280 --> 00:10:53,760
very relevant, useful, impactful idea by the end of your project.

142
00:10:53,760 --> 00:10:57,000
MARY: It is difficult because, as you say, especially in academia, there's this

143
00:10:57,000 --> 00:11:01,170
pressure to be convincing, to sound credible all the time and authoritative.

144
00:11:01,170 --> 00:11:04,700
And of course, that's just not realistic. You're constantly learning as you're

145
00:11:04,700 --> 00:11:09,259
going along. There's a balance between being open about your mistakes and

146
00:11:09,259 --> 00:11:12,680
humble about what you don't know. At the same time, having a degree of

147
00:11:12,680 --> 00:11:19,800
composure and valuing what you DO know... having a responsibility to try and

148
00:11:19,800 --> 00:11:23,800
contribute that. It's very much a balance between those two things. You become

149
00:11:23,800 --> 00:11:29,520
obsessed with your mistakes or you can not see (and continue to make) them.

150
00:11:29,520 --> 00:11:32,510
So it's finding that happy medium between the two.

151
00:11:32,510 --> 00:11:35,700
ARI: I took part in your experiment! I remember making up nonsense,

152
00:11:35,700 --> 00:11:38,010
random words, there was some wordplay involved...

153
00:11:38,010 --> 00:11:42,400
What is a target command and how does word play fit into that?

154
00:11:42,400 --> 00:11:46,950
You said that people didn't understand the words, but the computer did?

155
00:11:46,950 --> 00:11:50,200
MARY: Yes. There were two parts to the practical experiment. One was the

156
00:11:50,200 --> 00:11:56,000
nonsense sounds targeting the audio part of the voice controlled device, the

157
00:11:56,000 --> 00:12:01,000
the second part was more about understanding words and meaning.

158
00:12:01,000 --> 00:12:07,000
Most words have multiple meanings depending on the context they're used in.

159
00:12:07,000 --> 00:12:11,400
We do that automatically. We barely think about it. We just use words in

160
00:12:11,400 --> 00:12:16,700
their correct context, with their given meaning. It turns out that we don't fully

161
00:12:16,700 --> 00:12:22,140
understand how we do that and  struggle to teach machines how to do it.

162
00:12:22,140 --> 00:12:26,250
In my second experiment (that you took part in, Ari) I was trying to demonstrate

163
00:12:26,250 --> 00:12:28,740
that humans could come up with usage of words which meant something

164
00:12:28,740 --> 00:12:34,840
different to a target command in a given context and feed it to Alexa. Alexa

165
00:12:34,840 --> 00:12:40,900
would pick up on it as that target command (as programmed).

166
00:12:40,900 --> 00:12:45,800
One example is the word "bank". We can say: I was standing on the (river) bank,

167
00:12:45,800 --> 00:12:50,710
lost my balance on the bank... we would understand we're talking about water.

168
00:12:50,710 --> 00:12:55,030
But Alexa, because she doesn't understand about banks being next to

169
00:12:55,030 --> 00:13:01,360
water will pick it up as a command to do online banking and act accordingly.

170
00:13:01,360 --> 00:13:07,500
So you could trick a system into doing something malicious, you could

171
00:13:07,500 --> 00:13:12,300
use a word in a different context that sounds totally unrelated and trigger a

172
00:13:12,300 --> 00:13:17,000
victim's system to do a banking transaction... any malicious act you can

173
00:13:17,000 --> 00:13:23,080
think of, you might be able to work into wordplay. A sinister contrast between

174
00:13:23,080 --> 00:13:27,760
the playfulness of the word play and the maliciousness it could be used for.

175
00:13:27,760 --> 00:13:33,000
CLAUDINE: How pervasive do you think this issue actually is

176
00:13:33,000 --> 00:13:38,410
amongst voice or speech recognition technologies across multiple platforms?

177
00:13:38,410 --> 00:13:45,000
MARY: I was looking more at control of devices via speech.

178
00:13:45,000 --> 00:13:50,980
For the moment, people are more worried about their devices listening in.

179
00:13:50,980 --> 00:13:58,000
Applications of voice control have been limited [maybe as] it's difficult to control.

180
00:13:58,000 --> 00:14:03,500
You wouldn't want to be controlling a military robot by voice if you couldn't

181
00:14:03,500 --> 00:14:06,100
be sure that whatever you said was actually going to happen,

182
00:14:06,100 --> 00:14:08,980
or that you weren't going to say one thing and it would do something else.

183
00:14:08,980 --> 00:14:15,700
The issue of the security of control by voice is one of limit[ed] potential.

184
00:14:15,700 --> 00:14:19,200
If we could solve the security, that would open a number of applications

185
00:14:19,200 --> 00:14:23,020
which are currently too risky but might actually be otherwise quite valuable.

186
00:14:23,020 --> 00:14:28,030
The other issue relating to voice that's happened in practice is deep fakes.

187
00:14:28,030 --> 00:14:31,570
We have images, but it's also happened with voice, where people have been able

188
00:14:31,570 --> 00:14:37,000
to do more and more convincing imitations of real individual's voices.

189
00:14:37,000 --> 00:14:43,900
E.g., being tricked into making a bank transfer.

190
00:14:43,900 --> 00:14:48,370
That kind of use of audio technology is becoming a real security concern.

191
00:14:48,370 --> 00:14:53,380
That's slightly to the side of what I was looking at in my research, but related.

192
00:14:53,380 --> 00:15:01,450
ARI: Does that mean that we shouldn't let recordings of our voice go online?

193
00:15:01,450 --> 00:15:05,290
You've described yourself as a researcher hoping to be an innovator,

194
00:15:05,290 --> 00:15:09,970
It sounds like you are working in a[n innovative] space!

195
00:15:09,970 --> 00:15:15,370
What do you see as the exciting or troublesome parts here?

196
00:15:15,370 --> 00:15:19,870
MARY: It's an area with many problems and as yet few solutions.

197
00:15:19,870 --> 00:15:24,000
I don't think, to answer your question about voice recordings, I really don't

198
00:15:24,000 --> 00:15:29,800
think the answer (in terms of what we should/can be doing) is to just not be

199
00:15:29,800 --> 00:15:37,060
recorded, or limit our communication: A) This is impractical,

200
00:15:37,060 --> 00:15:42,520
and B it is restrictive and fear based. That isn't the answer.

201
00:15:42,520 --> 00:15:45,821
The answer is to find a way to handle this in a secure way.

202
00:15:45,821 --> 00:15:50,170
That is my area of innovation that, if you look me up in 5 years time,

203
00:15:50,170 --> 00:15:51,460
I may or may not have solved.

204
00:15:51,460 --> 00:16:06,200
CLAUDINE: If you had unlimited time, unlimited resources, what would you fix?

205
00:16:06,200 --> 00:16:09,520
MARY: Something which is perhaps even more fanciful than the work I've

206
00:16:09,520 --> 00:16:20,800
already been talking about! When Perseverance landed on Mars, one of

207
00:16:20,800 --> 00:16:24,600
the things that struck me about it was that when it landed, it almost

208
00:16:24,600 --> 00:16:30,000
immediately started sending Twitter updates. That was probably a human

209
00:16:30,000 --> 00:16:33,580
typing at NASA and not the actual robot on Mars.

210
00:16:33,580 --> 00:16:38,020
But having said that, it got me thinking about the fact that, going into space,

211
00:16:38,020 --> 00:16:43,500
we're going to be sending robots ahead of us. Alexa is a robot.

212
00:16:43,500 --> 00:16:47,350
And I wondered, Well, what would the place be for speech technology in that?

213
00:16:47,350 --> 00:16:51,760
Could we train the robot to describe to us the things that it finds?

214
00:16:51,760 --> 00:16:58,200
How would we make those communications secure?

215
00:16:58,200 --> 00:17:04,200
If we ever encounter anything moving on another planet, a robot will be there

216
00:17:04,200 --> 00:17:13,000
on our behalf. How would we train [a robot] to talk to [the alien]?

217
00:17:13,000 --> 00:17:18,250
How would we use speech technology in that kind of context?

218
00:17:18,250 --> 00:17:24,250
How would [the robot] describe a landscape that a human's never seen?

219
00:17:24,250 --> 00:17:28,840
As I say, drifting off into the realms of total fantasy.

220
00:17:28,840 --> 00:17:33,550
CLAUDINE: I feel like that could very well be turned into a sci-fi limited series.

221
00:17:33,550 --> 00:17:46,600
ARI: I'd watch that. CLAUDINE: Yeah, I would, too. MARY: Okay, next project!

222
00:17:46,600 --> 00:17:53,140
ARI: We're talking about interacting or describing the world around us.

223
00:17:53,140 --> 00:17:56,020
Does accessibility come into this conversation?

224
00:17:56,020 --> 00:18:02,710
The potential to voice hack is not just a general issue,

225
00:18:02,710 --> 00:18:06,430
it's also very specific for people who use technology in a certain way.

226
00:18:06,430 --> 00:18:12,520
Have you encountered accessibility or how this might affect interaction?

227
00:18:12,520 --> 00:18:17,470
MARY: Certainly, coming back to more immediate human and social topics.

228
00:18:17,470 --> 00:18:22,000
That is something I've thought about - moving away from the future,

229
00:18:22,000 --> 00:18:24,370
towards people's more immediate concerns.

230
00:18:24,370 --> 00:18:31,000
Voice control has a part to play in assisting people with mobility issues,

231
00:18:31,000 --> 00:18:34,810
turning on the light from across a room or turning off a kitchen device.

232
00:18:34,810 --> 00:18:40,000
The reliability and security of voice control becomes even more important

233
00:18:40,000 --> 00:18:44,260
when it's being used by someone who's who might be entirely reliant on it,

234
00:18:44,260 --> 00:18:46,800
rather than using it as a convenient option.

235
00:18:46,800 --> 00:18:50,500
There are all sorts of endless applications that become

236
00:18:50,500 --> 00:18:55,090
very practical and perhaps less romantic in that setting.

237
00:18:55,090 --> 00:19:00,000
Much as I enjoy thinking about the bigger picture and the exciting topics,

238
00:19:00,000 --> 00:19:03,800
I do think that this kind of technology also has more immediate, more human

239
00:19:03,800 --> 00:19:08,770
and less glamorous applications that are just as valuable, possibly more valuable.

240
00:19:08,770 --> 00:19:13,840
ARI: Is there a push for this or is it something that we don't talk about?

241
00:19:13,840 --> 00:19:19,910
MARY: It probably is something that's not talked about enough.

242
00:19:19,910 --> 00:19:25,090
When Alexa first came in, certainly Google Home, it was more of a toy.

243
00:19:25,090 --> 00:19:32,000
Very trivial - ask the weather, play music... almost a bit of a gimmick.

244
00:19:32,000 --> 00:19:35,700
Perhaps there wasn't enough thought into these more serious applications,

245
00:19:35,700 --> 00:19:39,760
Perhaps because it was not immediately attractive to the wider population.

246
00:19:39,760 --> 00:19:45,220
There is scope for looking at voice control in that context and improving it,

247
00:19:45,220 --> 00:19:48,410
making it more reliable and more genuinely useful.

248
00:19:48,410 --> 00:19:51,820
ARI: even to install your smart light bulbs and connect it all to Alexa...

249
00:19:51,820 --> 00:19:54,450
The barrier to entry is actually quite high.

250
00:19:54,450 --> 00:19:58,010
My friend has some light bulbs, that are smart bulbs and Alexa deals with them,

251
00:19:58,010 --> 00:20:02,200
and I can't. Alexa doesn't recognise my voice but recognises his voice.

252
00:20:02,200 --> 00:20:06,160
MARY: As someone who spent a day trying to install a voice controlled kettle.

253
00:20:06,160 --> 00:20:11,900
Yeah, I agree with that. That's after four years of doctoral research in the area!

254
00:20:11,900 --> 00:20:15,000
ARI: Right? We are the experts, people!

255
00:20:15,000 --> 00:20:17,410
CLAUDINE: With respect to accessibility and voice commands.

256
00:20:17,410 --> 00:20:21,550
Is it your sense that technology developed for vulnerable populations,

257
00:20:21,550 --> 00:20:25,270
individuals who have specific disabilities or needs around technology,

258
00:20:25,270 --> 00:20:30,610
that security by design is a priority, an afterthought or not thought of at all?

259
00:20:30,610 --> 00:20:35,500
MARY: We know in cyber security that... Security has been an afterthought,

260
00:20:35,500 --> 00:20:39,520
full stop for the whole of the internet, for these systems that we put together.

261
00:20:39,520 --> 00:20:44,600
That problem becomes exacerbated when we have sensitive applications

262
00:20:44,600 --> 00:20:47,920
or applications which are used by more vulnerable people.

263
00:20:47,920 --> 00:20:51,430
In a nutshell, security tends to be an afterthought rather than something

264
00:20:51,430 --> 00:20:54,760
that's baked in from the beginning and the system design.

265
00:20:54,760 --> 00:21:00,900
There are new security angles to the fact that we can make things happen just by

266
00:21:00,900 --> 00:21:04,090
speaking to a device that haven't been explored yet.

267
00:21:04,090 --> 00:21:08,440
There are all sorts of implications and security problems to be solved.

268
00:21:08,440 --> 00:21:10,460
We've only just scratched the surface.

269
00:21:10,460 --> 00:21:13,900
ARI: What are your tips for keeping up to speed with cybersecurity?

270
00:21:13,900 --> 00:21:20,170
I try to read blogs, e.g. Bruce Schneier's which is the go-to for a lot of people.

271
00:21:20,170 --> 00:21:25,030
I subscribe to a few newsletters (one produced by one of our alumni!).

272
00:21:25,030 --> 00:21:30,220
When I have time, every time there's a cyber security story in the news,

273
00:21:30,220 --> 00:21:33,470
I try to look into at least some of the background to it.

274
00:21:33,470 --> 00:21:38,000
So beyond the headline, try to look up some technical blogs and articles and

275
00:21:38,000 --> 00:21:40,930
try to develop some understanding of what actually happened.

276
00:21:40,930 --> 00:21:44,920
From a more specialist point of view, I suppose that's quite helpful.

277
00:21:44,920 --> 00:21:47,500
ARI: Who do you go to for these write ups?

278
00:21:47,500 --> 00:21:52,090
MARY: Well, Google tends to throw up CNET, technical blogs, academic articles.

279
00:21:52,090 --> 00:21:55,960
I use Wikipedia if I really don't know anything about a specific area.

280
00:21:55,960 --> 00:22:00,730
Just some kind of technical material that is beneath the media, if you like.

281
00:22:00,730 --> 00:22:06,730
That's a way of continuing to develop [under realistic time constraints].

282
00:22:06,730 --> 00:22:09,479
Join us next week for another fascinating conversation.

283
00:22:09,479 --> 00:22:13,930
In the meantime, you can tweet at us @HelloPTNPod

284
00:22:13,930 --> 00:22:18,190
and you can subscribe on Apple Podcasts or wherever you listen to podcasts.

285
00:22:18,190 --> 00:22:24,020
The title there is PTNPod. See you next week. ARI: Bye!

286
00:22:24,020 --> 00:22:29,210
This has been a podcast from the Centre for Doctoral Training in Cybersecurity, University of Oxford.

287
00:22:29,210 --> 00:22:34,090
Funded by the Engineering and Physical Sciences Research Council.