1 00:00:03,530 --> 00:00:16,340 OK, great, so welcome everybody to this week's graduate lecture, we're continuing in the theme of statistics and ethics and their intersection. 2 00:00:16,340 --> 00:00:23,120 Today, I'm very happy that Catherine Fletcher is going to speak. 3 00:00:23,120 --> 00:00:30,590 She is in the Department of Computer Science and instigator of the Research Ethics Committee, 4 00:00:30,590 --> 00:00:39,260 and her talk is the practicalities of academic research ethics, how to get things done. 5 00:00:39,260 --> 00:00:49,130 Catherine, as I mentioned in computer science, she searches a project such programme manager specialising in interdisciplinary projects, 6 00:00:49,130 --> 00:00:57,590 interest in bridging academia and industry and having having encountered several ethics and 7 00:00:57,590 --> 00:01:04,910 compliance dilemmas in cybersecurity and other data driven research fields relevant to statistics. 8 00:01:04,910 --> 00:01:15,380 She helped set up the Computer Science Department Research Ethics Committee, and so to sort of advance this cause, 9 00:01:15,380 --> 00:01:26,360 she's run a series of workshops and took an annual conference of the CSC Academic Centres of Excellence Cyber Security Research one in Oxford, 10 00:01:26,360 --> 00:01:37,100 one the Alan Turing Institute has said is now managing the Global Pathogen Analysis System in the field Department of Medicine, 11 00:01:37,100 --> 00:01:42,650 which you can access a deep dark cloud working at the sharp end, 12 00:01:42,650 --> 00:01:55,820 as she puts it, to help public health agencies to respond to the COVID 19 pandemic while also preserving our privacy and national data sovereignty. 13 00:01:55,820 --> 00:02:05,080 So the talk today? Practicalities of academic research ethics has to get things done very well to talk about truth, 14 00:02:05,080 --> 00:02:11,770 beauty and justice for academic research ethics, but how would you do these things at a practical level? 15 00:02:11,770 --> 00:02:15,890 I think today we're going to hear from Catherine exactly where to go, who to talk to, 16 00:02:15,890 --> 00:02:22,420 what, what we need to do or be aware of on various levels, including legal levels. 17 00:02:22,420 --> 00:02:30,250 And so what what are the important implications? If you stumble upon something, what do you do with it? 18 00:02:30,250 --> 00:02:36,160 How do you make sure that appropriate safeguards are applied without drowning in bureaucracy? 19 00:02:36,160 --> 00:02:44,240 So we'll hear a brief introduction to these first legal and procedural concepts and discuss how it might apply to your work, 20 00:02:44,240 --> 00:02:50,470 both within and beyond academia. I think Catherine would welcome a discussion format, so if you've got questions, 21 00:02:50,470 --> 00:02:55,930 raise them up during the talk if you want to be recorded or touch them, and I will read them for you. 22 00:02:55,930 --> 00:03:00,580 Okay, so over to you, Catherine. Thank you. Well, thank you very much for the introduction. 23 00:03:00,580 --> 00:03:05,080 It's a pleasure to be here. So as Garrett said, 24 00:03:05,080 --> 00:03:14,140 I'm a bit of a fraud because I've spent about 12 years in my of my career at the University of Oxford in the Department of Computer Science. 25 00:03:14,140 --> 00:03:18,940 But I have since may come across to the Nuffield Department of Experimental Medicine, 26 00:03:18,940 --> 00:03:25,000 where I'm helping run this project, where, as it turns out, they didn't hire me for my ethics background per se. 27 00:03:25,000 --> 00:03:31,300 But it turns out we're doing cutting edge research with data again, where ethics and norms and legal structures are lagging behind. 28 00:03:31,300 --> 00:03:39,670 So as something that I'm very sensitive to and I wanted to help people learn from the mistakes I've made and the lessons I've learnt of trying to 29 00:03:39,670 --> 00:03:44,320 navigate the university systems where literally there is not necessarily one 30 00:03:44,320 --> 00:03:48,550 authority to go to because what you're doing is new because it's research. 31 00:03:48,550 --> 00:03:55,450 So in that sense, I would love it if you guys took part in the conversation. The easiest way to do it probably is to drop things into the chat. 32 00:03:55,450 --> 00:03:59,200 I won't be able to read it myself because I've only got a laptop, and when I'm sharing my screen, 33 00:03:59,200 --> 00:04:04,090 I won't be able to see the comments, but again kindly offered to watch it for me. 34 00:04:04,090 --> 00:04:10,750 So please do definitely feel free to use that, to ask questions and stop me or whatever you'd like. 35 00:04:10,750 --> 00:04:16,990 And particularly actually, could I ask you guys, you know, the system where if you go to the chat at the bottom of the screen and Zoom, 36 00:04:16,990 --> 00:04:22,900 there's a little picture of a speech, but we'll go to the chat and on your own name, you can change your name to anything you like. 37 00:04:22,900 --> 00:04:31,090 And may I just ask if you would be willing to change your name to whether or not you've had any formal training in research, 38 00:04:31,090 --> 00:05:02,430 ethics or something that you would consider formal training like something involving serious thought? 39 00:05:02,430 --> 00:05:17,070 I can't seem to redeem myself. Other. 40 00:05:17,070 --> 00:05:31,570 So I was wrong, so it's not the chat thing, it's the list of participants is how you click on your own name and you could rename yourself. 41 00:05:31,570 --> 00:05:36,030 Thank you very much for adding that it's really useful to see it as background. 42 00:05:36,030 --> 00:05:41,740 So the other question I would ask for you is, do you think I realise things can change? 43 00:05:41,740 --> 00:05:43,770 VIDEO But do you think as of right now, 44 00:05:43,770 --> 00:06:18,600 the kind of work that you hope to do in the next couple of years in your career would probably benefit from some ethics or oversight? 45 00:06:18,600 --> 00:06:23,580 Thank you very much. I really appreciate you guys putting putting this stuff in there and feel free to. 46 00:06:23,580 --> 00:06:29,010 You can also use your name as a thing else you want to communicate that way, but it's entirely up to you what you wish to share, 47 00:06:29,010 --> 00:06:33,510 but otherwise feel free to use the chat for any other written conversation that you want to add to the day. 48 00:06:33,510 --> 00:06:41,160 Otherwise, for now, I will get going with sharing my screen. I'll show you my slides and we'll be off to the races. 49 00:06:41,160 --> 00:06:49,110 And I have to also say I know that I have a terrible habit of speaking very quickly, especially when I get excited everything's yardstick. 50 00:06:49,110 --> 00:06:53,340 And even more when I've had a lot of coffee, which I mistakenly did this afternoon. 51 00:06:53,340 --> 00:06:59,820 So if I'm going too fast, please call me up. Or if I'm I'm difficult to understand because especially for people, a lot of people, 52 00:06:59,820 --> 00:07:03,540 English isn't even your first language, and we're having a conversation about complicated things. 53 00:07:03,540 --> 00:07:09,190 So tell me if I'm going too fast or if you can't understand me. So, right. 54 00:07:09,190 --> 00:07:14,110 How are we going to get stuff done? I am so glad to be here and like I said, 55 00:07:14,110 --> 00:07:21,340 the purpose of this talk is to share what I've learnt so that you don't have to do things the hard way, but the way I had to do it. 56 00:07:21,340 --> 00:07:27,100 I think very much it's important to keep a sense of perspective about this stuff, which is that. 57 00:07:27,100 --> 00:07:32,920 You know, I said the things I said, no one's coming from the future to stop you from doing it, how bad her decision can it really be? 58 00:07:32,920 --> 00:07:36,970 But the thing is, now that you are, especially since you're at the University of Oxford, 59 00:07:36,970 --> 00:07:42,850 the things that you do is very likely that you're you're bright and you're really good at what you're doing or you wouldn't be here, 60 00:07:42,850 --> 00:07:47,140 which means that what you do is likely to be really interesting and new and cutting edge. 61 00:07:47,140 --> 00:07:49,000 And when it's interesting and new and cutting edge, 62 00:07:49,000 --> 00:07:55,420 it can be easy to not be supported by the existing systems because the existing systems that they're meant 63 00:07:55,420 --> 00:08:00,490 to be there for safeguards and for help and even legal structures are designed for what happened before, 64 00:08:00,490 --> 00:08:05,410 not for what you're making now. And especially since you're here at the University of Oxford, 65 00:08:05,410 --> 00:08:13,000 often that means that things we do get is more likely to get picked up in sort of higher impact venues just because got the Oxford name on it. 66 00:08:13,000 --> 00:08:17,710 So it does actually matter. But the really the reason this actually matters is not about reputation management. 67 00:08:17,710 --> 00:08:22,450 It is absolutely not about you protecting the university from scandal. It's about people. 68 00:08:22,450 --> 00:08:29,450 The reason we are here is because, you know, we care about people and as as humans, that's what really matters. 69 00:08:29,450 --> 00:08:35,080 So the important part is that we don't do things that are, I mean, basically don't be a jerk, right? 70 00:08:35,080 --> 00:08:40,030 But especially don't be inadvertently faultless because you didn't see things 71 00:08:40,030 --> 00:08:43,570 from somebody else's perspective because you weren't thinking in those terms. 72 00:08:43,570 --> 00:08:48,910 That's where we can come in and help you and pull it up and say, Hang on, you can look at it from this angle, from this angle, from this angle. 73 00:08:48,910 --> 00:08:51,370 Now, do you still feel that what you're doing is responsible? 74 00:08:51,370 --> 00:08:58,090 Or are there any safeguards that you might put in place, for example, to make sure that you're being responsible? 75 00:08:58,090 --> 00:09:01,240 Now, formally speaking, there are a lot of processes in the university, 76 00:09:01,240 --> 00:09:05,260 so my talk is going to talk a lot about the processes because there are so many different things. 77 00:09:05,260 --> 00:09:08,500 I'm not expecting you all to remember this now. A lot of you will. 78 00:09:08,500 --> 00:09:12,070 Some of you will never encounter any of these, and a lot of you will say, Yeah, yeah, that's great. 79 00:09:12,070 --> 00:09:13,390 But I don't need it right now. 80 00:09:13,390 --> 00:09:19,480 But maybe a year or two from now or six months from now, you might go, Oh, wasn't there something about I could have sworn there was a team? 81 00:09:19,480 --> 00:09:25,750 So the purpose of this talk is actually to tell you about all the things. We'll make sure you have a copy of the slides at the end if you want them. 82 00:09:25,750 --> 00:09:28,210 And then when you're ready to if you need to come back, 83 00:09:28,210 --> 00:09:33,040 then you can refer back to the slides and find all the web links to all the different places around the university. 84 00:09:33,040 --> 00:09:38,020 So the first thing that you need to know, though, is that we have a research integrity policy here at the University of Oxford, 85 00:09:38,020 --> 00:09:43,630 and it's absolutely standard across academia worldwide. None of this should come as news to you. 86 00:09:43,630 --> 00:09:49,500 None of it should be a surprise. But but also you need to have had somebody tell you this at some point. 87 00:09:49,500 --> 00:09:55,080 Because it's not fair on you, if nobody ever told you we had a policy about it and then you break the policy, how can how is that fair? 88 00:09:55,080 --> 00:09:56,460 You will still be in trouble, right? 89 00:09:56,460 --> 00:10:03,690 So the obvious things are that I've literally pulled this off of the university's website, and the link at the top is where all the details are. 90 00:10:03,690 --> 00:10:07,580 But your fundamental jobs as a researcher are, to be honest. 91 00:10:07,580 --> 00:10:17,240 To comply with any ethical and legal obligations which are either through national law or university policies, you need to think about the safety, 92 00:10:17,240 --> 00:10:23,720 dignity and well-being of yourself and others, and that may or may not be the same thing as complying with legal obligations. 93 00:10:23,720 --> 00:10:28,820 Sometimes that can be very different where I mean, civil disobedience happens occasionally, 94 00:10:28,820 --> 00:10:31,580 for example, where people think the law is not protecting people in the way it should be. 95 00:10:31,580 --> 00:10:39,200 So those two are not the same thing, but also it's really important to remember that this extends not just to other people, but to yourself, 96 00:10:39,200 --> 00:10:44,570 that we as a university also have an obligation towards you and that if we're not supporting you properly, 97 00:10:44,570 --> 00:10:48,920 I want you to know that somebody cares and somebody needs to know and that we're supposed to do that. 98 00:10:48,920 --> 00:10:56,630 So if you're finding that you're not having your dignity and safety, well, being respected, talk to people because there is a requirement around that. 99 00:10:56,630 --> 00:11:00,890 You are supposed to not conduct research where there are conflicts of interest that should be fairly obvious. 100 00:11:00,890 --> 00:11:05,960 But if you're not sure if you ever end up with some kind of essential corporate stake or something like that, 101 00:11:05,960 --> 00:11:09,500 you do need to declare these things, especially in papers, obviously. 102 00:11:09,500 --> 00:11:13,550 Likewise, around papers and publications, attribution and authorship are really important. 103 00:11:13,550 --> 00:11:17,420 And especially if you're a more junior researcher, sometimes that can be a really awkward, 104 00:11:17,420 --> 00:11:20,990 difficult conversation to have if you're not sure and you've never done it before. 105 00:11:20,990 --> 00:11:28,460 So again, if the rule of thumb is if a paper would not have been possible without someone's input, 106 00:11:28,460 --> 00:11:32,450 then that person deserves authorship or equivalent sort of credit. 107 00:11:32,450 --> 00:11:36,020 And so again, if you're not sure, talk to somebody in the university. 108 00:11:36,020 --> 00:11:39,200 If you're not, especially there's an awkward relationship with a collaborator. 109 00:11:39,200 --> 00:11:43,430 Talk to somebody else to get some advice, because the point is that should definitely be followed. 110 00:11:43,430 --> 00:11:49,220 The last thing, of course, is that you're supposed to follow the best requirements and guidance of any professional bodies in the field of research. 111 00:11:49,220 --> 00:11:51,890 Now, unfortunately, I don't know what this is for stat. 112 00:11:51,890 --> 00:11:59,360 You might be able to tell people, but in computer science often, for example, the ACM, the Association of Computer Machinery, I think it stands for. 113 00:11:59,360 --> 00:12:05,670 But any other professional bodies that sort of regulate your field? 114 00:12:05,670 --> 00:12:10,290 Specifically, then the ways you can break those rules are what's known as academic misconduct, 115 00:12:10,290 --> 00:12:14,910 and here's a quick list of them, and I'm not expecting you to read through this whole thing and memorise it. 116 00:12:14,910 --> 00:12:18,780 The two things I wanted to call out for your attention are failure to follow 117 00:12:18,780 --> 00:12:23,520 good practise for proper preservation management and sharing of primary data. 118 00:12:23,520 --> 00:12:25,370 So you guys again, you're in the stats department. 119 00:12:25,370 --> 00:12:29,580 Isn't the idea of managing data is probably more familiar to you than it is of a lot of other areas, 120 00:12:29,580 --> 00:12:37,080 but that this is formally grounds for an academic misconduct discussion is important news to some people. 121 00:12:37,080 --> 00:12:42,460 So the point is if especially in the context where your data might be paid for by public moneys. 122 00:12:42,460 --> 00:12:48,120 If you have anything from a research council, for example, or say, the Wellcome Trust or some charitable foundation, 123 00:12:48,120 --> 00:12:52,170 literally that's charity money or that's public taxes that pay for your research. 124 00:12:52,170 --> 00:12:56,170 And if they spent two million pounds commissioning this research, that dataset cost two million. 125 00:12:56,170 --> 00:13:01,980 Typekit basically cost you a million pounds, and so it is incumbent upon you to treat it with that kind of respect. 126 00:13:01,980 --> 00:13:08,490 The other thing you need to do is to make sure that you follow again accepted procedures, legal and ethical requirements. 127 00:13:08,490 --> 00:13:15,630 And that, of course, comes to includes humans, but hilariously also vertebrates, so cephalopods and the environment. 128 00:13:15,630 --> 00:13:22,200 So it's just again about having a holistic sense of responsibility. And I'm not here to shout at you. 129 00:13:22,200 --> 00:13:28,680 I'm just here to say those are the official policies of the university, and I want to make sure somebody has told you. 130 00:13:28,680 --> 00:13:35,970 Because obviously, it's yeah, I don't understand why this is so hard, right, you just say if a robot is going to turn evil, don't bite. 131 00:13:35,970 --> 00:13:42,600 So the way we try and help you with that is a ton of different overlapping bodies within the university and the public. 132 00:13:42,600 --> 00:13:47,820 The good news is there's lots of different places you can go for help. The bad news is especially when you're at the cutting edge. 133 00:13:47,820 --> 00:13:54,570 It's not really clear where you need to go or who you need to be answerable to, especially if there's, for example, two different committees in. 134 00:13:54,570 --> 00:13:58,800 One might say yes and one might say no or one might say, please do this and another one. 135 00:13:58,800 --> 00:14:02,370 I might add slightly conflicting advice. And so that's where I'd love. 136 00:14:02,370 --> 00:14:09,590 Later on, after I finished talking, I would love you guys to ask questions about that, especially as it as it applies. 137 00:14:09,590 --> 00:14:15,770 Just as a judge, as I've I've been bumping into things myself myself over the years, and I'll try and help you navigate it. 138 00:14:15,770 --> 00:14:21,680 So first of all, the obvious one is the Central University Research Ethics Committee, otherwise known as Cure. 139 00:14:21,680 --> 00:14:25,400 So that's a subset of which is what I helped set up in the Department of Computer Science. 140 00:14:25,400 --> 00:14:35,190 So correct. It's above the whole university. Now that basically has to do with working with people directly, so that might be interviews or surveys, 141 00:14:35,190 --> 00:14:39,270 running workshops, even if the workshop is collecting data that you're actually going to analyse. 142 00:14:39,270 --> 00:14:42,600 So having a workshop on a topic of your research just to talk is fine. 143 00:14:42,600 --> 00:14:51,180 But if that if the output of that research becomes part of your data set at that point, even the workshop needs ethics clearance rate user studies. 144 00:14:51,180 --> 00:14:58,620 This is important even if they're anonymous, even if you're only talking to one person, even if you're not asking about anything sensitive. 145 00:14:58,620 --> 00:15:04,840 It's not because. We don't believe you, it's because we have no way of knowing unless you mention it, because you can say, Oh, well, 146 00:15:04,840 --> 00:15:09,670 I just decided this was insensitive to tell us why it's not sensitive and then we say, Oh great, you're absolutely right. 147 00:15:09,670 --> 00:15:18,220 Off you go. Right. So for statistics, you actually have your applications go through the Medical Sciences Interdepartmental Research Ethics Committee. 148 00:15:18,220 --> 00:15:24,490 So Ms. Interac, is there your contact with a lady named Helen Brown before it who has a wealth of experience on these things? 149 00:15:24,490 --> 00:15:30,760 Interestingly, the reason why you guys go through medicine is because they deal with a lot of data about people in the context of medical science. 150 00:15:30,760 --> 00:15:33,250 So they've got a lot of expertise in this. 151 00:15:33,250 --> 00:15:38,560 But you might also find that depending on your kind of question, their expertise isn't exactly suited to what you're after. 152 00:15:38,560 --> 00:15:41,440 So there's also the Computer Science Research Ethics Committee, 153 00:15:41,440 --> 00:15:46,540 which actually sits within a different section of the university for reasons, but they might be a better fit for you. 154 00:15:46,540 --> 00:15:52,180 The point is, you always start with Helen and then, but you might based on what Helen says or before when you approach your site. 155 00:15:52,180 --> 00:15:57,190 I'll go through you if you want because it's officially her responsibility, but I think computer science will be a better fit for me. 156 00:15:57,190 --> 00:16:03,740 And here's why. And again, we can talk about that later. Just it's important to understand how it works. 157 00:16:03,740 --> 00:16:07,520 In terms of actual processes, there's a link at the top. This is the whole point of the science, right? 158 00:16:07,520 --> 00:16:12,680 Keep these for later referred to it when you're ready, but when you're ready, if you just google me, Cedric Oxford, 159 00:16:12,680 --> 00:16:16,910 that also takes you straight through the application process and their website is really nicely laid out. 160 00:16:16,910 --> 00:16:21,920 And this is before you start. Check the interview. I'm summarising what's on the screen before you start. 161 00:16:21,920 --> 00:16:28,700 Check the where and how to apply flowchart, and there's a little handy link. This is if you do not need to apply for NHS ethics approval. 162 00:16:28,700 --> 00:16:35,460 Here's here's our application decision tool to determine how to apply for ethical review or your research and. 163 00:16:35,460 --> 00:16:41,120 Again, if you get stuck just right to [INAUDIBLE], and she's lovely. The last step is. 164 00:16:41,120 --> 00:16:47,270 Especially in the formal structures, the idea of integrity and ethics training, so we have online and in-person training. 165 00:16:47,270 --> 00:16:52,850 So if you're just one doing a little one off survey one time or you're doing like a little bit of work on data, 166 00:16:52,850 --> 00:16:59,510 you don't necessarily need formal training. I'm not going to suggest that everybody in the university must have this, but I would highly recommend it. 167 00:16:59,510 --> 00:17:02,480 If you think you're going to be working with data in the long term, 168 00:17:02,480 --> 00:17:07,430 and especially actually because when you go out to then submit your CVS to job opportunities, 169 00:17:07,430 --> 00:17:11,600 you can say I have actually had this training and that makes you a much more interesting and valuable candidate. 170 00:17:11,600 --> 00:17:20,480 But also the training itself is pretty good. It's really it's very it's very focussed on what you might need to actually do. 171 00:17:20,480 --> 00:17:27,650 It's less about sort of notions of truth, and justice is more about, here's how you do it here in Oxford, and it's really helpful. 172 00:17:27,650 --> 00:17:36,260 But the thing is, you only apply to cure. If you're doing research and research has a definition and it's the four Scotti definition of research. 173 00:17:36,260 --> 00:17:41,240 Again, there's a link at the top so you can come back to it. So first got you something that was agreed by. 174 00:17:41,240 --> 00:17:49,160 Basically, everybody has to do with any kind of things to do with data, and the definition of research is that and this is the bullet points below. 175 00:17:49,160 --> 00:17:58,040 It has to be all of these things at the same time, not just one novel, creative, uncertain, systematic and transferable. 176 00:17:58,040 --> 00:18:02,420 So before you go to crack, actually, you have to decide if what you're doing is research or not, 177 00:18:02,420 --> 00:18:07,490 because if it's not technically all of those five things at once, it's outside the remit of Keurig. 178 00:18:07,490 --> 00:18:13,770 That doesn't mean they have nothing to say to you is useful, but the point is like, technically speaking, you're outside their remit. 179 00:18:13,770 --> 00:18:16,950 That doesn't also mean that what you're doing, therefore has no ethical considerations, 180 00:18:16,950 --> 00:18:24,660 so then the rest of this talk is about one of the other things you might be doing if it's not officially correct correctable as it works, right? 181 00:18:24,660 --> 00:18:28,470 So one of the things you might need to do is a risk assessment, and that's I mean, 182 00:18:28,470 --> 00:18:32,100 you think about the old school kind of physical stuff like, I don't know, are you going to be scuba diving? 183 00:18:32,100 --> 00:18:38,580 This is very unlikely, the statistics, but there are things that can entail physical risk climbing ladders to collect data from something, you know. 184 00:18:38,580 --> 00:18:43,020 If so, that would be done through your departmental administrator or possibly a facilities manager, 185 00:18:43,020 --> 00:18:45,360 but that involves something about your own building. 186 00:18:45,360 --> 00:18:51,480 And if you're travelling for research, talk to your departmental administrator because that's they have risk assessments for that. 187 00:18:51,480 --> 00:18:55,440 But also you get university travel insurance, which is actually very worthwhile. 188 00:18:55,440 --> 00:18:58,740 We have a very good policy, and it's important that you have that on your side, 189 00:18:58,740 --> 00:19:03,060 especially in the world of COVID and post-COVID as the world begins to reopen. 190 00:19:03,060 --> 00:19:09,160 Things get complicated. So travel insurance is great, but you only get that if you did the risk assessment and came in through the front door. 191 00:19:09,160 --> 00:19:13,150 But the thing is, that's lovely, that's all that existed three years ago. 192 00:19:13,150 --> 00:19:18,610 Since since, especially the cybersecurity people that I used to work with in computer science started bumping into this, 193 00:19:18,610 --> 00:19:26,440 we instigated a new kind of research risk assessment. So that's basically for the kinds of research they don't technically fall under Kerik, 194 00:19:26,440 --> 00:19:30,460 which have to do with direct interaction with people, and it has to be defined as research. 195 00:19:30,460 --> 00:19:34,930 There's plenty of other stuff you can do that probably has ethical implications. It isn't those things. 196 00:19:34,930 --> 00:19:41,830 And for that, we've invented research assessments, one of which is for vulnerability or network scanning type research, 197 00:19:41,830 --> 00:19:45,340 the kind of thing where you're going to be poking holes and seeing what just seeing what you can learn. 198 00:19:45,340 --> 00:19:47,980 And that could also even be statistical inferences at some point. 199 00:19:47,980 --> 00:19:52,900 If the idea is that you're trying to see if there's a weakness in a system, that's where this one comes in. 200 00:19:52,900 --> 00:19:59,650 And there's another one which is separate for a machine learning kind of pattern extraction type research, 201 00:19:59,650 --> 00:20:02,620 which I think is probably very useful for people of statistics. 202 00:20:02,620 --> 00:20:06,310 So the method we have in computer science is we have these documents and they are meant to be 203 00:20:06,310 --> 00:20:10,780 a self-assessment checklist so you can just pull it down yourself if you want after the talk, 204 00:20:10,780 --> 00:20:13,570 like, let me know I can give you a copy of it even. 205 00:20:13,570 --> 00:20:19,120 But it's meant to be a self-assessment exercise that you go through for your own benefit and just said you're going to have a copy of it. 206 00:20:19,120 --> 00:20:23,080 But if based on those answers, when we walk you through that decision making tool, you kind of go, 207 00:20:23,080 --> 00:20:27,270 Oh, actually, I hadn't thought about that, and I don't I don't know the answer to that question. 208 00:20:27,270 --> 00:20:29,050 That's your cue to go out and get help. 209 00:20:29,050 --> 00:20:34,000 And it could also be that you could then submit that to, for example, the Computer Science Research Ethics Committee. 210 00:20:34,000 --> 00:20:37,570 So their advice on this is nonbinding, but they do have advice they can say, 211 00:20:37,570 --> 00:20:41,980 Well, we've seen best practise about how people have dealt with that before. We can help you with that. 212 00:20:41,980 --> 00:20:48,130 And it's less about stopping you doing your research and more about saying right, what you want to do, this research, we want you to do it too. 213 00:20:48,130 --> 00:20:54,190 But when you're doing it, please think about this, this and this and make sure that you're not being stupid, literally. 214 00:20:54,190 --> 00:21:00,910 So if you want to find out more about that again, I can help but also just write directly ethics at See US Thought Okay Study UK, 215 00:21:00,910 --> 00:21:04,360 which will get you to the Computer Science Ethics Committee. 216 00:21:04,360 --> 00:21:09,190 There's also data protection, and I realise this is a lot of stuff to throw at you guys over a lecture, 217 00:21:09,190 --> 00:21:16,030 I wish it was more interactive and in-person, but data protection is a thing that I'm sure you will all hear a lot about in statistics. 218 00:21:16,030 --> 00:21:23,080 The university itself, though, has very specific thoughts about this, but I will say that their website is now magnificent. 219 00:21:23,080 --> 00:21:28,420 So again, there's a link there on the thing and it's all laid out. This is a direct screenshot and a little visit. 220 00:21:28,420 --> 00:21:35,710 What is what is data protection? What is the scope of GDPR? So that thing with the little rectangle in the arrow pointing downwards? 221 00:21:35,710 --> 00:21:40,720 It's a really useful tool to walk you through about whether or not GDPR would apply to your research. 222 00:21:40,720 --> 00:21:45,100 There's a really clear layout of who is responsible and what is required, 223 00:21:45,100 --> 00:21:49,840 and there's a lot of information about how you can transfer data, what might be exempt. 224 00:21:49,840 --> 00:21:55,150 You know, practical considerations is genuinely brilliant website, so if you're dealing with any kind of data with people, 225 00:21:55,150 --> 00:22:02,350 basically GDPR applies if they're European citizens or if there's even, to be honest, a chance that one European citizen might be a new data set. 226 00:22:02,350 --> 00:22:06,640 That's another GDPR now applies to you. Follow the rules. 227 00:22:06,640 --> 00:22:13,660 The thing is, one of the one of the things you bump into then is if you know that your research is sort of subject to GDPR, 228 00:22:13,660 --> 00:22:16,990 then the first thing you probably need to do is a data protection impact assessment. 229 00:22:16,990 --> 00:22:22,480 And I'm sorry to say that this is not the most fun thing you could do on a Saturday, but it actually is really important. 230 00:22:22,480 --> 00:22:26,280 And the reason is important, first of all, again, is it's actually part of. 231 00:22:26,280 --> 00:22:33,600 In the context of academic research, it's not enough to just do it right, although that's really important, you do need to do it right. 232 00:22:33,600 --> 00:22:39,900 But the point of academic research is that it's about learning and teaching and pushing boundaries. 233 00:22:39,900 --> 00:22:43,410 And all of those means there's a lot of there's a lot of scope for mistakes. 234 00:22:43,410 --> 00:22:47,670 There's also a lot of scope for people who are looking again to branch out in their future careers and things. 235 00:22:47,670 --> 00:22:54,580 So the whole point is you're always in academia meant to do extra effort to make sure that you're being. 236 00:22:54,580 --> 00:22:56,450 Theatrically isn't the right word, I don't mean that at all, 237 00:22:56,450 --> 00:23:02,260 but almost theatrically careful because what you're doing is you're demonstrating that, you know, how to do all of these things. 238 00:23:02,260 --> 00:23:09,280 They might not always be necessary for everything, but you need to demonstrate in the context of an academic environment that you are able to do all 239 00:23:09,280 --> 00:23:13,330 of these things and think about your research in the whole and not simply in the in the detail. 240 00:23:13,330 --> 00:23:14,080 So with that in mind, 241 00:23:14,080 --> 00:23:20,230 a data protection impact assessment allows you to demonstrate that which makes publications happy and makes future employers happy. 242 00:23:20,230 --> 00:23:26,050 It makes examiners happy. But also, the thing is, GDPR entails a real life legal risk to the university. 243 00:23:26,050 --> 00:23:31,180 And so if you goof up, the universe is on the hook and they will get fined and they will not be happy with you. 244 00:23:31,180 --> 00:23:36,070 But actually, the whole point is it's not about the fine. It's really not. It's actually about the responsibility part. 245 00:23:36,070 --> 00:23:38,710 But I'm saying the university takes it seriously because they will get fined. 246 00:23:38,710 --> 00:23:42,520 So that's why they've done such a great website to make it easy for you to do this. 247 00:23:42,520 --> 00:23:45,790 So the first step is that you fill in a little for when you can see again, 248 00:23:45,790 --> 00:23:49,210 there's a website that gives you links to everything you need and then based on your answers to the form, 249 00:23:49,210 --> 00:23:56,230 you find out whether or not you need to have more serious thinking about your research, data protection design, basically. 250 00:23:56,230 --> 00:24:05,110 But the good news is we have researched the research, data management research does our research data Oxford team. 251 00:24:05,110 --> 00:24:09,850 So all of these things so far, you can you can imagine it's kind of like a net, a web of like, Well, who do I go to? 252 00:24:09,850 --> 00:24:13,720 Is it a compliance issue? Is an ethics issue? Is it under the Piscotty definition? 253 00:24:13,720 --> 00:24:19,360 Is it not? Is it GDPR? Is it not? And I guess we really terrifying and confusing, especially if you've never done it. 254 00:24:19,360 --> 00:24:23,470 So what you do, you take a deep breath and first of all, you can talk to your supervisor, 255 00:24:23,470 --> 00:24:28,240 but the supervisor says, Well, I don't know, go to research data management. These guys are legends. 256 00:24:28,240 --> 00:24:32,470 So the website again is their research data Dot Dot Dot UK. 257 00:24:32,470 --> 00:24:37,150 They have real life humans at the end, so they contact us button on the on the website you put you click it. 258 00:24:37,150 --> 00:24:42,160 It gives you an email and a real live human will read your email and get back to you and they'll do it quickly. 259 00:24:42,160 --> 00:24:42,910 And they're brilliant. 260 00:24:42,910 --> 00:24:48,550 They do things that they'll explain to you things about archival so that their responsibility to look after you data they can explain about 261 00:24:48,550 --> 00:24:55,720 licencing and the legal implications of either reusing something under a licence or putting a licence on your own data or your own software, 262 00:24:55,720 --> 00:24:57,860 or whatever it is you're doing. 263 00:24:57,860 --> 00:25:05,430 They link back to they have a really good link between them and the compliance team to make sure that so things like GDP are linked up. 264 00:25:05,430 --> 00:25:11,340 If you go in through the research data management team, they will help you to make sure that you ticking all the boxes they are. 265 00:25:11,340 --> 00:25:22,390 I hope this clears you. I love them. They're brilliant. So that ends the really quick summary of types of team in the university, 266 00:25:22,390 --> 00:25:26,080 and I would like to later on to discuss how you know which one you might need, 267 00:25:26,080 --> 00:25:30,700 but really quickly in the interests of making sure we're all speaking the same language. 268 00:25:30,700 --> 00:25:35,860 The whole point is all of those exist as different methods to get basically to compliance. 269 00:25:35,860 --> 00:25:42,850 So compliance means do I comply with some specific burden and it might be a regular like an actual law of the country, 270 00:25:42,850 --> 00:25:46,690 or it might be a policy the university has with the university says all employees are all students. 271 00:25:46,690 --> 00:25:49,870 Must that that that you must comply with that, right? 272 00:25:49,870 --> 00:25:58,390 So doing correct or doing a GDPR data impact assessment form or any of those other things are about compliance. 273 00:25:58,390 --> 00:26:03,900 But there's also ethics and ethics just means what does my community? 274 00:26:03,900 --> 00:26:09,050 Of like minded people in the same area, so what community, what do we believe is right? 275 00:26:09,050 --> 00:26:15,230 And that's different community by community, and I'm sure this will have been discussed by a lot of other people in the same lecture series. 276 00:26:15,230 --> 00:26:20,930 But the obvious the obvious example of this is, for example, concepts around privacy and self-determination. 277 00:26:20,930 --> 00:26:25,280 Everybody can agree. I think very, very few people in the world will say privacy is bad. 278 00:26:25,280 --> 00:26:31,860 Very people would say very few people as a self-determination, basically being able to make decisions about my own self. 279 00:26:31,860 --> 00:26:41,520 Everybody agrees that it's good, but how do you ensure it? Well, the hilarious sign on the on the left of the screen, click like that's a pen tester. 280 00:26:41,520 --> 00:26:47,190 So if you're a penetration tester in software development world, your ideas. I ensure privacy by checking it, 281 00:26:47,190 --> 00:26:55,140 and I ensure self-determination by being allowed to prod at the boundaries of software that I'm using or on behalf of other people. 282 00:26:55,140 --> 00:27:00,670 Prod at it and see if it works just just like see if people are keeping their promises to me. 283 00:27:00,670 --> 00:27:05,080 But if you come from a corporate environment where, say, copyright is an issue, 284 00:27:05,080 --> 00:27:10,810 then allowing people into private things is not really how you want your intellectual property to be used. 285 00:27:10,810 --> 00:27:14,320 There's also a concept of So I've got this copyright, but if you've heard of copyleft, 286 00:27:14,320 --> 00:27:21,220 which is why there's a reverse see there in the middle, so the copy left community, they have an idea that. 287 00:27:21,220 --> 00:27:25,970 It's OK to be you, it's OK to be things so long, generally speaking, so long as you give credit, for example. 288 00:27:25,970 --> 00:27:30,520 So is that OK? Do you have the concept of it's OK for anybody to use stuff so long as you acknowledge it? 289 00:27:30,520 --> 00:27:33,820 Or is it like, Nope, that was mine, you can't have that. That's not fair. 290 00:27:33,820 --> 00:27:38,350 Those are two different opinions, and they can both be absolutely right, depending on the community you're in. 291 00:27:38,350 --> 00:27:43,990 The last ideas that I heart fries. So this is more of a social science and medical science stands. 292 00:27:43,990 --> 00:27:46,630 Would talks about the fact that consent is really important. 293 00:27:46,630 --> 00:27:53,470 So the way we ensure privacy and self-determination, when people is asking their consent before we do things to them or with them, 294 00:27:53,470 --> 00:27:59,800 and that consent needs to be says free and informed and enthusiastic, i.e. very clear and specific. 295 00:27:59,800 --> 00:28:01,120 What are they consenting to? 296 00:28:01,120 --> 00:28:05,860 And those are all different ways you can ensure your privacy and self-determination, but they are not the same thing at all. 297 00:28:05,860 --> 00:28:13,330 So which community are you? Part of? That depends, for example, on the general way you're publishing, and so it's actually really hard to navigate. 298 00:28:13,330 --> 00:28:22,300 So why might ethics directly be relevant to your research? Well, maybe you want to reuse data about people, or maybe even you want to directly? 299 00:28:22,300 --> 00:28:27,070 Maybe you want to directly collect some yourself and you want to make sure that you're 300 00:28:27,070 --> 00:28:32,920 doing it respectfully and safely and robustly on behalf of yourself and the participants. 301 00:28:32,920 --> 00:28:38,290 So you want to be looking after your participants in a good way, but also for yourself because you don't want to have to collect the data again. 302 00:28:38,290 --> 00:28:41,890 If you collect all the data and then somebody says, Oh, this wasn't properly consented, 303 00:28:41,890 --> 00:28:45,730 they will throw it out your if it's a thesis examiner, you will be hauled up. 304 00:28:45,730 --> 00:28:48,710 And if it's a if it's a paper where you try to publish it, they will. 305 00:28:48,710 --> 00:28:51,490 They will throw that paper out and you have to go back and collect the data again. 306 00:28:51,490 --> 00:28:55,900 So in that case, the way you get around that is that you go straight through here because that's a nice, 307 00:28:55,900 --> 00:29:02,200 straightforward people to write fairly directly interacting with or directly using their data. 308 00:29:02,200 --> 00:29:06,940 But if it's not research, then it's not correct. And that case is more about probably about licencing. 309 00:29:06,940 --> 00:29:10,980 And then you talk to the Oxford data management team. 310 00:29:10,980 --> 00:29:17,850 So maybe you are developing tools for decision support and you want those tools to be robust and usable. 311 00:29:17,850 --> 00:29:23,390 Well, in that case, again, if it's technically what you're doing is research, then it might be a correct thing. 312 00:29:23,390 --> 00:29:30,470 But it might also be more about explainability. It might be about the data itself as and where are you getting it from? 313 00:29:30,470 --> 00:29:36,560 Are you allowed to re-use it for this purpose, like legally speaking or also just again from an ethical perspective? 314 00:29:36,560 --> 00:29:39,350 We can talk about some examples, but there's a lot of you know this, right? 315 00:29:39,350 --> 00:29:43,190 You know, statistics data can have all kinds of weird strings attached to it. 316 00:29:43,190 --> 00:29:47,240 Documentation, right? If you're developing a tool that people are going to use, it needs to be documented. 317 00:29:47,240 --> 00:29:50,480 And actually, well, that's not technically part of the ethics approval process. 318 00:29:50,480 --> 00:29:55,470 It's absolutely part of being an ethical actor, proper software development techniques. 319 00:29:55,470 --> 00:29:59,060 So again, if it's something people are relying on, you need to have built it. 320 00:29:59,060 --> 00:30:05,540 Using robust things, you can't necessarily put it together with chewing gum and duct tape because people are depending on it. 321 00:30:05,540 --> 00:30:12,380 It might matter how you built it. That might involve things like accessibility and screen like is the tool you're using going to be used 322 00:30:12,380 --> 00:30:16,670 by people who might need to interact with it through screen readers or are using cascading style sheets? 323 00:30:16,670 --> 00:30:18,500 There's all kinds of things around that. 324 00:30:18,500 --> 00:30:23,880 It might be that again, because people might depend on it needs to be secure or at least be resilient or both. 325 00:30:23,880 --> 00:30:27,870 And I can provide guidance, thoughts on this if you want, but basically if going through those. 326 00:30:27,870 --> 00:30:32,400 First of all, if it's Piscotty definition of research and is directly involving people, correct? 327 00:30:32,400 --> 00:30:35,400 If it's not both of those things, then you like, What do I do? 328 00:30:35,400 --> 00:30:42,840 So explainability that would be handled quite nicely by the computer science vulnerability risk. 329 00:30:42,840 --> 00:30:48,390 I mean, the risk assessment, if it's about data and where it's coming from, that's also in the computer. 330 00:30:48,390 --> 00:30:56,190 I know computer science. I risk assessment. But also you might want to talk to the research data team if it's about documentation. 331 00:30:56,190 --> 00:30:57,750 Hopefully you have expertise about that. 332 00:30:57,750 --> 00:31:03,780 But again, if you go through the AI risk assessment, that's one of the things that prompts you about is have you done that properly? 333 00:31:03,780 --> 00:31:07,770 But if you don't know where to find the expertise again, Campsie is really good on this because we've been practising, 334 00:31:07,770 --> 00:31:11,260 but I'm sure there's loads of expertise and stats as well. 335 00:31:11,260 --> 00:31:17,170 Again, proper software development techniques, we've got some guidance on that from computer science that is useful accessibility. 336 00:31:17,170 --> 00:31:22,360 That's fact. So the university has an accessibility office, but also again, the Oxford data management team are lovely. 337 00:31:22,360 --> 00:31:26,020 They can connect you if it's about security and resilience again. 338 00:31:26,020 --> 00:31:30,490 You're very welcome to lean on the computer science research ethics team who can point you in the direction of whatever 339 00:31:30,490 --> 00:31:36,400 expertise you haven't got because we have a software engineering master's programme and lots of experts who are good at this, 340 00:31:36,400 --> 00:31:40,900 hence the guidance talk if you want to just ask us. Or. 341 00:31:40,900 --> 00:31:48,190 Maybe you want to train a magnificent deep neural net and it's hungry for data and you need to feed it. 342 00:31:48,190 --> 00:31:56,920 And what you want to feed it is baby tweets or social media profiles or photographs of people maybe say you scraped it from Flickr or, 343 00:31:56,920 --> 00:32:00,510 you know, dating profiles, personally identifying data. 344 00:32:00,510 --> 00:32:07,020 Is this OK and this kid's incredibly complicated, because this is the part where I was talking about this, what the law says, 345 00:32:07,020 --> 00:32:11,340 and then there's kind of what your conscience tells you or what actually your community ethics tells you. 346 00:32:11,340 --> 00:32:15,290 More importantly, but your conscience matters, too, which is. 347 00:32:15,290 --> 00:32:22,040 Under the terms of Twitter's terms of use, for example, anything people tweet isn't actually their intellectual property, it belongs to Twitter. 348 00:32:22,040 --> 00:32:29,480 Right? And Twitter has a terms of use that come with it, which is that academics may reuse anything that Twitter anything on Twitter. 349 00:32:29,480 --> 00:32:32,810 It can be used for academic research. No problem, but it must be cited, 350 00:32:32,810 --> 00:32:38,750 including the reason you see those little screenshots of Twitter like a tweet is because under the terms of your use for Twitter, 351 00:32:38,750 --> 00:32:43,580 you need to include the username of the person and the fact that it came from Twitter 352 00:32:43,580 --> 00:32:47,360 and some other metadata so that the way you do that easily take a screenshot. 353 00:32:47,360 --> 00:32:53,480 Problem is, that includes, like I said, the terms of you say, you must say who tweeted it and that it came from Twitter. 354 00:32:53,480 --> 00:32:56,180 But if you're trying to be an academic who's trying to protect the interests of 355 00:32:56,180 --> 00:33:00,920 your research participants or your unwilling people who you statistically scraped, 356 00:33:00,920 --> 00:33:07,040 maybe it's not very fair to be naming them by name, because maybe you're calling out specific behaviour that might be embarrassing to them or might be 357 00:33:07,040 --> 00:33:11,150 making a vulnerability for them in some way of revealing some personal information about them, 358 00:33:11,150 --> 00:33:16,730 maybe whether they're at home or not or whatever. There's tons of information that people could infer that you might be able to do. 359 00:33:16,730 --> 00:33:22,220 Really fascinating research on, but it wouldn't be very responsible or fair to name those people by name in your dataset. 360 00:33:22,220 --> 00:33:26,690 But Twitter says you can only use our data if you name them by name. How do you balance that? 361 00:33:26,690 --> 00:33:31,010 There's loads of there's loads of dilemmas like that. And again, Helen in iMessage. 362 00:33:31,010 --> 00:33:37,500 Rick has some expertise in that, and so does computer science. So we'd be really happy to help you navigate that minefield, right? 363 00:33:37,500 --> 00:33:43,560 So imagine you got through that somehow and you decide to train your network and you wrote a paper on it and you want to submit it to, 364 00:33:43,560 --> 00:33:48,570 I don't know, some journal I put in there, it's right and you want to publish your data set to make it available for the world 365 00:33:48,570 --> 00:33:52,080 because that's good science and that's part of being a responsible researcher, right? 366 00:33:52,080 --> 00:33:56,800 That's part of good research, ethics and responsible research and, you know, science. 367 00:33:56,800 --> 00:34:02,140 But your dataset again contains pictures of people or tweets by people in which you are 368 00:34:02,140 --> 00:34:05,770 technically allowed to have because maybe it was Creative Commons licence or whatever, 369 00:34:05,770 --> 00:34:09,480 but. Can you? Is that is that fair? 370 00:34:09,480 --> 00:34:11,400 And in what way can you do it? 371 00:34:11,400 --> 00:34:16,200 So the example is, for example, you want to include in your paper examples of things that your classify got correct, can you? 372 00:34:16,200 --> 00:34:25,770 So that's about your dataset or about the publication. Or another example is maybe you want to create and create a vision system. 373 00:34:25,770 --> 00:34:30,240 This is more, I guess, less stats, but it definitely people in your department have the skills to do this. 374 00:34:30,240 --> 00:34:36,390 So it might well be somebody on this call. You want to create a vision system that can infer something about a person. 375 00:34:36,390 --> 00:34:38,790 Anything at all about a person. But especially maybe you could add, 376 00:34:38,790 --> 00:34:45,870 I think that you could infer their sexual orientation or their gender or their ethnicity or their IQ or their religious affiliation, 377 00:34:45,870 --> 00:34:51,000 or their social popularity or political affiliation likelihood to commit a crime. 378 00:34:51,000 --> 00:34:58,370 Academic performance almost all of those. It should be obvious to you sitting on the call and thinking about it are very, very sensitive topics. 379 00:34:58,370 --> 00:35:02,990 Some of those are legally protected. They're called. It's actually a legally protected category of data. 380 00:35:02,990 --> 00:35:06,260 So gender, ethnicity, things like that, sexual orientation, 381 00:35:06,260 --> 00:35:10,520 you you might be able to do interesting science, but you need to know what the law says you can do. 382 00:35:10,520 --> 00:35:17,480 And also, again, be aware that what you've done, even if your particular use of it as a researcher for your use case is fine once you published it. 383 00:35:17,480 --> 00:35:22,190 Anybody can take that tool and run with it and do anything. So what are the ethical risks? 384 00:35:22,190 --> 00:35:23,720 How do you think it through? 385 00:35:23,720 --> 00:35:29,900 That's why we've got our machine learning and AI vulnerable machine learning and AI risk assessment performance to help you think 386 00:35:29,900 --> 00:35:34,820 that through and ask all the questions that you should have asked before you start so that you don't build something and go, 387 00:35:34,820 --> 00:35:41,180 Oh, I shouldn't done that. It's not about saying no, it's about saying, how do we do this in a way that's thoughtful. 388 00:35:41,180 --> 00:35:45,170 Or finally, right, this is my my wheelhouse is the cybersecurity thing. 389 00:35:45,170 --> 00:35:51,560 Maybe you maybe your research allows you to discover flaws in systems which allow you or, if misused, 390 00:35:51,560 --> 00:35:57,290 could allow criminals or some kind of attacker to exploit those systems in unexpected ways. 391 00:35:57,290 --> 00:36:02,690 Right. I mean, it's like anything from a classic statistical inferences thing where all of a sudden, 392 00:36:02,690 --> 00:36:06,920 if I combine this dataset with that dataset, suddenly I can tell something that nobody thought I could tell. 393 00:36:06,920 --> 00:36:13,040 But it can. There can be lots of other things, especially with software and penetration testing, which is less your thing, but definitely happens. 394 00:36:13,040 --> 00:36:18,560 So again, what can you do about that? And that's something where the Computer Science Research Ethics Committee would be really happy to help you, 395 00:36:18,560 --> 00:36:25,910 and that's where we have our vulnerability risk assessment chief. The vulnerability in pen testing type type approaches. 396 00:36:25,910 --> 00:36:28,220 And again, the good news is, as you will, 397 00:36:28,220 --> 00:36:32,960 those of you who are there for this talk last week will remember that responsible research and innovation is something 398 00:36:32,960 --> 00:36:38,180 that the University of Oxford has actually had some of the world leading academics who are developing these frameworks. 399 00:36:38,180 --> 00:36:47,690 So the area framework, which stands for Act Engage, so I anticipate engage reflect an act, a R e a and I keep getting word around. 400 00:36:47,690 --> 00:36:55,220 But the words on the screen is that is a framework that allows you again to kind of work through things so that. 401 00:36:55,220 --> 00:37:00,520 The system that they have developed has been incorporated into those risk assessment sheets, but also there's a formal methodology for this. 402 00:37:00,520 --> 00:37:07,360 So if you're developing something bigger, some big project the U.K. research ethics councils now is, 403 00:37:07,360 --> 00:37:15,320 I think UK research funding councils now require that you have thought about the area framework for a lot of kinds of research. 404 00:37:15,320 --> 00:37:20,630 And so if that comes up in a grant application or a practise doing that because you think it would help your career, 405 00:37:20,630 --> 00:37:22,550 but also because it's responsible, 406 00:37:22,550 --> 00:37:27,410 we have the experts who've actually developed that methodology and who can help you apply it to your particular question. 407 00:37:27,410 --> 00:37:31,220 They believe they'd love to you. I mean, Max, for those of you who are there is lovely. 408 00:37:31,220 --> 00:37:34,730 And it would be you would be delighted to help you with that. 409 00:37:34,730 --> 00:37:41,270 But the thing is, actually, I've talked an awful lot and I've still got a section on legal issues before I get into them, 410 00:37:41,270 --> 00:38:01,480 would anybody like to pull me up and ask any questions? All right, well, I'm more than happy to to have sort of a more specific discussion at the end, 411 00:38:01,480 --> 00:38:08,200 but the reason we the legal issues is because these are things that you may never encounter and that's absolutely fine, 412 00:38:08,200 --> 00:38:12,220 but most people don't even know about them until they find out the hard way. 413 00:38:12,220 --> 00:38:15,310 And it matters, particularly in the context of university research, 414 00:38:15,310 --> 00:38:19,000 because it's one thing like the classic thing is it's one thing to break the law, which is wrong. 415 00:38:19,000 --> 00:38:24,070 It's quite another to get caught. And if you're going to publish your research, that's that's getting caught. 416 00:38:24,070 --> 00:38:27,820 Right? So you have to make sure that you're being compliant with the law, 417 00:38:27,820 --> 00:38:32,020 especially for anything that you want to publish because it's really important or you want to publish it. 418 00:38:32,020 --> 00:38:41,200 So one of my favourites, this comes up much more for computer science, but it might matter for you is the UK Computer Misuse Act of 1990. 419 00:38:41,200 --> 00:38:51,250 Now, the date should probably tell you all you need to know about whether this law is particularly well defined for today's modern world. 420 00:38:51,250 --> 00:38:54,670 So I've detailed it here on the slide for anybody who wants to refer to it later. 421 00:38:54,670 --> 00:39:03,300 But the important thing is that it basically creates an actual criminal act in the UK, which is unauthorised access to a computer, 422 00:39:03,300 --> 00:39:07,600 and that it can be specifically talks about unauthorised access with the intent of committing further 423 00:39:07,600 --> 00:39:12,610 offences and distributed denial of service attacks and things things that you obviously know are wrong. 424 00:39:12,610 --> 00:39:19,450 But the fact that unauthorised access to a computer is itself a crime is a problem, right? 425 00:39:19,450 --> 00:39:26,440 All that means in this context is that the attacker attacker is aware that they are not intended to use the computer in question. 426 00:39:26,440 --> 00:39:29,710 That is it. Right? So I've got the paragraph below. 427 00:39:29,710 --> 00:39:36,970 Breaking organisational policies for computer use is sufficient. For example, sharing credentials within a home environment is fine. 428 00:39:36,970 --> 00:39:40,930 But if there's a company policy saying don't share credentials, 429 00:39:40,930 --> 00:39:46,360 they're logging into your colleagues computer even with their express permission in their presence, 430 00:39:46,360 --> 00:39:50,170 like you're standing over your shoulder saying, Please take this for me. I don't, I don't know, I just broke my arm. 431 00:39:50,170 --> 00:39:55,390 Can you please put my password in for me? Is in breach of the university's policy in our case, 432 00:39:55,390 --> 00:40:03,040 and therefore a crime crime is committed the moment you press the first key of their username if you intended to log in using their credentials. 433 00:40:03,040 --> 00:40:07,540 That's a really. Difficult things sometimes, I don't know. 434 00:40:07,540 --> 00:40:12,220 I'm a huge fan, but you need to know that this exists and it's true, but it's also an offence to distribute software. 435 00:40:12,220 --> 00:40:18,970 That's my eight attacks on computers, which sounds pretty obvious on the face of it, but especially in the computer science world, 436 00:40:18,970 --> 00:40:23,590 there are a lot of things that have to do with tools that could be used by forensic experts for 437 00:40:23,590 --> 00:40:28,510 investigations or tools used by people who are trying to test their own systems for robustness, 438 00:40:28,510 --> 00:40:32,470 who never intend to take that to heart and penetrate do penetration testing on other systems. 439 00:40:32,470 --> 00:40:37,060 But the same tool would allow you to exploit weaknesses in another system. 440 00:40:37,060 --> 00:40:43,390 You can't create them, and you can't supply software about them, which makes publishing things in computer science tricky sometimes. 441 00:40:43,390 --> 00:40:48,140 Right? So another law is the UK Wireless Telegraphy Act of 2006. 442 00:40:48,140 --> 00:40:54,460 Again, the fact that is calling it wireless telegraphy probably tells you what you need to know about this law. 443 00:40:54,460 --> 00:40:57,760 But if you think of it as actually the reason, it was basically reasoning by analogy, 444 00:40:57,760 --> 00:41:01,900 and it started off back in the days when it was still the wireless telegraph, right? 445 00:41:01,900 --> 00:41:09,230 Which is the idea that we as a society in the UK do not read other people's mail effectively, because that's the thing about privacy. 446 00:41:09,230 --> 00:41:12,730 The way we respect privacy is you don't read people's stuff that it was not intended for you. 447 00:41:12,730 --> 00:41:18,580 You know, it wasn't. So imagine a postcard. So a postcard has like the name and the address of the person that's going to. 448 00:41:18,580 --> 00:41:25,510 And it also has the message that you've written to the person is going to. And the deal is even though everybody's there, it is there in the plane, 449 00:41:25,510 --> 00:41:30,220 just as common courtesy dictates you don't read other people's postcard, even if you see it right. 450 00:41:30,220 --> 00:41:33,820 So they took the same idea and applied it to wireless communications. 451 00:41:33,820 --> 00:41:40,090 So you must be authorised or an intended recipient of a message to use wireless telegraphy. 452 00:41:40,090 --> 00:41:42,460 Basically anything wireless, Wi-Fi, whatever, right? 453 00:41:42,460 --> 00:41:48,590 To obtain information about the contents, sender or address of a message, or to disclose this information. 454 00:41:48,590 --> 00:41:54,430 So again, you can't do scraping of wireless communications like full stop. 455 00:41:54,430 --> 00:42:00,970 In the UK, it is really illegal. The way you could do it is say I am authorised by the owner of this equipment to do it. 456 00:42:00,970 --> 00:42:08,680 But if you just plain set up a site in a public square and you put up a Wi-Fi sniffer and a sniff is going by, you have committed a criminal act. 457 00:42:08,680 --> 00:42:14,740 It's really bad. And so publishing that again will be very difficult, even if the research is incredibly valuable. 458 00:42:14,740 --> 00:42:19,900 Now the thing is, if the research is incredible, it's potentially really valuable and there's no other way that you could test it. 459 00:42:19,900 --> 00:42:25,090 There's no sort of dummy experiment that you could do that gives you equivalent proof of concept data that it could well be. 460 00:42:25,090 --> 00:42:30,730 The university will take your side on this and say, We think this is where this is worth it is basically almost civil disobedience. 461 00:42:30,730 --> 00:42:34,960 This is worth the fight, but you cannot choose that for the university. 462 00:42:34,960 --> 00:42:39,940 You have to have gone through, the official channel said. I know this is a thing, this is a potential risk, but it's worth it. 463 00:42:39,940 --> 00:42:44,860 And here's why. And then the university will go, OK, I like the cardiologist, 464 00:42:44,860 --> 00:42:50,780 and all of a sudden then you're in a position where you've got the university's lawyers behind you and not in front of you. 465 00:42:50,780 --> 00:42:53,960 That's what you want. So that's why I'm telling you this stuff is not of the university. 466 00:42:53,960 --> 00:43:00,390 You just have to say no, but they need to know that they've said yes because you can't surprise them with this kind of thing. 467 00:43:00,390 --> 00:43:04,530 The UK Investigatory Powers Act 2016 basically is the Wireless Telegraphy Act, 468 00:43:04,530 --> 00:43:12,270 but more so any kind of public telecommunication systems and all that says is you need lawful authority to carry out any kind of interception. 469 00:43:12,270 --> 00:43:17,820 And that's all this right was designed with police and like GCHQ HQ in mind, so you need a warrant. 470 00:43:17,820 --> 00:43:20,850 But the problem is your university. You will never have a warrant. 471 00:43:20,850 --> 00:43:27,060 So to do private telecoms, public telecoms or Public Postal Service or wireless communications. 472 00:43:27,060 --> 00:43:32,880 You need authority to intercept that. And so the way you do it is you either get the owner of that technology onside or 473 00:43:32,880 --> 00:43:37,140 you be very careful about it because you need to know that you're playing with fire. 474 00:43:37,140 --> 00:43:42,240 The last thing is the UK USA Computer Fraud and Abuse Act. Again, 1986, guys. 475 00:43:42,240 --> 00:43:50,400 Not. Not great. OK. It is illegal to engage in unauthorised access to a computer connected to the internet. 476 00:43:50,400 --> 00:43:55,590 This is a US law and we are currently based in the UK. But this is an international world where data crosses borders, right? 477 00:43:55,590 --> 00:44:02,650 And one server that was based in California or one Amazon data centre is enough, right? 478 00:44:02,650 --> 00:44:09,760 To make U.S. law applicable in do OK, so it is illegal to engage in unauthorised access to a computer connected to the internet. 479 00:44:09,760 --> 00:44:16,540 But worse than the UK version, the statute doesn't define authorisation or without authorisation. 480 00:44:16,540 --> 00:44:23,170 So this is basically a stick to hit people with. If you are a corporation, you want to silence and research. 481 00:44:23,170 --> 00:44:27,340 So again, you need to be sure that you knew what you were doing as a researcher before you get into territory 482 00:44:27,340 --> 00:44:32,890 where this might apply to other things where it applies is knowingly trafficking in computer passwords. 483 00:44:32,890 --> 00:44:40,150 That sounds obvious, right? But actually, if you imagine you doing kind of statistical research on data dumps and the data dumps themselves contains 484 00:44:40,150 --> 00:44:46,210 in password either explicitly you were looking for that or it's just part of some other sort of data breach. 485 00:44:46,210 --> 00:44:49,570 You sending that around to your buddies is knowingly trafficking, 486 00:44:49,570 --> 00:44:54,700 but also you publishing that data set is knowingly trafficking in computer passwords. 487 00:44:54,700 --> 00:44:59,980 So again, this is an easy one that you think sounds obvious, but when it comes down to it can be quite problematic. 488 00:44:59,980 --> 00:45:02,380 Similarly, knowingly infecting a computer with a virus, 489 00:45:02,380 --> 00:45:07,990 hilariously infecting computer and virus aren't really defined in our kind of slightly old school terminology. 490 00:45:07,990 --> 00:45:12,850 But again, if you make some kinds of software that's exploratory, that could be a problem, right? 491 00:45:12,850 --> 00:45:16,720 And as a quick side note, actually on classified information. 492 00:45:16,720 --> 00:45:20,680 So information is in the public domain because it's been leaked, right? 493 00:45:20,680 --> 00:45:27,730 Normally, like with a company secret, this can be if it's like this under the secret recipe for Coca-Cola, right? 494 00:45:27,730 --> 00:45:28,600 You're not allowed to do it. 495 00:45:28,600 --> 00:45:34,420 But if somebody leaks it onto the internet and you get the leaked version, you can have it because that's fair game has been leaked. 496 00:45:34,420 --> 00:45:37,780 Coca-Cola lost it, right? You can have it. You can't steal it from Coke, 497 00:45:37,780 --> 00:45:42,940 but you can get it off the web if anything is ever classified by a government 498 00:45:42,940 --> 00:45:47,380 in the USA or the UK and lots of other places that I know the USA in the UK, 499 00:45:47,380 --> 00:45:53,230 it is still an offence to possess or distribute classified information, even if it's been leaked. 500 00:45:53,230 --> 00:45:59,410 Right. So this is what WikiLeaks did, and that's why everybody got so mad is now granted it's bad for him to prosecute. 501 00:45:59,410 --> 00:46:05,320 So again, there's a real risk here. Research is not huge, but you do need to know they could actually prosecute you for that and you need to be 502 00:46:05,320 --> 00:46:10,720 aware of that if you ever want to do anything dealing with classified information. OK, almost there, guys. 503 00:46:10,720 --> 00:46:17,530 You've been really patient. Well, we're getting on to that of the USA Digital Millennium Copyright Act of 1998. 504 00:46:17,530 --> 00:46:28,380 This one? If possibly the worst one on the list is highly politicised, it is tied up in things like the right to repair and fair use and free speech. 505 00:46:28,380 --> 00:46:33,660 But basically, if you can, if a company or any organisation, but often company, 506 00:46:33,660 --> 00:46:42,990 if an organisation or a person puts copyright or access controls, digital access controls you basically encryption on something, 507 00:46:42,990 --> 00:46:47,310 then it is illegal for you to circumvent that access control tool, 508 00:46:47,310 --> 00:46:53,370 i.e. the encryption is illegal for you to break the the the electronic lock and use something, 509 00:46:53,370 --> 00:46:57,240 even if the thing you're doing once you've broken the lock, you were totally allowed to do. 510 00:46:57,240 --> 00:47:02,430 If it's a piece of software that you bought and you paid for and you own or DVD that you bought and paid for and you own, 511 00:47:02,430 --> 00:47:04,770 but it's still got encryption on it and you break the encryption to, 512 00:47:04,770 --> 00:47:12,030 like, make your DVD play in a different computer because of the restriction about the region controls, then you are breaking the law. 513 00:47:12,030 --> 00:47:18,660 And so that matters again where the concept of technologies you might be using, it had to do with circumvention. 514 00:47:18,660 --> 00:47:22,320 This isn't hugely an issue for statistics, but it can be depending on the data sets you playing with. 515 00:47:22,320 --> 00:47:28,440 It could be an issue. But yeah, it's also has to do with the right to repair your own devices, which is very annoying. 516 00:47:28,440 --> 00:47:36,000 There's the UK Strategic Export Controls List. Realistically, most of you will never encounter this, but again, you guys are really smart. 517 00:47:36,000 --> 00:47:40,650 Right? And this is the kind of thing that you are bright enough that you might encounter it. 518 00:47:40,650 --> 00:47:47,970 Basically, the summary of it is if you if you've made some new technology that is so cool that the UK military would want 519 00:47:47,970 --> 00:47:52,530 to have an exclusive lock on it because they don't want other countries having access to that technology. 520 00:47:52,530 --> 00:48:01,890 There is an export control that we put on it right? And a person wishing to physically export good software technology on a control list or 521 00:48:01,890 --> 00:48:05,550 to transfer it by electronic means will need an export licence to be able to do so. 522 00:48:05,550 --> 00:48:13,320 So that, you know, it has to do even with possibly even really cool statistical inference techniques could come under this. 523 00:48:13,320 --> 00:48:17,730 Mostly, you won't run into it. But if you think you've got something that is really hot stuff, talk to your supervisor, 524 00:48:17,730 --> 00:48:21,600 talk to your departmental administrator because you need to know where you will be. 525 00:48:21,600 --> 00:48:27,930 You'll be in trouble because even emailing a collaborator who's across the national boundary about your project could get you in trouble with that. 526 00:48:27,930 --> 00:48:33,160 And that'll mess up your visa and you'll never be able to travel again is really bad. There's also the prevent duty. 527 00:48:33,160 --> 00:48:37,530 This is really minor, but people who do its deal with scraped data might encounter it. 528 00:48:37,530 --> 00:48:41,280 So it's basically concerned with preventing people being drawn into terrorism. 529 00:48:41,280 --> 00:48:47,580 The prevent duty really applies to the university or the department, but it also can apply to you and your research collaborators. 530 00:48:47,580 --> 00:48:52,630 There are two main duties. One is that researchers need to be careful about the exposure to radicalising material. 531 00:48:52,630 --> 00:48:57,630 So if you're scraping the most horrible things that you know social media has to offer, 532 00:48:57,630 --> 00:49:02,550 that's really hard on a person psychologically, and you need to take care of yourself. 533 00:49:02,550 --> 00:49:06,690 We want you to take care of yourself at the university has a duty to take care of you as well. 534 00:49:06,690 --> 00:49:13,300 We don't want you being like, scarred or traumatised by what you've done. Also, heaven forbid, seeing all of it might actually radicalise you. 535 00:49:13,300 --> 00:49:16,620 So let's we need to be really careful about this stuff, right? 536 00:49:16,620 --> 00:49:20,040 And so you also have a duty about reporting any material you might come across if you're 537 00:49:20,040 --> 00:49:24,080 doing some research again in the bowels of the internet where you haven't come across. 538 00:49:24,080 --> 00:49:28,670 That kind of material, you have a legal duty to report it, and if you're not sure how we can help you, 539 00:49:28,670 --> 00:49:35,280 if you ever get to that stage as computer science or even just Google prevent duty and people can show you it's not that hard. 540 00:49:35,280 --> 00:49:43,530 There is similar things about responsible disclosure, so this is the thing where if you fight and especially in software flaws, 541 00:49:43,530 --> 00:49:46,260 generally speaking, but it could well be again with statistical inferences, 542 00:49:46,260 --> 00:49:53,820 you can basically poke holes in a system and especially a system that's owned by either a government or a corporation who would rather maybe, 543 00:49:53,820 --> 00:49:58,080 maybe they should know about it. Maybe they'd rather not know about it, and you never know which it's going to be, 544 00:49:58,080 --> 00:50:03,180 whether they're going to be delighted that you've told them, and we're so happy they can fix it or whether they'll try and gag you. 545 00:50:03,180 --> 00:50:06,510 The problem is, if they choose to be the person who wants to gag you, 546 00:50:06,510 --> 00:50:11,610 all those laws that I've listed above can all be used as a stick to beat us and they can silence you. 547 00:50:11,610 --> 00:50:19,470 It's really hard to navigate. Generally speaking, if you're going to do something that involves showing mistakes or weaknesses in a system, 548 00:50:19,470 --> 00:50:25,050 you need to generally try and ask permission first and try to engage the technology owner in the process. 549 00:50:25,050 --> 00:50:29,370 Because the technology owner in every jurisdiction I've looked into this for and I've looked around a lot. 550 00:50:29,370 --> 00:50:33,330 They always retain the right to prosecute you. Always. 551 00:50:33,330 --> 00:50:37,200 They can agree not to. But that's all you can get. They might otherwise have. 552 00:50:37,200 --> 00:50:39,930 You haven't talked to them first. They might just hit you with a stick. And that's it. 553 00:50:39,930 --> 00:50:45,150 And I do know instances of researchers who were hit with gag orders and couldn't publish their work. 554 00:50:45,150 --> 00:50:51,300 So the final thing and this is final is a General Data Protection Regulation GDPR. 555 00:50:51,300 --> 00:50:54,570 So we talked about this a few times and hopefully you all know about it already, 556 00:50:54,570 --> 00:51:02,580 which is that basically personally identifiable information or personal information is actually the legal term personal information. 557 00:51:02,580 --> 00:51:06,210 But in the medical field, we talk about personally identifiable information. 558 00:51:06,210 --> 00:51:15,810 So PII medicine, PII GDPR, but same idea stuff that allows one individual to be directly or indirectly identified by reference to it, right? 559 00:51:15,810 --> 00:51:20,250 So it's like the pseudonym ization is great because it takes their name out of it. 560 00:51:20,250 --> 00:51:24,780 But actually, it might not be enough if they're the only person in the dataset, like if you're the only person with, 561 00:51:24,780 --> 00:51:31,740 say, tuberculosis in Oxford, like saying one person had tuberculosis might still identify you, right? 562 00:51:31,740 --> 00:51:37,590 Especially when you can combine that with data from others that there are there are two concepts of data controller and data processor, 563 00:51:37,590 --> 00:51:38,970 which I'm not going to get into here. 564 00:51:38,970 --> 00:51:46,170 But no matter if you're getting into GDPR data, the important part is there are legitimate and illegitimate purposes for holding data. 565 00:51:46,170 --> 00:51:51,630 If you don't have a reason to hold this personal data personal information, you must not hold it. 566 00:51:51,630 --> 00:51:55,470 But there are totally lawful reasons to have it, and that might be on the basis of consent. 567 00:51:55,470 --> 00:52:00,270 A person directly says Yes, I consent, hence the cookies on all the websites, which is terrible, if not real consent. 568 00:52:00,270 --> 00:52:05,970 But anyway, you might be under contract. But if you keep going down, you'll see public interest. 569 00:52:05,970 --> 00:52:10,260 Now, researchers at the University of Oxford carries out in the public interest. 570 00:52:10,260 --> 00:52:12,750 So you're generally speaking, absolutely fine, 571 00:52:12,750 --> 00:52:17,910 but you do need to know this so that you can put the right things into any wording that you might have on a website or whatever 572 00:52:17,910 --> 00:52:23,590 to make sure that you're covered so that nobody can come back and hit you with the GDPR and say you were in breach of it. 573 00:52:23,590 --> 00:52:29,210 And that was a heck of a lot of information. And I think your brains are probably all full. 574 00:52:29,210 --> 00:52:42,480 Is there anything to do? I've got a hand. 575 00:52:42,480 --> 00:52:44,608 I think under no training here.