1 00:00:00,330 --> 00:00:02,280 I'd like to welcome Karina Jones. 2 00:00:02,940 --> 00:00:11,880 Professor Karina Jones is here to talk to us about some of her work relevant to the big state epidemiologist motto, which just talks part of. 3 00:00:13,350 --> 00:00:20,940 She's an associate professor at the University of Swansea and she's worked on a number of big data initiatives which she will tell us about. 4 00:00:21,480 --> 00:00:27,150 She's also really involved in information governance and public engagement, which I believe this talk is really focusing. 5 00:00:28,590 --> 00:00:36,750 Her role is to ensure data protection and maximise the social accessibility of the data used across these various sources. 6 00:00:37,480 --> 00:00:40,730 Okay, thank you very much. Thanks very much for the invitation. 7 00:00:40,740 --> 00:00:45,569 It's great to be doing something on this really prestigious course and such a lovely summer as well, 8 00:00:45,570 --> 00:00:50,000 which is I know it's wonderful, but you know, we've got this air con, which is great, but isn't it good? 9 00:00:50,010 --> 00:00:58,580 We've got a good summer at last. And so okay, I'm going to talk about I like intriguing titles, the jugglers and the Black Cat, 10 00:00:58,590 --> 00:01:03,580 just to think, well, you've had a long day, you've been being taught now and people have been teaching you. 11 00:01:03,600 --> 00:01:07,950 So I'm just thinking I'll give you a title so that you think, Well, what's that going to be about then? 12 00:01:08,550 --> 00:01:16,320 But before I talk about that, I'll just tell you a little tiny bit about some of the big data initiatives I work with at the very beginning there. 13 00:01:16,320 --> 00:01:21,480 That's our building, that's the data science building where we house all the work that we do. 14 00:01:22,140 --> 00:01:25,020 So we have the same data bank which you may have heard of, 15 00:01:25,230 --> 00:01:33,180 which is an anonymous data safe haven of records about the population of Wales, so that about 3 million people in Wales. 16 00:01:33,180 --> 00:01:40,650 But this because it's been running now for ten years, but it's gotten retrospective data as I've got data on about 4 million people. 17 00:01:40,950 --> 00:01:47,100 So it can be GP records, hospital records, all sorts of things, screen in cancer registries. 18 00:01:47,310 --> 00:01:51,540 We've got awareness, births and deaths, but we've also got some of the data. 19 00:01:51,540 --> 00:01:56,760 That's not health. We've got education, housing, fire and rescue, that sort of thing. 20 00:01:57,120 --> 00:02:02,880 And the way it works is the data are anonymized by an NHS body and then they come to us, 21 00:02:03,030 --> 00:02:08,070 but they didn't link form because it goes via a trusted third party that in link up form. 22 00:02:08,250 --> 00:02:16,830 So we can have a lot of records, a lot of details about people, but not in a way that has any personal information attached to it. 23 00:02:16,980 --> 00:02:21,750 We only have an anonymous linking field that has no currency outside the system, 24 00:02:22,230 --> 00:02:27,120 and there's a lot tons and tons of research going on with the same databank, if you want to look. 25 00:02:27,120 --> 00:02:30,389 I mean, I think they will distribute the slides for you if you want to. 26 00:02:30,390 --> 00:02:34,650 And you can you can look up the same data bank if you're interested then as well 27 00:02:34,650 --> 00:02:39,690 as what we've just become one of the health data research UK substantial centres. 28 00:02:39,990 --> 00:02:45,330 That's the Amnesty's new investment now in place of the FA Institute though it is a little bit different. 29 00:02:46,110 --> 00:02:53,399 That is set up a number of consortia across the UK to apply cutting edge techniques and to do research. 30 00:02:53,400 --> 00:03:03,360 It's really important for our public benefit and we're also an administrative data research centre funded by the ISC that are four in the UK, 31 00:03:03,360 --> 00:03:09,210 that also rejig in that investment as well. But that's to take forward the use of administrative data. 32 00:03:09,450 --> 00:03:15,000 It doesn't exclude health data, but it's more focussed on government data, local authorities, 33 00:03:15,180 --> 00:03:23,520 economic and social data, benefits, work and pensions and also things like education and on suchlike as well. 34 00:03:24,270 --> 00:03:30,360 And as as Emily mentioned, I'm the academic lead for data governance and public engagement. 35 00:03:30,510 --> 00:03:39,780 So that's about looking at how to keep the data as useful as possible, but also to be safe and to be able to be used in a socially acceptable way. 36 00:03:40,110 --> 00:03:45,569 Because if you think you can lock, date it up and make it aggregated and it's incredibly, incredibly safe, 37 00:03:45,570 --> 00:03:51,000 but it may not actually be useful for answering research questions or you can make it very open, 38 00:03:51,090 --> 00:03:54,090 but then perhaps you haven't got sufficient safeguards. 39 00:03:54,330 --> 00:04:04,080 So it's about finding that proportionate balance so that the data are useful but also can be done within a robust governance framework. 40 00:04:05,520 --> 00:04:09,690 So let's come to the jugglers. So we have a little bit of context. 41 00:04:10,410 --> 00:04:15,510 It's been suggested it can be easier to donate blood and even organs than to genetic data. 42 00:04:15,870 --> 00:04:20,490 This is something that came my way in another meeting I was at in Oxford, actually, in April. 43 00:04:20,760 --> 00:04:28,409 And that's an interesting sort. And if you think about donate, it comes from the Latin for donor, which means a gift. 44 00:04:28,410 --> 00:04:32,100 And the French donor is a very, you know, familiar derivative. 45 00:04:32,820 --> 00:04:36,389 But do you think then donating data is about gifting data? 46 00:04:36,390 --> 00:04:42,690 Donating organs is about gifting them? It's an interesting concept if you think about it in terms of a gift. 47 00:04:43,320 --> 00:04:50,100 So I'm just going to pause a few questions. Are we moving in the right direction with legislation and regulations around this? 48 00:04:50,820 --> 00:04:59,550 Is it possible to find a bioethical balance between individual autonomy, personal exploitation and concepts of social responsibility? 49 00:05:00,000 --> 00:05:05,280 And can we, as individuals and as decision makers, make properly informed choices? 50 00:05:06,990 --> 00:05:10,800 Now, the thing with data is there's a tangibility issue. 51 00:05:11,190 --> 00:05:21,000 Debt is not like a thing. And when you think about ownership and rights, they usually about the right or the act or state of possessing a thing. 52 00:05:21,300 --> 00:05:23,700 And data is not tangible. It's not a thing. 53 00:05:24,270 --> 00:05:33,660 So the laws around data tend to deal with data protection, safeguarding privacy and confidentiality and take and just as a rule of thumb. 54 00:05:33,930 --> 00:05:37,980 Privacy would be about a person confidentiality about the data. 55 00:05:38,250 --> 00:05:45,030 But it's not about ownership of data as much as it is about safeguarding privacy and confidentiality. 56 00:05:45,600 --> 00:05:49,140 Because if you think about it, how can you own what you can't control? 57 00:05:49,380 --> 00:05:54,540 Once someone else is aware of it? If I tell you two things about myself that you don't know. 58 00:05:54,780 --> 00:05:58,920 One is I really, really don't like butternut squash. I think it's awful. 59 00:05:59,400 --> 00:06:04,260 But I want to say, Lambert, you know, now you now know two things about me that you didn't know. 60 00:06:04,380 --> 00:06:06,300 And I really don't mind, you know, in those things. 61 00:06:06,600 --> 00:06:12,420 But there are lots of other things about ourselves that perhaps we don't want people to know or we only want some people to know. 62 00:06:12,750 --> 00:06:16,200 An analogy I heard, which I quite like, is you can think about data. 63 00:06:16,200 --> 00:06:21,149 It's a bit like music. You can hear it, you can't it's not it's not so easy to own it. 64 00:06:21,150 --> 00:06:22,460 You can copy that music. Okay. 65 00:06:22,470 --> 00:06:29,940 But it's quite similar, isn't it, in that, you know, when somebody is aware of data, aware of information, how can you own it? 66 00:06:30,180 --> 00:06:38,490 So the laws and the regulations are more about safeguarding your privacy and the confidentiality of the data rather than an ownership concept, 67 00:06:38,730 --> 00:06:42,900 though we do tend to say, my data, your data, but it is a bit tricky. 68 00:06:43,710 --> 00:06:48,660 And of course we haven't even discovered every type of data that could be considered personal. 69 00:06:49,080 --> 00:06:52,770 So that's a challenge in itself. And there have been recent changes. 70 00:06:52,890 --> 00:06:55,080 If we go back however many years we need to, 71 00:06:55,200 --> 00:07:01,200 when there was no social media that couldn't be considered in data protection legislation because it didn't exist. 72 00:07:01,440 --> 00:07:09,730 And so legislation gets changed to add the concepts of these different kinds of data as we progress and they become available. 73 00:07:11,340 --> 00:07:14,130 But let's think about what it is we could be donating. 74 00:07:14,880 --> 00:07:21,240 So if we think about our little data subject, you know, I've called him Schrodinger's pot and I guess you're going to work out why. 75 00:07:22,230 --> 00:07:27,870 So Pat could donate all sorts of things that pertain to himself. 76 00:07:28,410 --> 00:07:35,040 It could be paired organ, could be a vital organ, could be some tissue, some blood, some stem cells, 77 00:07:35,040 --> 00:07:43,320 perhaps when he was very small, or perhaps his needs to decide about that for a baby or a change of his his DNA, 78 00:07:43,920 --> 00:07:56,640 his health records, social media data, data collected as he uses his phone, something collected via an app like Mapmyrun or store loyalty card data. 79 00:07:57,600 --> 00:08:05,280 Or he could donate his whole body after he dies, which is why he should take part, because, of course, if you don't, it's all about your liver. 80 00:08:05,490 --> 00:08:09,060 Then he must be dead before all of the others he can do while he's alive. 81 00:08:09,750 --> 00:08:15,600 But whatever he donates, even if it's tissue or organs, data are being generated. 82 00:08:15,900 --> 00:08:20,190 So when we say Is it harder or is it easier to donate data or tissue, 83 00:08:20,370 --> 00:08:27,330 we've got to think that even when it's tissue, it's donating data as well, which is another interesting twist. 84 00:08:28,380 --> 00:08:33,450 So if we think about the different sort of data to whom and how we donate it, 85 00:08:33,900 --> 00:08:41,520 we have a store loyalty cards which we go into Sainsbury's or Tesco's and we swipe the card and we think, Great, we're going to get some points. 86 00:08:41,910 --> 00:08:49,320 But the truth is that data, this is of much more value to the shops and the little points we get back and then we get £10 to spend in the shop. 87 00:08:49,590 --> 00:08:52,620 But really it's for the market advantage. 88 00:08:52,950 --> 00:08:57,450 They do it something nice for the customer, but the primary reason is to gain an advantage. 89 00:08:58,140 --> 00:09:02,250 Similarly with social social media, we can share things with friends and family, 90 00:09:02,520 --> 00:09:07,320 but they are gaining information on us which they can use for their own purposes. 91 00:09:07,560 --> 00:09:09,810 And of course we know that they do that. 92 00:09:10,560 --> 00:09:17,280 We'll all be aware of the recent Cambridge Analytica debacle, as it tends to be called debacle after Cambridge Analytica. 93 00:09:17,280 --> 00:09:21,050 That wasn't that. It's a bit like the conciliatory service, ACOSS. 94 00:09:21,170 --> 00:09:26,220 You know, it's one of those things that goes together where they basically Facebook allowed them 95 00:09:26,460 --> 00:09:32,550 to put on a personality test where people filled in details about themselves and then 96 00:09:32,550 --> 00:09:37,410 the company profile people according to the things they clicked like on and then use 97 00:09:37,410 --> 00:09:43,080 that information to target adverts to try and influence the US presidential elections. 98 00:09:43,670 --> 00:09:52,950 And so there's this, things like that. And then just by reason of using our mobile phones or smartphones, every time we make a call or send a text, 99 00:09:53,160 --> 00:09:58,860 the company collects a call detail record which gives the location where. 100 00:09:58,960 --> 00:10:03,670 We are where the recipient of the call is, whether it's a text message or a call. 101 00:10:03,970 --> 00:10:08,590 The duration of the call. So there's some sorts of information just being collected there. 102 00:10:08,830 --> 00:10:11,830 They can also profile the URLs you've visited as well. 103 00:10:12,730 --> 00:10:19,270 And we use all sorts of apps, you know, just like Mapmyrun for GPS locations or all sorts of other reasons. 104 00:10:20,080 --> 00:10:24,970 And then we have medical data, which does tend to be better regulated, of course, 105 00:10:25,510 --> 00:10:32,680 where we have our health status shared, perhaps with a practitioner or a researcher, a genetic data similarly. 106 00:10:32,680 --> 00:10:38,740 But we may also choose to buy some sort of genomic testing via a direct to consumer company. 107 00:10:39,280 --> 00:10:46,540 So that's a different sort of thing because that isn't really the same as someone who's sitting down with us and explaining things to us. 108 00:10:46,780 --> 00:10:54,280 That's a matter of pay. In $200 engaging a company, they send you a little thing to do, a cheek swab, and they send you back a result. 109 00:10:54,670 --> 00:11:03,010 So we provide our data to these companies in different ways, and we may also have apps to monitor our health a health condition. 110 00:11:03,310 --> 00:11:07,510 And depending who owns that app, the data will go to a different place. 111 00:11:07,510 --> 00:11:10,360 Perhaps if it's owned by the health provider, that's one thing. 112 00:11:10,660 --> 00:11:15,670 But if it's owned by a third party, then they will benefit from the data being collected from that app. 113 00:11:16,540 --> 00:11:21,999 And we have everything from properly informed consent to scrolling like crazy and 114 00:11:22,000 --> 00:11:26,950 agreeing to terms and conditions which we I know I'm guilty of just tending to do that. 115 00:11:28,490 --> 00:11:33,070 And there was this story, it was in the BBC News just on I think it was Friday. 116 00:11:35,300 --> 00:11:40,550 What you might have seen. And there are lots of stories like this about what the tech giants really do with your data. 117 00:11:41,510 --> 00:11:46,010 What they did there was they talked to some children up to about age 13, I think they were. 118 00:11:46,280 --> 00:11:53,030 And they got them to read the terms and conditions and they found it very difficult to understand them to be fair, which I think is fair enough. 119 00:11:53,600 --> 00:11:59,510 And the researchers concluded that you probably needed a university degree in order to be able to understand the terms and conditions, 120 00:11:59,690 --> 00:12:08,960 to give a meaningful decision on the sort of things that happened is your location can be tracked even if you don't specifically allow it. 121 00:12:09,200 --> 00:12:12,439 So when you're carrying your phone around, you're just being tracked. 122 00:12:12,440 --> 00:12:20,390 Basically, data passed to affiliates. They're not just capped with the company that collects them and were then bound by the third party terms. 123 00:12:20,510 --> 00:12:24,410 Who's now got the data? And as I mentioned, Cambridge Analytica. 124 00:12:25,190 --> 00:12:31,460 Facebook tracks you if the app is on your device, even if you haven't got an account, that's what they found there. 125 00:12:31,730 --> 00:12:36,680 Yes, I know it's not just if you're not logged in, but even if you haven't got an account, 126 00:12:36,680 --> 00:12:39,860 if the app is on the phone, Facebook is follow in that phone. 127 00:12:41,060 --> 00:12:46,550 And LinkedIn tracks your scans with private messages. 128 00:12:46,880 --> 00:12:49,910 There's often good reasons for these things. 129 00:12:49,910 --> 00:12:55,000 You can see the inverted commas on the podcast where, you know, it may be done for privacy, 130 00:12:55,000 --> 00:12:58,460 it may be done for security, making sure that nothing inappropriate is done. 131 00:12:59,060 --> 00:13:06,770 But the fact is it's still going on. And what we need to think about is anything you put in those situations don't just assume it's private. 132 00:13:06,890 --> 00:13:16,070 You know, it may not be. It may be, but there are cases where it isn't because these are platforms of largely unaccountable power very often. 133 00:13:18,020 --> 00:13:23,120 But are we, at least with Socrates, do we know what we don't know? No, we don't know what we don't know. 134 00:13:23,930 --> 00:13:25,159 I recently did a study. 135 00:13:25,160 --> 00:13:34,880 I don't know if anybody remembers Moth and if it did remember Moth yet I managed to mangle the title of a project into a word acronym MOFT, 136 00:13:35,000 --> 00:13:42,470 which I won't spell out for you now, but it was on looking at how should data governance be done for using mobile phone data for research? 137 00:13:43,160 --> 00:13:49,100 And one of the things we asked people in a series of workshops which I could repeat here is, 138 00:13:49,100 --> 00:13:54,140 okay, has anyone read the terms and conditions for their mobile phone contract? 139 00:13:55,130 --> 00:14:01,100 Anyone know? Okay, you're just the same as everybody else I've ever asked in the study. 140 00:14:01,250 --> 00:14:07,100 Apart from I was doing a presentation in a law conference and it wasn't part of the study, 141 00:14:07,280 --> 00:14:10,969 but I just asked the question and in the whole room there were about 30 people and 142 00:14:10,970 --> 00:14:15,890 three in a law conference full of legal experts had read the terms and conditions. 143 00:14:16,160 --> 00:14:18,620 So it's something we don't do. Okay. 144 00:14:19,140 --> 00:14:26,300 And then there have been some interesting social experiments and over 500 students signed up to a fictitious social media channel. 145 00:14:26,930 --> 00:14:32,420 None of them had read the terms and conditions well enough to notice they'd agreed to hand over their first born child. 146 00:14:33,290 --> 00:14:38,210 There was another example where the terms of a different study, the terms and conditions, said, 147 00:14:38,540 --> 00:14:44,720 if you have read these terms and conditions, you can claim and it was a decent prize and it ran for three months. 148 00:14:44,810 --> 00:14:46,879 And then one lady emailed and then she said, 149 00:14:46,880 --> 00:14:51,500 I've read the terms and conditions and they gave it the place because nobody else had read them, you know. 150 00:14:52,220 --> 00:15:00,680 And even with quite sensitive information like these, direct to consumer genomic companies with 550 people who used the company, 151 00:15:01,070 --> 00:15:07,400 most of them thought they were aware of privacy issues and thought the risk of trouble and repercussions were negligible. 152 00:15:07,850 --> 00:15:13,520 But also 50% of men and about 30% of the women said they hadn't read the terms and conditions anyway, 153 00:15:13,820 --> 00:15:17,690 so they didn't really know whether there were privacy issues to be concerned about. 154 00:15:17,900 --> 00:15:21,170 They were just taking it as I want this thing, therefore I am going to do it. 155 00:15:22,100 --> 00:15:25,550 But what can we think about in terms of perceiving public opinion? 156 00:15:25,730 --> 00:15:29,120 And we can ask ourselves the question as well. It's not just the public out there. 157 00:15:29,270 --> 00:15:37,190 We are the public in this context as well. Do we really engage with information or are there very good reasons why we don't or can't? 158 00:15:38,990 --> 00:15:48,979 So if we look just at the legal position for a minute, Wales has adopted an organ opt out system, organ donation, opt out after death. 159 00:15:48,980 --> 00:15:53,480 We've done this since 2015. England is doing the same. I'm not quite sure when it's coming in. 160 00:15:54,680 --> 00:16:02,300 And the donation of human tissue while you're alive is under the Human Tissue Act, as you know, and the Human Tissue Authority. 161 00:16:02,810 --> 00:16:08,030 There was a study done by the Wales Cancer Bank and they found that over 95% of patients 162 00:16:08,390 --> 00:16:13,310 were willing for surplus tissue to be used by the Wales Cancer Bank for further studies. 163 00:16:13,550 --> 00:16:18,350 That's the sort of thing you often see with people who are close to an issue like that. 164 00:16:19,460 --> 00:16:23,870 On the GDPR, as we now know, has now come in and the new UK Data Protection Act, 165 00:16:24,230 --> 00:16:33,770 which has renewed provisions and some new ones for process in general and special category data as part of this opt out consent. 166 00:16:34,100 --> 00:16:41,090 Is no longer acceptable. So using it in research, let's say you can't just say this is going to happen unless you say no. 167 00:16:41,270 --> 00:16:43,850 And the same with using identifiable data. 168 00:16:44,690 --> 00:16:50,750 There's more clarity on data processing and the need to be more upfront about what the data controller is going to do. 169 00:16:50,900 --> 00:16:58,400 It's not just about having the information. If somebody wants it, you need to be more upfront, more more prospective about it. 170 00:16:59,120 --> 00:17:04,640 And there's an improvement on legacy tiny word escrow. Nobody reads it and sees as well. 171 00:17:05,060 --> 00:17:10,010 And a bit of a a no no on this automated profiling that was going on. 172 00:17:10,160 --> 00:17:13,820 Now, whether that'll have an effect on these massive, powerful platforms. 173 00:17:14,030 --> 00:17:20,420 But that's the idea. And clear provisions for using research in the public interest, which is a good thing, 174 00:17:20,720 --> 00:17:25,820 because the guidance now is it may not be necessary to rely on consent for some research, 175 00:17:26,060 --> 00:17:30,680 provided you can show and defend your position that it's in the public interest. 176 00:17:30,770 --> 00:17:33,020 I mean, it wouldn't be something like a clinical trial, let's say, 177 00:17:33,020 --> 00:17:38,360 but if it's research that's using data, there may be cases where that can be relied upon. 178 00:17:39,650 --> 00:17:46,790 But coming back to the organs, the organs of deceased people can generate DNA and other health related data about living individuals. 179 00:17:47,210 --> 00:17:51,920 So we've got an opt out consent for organ donation after death. 180 00:17:52,220 --> 00:17:59,030 But that organ donation can generate DNA profiles about people that are alive that are related to this person. 181 00:17:59,480 --> 00:18:11,090 So there are some incongruities there, at least it appears so far on some ethicists actually argued against that opt out organ donation after death. 182 00:18:11,750 --> 00:18:17,540 And another query that's come up is, does a donation bid passively lose the spirit of a gift? 183 00:18:17,750 --> 00:18:22,850 If we don't know we're giving something? I don't mean the organ donation after death, because that's got to be specified. 184 00:18:23,180 --> 00:18:30,590 But in terms of passive data collection, if we're doing it passively and we haven't really engaged or been able to engage, 185 00:18:31,220 --> 00:18:38,120 maybe we shouldn't really be thinking about that as a donation, but need to be thinking about do we really want to do that or should it be considered 186 00:18:38,300 --> 00:18:44,510 in another way because we don't have opt out donation for posthumous data? 187 00:18:44,810 --> 00:18:49,040 You know, which is interesting. If we can do it for organs, we can't do it for data. 188 00:18:51,110 --> 00:18:54,320 But of course, Pat isn't an island entire of himself. 189 00:18:54,770 --> 00:19:01,550 And when people make decisions for all of us, we need to think about the other people that we implicate. 190 00:19:02,090 --> 00:19:08,600 And we generally take these as four bioethical principles in making decisions based on bioethics. 191 00:19:09,050 --> 00:19:13,850 So autonomy, the right to make informed decisions free of coercion. 192 00:19:14,510 --> 00:19:20,120 Not maleficent's. Not intentionally harming someone through acts of commission or omission. 193 00:19:20,780 --> 00:19:25,730 Beneficence. The duty to benefit patients, individuals and society and justice. 194 00:19:25,910 --> 00:19:28,430 Equity in provision of care and resources. 195 00:19:29,120 --> 00:19:35,420 But coming back to that question from the beginning, where's that bioethical balance between individual autonomy, 196 00:19:35,570 --> 00:19:44,000 personal exploitation and concepts of social responsibility in these different data contexts where you've got data provided to practitioner, 197 00:19:44,210 --> 00:19:49,190 that's okay. Data provided to a researcher with all the correct procedures in place, 198 00:19:49,460 --> 00:19:55,670 but data provided to some where else, where we don't actually know where it's going. 199 00:19:56,060 --> 00:20:01,070 You know, we could be thinking, well, there are different sorts of ethical philosophies going on here. 200 00:20:01,250 --> 00:20:07,190 We could think, well, autonomy is that individual autonomy, or if that's in relation to a company, 201 00:20:07,400 --> 00:20:12,530 is it more related to we want this regardless of what effect it has on other people? 202 00:20:12,860 --> 00:20:16,490 Because we know that some of the companies I pick on Facebook, again, 203 00:20:17,030 --> 00:20:22,160 some people who work for them have come upfront and said, we've actually got psychologists working. 204 00:20:22,340 --> 00:20:26,060 We know what you know how to manipulate people. 205 00:20:26,240 --> 00:20:29,840 And an engineer, they're thinking in order to get them to do what we want. 206 00:20:29,930 --> 00:20:33,230 And they've they've come up and said that that's what is being done. 207 00:20:33,950 --> 00:20:34,609 Altruism, 208 00:20:34,610 --> 00:20:42,710 we think of as something the benefits are the people without necessarily benefiting ourselves and sometimes with a bit of a sacrifice attached to it, 209 00:20:43,220 --> 00:20:47,840 such as perhaps someone would be prepared to provide data about themselves. 210 00:20:47,840 --> 00:20:53,659 It may take a lengthy interview. They may need to go through some procedures, but it's not going to help them. 211 00:20:53,660 --> 00:21:01,790 But it may help research in the future, or utilitarianism where it may have benefits for a large number of people, including ourselves. 212 00:21:02,660 --> 00:21:10,219 But even so, to what extent can it be said decisions and practices are actually guided by bioethics when 213 00:21:10,220 --> 00:21:15,500 we've got this huge interplay of actors and where the might or right comes to the front. 214 00:21:17,120 --> 00:21:25,220 So if we think of ourselves and everybody, really, we are bombarded by information data from all sorts of places. 215 00:21:25,400 --> 00:21:33,170 We've got this little kind of data splat over here where all sorts of information about health comes to us books, 216 00:21:33,170 --> 00:21:39,980 news media, magazines, social media, health professionals of friends, all sorts of places. 217 00:21:40,070 --> 00:21:45,500 Information comes our way now for health care practitioners. And I worked in the NHS and I work in a university. 218 00:21:45,740 --> 00:21:53,810 We've got an advantage position because we're aware that there are good sources of information where to go to find them. 219 00:21:54,140 --> 00:22:00,560 But for the general public they may not be aware of that and they're more subject to this with these people being quite small, 220 00:22:00,650 --> 00:22:05,450 what is for us perhaps that a bigger influence and research may be a bigger influence? 221 00:22:06,410 --> 00:22:13,940 So when individuals make decisions to donate or not donate their data, we've got this multiplicity of data contexts. 222 00:22:14,360 --> 00:22:20,810 We've got active and passive data donation, we've got the getting something versus the giving something. 223 00:22:21,110 --> 00:22:30,440 If I want a mobile phone or if I need to Skype in the next 10 minutes and I don't have Skype got installed on, I must go through this long thing. 224 00:22:30,740 --> 00:22:35,809 I'm just going to press accept because I need the thing or I want the thing compared to 225 00:22:35,810 --> 00:22:42,500 giving where we see ourselves as data subjects in a study or a clinical trial where okay, 226 00:22:42,530 --> 00:22:49,820 I may be able to benefit someone in the future. There's a getting giving concept going on across these data domains. 227 00:22:50,540 --> 00:22:57,470 We also have the effect of distance from the issues where the whales, cancer bank patients, they were close to the issue. 228 00:22:57,710 --> 00:23:03,080 They could see, well, I've been helped by this. It's going to benefit others if I allow my data to be used. 229 00:23:03,380 --> 00:23:07,340 But if we're in a situation, let's take blood donation. Maybe we donate blood too. 230 00:23:07,340 --> 00:23:11,120 Maybe we don't. But if we think we could, yes, we will do it. 231 00:23:11,270 --> 00:23:17,840 But on the other hand, if there was a local need where there's a big problem near us or to people related to us, 232 00:23:18,050 --> 00:23:22,250 it may be much more foremost in our minds and we think we really need to do this. 233 00:23:22,550 --> 00:23:26,330 But some issues are important, whether or not they're important in our perception. 234 00:23:26,330 --> 00:23:31,370 And they the other thing, of course, is data and things like this. 235 00:23:31,370 --> 00:23:35,390 And research is not the only show in town. And I spent five I was on the train today. 236 00:23:35,720 --> 00:23:44,720 And sometimes I just think to myself, if I was to ask any of these people on this train, have you donated data today and who did you donate it to? 237 00:23:45,260 --> 00:23:48,260 I would imagine people say, no, no, I do that. 238 00:23:48,500 --> 00:23:54,110 No, I haven't donated any data. But the truth is, it's very unlikely that we haven't you know, 239 00:23:54,110 --> 00:23:58,910 if we're carrying around a mobile phone or even travelling in the car with cameras and such like, 240 00:23:59,150 --> 00:24:01,460 there's different sorts of information being collected. 241 00:24:01,700 --> 00:24:06,830 If we've got a Fitbit and all sorts of things like that, there will be information being collected. 242 00:24:08,150 --> 00:24:15,080 But then sometimes it gets personal and it hits that threshold and we have to make some sort of a decision, 243 00:24:15,230 --> 00:24:20,720 whether for ourselves or for somebody else. And then it's about finding a way through moral measures. 244 00:24:20,900 --> 00:24:23,910 And we come back to this situation now. 245 00:24:23,990 --> 00:24:30,500 How does Paterno, which is the bit of information he can trust, how does he know how to make his decision? 246 00:24:30,800 --> 00:24:34,160 It's it's quite a difficult thing when there's so much information. 247 00:24:34,340 --> 00:24:40,790 It's not about getting information anymore, is it, when, you know, maybe 50 years ago you needed to look hard for things and find things in books. 248 00:24:40,970 --> 00:24:46,520 Now it's easy to get information, but it's hard to be judicious and to know what we can trust. 249 00:24:47,810 --> 00:24:51,080 So we find ourselves juggling the ones and zeroes. 250 00:24:51,830 --> 00:24:58,470 We are, wittingly or unwittingly, we are the jugglers because we've got all sorts of things coming our way. 251 00:24:58,550 --> 00:25:03,260 Decisions we have to make, decisions we don't know. We've made all sorts of things going on. 252 00:25:04,190 --> 00:25:08,210 We can donate the same or similar data via different processes. 253 00:25:08,540 --> 00:25:14,450 We can sit down face to face with a researcher or a practitioner and talk about sensitive health data. 254 00:25:14,990 --> 00:25:18,500 Or we can put it on an app and we're not quite sure where it goes, 255 00:25:18,890 --> 00:25:24,800 or we can contact a direct to consumer genomic sequencing company and provide it to them. 256 00:25:25,640 --> 00:25:33,860 Or we can provide different data using the same or different processes, general data or very sensitive data, again, via an app, let's say. 257 00:25:34,220 --> 00:25:37,820 So different things in the same way, similar things in different ways. 258 00:25:38,630 --> 00:25:45,230 And there are limitations on individual choice because we're not always aware of who we are donating data to. 259 00:25:45,590 --> 00:25:49,820 What data do we mean? What will they do with it? Will they pass it on and so on. 260 00:25:50,480 --> 00:25:53,510 So sometimes I think it's too easy to donate data. 261 00:25:53,630 --> 00:25:56,690 It may be quite easy to donate tissue. Yes. Okay. 262 00:25:57,110 --> 00:26:03,920 But sometimes it's too easy to donate dated as well when we're subject to this information overload and distortion. 263 00:26:05,420 --> 00:26:13,910 So we just look at this as a little illustration here. If we think of ourselves as this thinking capital, 264 00:26:14,910 --> 00:26:22,920 we have to make some sort of decision either for ourselves or for the people we're caring for as practitioners or working with us researchers, 265 00:26:23,250 --> 00:26:29,910 whether it's about research and big data, all about this is trying to be packed again with an unusual skin condition this time. 266 00:26:31,050 --> 00:26:40,170 If we think about information coming from primary sources and we just allow us, us to call this objective, depending how it gets to us, 267 00:26:40,560 --> 00:26:50,530 it'll go through different channels, different conduits, and depending how it's interpreted, it may be magnified, diminished or fragmented. 268 00:26:50,550 --> 00:26:53,550 By the time it gets to us and we may not know that. 269 00:26:53,940 --> 00:26:57,030 So all we can really do is to think to ourselves, well, 270 00:26:57,240 --> 00:27:04,860 we need to get as close as possible to the primary source and think about, you know, how does data come to us? 271 00:27:04,980 --> 00:27:08,070 Where can we go to be as close to the primary source as possible? 272 00:27:08,400 --> 00:27:14,850 Because I'm sure we've all been in a situation where someone has said something and it could be something quite ordinary, 273 00:27:14,850 --> 00:27:17,850 but you think that really doesn't work like that. 274 00:27:18,030 --> 00:27:25,110 However, they thought that, well, obviously somewhere down the line they've got some information from somewhere and it's got a bit mangled on the way. 275 00:27:25,500 --> 00:27:31,440 So I think this sort of thing is just with thinking about when you read information as part of your course, 276 00:27:31,440 --> 00:27:38,080 as part of your work, as part of just being a person, you know, think about where does that come from? 277 00:27:38,100 --> 00:27:42,390 And can I get back to as primary a source as possible? 278 00:27:44,410 --> 00:27:47,920 So having got this far, I'm sure there's got to be time for a bit of alliteration, though. 279 00:27:48,010 --> 00:27:53,649 Everybody brings a bit of a little rationing. So minding our P's and Q's, we've got a series of unknowns. 280 00:27:53,650 --> 00:27:57,760 When we deal with data, we've got the unknown package, the data content. 281 00:27:58,000 --> 00:28:05,079 We may not know what data someone holds about us, or we may not know, for example, with, let's say, genomic data, 282 00:28:05,080 --> 00:28:11,140 what the data actually mean, because only some of it is known, you know, it's not as if it's all fully known as yet. 283 00:28:12,040 --> 00:28:20,410 We've got unknowns in who might be using the data. We've got unknowns in how they might use it, on where that data might sit. 284 00:28:20,680 --> 00:28:26,979 Is it somewhere in a data safe haven where it's under secure governance or 285 00:28:26,980 --> 00:28:31,000 is it sitting on somebody's desktop which could be shared with someone else? 286 00:28:31,150 --> 00:28:37,209 We may not know, but I think we need to stop asking questions that can't be answered properly. 287 00:28:37,210 --> 00:28:41,110 Now, what I mean by that is it is something that is changing naturally, 288 00:28:41,350 --> 00:28:46,060 but there's been a lot of work asking the public, do you think your data should be used for research? 289 00:28:46,360 --> 00:28:54,040 Well, let's face it, data are being used for research, and asking that question isn't really going to help anybody anymore, 290 00:28:54,190 --> 00:28:57,760 because if people say no, well, what are we going to do about it anyway? 291 00:28:58,030 --> 00:29:00,770 Because data are being used on a lot of data, 292 00:29:00,790 --> 00:29:09,519 they're being used by large companies who may not being may not be being as caring towards the data subject, 293 00:29:09,520 --> 00:29:15,370 perhaps as others are, as as as generally we should be for research integrity. 294 00:29:16,390 --> 00:29:21,640 So I think it is moving more towards how do you think data should be used? 295 00:29:22,420 --> 00:29:27,729 Who do you are you happy for to use your data or under what conditions are you happy 296 00:29:27,730 --> 00:29:31,660 for your data to be accessed rather than should it be used in the first place? 297 00:29:31,800 --> 00:29:33,910 And I think we are moving on from that. 298 00:29:34,930 --> 00:29:43,149 But we need to be transparent about information and custodians need to be able to show trustworthiness to us all, 299 00:29:43,150 --> 00:29:47,590 us as people, to show that is risk mitigation controls in place. 300 00:29:48,040 --> 00:29:52,179 Be clear, but also be clear about the limits and admit, yes, 301 00:29:52,180 --> 00:30:01,270 there are some unknowns and perhaps we can't totally protect your data against every possible attack that could come from anywhere. 302 00:30:01,570 --> 00:30:08,350 For example, the Personal Genome Project, where people are able to share genomic and health data about themselves actually 303 00:30:08,350 --> 00:30:13,270 tells people We cannot totally guarantee the confidentiality of your information, 304 00:30:13,480 --> 00:30:16,809 so at least people can go in with your eyes open if they want to. 305 00:30:16,810 --> 00:30:23,080 They can. If they don't want to, then that's fine. So can we make informed choices? 306 00:30:23,170 --> 00:30:29,229 Well, I think only if we are asked and we ask the right questions and we're able to engage 307 00:30:29,230 --> 00:30:34,840 and we do engage because otherwise it's quite difficult if we keep that distance. 308 00:30:34,990 --> 00:30:41,049 There's a lot of work with public engagement, but I do think there needs to be a recognition. 309 00:30:41,050 --> 00:30:45,190 We've all got that responsibility to think about it more for ourselves. 310 00:30:46,000 --> 00:30:53,200 And there are now, as I say, some alternatives to consent and we need ways, interpretations of legislation and regulations, 311 00:30:53,710 --> 00:31:00,580 because the GDPR provisions for use in research in the public interest is very helpful for researchers just about data, 312 00:31:00,940 --> 00:31:09,550 because data can be anonymized by a trusted third party as long as that authorised to hold that data, and then the data can be used in anonymous form. 313 00:31:10,390 --> 00:31:14,980 The Digital Economy Act has provisions for data processing, for anonymization, 314 00:31:15,280 --> 00:31:21,580 for non-healthcare organisations, so for government and administrative data organisations. 315 00:31:21,910 --> 00:31:28,900 The Digital Economy Act allows the data to be anonymous by a trusted third party, whereas there wasn't a provision for that before. 316 00:31:29,650 --> 00:31:37,059 Now, as I mentioned, we have the opt out organ donation which is moving in England and we now have in England the National Data Guardian, 317 00:31:37,060 --> 00:31:43,660 opt out where people can object to their data being used for anything beyond the direct care. 318 00:31:45,100 --> 00:31:51,190 Some of the thoughts that are flying around are about licensing data uses compared to actually donating data. 319 00:31:51,700 --> 00:31:56,350 Now if we think back to our full piece, the what, the who, the how and the where, 320 00:31:56,860 --> 00:32:00,830 I think it would be challenging in terms of the practicalities and it would need some really clever, 321 00:32:00,880 --> 00:32:04,630 clever technical solutions and governance frameworks around that. 322 00:32:04,840 --> 00:32:08,020 So I'm not quite sure, to be honest, how that would work. 323 00:32:08,290 --> 00:32:11,379 If we think of donating and forgetting, is it licensing and remembering, 324 00:32:11,380 --> 00:32:19,660 but how would you then engage with that every time you need to be able to engage every possible time your data were going to be used. 325 00:32:20,530 --> 00:32:24,580 But we want to promote data sharing, whether it's donated or licensed. 326 00:32:24,790 --> 00:32:27,040 We know the Data Saves Lives campaign. 327 00:32:27,220 --> 00:32:34,210 This is really, really important and we need to be able to follow the rules and the regulations, but without being rigid. 328 00:32:34,490 --> 00:32:43,360 And we need to be able to make decisions that have consequences for people without being partial so that we have socially acceptable ways. 329 00:32:43,560 --> 00:32:50,850 To use data. So we need wise interpretations of sorry. 330 00:32:51,090 --> 00:33:00,060 And the it's really important to think about what can we lead, what can we shape and what do we need to follow in. 331 00:33:00,900 --> 00:33:10,540 So we come to the black cat. Now, as context for this, I was asked to carry out a study commissioned by the Nuffield Trust, 332 00:33:10,540 --> 00:33:16,570 along with a couple of other colleagues to review the harms for health and biomedical data. 333 00:33:17,800 --> 00:33:25,690 And we did that. And what we found was the largest problem was actually the maladministration of information governance regimes, 334 00:33:26,350 --> 00:33:34,930 not about not having the policies in place, but having them, but there being problems in actually how they were interpreted and implemented. 335 00:33:36,130 --> 00:33:43,450 I won't talk about that now, but what I want to talk about is the alternative, the harms that occur when data are not used. 336 00:33:44,050 --> 00:33:52,450 Now this can be difficult to pin down and there are lots and lots of domain specific efforts going on, like the All Trials campaign, 337 00:33:52,450 --> 00:34:00,580 where a lot of effort is going on to make sure trial data is registered, information is provided properly so people know what's going on. 338 00:34:01,240 --> 00:34:08,500 But the non-use of data, that's quite a tricky concept, and there's a lack of publications in this problem area. 339 00:34:10,090 --> 00:34:18,940 So what we did was an international case study using the published literature looked at why is health related data non-use difficult to ascertain? 340 00:34:19,810 --> 00:34:23,530 What are the sources of data non-use? What are some of the reasons? 341 00:34:23,890 --> 00:34:31,930 What are the implications for citizens in society? And I looked across clinical records, research domains and governance frameworks. 342 00:34:32,260 --> 00:34:35,140 It was focussed on health, but it is applicable more widely. 343 00:34:35,650 --> 00:34:44,230 And it's really important to point out that the harm due to the non-use of data is more than not gaining the benefits of data use. 344 00:34:44,260 --> 00:34:49,690 It's not just one or the other. It's not if you don't use the data, you will get those benefits. 345 00:34:49,840 --> 00:34:53,260 Yes, you will get those benefits, but other things will happen as well. 346 00:34:54,310 --> 00:35:00,580 So if we have a look, first of all, at the clinical regimes, first of all, with sources, 347 00:35:01,870 --> 00:35:07,430 there were a variety of different sources of why data are not use or sources of data non-use. 348 00:35:07,450 --> 00:35:10,810 One of them was the way in which case notes are managed. 349 00:35:10,960 --> 00:35:13,360 Now, this is something that you'll be all very familiar with. 350 00:35:13,660 --> 00:35:21,220 Often a patient has a number of conditions, and so the case notes maybe in one clinic or the other clinic or another clinic. 351 00:35:21,730 --> 00:35:28,390 But that is something that if if everything was beautifully joined up with electronic records, then that obviously wouldn't be a problem. 352 00:35:28,630 --> 00:35:34,840 But while we depend on Steele a lot on on paper records, that can be a problem, 353 00:35:34,840 --> 00:35:38,140 because when the records are looked for, they're not where they meant to be. 354 00:35:38,290 --> 00:35:44,410 Because possibly the interesting case is where someone has got a number of conditions that could be in a number of places, 355 00:35:45,310 --> 00:35:49,240 that could be data entry errors that can happen to anybody. 356 00:35:49,420 --> 00:35:54,410 I've got two really unusual examples for you. 357 00:35:54,760 --> 00:35:59,530 I was working in the Clinical Audit Office at one time and we had a clinical audit assistant 358 00:35:59,770 --> 00:36:04,150 and she was given a spreadsheet with all sorts of information from different departments, 359 00:36:04,660 --> 00:36:10,450 numbers of procedures, bed length, time stay, stays in hospital and suchlike. 360 00:36:10,840 --> 00:36:15,220 And she was adding and she was also needed to sort by date and by department. 361 00:36:15,970 --> 00:36:22,209 But when she added and then sorted, she sorted by date and sorted that column and didn't start the whole spreadsheet, 362 00:36:22,210 --> 00:36:30,610 the dates went in the column, the row stayed the same. Then she sorted on department and the column changed and the roles stayed the same. 363 00:36:31,750 --> 00:36:38,350 So basically as she was adding and sorting and adding and sorting, it was well, it was just completely useless. 364 00:36:38,350 --> 00:36:39,160 It was a shambles. 365 00:36:39,310 --> 00:36:47,590 It was in reverse, it was irreversible because it was randomised basically, you know, and it was found because of some guy, any man. 366 00:36:47,770 --> 00:36:52,450 No, no, that's not right. You know, on this, this obviously became quite a big problem. 367 00:36:52,900 --> 00:37:00,850 Now, I can make this one step worse. After she was found not to be able to work in clinical audit, she was given a job in medical records. 368 00:37:01,180 --> 00:37:09,669 And this is true. I've got another example for you where this was a researcher and he had 4000 records, 369 00:37:09,670 --> 00:37:15,460 quite wide database records, and he had them in SPSS and he wanted to put them in Excel. 370 00:37:16,150 --> 00:37:22,390 Okay. And he took a whole weekend, he printed the whole lot out and he literally hand copied them all in. 371 00:37:23,080 --> 00:37:28,389 And the only reason we found out was he produced a bar chart and he produced a batch and showed it to me. 372 00:37:28,390 --> 00:37:34,980 And I looked at the one it produced the previous week and it bore very little resemblance that it just wasn't even the same shape. 373 00:37:35,320 --> 00:37:40,060 I saw what happened there, and he had actually copied it in row by row. 374 00:37:40,240 --> 00:37:47,080 So some strange things do happen. Unfortunately, sometimes it's about limited data recording. 375 00:37:47,080 --> 00:37:52,180 We don't we don't record all data. We know some information is just not for recording. 376 00:37:52,180 --> 00:38:00,010 That could perhaps be useful. Sometimes there are no clinical systems that are suitable or they don't fit and data are in silos. 377 00:38:00,190 --> 00:38:03,400 We all know that people go to hospital and they're asked. 378 00:38:03,840 --> 00:38:09,719 Medication are you on? And people find it odd because they don't know why the GP has given them certain 379 00:38:09,720 --> 00:38:12,960 medication and the hospital doesn't know that they're taking that medication. 380 00:38:13,050 --> 00:38:16,890 But of course, as we know, it's not joined up, but the public often think it is. 381 00:38:17,040 --> 00:38:23,930 And find it odd when they go to outpatients and they're asked about things and think, well, well, I told that to, you know, and they just don't know. 382 00:38:24,130 --> 00:38:29,550 You really should be more joined up than that. So some of the problems that have happened. 383 00:38:29,760 --> 00:38:35,040 The National Audit Office did a huge review of case notes across 120 hospitals. 384 00:38:35,490 --> 00:38:40,170 And as an example, they did they looked at around two need to audit. 385 00:38:40,500 --> 00:38:48,059 And what happened was when they investigated the results of the audit were better than they should have been because the problem cases, 386 00:38:48,060 --> 00:38:53,350 as I mentioned with the case notes, the case those but in other places and they hadn't been included in the audit. 387 00:38:53,790 --> 00:39:00,410 So where a pregnant lady had a culpability or diabetes or something on the case notes weren't available. 388 00:39:00,420 --> 00:39:09,990 So what about 3% of the case notes that weren't available? But it tended to be a bit systematic in that they were the ones who had the conditions. 389 00:39:11,100 --> 00:39:20,040 We did a survey when we set up the UK EMS register and we asked all the neurology clinics in the UK how they collect their data. 390 00:39:20,340 --> 00:39:29,219 There were about 85 at the time and about 40 something replied to the survey and about six of them still only used paper or a word processor, 391 00:39:29,220 --> 00:39:30,780 no clinical system at all. 392 00:39:31,170 --> 00:39:39,059 And for many specialities I guess that's still the case where it may be people are using an Excel sheet or or an access database. 393 00:39:39,060 --> 00:39:45,240 I don't know if there's any nods going on. Or is everything where you work all beautifully put into a clinical system or is it 394 00:39:45,240 --> 00:39:52,410 still in some cases anybody know large amounts of paper with large amounts of free text? 395 00:39:52,560 --> 00:39:56,400 Yes. Free text which isn't coded or. Yeah, searchable. 396 00:39:56,580 --> 00:39:59,580 Yes, yes, indeed. Yes. Thank you. Yes. Yes. 397 00:39:59,580 --> 00:40:06,120 Although this is improving, it's still something that needs a lot of work and of course different coding standards 398 00:40:06,330 --> 00:40:11,820 so that one system doesn't talk to another and systems themselves are incompatible. 399 00:40:12,210 --> 00:40:16,800 The reasons I mean the NHS is wonderful. I totally believe that I've worked in the NHS. 400 00:40:16,950 --> 00:40:22,650 I wasn't clinical myself but totally aware of the work that goes on. 401 00:40:23,340 --> 00:40:29,970 But it's so massively overstretched and it's about the priority, it's about patient care. 402 00:40:30,330 --> 00:40:37,290 The work of the NHS is to try and make sure all the data is done right that needs to be brought in and made possible 403 00:40:37,500 --> 00:40:45,000 rather than expecting it to be done by magic in some countries as performance a payment performance model, 404 00:40:45,870 --> 00:40:51,180 which also influences the way things are done. And it's not by design. 405 00:40:51,390 --> 00:40:57,510 These things are done not from the beginning where the systems are created in order to be interoperable and suitable. 406 00:40:57,870 --> 00:40:59,250 It's a bit bolt on, isn't it? 407 00:40:59,340 --> 00:41:08,550 Well, I think from the top and not really looking across the big picture, the implications are hugely this happens misdiagnosis, 408 00:41:09,120 --> 00:41:14,580 unnecessary intrusion, where if information is missing, people may be asked the same questions again. 409 00:41:14,760 --> 00:41:18,839 They may need to go through the same tests again because information is just not 410 00:41:18,840 --> 00:41:24,030 available in order to proceed or procedures that they need to have may be delayed. 411 00:41:24,690 --> 00:41:32,130 There may be poor outcomes, emotional damage, deaths, litigation then against the NHS and of course additional costs. 412 00:41:34,540 --> 00:41:44,380 In research. If we're dealing with information based on clinical records, we may find our study is subject to selection bias. 413 00:41:46,650 --> 00:41:51,450 Because, of course, people that are sicker have more data entry. 414 00:41:51,640 --> 00:41:52,920 You know, they have more data records. 415 00:41:53,160 --> 00:42:00,810 So that in itself can cause a selection bias or they may be improper study design that influences how data are used. 416 00:42:01,230 --> 00:42:05,010 Sometimes results are held back for personal or political reasons. 417 00:42:05,790 --> 00:42:12,570 Sometimes this cherry picking where someone might do a study is X, influenced by, let's say, 418 00:42:12,690 --> 00:42:18,450 and choose a lot of variables on a huge data set so that something will be significant and then report, 419 00:42:18,540 --> 00:42:25,710 Hey, this is significant, but really it's been done in a way that's cherry picking, which calls into question research, integrity. 420 00:42:26,430 --> 00:42:32,340 Some examples. I don't know if you remember, this trial was in 2006. 421 00:42:32,670 --> 00:42:35,820 It was a an immunomodulatory drug. 422 00:42:35,940 --> 00:42:41,400 It was a first in man trial. And they gave the drug to four or five people. 423 00:42:41,580 --> 00:42:45,240 They had an awful immune response. I think a couple of them died. 424 00:42:46,230 --> 00:42:48,810 Sorry, that's the one. Yes, indeed. 425 00:42:49,440 --> 00:42:58,740 It later transpired during the investigation that a close enough drug to be relevant had been tried in one person and it caused a big problem. 426 00:42:58,980 --> 00:43:02,130 But this information hadn't been available at the time. 427 00:43:03,870 --> 00:43:11,820 Clinical trial success rates. Commercial clinical trials are reported to have 85% of them have positive results, 428 00:43:12,030 --> 00:43:16,530 whereas in non-commercial trials, only about 50% of them have positive results. 429 00:43:16,740 --> 00:43:21,810 That's a little quote from Ben Goldacre, and obviously he's thinking about that and that's interesting. 430 00:43:21,960 --> 00:43:29,310 How does that quite work then? This one about the arrhythmia drug that was quite a big disaster. 431 00:43:29,580 --> 00:43:33,540 About 100,000 people died before the problem was realised. 432 00:43:33,930 --> 00:43:39,629 And what happened was it was in the 1980s and a drug was developed for arrhythmia and it was given to 433 00:43:39,630 --> 00:43:45,420 people who'd had a heart attack and it was fine as long as it was given to people who'd had a heart attack, 434 00:43:45,420 --> 00:43:53,999 who had arrhythmia. But when it was given to people who didn't have arrhythmia but had had a heart attack, the died basically on 100,000 people. 435 00:43:54,000 --> 00:43:58,440 Over a hundred thousand people died before that mistake was realised. 436 00:43:59,100 --> 00:44:02,490 Sadly, there had been a small study but it wasn't published. 437 00:44:02,640 --> 00:44:07,860 But in the investigation it came forward and said, look, we we weren't able to publish this, 438 00:44:08,100 --> 00:44:13,230 but if we'd been able to publish it, then we may have been able to to stop this happening. 439 00:44:14,460 --> 00:44:22,860 The new anticoagulants. This is an interesting one and one of the big drug companies has marketed dabigatran. 440 00:44:23,460 --> 00:44:28,830 And the marvellous thing about the new anticoagulants is that they're meant to be this meant to be. 441 00:44:28,830 --> 00:44:35,190 No need for titrating one dose suits everybody and no need for the frequent monitoring. 442 00:44:36,150 --> 00:44:45,660 Well, it then transpired that hundreds of people died or had a severe bleed many, many more, many times more than on the older style warfarin. 443 00:44:46,260 --> 00:44:50,819 And when it was investigated, it was found that the drug company had wilfully withheld the data, 444 00:44:50,820 --> 00:44:54,960 and they knew that it did need to be titrated and it did need to be monitored. 445 00:44:55,470 --> 00:45:00,030 And they were fined something like 5 million by the FDA in America. 446 00:45:00,870 --> 00:45:08,940 But that drug is a blockbuster drug and it has an annual revenue of $1,000,000,000 and they were fined $5 million. 447 00:45:09,660 --> 00:45:14,610 Then after that, they also produced and marketed the antidote and then no able to sell that as well. 448 00:45:14,940 --> 00:45:20,010 So that's some of the problems that can happen with non-use of data. 449 00:45:20,280 --> 00:45:28,769 And of course, we have this problem with journal publications where we know it's so much easier to publish something that's got a positive result, 450 00:45:28,770 --> 00:45:33,570 a really clear finding than a negative result. Or basically this just didn't work. 451 00:45:35,040 --> 00:45:42,270 So why does this sort of thing happen? Well, there are few incentives for data sharing and few penalties for not. 452 00:45:43,110 --> 00:45:49,170 Sometimes it's about market share, intellectual property investment, either in money or effort. 453 00:45:49,890 --> 00:45:55,560 Kudos by data. You know, sometimes we're just under pressure and some people want to hold it up to themselves. 454 00:45:56,070 --> 00:45:59,100 Pressure to publish on publication bias, as I mentioned. 455 00:45:59,790 --> 00:46:00,840 But as a result, 456 00:46:01,290 --> 00:46:09,510 sometimes we're dealing with working with untrustworthy information because unreliable findings are taken forward because they look shiny. 457 00:46:09,810 --> 00:46:16,200 Whereas the second best or third best, which was actually true, isn't seen as so good and it doesn't get taken forward. 458 00:46:17,010 --> 00:46:24,660 True findings may be concealed with multiple, multiple deaths in terms of governance frameworks. 459 00:46:24,900 --> 00:46:30,120 These, of course, there to safeguard individuals, everything to be done properly. 460 00:46:30,720 --> 00:46:35,160 Often we'd be all familiar with lengthy and duplicative processes, 461 00:46:35,820 --> 00:46:45,620 inconsistent advice and subjective interpretation like that life through a lens image and sometimes excessive disclosure control even. 462 00:46:45,680 --> 00:46:51,350 And when data are anonymized, sometimes there's a drive for or we must do more. 463 00:46:51,350 --> 00:46:53,990 We must do more to it in order to make it safer. 464 00:46:55,610 --> 00:47:02,180 Sometimes there's non-acceptance by an organisation after the regulatory and governance approvals have been gained. 465 00:47:02,390 --> 00:47:11,360 Sometimes a data provider will still say no, they won't provide data, even though ethical approval and everything that's necessary is in place. 466 00:47:13,380 --> 00:47:16,170 Sometimes you get consent by us. 467 00:47:16,410 --> 00:47:24,330 There's an article in the UN Malformation Study where there was a difference, systematic difference between those who consented and those that didn't. 468 00:47:24,630 --> 00:47:30,990 And so that can affect the the study that also happens with what people have been up to, 469 00:47:31,000 --> 00:47:37,170 wicked problems such as substance misuse and psychosocial problems in young people. 470 00:47:37,410 --> 00:47:41,670 What is really hard to get hold of the people in order to get consent. 471 00:47:42,090 --> 00:47:49,410 But as a result, if you are if this insistence on consent, then you can't study the issue properly. 472 00:47:50,430 --> 00:47:54,540 There was a particular problem with substance misuse stated in the US where 473 00:47:54,540 --> 00:47:59,610 the data provider just decided with no changes in regulations or legislation, 474 00:47:59,910 --> 00:48:06,600 that they would now withhold any record that contained an instance of substance misuse from research. 475 00:48:07,140 --> 00:48:10,200 But that means when the subject can't be studied properly. 476 00:48:10,980 --> 00:48:14,660 But that was just something that needed probably needed a lot of fighting back. 477 00:48:14,670 --> 00:48:19,020 There was an article published on it. So I would hope that that has now been resolved. 478 00:48:20,640 --> 00:48:23,520 The reasons can be, of course, there may be framework changes. 479 00:48:23,940 --> 00:48:30,870 Often it's because responsibilities are unclear and there is fear of reputation, risk, fear of getting it wrong. 480 00:48:30,990 --> 00:48:37,020 It's easier to say no than it is to think what happens if I say yes and I get this wrong? 481 00:48:38,070 --> 00:48:43,200 There's this little term here, privacy protectionism, which was coined by some Australian researchers. 482 00:48:43,500 --> 00:48:51,060 And what they mean by that is adding extra controls on the data and doing more and more processing to the data to make it safer, 483 00:48:51,210 --> 00:48:53,010 but actually not making it safer at all. 484 00:48:53,160 --> 00:48:59,520 So it doesn't actually add safeguards, it just adds further and further controls and limits the utility of the data. 485 00:49:00,210 --> 00:49:05,070 And, of course, the to the extent public support for the use of all kinds of data isn't known. 486 00:49:06,090 --> 00:49:11,850 As a result, we get an information deficit which can be delayed or abandoned and benefits left and attained. 487 00:49:13,050 --> 00:49:21,600 I don't know about you, but sometimes we maybe I'm find and I'm seeing more of a caution since the GDPR and not necessarily because of the GDPR, 488 00:49:21,750 --> 00:49:23,430 but because of interpretations. 489 00:49:23,700 --> 00:49:29,730 All of a sudden, everybody's a legal expert because it's a new piece of legislation and lots of people have had to engage with it. 490 00:49:30,150 --> 00:49:36,000 I don't know about you, but you know, just the range of emails I have received from companies I do and don't know. 491 00:49:36,420 --> 00:49:42,209 You need to do basically nothing for us to continue to send you this thing or you need to do an awful lot of 492 00:49:42,210 --> 00:49:47,520 things and do lots and lots of clicking and saying yes and reading in order for us just to send you a newsletter. 493 00:49:47,820 --> 00:49:54,090 And I think, you know, it's just, again, an evidence of that, differences in how things are interpreted. 494 00:49:54,780 --> 00:49:59,400 And of course, the result, again, multiple deaths of financial burdens to societies. 495 00:50:00,120 --> 00:50:04,020 But it's not just the one thing at a time. 496 00:50:04,530 --> 00:50:05,820 There are combined effects. 497 00:50:06,210 --> 00:50:11,550 So if you think of this doctor's handwriting here, which we're probably very familiar to doctors, I could probably read that. 498 00:50:11,550 --> 00:50:19,380 And it doctors among us, if we think of how the data collected, there are issues there in data non-use, 499 00:50:20,340 --> 00:50:24,330 how it's managed and stored, what is actually managed and stored, 500 00:50:24,660 --> 00:50:30,990 how it's used for research, how it's taken on to inform policies and guidelines, 501 00:50:31,320 --> 00:50:37,920 and then how it is received back at the bedside or at school and what is collected at that point. 502 00:50:38,100 --> 00:50:43,410 That comes back around to the state treasury. So it's a kind of big synergistic effect. 503 00:50:43,560 --> 00:50:50,010 It's an effect that's that has a knock on effect. It's a trail and it can compound itself. 504 00:50:51,180 --> 00:50:54,900 So how serious is this problem? Well, as we know, 505 00:50:55,500 --> 00:51:03,090 the non-use of health data is implicated in the deaths of hundreds of thousands of people and billions of dollars in financial burdens for societies. 506 00:51:03,390 --> 00:51:08,580 And this is quite a nice quote from the managed health care executive. 507 00:51:08,910 --> 00:51:17,790 And what they said was the greatest threat, the biggest risk to people with diabetes or heart disease, cancer, HIV, etc., 508 00:51:18,750 --> 00:51:23,310 seems not to be from unauthorised sharing or use of their personal health information, 509 00:51:23,460 --> 00:51:27,750 but rather from a failure to share or inadequate use of that information. 510 00:51:27,930 --> 00:51:32,610 And sometimes even valuing privacy or overprotected an individual's life, 511 00:51:32,790 --> 00:51:36,390 their health and the health of their families, their friends and their neighbours. 512 00:51:37,410 --> 00:51:45,420 I think that's a really, really interesting point because sometimes even it's not even you know what we mean by this, isn't it? 513 00:51:45,600 --> 00:51:49,860 What do we mean by protecting privacy, which I could waffle on about for quite a long time, 514 00:51:50,160 --> 00:51:56,190 because if we think about is something disclosed, if well, what happens if it is disclosed? 515 00:51:56,340 --> 00:52:01,680 Because sometimes the disclosure of information doesn't cause any problems and sometimes it does. 516 00:52:01,950 --> 00:52:06,810 So not everything needs to be private, but obviously some things do. 517 00:52:07,080 --> 00:52:11,790 And it's quite a difficult balance. So the recommendations from the. 518 00:52:12,050 --> 00:52:16,440 They were. There needs to be support for a culture change and data capture. 519 00:52:16,460 --> 00:52:21,680 How do we do this? It needs to be thought about and built in, not a bolt on solution. 520 00:52:22,040 --> 00:52:25,070 Decision makers need to think beyond the specific settings. 521 00:52:25,700 --> 00:52:30,260 Data capture systems need to be fit for settings and need to be interoperable. 522 00:52:31,620 --> 00:52:36,890 Need to promote onward data sharing. But recognising the investments that have been made by people. 523 00:52:37,100 --> 00:52:42,679 People may have worked hard for years to build a big cohort of data for research and 524 00:52:42,680 --> 00:52:46,850 it's understandable that they they need to be recognised for having done that work. 525 00:52:47,360 --> 00:52:56,630 But the need to be greater repercussions for wilful non-use of data, okay, accidental or unintentional or data is just not available. 526 00:52:56,900 --> 00:53:04,070 But where data are withheld because I don't want to publish yet because or I'm going to market this drug and then I'll 527 00:53:04,070 --> 00:53:08,300 deal with the repercussions because I know I'm going to make far more money than the fine them ever going to get. 528 00:53:08,580 --> 00:53:12,710 You know, we need something more than a tiny fine on a massive revenue. 529 00:53:14,210 --> 00:53:16,430 Governance models need to be proportionate. 530 00:53:16,910 --> 00:53:27,380 We don't need the same level of ethical scrutiny on such like for something that's dealing with something non-intrusive as we do with an intervention. 531 00:53:28,250 --> 00:53:34,790 We need to rebalance that tension between data privacy and utility with greater support for trustworthy re-use. 532 00:53:35,300 --> 00:53:44,750 I came across a term which was data obsolescence, and I just think, no, you know, we really need to be thinking about reusing data, 533 00:53:44,900 --> 00:53:49,730 not creating data for one project and then saying, okay, we can't use that again. 534 00:53:50,030 --> 00:53:53,960 That massive investment that's gone into that needs to be taken into account. 535 00:53:54,830 --> 00:53:58,040 We need to present the public with a balanced view of data use. 536 00:53:58,310 --> 00:54:01,790 Yes, there are some risks on the S that are huge benefits, 537 00:54:01,970 --> 00:54:08,840 but also there are problems when data are not used and if people choose not to allow their data to be used, 538 00:54:09,050 --> 00:54:18,680 even to say, okay, I don't know my data to be anonymized and used, well then there needs to be an understanding of what happens when data aren't used. 539 00:54:18,980 --> 00:54:22,700 And that's so that social responsibility can work both ways. 540 00:54:24,380 --> 00:54:27,140 Just some brief thoughts on the recent developments. 541 00:54:27,470 --> 00:54:35,570 It's a great thing that there's no the public interest provision for data use for research rather than needing to rely on consent all the time. 542 00:54:35,750 --> 00:54:41,900 Because obviously with the GDPR, the bar for consent has been raised and it is much more stringent. 543 00:54:42,650 --> 00:54:47,270 It's good that the Digital Economy Act allows data sharing for anonymization by the 544 00:54:47,270 --> 00:54:52,820 government departments that have been some comments on the National Data Guardian opt out, 545 00:54:53,420 --> 00:54:58,399 the opting out of personal data being used beyond direct care understanding patient data. 546 00:54:58,400 --> 00:55:02,930 I've written a couple of blogs on this because there is some concern that if people opt out, 547 00:55:03,080 --> 00:55:08,000 it could bias research and audit and service development because of, 548 00:55:08,000 --> 00:55:14,720 let's say, rare conditions, rare events, diversity, ethnicity and possibly differences in geography. 549 00:55:14,990 --> 00:55:19,970 So that, I think, needs to be carefully looked after. 550 00:55:20,210 --> 00:55:28,970 But it's obviously understandable that it needs to be done properly and not done in any way that the public wouldn't be finding acceptable. 551 00:55:30,650 --> 00:55:33,620 So there's a need for innovation in data governance. 552 00:55:33,920 --> 00:55:44,210 However, whatever data is generated from where Schrodinger's pot produces his his donations so that in order to get from data to information, 553 00:55:44,240 --> 00:55:46,190 there needs to be innovation. 554 00:55:47,000 --> 00:55:56,090 It is about but it's not just about following the legislation and regulations and the ethical, legal and social implications and due diligence. 555 00:55:56,090 --> 00:55:58,670 It is about them, but there's more to it than that. 556 00:55:59,090 --> 00:56:05,840 We need to take into account how these things are interpreted, how they're perceived, what sort of reservations are there, 557 00:56:06,050 --> 00:56:13,790 what are the aspirations of people present in the different perspectives that are stakeholders to consider the literature, 558 00:56:13,790 --> 00:56:22,370 the debate, the media, that information splat we looked at with that sort of shifting body of knowledge, it floats around shaping social realities. 559 00:56:22,640 --> 00:56:26,959 So if we're going to do this successfully, we need to be innovative. 560 00:56:26,960 --> 00:56:32,720 You need to do it properly because otherwise it risks becoming unethical through not using data. 561 00:56:32,850 --> 00:56:41,360 It would swing too far. So in terms of final words, we need to hone our juggling skills, all of us. 562 00:56:41,360 --> 00:56:47,480 And and as researchers thought responsible, trustworthy and socially acceptable use of data, 563 00:56:47,930 --> 00:56:55,790 because the non-use of data is like a black cat and it's like a large, agile, polymorphic, lethal black cat. 564 00:56:56,000 --> 00:57:00,079 And he certainly is out there and a better understanding of its nature. 565 00:57:00,080 --> 00:57:07,970 It needs to be understood because this is a global problem and he is definitely not truly Schrodinger's pet because he really is there. 566 00:57:08,830 --> 00:57:12,540 And this is the publication which you can have the. Tails off of. 567 00:57:12,780 --> 00:57:17,550 It's the first known study to address perspectives of health, debt and non-use. 568 00:57:18,030 --> 00:57:25,530 It's informing the parliamentary inquiry on research, integrity and the Wellcome Trust Initiative on understanding patient data. 569 00:57:26,070 --> 00:57:31,350 So that's my my cut. And just thank you very much and your questions. 570 00:57:31,350 --> 00:57:34,979 And that's a picture of where I'm from in Swansea, where I work in Swansea. 571 00:57:34,980 --> 00:57:37,560 I don't live there actually, but I'll leave the catch up for you.