1 00:00:00,850 --> 00:00:05,020 Thank you very much, Crystal. It's really nice to be here. 2 00:00:05,020 --> 00:00:15,430 Very strange to be in a glass box, so I hope this works, so you'll have to bear with me if if it's a bit strange. 3 00:00:15,430 --> 00:00:22,900 Thank you very much for the invitation to come to the department. So the subject of my talk is indeed statistical ethics. 4 00:00:22,900 --> 00:00:32,800 It's a topic which I approach with some humility, since I know that we have different experiences and views regarding ethics. 5 00:00:32,800 --> 00:00:42,940 In part, that's because statisticians work within a very wide range of different economic, cultural, legal and political settings. 6 00:00:42,940 --> 00:00:53,770 Also, because we work within different branches of our discipline, each involving its own techniques and procedures and its own ethical approach. 7 00:00:53,770 --> 00:01:00,020 Can everybody hear me OK? Can I just check the technology? I'm being indicated that I should speak up. 8 00:01:00,020 --> 00:01:04,060 I'm speaking quite loudly. We can hear you online. 9 00:01:04,060 --> 00:01:10,030 You can have hear me online. That's good. That's good. Great. 10 00:01:10,030 --> 00:01:17,860 The other reason why we have a variety of different experiences with respect to ethics is 11 00:01:17,860 --> 00:01:24,520 that many statisticians work in close partnership with professionals from other disciplines, 12 00:01:24,520 --> 00:01:35,080 and those other disciplines have existing conventions. And in such situations, statisticians should make their own ethical principles clear, 13 00:01:35,080 --> 00:01:39,790 but should also respect the ethical principles of their collaborators. 14 00:01:39,790 --> 00:01:44,690 They might be medicks, they might be economists, et cetera. 15 00:01:44,690 --> 00:01:53,840 So even within the same setting and branch of statistics, individuals may have very different moral precepts which guide their work. 16 00:01:53,840 --> 00:02:02,400 So in this talk, I'm not seeking to impose a rigid set of rules to which statisticians should comply. 17 00:02:02,400 --> 00:02:07,840 And I think that's really important we could come back to that in discussion, if you like. 18 00:02:07,840 --> 00:02:12,850 My own background is Crystal Sanders as an applied social statistician. 19 00:02:12,850 --> 00:02:20,950 I've worked in central government, in this country and in the United Nations, and for a brief period in the NHS. 20 00:02:20,950 --> 00:02:30,070 My more recent roles in academia have involved supporting graduate students and researchers and social science and medicine. 21 00:02:30,070 --> 00:02:35,230 But my interest in ethics began very early in my career in my twenties. 22 00:02:35,230 --> 00:02:42,670 I chaired the Ethics Committee of the Social Research Association, and we developed professional guidelines, 23 00:02:42,670 --> 00:02:48,430 which was subsequently adopted by the International Statistics Institute. 24 00:02:48,430 --> 00:03:00,160 The aim was to support the social researcher, stroke statistician in making individual ethical judgements and informing them of shared values. 25 00:03:00,160 --> 00:03:09,010 The goal called guidelines for a very important reason, and that is because, as the introduction to the guidelines says, 26 00:03:09,010 --> 00:03:17,050 it isn't appropriate to draw up a list of regulations, not least because context is important. 27 00:03:17,050 --> 00:03:24,570 So I'm going to be drawing on those guidelines, and you can reference them here. 28 00:03:24,570 --> 00:03:31,230 International Statistics Institute Declaration on Professional Ethics. 29 00:03:31,230 --> 00:03:42,210 I'm also going to be referring to the United Nations fundamental principles of official of national official statistics. 30 00:03:42,210 --> 00:03:48,920 And let me just start off by quoting from those fundamental principles. 31 00:03:48,920 --> 00:03:58,070 Official statistics provide an indispensable element in the information system of a democratic society serving the government, 32 00:03:58,070 --> 00:04:07,160 the economy and the public with data about the economic, demographics, social and environmental situation. 33 00:04:07,160 --> 00:04:15,800 So they send official statistics that meet the test of practical utility are to be compiled and made available on an 34 00:04:15,800 --> 00:04:25,580 impartial basis by official statistical agencies to honour statistical citizens entitlement to public information. 35 00:04:25,580 --> 00:04:36,480 I would say to empower the public as well. But I think underpins my talk, and I'm going to structure my talk run for things. 36 00:04:36,480 --> 00:04:44,490 Ensuring honesty and integrity, avoiding over claiming, promoting transparency and delivering public goods. 37 00:04:44,490 --> 00:04:50,590 So I'm not covering the whole area of ethics, but focussing on these. 38 00:04:50,590 --> 00:04:55,540 So theme one. Ensuring honesty and integrity. 39 00:04:55,540 --> 00:05:04,440 Let me quote from the ISI declaration. While statisticians operate within the value systems of their societies, 40 00:05:04,440 --> 00:05:09,990 they should attempt to uphold their professional integrity without fear or favour. 41 00:05:09,990 --> 00:05:15,900 They should also not engage or collude in selecting methods designed to produce 42 00:05:15,900 --> 00:05:25,020 misleading results or misrepresenting statistical findings by commission or omission. 43 00:05:25,020 --> 00:05:31,440 We're all familiar with the damage wreaked by misinformation, which has, to be quite honest, 44 00:05:31,440 --> 00:05:37,530 undermined trust in statistics from the false claims made about the windfalls that 45 00:05:37,530 --> 00:05:45,120 Brexit would deliver to the NHS to denying the reality of national PPE shortages. 46 00:05:45,120 --> 00:05:58,940 So the more recent lies relating to, say, Stallmer, issue, the guiding principle of many has been to lie first and avoid questions later. 47 00:05:58,940 --> 00:06:09,350 And research, unfortunately, shows that once the light is made, it gains a life of its own and can be extremely difficult to counter. 48 00:06:09,350 --> 00:06:19,310 I rather like the saying that lies move round the world and back whilst truth is doing up at shoelaces. 49 00:06:19,310 --> 00:06:27,100 And, of course, social media and echo chambers amplify the lies. 50 00:06:27,100 --> 00:06:35,740 The U.N. Fundamental Principles state, I think rather strangely in principle for statistical agencies, 51 00:06:35,740 --> 00:06:46,240 are entitled to comment on erroneous interpretation and misuse of statistics, find it rather strange because of course they're entitled to. 52 00:06:46,240 --> 00:06:51,340 I would prefer the word entitled in this, so be much, much strengthened. 53 00:06:51,340 --> 00:07:00,640 I think we should be aiming for correcting misinterpretations of any data for which we have responsibility. 54 00:07:00,640 --> 00:07:07,450 We can't do it across all data, of course, but any for which we have responsibility. 55 00:07:07,450 --> 00:07:15,040 The IOC declaration talks about the statistician considering the likely consequences of collecting and 56 00:07:15,040 --> 00:07:22,280 disseminating various types of data and should guard against predictable misinterpretations of misuse. 57 00:07:22,280 --> 00:07:31,980 So that's rather difficult. So you've got to anticipate the the misuse, according to the IOC declaration. 58 00:07:31,980 --> 00:07:40,200 Of course, here in the United Kingdom, the UK Statistics Authority in particular tries to correct misinformation. 59 00:07:40,200 --> 00:07:52,410 For example, their recent intervention. They rebuked the prime minister and the Home Office for claiming erroneously that crime is falling. 60 00:07:52,410 --> 00:07:55,020 Unfortunately, they don't have teeth, 61 00:07:55,020 --> 00:08:06,810 and the prime minister repeated very recently the claim that there are more people in work in the UK now than there were two years ago, 62 00:08:06,810 --> 00:08:17,630 and he was rebuked by the statistics authority some time ago for that incorrect statement. 63 00:08:17,630 --> 00:08:26,150 World leaders and famous people are vectors for disinformation if they condone and they normalise false claims. 64 00:08:26,150 --> 00:08:34,030 We have particular problems in the UK what I call the land of the performance indicator. 65 00:08:34,030 --> 00:08:42,620 And our government confuses setting targets with delivering improvements. 66 00:08:42,620 --> 00:08:52,760 So problems arise when governments or other organisations are both monitoring and being monitored by indicators. 67 00:08:52,760 --> 00:08:59,780 And ranking by the indicators exacerbates the problem because it raises the pressure. 68 00:08:59,780 --> 00:09:07,730 It's a mess report is an absolutely wonderful report a few years ago by the Royal Statistical Society called The Good, 69 00:09:07,730 --> 00:09:17,000 the Bad and the Ugly about the the benefits and the best benefits of performance indicators. 70 00:09:17,000 --> 00:09:25,490 One of the difficulties we have is that when a measure becomes a target, it ceases to become a good measure. 71 00:09:25,490 --> 00:09:31,280 Rather, like David Boyle's paradox, if we don't count something, it gets ignored. 72 00:09:31,280 --> 00:09:45,800 If we do count it, it gets perverted. Misinformation refers to claims that are false, but are not necessarily created with an intent to mislead. 73 00:09:45,800 --> 00:09:53,880 So you can counter misinformation. It basically requires explaining what is wrong. 74 00:09:53,880 --> 00:10:02,160 And why it's wrong. Disinformation is created expressly to mislead. 75 00:10:02,160 --> 00:10:10,350 And it requires that, as well as showing what's wrong. You need to investigate who's behind it and why. 76 00:10:10,350 --> 00:10:18,460 And of course, that can be very difficult because some in disinformation campaigns can be very sophisticated. 77 00:10:18,460 --> 00:10:29,660 It is sometimes possible to show that a false claim is being shared online in a coordinated manner by a number of social media accounts. 78 00:10:29,660 --> 00:10:37,430 Particularly where they share similar traits or behaviour and perhaps are all linked to certain political campaigns. 79 00:10:37,430 --> 00:10:46,140 And there are some tools like CrowdTangle that can be used to investigate patterns of behaviour in such accounts. 80 00:10:46,140 --> 00:10:50,700 I'm grateful to the Reuters Institute for the Study of Journalism here in Green 81 00:10:50,700 --> 00:10:55,710 Templeton College for these explanations and for their work on fake news, 82 00:10:55,710 --> 00:11:04,560 and I would encourage you to to look at their communications about the topic of fake news. 83 00:11:04,560 --> 00:11:12,180 I'd like to see statisticians being much more proactive in speaking out of the importance of maintaining 84 00:11:12,180 --> 00:11:21,130 trust and statistics and more involved in the science which tackles this and this information. 85 00:11:21,130 --> 00:11:29,670 It would also be good to see greater cooperation between fact fact checkers and statisticians. 86 00:11:29,670 --> 00:11:41,400 Students of this department could do internships at organisations such as Full Fact, Full Fact was established in 2010 by Will. 87 00:11:41,400 --> 00:11:49,200 In part because of Peter O'Brien's wonderful book The Rise of Political Lion. 88 00:11:49,200 --> 00:11:55,110 The initiatives of the Royal Statistical Society in establishing a journalism prise 89 00:11:55,110 --> 00:12:01,410 and in training journalists in the sound use of statistics are also to be applauded. 90 00:12:01,410 --> 00:12:06,510 Some media organisations are also building expertise. 91 00:12:06,510 --> 00:12:12,210 Interestingly, sometimes known under the label of digital verification. 92 00:12:12,210 --> 00:12:18,420 And I was extremely pleased when the BBC appointed head of statistics. 93 00:12:18,420 --> 00:12:26,000 He has a particular challenge to correct the issue of the false balance of reporting. 94 00:12:26,000 --> 00:12:34,310 It's also positive that journalists are more likely to say now when politicians are lying on this discussions taking place. 95 00:12:34,310 --> 00:12:42,170 But if an MP lies, colleagues can ask the speaker to have a statement checked by the House of Commons Library. 96 00:12:42,170 --> 00:12:47,250 So I think those are positive statements. Developments. 97 00:12:47,250 --> 00:12:51,870 There are also significant benefits when statisticians work in partnership 98 00:12:51,870 --> 00:12:58,020 with scientific communities that are committed to the integrity of reporting. 99 00:12:58,020 --> 00:13:08,310 As illustrated over the last couple of years by the strong involvement of statisticians in the Science Media Centre during the pandemic, 100 00:13:08,310 --> 00:13:18,070 I think it has been really positive. Difficult as the issue is of maintaining trust and data in the UK, 101 00:13:18,070 --> 00:13:26,620 it's significantly worse in countries where the official statistics agency is not independent from the government of the day. 102 00:13:26,620 --> 00:13:37,420 Having worked in the UN, I can give you numerous examples of what I call political numbers when the statistical system is badly under-resourced. 103 00:13:37,420 --> 00:13:42,950 When the evidence base for such numbers is poor or even non-existent. 104 00:13:42,950 --> 00:13:52,320 When statisticians have no power and no external support system, they're really not free to produce. 105 00:13:52,320 --> 00:14:01,690 Inconvenient truths. So often the data are produced to satisfy the political masters. 106 00:14:01,690 --> 00:14:08,860 Freedom to speak, truth to power has been seen as a characteristic of democracy. 107 00:14:08,860 --> 00:14:15,790 The need for a set of principles governing official statistics became apparent at the end of the 1980s, 108 00:14:15,790 --> 00:14:26,320 when countries in Central Europe began to change from centrally planned economies to market oriented democracies. 109 00:14:26,320 --> 00:14:34,150 It was essential to ensure that national statistical systems in such countries would be able to produce appropriate, 110 00:14:34,150 --> 00:14:41,220 reliable data that adhere to professional and scientific standards. 111 00:14:41,220 --> 00:14:48,150 However, since then, I think we realise that the principles are of much wider global significance. 112 00:14:48,150 --> 00:14:53,940 And there are examples, even in Western Europe, of direct manipulation of statistics. 113 00:14:53,940 --> 00:15:01,440 So we could look at Greece, for example, in the manipulation of the economic data in order that Greece could join the EU and 114 00:15:01,440 --> 00:15:09,270 then further manipulation and audit an order that Greece could join the euro currency. 115 00:15:09,270 --> 00:15:16,960 There are just too many incentives for governments to mis report, and I'll return this to this later. 116 00:15:16,960 --> 00:15:26,470 This slide shows some of the characteristics of a statistical system which help to protect its independence, 117 00:15:26,470 --> 00:15:30,850 the autonomy of the statistician, whether the statistical legislation, 118 00:15:30,850 --> 00:15:35,830 the existence of an independent board overseeing statistics, 119 00:15:35,830 --> 00:15:45,130 development of codes of conduct and breaches of the code being identified, investigated and importantly publicised. 120 00:15:45,130 --> 00:15:50,860 So it's not just that it happens, but the perception is that it happens. 121 00:15:50,860 --> 00:16:00,870 The important employment of the senior statisticians, particularly the national statistician, being removed from the political process. 122 00:16:00,870 --> 00:16:08,820 But there's involvement with users, and the users should be involved in setting the agenda, asking the awkward questions. 123 00:16:08,820 --> 00:16:21,040 There are external audits of processes, and any audit body doesn't report to the government of the day, but reports more widely to parliament. 124 00:16:21,040 --> 00:16:29,860 A red flag is the dismissal of a national statistician, as has recently happened in Fiji and in Turkey. 125 00:16:29,860 --> 00:16:39,930 Let me read from the RSS website just today. The Royal Statistical Society and the American Statistical Association note with concern 126 00:16:39,930 --> 00:16:46,830 the dismissal of the head of the Turkish Statistical Institute by Turkey's president. 127 00:16:46,830 --> 00:16:52,170 At a time when the country is experiencing high levels of inflation, 128 00:16:52,170 --> 00:17:00,750 the United Nations fundamental principles of official statistics clearly set out the professional and scientific standards for official statistics 129 00:17:00,750 --> 00:17:10,830 to ensure reliable and robust information to aid decision making as learnt societies with memberships of government statisticians across the world. 130 00:17:10,830 --> 00:17:19,400 The RSS and the USA look to highlight the importance of these principles in maintaining public trust. 131 00:17:19,400 --> 00:17:31,160 The RSS and the ISI condemn political interference in the production of official statistics and urge President Erdogan to reassure those statisticians 132 00:17:31,160 --> 00:17:40,040 working for the Turkish Statistical Institute that they are free to produce objective statistical information that serves not just the government, 133 00:17:40,040 --> 00:17:51,960 but the Turkish public. This is essential to ensure a healthy democracy in Turkey and to maintain international credibility and statistics. 134 00:17:51,960 --> 00:17:58,650 My second theme is the avoidance of over claiming. 135 00:17:58,650 --> 00:18:06,810 The IOC declaration talks about statisticians depending upon the confidence of the public, and that they should, in their work, 136 00:18:06,810 --> 00:18:16,680 attempt to promote and preserve such confidence without exaggerating the accuracy or explanatory power of their data. 137 00:18:16,680 --> 00:18:21,300 So it's important that we shouldn't have a claim for our results, in particular, 138 00:18:21,300 --> 00:18:27,270 we should not claim representativeness beyond the population we've studied. 139 00:18:27,270 --> 00:18:31,020 To do so risks bringing statistics into disrepute. 140 00:18:31,020 --> 00:18:38,790 I could quote many examples, but perhaps one of the most frequent problem areas is the reporting of opinion polling, 141 00:18:38,790 --> 00:18:43,440 where the data are assumed to be subject only to sampling error. 142 00:18:43,440 --> 00:18:48,540 Ironically, even when probability sampling has not been employed. 143 00:18:48,540 --> 00:18:53,640 And where on sampling error is completely disregarded? 144 00:18:53,640 --> 00:19:03,800 Indeed, in relation to many data sets, the analysis of many data sets, we see the possible effects of non-response ignored. 145 00:19:03,800 --> 00:19:11,570 And there's an overemphasis on sample size with little understanding of bias. 146 00:19:11,570 --> 00:19:17,030 I love these two quotes, one from Brad Efron. 147 00:19:17,030 --> 00:19:21,080 He talks about scientists having misled themselves into thinking, if you can, 148 00:19:21,080 --> 00:19:28,500 if you collect enormous amounts of data, you're bound to get the right, the right answer. 149 00:19:28,500 --> 00:19:41,680 And Nate Silver responding to Chris Anderson, the editor of Wired Magazine, claims that the sheer volume of data would obviate the need for theory. 150 00:19:41,680 --> 00:19:46,110 Need for statisticians, I guess, and even for the scientific method. 151 00:19:46,110 --> 00:19:58,860 And Nate Silver argues that these views are badly mistaken, says the numbers have no way of speaking for themselves. 152 00:19:58,860 --> 00:20:08,160 A related ethical issue is that statisticians have a responsibility also to say what we don't know. 153 00:20:08,160 --> 00:20:14,580 Indeed, that can be very informative, as it often highlights priorities within our society. 154 00:20:14,580 --> 00:20:22,110 Deaths in old people's homes at the start of the pandemic is an example which springs to mind. 155 00:20:22,110 --> 00:20:27,810 So I've talked about the importance of over claiming, but this is not the side to this story. 156 00:20:27,810 --> 00:20:33,330 And the other side is statisticians are often seen as being too cautious. 157 00:20:33,330 --> 00:20:37,680 It's unhelpful if we resist drawing conclusions from our data. 158 00:20:37,680 --> 00:20:47,040 I have a wonderful T-shirt that I got on the NASA conference that says being a statistician means never having to say, I'm certain. 159 00:20:47,040 --> 00:20:53,880 And as David Spiegelhalter said in last week's Desert Island Discs, statistics do not speak for themselves. 160 00:20:53,880 --> 00:20:57,450 We imbue them with meaning. Incidentally, 161 00:20:57,450 --> 00:21:06,720 isn't that a wonderful sign as the profile of statisticians being raised a statistician on desert island discs that if was the first statistician, 162 00:21:06,720 --> 00:21:11,950 but I certainly can't remember one before. 163 00:21:11,950 --> 00:21:21,600 A topic that we might discuss is whether statisticians should ever make policy recommendations when alive us here during the pandemic. 164 00:21:21,600 --> 00:21:27,510 I'm not someone who sees a clear line between the objective and the subjective. 165 00:21:27,510 --> 00:21:32,100 What we choose to study, the questions we choose to ask. 166 00:21:32,100 --> 00:21:41,940 The timing of our research, et cetera, et cetera. These are all decisions that we make, but which are going to impact the findings. 167 00:21:41,940 --> 00:21:46,780 Somebody else doing a similar study may come up with different results. 168 00:21:46,780 --> 00:21:53,260 So it's very difficult, in my view, to say what's objective and what subjects have. 169 00:21:53,260 --> 00:22:02,770 I'll come back to this in relation to openness. The other reason why I think we should have a discussion about whether statisticians should ever make 170 00:22:02,770 --> 00:22:10,870 policy recommendations is that many statisticians become real experts in the substantive areas they study. 171 00:22:10,870 --> 00:22:13,660 I could choose examples from this department. 172 00:22:13,660 --> 00:22:21,040 I think you've got a number of statisticians in this department who are real experts and subtract substantive areas, 173 00:22:21,040 --> 00:22:28,330 and it's a great pity if they feel overly constrained and what they can say. 174 00:22:28,330 --> 00:22:37,120 Of course, statisticians are frequently put under pressure by the media to step beyond their area of expertise. 175 00:22:37,120 --> 00:22:43,570 And it's easy to get sidelined by a clever, clever interviewer. 176 00:22:43,570 --> 00:22:50,360 And we need to take account, too, of the weak statistical literacy in our society. 177 00:22:50,360 --> 00:23:01,990 The lack of understanding of the public, the use of modelling and of variability led to an over interpretation of these as predictions. 178 00:23:01,990 --> 00:23:11,190 Incidentally, I'm reminded of Fiddler's statement that forecasting is very difficult, especially if it's about the future. 179 00:23:11,190 --> 00:23:19,450 We do need better skills as statisticians to how to operate responsibly and hopefully with the media. 180 00:23:19,450 --> 00:23:25,900 Especially in an environment where we're urged to demonstrate greater impact of our work. 181 00:23:25,900 --> 00:23:31,800 And there's a great temptation for academics to become media celebrities. 182 00:23:31,800 --> 00:23:41,080 In particular, I think we need to improve the communication and the understanding of uncertainty. 183 00:23:41,080 --> 00:23:48,160 My third theme is openness or openness and transparency. 184 00:23:48,160 --> 00:23:52,570 The ISI declaration says that statisticians are frequently furnished with information 185 00:23:52,570 --> 00:23:57,460 by the funder or employer who may legitimately require it to be confidential. 186 00:23:57,460 --> 00:24:06,970 Incidentally, elsewhere in the declaration, they talk about the importance about being open, about who you are funded by and employed by. 187 00:24:06,970 --> 00:24:16,400 Statistical methods and procedures that have been utilised to produce published data should not, however, be kept confidential. 188 00:24:16,400 --> 00:24:24,830 I can't stress enough the importance of transparency of methodology and openness about uncertainty. 189 00:24:24,830 --> 00:24:32,330 But I also want to champion the moral imperative to share data. 190 00:24:32,330 --> 00:24:39,830 Crystal didn't mention that I used to be director of the UK data archive, so this is something that is very close to my heart. 191 00:24:39,830 --> 00:24:50,460 The scientific principle is that data should be available for others to refuse to confirm, to clarify, to extend, to enhance the results. 192 00:24:50,460 --> 00:24:59,960 That's part of public accountability. We have a responsibility to society. 193 00:24:59,960 --> 00:25:14,830 But of course, also to our funders, it's OK to use resources efficiently, and it's important to to reduce response burden. 194 00:25:14,830 --> 00:25:23,800 Steve Fineberg, Byron Staff, Sue Martin, I think it's I think it's Sue Martin wrote this wonderful paper some time ago, 195 00:25:23,800 --> 00:25:33,790 1985 gosh, about the importance of of access to data and that public funded research are a public good. 196 00:25:33,790 --> 00:25:40,120 Produced in the public, interest should remain in the public realm. 197 00:25:40,120 --> 00:25:47,000 Of course, there are constraints, as they identified in this quote. 198 00:25:47,000 --> 00:26:00,870 So I would argue also, it's important to archive data because many of the issues we study change over time and most datasets can't be reconstructed. 199 00:26:00,870 --> 00:26:11,400 So where possible and consistent with confidentiality, data should be shared and at the individual level to facilitate replication. 200 00:26:11,400 --> 00:26:18,920 So not just aggregate data. Great progress has been made with respect to open access journals, 201 00:26:18,920 --> 00:26:27,620 but the importance of integrating data with the associated publication has been paid too little attention. 202 00:26:27,620 --> 00:26:39,740 The concept of data stewardship is gaining recognition. I define this as the responsible use collection and management of data in the participatory 203 00:26:39,740 --> 00:26:47,680 and rights preserving way informed by values and engaging with questions of fairness. 204 00:26:47,680 --> 00:26:55,510 Ways are being explored to allow people to gain increasing levels of control and agency over their data, 205 00:26:55,510 --> 00:27:01,060 from being informed about what's happening to data about themselves through to being empowered 206 00:27:01,060 --> 00:27:07,480 to take responsibility for exercising and actively managing decisions about data governance. 207 00:27:07,480 --> 00:27:20,340 The GDPR requires such consent to be specific, informed, unambiguous and given freely requiring affirmative action by the user. 208 00:27:20,340 --> 00:27:27,450 Now, those are fine words, but I'm conscious that there are difficulties and I'm applying them in some circumstances. 209 00:27:27,450 --> 00:27:32,150 We have statisticians often rely on secondary sources. 210 00:27:32,150 --> 00:27:41,300 As combining data sources becomes more prevalent, record linkage in particular can post privacy challenges. 211 00:27:41,300 --> 00:27:49,880 Similarly, obtaining informed consent or any consent at all from units for access to and 212 00:27:49,880 --> 00:27:56,180 linkage of their data from non survey sources such as administrative data, 213 00:27:56,180 --> 00:28:01,940 which has been collected for an entirely different purpose that continues to be challenging. 214 00:28:01,940 --> 00:28:10,030 And of course, increasingly we are using administrative data in our studies. 215 00:28:10,030 --> 00:28:20,350 In cases where a statistician has been granted access to administrative or medical records or other search material for a new enquiry, 216 00:28:20,350 --> 00:28:29,230 the custodians permission to use the records should not relieve the statistician from having to consider the likely reactions, 217 00:28:29,230 --> 00:28:35,740 sensitivities and the interests of the subjects concerned, including their entitlement. 218 00:28:35,740 --> 00:28:43,540 I don't think there are easy answers to these challenges, but I don't think we as statisticians discuss them enough. 219 00:28:43,540 --> 00:28:50,500 So as a society, we're seeking to chart a way between a data free for all where people feel powerless, 220 00:28:50,500 --> 00:29:02,210 but how data is being used and a situation where opportunities for beneficial research are lost because data isn't shared. 221 00:29:02,210 --> 00:29:09,460 Data stewardship can build trust. 222 00:29:09,460 --> 00:29:19,000 Despite the problems of using secondary data sources, I'd argue that we shouldn't fund from public finances new primary studies, 223 00:29:19,000 --> 00:29:24,520 which are carried out in ignorance of what has previously been researched. 224 00:29:24,520 --> 00:29:30,620 When I was at the data archive, I worked with the FCC to develop a data policy, 225 00:29:30,620 --> 00:29:38,830 and that data policy was but the deliberate replication to be applauded but ignorant duplication, 226 00:29:38,830 --> 00:29:49,900 waste resources as to respond burden and shouldn't shouldn't be funded. 227 00:29:49,900 --> 00:29:59,260 For responsibilities to society or statistics as a public good, which I've referred to a couple of times. 228 00:29:59,260 --> 00:30:04,900 The ISI urges statisticians to act in the public's interest. 229 00:30:04,900 --> 00:30:12,730 Says that their obligations to employers, clients and the profession can never override the public interest, 230 00:30:12,730 --> 00:30:23,120 and fellows should seek to avoid situations and not enter into undertakings which compromise this responsibility. 231 00:30:23,120 --> 00:30:30,600 So I want to focus on how the agenda for statistics is set with respect to the issues we address. 232 00:30:30,600 --> 00:30:37,710 It's vital that statisticians are free to collect information which might be uncomfortable to those in power. 233 00:30:37,710 --> 00:30:44,300 Statistics give visibility and they can hold a mirror up to our societies. 234 00:30:44,300 --> 00:30:54,340 We also need to acknowledge that those on the fringes of society are often unrepresented, underrepresented in all studies. 235 00:30:54,340 --> 00:31:06,420 We're talking about people without permanent homes, I'm talking about people in institutions, I'm talking about the poor. 236 00:31:06,420 --> 00:31:13,890 Those working in academia may have more freedom than those working in government to decide what to study, 237 00:31:13,890 --> 00:31:18,360 though we have seen an increase in directed research funding. 238 00:31:18,360 --> 00:31:24,940 And some of us worry that our funding bodies are not always politically independent. 239 00:31:24,940 --> 00:31:30,970 But working in government, I've been aware of some of some potential interference in what's collected, 240 00:31:30,970 --> 00:31:36,700 so I can tell you tales of when I was the secretary to the Census Committee and 241 00:31:36,700 --> 00:31:43,260 Margaret Thatcher tried to influence the questions being asked on the census. 242 00:31:43,260 --> 00:31:50,250 As every statistician knows, gathering the data is only half the story, the other half is getting it out to those who need it, 243 00:31:50,250 --> 00:31:57,450 and the story which accompanies the data is important but is more easily manipulated. 244 00:31:57,450 --> 00:32:01,730 So when I worked in the NHS, the Department of Health. 245 00:32:01,730 --> 00:32:13,780 Try to stop me producing an interpretation of some statistics that showed a rise and caesarian births year on year. 246 00:32:13,780 --> 00:32:23,300 So you do get interference and the freedom to collect information, the freedom to report on the data is critical. 247 00:32:23,300 --> 00:32:30,170 Resistance to such interference has been very dependent upon the professional leadership, 248 00:32:30,170 --> 00:32:38,530 and that professional leadership needs to understand the nuances of the situation need sometimes to be brave. 249 00:32:38,530 --> 00:32:42,670 The Royal Statistical Society has also been crucial. 250 00:32:42,670 --> 00:32:53,530 So it's argued for published dates and times when statistics will be released to avoid sensitive data being deliberately timed so that they'll get 251 00:32:53,530 --> 00:33:04,220 less coverage and for a reduction in the number of people who have prior access to official data and the length of time they have access to it. 252 00:33:04,220 --> 00:33:12,970 And I've got a couple of rather old slides here showing you the sorts of things we wanted to avoid. 253 00:33:12,970 --> 00:33:17,920 So Gus O'Donnell, who says he was misquoted. 254 00:33:17,920 --> 00:33:26,320 So we'll have to. I have to add that. But he was quoted as having said that he wanted the Office for National Statistics to be boring, 255 00:33:26,320 --> 00:33:32,380 to put out the plain facts and nothing but the facts and unclear and predictable deadlines. 256 00:33:32,380 --> 00:33:38,830 He said it would then be for politicians and government press officers to interpret the figures. 257 00:33:38,830 --> 00:33:47,650 He added the RSS wrote to him the same day and know because I helped write that letter, 258 00:33:47,650 --> 00:33:52,540 saying it's clearly the task of statisticians to interpret the figures in a 259 00:33:52,540 --> 00:33:58,900 statistical context to facilitate understanding and avoid misunderstanding. 260 00:33:58,900 --> 00:34:08,200 The Code of Practise of the UK Statistics Authority explicitly states that official statistics, accompanied by full and frank commentary, 261 00:34:08,200 --> 00:34:13,750 should be readily accessible to all users and that all UK bodies that are responsible 262 00:34:13,750 --> 00:34:18,550 for official statistics should prepare and disseminate commentary and analysis. 263 00:34:18,550 --> 00:34:30,530 The aid interpretation and provide factual information about the policy or operational context of official statistics. 264 00:34:30,530 --> 00:34:37,860 This is the issue that I was saying about having the predictable deadlines, and some of you may remember this, 265 00:34:37,860 --> 00:34:52,330 but this is a statement that was made of 911, where a senior government adviser suggested it was a good day for burying bad news. 266 00:34:52,330 --> 00:34:58,390 So fortunately, we've been able to counter some of these issues, not in all countries. 267 00:34:58,390 --> 00:35:05,260 I think I've got time just for a quick aside, when I was working in the in the U.N., 268 00:35:05,260 --> 00:35:14,650 I was at a meeting where the National Statistician Canada talked about no politicians, 269 00:35:14,650 --> 00:35:23,690 no government ministers had access to statistics, except for the government minister with responsibility for those statistics. 270 00:35:23,690 --> 00:35:37,520 And he was given access. He or she was given access one hour before in a closed room without any access to technology to distribute information. 271 00:35:37,520 --> 00:35:47,890 So you didn't get his or her spin first. And the head of statistics for Russia was listening to this and then asked for the floor and said. 272 00:35:47,890 --> 00:35:52,520 He listened with respect to the distinguished head of statistics from Canada. 273 00:35:52,520 --> 00:36:01,010 But what he wanted to know was what happened when the minister wished to change the statistics. 274 00:36:01,010 --> 00:36:08,550 So illustrates the difference of different countries. When I was working as a statistician in the U.N., 275 00:36:08,550 --> 00:36:18,420 I was concerned about pressures on poorer countries to collect data that's been determined by wealthier countries, by donor agencies. 276 00:36:18,420 --> 00:36:21,800 What I would call data colonialism. 277 00:36:21,800 --> 00:36:32,480 There is, in my view, an an overemphasis on cross nationally comparable data at the expense sometimes of locally specific data, 278 00:36:32,480 --> 00:36:37,940 which might be used to develop relevant policies that can make a difference in people's lives. 279 00:36:37,940 --> 00:36:45,890 Instead, we have countries getting obsessed with targets and that positions in international league tables. 280 00:36:45,890 --> 00:36:55,400 That was my topic, the topic of my RSS presidential address, which I hesitate to say was 21 years ago. 281 00:36:55,400 --> 00:37:08,560 Scary. A thread running through the whole of my talk has been that of the importance of understanding the impact of incentives. 282 00:37:08,560 --> 00:37:13,150 I mentioned incentives, particularly in relation to official data, 283 00:37:13,150 --> 00:37:20,690 but we also need to understand the unintended consequences of the system of academic rewards. 284 00:37:20,690 --> 00:37:29,450 In response to the UK Science and Technology Select Committees, recent call for evidence on science reproducibility, 285 00:37:29,450 --> 00:37:36,420 many early career researchers expressed concern about how they were being measured. 286 00:37:36,420 --> 00:37:42,390 And their impressions were backed up by the Wellcome Trust survey of Research Culture. 287 00:37:42,390 --> 00:37:50,910 Forty three percent of the respondents thought that metrics were valued over research quality. 288 00:37:50,910 --> 00:38:00,990 Nearly a quarter of early career researchers had felt pressurised by a supervisor to produce a particular result. 289 00:38:00,990 --> 00:38:09,690 And a common response was that the criteria for hiring and firing young researchers needed to be modified. 290 00:38:09,690 --> 00:38:15,420 Respondents arguing that we must be careful when we use proxy measures of quality, 291 00:38:15,420 --> 00:38:20,610 even though such as number of publications or amount of grant funding. 292 00:38:20,610 --> 00:38:31,560 And instead, we should try to reward work that's conducted in a reproducible and reproducible and rigorous fashion. 293 00:38:31,560 --> 00:38:31,950 Clearly, 294 00:38:31,950 --> 00:38:42,900 something's wrong in a system where so many young researchers feel that there's a mismatch between doing good science and having a successful career. 295 00:38:42,900 --> 00:38:53,640 Dorothy Bishop here in Oxford argues that we should give we should try while giving researchers an automatic and flexible research budget, 296 00:38:53,640 --> 00:39:02,190 say £10000 targeted at replication. No reviews, no university overheads. 297 00:39:02,190 --> 00:39:06,810 Just pre registration of the hypotheses to be tested. 298 00:39:06,810 --> 00:39:18,270 A law promoted by Ben Goldacre. And open science flexibility would allow researchers to pool their grants to 299 00:39:18,270 --> 00:39:23,910 undertake a larger piece of work and voluntary and contrived collaborations. 300 00:39:23,910 --> 00:39:31,250 Crucially, researchers would switch from asking Is this fundable to is this true? 301 00:39:31,250 --> 00:39:37,280 I think that's a rather wonderful note on which to end my talk. 302 00:39:37,280 --> 00:39:41,600 Do we measure what we treasure? A question for you. 303 00:39:41,600 --> 00:39:49,397 Thank you.