1 00:00:04,090 --> 00:00:13,570 So it gives me great pleasure to introduce Guy Jenkins and Andy Getting's Dei is the head of research, 2 00:00:13,570 --> 00:00:18,610 computing and support services for Oxford's Advanced Research Computing Service. 3 00:00:18,610 --> 00:00:24,160 And Andy is the scientific research software advisor for ARC. 4 00:00:24,160 --> 00:00:34,000 And today they're going to tell us about the amazing supercomputing facilities we have at our fingertips here, Oxford and and beyond. 5 00:00:34,000 --> 00:00:42,790 And I hope that in this we get an overview of what's possible and how to use it. 6 00:00:42,790 --> 00:00:46,630 So over to you. OK, thanks. 7 00:00:46,630 --> 00:00:55,210 And thanks for having us along today to you to talk about, OK, so what we can try and strike a balance between in this presentation is some well, 8 00:00:55,210 --> 00:00:58,660 hopefully all of it is going to be useful in some kind of way. 9 00:00:58,660 --> 00:01:02,050 It's not meant to be a fully in-depth training course, 10 00:01:02,050 --> 00:01:07,990 but what is meant to do is to try and give you an overview of the facilities that we have available. 11 00:01:07,990 --> 00:01:15,040 Why would you want to use high performance or advanced computing in general, not just through ARC? 12 00:01:15,040 --> 00:01:25,060 And also how you can access our services? I'm looking at support what's available for them and and so on and so forth. 13 00:01:25,060 --> 00:01:29,350 So getting into the slides, so just a quick kind of overview. 14 00:01:29,350 --> 00:01:32,650 Key facts of of arc. 15 00:01:32,650 --> 00:01:41,920 So Arc, the University of Oxford Advanced Research Computing team and we provide the Central High Performance Computing resource in Oxford University, 16 00:01:41,920 --> 00:01:47,110 and we are hosted within I.T. services. 17 00:01:47,110 --> 00:01:54,970 We are the Central University resource, but there are significant other high performance computing resources that are based around the university. 18 00:01:54,970 --> 00:02:05,020 But we are the only ones available to all four divisions that are free at the point of access to to researchers. 19 00:02:05,020 --> 00:02:14,290 So that means that you can, as a member of the university, be able to request a user account on ARC and through a project. 20 00:02:14,290 --> 00:02:20,770 You can then start doing work on the arc resource without having to find any income, 21 00:02:20,770 --> 00:02:24,910 any funding in order to provide to the OK series in order to gain access to it. 22 00:02:24,910 --> 00:02:32,470 So in terms of in terms of numbers, the team is I'm quite small, it's only for staff and myself. 23 00:02:32,470 --> 00:02:35,170 So to boast I followed, not with a stay. 24 00:02:35,170 --> 00:02:40,720 They are the systems administrators for the service and they keep the systems and fed and watered up and running. 25 00:02:40,720 --> 00:02:49,720 And then there's Andy and another person called, Yes, me, neither who provide the applications and user support for the service, 26 00:02:49,720 --> 00:02:56,980 and we will be giving more of an overview on what's available there in his half of this talk. 27 00:02:56,980 --> 00:03:01,960 So in terms of the hardware we have available to us at the moment, we have two principal clusters. 28 00:03:01,960 --> 00:03:11,350 One of those is for high throughput applications, and the other one is what we use for or was billed as being the capability cluster, 29 00:03:11,350 --> 00:03:17,200 which is used for the it's kind of what what was what was deemed as being too high performance applications, 30 00:03:17,200 --> 00:03:23,980 which are closely coupled in terms of processor communications and memory available and so on and so forth. 31 00:03:23,980 --> 00:03:29,410 I go into that in a bit more detail and those are connected through to high performance, 32 00:03:29,410 --> 00:03:37,150 deep fast filesystem that we have connected to the cluster in terms of the user base that we have. 33 00:03:37,150 --> 00:03:42,610 As I said earlier on, we are free to all four divisions within the university. 34 00:03:42,610 --> 00:03:47,740 But in terms of, you know, not all divisions users, as much as others, 35 00:03:47,740 --> 00:03:56,680 MPLS uses one about 90 percent of the resources of the the compute cycles and core hours that are provided by the service 36 00:03:56,680 --> 00:04:06,760 that simplifies social sciences division and medical sciences division and then humanities who use us very rarely. 37 00:04:06,760 --> 00:04:11,230 So that was what happens across all divisions in terms of the board members use numbers. 38 00:04:11,230 --> 00:04:20,590 We have 20 five hundred registered users across the university and that's increasing by around about 600 users per year. 39 00:04:20,590 --> 00:04:27,550 Out of all of those, and 400 are active and submitting jobs to the system, any one given moment in time. 40 00:04:27,550 --> 00:04:32,230 The service is quite busy and we're running around about 50000 jobs per month and increasing, 41 00:04:32,230 --> 00:04:39,640 which means that our clusters and the resource that's available with them is around about 80 percent utilised at all times. 42 00:04:39,640 --> 00:04:45,580 And that's kind of around about the maximum level that we like to run them at in order that there's 43 00:04:45,580 --> 00:04:52,480 sufficient headroom to turn around jobs within the scheduler and have sufficient throughput on the system. 44 00:04:52,480 --> 00:04:57,430 So, you know, I'm just answering the question, but why not 100 percent to 80 percent of the thing? 45 00:04:57,430 --> 00:05:04,400 It doesn't become kind of almost gridlocked within within the system. 46 00:05:04,400 --> 00:05:11,930 So what services do we offer? Well, first of all, there's the access to the cluster resources and also the research applications. 47 00:05:11,930 --> 00:05:18,470 We have a range of cluster resources. We have x86 nodes, GP nodes, high memory nodes in the system. 48 00:05:18,470 --> 00:05:20,450 And Andy will talk more about this. 49 00:05:20,450 --> 00:05:27,680 But we have over 400 centrally installed applications on the system that we take care of and curator available for you 50 00:05:27,680 --> 00:05:35,150 to use in order to do your research on the system and as well as providing the the hardware and support for that. 51 00:05:35,150 --> 00:05:39,920 There's also the use of trained user support and training that goes along with it. 52 00:05:39,920 --> 00:05:47,100 We run face to face training as well as well with encoded situated in COVI times. 53 00:05:47,100 --> 00:05:57,230 We do most this this online, but if you do have any ad hoc queries, then a ticket can be submitted through to support of the Orchestra Act or UK. 54 00:05:57,230 --> 00:06:02,930 And we will we will deal with anything that comes across on that by email and if things do need to be followed up on that, 55 00:06:02,930 --> 00:06:12,440 we can then do a teams call in order to to help you out. There's also what we call premium services as well that we offer on there own the 56 00:06:12,440 --> 00:06:16,790 cluster and these are in addition to the free resources I talked about earlier on. 57 00:06:16,790 --> 00:06:20,720 And examples of these are node reservations, 58 00:06:20,720 --> 00:06:29,150 so we can effectively book out a set of notes for you to do a specific piece of work on and be given a reservation on that. 59 00:06:29,150 --> 00:06:32,960 And then that will be yours to use for a specific period of time. 60 00:06:32,960 --> 00:06:38,480 We do kind of like test the case for those quite rigorously, so that's not something we just give as a matter of. 61 00:06:38,480 --> 00:06:45,350 As a matter, of course, because it then takes that reserved resource away from the wider user population. 62 00:06:45,350 --> 00:06:51,500 There's also priority time as well, which is a paid for service, 63 00:06:51,500 --> 00:06:57,860 paid for kind of product that we have and that is allows your jobs to be pushed 64 00:06:57,860 --> 00:07:03,620 forward in the queue ahead of other people that are just having standard service. 65 00:07:03,620 --> 00:07:10,520 And so the reasons why you would want to do that is if you are coming very close to a paper submission deadline or you know, 66 00:07:10,520 --> 00:07:14,660 you're it's coming close to your thesis being submitted or anything like that, 67 00:07:14,660 --> 00:07:18,470 then you can purchase priority time and it will allow you to go ahead in the queue and 68 00:07:18,470 --> 00:07:25,700 get that work done ahead of others and become of the final premium service that we use. 69 00:07:25,700 --> 00:07:33,710 A lot is co-investment, and that's where a researcher will buy notes. 70 00:07:33,710 --> 00:07:37,130 Those no. Two effects will be given over to the ARK team. 71 00:07:37,130 --> 00:07:44,900 We will install them in the system and we will feed water, operate those for for the researchers. 72 00:07:44,900 --> 00:07:51,680 The benefit that we can get back to arc is that when the research that's bought those nodes isn't using them. 73 00:07:51,680 --> 00:07:59,330 We can then do what's called back filling of jobs onto those nodes, which benefits the wider user community, 74 00:07:59,330 --> 00:08:03,230 as well as the researcher that bought the nodes in the first place. 75 00:08:03,230 --> 00:08:11,090 And a large number of our GPU notes have been bought under that co-investment model. 76 00:08:11,090 --> 00:08:17,660 What we can also do as well for external users of the system is we can provide access for them and this is 77 00:08:17,660 --> 00:08:26,150 primarily and it says here it's open access for collaboration and commercial partners for academic collaborators. 78 00:08:26,150 --> 00:08:35,310 We can arrange access free to to ask to work with you and your teams on a quest for commercial partners. 79 00:08:35,310 --> 00:08:42,590 Be to charge them for access to the to to the system. 80 00:08:42,590 --> 00:08:46,700 And finally, on this access to IS is incredibly easy when it's free. 81 00:08:46,700 --> 00:08:55,130 Secondly, in order to access ARC, first thing you need to do is to set up a project and that can be done by a principal investigator or group lead. 82 00:08:55,130 --> 00:08:59,210 And that's done by our web form and the link for that issue on the slide. 83 00:08:59,210 --> 00:09:08,120 Once that project is then being created, you can then do a a an individual user account request and then we can set you up with a user account, 84 00:09:08,120 --> 00:09:14,780 which is then linked to that project account. And then you can start beginning to use the the park service. 85 00:09:14,780 --> 00:09:20,000 So it's a fairly easy, simple process to go through and from start to finish. 86 00:09:20,000 --> 00:09:29,190 This can probably be done in about two or three days more or less in normal in normal circumstances. 87 00:09:29,190 --> 00:09:34,360 But, you know, so that's what all kind of is a very high level as well as kind of what we provide. 88 00:09:34,360 --> 00:09:38,940 But but you know, what is high performance computing and why is high performance computing? 89 00:09:38,940 --> 00:09:41,760 So first of all, what is high performance computing? 90 00:09:41,760 --> 00:09:51,030 No, single single or really kind of an all encompassing definition of what high performance computing is or isn't? 91 00:09:51,030 --> 00:09:54,810 But for the purposes of this kind of, you know, framing this topic, 92 00:09:54,810 --> 00:10:02,380 it's really something that can't be performed on a desktop or workstation easily or just takes so much time to do all that kind of resource. 93 00:10:02,380 --> 00:10:10,820 It becomes impractical, if not impossible, to do it in the useful manner or something may be amenable to be carried out in, 94 00:10:10,820 --> 00:10:14,160 you know, on multiple processes in in parallel. 95 00:10:14,160 --> 00:10:18,840 And that can either be so at multiple instances of the kind of a second job, but with a slight problem to tweak. 96 00:10:18,840 --> 00:10:26,010 Or we can be working using multiple processes in parallel to sort out a truly complex problem that is requires a single job that's 97 00:10:26,010 --> 00:10:36,600 require access to multiple processes into changing information with each other via API to to come to come to a solution on that problem. 98 00:10:36,600 --> 00:10:42,150 But overall, I think kind of it's agreed that HPC should find the person to take a unit to work in less 99 00:10:42,150 --> 00:10:47,880 time or more work in the same time or to you that something is otherwise impossible. 100 00:10:47,880 --> 00:10:54,720 And kind of the reason why these these images have been chosen only, well, it's on the on the left and right hand side of my slide. 101 00:10:54,720 --> 00:11:01,770 It's kind of like exemplify that this isn't really something that is new or it's been done in the last five or 10 years. 102 00:11:01,770 --> 00:11:06,210 The diagram of the picture at the bottom of a slide that's one of the and the bombs 103 00:11:06,210 --> 00:11:12,210 that was used to to to to find out the relative positions of the Enigma machine. 104 00:11:12,210 --> 00:11:20,130 And that would be kind of a classic example of kind of a high throughput how high throughput computing type job where you're working on, 105 00:11:20,130 --> 00:11:24,250 not dots are all completely interconnected, but there are lots of different jobs, 106 00:11:24,250 --> 00:11:31,530 but you have the requisite if you have the time pressure that you have to figure out what those two positions are in a 24 107 00:11:31,530 --> 00:11:38,070 hours period so that the information can actually be of any use because as soon as you can step outside of that period, 108 00:11:38,070 --> 00:11:42,180 then of course, all the work you done previously is is absolutely null and void. 109 00:11:42,180 --> 00:11:49,630 And the work that Lexie McCray, that's the naturally dressed guy standing next to the big one thing at the top, which is a cray Acree, 110 00:11:49,630 --> 00:11:59,820 when that machine is was primarily meant to set problems to to to do calculations, but again with time to completion requirement was wasn't so great. 111 00:11:59,820 --> 00:12:05,100 But where the competition was so complicated that it would take so it would take a huge amount of time in order for 112 00:12:05,100 --> 00:12:11,940 somebody to do it basically on their own with a slide rule or working in 10s and with the slide rule in order to do it. 113 00:12:11,940 --> 00:12:19,260 So High-Performance computing, high throughput computing, not necessarily very new concepts at all. 114 00:12:19,260 --> 00:12:22,030 There. 115 00:12:22,030 --> 00:12:29,300 But usually one thing in common with high performance of all high performance computing is it's normally a large system somewhere in a data centre. 116 00:12:29,300 --> 00:12:38,570 And our data centre is Bedford Park, which is based just outside of of Oxford. 117 00:12:38,570 --> 00:12:45,740 So what can we switch computing be be used for? So generally, there are kind of four types of research computing. 118 00:12:45,740 --> 00:12:51,740 So this this compute intensive and surprise surprise. And that's doing things where, you know, 119 00:12:51,740 --> 00:12:58,910 requiring a large amount of compute with high performance into processor communication in order to actually to do that. 120 00:12:58,910 --> 00:13:02,930 So this is what we would term as being so traditional, high performance, 121 00:13:02,930 --> 00:13:08,510 heroic supercomputing that you're doing modelling simulation problems with things like fluid dynamics, 122 00:13:08,510 --> 00:13:12,440 climate modelling, molecular simulations, et cetera, et cetera. 123 00:13:12,440 --> 00:13:21,590 And where we see most this kind of work coming in onto our systems is through researchers in MPLS and medical sciences division. 124 00:13:21,590 --> 00:13:26,630 Then we can move through to data intensive High-Performance computing. 125 00:13:26,630 --> 00:13:29,780 And this is this is, as it says on the tin. 126 00:13:29,780 --> 00:13:37,910 It's applications requiring or operating on large amounts of data, quite fast, efficient ingress egress of data. 127 00:13:37,910 --> 00:13:43,100 So in this case, it's not so much performance within the computational part of the node. 128 00:13:43,100 --> 00:13:47,670 It's performance there, but also in kind of the the the data. 129 00:13:47,670 --> 00:13:54,740 So the hierarchy there in order to be able to hold data into the compute and back out again. 130 00:13:54,740 --> 00:14:02,990 And so applications that we see in this area are basically around by occupying informatics, genomics and machine learning applications, 131 00:14:02,990 --> 00:14:11,570 which are of using and manipulating and operating on large increase in an increasingly larger amounts of of data in order to, 132 00:14:11,570 --> 00:14:18,860 you know, in order to come to a solution that becomes a high throughput computing. 133 00:14:18,860 --> 00:14:25,250 This is although we said it's been a high performance computing stuff that we can't really do on the desktop. 134 00:14:25,250 --> 00:14:27,530 This kind of work is stuff that you can do on a desktop, 135 00:14:27,530 --> 00:14:33,170 except that what you produce high performance computing for is to harness the sheer amount of resource that's kept in a single 136 00:14:33,170 --> 00:14:41,720 place to almost operate as a thousand two thousand 4000 laptops working on things like parameter to sweep experiments. 137 00:14:41,720 --> 00:14:50,330 But you could do individually and then serial on it on a laptop, but it would take you so much time to do it that it would be impractical to do so. 138 00:14:50,330 --> 00:14:55,340 So what you do is you can do that if you do those types of experiments on arc multiple multiple 139 00:14:55,340 --> 00:15:00,770 times and then that increases or decreases over your time to completion immeasurably. 140 00:15:00,770 --> 00:15:04,030 So that's kind of high throughput computing, 141 00:15:04,030 --> 00:15:13,010 and it's using a number of different application areas and that's prevalent across all of the divisions within within within the university. 142 00:15:13,010 --> 00:15:21,710 And then we finally get to memory intensive computing, which in some cases is a bit of a tweak on high throughput computing. 143 00:15:21,710 --> 00:15:23,420 Applications are many and varied, 144 00:15:23,420 --> 00:15:32,630 and basically they require requirements in these cases mostly is for all of your data to be in memory in order to be to be operated on. 145 00:15:32,630 --> 00:15:36,530 So it's either input is incredibly large and has to sit in memory or the outputs from what 146 00:15:36,530 --> 00:15:42,230 you were doing and require a huge amount of memory in order to push it to be pushed into. 147 00:15:42,230 --> 00:15:47,600 And there's some examples in the economy. But overall, 148 00:15:47,600 --> 00:15:53,360 take home message from this is that Arc provides some compute resource and that goes across all of 149 00:15:53,360 --> 00:16:04,560 these four areas and finds a general and high performance computing service to to the university. 150 00:16:04,560 --> 00:16:12,780 So as a kind of one of the kind of extreme examples that we sometimes give to to to to illustrate what's required. 151 00:16:12,780 --> 00:16:17,970 So I been talking in a desktop PC can reverse or tens of gigawatts. 152 00:16:17,970 --> 00:16:23,500 That is useful for most day to day applications, but in some cases that's just not going to be able to catch it. 153 00:16:23,500 --> 00:16:29,970 An extreme example is things like short range weather forecasting where you need to and shorten in terms of time, 154 00:16:29,970 --> 00:16:38,310 rather in terms distance that brings different complications. So we have two term predictions around for the next day within that particular day that 155 00:16:38,310 --> 00:16:43,320 that that that you want to run today in order for it to be available for tomorrow. 156 00:16:43,320 --> 00:16:52,620 And so for the things that the Met Office requirements a compute system of circa one petaflops required in order to actually to to do 157 00:16:52,620 --> 00:17:02,670 that and be able to turn around those calculations on a useful timescale in order to be available for the next the next working day. 158 00:17:02,670 --> 00:17:06,690 So that's kind of an example of where you require an extreme compute in order 159 00:17:06,690 --> 00:17:12,030 to do to do something which which we both don't take for granted nowadays. 160 00:17:12,030 --> 00:17:22,680 I'm just giving an illustration of the size of the computers that are the fastest supercomputer in November 2021, as shown up on the the. 161 00:17:22,680 --> 00:17:31,710 The top 500 is around about 440 to petaflops, which is a massive and staggering. 162 00:17:31,710 --> 00:17:36,810 And when you consider that when I first started in high performance computing, 163 00:17:36,810 --> 00:17:42,190 which is back in around about 2000 and seven, we had the National Service. 164 00:17:42,190 --> 00:17:47,430 Hector then was, I think it was 60 teraflops in size. 165 00:17:47,430 --> 00:17:54,150 And so strolling on around August 13, 14, 15 years, we have this size of computer. 166 00:17:54,150 --> 00:18:02,520 Hector in two thousand six seven, was in the top 10. This system is the fastest in the top 10, and it is now on rough calculations, 167 00:18:02,520 --> 00:18:10,470 probably around 7000 times faster than than than than the hack the system was back then. 168 00:18:10,470 --> 00:18:17,820 So that's how much things have come on in leaps and bounds because the applications are driving us in this direction. 169 00:18:17,820 --> 00:18:22,630 So it gives you food for thought. Right. 170 00:18:22,630 --> 00:18:31,230 So in terms of the hardware. But before I go on, does anybody have any immediate questions? 171 00:18:31,230 --> 00:18:39,620 So. OK, so I'll just so I take a quick canter through the through the actual hardware resources that we have 172 00:18:39,620 --> 00:18:46,550 available for turning over to Andy that will talk more about the the softer elements of the service. 173 00:18:46,550 --> 00:18:49,970 We operate two clusters of a type of park. 174 00:18:49,970 --> 00:19:00,050 One is the high throughput cluster, which is kind of, in some cases, I going to discuss in more details, but it's just a bunch of nodes. 175 00:19:00,050 --> 00:19:07,420 And then we have the ARK cluster, which is our capability cluster, and I'm going to explain more about what these are. 176 00:19:07,420 --> 00:19:15,050 In a minute, the way that most users interact with Senate jobs, the system is through the arm, 177 00:19:15,050 --> 00:19:19,370 through the flumes scheduler to select a job to learn it then assesses it. 178 00:19:19,370 --> 00:19:28,700 And then when the when the resource becomes available, should other factors be equal, it will then launch the job onto the system and it will run. 179 00:19:28,700 --> 00:19:34,410 All of our clusters are based out at the Park Data Centre. 180 00:19:34,410 --> 00:19:45,090 Just outside, just outside Oxford. And these are then connected back into the university's core Odin network. 181 00:19:45,090 --> 00:19:51,990 So taking each cluster into its supercomputing cluster, these clusters purposely set up the preference small jobs, 182 00:19:51,990 --> 00:19:57,090 which in this case are less than less than one node inside in size. 183 00:19:57,090 --> 00:20:03,810 And this is to make sure that only small jobs, high throughput jobs actually get through in the system and that the system is not going to 184 00:20:03,810 --> 00:20:09,930 get swamped by jobs that are larger than that which should be targeted to the arc system. 185 00:20:09,930 --> 00:20:13,830 The system also provides a kind of as well as finding a system. 186 00:20:13,830 --> 00:20:24,060 It also provides a hosting infrastructure, a mix of CPU and GPU nodes that we bought and co-investment within this system. 187 00:20:24,060 --> 00:20:31,920 We have two high memory nodes, and they're available to users on each of those has three terabytes each. 188 00:20:31,920 --> 00:20:38,970 We have a whole slew of GPU nodes in that system as well, which you can, which you can. 189 00:20:38,970 --> 00:20:41,670 You can get full details on at the link, 190 00:20:41,670 --> 00:20:50,400 but they range from these kind of very high performance and servers to show on the system, which is 80 x Max-Q, 191 00:20:50,400 --> 00:20:57,960 which has eight v 100 AMD GPUs, which are all connected via a very high bandwidth NVLink interconnect, 192 00:20:57,960 --> 00:21:00,660 and then that has two Intel processors within it. 193 00:21:00,660 --> 00:21:07,050 So this is basically a server that's completely dedicated to high end machine learning and in some cases by simulation kind 194 00:21:07,050 --> 00:21:15,180 of research on it that you can do on that system and the GPUs on that also and do double precision workloads as well. 195 00:21:15,180 --> 00:21:18,390 So we have a number of those in this in the system. 196 00:21:18,390 --> 00:21:24,900 But what we also have, as well as a large number of more kind of like prosumer consumer type cards, 197 00:21:24,900 --> 00:21:31,950 almost like gaming cards that within the system with varying amounts of graphics, graphics memory that go along with those as well. 198 00:21:31,950 --> 00:21:39,030 And those are more geared towards just just out machine learning applications. 199 00:21:39,030 --> 00:21:41,100 On top of what we have in terms of, you know, 200 00:21:41,100 --> 00:21:48,060 you also have another 40 just standard CPU nodes within that system, as well as the high throughput workloads, 201 00:21:48,060 --> 00:21:53,610 and these servers are exactly the same specification as those in the arc capability machine, 202 00:21:53,610 --> 00:22:03,230 but they just do not have the the InfiniBand interconnect linking each of the nodes together. 203 00:22:03,230 --> 00:22:11,000 With regard to the co-investment nodes that are put into the system, as I said earlier on, research buys them, we feed and water them. 204 00:22:11,000 --> 00:22:15,890 But when the research is not using, they're available to the wider university. 205 00:22:15,890 --> 00:22:19,100 What that means is that jobs can backfill into those nodes. 206 00:22:19,100 --> 00:22:27,740 But the maximum length of the jobs that come back on those nodes is 12 is 12 hours and full details of those co-investment nodes. 207 00:22:27,740 --> 00:22:36,390 And there the specifications are also available via the link on this on this slide. 208 00:22:36,390 --> 00:22:39,370 Let me going to the art cluster, which studies the capability cluster, 209 00:22:39,370 --> 00:22:46,470 and this is dedicated surprise surprise to modes at the jobs that are very much larger than when one note in size. 210 00:22:46,470 --> 00:22:53,550 And so when they go into the scheduler, you know, dependable that the science part of it is going to be a prioritisation, 211 00:22:53,550 --> 00:23:01,830 prioritise prioritising part as to where it goes, when when it goes in the system, they all cluster. 212 00:23:01,830 --> 00:23:06,930 We have 258 compute nodes that are in there. 213 00:23:06,930 --> 00:23:16,020 Those compute nodes are all arranged into and arranged into seven islands as they call within the system within each island. 214 00:23:16,020 --> 00:23:21,720 There are around 40 nodes and so 40 to 44 nodes and within each island. 215 00:23:21,720 --> 00:23:28,770 And they're all connected by this this InfiniBand S$100 interconnect within each island. 216 00:23:28,770 --> 00:23:37,710 There is one to one communication non blocking communication within it, and that provides the highest performance. 217 00:23:37,710 --> 00:23:42,510 It's like communications that we can provide between all nodes within the system, 218 00:23:42,510 --> 00:23:48,450 and the performance of that interconnect would be comparable to what you would see on the on the national services. 219 00:23:48,450 --> 00:23:58,740 So things like Archer and between you can run jobs between these islands, but there is a slightly higher communications overhead on on that. 220 00:23:58,740 --> 00:24:03,190 And where it goes from being non blocking to being a three to one contention ratio 221 00:24:03,190 --> 00:24:09,570 in the in the network connectivity between that so contention three to one, 222 00:24:09,570 --> 00:24:13,590 the higher those ratios go, the the lower that your performance is going to be from. 223 00:24:13,590 --> 00:24:21,510 From that particular given interconnect operating system on this is Santos. 224 00:24:21,510 --> 00:24:31,200 But what? We Santos eight. But we also ran to some of the other choices within that system in a legacy configuration, 225 00:24:31,200 --> 00:24:38,970 which looks a bit like Arcus Bay that runs CentOS seven point seven and that's just specifically set up for legacy applications that can't not come. 226 00:24:38,970 --> 00:24:46,980 I can't run on central site. And I said before the schedule that we have on the system is slow. 227 00:24:46,980 --> 00:24:52,290 This system is a kind of the grand scheme of things, a modest twelve thousand three hundred cores. 228 00:24:52,290 --> 00:24:59,850 But in the Oxford Sense, this system provides a significant improvement over what we had previously with with our Casspi and the old HTC system, 229 00:24:59,850 --> 00:25:05,190 which collectively had, I think it was probably about five thousand eight hundred cores between them. 230 00:25:05,190 --> 00:25:13,410 So we more than doubled the the capacity on this type system over our previous arrangements. 231 00:25:13,410 --> 00:25:22,260 Full details on the system are again given on the on the Arc website, so if you want to look at things in more detail. 232 00:25:22,260 --> 00:25:28,710 So that's the compute side of things, then we have the storage that goes behind all of this. 233 00:25:28,710 --> 00:25:39,060 We operate a very high performance file system called called Universe, and we have two petabytes of that that are available to us. 234 00:25:39,060 --> 00:25:44,700 I don't think there's really much more to say to that. They don't any restrictions on use of that power from quoting. 235 00:25:44,700 --> 00:25:48,930 And that's pretty much all there is to say on that one. 236 00:25:48,930 --> 00:25:56,630 I won't go into full details on the performance of that because because sometimes going when you and the other. 237 00:25:56,630 --> 00:26:05,900 So that's those last three slides are what we have as a team based in the the banquet park data centre. 238 00:26:05,900 --> 00:26:17,660 But what we also have access to as well is the data service, and Jade stands for the joint academic data science endeavour. 239 00:26:17,660 --> 00:26:21,500 And this is a grant that was funded by ATSIC, 240 00:26:21,500 --> 00:26:33,800 led by Oxford for a four or five million pound system that is based entirely on a slender TJX Max-Q product. 241 00:26:33,800 --> 00:26:37,190 So it's essentially it's owned by University of Oxford. 242 00:26:37,190 --> 00:26:48,040 We reported on behalf of of researchers across the UK on behalf of EPS rc well, efforts to provide this money. 243 00:26:48,040 --> 00:26:57,860 And we bought it and the system is is hosted at the Hartrick Centre in the north west of of England. 244 00:26:57,860 --> 00:27:03,740 The system itself is based on is based on 63 x max q boxes. 245 00:27:03,740 --> 00:27:10,040 It's very similar to the previous one service, except that there's just a large amount of it. 246 00:27:10,040 --> 00:27:17,240 The only other difference between really between grade one and two, apart from the increased capacity of the system, 247 00:27:17,240 --> 00:27:22,610 is also the fact that it's connected through to a much more high performance file 248 00:27:22,610 --> 00:27:29,090 system on that on on on on that particular iteration of the of the cluster. 249 00:27:29,090 --> 00:27:38,670 And what this is this system is mainly geared to providing is just a higher level of resource and specifically the ability to do in, 250 00:27:38,670 --> 00:27:47,840 you know, in multiples of one boxes, multi GP work, which most universities don't have the resources to be able to do. 251 00:27:47,840 --> 00:27:52,730 We are quite fortunate at Oxford in that we have around about five of these systems. 252 00:27:52,730 --> 00:28:01,340 TJX minus key systems are self, but there are a number of universities that do not have access to this kind of this kind of kit, 253 00:28:01,340 --> 00:28:08,690 and that's what data is meant to be an unchanged one before it is meant to be that facility. 254 00:28:08,690 --> 00:28:26,270 So that's all I have to say on those elements, I'm happy to take any questions, but if there aren't any, then I can have the to Andy. 255 00:28:26,270 --> 00:28:35,930 OK. Sorry, I just can't say die. You mentioned there are different types of notes on on arc, including chip use. 256 00:28:35,930 --> 00:28:38,810 Mm hmm. Maybe it may be A. You're going to say a little bit about this, 257 00:28:38,810 --> 00:28:49,160 but if if if one wanted to run an arc job on a particular class of node, I would one go about doing that. 258 00:28:49,160 --> 00:28:55,340 You can specify it into Islam, will cover it a little bit more to talk about, but it won't be in too much detail. 259 00:28:55,340 --> 00:28:59,510 But yeah. OK, thanks. Yeah. OK. 260 00:28:59,510 --> 00:29:03,560 I will carry on then. Thanks. Thanks, Joe. 261 00:29:03,560 --> 00:29:07,670 So, yeah, I imagine to get I need this off applications group within the Arctic. 262 00:29:07,670 --> 00:29:18,950 I'm going to cover some of the usual application support type things that Arc do specifically. 263 00:29:18,950 --> 00:29:25,460 We'll look at things like the training courses, use documentation, what we do with software applications. 264 00:29:25,460 --> 00:29:30,950 And obviously, as I've already mentioned, 265 00:29:30,950 --> 00:29:39,410 one of the things that we do quite a lot of is providing general user assistance via our email address, which is support Arctic oxygen to UK. 266 00:29:39,410 --> 00:29:48,170 We do often repeat that so that people can actually send us any kind of support request and that goes into our general ticketing system. 267 00:29:48,170 --> 00:29:59,300 And and usually we're very responsive on that. One member of of the marketing is usually dedicated to answering tickets each day of the week, 268 00:29:59,300 --> 00:30:07,670 but there's only four of us, so we have to double up occasionally. And I got. 269 00:30:07,670 --> 00:30:17,390 Let me go, so trying engagement, so the kinds of things we do regularly are the training courses, 270 00:30:17,390 --> 00:30:21,480 the main one being the introduction to OK, this is for new users. 271 00:30:21,480 --> 00:30:26,890 And if anybody is particularly interested in actually getting on our account. 272 00:30:26,890 --> 00:30:37,390 Feel free to go onto the training course website, which is shown there there marked at the UK slash training and book book a place, 273 00:30:37,390 --> 00:30:48,010 it's presented twice a month and gives a really good overview of how to use the system and what capabilities there are. 274 00:30:48,010 --> 00:30:52,630 There's also another course that we do called effective use of clusters for known programmers, 275 00:30:52,630 --> 00:31:00,490 and this is for uses that have maybe been on the mark in introduction to our course. 276 00:31:00,490 --> 00:31:05,080 I have a couple apps they use and they want to get a little bit more out of the system. 277 00:31:05,080 --> 00:31:11,080 And the reason we say and on programme is because it could be something where a user has been given 278 00:31:11,080 --> 00:31:16,690 some code from GitHub or just found something they want to build it and make it work on Iraq, 279 00:31:16,690 --> 00:31:18,340 and this place will give some pointers. 280 00:31:18,340 --> 00:31:26,360 And it also works for people who are just just running commercial applications or any other applications that are pre-installed on the system. 281 00:31:26,360 --> 00:31:35,920 And we try and run that around once per term, and I think we're pretty to have one later this term. 282 00:31:35,920 --> 00:31:47,020 The other thing that's quite important is that the all of the arc systems run Linux, so we stated that it was running centre central site, 283 00:31:47,020 --> 00:31:54,550 and that can come as a little bit of a shock to some users who are more used towards the windowed windows environment. 284 00:31:54,550 --> 00:32:02,500 So we do have a link on this training page to quite a nice, web based training course is not ours, 285 00:32:02,500 --> 00:32:10,420 but it's quite a nice one to get a good grounding in Linux. 286 00:32:10,420 --> 00:32:19,390 Something else we do our drop in sessions, and we sort of tend to schedule these around the day after an introduction to AAC course. 287 00:32:19,390 --> 00:32:24,100 And this is just some time where a couple of the team members will be available on teams for 288 00:32:24,100 --> 00:32:31,270 people just to drop in and talk to us about about any issues that they may have on the system. 289 00:32:31,270 --> 00:32:41,980 And it tends to be primarily for for four questions that don't really sit very well within an arc or supporter arc type email request. 290 00:32:41,980 --> 00:32:49,690 Maybe they need some help going through some building some software and they've got they've got a bit stock and it's a bit more interactive. 291 00:32:49,690 --> 00:32:54,460 So you know, that's something that we have these set times when people can join us. 292 00:32:54,460 --> 00:33:00,250 Or of course, if you do sport requested, we can, of course, just do those kind of things ad hoc. 293 00:33:00,250 --> 00:33:05,820 As I said previously and the other kinds of engagement that do we do? 294 00:33:05,820 --> 00:33:17,140 Just lost that page. Three guys are attending student welcome events and running any kinds of presentations like this on on on request. 295 00:33:17,140 --> 00:33:28,480 So we have the use of documentation currently, it's all hosted on WW W, the arc to Oxford, to the UK, and that's quite quite a nice site, 296 00:33:28,480 --> 00:33:33,700 but gradually starting to break that out and review what we have there in terms of 297 00:33:33,700 --> 00:33:41,110 the arc user guide and software guide and migrating them to read the Docs type page, 298 00:33:41,110 --> 00:33:50,320 which is hosted on GitHub. And that'll make it a little bit easier for users to take information, export it as a PDF and turn it into a hard copy. 299 00:33:50,320 --> 00:33:56,080 And just as some people like to read those types of documents in that way. 300 00:33:56,080 --> 00:34:01,210 The documentation has, as I think I mentioned, 301 00:34:01,210 --> 00:34:12,490 has all the links to getting a project user accounts and also covers all the sections on priority credits 302 00:34:12,490 --> 00:34:24,330 and also the service level agreements for things like quality of service for different types of abuses. 303 00:34:24,330 --> 00:34:33,450 So we have quite a varied audience for the arc, just for jobs that of the arc systems, as you probably say. 304 00:34:33,450 --> 00:34:43,860 So, you know, we're not experts in any one of these areas, but there's quite a wide range of both applications on the system and experience of users. 305 00:34:43,860 --> 00:34:51,840 But there are lots and lots of common issues that they face in these types of HPC environments on ARK, for example. 306 00:34:51,840 --> 00:34:57,990 So yeah, there's lots of challenges there, so they tend to be the same types of ones. 307 00:34:57,990 --> 00:35:05,910 So, you know, it is mainly that one uses have from moving from their work station to a system where you know, 308 00:35:05,910 --> 00:35:10,170 the they're not they're not the only person running on that. They have to queue things up. 309 00:35:10,170 --> 00:35:16,650 They have to write batch scripts. And so these are the kinds of things that we can help with. 310 00:35:16,650 --> 00:35:24,360 The software we've got available log his quite nice list of of what we have. 311 00:35:24,360 --> 00:35:33,180 You might recognise some of the applications, for example, are Anaconda, those types of things, for example, 312 00:35:33,180 --> 00:35:42,240 like with our we have about 1100 libraries available in our AR installation and 313 00:35:42,240 --> 00:35:45,690 we make it quite easy so that you can actually install your own locally as well. 314 00:35:45,690 --> 00:35:52,170 So if you're particularly using AR, that's useful. 315 00:35:52,170 --> 00:36:06,690 Anaconda Anaconda a nice way of packaging Python code and you can use virtual environments or conder environments to actually package that up, 316 00:36:06,690 --> 00:36:10,740 and you can run a virtual environment in your own areas. 317 00:36:10,740 --> 00:36:15,210 And so those types of things, we don't have to worry about installing software for you. 318 00:36:15,210 --> 00:36:28,980 We give you advice, and there's a nice walkthrough example of how to create conder environments on Arc JCC and Intel. 319 00:36:28,980 --> 00:36:33,700 They're on here because they're that they're two of our main compiler tool chains that we use, 320 00:36:33,700 --> 00:36:40,410 that you've got code you want to compile for yourself, JCC and tell either of those two things. 321 00:36:40,410 --> 00:36:57,810 And so that will be added with things like API maths kind of libraries for Intel or the open plus open or allow pack types of mass libraries for JCC. 322 00:36:57,810 --> 00:37:07,680 So this is just a nice slide showing the Times applications with their domains. 323 00:37:07,680 --> 00:37:15,420 So, you know, I know that some, some, some so many people probably won't be looking at some chemistry of genomics, baby stuff. 324 00:37:15,420 --> 00:37:21,660 But you know, there's there's quite a large number of applications on the system. 325 00:37:21,660 --> 00:37:26,820 As I say this, there's over 500 now. I think I think we had 400 earlier on, but it's now over 500. 326 00:37:26,820 --> 00:37:32,910 Application modules are available that are specific specifically for applications. 327 00:37:32,910 --> 00:37:40,640 There's over a thousand in total with all the supporting dependencies that these applications use. 328 00:37:40,640 --> 00:37:51,110 So when it comes to software support tend to be involved in looking at the applications that somebody wants to use, 329 00:37:51,110 --> 00:37:57,890 working out whether we can actually install it and updating it on on request. 330 00:37:57,890 --> 00:38:08,030 We also can provide software build assistance to users, and that tends to be a little bit like I said earlier, 331 00:38:08,030 --> 00:38:13,220 where people have found software on the internet, they found a GitHub repository. 332 00:38:13,220 --> 00:38:18,440 They're not very confident and building it themselves. They would just say to us, Could you install this for us? 333 00:38:18,440 --> 00:38:21,540 And if we think it's something that a few users might? 334 00:38:21,540 --> 00:38:26,790 Find useful, then we'll install it centrally, otherwise we can help them install it in their own, 335 00:38:26,790 --> 00:38:35,310 in their own areas, and on average, we get about four or five application requests a week, some of them quite easy. 336 00:38:35,310 --> 00:38:40,600 We can we can just deal with them in a matter of hours. 337 00:38:40,600 --> 00:38:42,030 You know, it's quite easy to put on. 338 00:38:42,030 --> 00:38:52,710 Some of them have been somewhat more problematic and it's taken taken weeks to get a piece of software working as we'd wanted to work on the system. 339 00:38:52,710 --> 00:38:58,800 One of the one of the differences between maybe a system that you'll work at your own workstation is 340 00:38:58,800 --> 00:39:10,980 that we use a a feature called Environment Modules to manage the software applications on the system. 341 00:39:10,980 --> 00:39:15,630 So you do a module load and then the name of the application that you want, 342 00:39:15,630 --> 00:39:23,710 and that sets the entire environment up so that you can actually run that piece of software and. 343 00:39:23,710 --> 00:39:32,440 I'll give you a quick example, I think here. So in this example, someone's trying to run ah, and it's just not there. 344 00:39:32,440 --> 00:39:38,710 But if you then load the module, ah and then run ah again, you can actually see that it's all in the system. 345 00:39:38,710 --> 00:39:43,270 You can run something and it works fine. And then you can unload the module. 346 00:39:43,270 --> 00:39:45,400 It's just completely disappears. 347 00:39:45,400 --> 00:39:54,400 So that's quite a nice way of managing your environment, which, if you had to do it manually, would be quite quite cumbersome, 348 00:39:54,400 --> 00:39:58,780 especially with the number of modules that we have in the fact that they can actually 349 00:39:58,780 --> 00:40:04,270 interfere with each other because they've all been built with different libraries, different compilers and other dependencies. 350 00:40:04,270 --> 00:40:12,210 The fact that the module system allows us to to to make that consistent is very nice. 351 00:40:12,210 --> 00:40:20,160 When it comes to natural software installation, we make our own lives a little bit easier by using a framework called easy build, 352 00:40:20,160 --> 00:40:28,200 and that allows us to use known recipes for building, for the most part, open source software. 353 00:40:28,200 --> 00:40:34,950 It also allows us to to install some commercial app codes as well, but it's a little bit different for that. 354 00:40:34,950 --> 00:40:38,610 But this means that you can. 355 00:40:38,610 --> 00:40:45,570 It has a predefined set of known compiler library tool chains. 356 00:40:45,570 --> 00:40:49,980 These are updated twice a year and a B version each year. 357 00:40:49,980 --> 00:41:03,390 So that has the latest and greatest versions IGCC, Intel PGI or other well and BSD SDK type tool chains, and it makes those available. 358 00:41:03,390 --> 00:41:11,100 And then there are other recipes then built on those in order to to allow an application to be installed. 359 00:41:11,100 --> 00:41:20,700 And this really helps with reproducibility because a lot of other academic institutions in Europe particularly use easy build. 360 00:41:20,700 --> 00:41:32,370 And so it gives you a known and unknown environment that, you know, if you're going to load that particular version of of a module, it will work. 361 00:41:32,370 --> 00:41:37,700 Or at least it should. So that does also when you when you build with these, 362 00:41:37,700 --> 00:41:47,030 you build you've got a basic level of assurance that the application will function because there are inbuilt tests at the end of the end of the build. 363 00:41:47,030 --> 00:41:52,730 However, we do have a minority of applications that we have to install manually. 364 00:41:52,730 --> 00:42:01,490 These tend to be the restricted licence applications, things like the MATLAB and this and other and other types of codes. 365 00:42:01,490 --> 00:42:05,870 Also, you'll get code are licenced as a source. 366 00:42:05,870 --> 00:42:12,950 So VASP one that comes to mind as a as a source licence because of code, as his kalsi. 367 00:42:12,950 --> 00:42:19,610 And so they have to be restricted in their in their access. 368 00:42:19,610 --> 00:42:24,530 Commercial licence are also licenced differently. 369 00:42:24,530 --> 00:42:36,260 So we'll find that the module files will also points to a particular licence server because AAC don't run very many or don't own many licences. 370 00:42:36,260 --> 00:42:42,320 We tend to licence the Intel compilers starter and a couple of other things, 371 00:42:42,320 --> 00:42:47,630 but everything else is usually another department or group that own that licence. 372 00:42:47,630 --> 00:42:56,670 And so that's needs to be needs to be protected so that the wrong people aren't using the wrong code. 373 00:42:56,670 --> 00:43:05,820 When it comes to assistance, users want to build their own code. We tend to it tends to be limited to customisation for ARC itself. 374 00:43:05,820 --> 00:43:11,310 We really can't. We don't have the bandwidth given the number of staff that we have to provide a 375 00:43:11,310 --> 00:43:17,550 large amount of RC type effort to get into code and help people power lines code. 376 00:43:17,550 --> 00:43:23,040 We can give ideas and a little bit of help, but it's not something that we can easily get into. 377 00:43:23,040 --> 00:43:33,380 It does take a lot of time. One thing that we do find ourselves doing is actually optimising commercial 378 00:43:33,380 --> 00:43:38,540 codes because there are a number of codes that run in this crisis and Chef X, 379 00:43:38,540 --> 00:43:45,530 which is a CFD code. And it was we found it was running quite poorly on our new system. 380 00:43:45,530 --> 00:43:51,770 And by changing the MPI stack, 381 00:43:51,770 --> 00:43:56,930 which is the message passing interface that the system uses to communicate between 382 00:43:56,930 --> 00:44:04,460 nodes from the one that answers supplied to our own one built locally on our system, 383 00:44:04,460 --> 00:44:08,840 we were able to dramatically improve its its performance. 384 00:44:08,840 --> 00:44:14,570 The idea being a nice linear scaling up, we've got it pretty close. 385 00:44:14,570 --> 00:44:18,410 Previously, it was it was all over the place. It was horrible. Have we got the graph? 386 00:44:18,410 --> 00:44:30,700 Unfortunately, but it was. It was very poor. Something else that's that's become very popular recently is the concept of software containers. 387 00:44:30,700 --> 00:44:37,060 And people have heard quite a lot about Docker Docker images. 388 00:44:37,060 --> 00:44:42,670 We don't support Docker on our system natively. 389 00:44:42,670 --> 00:44:54,760 Due to some of the security issues with it. But we can convert Docker images into singularity containers that we run singularity on on ARK and. 390 00:44:54,760 --> 00:45:03,640 That's just a very nice way of being able to package up an application for your own workstation and just know that it will actually run on the system, 391 00:45:03,640 --> 00:45:08,260 albeit you probably wouldn't be able to get a parallel run that way, 392 00:45:08,260 --> 00:45:17,830 but it certainly would be great for running multiple instances and doing some kind of Monte-Carlo type type simulation. 393 00:45:17,830 --> 00:45:28,240 And also, it's I have put on that homicide singular, and she has been renamed to obtain a due to it's joining the Linux Foundation. 394 00:45:28,240 --> 00:45:40,510 As of November. Dai touched on the fact that we slow down, slow Mr Simple electric utility for resource management, 395 00:45:40,510 --> 00:45:47,410 and that's how a user specifies what resources they require for their job. 396 00:45:47,410 --> 00:45:57,130 So if you are crafting a or you will need to craft a little a little script to run the application that that you want to use, 397 00:45:57,130 --> 00:46:02,770 and in that you have to specify a number of resources that that job requires. 398 00:46:02,770 --> 00:46:07,090 That could be the number of CPUs that require a large amount of memory. 399 00:46:07,090 --> 00:46:14,380 Maybe it needs a GPU and maybe you have a reservation or have other similar quality of service requirement. 400 00:46:14,380 --> 00:46:21,610 Once you feed all of that information into your script, some will see that and it will make a sensible, 401 00:46:21,610 --> 00:46:25,330 hopefully decision on to where where that that job runs. 402 00:46:25,330 --> 00:46:31,330 And it may be that that job could run immediately somewhere where it may have to be queued until the resources are available. 403 00:46:31,330 --> 00:46:38,680 So that's what the the resource management does does for us. 404 00:46:38,680 --> 00:46:45,370 But any more information on that, and I want to work out how to create a slim script, 405 00:46:45,370 --> 00:46:53,530 the best place to learn that is on the intro to ARK course is which we run every couple of weeks. 406 00:46:53,530 --> 00:47:02,500 Di also mentioned that the GTA to this now we as victim, we provide local support for Oxford users of giTe. 407 00:47:02,500 --> 00:47:11,830 Unfortunately, the well, I say unfortunately that's not fair. We can't provide direct systems administration of the of the great system. 408 00:47:11,830 --> 00:47:19,160 It's it's not run by us. So we don't quite have the same level of access as we have on the arc system. 409 00:47:19,160 --> 00:47:27,580 So when it comes to helping users with an arc with a problem, we usually can get in and help you go into your directories. 410 00:47:27,580 --> 00:47:31,750 Have a look at logs, make changes and help that way. 411 00:47:31,750 --> 00:47:37,480 But we can't do that on site, so we have to ask you to raise a ticket with Heart Tree, 412 00:47:37,480 --> 00:47:42,490 which we can then mount monitor and help and add information to. 413 00:47:42,490 --> 00:47:50,710 But in order to maybe help with that, we also have some TJX Max-Q nodes, as I mentioned earlier on in our HTC cluster. 414 00:47:50,710 --> 00:47:56,530 So it's probably quite a good idea to run any any jobs locally on our the arc 415 00:47:56,530 --> 00:48:03,130 first before then scaling up and running them in the in the right system. 416 00:48:03,130 --> 00:48:18,450 Because once again, the the the TJX is in charge tend to run containers produced by Nvidia, which we can also use. 417 00:48:18,450 --> 00:48:25,980 So I think that's it from me. Any questions? 418 00:48:25,980 --> 00:48:35,340 Great. Thank you very much, Andy and A. There was a wonderful presentation and a round of applause. 419 00:48:35,340 --> 00:48:40,288 Thank you, thank you.