1
00:00:04,090 --> 00:00:13,570
So it gives me great pleasure to introduce Guy Jenkins and Andy Getting's Dei is the head of research,

2
00:00:13,570 --> 00:00:18,610
computing and support services for Oxford's Advanced Research Computing Service.

3
00:00:18,610 --> 00:00:24,160
And Andy is the scientific research software advisor for ARC.

4
00:00:24,160 --> 00:00:34,000
And today they're going to tell us about the amazing supercomputing facilities we have at our fingertips here, Oxford and and beyond.

5
00:00:34,000 --> 00:00:42,790
And I hope that in this we get an overview of what's possible and how to use it.

6
00:00:42,790 --> 00:00:46,630
So over to you. OK, thanks.

7
00:00:46,630 --> 00:00:55,210
And thanks for having us along today to you to talk about, OK, so what we can try and strike a balance between in this presentation is some well,

8
00:00:55,210 --> 00:00:58,660
hopefully all of it is going to be useful in some kind of way.

9
00:00:58,660 --> 00:01:02,050
It's not meant to be a fully in-depth training course,

10
00:01:02,050 --> 00:01:07,990
but what is meant to do is to try and give you an overview of the facilities that we have available.

11
00:01:07,990 --> 00:01:15,040
Why would you want to use high performance or advanced computing in general, not just through ARC?

12
00:01:15,040 --> 00:01:25,060
And also how you can access our services? I'm looking at support what's available for them and and so on and so forth.

13
00:01:25,060 --> 00:01:29,350
So getting into the slides, so just a quick kind of overview.

14
00:01:29,350 --> 00:01:32,650
Key facts of of arc.

15
00:01:32,650 --> 00:01:41,920
So Arc, the University of Oxford Advanced Research Computing team and we provide the Central High Performance Computing resource in Oxford University,

16
00:01:41,920 --> 00:01:47,110
and we are hosted within I.T. services.

17
00:01:47,110 --> 00:01:54,970
We are the Central University resource, but there are significant other high performance computing resources that are based around the university.

18
00:01:54,970 --> 00:02:05,020
But we are the only ones available to all four divisions that are free at the point of access to to researchers.

19
00:02:05,020 --> 00:02:14,290
So that means that you can, as a member of the university, be able to request a user account on ARC and through a project.

20
00:02:14,290 --> 00:02:20,770
You can then start doing work on the arc resource without having to find any income,

21
00:02:20,770 --> 00:02:24,910
any funding in order to provide to the OK series in order to gain access to it.

22
00:02:24,910 --> 00:02:32,470
So in terms of in terms of numbers, the team is I'm quite small, it's only for staff and myself.

23
00:02:32,470 --> 00:02:35,170
So to boast I followed, not with a stay.

24
00:02:35,170 --> 00:02:40,720
They are the systems administrators for the service and they keep the systems and fed and watered up and running.

25
00:02:40,720 --> 00:02:49,720
And then there's Andy and another person called, Yes, me, neither who provide the applications and user support for the service,

26
00:02:49,720 --> 00:02:56,980
and we will be giving more of an overview on what's available there in his half of this talk.

27
00:02:56,980 --> 00:03:01,960
So in terms of the hardware we have available to us at the moment, we have two principal clusters.

28
00:03:01,960 --> 00:03:11,350
One of those is for high throughput applications, and the other one is what we use for or was billed as being the capability cluster,

29
00:03:11,350 --> 00:03:17,200
which is used for the it's kind of what what was what was deemed as being too high performance applications,

30
00:03:17,200 --> 00:03:23,980
which are closely coupled in terms of processor communications and memory available and so on and so forth.

31
00:03:23,980 --> 00:03:29,410
I go into that in a bit more detail and those are connected through to high performance,

32
00:03:29,410 --> 00:03:37,150
deep fast filesystem that we have connected to the cluster in terms of the user base that we have.

33
00:03:37,150 --> 00:03:42,610
As I said earlier on, we are free to all four divisions within the university.

34
00:03:42,610 --> 00:03:47,740
But in terms of, you know, not all divisions users, as much as others,

35
00:03:47,740 --> 00:03:56,680
MPLS uses one about 90 percent of the resources of the the compute cycles and core hours that are provided by the service

36
00:03:56,680 --> 00:04:06,760
that simplifies social sciences division and medical sciences division and then humanities who use us very rarely.

37
00:04:06,760 --> 00:04:11,230
So that was what happens across all divisions in terms of the board members use numbers.

38
00:04:11,230 --> 00:04:20,590
We have 20 five hundred registered users across the university and that's increasing by around about 600 users per year.

39
00:04:20,590 --> 00:04:27,550
Out of all of those, and 400 are active and submitting jobs to the system, any one given moment in time.

40
00:04:27,550 --> 00:04:32,230
The service is quite busy and we're running around about 50000 jobs per month and increasing,

41
00:04:32,230 --> 00:04:39,640
which means that our clusters and the resource that's available with them is around about 80 percent utilised at all times.

42
00:04:39,640 --> 00:04:45,580
And that's kind of around about the maximum level that we like to run them at in order that there's

43
00:04:45,580 --> 00:04:52,480
sufficient headroom to turn around jobs within the scheduler and have sufficient throughput on the system.

44
00:04:52,480 --> 00:04:57,430
So, you know, I'm just answering the question, but why not 100 percent to 80 percent of the thing?

45
00:04:57,430 --> 00:05:04,400
It doesn't become kind of almost gridlocked within within the system.

46
00:05:04,400 --> 00:05:11,930
So what services do we offer? Well, first of all, there's the access to the cluster resources and also the research applications.

47
00:05:11,930 --> 00:05:18,470
We have a range of cluster resources. We have x86 nodes, GP nodes, high memory nodes in the system.

48
00:05:18,470 --> 00:05:20,450
And Andy will talk more about this.

49
00:05:20,450 --> 00:05:27,680
But we have over 400 centrally installed applications on the system that we take care of and curator available for you

50
00:05:27,680 --> 00:05:35,150
to use in order to do your research on the system and as well as providing the the hardware and support for that.

51
00:05:35,150 --> 00:05:39,920
There's also the use of trained user support and training that goes along with it.

52
00:05:39,920 --> 00:05:47,100
We run face to face training as well as well with encoded situated in COVI times.

53
00:05:47,100 --> 00:05:57,230
We do most this this online, but if you do have any ad hoc queries, then a ticket can be submitted through to support of the Orchestra Act or UK.

54
00:05:57,230 --> 00:06:02,930
And we will we will deal with anything that comes across on that by email and if things do need to be followed up on that,

55
00:06:02,930 --> 00:06:12,440
we can then do a teams call in order to to help you out. There's also what we call premium services as well that we offer on there own the

56
00:06:12,440 --> 00:06:16,790
cluster and these are in addition to the free resources I talked about earlier on.

57
00:06:16,790 --> 00:06:20,720
And examples of these are node reservations,

58
00:06:20,720 --> 00:06:29,150
so we can effectively book out a set of notes for you to do a specific piece of work on and be given a reservation on that.

59
00:06:29,150 --> 00:06:32,960
And then that will be yours to use for a specific period of time.

60
00:06:32,960 --> 00:06:38,480
We do kind of like test the case for those quite rigorously, so that's not something we just give as a matter of.

61
00:06:38,480 --> 00:06:45,350
As a matter, of course, because it then takes that reserved resource away from the wider user population.

62
00:06:45,350 --> 00:06:51,500
There's also priority time as well, which is a paid for service,

63
00:06:51,500 --> 00:06:57,860
paid for kind of product that we have and that is allows your jobs to be pushed

64
00:06:57,860 --> 00:07:03,620
forward in the queue ahead of other people that are just having standard service.

65
00:07:03,620 --> 00:07:10,520
And so the reasons why you would want to do that is if you are coming very close to a paper submission deadline or you know,

66
00:07:10,520 --> 00:07:14,660
you're it's coming close to your thesis being submitted or anything like that,

67
00:07:14,660 --> 00:07:18,470
then you can purchase priority time and it will allow you to go ahead in the queue and

68
00:07:18,470 --> 00:07:25,700
get that work done ahead of others and become of the final premium service that we use.

69
00:07:25,700 --> 00:07:33,710
A lot is co-investment, and that's where a researcher will buy notes.

70
00:07:33,710 --> 00:07:37,130
Those no. Two effects will be given over to the ARK team.

71
00:07:37,130 --> 00:07:44,900
We will install them in the system and we will feed water, operate those for for the researchers.

72
00:07:44,900 --> 00:07:51,680
The benefit that we can get back to arc is that when the research that's bought those nodes isn't using them.

73
00:07:51,680 --> 00:07:59,330
We can then do what's called back filling of jobs onto those nodes, which benefits the wider user community,

74
00:07:59,330 --> 00:08:03,230
as well as the researcher that bought the nodes in the first place.

75
00:08:03,230 --> 00:08:11,090
And a large number of our GPU notes have been bought under that co-investment model.

76
00:08:11,090 --> 00:08:17,660
What we can also do as well for external users of the system is we can provide access for them and this is

77
00:08:17,660 --> 00:08:26,150
primarily and it says here it's open access for collaboration and commercial partners for academic collaborators.

78
00:08:26,150 --> 00:08:35,310
We can arrange access free to to ask to work with you and your teams on a quest for commercial partners.

79
00:08:35,310 --> 00:08:42,590
Be to charge them for access to the to to the system.

80
00:08:42,590 --> 00:08:46,700
And finally, on this access to IS is incredibly easy when it's free.

81
00:08:46,700 --> 00:08:55,130
Secondly, in order to access ARC, first thing you need to do is to set up a project and that can be done by a principal investigator or group lead.

82
00:08:55,130 --> 00:08:59,210
And that's done by our web form and the link for that issue on the slide.

83
00:08:59,210 --> 00:09:08,120
Once that project is then being created, you can then do a a an individual user account request and then we can set you up with a user account,

84
00:09:08,120 --> 00:09:14,780
which is then linked to that project account. And then you can start beginning to use the the park service.

85
00:09:14,780 --> 00:09:20,000
So it's a fairly easy, simple process to go through and from start to finish.

86
00:09:20,000 --> 00:09:29,190
This can probably be done in about two or three days more or less in normal in normal circumstances.

87
00:09:29,190 --> 00:09:34,360
But, you know, so that's what all kind of is a very high level as well as kind of what we provide.

88
00:09:34,360 --> 00:09:38,940
But but you know, what is high performance computing and why is high performance computing?

89
00:09:38,940 --> 00:09:41,760
So first of all, what is high performance computing?

90
00:09:41,760 --> 00:09:51,030
No, single single or really kind of an all encompassing definition of what high performance computing is or isn't?

91
00:09:51,030 --> 00:09:54,810
But for the purposes of this kind of, you know, framing this topic,

92
00:09:54,810 --> 00:10:02,380
it's really something that can't be performed on a desktop or workstation easily or just takes so much time to do all that kind of resource.

93
00:10:02,380 --> 00:10:10,820
It becomes impractical, if not impossible, to do it in the useful manner or something may be amenable to be carried out in,

94
00:10:10,820 --> 00:10:14,160
you know, on multiple processes in in parallel.

95
00:10:14,160 --> 00:10:18,840
And that can either be so at multiple instances of the kind of a second job, but with a slight problem to tweak.

96
00:10:18,840 --> 00:10:26,010
Or we can be working using multiple processes in parallel to sort out a truly complex problem that is requires a single job that's

97
00:10:26,010 --> 00:10:36,600
require access to multiple processes into changing information with each other via API to to come to come to a solution on that problem.

98
00:10:36,600 --> 00:10:42,150
But overall, I think kind of it's agreed that HPC should find the person to take a unit to work in less

99
00:10:42,150 --> 00:10:47,880
time or more work in the same time or to you that something is otherwise impossible.

100
00:10:47,880 --> 00:10:54,720
And kind of the reason why these these images have been chosen only, well, it's on the on the left and right hand side of my slide.

101
00:10:54,720 --> 00:11:01,770
It's kind of like exemplify that this isn't really something that is new or it's been done in the last five or 10 years.

102
00:11:01,770 --> 00:11:06,210
The diagram of the picture at the bottom of a slide that's one of the and the bombs

103
00:11:06,210 --> 00:11:12,210
that was used to to to to find out the relative positions of the Enigma machine.

104
00:11:12,210 --> 00:11:20,130
And that would be kind of a classic example of kind of a high throughput how high throughput computing type job where you're working on,

105
00:11:20,130 --> 00:11:24,250
not dots are all completely interconnected, but there are lots of different jobs,

106
00:11:24,250 --> 00:11:31,530
but you have the requisite if you have the time pressure that you have to figure out what those two positions are in a 24

107
00:11:31,530 --> 00:11:38,070
hours period so that the information can actually be of any use because as soon as you can step outside of that period,

108
00:11:38,070 --> 00:11:42,180
then of course, all the work you done previously is is absolutely null and void.

109
00:11:42,180 --> 00:11:49,630
And the work that Lexie McCray, that's the naturally dressed guy standing next to the big one thing at the top, which is a cray Acree,

110
00:11:49,630 --> 00:11:59,820
when that machine is was primarily meant to set problems to to to do calculations, but again with time to completion requirement was wasn't so great.

111
00:11:59,820 --> 00:12:05,100
But where the competition was so complicated that it would take so it would take a huge amount of time in order for

112
00:12:05,100 --> 00:12:11,940
somebody to do it basically on their own with a slide rule or working in 10s and with the slide rule in order to do it.

113
00:12:11,940 --> 00:12:19,260
So High-Performance computing, high throughput computing, not necessarily very new concepts at all.

114
00:12:19,260 --> 00:12:22,030
There.

115
00:12:22,030 --> 00:12:29,300
But usually one thing in common with high performance of all high performance computing is it's normally a large system somewhere in a data centre.

116
00:12:29,300 --> 00:12:38,570
And our data centre is Bedford Park, which is based just outside of of Oxford.

117
00:12:38,570 --> 00:12:45,740
So what can we switch computing be be used for? So generally, there are kind of four types of research computing.

118
00:12:45,740 --> 00:12:51,740
So this this compute intensive and surprise surprise. And that's doing things where, you know,

119
00:12:51,740 --> 00:12:58,910
requiring a large amount of compute with high performance into processor communication in order to actually to do that.

120
00:12:58,910 --> 00:13:02,930
So this is what we would term as being so traditional, high performance,

121
00:13:02,930 --> 00:13:08,510
heroic supercomputing that you're doing modelling simulation problems with things like fluid dynamics,

122
00:13:08,510 --> 00:13:12,440
climate modelling, molecular simulations, et cetera, et cetera.

123
00:13:12,440 --> 00:13:21,590
And where we see most this kind of work coming in onto our systems is through researchers in MPLS and medical sciences division.

124
00:13:21,590 --> 00:13:26,630
Then we can move through to data intensive High-Performance computing.

125
00:13:26,630 --> 00:13:29,780
And this is this is, as it says on the tin.

126
00:13:29,780 --> 00:13:37,910
It's applications requiring or operating on large amounts of data, quite fast, efficient ingress egress of data.

127
00:13:37,910 --> 00:13:43,100
So in this case, it's not so much performance within the computational part of the node.

128
00:13:43,100 --> 00:13:47,670
It's performance there, but also in kind of the the the data.

129
00:13:47,670 --> 00:13:54,740
So the hierarchy there in order to be able to hold data into the compute and back out again.

130
00:13:54,740 --> 00:14:02,990
And so applications that we see in this area are basically around by occupying informatics, genomics and machine learning applications,

131
00:14:02,990 --> 00:14:11,570
which are of using and manipulating and operating on large increase in an increasingly larger amounts of of data in order to,

132
00:14:11,570 --> 00:14:18,860
you know, in order to come to a solution that becomes a high throughput computing.

133
00:14:18,860 --> 00:14:25,250
This is although we said it's been a high performance computing stuff that we can't really do on the desktop.

134
00:14:25,250 --> 00:14:27,530
This kind of work is stuff that you can do on a desktop,

135
00:14:27,530 --> 00:14:33,170
except that what you produce high performance computing for is to harness the sheer amount of resource that's kept in a single

136
00:14:33,170 --> 00:14:41,720
place to almost operate as a thousand two thousand 4000 laptops working on things like parameter to sweep experiments.

137
00:14:41,720 --> 00:14:50,330
But you could do individually and then serial on it on a laptop, but it would take you so much time to do it that it would be impractical to do so.

138
00:14:50,330 --> 00:14:55,340
So what you do is you can do that if you do those types of experiments on arc multiple multiple

139
00:14:55,340 --> 00:15:00,770
times and then that increases or decreases over your time to completion immeasurably.

140
00:15:00,770 --> 00:15:04,030
So that's kind of high throughput computing,

141
00:15:04,030 --> 00:15:13,010
and it's using a number of different application areas and that's prevalent across all of the divisions within within within the university.

142
00:15:13,010 --> 00:15:21,710
And then we finally get to memory intensive computing, which in some cases is a bit of a tweak on high throughput computing.

143
00:15:21,710 --> 00:15:23,420
Applications are many and varied,

144
00:15:23,420 --> 00:15:32,630
and basically they require requirements in these cases mostly is for all of your data to be in memory in order to be to be operated on.

145
00:15:32,630 --> 00:15:36,530
So it's either input is incredibly large and has to sit in memory or the outputs from what

146
00:15:36,530 --> 00:15:42,230
you were doing and require a huge amount of memory in order to push it to be pushed into.

147
00:15:42,230 --> 00:15:47,600
And there's some examples in the economy. But overall,

148
00:15:47,600 --> 00:15:53,360
take home message from this is that Arc provides some compute resource and that goes across all of

149
00:15:53,360 --> 00:16:04,560
these four areas and finds a general and high performance computing service to to the university.

150
00:16:04,560 --> 00:16:12,780
So as a kind of one of the kind of extreme examples that we sometimes give to to to to illustrate what's required.

151
00:16:12,780 --> 00:16:17,970
So I been talking in a desktop PC can reverse or tens of gigawatts.

152
00:16:17,970 --> 00:16:23,500
That is useful for most day to day applications, but in some cases that's just not going to be able to catch it.

153
00:16:23,500 --> 00:16:29,970
An extreme example is things like short range weather forecasting where you need to and shorten in terms of time,

154
00:16:29,970 --> 00:16:38,310
rather in terms distance that brings different complications. So we have two term predictions around for the next day within that particular day that

155
00:16:38,310 --> 00:16:43,320
that that that you want to run today in order for it to be available for tomorrow.

156
00:16:43,320 --> 00:16:52,620
And so for the things that the Met Office requirements a compute system of circa one petaflops required in order to actually to to do

157
00:16:52,620 --> 00:17:02,670
that and be able to turn around those calculations on a useful timescale in order to be available for the next the next working day.

158
00:17:02,670 --> 00:17:06,690
So that's kind of an example of where you require an extreme compute in order

159
00:17:06,690 --> 00:17:12,030
to do to do something which which we both don't take for granted nowadays.

160
00:17:12,030 --> 00:17:22,680
I'm just giving an illustration of the size of the computers that are the fastest supercomputer in November 2021, as shown up on the the.

161
00:17:22,680 --> 00:17:31,710
The top 500 is around about 440 to petaflops, which is a massive and staggering.

162
00:17:31,710 --> 00:17:36,810
And when you consider that when I first started in high performance computing,

163
00:17:36,810 --> 00:17:42,190
which is back in around about 2000 and seven, we had the National Service.

164
00:17:42,190 --> 00:17:47,430
Hector then was, I think it was 60 teraflops in size.

165
00:17:47,430 --> 00:17:54,150
And so strolling on around August 13, 14, 15 years, we have this size of computer.

166
00:17:54,150 --> 00:18:02,520
Hector in two thousand six seven, was in the top 10. This system is the fastest in the top 10, and it is now on rough calculations,

167
00:18:02,520 --> 00:18:10,470
probably around 7000 times faster than than than than the hack the system was back then.

168
00:18:10,470 --> 00:18:17,820
So that's how much things have come on in leaps and bounds because the applications are driving us in this direction.

169
00:18:17,820 --> 00:18:22,630
So it gives you food for thought. Right.

170
00:18:22,630 --> 00:18:31,230
So in terms of the hardware. But before I go on, does anybody have any immediate questions?

171
00:18:31,230 --> 00:18:39,620
So. OK, so I'll just so I take a quick canter through the through the actual hardware resources that we have

172
00:18:39,620 --> 00:18:46,550
available for turning over to Andy that will talk more about the the softer elements of the service.

173
00:18:46,550 --> 00:18:49,970
We operate two clusters of a type of park.

174
00:18:49,970 --> 00:19:00,050
One is the high throughput cluster, which is kind of, in some cases, I going to discuss in more details, but it's just a bunch of nodes.

175
00:19:00,050 --> 00:19:07,420
And then we have the ARK cluster, which is our capability cluster, and I'm going to explain more about what these are.

176
00:19:07,420 --> 00:19:15,050
In a minute, the way that most users interact with Senate jobs, the system is through the arm,

177
00:19:15,050 --> 00:19:19,370
through the flumes scheduler to select a job to learn it then assesses it.

178
00:19:19,370 --> 00:19:28,700
And then when the when the resource becomes available, should other factors be equal, it will then launch the job onto the system and it will run.

179
00:19:28,700 --> 00:19:34,410
All of our clusters are based out at the Park Data Centre.

180
00:19:34,410 --> 00:19:45,090
Just outside, just outside Oxford. And these are then connected back into the university's core Odin network.

181
00:19:45,090 --> 00:19:51,990
So taking each cluster into its supercomputing cluster, these clusters purposely set up the preference small jobs,

182
00:19:51,990 --> 00:19:57,090
which in this case are less than less than one node inside in size.

183
00:19:57,090 --> 00:20:03,810
And this is to make sure that only small jobs, high throughput jobs actually get through in the system and that the system is not going to

184
00:20:03,810 --> 00:20:09,930
get swamped by jobs that are larger than that which should be targeted to the arc system.

185
00:20:09,930 --> 00:20:13,830
The system also provides a kind of as well as finding a system.

186
00:20:13,830 --> 00:20:24,060
It also provides a hosting infrastructure, a mix of CPU and GPU nodes that we bought and co-investment within this system.

187
00:20:24,060 --> 00:20:31,920
We have two high memory nodes, and they're available to users on each of those has three terabytes each.

188
00:20:31,920 --> 00:20:38,970
We have a whole slew of GPU nodes in that system as well, which you can, which you can.

189
00:20:38,970 --> 00:20:41,670
You can get full details on at the link,

190
00:20:41,670 --> 00:20:50,400
but they range from these kind of very high performance and servers to show on the system, which is 80 x Max-Q,

191
00:20:50,400 --> 00:20:57,960
which has eight v 100 AMD GPUs, which are all connected via a very high bandwidth NVLink interconnect,

192
00:20:57,960 --> 00:21:00,660
and then that has two Intel processors within it.

193
00:21:00,660 --> 00:21:07,050
So this is basically a server that's completely dedicated to high end machine learning and in some cases by simulation kind

194
00:21:07,050 --> 00:21:15,180
of research on it that you can do on that system and the GPUs on that also and do double precision workloads as well.

195
00:21:15,180 --> 00:21:18,390
So we have a number of those in this in the system.

196
00:21:18,390 --> 00:21:24,900
But what we also have, as well as a large number of more kind of like prosumer consumer type cards,

197
00:21:24,900 --> 00:21:31,950
almost like gaming cards that within the system with varying amounts of graphics, graphics memory that go along with those as well.

198
00:21:31,950 --> 00:21:39,030
And those are more geared towards just just out machine learning applications.

199
00:21:39,030 --> 00:21:41,100
On top of what we have in terms of, you know,

200
00:21:41,100 --> 00:21:48,060
you also have another 40 just standard CPU nodes within that system, as well as the high throughput workloads,

201
00:21:48,060 --> 00:21:53,610
and these servers are exactly the same specification as those in the arc capability machine,

202
00:21:53,610 --> 00:22:03,230
but they just do not have the the InfiniBand interconnect linking each of the nodes together.

203
00:22:03,230 --> 00:22:11,000
With regard to the co-investment nodes that are put into the system, as I said earlier on, research buys them, we feed and water them.

204
00:22:11,000 --> 00:22:15,890
But when the research is not using, they're available to the wider university.

205
00:22:15,890 --> 00:22:19,100
What that means is that jobs can backfill into those nodes.

206
00:22:19,100 --> 00:22:27,740
But the maximum length of the jobs that come back on those nodes is 12 is 12 hours and full details of those co-investment nodes.

207
00:22:27,740 --> 00:22:36,390
And there the specifications are also available via the link on this on this slide.

208
00:22:36,390 --> 00:22:39,370
Let me going to the art cluster, which studies the capability cluster,

209
00:22:39,370 --> 00:22:46,470
and this is dedicated surprise surprise to modes at the jobs that are very much larger than when one note in size.

210
00:22:46,470 --> 00:22:53,550
And so when they go into the scheduler, you know, dependable that the science part of it is going to be a prioritisation,

211
00:22:53,550 --> 00:23:01,830
prioritise prioritising part as to where it goes, when when it goes in the system, they all cluster.

212
00:23:01,830 --> 00:23:06,930
We have 258 compute nodes that are in there.

213
00:23:06,930 --> 00:23:16,020
Those compute nodes are all arranged into and arranged into seven islands as they call within the system within each island.

214
00:23:16,020 --> 00:23:21,720
There are around 40 nodes and so 40 to 44 nodes and within each island.

215
00:23:21,720 --> 00:23:28,770
And they're all connected by this this InfiniBand S$100 interconnect within each island.

216
00:23:28,770 --> 00:23:37,710
There is one to one communication non blocking communication within it, and that provides the highest performance.

217
00:23:37,710 --> 00:23:42,510
It's like communications that we can provide between all nodes within the system,

218
00:23:42,510 --> 00:23:48,450
and the performance of that interconnect would be comparable to what you would see on the on the national services.

219
00:23:48,450 --> 00:23:58,740
So things like Archer and between you can run jobs between these islands, but there is a slightly higher communications overhead on on that.

220
00:23:58,740 --> 00:24:03,190
And where it goes from being non blocking to being a three to one contention ratio

221
00:24:03,190 --> 00:24:09,570
in the in the network connectivity between that so contention three to one,

222
00:24:09,570 --> 00:24:13,590
the higher those ratios go, the the lower that your performance is going to be from.

223
00:24:13,590 --> 00:24:21,510
From that particular given interconnect operating system on this is Santos.

224
00:24:21,510 --> 00:24:31,200
But what? We Santos eight. But we also ran to some of the other choices within that system in a legacy configuration,

225
00:24:31,200 --> 00:24:38,970
which looks a bit like Arcus Bay that runs CentOS seven point seven and that's just specifically set up for legacy applications that can't not come.

226
00:24:38,970 --> 00:24:46,980
I can't run on central site. And I said before the schedule that we have on the system is slow.

227
00:24:46,980 --> 00:24:52,290
This system is a kind of the grand scheme of things, a modest twelve thousand three hundred cores.

228
00:24:52,290 --> 00:24:59,850
But in the Oxford Sense, this system provides a significant improvement over what we had previously with with our Casspi and the old HTC system,

229
00:24:59,850 --> 00:25:05,190
which collectively had, I think it was probably about five thousand eight hundred cores between them.

230
00:25:05,190 --> 00:25:13,410
So we more than doubled the the capacity on this type system over our previous arrangements.

231
00:25:13,410 --> 00:25:22,260
Full details on the system are again given on the on the Arc website, so if you want to look at things in more detail.

232
00:25:22,260 --> 00:25:28,710
So that's the compute side of things, then we have the storage that goes behind all of this.

233
00:25:28,710 --> 00:25:39,060
We operate a very high performance file system called called Universe, and we have two petabytes of that that are available to us.

234
00:25:39,060 --> 00:25:44,700
I don't think there's really much more to say to that. They don't any restrictions on use of that power from quoting.

235
00:25:44,700 --> 00:25:48,930
And that's pretty much all there is to say on that one.

236
00:25:48,930 --> 00:25:56,630
I won't go into full details on the performance of that because because sometimes going when you and the other.

237
00:25:56,630 --> 00:26:05,900
So that's those last three slides are what we have as a team based in the the banquet park data centre.

238
00:26:05,900 --> 00:26:17,660
But what we also have access to as well is the data service, and Jade stands for the joint academic data science endeavour.

239
00:26:17,660 --> 00:26:21,500
And this is a grant that was funded by ATSIC,

240
00:26:21,500 --> 00:26:33,800
led by Oxford for a four or five million pound system that is based entirely on a slender TJX Max-Q product.

241
00:26:33,800 --> 00:26:37,190
So it's essentially it's owned by University of Oxford.

242
00:26:37,190 --> 00:26:48,040
We reported on behalf of of researchers across the UK on behalf of EPS rc well, efforts to provide this money.

243
00:26:48,040 --> 00:26:57,860
And we bought it and the system is is hosted at the Hartrick Centre in the north west of of England.

244
00:26:57,860 --> 00:27:03,740
The system itself is based on is based on 63 x max q boxes.

245
00:27:03,740 --> 00:27:10,040
It's very similar to the previous one service, except that there's just a large amount of it.

246
00:27:10,040 --> 00:27:17,240
The only other difference between really between grade one and two, apart from the increased capacity of the system,

247
00:27:17,240 --> 00:27:22,610
is also the fact that it's connected through to a much more high performance file

248
00:27:22,610 --> 00:27:29,090
system on that on on on on that particular iteration of the of the cluster.

249
00:27:29,090 --> 00:27:38,670
And what this is this system is mainly geared to providing is just a higher level of resource and specifically the ability to do in,

250
00:27:38,670 --> 00:27:47,840
you know, in multiples of one boxes, multi GP work, which most universities don't have the resources to be able to do.

251
00:27:47,840 --> 00:27:52,730
We are quite fortunate at Oxford in that we have around about five of these systems.

252
00:27:52,730 --> 00:28:01,340
TJX minus key systems are self, but there are a number of universities that do not have access to this kind of this kind of kit,

253
00:28:01,340 --> 00:28:08,690
and that's what data is meant to be an unchanged one before it is meant to be that facility.

254
00:28:08,690 --> 00:28:26,270
So that's all I have to say on those elements, I'm happy to take any questions, but if there aren't any, then I can have the to Andy.

255
00:28:26,270 --> 00:28:35,930
OK. Sorry, I just can't say die. You mentioned there are different types of notes on on arc, including chip use.

256
00:28:35,930 --> 00:28:38,810
Mm hmm. Maybe it may be A. You're going to say a little bit about this,

257
00:28:38,810 --> 00:28:49,160
but if if if one wanted to run an arc job on a particular class of node, I would one go about doing that.

258
00:28:49,160 --> 00:28:55,340
You can specify it into Islam, will cover it a little bit more to talk about, but it won't be in too much detail.

259
00:28:55,340 --> 00:28:59,510
But yeah. OK, thanks. Yeah. OK.

260
00:28:59,510 --> 00:29:03,560
I will carry on then. Thanks. Thanks, Joe.

261
00:29:03,560 --> 00:29:07,670
So, yeah, I imagine to get I need this off applications group within the Arctic.

262
00:29:07,670 --> 00:29:18,950
I'm going to cover some of the usual application support type things that Arc do specifically.

263
00:29:18,950 --> 00:29:25,460
We'll look at things like the training courses, use documentation, what we do with software applications.

264
00:29:25,460 --> 00:29:30,950
And obviously, as I've already mentioned,

265
00:29:30,950 --> 00:29:39,410
one of the things that we do quite a lot of is providing general user assistance via our email address, which is support Arctic oxygen to UK.

266
00:29:39,410 --> 00:29:48,170
We do often repeat that so that people can actually send us any kind of support request and that goes into our general ticketing system.

267
00:29:48,170 --> 00:29:59,300
And and usually we're very responsive on that. One member of of the marketing is usually dedicated to answering tickets each day of the week,

268
00:29:59,300 --> 00:30:07,670
but there's only four of us, so we have to double up occasionally. And I got.

269
00:30:07,670 --> 00:30:17,390
Let me go, so trying engagement, so the kinds of things we do regularly are the training courses,

270
00:30:17,390 --> 00:30:21,480
the main one being the introduction to OK, this is for new users.

271
00:30:21,480 --> 00:30:26,890
And if anybody is particularly interested in actually getting on our account.

272
00:30:26,890 --> 00:30:37,390
Feel free to go onto the training course website, which is shown there there marked at the UK slash training and book book a place,

273
00:30:37,390 --> 00:30:48,010
it's presented twice a month and gives a really good overview of how to use the system and what capabilities there are.

274
00:30:48,010 --> 00:30:52,630
There's also another course that we do called effective use of clusters for known programmers,

275
00:30:52,630 --> 00:31:00,490
and this is for uses that have maybe been on the mark in introduction to our course.

276
00:31:00,490 --> 00:31:05,080
I have a couple apps they use and they want to get a little bit more out of the system.

277
00:31:05,080 --> 00:31:11,080
And the reason we say and on programme is because it could be something where a user has been given

278
00:31:11,080 --> 00:31:16,690
some code from GitHub or just found something they want to build it and make it work on Iraq,

279
00:31:16,690 --> 00:31:18,340
and this place will give some pointers.

280
00:31:18,340 --> 00:31:26,360
And it also works for people who are just just running commercial applications or any other applications that are pre-installed on the system.

281
00:31:26,360 --> 00:31:35,920
And we try and run that around once per term, and I think we're pretty to have one later this term.

282
00:31:35,920 --> 00:31:47,020
The other thing that's quite important is that the all of the arc systems run Linux, so we stated that it was running centre central site,

283
00:31:47,020 --> 00:31:54,550
and that can come as a little bit of a shock to some users who are more used towards the windowed windows environment.

284
00:31:54,550 --> 00:32:02,500
So we do have a link on this training page to quite a nice, web based training course is not ours,

285
00:32:02,500 --> 00:32:10,420
but it's quite a nice one to get a good grounding in Linux.

286
00:32:10,420 --> 00:32:19,390
Something else we do our drop in sessions, and we sort of tend to schedule these around the day after an introduction to AAC course.

287
00:32:19,390 --> 00:32:24,100
And this is just some time where a couple of the team members will be available on teams for

288
00:32:24,100 --> 00:32:31,270
people just to drop in and talk to us about about any issues that they may have on the system.

289
00:32:31,270 --> 00:32:41,980
And it tends to be primarily for for four questions that don't really sit very well within an arc or supporter arc type email request.

290
00:32:41,980 --> 00:32:49,690
Maybe they need some help going through some building some software and they've got they've got a bit stock and it's a bit more interactive.

291
00:32:49,690 --> 00:32:54,460
So you know, that's something that we have these set times when people can join us.

292
00:32:54,460 --> 00:33:00,250
Or of course, if you do sport requested, we can, of course, just do those kind of things ad hoc.

293
00:33:00,250 --> 00:33:05,820
As I said previously and the other kinds of engagement that do we do?

294
00:33:05,820 --> 00:33:17,140
Just lost that page. Three guys are attending student welcome events and running any kinds of presentations like this on on on request.

295
00:33:17,140 --> 00:33:28,480
So we have the use of documentation currently, it's all hosted on WW W, the arc to Oxford, to the UK, and that's quite quite a nice site,

296
00:33:28,480 --> 00:33:33,700
but gradually starting to break that out and review what we have there in terms of

297
00:33:33,700 --> 00:33:41,110
the arc user guide and software guide and migrating them to read the Docs type page,

298
00:33:41,110 --> 00:33:50,320
which is hosted on GitHub. And that'll make it a little bit easier for users to take information, export it as a PDF and turn it into a hard copy.

299
00:33:50,320 --> 00:33:56,080
And just as some people like to read those types of documents in that way.

300
00:33:56,080 --> 00:34:01,210
The documentation has, as I think I mentioned,

301
00:34:01,210 --> 00:34:12,490
has all the links to getting a project user accounts and also covers all the sections on priority credits

302
00:34:12,490 --> 00:34:24,330
and also the service level agreements for things like quality of service for different types of abuses.

303
00:34:24,330 --> 00:34:33,450
So we have quite a varied audience for the arc, just for jobs that of the arc systems, as you probably say.

304
00:34:33,450 --> 00:34:43,860
So, you know, we're not experts in any one of these areas, but there's quite a wide range of both applications on the system and experience of users.

305
00:34:43,860 --> 00:34:51,840
But there are lots and lots of common issues that they face in these types of HPC environments on ARK, for example.

306
00:34:51,840 --> 00:34:57,990
So yeah, there's lots of challenges there, so they tend to be the same types of ones.

307
00:34:57,990 --> 00:35:05,910
So, you know, it is mainly that one uses have from moving from their work station to a system where you know,

308
00:35:05,910 --> 00:35:10,170
the they're not they're not the only person running on that. They have to queue things up.

309
00:35:10,170 --> 00:35:16,650
They have to write batch scripts. And so these are the kinds of things that we can help with.

310
00:35:16,650 --> 00:35:24,360
The software we've got available log his quite nice list of of what we have.

311
00:35:24,360 --> 00:35:33,180
You might recognise some of the applications, for example, are Anaconda, those types of things, for example,

312
00:35:33,180 --> 00:35:42,240
like with our we have about 1100 libraries available in our AR installation and

313
00:35:42,240 --> 00:35:45,690
we make it quite easy so that you can actually install your own locally as well.

314
00:35:45,690 --> 00:35:52,170
So if you're particularly using AR, that's useful.

315
00:35:52,170 --> 00:36:06,690
Anaconda Anaconda a nice way of packaging Python code and you can use virtual environments or conder environments to actually package that up,

316
00:36:06,690 --> 00:36:10,740
and you can run a virtual environment in your own areas.

317
00:36:10,740 --> 00:36:15,210
And so those types of things, we don't have to worry about installing software for you.

318
00:36:15,210 --> 00:36:28,980
We give you advice, and there's a nice walkthrough example of how to create conder environments on Arc JCC and Intel.

319
00:36:28,980 --> 00:36:33,700
They're on here because they're that they're two of our main compiler tool chains that we use,

320
00:36:33,700 --> 00:36:40,410
that you've got code you want to compile for yourself, JCC and tell either of those two things.

321
00:36:40,410 --> 00:36:57,810
And so that will be added with things like API maths kind of libraries for Intel or the open plus open or allow pack types of mass libraries for JCC.

322
00:36:57,810 --> 00:37:07,680
So this is just a nice slide showing the Times applications with their domains.

323
00:37:07,680 --> 00:37:15,420
So, you know, I know that some, some, some so many people probably won't be looking at some chemistry of genomics, baby stuff.

324
00:37:15,420 --> 00:37:21,660
But you know, there's there's quite a large number of applications on the system.

325
00:37:21,660 --> 00:37:26,820
As I say this, there's over 500 now. I think I think we had 400 earlier on, but it's now over 500.

326
00:37:26,820 --> 00:37:32,910
Application modules are available that are specific specifically for applications.

327
00:37:32,910 --> 00:37:40,640
There's over a thousand in total with all the supporting dependencies that these applications use.

328
00:37:40,640 --> 00:37:51,110
So when it comes to software support tend to be involved in looking at the applications that somebody wants to use,

329
00:37:51,110 --> 00:37:57,890
working out whether we can actually install it and updating it on on request.

330
00:37:57,890 --> 00:38:08,030
We also can provide software build assistance to users, and that tends to be a little bit like I said earlier,

331
00:38:08,030 --> 00:38:13,220
where people have found software on the internet, they found a GitHub repository.

332
00:38:13,220 --> 00:38:18,440
They're not very confident and building it themselves. They would just say to us, Could you install this for us?

333
00:38:18,440 --> 00:38:21,540
And if we think it's something that a few users might?

334
00:38:21,540 --> 00:38:26,790
Find useful, then we'll install it centrally, otherwise we can help them install it in their own,

335
00:38:26,790 --> 00:38:35,310
in their own areas, and on average, we get about four or five application requests a week, some of them quite easy.

336
00:38:35,310 --> 00:38:40,600
We can we can just deal with them in a matter of hours.

337
00:38:40,600 --> 00:38:42,030
You know, it's quite easy to put on.

338
00:38:42,030 --> 00:38:52,710
Some of them have been somewhat more problematic and it's taken taken weeks to get a piece of software working as we'd wanted to work on the system.

339
00:38:52,710 --> 00:38:58,800
One of the one of the differences between maybe a system that you'll work at your own workstation is

340
00:38:58,800 --> 00:39:10,980
that we use a a feature called Environment Modules to manage the software applications on the system.

341
00:39:10,980 --> 00:39:15,630
So you do a module load and then the name of the application that you want,

342
00:39:15,630 --> 00:39:23,710
and that sets the entire environment up so that you can actually run that piece of software and.

343
00:39:23,710 --> 00:39:32,440
I'll give you a quick example, I think here. So in this example, someone's trying to run ah, and it's just not there.

344
00:39:32,440 --> 00:39:38,710
But if you then load the module, ah and then run ah again, you can actually see that it's all in the system.

345
00:39:38,710 --> 00:39:43,270
You can run something and it works fine. And then you can unload the module.

346
00:39:43,270 --> 00:39:45,400
It's just completely disappears.

347
00:39:45,400 --> 00:39:54,400
So that's quite a nice way of managing your environment, which, if you had to do it manually, would be quite quite cumbersome,

348
00:39:54,400 --> 00:39:58,780
especially with the number of modules that we have in the fact that they can actually

349
00:39:58,780 --> 00:40:04,270
interfere with each other because they've all been built with different libraries, different compilers and other dependencies.

350
00:40:04,270 --> 00:40:12,210
The fact that the module system allows us to to to make that consistent is very nice.

351
00:40:12,210 --> 00:40:20,160
When it comes to natural software installation, we make our own lives a little bit easier by using a framework called easy build,

352
00:40:20,160 --> 00:40:28,200
and that allows us to use known recipes for building, for the most part, open source software.

353
00:40:28,200 --> 00:40:34,950
It also allows us to to install some commercial app codes as well, but it's a little bit different for that.

354
00:40:34,950 --> 00:40:38,610
But this means that you can.

355
00:40:38,610 --> 00:40:45,570
It has a predefined set of known compiler library tool chains.

356
00:40:45,570 --> 00:40:49,980
These are updated twice a year and a B version each year.

357
00:40:49,980 --> 00:41:03,390
So that has the latest and greatest versions IGCC, Intel PGI or other well and BSD SDK type tool chains, and it makes those available.

358
00:41:03,390 --> 00:41:11,100
And then there are other recipes then built on those in order to to allow an application to be installed.

359
00:41:11,100 --> 00:41:20,700
And this really helps with reproducibility because a lot of other academic institutions in Europe particularly use easy build.

360
00:41:20,700 --> 00:41:32,370
And so it gives you a known and unknown environment that, you know, if you're going to load that particular version of of a module, it will work.

361
00:41:32,370 --> 00:41:37,700
Or at least it should. So that does also when you when you build with these,

362
00:41:37,700 --> 00:41:47,030
you build you've got a basic level of assurance that the application will function because there are inbuilt tests at the end of the end of the build.

363
00:41:47,030 --> 00:41:52,730
However, we do have a minority of applications that we have to install manually.

364
00:41:52,730 --> 00:42:01,490
These tend to be the restricted licence applications, things like the MATLAB and this and other and other types of codes.

365
00:42:01,490 --> 00:42:05,870
Also, you'll get code are licenced as a source.

366
00:42:05,870 --> 00:42:12,950
So VASP one that comes to mind as a as a source licence because of code, as his kalsi.

367
00:42:12,950 --> 00:42:19,610
And so they have to be restricted in their in their access.

368
00:42:19,610 --> 00:42:24,530
Commercial licence are also licenced differently.

369
00:42:24,530 --> 00:42:36,260
So we'll find that the module files will also points to a particular licence server because AAC don't run very many or don't own many licences.

370
00:42:36,260 --> 00:42:42,320
We tend to licence the Intel compilers starter and a couple of other things,

371
00:42:42,320 --> 00:42:47,630
but everything else is usually another department or group that own that licence.

372
00:42:47,630 --> 00:42:56,670
And so that's needs to be needs to be protected so that the wrong people aren't using the wrong code.

373
00:42:56,670 --> 00:43:05,820
When it comes to assistance, users want to build their own code. We tend to it tends to be limited to customisation for ARC itself.

374
00:43:05,820 --> 00:43:11,310
We really can't. We don't have the bandwidth given the number of staff that we have to provide a

375
00:43:11,310 --> 00:43:17,550
large amount of RC type effort to get into code and help people power lines code.

376
00:43:17,550 --> 00:43:23,040
We can give ideas and a little bit of help, but it's not something that we can easily get into.

377
00:43:23,040 --> 00:43:33,380
It does take a lot of time. One thing that we do find ourselves doing is actually optimising commercial

378
00:43:33,380 --> 00:43:38,540
codes because there are a number of codes that run in this crisis and Chef X,

379
00:43:38,540 --> 00:43:45,530
which is a CFD code. And it was we found it was running quite poorly on our new system.

380
00:43:45,530 --> 00:43:51,770
And by changing the MPI stack,

381
00:43:51,770 --> 00:43:56,930
which is the message passing interface that the system uses to communicate between

382
00:43:56,930 --> 00:44:04,460
nodes from the one that answers supplied to our own one built locally on our system,

383
00:44:04,460 --> 00:44:08,840
we were able to dramatically improve its its performance.

384
00:44:08,840 --> 00:44:14,570
The idea being a nice linear scaling up, we've got it pretty close.

385
00:44:14,570 --> 00:44:18,410
Previously, it was it was all over the place. It was horrible. Have we got the graph?

386
00:44:18,410 --> 00:44:30,700
Unfortunately, but it was. It was very poor. Something else that's that's become very popular recently is the concept of software containers.

387
00:44:30,700 --> 00:44:37,060
And people have heard quite a lot about Docker Docker images.

388
00:44:37,060 --> 00:44:42,670
We don't support Docker on our system natively.

389
00:44:42,670 --> 00:44:54,760
Due to some of the security issues with it. But we can convert Docker images into singularity containers that we run singularity on on ARK and.

390
00:44:54,760 --> 00:45:03,640
That's just a very nice way of being able to package up an application for your own workstation and just know that it will actually run on the system,

391
00:45:03,640 --> 00:45:08,260
albeit you probably wouldn't be able to get a parallel run that way,

392
00:45:08,260 --> 00:45:17,830
but it certainly would be great for running multiple instances and doing some kind of Monte-Carlo type type simulation.

393
00:45:17,830 --> 00:45:28,240
And also, it's I have put on that homicide singular, and she has been renamed to obtain a due to it's joining the Linux Foundation.

394
00:45:28,240 --> 00:45:40,510
As of November. Dai touched on the fact that we slow down, slow Mr Simple electric utility for resource management,

395
00:45:40,510 --> 00:45:47,410
and that's how a user specifies what resources they require for their job.

396
00:45:47,410 --> 00:45:57,130
So if you are crafting a or you will need to craft a little a little script to run the application that that you want to use,

397
00:45:57,130 --> 00:46:02,770
and in that you have to specify a number of resources that that job requires.

398
00:46:02,770 --> 00:46:07,090
That could be the number of CPUs that require a large amount of memory.

399
00:46:07,090 --> 00:46:14,380
Maybe it needs a GPU and maybe you have a reservation or have other similar quality of service requirement.

400
00:46:14,380 --> 00:46:21,610
Once you feed all of that information into your script, some will see that and it will make a sensible,

401
00:46:21,610 --> 00:46:25,330
hopefully decision on to where where that that job runs.

402
00:46:25,330 --> 00:46:31,330
And it may be that that job could run immediately somewhere where it may have to be queued until the resources are available.

403
00:46:31,330 --> 00:46:38,680
So that's what the the resource management does does for us.

404
00:46:38,680 --> 00:46:45,370
But any more information on that, and I want to work out how to create a slim script,

405
00:46:45,370 --> 00:46:53,530
the best place to learn that is on the intro to ARK course is which we run every couple of weeks.

406
00:46:53,530 --> 00:47:02,500
Di also mentioned that the GTA to this now we as victim, we provide local support for Oxford users of giTe.

407
00:47:02,500 --> 00:47:11,830
Unfortunately, the well, I say unfortunately that's not fair. We can't provide direct systems administration of the of the great system.

408
00:47:11,830 --> 00:47:19,160
It's it's not run by us. So we don't quite have the same level of access as we have on the arc system.

409
00:47:19,160 --> 00:47:27,580
So when it comes to helping users with an arc with a problem, we usually can get in and help you go into your directories.

410
00:47:27,580 --> 00:47:31,750
Have a look at logs, make changes and help that way.

411
00:47:31,750 --> 00:47:37,480
But we can't do that on site, so we have to ask you to raise a ticket with Heart Tree,

412
00:47:37,480 --> 00:47:42,490
which we can then mount monitor and help and add information to.

413
00:47:42,490 --> 00:47:50,710
But in order to maybe help with that, we also have some TJX Max-Q nodes, as I mentioned earlier on in our HTC cluster.

414
00:47:50,710 --> 00:47:56,530
So it's probably quite a good idea to run any any jobs locally on our the arc

415
00:47:56,530 --> 00:48:03,130
first before then scaling up and running them in the in the right system.

416
00:48:03,130 --> 00:48:18,450
Because once again, the the the TJX is in charge tend to run containers produced by Nvidia, which we can also use.

417
00:48:18,450 --> 00:48:25,980
So I think that's it from me. Any questions?

418
00:48:25,980 --> 00:48:35,340
Great. Thank you very much, Andy and A. There was a wonderful presentation and a round of applause.

419
00:48:35,340 --> 00:48:40,288
Thank you, thank you.