WEBVTT 00:00.000 --> 00:09.000 All right, let's get started with Thomas and Drup. 00:09.000 --> 00:11.000 Thank you. 00:11.000 --> 00:13.000 Good morning, everyone. 00:13.000 --> 00:15.000 I'm Thomas. 00:15.000 --> 00:18.000 Thank you. 00:18.000 --> 00:22.000 I'm working for the Ulysses Park Computing Center, 00:22.000 --> 00:25.000 where we are hosting the first XSA system of Europe. 00:25.000 --> 00:28.000 And if you're one of the people collecting stickers, 00:28.000 --> 00:30.000 you might have something for you. 00:30.000 --> 00:32.000 Yeah, it's my first time at Boston, 00:32.000 --> 00:34.000 and today I'm going to talk about Drup, 00:34.000 --> 00:38.000 which is an environment to run systematic benchmarks 00:38.000 --> 00:41.000 and scientific workflows. 00:41.000 --> 00:43.000 So benchmarks, I would say, 00:43.000 --> 00:45.000 is a kind of subcategory of workflows, 00:45.000 --> 00:49.000 so we are executing some workflows in a very systematic way, 00:49.000 --> 00:51.000 with several dependent steps, 00:51.000 --> 00:54.000 and this is usually done by different kinds of user groups. 00:54.000 --> 00:56.000 For example, the system admins, 00:56.000 --> 01:01.000 they use it for procurements to ensure the stability of the system, 01:01.000 --> 01:06.000 or support stuff to detect changes in the behavior of the system, 01:06.000 --> 01:10.000 or to reproduce problems that are reported by users, 01:10.000 --> 01:13.000 or the users themselves to let them run their software 01:13.000 --> 01:18.000 as the functionalities or run the scientific use cases in the end. 01:18.000 --> 01:21.000 And one can do this in different ways, 01:21.000 --> 01:24.000 good or better and worse ones, 01:24.000 --> 01:27.000 manual, so you have nothing scripted. 01:27.000 --> 01:29.000 You do everything by hand, 01:29.000 --> 01:31.000 you haven't bellies anything documented, 01:31.000 --> 01:33.000 you are running into problems, 01:33.000 --> 01:35.000 like overwriting your own data in the worst case, 01:35.000 --> 01:38.000 so that's probably not the best way to do it. 01:38.000 --> 01:39.000 Then you have something, 01:39.000 --> 01:40.000 they miscripted, 01:40.000 --> 01:42.000 very benchmark specific, 01:42.000 --> 01:44.000 this is also hard to port, 01:44.000 --> 01:47.000 and to give to your colleague, 01:47.000 --> 01:49.000 and he has to think what you thought, 01:49.000 --> 01:51.000 how the script should work like, 01:51.000 --> 01:53.000 so there's not no standard at all. 01:53.000 --> 01:56.000 And then there are more generic tools 01:56.000 --> 01:58.000 for configuring benchmarks, 01:58.000 --> 02:03.000 and this is where jupe comes into the game. 02:03.000 --> 02:06.000 Let's start with a little bit of history, 02:06.000 --> 02:09.000 so I haven't been in HPC for so long, 02:09.000 --> 02:11.000 but the colleagues who have started with jupe, 02:11.000 --> 02:13.000 they told me it has started around 2004 02:13.000 --> 02:16.000 with a procurement of a big system we had at this time, 02:16.000 --> 02:18.000 and this was purely pearl based, 02:18.000 --> 02:21.000 and this has been re-implemented in 2014, 02:22.000 --> 02:23.000 with Pison, 02:23.000 --> 02:27.000 and we are currently having the version 2.7.1. 02:27.000 --> 02:29.000 And a little bit more about the history, 02:29.000 --> 02:32.000 jupe has been the chosen framework 02:32.000 --> 02:35.000 for European benchmarking already, 02:35.000 --> 02:38.000 since the first years it was published, 02:38.000 --> 02:40.000 in deja and in the praise, 02:40.000 --> 02:44.000 preparatory phase and in several implementation phases. 02:44.000 --> 02:48.000 And also it has been used in several projects, 02:48.000 --> 02:50.000 nationally and internationally. 02:50.000 --> 02:52.000 Here's a few selection of these logos, 02:52.000 --> 02:54.000 you might recognize one or the other, 02:54.000 --> 02:58.000 but this list is not complete. 02:58.000 --> 03:00.000 So what's jupe actually? 03:00.000 --> 03:03.000 It's generic, lightweight, configurable, 03:03.000 --> 03:04.000 environment to run, 03:04.000 --> 03:08.000 monitor and analyze application execution in a systematic way. 03:08.000 --> 03:10.000 What it means will hopefully get more clear 03:10.000 --> 03:12.000 in the following slides. 03:12.000 --> 03:15.000 And I've mentioned a few use cases already for it, 03:15.000 --> 03:17.000 and they also parameter studies, 03:17.000 --> 03:20.000 production scenarios and more. 03:20.000 --> 03:22.000 We publish that open source, 03:22.000 --> 03:24.000 it's available on GitHub, 03:24.000 --> 03:27.000 but also on other ways, 03:27.000 --> 03:29.000 which I will come to later. 03:29.000 --> 03:31.000 And a few key concepts, 03:31.000 --> 03:33.000 which I will explain in the following slides, 03:33.000 --> 03:36.000 is the workflow creation based on dependent steps, 03:36.000 --> 03:38.000 and parameter expansion, 03:38.000 --> 03:40.000 pattern substitution and files, 03:40.000 --> 03:42.000 we create a generic directory structure 03:42.000 --> 03:44.000 to avoid overwriting of data, 03:44.000 --> 03:47.000 and keep doing things automatically. 03:47.000 --> 03:50.000 So how would a generic, 03:50.000 --> 03:52.000 basic workflow look like? 03:52.000 --> 03:53.000 You have your data, 03:53.000 --> 03:55.000 you have your application, your source codes, 03:55.000 --> 03:58.000 but what you would have to provide is 03:58.000 --> 03:59.000 jupe configuration file. 03:59.000 --> 04:02.000 You either program it in XML or in YAML, 04:02.000 --> 04:04.000 I will have this talk based on XML, 04:04.000 --> 04:05.000 but don't be afraid of it, 04:05.000 --> 04:06.000 there's also YAML, 04:06.000 --> 04:08.000 and if you prefer that one. 04:08.000 --> 04:09.000 And so it's also not free lunch, 04:09.000 --> 04:11.000 so you have to provide this file, 04:11.000 --> 04:12.000 but once you have it, 04:12.000 --> 04:13.000 it's standardized, 04:13.000 --> 04:16.000 and basically it's once you understood jupe, 04:16.000 --> 04:19.000 you might understand all the other benchmarks 04:19.000 --> 04:21.000 that are implemented in jupe. 04:21.000 --> 04:22.000 So in this example, 04:22.000 --> 04:24.000 we have configured our benchmarks 04:24.000 --> 04:28.000 such that it should have a first-to-compiled step, 04:28.000 --> 04:30.000 and this should create two work package, 04:30.000 --> 04:31.000 one, 04:31.000 --> 04:33.000 it should compile the sources with a one, 04:33.000 --> 04:35.000 and second one with a three. 04:35.000 --> 04:38.000 And we would like to have a follow-up step, 04:38.000 --> 04:40.000 that depends on the compile, 04:40.000 --> 04:44.000 and we run a scaling experiment for each of the executables. 04:44.000 --> 04:47.000 We would like to see how it performs on two, 04:47.000 --> 04:50.000 four, and eight thousand CPUs. 04:50.000 --> 04:54.000 And then we want to extract some data from the logs, 04:54.000 --> 04:55.000 and in this example, 04:55.000 --> 04:57.000 we interested in the runtime, 04:57.000 --> 05:01.000 and we provide this either in the table or in the database. 05:01.000 --> 05:05.000 So this is the most basic work flow 05:05.000 --> 05:07.000 that is implemented in jupe. 05:07.000 --> 05:09.000 But in the following slides, 05:09.000 --> 05:13.000 I will go a little bit more into detail how we do this. 05:13.000 --> 05:16.000 So this can be either in jamele or in XML here, 05:16.000 --> 05:17.000 it's in XML, 05:17.000 --> 05:19.000 and you define parameters, 05:19.000 --> 05:20.000 give them a name, 05:20.000 --> 05:22.000 and to value a series, 05:22.000 --> 05:23.000 like a one and a three, 05:23.000 --> 05:25.000 and if you have a comma separate list, 05:25.000 --> 05:27.000 a jupe makes sure that 05:27.000 --> 05:29.000 where you use this parameter, 05:29.000 --> 05:31.000 it creates two work packages. 05:31.000 --> 05:35.000 So one directory for each of these pipelines 05:35.000 --> 05:37.000 that jupes go into execute. 05:37.000 --> 05:39.000 And if you have another parameter, 05:39.000 --> 05:41.000 also with a comma separate list, 05:41.000 --> 05:43.000 and jupes spans out the tree even further. 05:43.000 --> 05:46.000 So just by adding another comma and another value, 05:46.000 --> 05:49.000 you can easily scale it up to 05:49.000 --> 05:50.000 more compile options, 05:50.000 --> 05:52.000 more execution options, 05:52.000 --> 05:55.000 or whatever your workflow is looking like. 05:55.000 --> 05:58.000 And the file substitution is another key feature. 05:58.000 --> 06:01.000 We define so-called subs, 06:01.000 --> 06:03.000 and then we define a unique pattern, 06:03.000 --> 06:05.000 or a non-unique pattern, 06:05.000 --> 06:07.000 jupel search for it in the file that you give to it, 06:07.000 --> 06:11.000 and replace it either by one of the values here, 06:11.000 --> 06:13.000 and your parameters here defined, 06:13.000 --> 06:15.000 or by a static string, for example. 06:15.000 --> 06:18.000 And then it replaces the values in a file. 06:18.000 --> 06:19.000 So you could think about 06:19.000 --> 06:21.000 to have a template of a job script, 06:21.000 --> 06:23.000 and you wanted to do the scaling experiment, 06:23.000 --> 06:26.000 and jupes create three or five or how many, 06:26.000 --> 06:28.000 and jobscripts you want, 06:28.000 --> 06:31.000 just by replacing the values in this template. 06:31.000 --> 06:36.000 The next one was the unique directory structure. 06:36.000 --> 06:40.000 So every time you run juperun with your benchmark file, 06:40.000 --> 06:44.000 jupel creates a sub directory to prevent 06:44.000 --> 06:47.000 that you override your previous runs, 06:47.000 --> 06:49.000 because you won't have them reproducible, 06:49.000 --> 06:50.000 you won't have them documented, 06:50.000 --> 06:54.000 and you might want to look into them at a later point. 06:54.000 --> 06:56.000 So you run juperun, 06:56.000 --> 06:58.000 and we get a new ID, 06:58.000 --> 07:01.000 and every of these IDs you see these structures. 07:01.000 --> 07:04.000 So you have some internal configuration files, 07:04.000 --> 07:06.000 which we can ignore at this place, 07:06.000 --> 07:10.000 a point, and on the top some directories, 07:10.000 --> 07:12.000 each directory is one work package, 07:12.000 --> 07:15.000 for example, the compile with the O1, 07:15.000 --> 07:16.000 the compile with the O3, 07:16.000 --> 07:18.000 and the different executions. 07:18.000 --> 07:21.000 So they don't interfere with each other. 07:21.000 --> 07:24.000 And within those directory, there's more structure, 07:24.000 --> 07:27.000 and at the end there's where the magic happens, 07:27.000 --> 07:30.000 log files, and probably your source code, 07:30.000 --> 07:31.000 your executables, 07:31.000 --> 07:37.000 further scripts, or whatever you want juped to do with your data. 07:37.000 --> 07:41.000 I will guide you through these example scripts. 07:41.000 --> 07:44.000 I hope it's not too small. 07:44.000 --> 07:47.000 Let's start here at the bottom. 07:47.000 --> 07:49.000 So we define a step, 07:49.000 --> 07:51.000 and all the rest is just configuration, 07:51.000 --> 07:54.000 or definition of things that juped to do, 07:54.000 --> 07:57.000 but the actual magic happens in the step itself. 07:57.000 --> 08:00.000 So we have created a step compile, 08:00.000 --> 08:03.000 and on the right side you will see what jubes do with it. 08:03.000 --> 08:05.000 So the first thing that juped does, 08:05.000 --> 08:07.000 it uses the compile set. 08:07.000 --> 08:09.000 The compile set is defined here, 08:09.000 --> 08:11.000 and we define two parameters. 08:11.000 --> 08:14.000 First is exec, name with string, my exit, 08:14.000 --> 08:17.000 and we have another parameter with the comma separate list, 08:17.000 --> 08:19.000 that we have seen already. 08:19.000 --> 08:21.000 So what juped does it, 08:22.000 --> 08:24.000 recognize that it's a comma separate list. 08:24.000 --> 08:28.000 This comma can be also replaced by any other arbitrary sign. 08:28.000 --> 08:32.000 If you have the comma in your strings or so that you want to use. 08:32.000 --> 08:35.000 So juped creates two directories, two work packages, 08:35.000 --> 08:38.000 one with your one and the other one with your three. 08:38.000 --> 08:43.000 So we have gone through this parameter sets that we have used, 08:43.000 --> 08:48.000 and then the next thing that jubes executing is using the sources. 08:48.000 --> 08:50.000 The source is a file set, 08:50.000 --> 08:55.000 and this can be used to link to copy some data, 08:55.000 --> 08:58.000 and in this case copies all the data that is in the files, 08:58.000 --> 09:00.000 and there is a c directory, 09:00.000 --> 09:02.000 and these are your sources, your cpp files or something, 09:02.000 --> 09:06.000 and the templates make file.in. 09:06.000 --> 09:09.000 And it copies it in both of its work packages, 09:09.000 --> 09:12.000 because it knows it has created these work packages. 09:12.000 --> 09:14.000 So we are done with the file set. 09:14.000 --> 09:17.000 The next thing we use the compile set, 09:18.000 --> 09:21.000 and that's the substitutions that I've mentioned already before. 09:21.000 --> 09:26.000 So we define a file where jubes should look for patterns, 09:26.000 --> 09:28.000 and we can define our output file. 09:28.000 --> 09:30.000 In this case, it's the input file that's make file.in, 09:30.000 --> 09:32.000 which we have already in our directories, 09:32.000 --> 09:34.000 to juped find it, 09:34.000 --> 09:38.000 and it will create a new file where it replaces everything 09:38.000 --> 09:40.000 that is defined under following lines. 09:40.000 --> 09:43.000 And it can be the same, you want to override it, 09:43.000 --> 09:45.000 but here we just want to create a new file, 09:45.000 --> 09:48.000 and here we search for hash, 09:48.000 --> 09:52.000 and replace it with the value that stored in this parameter here. 09:52.000 --> 09:56.000 And we end up with a make file in both directories, 09:56.000 --> 10:02.000 and the last thing that is to be done in the step is, 10:02.000 --> 10:05.000 we don't use anything, but we execute something. 10:05.000 --> 10:06.000 This is marked with the do, 10:06.000 --> 10:10.000 and the do's are executing shell comments, 10:10.000 --> 10:14.000 like if you would do it on the comment line. 10:14.000 --> 10:19.000 And here we run or make comments with an argument 10:19.000 --> 10:23.000 that gets its value from the parameter here. 10:23.000 --> 10:27.000 So in this case, we end up with an executable compile with minus 1, 10:27.000 --> 10:29.000 and in this case with all three. 10:29.000 --> 10:33.000 And this was the last thing that we had to do in the two paths 10:33.000 --> 10:34.000 to do in the compile step. 10:34.000 --> 10:38.000 So we have our first executables to executables compile. 10:39.000 --> 10:41.000 And then you will continue. 10:41.000 --> 10:44.000 It finds another step, 10:44.000 --> 10:47.000 and this step depends on the compile step. 10:47.000 --> 10:50.000 So group knows that for compile step, 10:50.000 --> 10:52.000 we have already two work packages. 10:52.000 --> 10:55.000 So it will also create one directory, 10:55.000 --> 11:00.000 at least one directory for each compile. 11:00.000 --> 11:05.000 And then group also knows that this directory should be linked 11:05.000 --> 11:09.000 to the O1 version, and this directory should be linked to the O3 version. 11:09.000 --> 11:11.000 So that if we execute something here, 11:11.000 --> 11:15.000 group can go via this symbolic link into this directory, 11:15.000 --> 11:19.000 and execute the executable that we have compiled there. 11:19.000 --> 11:22.000 And there's more things, 11:22.000 --> 11:24.000 the script is not complete. 11:24.000 --> 11:26.000 We want to analyze things, we can do much more, 11:26.000 --> 11:29.000 but this is really the very basic structure 11:29.000 --> 11:33.000 with some of the key components that the two paths. 11:34.000 --> 11:38.000 Also, a few common line arguments here. 11:38.000 --> 11:41.000 We can execute the script by jubran on the common line, 11:41.000 --> 11:43.000 either the XML script or the YAML. 11:43.000 --> 11:47.000 The functionalities are like 99% per 9% the same, 11:47.000 --> 11:51.000 just a little bit different than the YAML version on a very specific keyword, 11:51.000 --> 11:54.000 but that's not worth mentioning, 11:54.000 --> 11:58.000 and we can also query previous runs. 11:58.000 --> 12:01.000 So if we want to know which parameter has been set, 12:02.000 --> 12:05.000 whatever kind of information group knows, 12:05.000 --> 12:07.000 we can query it from the common line, 12:07.000 --> 12:09.000 even after a benchmark has been run, 12:09.000 --> 12:10.000 or while the benchmark is running, 12:10.000 --> 12:13.000 we have the common line help to access the glossary 12:13.000 --> 12:16.000 and many more common line options. 12:16.000 --> 12:18.000 So I hope until now, 12:18.000 --> 12:20.000 we have got a bit of know of you, 12:20.000 --> 12:23.000 what you can do, and how it does it. 12:23.000 --> 12:28.000 Now, I would like to go into a few more use case examples 12:28.000 --> 12:30.000 that we have, especially at jac. 12:31.000 --> 12:35.000 I've already mentioned that we are hosting the access cases in Jupiter, 12:35.000 --> 12:37.000 and in the procurement process, 12:37.000 --> 12:40.000 we had to come up with a big list of benchmarks, 12:40.000 --> 12:42.000 like people who are involved in procurement, 12:42.000 --> 12:44.000 they might have done similar things. 12:44.000 --> 12:47.000 So what we did, we integrated several applications 12:47.000 --> 12:50.000 and synthetic benchmarks in jup, 12:50.000 --> 12:53.000 and defined them for different execution targets, 12:53.000 --> 12:56.000 like the CPU, the GPU, or both together, 12:56.000 --> 13:00.000 and yeah, we published this benchmark suite on GitHub, 13:00.000 --> 13:02.000 so if you want to get started with jup, 13:02.000 --> 13:03.000 or if you're using jup, 13:03.000 --> 13:05.000 this might be a nice place to look at, 13:05.000 --> 13:09.000 because there's some recipes that you can use. 13:09.000 --> 13:12.000 So it's nice to have these benchmarks suite, 13:12.000 --> 13:14.000 but we also want to make use of it. 13:14.000 --> 13:17.000 So we use these benchmarks suite 13:17.000 --> 13:19.000 to in a continuous benchmarking environment, 13:19.000 --> 13:21.000 to get insights into system health, 13:21.000 --> 13:23.000 what do I mean with that? 13:23.000 --> 13:25.000 So we have written CI scripts, 13:25.000 --> 13:28.000 components like XACB that we use, 13:28.000 --> 13:30.000 but you can do it also without it. 13:30.000 --> 13:33.000 Then we go via jaccoma on our HPC systems, 13:33.000 --> 13:36.000 and then we execute all the jup scripts that we have. 13:36.000 --> 13:42.000 We extract the information that we want from this particular example, 13:42.000 --> 13:44.000 and we push them back. 13:44.000 --> 13:46.000 In this case, it's a CSV, 13:46.000 --> 13:48.000 and we use another software, 13:48.000 --> 13:50.000 at JAC, which is called LLview, 13:50.000 --> 13:52.000 which we use for, 13:52.000 --> 13:55.000 usually for system monitoring and reporting, 13:55.000 --> 13:56.000 lots of jobs, 13:56.000 --> 13:59.000 but it has also the capability to, 13:59.000 --> 14:04.000 flexibly and generically plot or visualized data. 14:04.000 --> 14:06.000 You can do this also in other ways, 14:06.000 --> 14:08.000 but looking at CSV files is, 14:08.000 --> 14:10.000 it's very hard to get some insights, 14:10.000 --> 14:13.000 so we want to have some more interactive visualization 14:13.000 --> 14:15.000 of the data that we get from, 14:15.000 --> 14:17.000 from running these benchmarks continuously. 14:17.000 --> 14:19.000 So I will give a few examples here, 14:19.000 --> 14:21.000 so we have then all list of benchmarks, 14:21.000 --> 14:23.000 we see some status of it, 14:23.000 --> 14:25.000 how they performed, 14:25.000 --> 14:28.000 and then we can click on the individual benchmarks, 14:28.000 --> 14:30.000 see we look at HPC, HPCG, 14:30.000 --> 14:32.000 on a special system, 14:32.000 --> 14:33.000 HAC, 14:33.000 --> 14:35.000 and everything that you see here, 14:35.000 --> 14:37.000 like the table and the plots is highly configured. 14:37.000 --> 14:38.000 So it's not, 14:38.000 --> 14:40.000 you will see this also in the following apps, 14:40.000 --> 14:41.000 plots, 14:41.000 --> 14:43.000 it's not depending on the metrics or so, 14:43.000 --> 14:45.000 it's a completely configurable. 14:45.000 --> 14:47.000 So what we see here is, 14:47.000 --> 14:49.000 for example, the Gigaflops, 14:49.000 --> 14:51.000 from the HPCG, on the right side, 14:51.000 --> 14:53.000 the runtime, on the X axis, 14:53.000 --> 14:55.000 the time from, like, 14:55.000 --> 14:57.000 mid December beginning of December, 14:57.000 --> 14:59.000 until, like, last week, 14:59.000 --> 15:01.000 and if we hover over the graphs, 15:01.000 --> 15:03.000 we even get more data, 15:03.000 --> 15:05.000 we call them annotations, 15:05.000 --> 15:08.000 and we can, for example, 15:08.000 --> 15:12.000 see which job ID has been used to get this data from, 15:12.000 --> 15:14.000 or any other kind of data that you would have available, 15:14.000 --> 15:15.000 in your benchmarks, 15:15.000 --> 15:17.000 we could visualize here, 15:17.000 --> 15:19.000 same for another kind of benchmark, 15:19.000 --> 15:20.000 this is the stream, 15:20.000 --> 15:22.000 here we don't show geoflops, 15:22.000 --> 15:25.000 but we have the bandwidth on the left and on the right, 15:25.000 --> 15:26.000 and the last one is, 15:26.000 --> 15:28.000 we cannot only show time series on the X axis, 15:28.000 --> 15:30.000 but also more scaling plot, 15:30.000 --> 15:32.000 like also from the stream benchmark, 15:32.000 --> 15:34.000 and here we scale over the threats, 15:34.000 --> 15:36.000 per task, for example. 15:36.000 --> 15:40.000 So this is how we use our benchmarks lead, 15:40.000 --> 15:43.000 to get also some continuous insights into our systems. 15:44.000 --> 15:46.000 But that's about the benchmarking side, 15:46.000 --> 15:50.000 we also use group for scientific workflows, 15:50.000 --> 15:51.000 complex ones, 15:51.000 --> 15:55.000 and here I brought you one from the energy system modeling, 15:55.000 --> 15:56.000 fields, 15:56.000 --> 15:58.000 usually the state of the artist there 15:58.000 --> 16:01.000 to run the dozens of scenarios, 16:01.000 --> 16:04.000 to see how to analyze, 16:04.000 --> 16:08.000 or evaluate scenarios in the future. 16:08.000 --> 16:10.000 So the further you go in the future, 16:10.000 --> 16:12.000 the bigger the uncertainty space gets, 16:12.000 --> 16:16.000 and usually the more scenarios you need to really find 16:16.000 --> 16:20.000 the optimal scenario for a future case. 16:20.000 --> 16:23.000 But the state of the artist that only runs a dozens, 16:23.000 --> 16:24.000 and what we did, 16:24.000 --> 16:27.000 we used HPC and run more than 11,000, 16:27.000 --> 16:30.000 which is quite a lot more than it's state of the art in that fields, 16:30.000 --> 16:33.000 and we did this for the German power system, 16:33.000 --> 16:36.000 and was too coming into the place here, 16:36.000 --> 16:38.000 I'm not going into the details here of this workflow, 16:38.000 --> 16:41.000 but this should just indicate that 16:41.000 --> 16:44.000 we have many different kind of data formats, 16:44.000 --> 16:46.000 and different kind of applications, 16:46.000 --> 16:49.000 and group orchards orchestrate them all off, 16:49.000 --> 16:52.000 and runs them on HPC. 16:52.000 --> 16:55.000 And this study has been reasonably published 16:55.000 --> 16:57.000 right before Christmas in nature, 16:57.000 --> 17:00.000 communications with several product partners. 17:00.000 --> 17:02.000 Okay. 17:02.000 --> 17:05.000 So now I hope you got a first impression 17:05.000 --> 17:07.000 about how you can implement group, 17:07.000 --> 17:09.000 and what you can be used for. 17:09.000 --> 17:12.000 So I'd like to also get with more in the software side of things, 17:12.000 --> 17:14.000 so what's the current status. 17:14.000 --> 17:17.000 The main development is still done in an internal GitLab, 17:17.000 --> 17:20.000 to choose some historical reasons we started long time ago, 17:20.000 --> 17:21.000 in SVN, 17:21.000 --> 17:22.000 and then we moved to GitLab, 17:22.000 --> 17:25.000 and at some point we opened also repository at GitLab, 17:25.000 --> 17:28.000 but the transition is still ongoing. 17:28.000 --> 17:30.000 And we have some upmissures internally, 17:30.000 --> 17:34.000 but they are really almost all future requests, 17:34.000 --> 17:38.000 and these future requests usually do not reach us by GitHub, 17:38.000 --> 17:43.000 or issues usually the interest of people reach out to us 17:43.000 --> 17:45.000 as a developer directly. 17:45.000 --> 17:51.000 That's why you don't see very much activity on GitLab there. 17:51.000 --> 17:53.000 And the last point is more, 17:53.000 --> 17:56.000 maybe someone can help me here with that. 17:56.000 --> 17:59.000 We have tried to publish group on PIPI, 17:59.000 --> 18:02.000 but they claim group as a prohibited name. 18:02.000 --> 18:04.000 So if you know someone from PIPI, 18:04.000 --> 18:06.000 we have opened an issue a few years ago, 18:06.000 --> 18:10.000 and there's not much movement there. 18:10.000 --> 18:13.000 And what are we working on currently? 18:13.000 --> 18:16.000 We are actively working on the version 3, 18:16.000 --> 18:20.000 so it's almost a bit more than 10 years after we have done the change 18:20.000 --> 18:22.000 from version 1 to 2. 18:22.000 --> 18:26.000 And here we would like to change some internal status files 18:26.000 --> 18:29.000 into a database to increase or improve the data, 18:29.000 --> 18:31.000 persistence, the consistency, 18:31.000 --> 18:32.000 and these things, 18:32.000 --> 18:35.000 and also to gain more speed up that we need to 18:35.000 --> 18:37.000 do in the access data, 18:37.000 --> 18:39.000 and also some features, 18:39.000 --> 18:40.000 and features, 18:40.000 --> 18:41.000 for example, 18:41.000 --> 18:45.000 you can then not only push data into a database, 18:45.000 --> 18:46.000 or in plain text, 18:46.000 --> 18:52.000 but it can also use the Matplotlib to plot directly some figures, 18:52.000 --> 18:56.000 and we have some more things to come up there. 18:56.000 --> 18:59.000 Then I guess like all of the other projects, 18:59.000 --> 19:01.000 we also want to extend and improve our token testing, 19:01.000 --> 19:03.000 but we are not doing that bad with it, 19:03.000 --> 19:06.000 but still there's always room for improvements, 19:06.000 --> 19:09.000 and of course keep implementing new features 19:09.000 --> 19:12.000 that are requested from interest to people. 19:12.000 --> 19:14.000 And as I've mentioned, 19:14.000 --> 19:16.000 the transition to GitHub is ongoing, 19:16.000 --> 19:19.000 we still think about how we do this in a best way, 19:19.000 --> 19:24.000 but that's also something we are actively looking into. 19:24.000 --> 19:26.000 Then about availability of Jew, 19:26.000 --> 19:28.000 we have it as a tab or on GitHub, 19:28.000 --> 19:32.000 you can even use it installed via easy build or spark. 19:32.000 --> 19:36.000 We have an online documentation, 19:36.000 --> 19:37.000 advance, 19:37.000 --> 19:38.000 a beginner tutorial, 19:38.000 --> 19:39.000 FAQ, 19:39.000 --> 19:40.000 glossary, 19:40.000 --> 19:42.000 and more on this web page. 19:42.000 --> 19:46.000 You can contact the JC developer's by this email address, 19:46.000 --> 19:48.000 it's all on the web page, 19:48.000 --> 19:50.000 or we also look into GitHub, 19:50.000 --> 19:52.000 if you're something there. 19:52.000 --> 19:56.000 And I think that's it already. 19:56.000 --> 19:58.000 Thank you for your attention. 19:58.000 --> 20:02.000 I hope you got inspired to use Jew. 20:02.000 --> 20:03.000 If you're not yet doing it, 20:03.000 --> 20:06.000 and if you want to get in touch with us, 20:06.000 --> 20:08.000 just reach out to me here, 20:08.000 --> 20:10.000 or in any time in the future. 20:10.000 --> 20:11.000 Thank you. 20:12.000 --> 20:13.000 Thank you. 20:30.000 --> 20:37.000 The question was whether we test all the Python versions, 20:37.000 --> 20:40.000 and we don't do this actively. 20:40.000 --> 20:44.000 This might not be up to date on the Dooku. 20:44.000 --> 20:46.000 Yeah, I would need to look into this word. 20:46.000 --> 20:48.000 What's the latest version? 20:48.000 --> 20:50.000 We need to end what we are testing right now. 20:50.000 --> 20:52.000 I cannot say this for sure, sorry. 21:00.000 --> 21:01.000 The question is, 21:01.000 --> 21:06.000 how do you make your benchmarks portable across systems? 21:06.000 --> 21:09.000 You need to get access to the systems, 21:09.000 --> 21:11.000 and you implement it into the Dooku script. 21:11.000 --> 21:13.000 So it doesn't know anything about the system, 21:13.000 --> 21:16.000 unless you tell it about it. 21:16.000 --> 21:19.000 So we can use something that's called taking for example. 21:19.000 --> 21:21.000 You can define, 21:21.000 --> 21:23.000 if you want to use the inter compiler, 21:23.000 --> 21:25.000 you can create a parameter that runs, 21:25.000 --> 21:27.000 ICC for example, 21:27.000 --> 21:30.000 or you run a parameter with the name, 21:30.000 --> 21:31.000 and then the calls, 21:31.000 --> 21:32.000 GCC, 21:32.000 --> 21:33.000 and you give it a take, 21:33.000 --> 21:34.000 and depending on this text, 21:34.000 --> 21:36.000 you can use them on the command line, 21:36.000 --> 21:39.000 and then you have it flexible to decide, 21:39.000 --> 21:41.000 which kind of compiler you use, 21:41.000 --> 21:43.000 or what kind of architecture you use, 21:43.000 --> 21:47.000 but you need to tell you how it should use the architecture. 21:47.000 --> 21:49.000 So you just generate, 21:49.000 --> 21:53.000 but you need to tell you what it should do. 21:53.000 --> 21:56.000 So we ship something that we call a platform XML. 21:56.000 --> 21:59.000 This is a generic template of parameters, 21:59.000 --> 22:01.000 and settings that we use for example, 22:01.000 --> 22:02.000 first learn, 22:02.000 --> 22:05.000 but we also have it for other, 22:05.000 --> 22:07.000 workload managers, 22:07.000 --> 22:10.000 and with those, 22:10.000 --> 22:12.000 you can prove that you have it for example, 22:12.000 --> 22:14.000 and then you have it for example, 22:14.000 --> 22:16.000 and then you have it for example, 22:16.000 --> 22:19.000 and then you have it for example, 22:19.000 --> 22:22.000 and then you have it for example, 22:22.000 --> 22:23.000 and those, 22:23.000 --> 22:26.000 you can provide generic files that can be used, 22:26.000 --> 22:27.000 you can include, 22:27.000 --> 22:31.000 and you've descriptions into different benchmarks, 22:31.000 --> 22:33.000 and I think where this mechanism, 22:33.000 --> 22:37.000 you could provide different kind of configurations, 22:37.000 --> 22:39.000 for individual systems, 22:39.000 --> 22:42.000 and then tell you which one it should use 22:42.000 --> 22:45.000 for the current execution that you're aiming for. 22:53.000 --> 23:02.000 We are doing it in a way that, 23:02.000 --> 23:04.000 at the question was, 23:04.000 --> 23:06.000 how we run our benchmarks, 23:06.000 --> 23:08.000 but that we create a reservation before, 23:08.000 --> 23:11.000 or we run them in another way. 23:11.000 --> 23:12.000 And what we do, 23:12.000 --> 23:15.000 we create as batch slumpscripts, 23:15.000 --> 23:17.000 and tell you okay, 23:17.000 --> 23:19.000 as batch and then you script, 23:20.000 --> 23:22.000 and we wait until there are free nodes, 23:22.000 --> 23:25.000 so we don't block our general use system, 23:25.000 --> 23:27.000 just for our benchmarking purposes, 23:27.000 --> 23:29.000 unless it's really necessary. 23:29.000 --> 23:32.000 So we use the regular scheduling of slurms, 23:32.000 --> 23:36.000 and this is how we let our benchmark run. 23:37.000 --> 23:39.000 The question was, 23:39.000 --> 23:42.000 how we can use different kind of build system, 23:42.000 --> 23:43.000 so, 23:43.000 --> 23:45.000 does not know anything about easy-easy builds, 23:45.000 --> 23:49.000 but if your application uses Python, 23:49.000 --> 23:52.000 you can just run Python something. 23:52.000 --> 23:54.000 If you have a compiler, 23:54.000 --> 23:56.000 you can just run Python something. 23:56.000 --> 23:58.000 If you have a compiler, 23:58.000 --> 24:00.000 you can just run Python something. 24:00.000 --> 24:02.000 If you have a compiler, 24:02.000 --> 24:04.000 you can just run Python something. 24:04.000 --> 24:06.000 If you have a compiler language, 24:06.000 --> 24:09.000 you need to tell you how to compile it. 24:09.000 --> 24:11.000 Either you configure, 24:11.000 --> 24:12.000 see-make, 24:12.000 --> 24:14.000 or any other kind of 24:14.000 --> 24:16.000 a build system you have. 24:16.000 --> 24:18.000 So you need to tell you what, 24:18.000 --> 24:21.000 how you want to have your application compiled. 24:21.000 --> 24:23.000 And you doesn't know it, 24:23.000 --> 24:24.000 but you can tell it, 24:24.000 --> 24:26.000 and you can do quite a lot, 24:26.000 --> 24:29.000 or almost everything that you can do on the comment line, 24:29.000 --> 24:31.000 so you basically put what you do on the comment line, 24:31.000 --> 24:34.000 and then you have it kind of standardized, 24:34.000 --> 24:36.000 and if you give this script to a friend, 24:36.000 --> 24:37.000 or colleague, 24:37.000 --> 24:39.000 if they know the structure of jub, 24:39.000 --> 24:42.000 they can get very fast started with understanding 24:42.000 --> 24:44.000 what you have been doing there. 24:53.000 --> 24:56.000 The question is whether jub can access executables 24:56.000 --> 24:59.000 that are installed on a different path. 24:59.000 --> 25:02.000 Yes, if you tell jub were to find it, 25:02.000 --> 25:03.000 yes, 25:03.000 --> 25:05.000 you just can, 25:05.000 --> 25:07.000 when you do it on the comment line, 25:07.000 --> 25:09.000 it's the same that if you wanted to do in jub, 25:09.000 --> 25:11.000 you just need to tell it where to find data. 25:11.000 --> 25:15.000 And I think I get a sign that my time is up. 25:16.000 --> 25:17.000 Well, no. 25:39.000 --> 25:44.000 The question is whether jub can access environment variables, 25:44.000 --> 25:47.000 and how it handles the parameters internally? 26:02.000 --> 26:05.000 I'm not sure if we have really changed on this, 26:05.000 --> 26:07.000 but I would say jub can do this, 26:07.000 --> 26:09.000 because jub can read environment variables, 26:09.000 --> 26:12.000 and depending on the value of the environment variable, 26:12.000 --> 26:15.000 it can create its internal structure. 26:19.000 --> 26:22.000 Jub can do it when you tell jub to do it. 26:22.000 --> 26:24.000 So jub, 26:24.000 --> 26:26.000 if you don't give the information to jub, 26:26.000 --> 26:27.000 jub doesn't know it, 26:27.000 --> 26:30.000 or if you tell jub where to find this information, 26:30.000 --> 26:33.000 then jub can also grab it and put it into the output. 26:42.000 --> 26:43.000 Yes. 26:43.000 --> 26:46.000 And the question was whether one can use one compiles step 26:46.000 --> 26:49.000 and reuse it for multiple executions. 26:53.000 --> 26:56.000 There's the full flexibility I would claim there. 26:57.000 --> 26:59.000 The time is no really up here. 26:59.000 --> 27:01.000 Thank you.