WEBVTT 00:00.000 --> 00:12.000 Our next talk, our next speaker, rather, is Ivan Pumman of, let's see, now, now, now, now, 00:12.000 --> 00:21.000 but Pumman Maria, presenting, made a documentation from DSL to dynamic docs with ASCII 00:21.000 --> 00:29.000 doc and Antora. Please welcome Ivan. Thank you. Thank you. Thank you. Thank you. 00:29.000 --> 00:37.000 Yeah, we're entering into 4th hour of this room. Things are getting difficult. Yeah, thank you for coming. 00:37.000 --> 00:43.400 So, yeah, I'm Ivan and currently I'm working as a team leader at technical startup called 00:43.400 --> 00:52.000 synthesized in London, we're producing test data automation you can trust, but also I'm teaching computer science and Java 00:52.000 --> 01:00.000 at number of universities. So, actually I also identify myself as a software engineer. I probably 01:00.000 --> 01:07.000 produce more documentation, technical documentation and the slides than actual code in my 01:07.000 --> 01:15.000 board. So, I would like to share the experience with the product that I'm currently working 01:15.000 --> 01:24.000 at this synthesized company in particular, how we document it. So, what this product does? Like, 01:24.000 --> 01:30.000 roughly speaking, it transforms and generates data in relational databases. So, it's core. 01:30.000 --> 01:36.000 It has a number of things that we call transformers or generators. There are, like, several 01:36.000 --> 01:43.000 kinds of them. So, to give you some example, it can be just random number generator with some, you 01:43.000 --> 01:48.000 know, distribution statistical distribution. Or it can be a person generator which will 01:48.000 --> 01:57.000 mock up somebody's gender age name and some, and so on so forth. Or we can give it several categories give them, 01:57.000 --> 02:04.000 give it, you know, probabilities and it will generate, you know, categories and stuff. So, and more, 02:04.000 --> 02:12.000 there are 35 of them and the whole product on top. So, what do we document? What is the surface, 02:12.000 --> 02:19.000 like, user-facing surface of the product? First of all, things can be configured via 02:19.000 --> 02:27.000 YAMO file or they can be configured via user interface, the editor. And of course, usually 02:28.000 --> 02:34.000 this product evolves over time. It's several years old now. So, new transformers are added. 02:34.000 --> 02:41.000 Like, features are changed, bugs are fixed. So, we need to keep a number of things in sync, 02:41.000 --> 02:46.000 actually, in this product. So, the things that we must, that must stay in sync. First, 02:46.000 --> 02:52.000 is Jason schema for the YAMO configuration. Of course, we want to validate this YAMO before even 02:52.000 --> 02:58.000 we start, like, feeding it to our product. And we want context sensitive hints in editors. 02:58.000 --> 03:04.000 So, we need this Jason schema. Always up to date. Also, the projectional editor UI, 03:04.000 --> 03:10.000 which actually follows the same, like, yeah. I believe, like, people who are in this room, 03:10.000 --> 03:16.000 you are big fans of everything is caught, but not every users. A big fans of coding. 03:16.000 --> 03:23.000 So, people want this editors. And we must keep them in sync as well. So, how do we do this? 03:23.000 --> 03:29.000 And finally, and there's the main focus of the talk is documentation, which actually 03:29.000 --> 03:35.000 must follow the same structure. It must describe it. It must be always up to date. And in particular, 03:35.000 --> 03:41.000 all the code snippets, this code snippets, they can be, like, copy, then paste it into 03:42.000 --> 03:50.000 customers' workflows. And they must work without saying, oh, this argument is no longer valid 03:50.000 --> 03:59.000 or something like this. This transformation has been renamed. So, in the very beginning of 03:59.000 --> 04:07.000 the development of this product, a decision has been made. And I think we are still getting benefits 04:07.000 --> 04:15.000 from this decision, like, many years after, is that we are using DSL approach as a single source of 04:15.000 --> 04:22.000 truth. So, I'll throw part of the product is written in Kotlin, part of it is written in 04:22.000 --> 04:33.000 TypeScript. The software developer starts from editing DSL file, the demo file, from which large 04:33.000 --> 04:39.000 chunks of artifacts are being generated. So, I counted six in our particular case, like 04:39.000 --> 04:46.000 DTOs, this JSON schema, the projectional editor, which is actually auto-generated. We don't 04:46.000 --> 04:56.000 make our front-end engineers like to follow all the controls and switches. And the documentation is one of 04:57.000 --> 05:05.000 this, you know, six outputs. So, what did we choose as a DSL as the domain specific language for this? 05:05.000 --> 05:12.000 Surprisingly, in our case, it's open API spec. Well, as you can see, it defines all the 05:12.000 --> 05:18.000 type names, it defines the structure, like in properties, what parameters do we have, it defines 05:18.000 --> 05:26.000 descriptions, nullability, all the stuff, like lots of stuff, actually. So, if you ask why open 05:26.000 --> 05:31.000 API spec, why such is strange, probably choice, well, because it's just because it's just 05:31.000 --> 05:38.000 feed for the purpose. Because built-in documentation fields, it supports extensions, it supports 05:38.000 --> 05:44.000 non-standard extension via X fields. So, if there is something which is not in a standard 05:44.000 --> 05:51.000 open API, you just put a property anywhere, which starts with X dash. And you can put just any 05:51.000 --> 05:58.000 additional semantic value you want. And what's important, we have a very high quality 05:58.000 --> 06:07.000 widely used open API parser for the GVN for Java. And, well, I believe in our days, the success 06:08.000 --> 06:15.000 of the product is mostly from using open source, which fits best your purpose. And writing 06:15.000 --> 06:24.000 like some glue code. And if you are choosing your library carefully, the amount of glue code 06:24.000 --> 06:30.000 is going to be minimum. So, in our case, we already had co-generators for Kotlin and for 06:30.000 --> 06:35.000 TypeScript to produce our details. And the remaining step was to generate the documentation. 06:35.000 --> 06:43.000 So, we decided to generate ASKIDOktor from the same source, from the open API spec. And then 06:43.000 --> 06:48.000 rely on ASKIDOktor toolchain for everything that follows. So, actually, this yellow 06:48.000 --> 06:56.000 tiny yellow part needed to be implemented. And by the way, 2020-26, you just task your agent 06:56.000 --> 07:03.000 to write glue code. And it can be done really quickly. So, the full pipeline is like this. 07:03.000 --> 07:10.000 So, as I said, there is a number of branches, a number of artifacts that we are producing. But 07:10.000 --> 07:18.000 we have only a small amount of glue code, like from Swagger parser, like using 07:18.000 --> 07:26.000 Kotlin, Kotlin is just also very reliable library, which is a builder of Kotlin code. So, we 07:26.000 --> 07:32.000 just use this glue code and produce Kotlin details, JSON schemas that we need. And the document 07:32.000 --> 07:41.000 is the documentation. One thing, like, when we think of auto-generated documentation, we 07:41.000 --> 07:48.000 think of something like Python docs, strings or Java doc. And you know that they produce 07:48.000 --> 07:57.000 HTML right away from the code. But interestingly, if you want to do something yourself, 07:57.000 --> 08:03.000 what I recommend is to use some semantic markup, because it's much easier to produce than 08:03.000 --> 08:10.000 HTML. And also, it will merge finally. It will blend finally with the rest of your documentation. 08:10.000 --> 08:16.000 Because if the rest of your documentation is written in the same markup, you will get automated 08:16.000 --> 08:23.000 documentation. And, like, written by humans, the documentation written by humans, in the same 08:23.000 --> 08:31.000 style, with the same visual style. So, which is important. So, speaking about DSL, DSL is a single source 08:31.000 --> 08:37.000 of truth. First of all, it's a design decision, a project design decision. It's not a decision 08:37.000 --> 08:44.000 that is taken by technical writers. So, some other people, like, it must be taking it very 08:44.000 --> 08:51.000 beginning. And it helps keep multiple parts of the product in sync, including, but not limited 08:52.000 --> 08:59.000 to the documentation. It's very powerful approach. The choice of DSL, however, is context dependent. 08:59.000 --> 09:06.000 So, in our case, it's open IP, open API spec. In your case, it can be anything which fits the purpose. 09:06.000 --> 09:12.000 You have lots of options, like a fully custom DSL. If you feel like writing your own custom 09:13.000 --> 09:20.000 parser of something. An existing spec language, such as XSD, open API, like whatever spec language you 09:20.000 --> 09:28.000 feel fit. It can be YAML XML-based format, or it can be an internal DSL in a host language, such as 09:28.000 --> 09:35.000 Groovy or Rubio Kotlin, which produces this, you know, syntax abilities to expose an internal DSL. 09:36.000 --> 09:42.000 The best results usually come from choosing a technology with strong out-of-the-box tooling. So, you 09:42.000 --> 09:51.000 only need a small amount of custom blueprint. So, let's get to ask a doctor part of the presentation. 09:51.000 --> 10:00.000 And there's a question, like, why not ask a doctor? Why not something else? Well, here I'm a bit opinionated. 10:00.000 --> 10:07.000 For DSL, I said, like choose whatever you like. For ask a doctor, I'm sorry, Daniel. 10:07.000 --> 10:14.000 I'm opinionated. I'm from ask a doctor camp. I believe that it just outperforms all the other 10:14.000 --> 10:19.000 markup languages. It has richer semantics. It has really powerful tables out of the box. 10:19.000 --> 10:25.000 It has attributes and conditional content, which makes documentation dynamic. And it has countless 10:25.000 --> 10:32.000 diagrams, code integrations. And also, it's truly cross-platform. Because if you're like 10:32.000 --> 10:39.000 your project is in Ruby, in JavaScript or in JV, you're using your build tools and you 10:39.000 --> 10:46.000 completely get ask a doctor tooling there. That's quite a rare, like, quite a rare example 10:46.000 --> 10:52.000 of truly cross-platform tooling. And my favorite feature include about which I'm going to 10:52.000 --> 10:58.000 speak separately. Complex tables. This presentation is in us, 10:58.000 --> 11:05.000 the doctor, by the way. So there's a GitHub repo for it. You can have a look. So good luck. 11:05.000 --> 11:15.000 If you want to do these things in Markdown and escape like chopping onions. 11:16.000 --> 11:22.000 That's a basic thing, like, same with the many languages. Let's have a look at task 11:22.000 --> 11:27.000 a doctor example. This call out, by the way, is supported by a 11:27.000 --> 11:34.000 doctor. It's first class support, like no extensions, quite unique thing. And yeah, you 11:34.000 --> 11:41.000 can do it in various languages. So, as you probably guessed, here, you see a 11:41.000 --> 11:45.000 client like 2500 years old Delgaritum for fighting greatest common 11:45.000 --> 11:48.000 divisor, presented in various languages. 11:48.000 --> 11:56.000 Lua, prologue, Ruby, Java. And you may ask, like, what you make, like, feel, 11:56.000 --> 12:02.000 believe that these are actually valid code snippets that they are actually producing 12:02.000 --> 12:08.000 this idea. Well, I can assure you that all these snippets are tested before the whole 12:08.000 --> 12:15.000 presentation being built using property like strong property based tests and the 12:15.000 --> 12:20.000 besides integer overflows they are correct. How do I achieve this? I'm going to 12:20.000 --> 12:29.000 tell you in a minute. But before that, I just can't help showcasing the 12:29.000 --> 12:36.000 integrations that a doctor have, which actually allows you to keep 12:36.000 --> 12:43.000 colorful, powerful, powerful pictures, illustrations within your documentation, 12:43.000 --> 12:50.000 in plain text. Alfaindomic is graph is quite low level, but you can do very flexible. 12:50.000 --> 12:56.000 You can do anything in it. If you want some formal stuff such as UML, you have 12:56.000 --> 13:03.000 planned UML for this. If you are teaching, like me, languages, you need this 13:03.000 --> 13:10.000 syntax diagrams and I'm really proud to show this slide because this tool in particular 13:10.000 --> 13:17.000 was implemented by my third year Java students. And due to, like, amazing 13:17.000 --> 13:23.000 community, an aski doctor, they included it as the first class support in an aski 13:23.000 --> 13:27.000 doctor. So if you want to describe some syntax, you have JSON 13:27.000 --> 13:33.000 tricks for this, and it's supported by aski doctor. 13:33.000 --> 13:41.000 It's in universal itself, right? So you stack charts like bar charts, you need 13:41.000 --> 13:45.000 them and your documentation, not technical documents. You don't need to copy and paste 13:45.000 --> 13:51.000 pictures. It can be a hidden. And of course, it's not like that 13:51.000 --> 13:56.000 frequent that you need a formula and your technical documentation. But if you do need 13:56.000 --> 14:00.000 them, of course, there is only one format for the formula, like if you're taking 14:00.000 --> 14:06.000 formula seriously, that's latex and aski doctor is capable for this out of the 14:06.000 --> 14:14.000 box. So it's extremely powerful, but my favorite feature is still 14:14.000 --> 14:23.000 included. Yeah, other tool chains they're missing it. So what it does, well, 14:23.000 --> 14:28.000 a simple thing, right? So somewhere in my code base, I have this Java file and I 14:28.000 --> 14:32.000 just include it and it's just being included here on this slide. But Java is 14:32.000 --> 14:38.000 verbose and actually you don't see anything useful on this slide because of the 14:38.000 --> 14:44.000 header like all this stuff. But as you can see, if I put special 14:44.000 --> 14:51.000 comment lines here, I'm tagging a snippet and then later I can refer to this snippet. 14:51.000 --> 14:57.000 So actually what I see here is not like a meaningless JavaScript snippet, 14:57.000 --> 15:03.000 copy and paste it from somewhere. It's like a tiny window where from which you 15:03.000 --> 15:09.000 see just a snippet of the whole big Java file. And then I'm put this on the 15:09.000 --> 15:13.000 slide for my students and I say that this is the solution of a classical 15:13.000 --> 15:17.000 problem of word counting. You have a text file and you count words and you 15:17.000 --> 15:22.000 have a map from string to long blah blah blah. And the question is, how do we 15:22.000 --> 15:27.000 make sure that this claim is accurate? We need a test for this. Okay, for 15:27.000 --> 15:31.000 Java string API, I'm probably straight forward, but if it's some other tool 15:32.000 --> 15:37.000 that we are documenting. We need a test. So what do we do? Remember that this 15:37.000 --> 15:41.000 is actually not a snippet. This is actually a window for the big file. So this 15:41.000 --> 15:46.000 file can be unit tested. So I'm writing a unit test for this. Oh, sorry, 15:46.000 --> 15:51.000 Java is to verbose. Let me focus on the interesting part. And yeah, 15:51.000 --> 15:55.000 this is a unit test which actually calls the method in question and 15:55.000 --> 16:02.000 verifies that it's compiled that it's run. And here we are. If you keep the 16:02.000 --> 16:07.000 commutation and tested example code in the same repository, if you use 16:07.000 --> 16:12.000 include with tags to pull in only relevant chunks or only relevant fragments, 16:12.000 --> 16:18.000 you build docs and run the example code tests in the same CI pipeline. And 16:18.000 --> 16:23.000 this includes using same build tool. This is why it's important like for that 16:23.000 --> 16:29.000 we have ASCII doctor, right? We can run ASCII doctor in NNPM. We can run ASCII doctor in 16:29.000 --> 16:35.000 Ruby, build tool. We can run ASCII doctor in grade all the way. And it's either 16:35.000 --> 16:40.000 green or red. If some of the tests are red, then the test breaks the 16:40.000 --> 16:47.000 CI fails. You have no documentation. Final part is 16:47.000 --> 16:57.000 Antora. Why Antora? Why DSL? In DSL, choose whatever you like. 16:57.000 --> 17:06.000 This is my point. For semantic markup, ASCII doctor is the best. That's my opinion. 17:06.000 --> 17:12.000 Antora, well, frankly speaking, you don't have any other choice if you are using ASCII 17:12.000 --> 17:19.000 doctor. That's the truth. What's good about Antora? It has version 17:19.000 --> 17:26.000 doc as a first class concept. So if you are doing release cycle for your library or 17:26.000 --> 17:32.000 your product, you can do tags in your git repository or branches and Antora 17:32.000 --> 17:39.000 will refure and it will publish them. Next, it has an opinionated layout for 17:39.000 --> 17:45.000 models, assets, examples and stuff. You put some example here. You know where to put 17:45.000 --> 17:51.000 image, where you put your SQL file, where to put your YAML, file example, whatever. 17:51.000 --> 18:02.000 Antora has standard folders for this. It has specific format for cross-references. 18:02.000 --> 18:12.000 So it has this notion of modules, examples, assets and stuff. Which helps a bit when you 18:12.000 --> 18:20.000 want to move stuff around your documentation. So when you're not relying on particular 18:20.000 --> 18:28.000 place on the file system, but your cross-reference is put in terms of modules, then you 18:28.000 --> 18:35.000 can move your ASCII doc files from a module to a module and your cross-reference will 18:35.000 --> 18:41.000 be working if you are careful enough. So that's really powerful feature for maintaining 18:41.000 --> 18:49.000 of large documentation. Last but not the list, it's surprisingly fast. Like those of you 18:49.000 --> 18:55.000 who know how long I can build cycles with great Laurenti and they can take ages. 18:55.000 --> 19:04.000 Antora builds huge size in a matter of seconds, and this is really impressive like in this 19:04.000 --> 19:17.000 days. However, and yeah, during previous presentations today, we had a lot of cases for 19:17.000 --> 19:26.000 you know, not documents as code solutions. Like WIKI or things like what was it called, 19:26.000 --> 19:34.000 last year, right? So this presentation brings like another kind of end of the spectrum. 19:34.000 --> 19:44.000 Like everything is code. Everything is tested. Everything is on CI. Well, actually from my current 19:44.000 --> 19:53.000 experience, I think, and this is still not solved problem, we must be, we must support something 19:53.000 --> 20:00.000 in between. Because when you've got like large and Torah presentation, large and Torah 20:00.000 --> 20:05.000 documentation, everything on CI, it becomes really difficult for people like solution architects 20:05.000 --> 20:13.000 or marketing people to edit it, to move contents around, to add some descriptive chunks. 20:13.000 --> 20:20.000 Also, it becomes to burden some to sync it with the list cycles. Like I want to update it now, 20:20.000 --> 20:28.000 or no, we need to wait for the next list. So probably if your documentation is mostly descriptive, 20:28.000 --> 20:37.000 is decoupled from a list cycle, probably you should think about something else like WIKI or other tools 20:38.000 --> 20:44.000 that we discussed earlier. But if you want to keep it tightly in sync with the code and with the 20:44.000 --> 20:52.000 features of your product, then this approach is definitely the best. So to conclusions, 20:52.000 --> 20:58.000 keep the documentation close to the source of truth. Ideally, if you're using the same 20:59.000 --> 21:04.000 repository on GitHub, with everything like with documentation and with code and tests, 21:04.000 --> 21:12.000 generate documents as plain ad hoc and let us give doctor to do the rest. It's really easy 21:12.000 --> 21:19.000 to co-generate ad hoc. Use the include magic in order to include testable examples and make 21:19.000 --> 21:27.000 your all the examples of your code, be it SQL, Yamu, Java, testable and tested. And yeah, 21:27.000 --> 21:33.000 the part of writing the documents and the part of engineering the whole solution is actually 21:33.000 --> 21:39.000 the same process. So when they're blunt, you get the best results. So thank you for listening. 21:39.000 --> 21:46.000 You can see the source code of this presentation here with all the examples and all the code 21:47.000 --> 21:53.000 examples in this presentation are really tested. Thank you. 22:08.000 --> 22:14.000 WIKI about WIKI, like, yeah, first of all, one hour ago there was a presentation 22:14.000 --> 22:21.000 like which demonstrated that actually ask doctor like provides you with the ability to write 22:21.000 --> 22:28.000 your own templates for output templates. So yes, it can produce HTML and can produce PDF, 22:28.000 --> 22:34.000 like there are templates for open document formats. And I didn't heard about, like, you know, 22:34.000 --> 22:40.000 WIKI pages, but I think it shouldn't be that difficult to write your own template because it's 22:40.000 --> 22:47.000 like, namely from one semantic markup to another semantic markup. Yes, please. 22:47.000 --> 22:53.000 How are you familiar with the concept of the programming? 22:53.000 --> 23:01.000 Could something definitely, but yeah, I'm not familiar. I heard the, you know, the question was, 23:02.000 --> 23:16.000 am I familiar with literate programming? I feel the name, but I don't know what it is. 23:16.000 --> 23:24.000 Orgmode syntax? No, no, no, no, no. Let's discuss. 23:24.000 --> 23:28.000 question of the day. Questions? Yes, please. 23:39.000 --> 23:43.000 You have terraform and you want it to be beautifully documented. 23:43.000 --> 23:47.000 100% it does work. This is the idea. 23:47.000 --> 23:51.000 If you have something deep, but you need to find a 23:52.000 --> 23:55.000 uniform, you know, documents. 23:55.000 --> 23:58.000 But yeah, terraform has additional attributes and 23:58.000 --> 24:03.000 yeah, I see like, no, no problems in implementing this. 24:06.000 --> 24:08.000 Yeah, yeah. 24:10.000 --> 24:12.000 Infrastructure risk. 24:17.000 --> 24:19.000 Not like what? 24:21.000 --> 24:28.000 If the question is, if the DSL is not open 24:28.000 --> 24:32.000 API spec, but some infrastructure is code DSL such as 24:32.000 --> 24:36.000 Ansible, terraform and stuff. 24:36.000 --> 24:40.000 Yes, from what I know about this DSLs, 24:40.000 --> 24:45.000 I would be capable of writing something which 24:45.000 --> 24:50.000 will produce semantic markup for documenting this. 24:50.000 --> 24:53.000 That's definitely possible. 24:53.000 --> 24:57.000 Thank you very much for your help. 24:57.000 --> 24:58.000 Thank you.