WEBVTT 00:00.000 --> 00:13.640 and I am rather pleased to introduce another protein talk. As some of you may know, I am kind 00:13.640 --> 00:21.880 of a protein person. So the other aspect of proteins apart from the structure is annotation. 00:21.960 --> 00:32.120 So having a talk about protein from Boralia, I am a Lufjani, so you are physically based 00:32.120 --> 00:36.040 on the VBI, aren't you? You are not visiting, yeah, yeah. So I came with Jai, I think it's the 00:36.040 --> 00:39.160 night coming too fast and over two. 00:39.160 --> 00:53.560 Oh, I look maybe, should I do that? I reset, just. It's good that I tried before, I was working 00:53.560 --> 01:04.360 and made this. It's not seeing anything. I've got a third one if this one doesn't 01:04.360 --> 01:20.160 go. Yeah, I'm loving you, you are. So I mean, yeah, there we go. It's thinking about it. 01:20.160 --> 01:27.440 Thinking about it. Thinking about it really hard and then getting out. Yes, okay. People thought 01:27.440 --> 01:36.320 that's can take so long. Okay, let's get started. So I, everyone, I'm Orientalisani, 01:36.320 --> 01:41.200 I'm a project lead in the Uniprot team. So the talk is spot vista, open source protein 01:41.200 --> 01:48.400 feature visualization with reusable web components. So as I said, I work at Uniprot is a 01:48.400 --> 01:52.320 comprehensive, high quality, publicly accessible resource of protein sequence and function 01:52.320 --> 01:57.760 information for the interest of the talk of today. We're going to focus on annotations. 01:57.760 --> 02:02.640 There are specifically parts of the protein specific positions of the protein sequence and the 02:02.640 --> 02:08.880 need that we had was to visualize those in some kind of visualization. So Uniprot is a 02:08.880 --> 02:14.640 consultant composed of AnboleBI where I work and also PIA in the US and CB in Switzerland. 02:15.360 --> 02:21.280 So 10 years ago, when I started IDBI, I was actually in another team in Interpro and I was 02:21.360 --> 02:29.120 tasked with adding to the Interpro website a visualization that has been done by the PDBE team. 02:29.760 --> 02:35.360 The issue is that the website that we had was in React. The PDBE visualization was in AngularJS, 02:35.360 --> 02:41.040 actually the first version of Angular and it just won't work. There was no way to make it work, 02:41.040 --> 02:46.400 actually to make it talk to each other, basically those two bits of code. So I ended up having 02:46.400 --> 02:51.200 to just rewrite the thing, which was a bit of a pity. So at the time we started talking 02:51.200 --> 02:56.560 within the different teams at the DBI to find a way to do visualization that we could share 02:57.360 --> 03:01.760 regardless of the framework that we would be using and also share outside of the DBI for 03:02.800 --> 03:08.000 for usage by other teams. That's when we started working on Nightingale. 03:08.960 --> 03:13.360 So Nightingale is an abundance of visualization library of standard web components. This is focused 03:13.360 --> 03:17.040 on protein visualization that has had before annotations within the protein sequence. 03:19.040 --> 03:25.200 Those are composable components that can be associated in different ways. So you can just use 03:25.200 --> 03:33.040 one component or a bunch of them together and it's interpretable with of standard components regardless 03:33.040 --> 03:38.640 of the underlying framework. So so compatible with any component that would be following the 03:38.720 --> 03:46.400 Nightingale APIs, which is just a specific set of web standard API that we've decided to use 03:47.440 --> 03:54.560 in order for those components to talk to each other. So we started using web components. 03:55.360 --> 03:59.760 10 years ago it was not well supported. So it was a bit of a challenge to start but then 04:00.560 --> 04:07.920 support is way better now. So this is a work component. It's a group of APIs of the browser that 04:07.920 --> 04:14.560 includes custom elements, Shadowdom HTML templates and we use them to be able to develop those 04:14.560 --> 04:20.960 components. So the good thing with that is not dependent on any framework. So regardless of if 04:20.960 --> 04:28.880 he used React or any related, like next or if he used Angular or if he used you, you should be able 04:28.880 --> 04:34.720 to work those components because they don't depend on those framework. And also if you don't use any 04:34.720 --> 04:40.160 framework, that would also work. We tried to limit the number of dependency that we use within 04:40.160 --> 04:47.120 those components. You know, here the sequence length. So from 1 to 770, we have some domains 04:47.120 --> 04:52.800 drawn there. But actually on the Interpro website, so a different website, we're able to have the same 04:52.800 --> 04:58.800 components look a bit different, but actually they're all using the same components underneath. 04:58.800 --> 05:03.840 And they're being fed different bits of data depending on the website. But they work in the same way. 05:05.600 --> 05:11.520 And this is the same for PDBE. PDBE also uses some Nightingale components in their own specific 05:11.520 --> 05:19.360 way with their own style. So for Uniprot, there was a specific need to have a turnkey component with 05:19.360 --> 05:26.320 all of the components that were important for us to view. And that's where a product size, 05:26.320 --> 05:31.840 property size, basically a combination of those Nightingale components are sampled in a specific 05:31.840 --> 05:38.640 way by the Uniprot team fed Uniprot data and having some extra features on top of that. 05:38.640 --> 05:43.840 It's yourself a web component wrapping over web components inside. And the good thing with that 05:43.840 --> 05:49.760 is that you can just use that one and then you will use all the underlying components together 05:49.760 --> 05:56.320 assemble this in a specific way. So the viewer is composed of tracks. The tracks are the 05:56.400 --> 06:00.480 Nightingale components that we saw before. They are the fundamental building blocks. 06:01.680 --> 06:08.160 And each track as we saw before can be used individually or in this case they can be combined together. 06:09.760 --> 06:15.120 And product size, so the wrapping thing, the whole thing is associated them together, fetching 06:15.120 --> 06:21.520 the data from new plot APIs and assigning each bit of data to each track responsible to 06:21.520 --> 06:25.520 displaying their own thing. That could be variance, that could be domains, et cetera. And actually 06:25.520 --> 06:32.400 the good thing with that is we also have this structure visualization. There is itself another track. 06:32.960 --> 06:38.400 And that could work with the rest of the tracks and it can work together interact with each other. 06:40.480 --> 06:44.480 So the architecture is that we have a manager and then within the manager you have all the tracks 06:44.560 --> 06:51.360 and the manager is in charge of listening to what all the tracks are saying. Let's say when the 06:51.360 --> 06:57.280 user interact with them and then propagating that to the other tracks in there and also assigning 06:57.280 --> 07:05.040 the data to the right track. So as an example, if a user click on a specific variant in one of the 07:05.040 --> 07:13.440 track then the manager, sorry, the event itself will emit a standard event that will be picked up 07:13.600 --> 07:20.400 by the manager and the manager will go to all the components within itself to be able to assign 07:20.400 --> 07:26.400 that information to all the tracks which means that you can highlight in one track and you will 07:26.400 --> 07:31.440 highlight on all the track even on the district to viewer and same thing you can keep in 07:31.440 --> 07:38.640 sick the zoom, the pan, the panning of the visualization or together. So under the hood, what do we 07:38.720 --> 07:44.560 have? We have the lead library for building reasonable work component. We'll serve the 3GS for 07:44.560 --> 07:51.280 the data driven rendering with SVG and canvas. Also recently we started using canvas to run 07:51.280 --> 07:57.520 the specific tracks and that led to web better performance for when we had a lot of rotations 07:58.240 --> 08:04.320 and we still have a SVG overlay on top of that. So we made the move to canvas because we are 08:04.400 --> 08:10.320 previously using on the SVG but the more data you have on the screen, the more heavy in memory this 08:10.320 --> 08:14.480 and that would just not be able to scale with the new amount of data that we'll get every day. 08:16.320 --> 08:21.200 So this is the work that we did recently, performance optimization. You can see the by 08:21.200 --> 08:28.640 number of annotations and here initial lot time and here the interaction time or refresh time 08:29.120 --> 08:35.280 and the SVG presentation is in blue and canvas implementation is in green and you can see that 08:35.280 --> 08:43.200 we had some threshold of acceptable time for those two metrics and we managed to reduce that by a lot 08:43.200 --> 08:51.920 by using canvas it's the SVG. So we had some challenges. I mean it's been 10 years since we 08:51.920 --> 08:59.120 started working on that. This is not a focus that sorry a project that one person is completely 08:59.120 --> 09:04.480 signed on so it's a bit of a small work as step by step and so at the beginning we tried to keep 09:04.480 --> 09:10.880 it pure and not having any dependency at all but in the end especially when integrated new developers 09:10.880 --> 09:16.240 in the project we realized that we needed to integrate some libraries, some lightweight libraries 09:17.200 --> 09:24.320 to avoid food guns when we had new developers in the project. We also embraced recently 09:24.320 --> 09:31.120 type script so the compile code is not in type script so anyone can use it but actually if someone 09:31.120 --> 09:36.960 develops they will be able to have that enhanced experience by using type script and having those 09:36.960 --> 09:44.560 type annotations. We also have some challenges because all of those components were handled in a 09:44.640 --> 09:50.800 mono repo and it was a bit challenging to make sure that all of them were bundled 09:50.800 --> 09:59.520 independently without having too much data sorry, too much code in each of them and as I said before 09:59.520 --> 10:05.600 we are in the path of having the performance improvement to have a full transition to canvas 10:05.600 --> 10:10.800 we still have some components that are not in canvas still in SVG and we will explore webGL 10:10.800 --> 10:16.240 implementations. We also want to improve developer experience and use the experience because some 10:16.240 --> 10:21.280 users were asking for example to rearrange some tracks or to remove some tracks and so this is 10:21.280 --> 10:27.520 something that we would want to implement and a big test that we want to do is engaging with a 10:27.520 --> 10:34.400 wider community not just the EBI and so this is something that we will do soon. I just wanted to 10:34.560 --> 10:40.080 specifically the software sustainability institute ground that we managed to get so from tomorrow 10:40.080 --> 10:45.200 and during one year we will get some money to be able to work on the sustainability of this specific 10:45.200 --> 10:53.920 project by organizing one hackathon having some of his hours for developers and also engaging 10:53.920 --> 11:02.080 with a community more. Here are some links to the different reports and I don't know if we have 11:02.160 --> 11:13.520 them for a bit of questions yeah. Questions and congratulations to keep you taught there. 11:15.120 --> 11:19.200 So whichever whilst you're taking questions other than questions. 11:21.200 --> 11:24.720 If they're in non I mean I can just feel a little bit oh yeah there's one. 11:25.120 --> 11:36.080 So actually this is more star within nightingale so this is a wrapper this is the one that 11:36.080 --> 11:41.440 is not really lightweight to be honest because if you don't use the structure viewer this is 11:41.440 --> 11:46.480 quite lightweight but if you want to have the structure viewer within it we have the nightingale 11:46.480 --> 11:52.560 component which is a wrapper to be able to talk if the same language with the same API with the 11:52.640 --> 11:57.360 other components but inside it it's interactive with the most eye library which in itself is 11:57.360 --> 12:03.280 quite big I mean yeah this is something that we're discussing with the PDV team to be able to 12:04.560 --> 12:09.680 extract just a bit the we need and so hopefully that's something that we can do in the future. 12:10.640 --> 12:25.120 And so yeah just so I like that the thing that I said before about the software sustainability 12:25.120 --> 12:30.480 institute we will announce through LinkedIn I guess this is a main communication channel 12:31.520 --> 12:38.720 the hackathon the different office hours that we'll be doing from now until next year and hopefully 12:38.720 --> 12:44.320 this means that we will be able to fix and improve all the code that some of the code 12:44.320 --> 12:49.680 might be 10 years old so this is something that we really need to update and and actually have time 12:49.680 --> 12:56.640 to spend on improving that so that hopefully the community can take over and we can get contributions 12:56.640 --> 13:06.080 even outside of API.