WEBVTT 00:00.000 --> 00:07.000 Yeah, welcome, everyone. 00:07.000 --> 00:10.000 I'm going to talk today about machine learning on air 00:10.000 --> 00:14.000 and give you an overview about available frameworks and toolboxes 00:14.000 --> 00:19.000 that you can use to improve your own DSP and radio communications. 00:19.000 --> 00:22.000 So, I had a small lie on this slide 00:22.000 --> 00:25.000 because there will be no transmission over air today. 00:25.000 --> 00:29.000 So, all I'm going to talk about is mostly offline optimization 00:29.000 --> 00:33.000 and also the talk is mostly focused on communication. 00:33.000 --> 00:37.000 So, that's what my understanding of DSP and radio is 00:37.000 --> 00:39.000 or when I combine both. 00:39.000 --> 00:41.000 So, yeah. 00:41.000 --> 00:43.000 First of all, who am I? 00:43.000 --> 00:47.000 So, I started using radio sometime 2015 00:47.000 --> 00:50.000 then I worked at this small SDR company 00:50.000 --> 00:52.000 or did an internship there. 00:52.000 --> 00:56.000 Then I became involved with the radio project 00:56.000 --> 00:58.000 and did like some code contribution 00:58.000 --> 01:03.000 and also today I still am responsible for some stuff there. 01:03.000 --> 01:07.000 And then 2019 I finished my masters 01:07.000 --> 01:10.000 and worked a little bit at ESA 01:10.000 --> 01:13.000 and I was happy to also have been organizing 01:13.000 --> 01:16.000 a pre-fossed them, heck, 01:16.000 --> 01:20.000 a radio hacking event at ESA 01:20.000 --> 01:23.000 and since 2001 I'm actually working 01:23.000 --> 01:25.000 with machine learning on communication. 01:25.000 --> 01:27.000 So, that was also my first exposure. 01:27.000 --> 01:31.000 So, you're a little bit seeing the last four years 01:31.000 --> 01:33.000 what I've learned. 01:33.000 --> 01:34.000 All right. 01:34.000 --> 01:37.000 So, short overview what you're going to talk about. 01:37.000 --> 01:39.000 First, I want to give you a little bit of introduction 01:39.000 --> 01:43.000 into machine learning for digital signal processing and radio. 01:43.000 --> 01:47.000 Then present to you some of the toolboxes 01:47.000 --> 01:51.000 that you can use to run optimization for the file 01:51.000 --> 01:56.000 and then give you a short tutorial like a small example 01:56.000 --> 01:59.000 that you can also replicate or use the public code 01:59.000 --> 02:03.000 that I uploaded to use the starting point 02:03.000 --> 02:05.000 to use these toolboxes. 02:05.000 --> 02:10.000 So, first, yeah. 02:10.000 --> 02:13.000 So, first, what is AI and machine learning? 02:13.000 --> 02:15.000 So, right now there's a lot of craze and hype 02:15.000 --> 02:17.000 about AI, everyone wants to do it. 02:17.000 --> 02:20.000 Nobody really knows what it is. 02:20.000 --> 02:23.000 So, I want to clarify that 02:23.000 --> 02:25.000 I'm not going to talk about how to use LLMs 02:25.000 --> 02:28.000 to improve your communications or to have 02:28.000 --> 02:31.000 them design your algorithms. 02:31.000 --> 02:33.000 I'm not going to talk about any AI agents 02:33.000 --> 02:36.000 that's also going to do the coding for you. 02:36.000 --> 02:37.000 Yeah. 02:37.000 --> 02:40.000 Also, no using of APIs of chatboards 02:40.000 --> 02:43.000 to come up with clever DSP algorithms. 02:43.000 --> 02:45.000 Oh, there's a point missing, 02:45.000 --> 02:48.000 but we're going to talk about how to apply 02:48.000 --> 02:52.000 the machine learning principles to improve the algorithms itself. 02:52.000 --> 02:55.000 So, there will be a little bit of math. 02:55.000 --> 02:56.000 Oh, yeah. 02:56.000 --> 02:57.000 So, it's not going to be chat. 02:57.000 --> 02:58.000 Yeah. 02:58.000 --> 02:59.000 That's the point. 02:59.000 --> 03:02.000 So, we'll look at the communication system 03:02.000 --> 03:05.000 and see how we can use some of the open source tools 03:05.000 --> 03:08.000 to apply machine learning in the different parts of our system. 03:08.000 --> 03:12.000 So, first of all, what is machine learning now? 03:12.000 --> 03:18.000 So, there's like an old quote from this book. 03:18.000 --> 03:23.000 Basically, we want a computer program 03:23.000 --> 03:26.000 and we want to feed it some experience E 03:26.000 --> 03:29.000 and define some task T it has to solve 03:29.000 --> 03:32.000 with respect some to some performance measure P. 03:32.000 --> 03:35.000 So, it's a very basic definition 03:35.000 --> 03:39.000 and it should improve the performance 03:39.000 --> 03:42.000 with if you provide more experience. 03:42.000 --> 03:45.000 So, this kind of data driven approach. 03:45.000 --> 03:49.000 And actually, if you think about it more closely, 03:49.000 --> 03:52.000 this means that we already had machine learning 03:52.000 --> 03:54.000 and communications for a long time. 03:54.000 --> 03:56.000 I will give you an example 03:56.000 --> 03:59.000 in some other slides. 03:59.000 --> 04:03.000 So, first of all, this is like a schematic 04:03.000 --> 04:06.000 representation of how a communication system 04:06.000 --> 04:08.000 roughly can look like. 04:08.000 --> 04:11.000 So, you start in the top left with a data source. 04:11.000 --> 04:13.000 So, you have some bits you want to transmit 04:13.000 --> 04:18.000 in a digital system like a picture of pictures or videos or emails. 04:18.000 --> 04:19.000 You compress them. 04:19.000 --> 04:22.000 You put them to some forward error correction 04:22.000 --> 04:25.000 and then you get this capital B and not capital 04:25.000 --> 04:29.000 but the bold B, which we later will also use. 04:29.000 --> 04:33.000 And our system then we map them 04:33.000 --> 04:36.000 with some representation for the physical space 04:36.000 --> 04:40.000 like amplitude shift keying, phase shift keying, 04:40.000 --> 04:44.000 QAM, FSK, different modulation formats. 04:44.000 --> 04:47.000 We put it through some pulse shaping 04:47.000 --> 04:49.000 to put it on the physical medium 04:49.000 --> 04:51.000 and send it to some channel. 04:51.000 --> 04:52.000 I don't know. 04:52.000 --> 04:53.000 It cannot be anything. 04:53.000 --> 04:54.000 Can be wireless. 04:54.000 --> 04:56.000 Can be a satellite. 04:56.000 --> 04:59.000 Communication channels, cables, cyber optical. 04:59.000 --> 05:00.000 It doesn't really matter. 05:00.000 --> 05:02.000 But, some channel. 05:02.000 --> 05:04.000 And the receiver. 05:04.000 --> 05:07.000 We have to somehow get rid of all the channel effects. 05:07.000 --> 05:10.000 We have to synchronize our signal again. 05:10.000 --> 05:14.000 And then after that we get some sort of X hat 05:14.000 --> 05:17.000 which is should represent as closely 05:17.000 --> 05:21.000 to this original X that we got out of the symbol map. 05:21.000 --> 05:24.000 And then our D map will give us 05:24.000 --> 05:29.000 nowadays mostly or oftentimes soft values 05:29.000 --> 05:32.000 which we call these kind of L values, 05:32.000 --> 05:34.000 or look exactly at ratios. 05:34.000 --> 05:37.000 And then we put them to channel decoder, decompress it. 05:37.000 --> 05:38.000 And then at the receiver, 05:38.000 --> 05:42.000 we hopefully have no errors or like a very low bit error rate. 05:42.000 --> 05:44.000 So this is more or less the definition of the system. 05:44.000 --> 05:46.000 And if you want to now apply machine learning, 05:46.000 --> 05:49.000 you basically have to define either. 05:49.000 --> 05:52.000 You want to replace the whole transmitter chain, 05:52.000 --> 05:53.000 everything. 05:53.000 --> 05:56.000 So you just feed bits and get almost physical 05:56.000 --> 06:00.000 signal out or you can also replace single blocks out of this 06:00.000 --> 06:04.000 with neural networks or with other approaches 06:04.000 --> 06:07.000 that have a trainable parameters. 06:07.000 --> 06:11.000 And then you can define the task. 06:11.000 --> 06:13.000 You define the performance measure. 06:13.000 --> 06:16.000 I'm going to show you some of the ones 06:16.000 --> 06:18.000 that you can use for communication. 06:18.000 --> 06:21.000 So that I commonly use for communications. 06:21.000 --> 06:25.000 And then you do data-driven simulation. 06:25.000 --> 06:30.000 So generate bits according to either uniformly distributed 06:30.000 --> 06:34.000 or maybe you have some other patterns that are in your data 06:34.000 --> 06:38.000 that you can also feed in this data source. 06:38.000 --> 06:44.000 And you compute this loss or the performance measure 06:44.000 --> 06:48.000 and then you compute the gradient of this objective function that you have. 06:48.000 --> 06:53.000 And the way you do it and the way I'm going to present today 06:53.000 --> 06:57.000 is numerically and you use commonly known frameworks 06:57.000 --> 07:00.000 which provide us this automatic differentiation. 07:00.000 --> 07:04.000 So it's not always easy from the data sink and the receiver 07:04.000 --> 07:08.000 to calculate the gradient all the way to the source by hand 07:08.000 --> 07:10.000 or analytically sometimes not possible. 07:10.000 --> 07:13.000 So we rely on this automatic differentiation 07:13.000 --> 07:16.000 and numerical simulations to give us 07:16.000 --> 07:18.000 and let's say approximation of this gradient 07:18.000 --> 07:22.000 because it's of course not exact. 07:22.000 --> 07:27.000 And yeah, in order to improve your parameters 07:27.000 --> 07:29.000 that are called theta here, 07:29.000 --> 07:34.000 we apply an optimization step where we have this kind of gradient 07:34.000 --> 07:41.000 and we have a step with mu and we try to go 07:41.000 --> 07:44.000 to find the minimum of our loss functions. 07:44.000 --> 07:47.000 Very simple stuff I hope. 07:47.000 --> 07:50.000 All right, so what are good objective functions? 07:50.000 --> 07:54.000 So one that you could think about immediately 07:54.000 --> 07:57.000 is probably the mean squared error where just compute 07:57.000 --> 08:00.000 mean squared error between your transmit symbol 08:00.000 --> 08:05.000 and your receive symbol, take the average across your batch 08:05.000 --> 08:08.000 or your time length of your simulation 08:08.000 --> 08:10.000 and then you have some loss. 08:10.000 --> 08:14.000 And this is already a pretty good one as we can see later. 08:14.000 --> 08:17.000 Then it's a bit more complicated. 08:17.000 --> 08:20.000 And I call it now modified cross entropy 08:20.000 --> 08:23.000 because this is not exactly cross entropy. 08:23.000 --> 08:26.000 This is let's say the practitioners formula 08:26.000 --> 08:30.000 that you can use at the end if you have this kind of simulation 08:30.000 --> 08:33.000 where on the left side this minus h of x 08:33.000 --> 08:38.000 is the negative entropy of your symbols. 08:38.000 --> 08:40.000 So the source entropy how much information 08:40.000 --> 08:44.000 you can put into for example your QAM. 08:45.000 --> 08:48.000 So for example 64 QAM you can put in six bits 08:48.000 --> 08:53.000 if you have a uniform occurrence of all the symbols. 08:53.000 --> 08:58.000 And on the right side is more or less this conditional entropy 08:58.000 --> 09:01.000 basically this is what you get at the receivers 09:01.000 --> 09:06.000 or you're receiving some complex symbol yk 09:06.000 --> 09:13.000 and you want to figure out what kind of symbol xk was sent. 09:13.000 --> 09:15.000 This is the part you want to minimize. 09:15.000 --> 09:18.000 And this is the part you basically want to maximize of this term 09:18.000 --> 09:23.000 and all the entropy and in, yeah. 09:23.000 --> 09:27.000 And you can also formulate the same bitwise 09:27.000 --> 09:30.000 where this are now these locally-hot ratios 09:30.000 --> 09:34.000 and this is still the source entropy 09:34.000 --> 09:38.000 and you can also use this as a loss function. 09:38.000 --> 09:43.000 If you didn't get exactly how these formulas are born 09:43.000 --> 09:47.000 I left out basically all of the steps coming from math 09:47.000 --> 09:50.000 to these derivations. 09:50.000 --> 09:56.000 This is just what we are going to use later in the simulation. 09:56.000 --> 10:00.000 So I said we already used this kind of machine learning 10:00.000 --> 10:02.000 quite a long time in communications 10:02.000 --> 10:06.000 and actually the first occurrence in literature 10:06.000 --> 10:11.000 is like 1960 where they came up with the LMS equalizer 10:11.000 --> 10:16.000 and it's quite simple system. 10:16.000 --> 10:18.000 Where you basically have the system model. 10:18.000 --> 10:26.000 You derive this x hat k by performing equalization 10:26.000 --> 10:30.000 with a vector f and some receive values 10:30.000 --> 10:33.000 and then you can compute this mean squared error. 10:33.000 --> 10:38.000 You can actually manually derive it, find the minimum 10:38.000 --> 10:41.000 and then you get these two terms. 10:41.000 --> 10:44.000 So the gradients are either in the complex world. 10:44.000 --> 10:48.000 You get this or in the real world, you get this. 10:48.000 --> 10:52.000 And you can just apply the same update step 10:52.000 --> 10:54.000 as I shown before. 10:54.000 --> 10:59.000 So you have the previous equalizers equalizer tabs 10:59.000 --> 11:03.000 and you just subtract this gradient that you can 11:03.000 --> 11:07.000 take either one of those depends on if you are real valued or not. 11:07.000 --> 11:10.000 And then you can basically step into the right direction 11:10.000 --> 11:13.000 and minimize your mean squared error. 11:13.000 --> 11:18.000 So this is let's say quite fun to see that this is nothing new. 11:18.000 --> 11:25.000 But in 2017 there was quite remarkable publication 11:25.000 --> 11:29.000 of let's not do this only on the receiver. 11:29.000 --> 11:33.000 But we can actually start from the transmitter 11:33.000 --> 11:36.000 going to some channel at the receiver 11:36.000 --> 11:40.000 and try to find maybe the best constellation we can transmit 11:40.000 --> 11:41.000 over this channel. 11:41.000 --> 11:45.000 In this case it was a simple AWGN channel. 11:45.000 --> 11:48.000 That's what we're also going to do later in the demo. 11:48.000 --> 11:54.000 And basically the transmitter, you see you put in some 11:54.000 --> 11:59.000 bits or in this case it's an encoded one hot vector 11:59.000 --> 12:03.000 where only one part of the vector is one everything else is zero. 12:03.000 --> 12:10.000 And this leads to getting a complex symbol X that you can transmit. 12:10.000 --> 12:13.000 Put it through this channel, get a complex symbol Y, 12:13.000 --> 12:16.000 put it through another neural network. 12:16.000 --> 12:21.000 And then you just run the optimization chain as we have seen before. 12:22.000 --> 12:26.000 And here on the right are some pictures from the publication. 12:26.000 --> 12:31.000 Basically they got QPSK, PSK and also other constellations 12:31.000 --> 12:35.000 depending on how they put in the constraints. 12:35.000 --> 12:41.000 And later this was also extended to bitwise and to end. 12:41.000 --> 12:45.000 So not only improving the location of the symbols, 12:45.000 --> 12:50.000 but also the labeling, so where to put all the labels. 12:50.000 --> 12:53.000 And that you some of the results you can see here. 12:53.000 --> 12:58.000 So you get a gray coded 16 QAM. 12:58.000 --> 13:05.000 And you can also do derivations or simulations to see how it changes 13:05.000 --> 13:06.000 if you change the SNR. 13:06.000 --> 13:08.000 So points are moving. 13:08.000 --> 13:10.000 So here red is very low SNR. 13:10.000 --> 13:12.000 So here you have points that are close together. 13:12.000 --> 13:15.000 So it transmits less. 13:15.000 --> 13:19.000 A smaller amount of different symbols and here large amount. 13:19.000 --> 13:26.000 And it was even extended to this kind of approach where you remove 13:26.000 --> 13:29.000 most parts of the transmitter and the receiver replacing 13:29.000 --> 13:30.000 the neural networks. 13:30.000 --> 13:34.000 And you don't even have any synchronization or equalization. 13:34.000 --> 13:38.000 But you have this kind of fully pilotless communication. 13:38.000 --> 13:43.000 Which requires, let's say, more neural networks or like deeper neural networks. 13:43.000 --> 13:46.000 But you get also constellations like this. 13:46.000 --> 13:51.000 Which you see there are highly asymmetric and one could already. 13:51.000 --> 13:56.000 Yeah, assume from that this is more or less used by the neural networks 13:56.000 --> 14:01.000 to perform this kind of synchronization and equalization. 14:01.000 --> 14:04.000 So we have these publications. 14:04.000 --> 14:11.000 But not necessarily every publication is giving us a free and open source code. 14:11.000 --> 14:14.000 So what do we need to replicate this? 14:14.000 --> 14:19.000 So one part we need these boxes that are used in traditional systems 14:19.000 --> 14:24.000 because you're not always replacing your whole system with neural networks. 14:24.000 --> 14:28.000 So you need the classical algorithms that you can still use or you require 14:28.000 --> 14:31.000 to use depending on your own scenario. 14:31.000 --> 14:37.000 And preferably in already this kind of automatic differentiation framework. 14:37.000 --> 14:41.000 We need these kind of channel models so we can do simulations 14:41.000 --> 14:44.000 because we are not allowed to always use the real thing. 14:44.000 --> 14:47.000 And that's also good. 14:47.000 --> 14:53.000 And a lot of utility functions are still in these automatic differentiation. 14:53.000 --> 14:59.000 So we can leverage this computation of the gradient and we can optimize things. 14:59.000 --> 15:03.000 And now the question is who should write these toolboxes? 15:03.000 --> 15:05.000 Because the authors aren't always doing that. 15:05.000 --> 15:08.000 But actually the authors should do that. 15:08.000 --> 15:12.000 And in this case the authors also did. 15:12.000 --> 15:18.000 So I'm going to present the first toolbox is called Shona. 15:18.000 --> 15:22.000 And it is developed by a research group at Nvidia. 15:22.000 --> 15:25.000 And you can only see the first name, Hoytis. 15:25.000 --> 15:29.000 But actually in this group all the previous papers that I shown. 15:29.000 --> 15:33.000 Some authors from this group actually contributed to this library. 15:33.000 --> 15:39.000 So they gave back their knowledge to open source and also free software. 15:39.000 --> 15:42.000 It's a passion to license. 15:42.000 --> 15:44.000 So that's quite nice. 15:44.000 --> 15:49.000 And they use as a base tens of low because it was very popular at the time. 15:49.000 --> 15:54.000 And that's also how they created their research papers. 15:54.000 --> 15:58.000 Then I'm also going to show you a little bit about Mocha, 15:58.000 --> 16:02.000 which is more or less my work that I was able to do together with my colleagues. 16:02.000 --> 16:08.000 So that's where we, when you develop things and roll papers, try it. 16:08.000 --> 16:12.000 Or I also try to make them contribute their code. 16:12.000 --> 16:20.000 So we can also have a growing toolbox and give back to the public good. 16:20.000 --> 16:23.000 And our toolbox is based on PyTorch. 16:23.000 --> 16:30.000 And then there's also a third toolbox that I'm just going to put on this slide here. 16:30.000 --> 16:38.000 And also have the QR code in the end, which is developed by a group in Hong Kong polytechnic university. 16:38.000 --> 16:40.000 And it's based on Jack. 16:40.000 --> 16:44.000 So let's say if you have some sort of preference yourself for one of these frameworks, 16:44.000 --> 16:52.000 there's already starting points or already quite widely well developed libraries. 16:52.000 --> 16:55.000 So Trona, what does it consist of? 16:55.000 --> 16:57.000 So what can you optimize? 16:57.000 --> 17:00.000 So they for one provided a system simulator. 17:00.000 --> 17:01.000 So it's like a higher level. 17:01.000 --> 17:04.000 So not the physical layer that they presented before, 17:04.000 --> 17:08.000 but it's like about link adeption, power control scheduling. 17:08.000 --> 17:12.000 And then they have a physical layer simulator. 17:12.000 --> 17:16.000 So this is all the stuff that I mentioned before. 17:16.000 --> 17:20.000 So they have the forward error correction implemented inside the framework. 17:20.000 --> 17:22.000 They have the mapping, channel models. 17:22.000 --> 17:26.000 They have already off the M&M and MIMO and also 5G new radio, 17:26.000 --> 17:28.000 like physical layer implementation. 17:28.000 --> 17:35.000 So this is a rather well developed toolbox. 17:35.000 --> 17:41.000 And since I think last year, they published this. 17:41.000 --> 17:44.000 They have also a ray tracing and channel emulator. 17:44.000 --> 17:49.000 So where you can define a 3D model of the landscape you want to do. 17:49.000 --> 17:54.000 And you can run a ray trace or to get actually the channel impulse responses 17:54.000 --> 18:00.000 for also moving targets and different antenna patterns, 18:00.000 --> 18:03.000 different antenna arrays. 18:03.000 --> 18:04.000 Yeah. 18:04.000 --> 18:11.000 And it's more or less electromagnetic accurate channel modeling. 18:11.000 --> 18:13.000 And how can you use it? 18:13.000 --> 18:16.000 Well, you can just run pip install, Shona, 18:16.000 --> 18:21.000 or UV ads, Shona depends on, but it's a public on pipi. 18:21.000 --> 18:25.000 And quite easy to get started. 18:25.000 --> 18:29.000 So for mock-up for our library, we basically have the same stuff. 18:29.000 --> 18:32.000 We have mapers, synchronization, equalization, 18:32.000 --> 18:36.000 discrete channel models, fiber optical channel model, and utilities. 18:36.000 --> 18:39.000 So it's not nothing really different. 18:39.000 --> 18:42.000 We don't have this extensive wireless channel model. 18:42.000 --> 18:47.000 And also we're missing this forward error correction implemented inside this framework 18:47.000 --> 18:50.000 because we have smaller team. 18:50.000 --> 18:57.000 But it's also available on pipi and you can install it simply by running this command. 18:57.000 --> 19:02.000 And yeah, this is more or less some of the results that we got for 19:02.000 --> 19:08.000 let's say special type of channel and DSP algorithms that we optimize 19:08.000 --> 19:11.000 constellations for. 19:11.000 --> 19:15.000 So that I wouldn't have had here. 19:15.000 --> 19:17.000 Okay, so we do a short tutorial. 19:17.000 --> 19:21.000 So you had this big graph with the block diagram with all of the things. 19:21.000 --> 19:24.000 And we reduce this to this kind of block diagram. 19:24.000 --> 19:27.000 So we have some data source, the transmitter. 19:27.000 --> 19:34.000 We just use a symbol maper, have an AWGN channel, and a neural demaper. 19:34.000 --> 19:39.000 So it's a bare-born system just to show you the capabilities. 19:39.000 --> 19:46.000 And we use this kind of binary cross entropy to find the constellation and the labels. 19:46.000 --> 19:48.000 And how do we do this? 19:48.000 --> 19:54.000 So this is just to give you an overview that it's technically not that difficult. 19:54.000 --> 19:57.000 So you need to do a bunch of imports. 19:57.000 --> 20:03.000 You can define some variables that we use. 20:04.000 --> 20:09.000 You create all these blocks that are available from Shona. 20:09.000 --> 20:11.000 So you have like a binary source. 20:11.000 --> 20:16.000 You already have these constellation constellations, like in the source code. 20:16.000 --> 20:22.000 And this case, LBGN channel, neural demaper, and this loss. 20:22.000 --> 20:32.000 And I mean, the way that you define this kind of channel also means you could easily swap in different channel definitions that are available. 20:32.000 --> 20:37.000 Some of them have preconditions, like you have to increase sampling rate, to pulse shaping or anything. 20:37.000 --> 20:43.000 But let's say in the easy case, you can simply swap in a different channel in a simulation. 20:43.000 --> 20:47.000 And then, how do we perform our end-to-end simulation? 20:47.000 --> 20:59.000 Well, we create bits or sample bits, map them to symbols, send them to a channel, get some LLRs, compute the binary cross entropy. 20:59.000 --> 21:02.000 And that's it. 21:02.000 --> 21:06.000 And then, if you want to then do this kind of optimization step. 21:06.000 --> 21:12.000 We have to roll this in some sort of model, which is defined in terms of flow. 21:12.000 --> 21:23.000 Get the weights, the trainable weights, get the gradients of the loss with respect to the weights and apply these step. 21:23.000 --> 21:28.000 So it's also not widely complicated. 21:28.000 --> 21:40.000 And so I created or I have a notebook running that I'm going to show right now, which is based on the left QR code is going directly to their documentation page of Shona. 21:40.000 --> 21:45.000 And they have, like, I don't know how many notebooks and examples, but it's a lot. 21:45.000 --> 21:55.000 And on the right is the code for the notebook that is running, because I took a lot of things out of there to make it a bit more simple. 21:55.000 --> 21:59.000 So let's see. 21:59.000 --> 22:03.000 So here is basically the code that we had before. 22:03.000 --> 22:09.000 And I already executed it all the way to this kind of plot, where we have more or less the transmit constellation. 22:09.000 --> 22:12.000 So we start with 64 Qm. 22:12.000 --> 22:14.000 And here right now we have nothing. 22:14.000 --> 22:20.000 And then I'm going to just start this training. 22:21.000 --> 22:26.000 And you basically see, so I think this SNR was selected to be like 10 dB. 22:26.000 --> 22:32.000 So 10 dB is not really suitable to transmit 64 Qm over. 22:32.000 --> 22:45.000 So more or less our machine learning system adapts the constellation to something where it's put some more other points closer together. 22:45.000 --> 22:56.000 And we sometimes say it sacrifices a bit, because all of these have the only differ in one bit of the mapping of the label. 22:56.000 --> 23:03.000 And this runs now a little bit. 23:03.000 --> 23:11.000 We can continue with the slides, because we can do the same with Mocha. 23:11.000 --> 23:17.000 So this is more or less similar definitions with the defined transmission. 23:17.000 --> 23:19.000 We create our blocks. 23:19.000 --> 23:21.000 So it's all quite modular. 23:21.000 --> 23:23.000 So that's what is also our goal. 23:23.000 --> 23:30.000 So we can create these sort of block diagrams or the mental model of a block diagram. 23:30.000 --> 23:33.000 So we can connect them all together. 23:33.000 --> 23:38.000 And also more or less run the simulation. 23:38.000 --> 23:46.000 So I have bits, so generate bits, map them symbols, send them for the channel, get LRs. 23:46.000 --> 23:55.000 And then in our case, or in the PyTorge way, it's a bit simpler to then do the spec propagation. 23:55.000 --> 24:00.000 So we have this loss and we calculate the gradient all the way to the back. 24:00.000 --> 24:05.000 And then we will make the optimizer do a step. 24:06.000 --> 24:09.000 And for that, I have a different simulation. 24:09.000 --> 24:13.000 But we can see, so this has now finished simulating. 24:13.000 --> 24:20.000 So we have this kind of weirdly looking consolation, which has some points closer together and some further part. 24:20.000 --> 24:25.000 And now we can actually continue in this block. 24:25.000 --> 24:27.000 So that's the nice thing about Toronto. 24:27.000 --> 24:29.000 They have this error correction. 24:29.000 --> 24:34.000 So we can now create a simulation and run BR. 24:34.000 --> 24:41.000 So if this is not creeping out, we just define this and run it. 24:41.000 --> 24:49.000 And so now basically, I can have a live plot of the bit error simulation curve for this kind of 24:49.000 --> 24:54.000 Consolation and demaper chain that we created. 24:54.000 --> 24:59.000 So this notebook is available in the link. 24:59.000 --> 25:04.000 And the slides are also available online on the first image already. 25:04.000 --> 25:12.000 And so for that, for Mocha, I've created a different demo. 25:12.000 --> 25:18.000 It's not a Jupyter notebook, but it's a standalone application where we have like a nice GUI. 25:18.000 --> 25:28.000 And it's a bit more mobile or more, yeah, it's a bit faster. 25:28.000 --> 25:33.000 More or less, the graphical interface that doesn't mean that the algorithm is faster. 25:33.000 --> 25:36.000 It just looks nicer. 25:36.000 --> 25:44.000 And we can basically, yeah, here change the SNR live. 25:44.000 --> 25:51.000 And you can basically already see in this demo how this has an impact on our receive system. 25:51.000 --> 25:57.000 So this is more or less again, a map and this is what the receiver sees. 25:57.000 --> 26:07.000 So it's quite low noise, but if you now increase the noise a lot. 26:07.000 --> 26:14.000 So we basically see it's going to have to change the Consolation again. 26:14.000 --> 26:17.000 And yeah, here are some other different channel models. 26:17.000 --> 26:24.000 So this is like a small demo purpose because you can actually do research with this demo. 26:24.000 --> 26:30.000 And yeah, you can find this with this link or with this QR code. 26:30.000 --> 26:35.000 And yeah, this is the short-ish demo. 26:35.000 --> 26:42.000 So you can download or check out the GitHub repositories for all of them download them from PIP. 26:42.000 --> 26:49.000 And also use it for your own optimization or you can create your own RFV waveform. 26:49.000 --> 26:59.000 And then if you're an amateur radio license operator, you could transmit it through whatever means you want. 26:59.000 --> 27:10.000 That's about it. And thank you for your attention. 27:11.000 --> 27:13.000 Yeah. 27:13.000 --> 27:14.000 It was very interesting. 27:14.000 --> 27:18.000 My question is, you're assimilate in the fibromtic conference? 27:18.000 --> 27:19.000 Yes. 27:19.000 --> 27:27.000 What are the physical parameters of the fibromtic conference? 27:27.000 --> 27:34.000 So the question is what are the physical parameters or characteristics to put in the fibromtic transport? 27:34.000 --> 27:41.000 So for the optical fiber, so it has similar lead to wireless communication also noise. 27:41.000 --> 27:44.000 And but you also have some optical effects. 27:44.000 --> 27:52.000 So there's, for example, the current effect, which more or less leads if you have an increased amplitude. 27:52.000 --> 27:55.000 You have a phase shift that is proportional to this amplitude. 27:55.000 --> 28:01.000 So you're more or less at the transmitter, for example, quite limited in the transmit power. 28:02.000 --> 28:07.000 But also optical systems experience a little bit higher phase noise. 28:07.000 --> 28:10.000 Relative to the symbol rates. 28:10.000 --> 28:17.000 So low phase noise systems are like 15 kilohertz lined with. 28:17.000 --> 28:24.000 And but they are like a 32 gigabort or 60 gigabort transmit rates. 28:25.000 --> 28:28.000 So these are the, so this is the main effect. 28:28.000 --> 28:33.000 So the current effect, and then you have to do to do nonlinear fiber simulation. 28:33.000 --> 28:38.000 You need to do this kind of clips split step for a simulation. 28:38.000 --> 28:41.000 But then there's also chromatic dispersion that I forgot to mention. 28:41.000 --> 28:46.000 So basically the light on different wavelengths is traveling at different speeds. 28:46.000 --> 28:48.000 So you have this kind of group. 28:49.000 --> 28:51.000 Linear group delay. 28:52.000 --> 28:54.000 Yeah, group delay dispersion. 29:01.000 --> 29:05.000 So the question is if it's suitable to adapt this chromatic path errors. 29:05.000 --> 29:06.000 And yes, of course. 29:06.000 --> 29:11.000 So shown up, for example, they have this multi path 3GPP model, 29:11.000 --> 29:15.000 where you can create channels with multi path. 29:16.000 --> 29:18.000 Let me show this. 29:25.000 --> 29:27.000 So this work here. 29:27.000 --> 29:31.000 So this was done for all of the M over this kind of multi path channels. 29:31.000 --> 29:34.000 So yeah, this paper and the bottom. 29:34.000 --> 29:37.000 So I put these IEE papers, but they are. 29:37.000 --> 29:40.000 I think all of them are available on archive as well. 29:40.000 --> 29:43.000 No, that's miss happened myself for this conference, but yeah. 29:43.000 --> 29:49.000 So this trimming the fat from all of the M, they basically put on each of the of the M transmitter and receiver. 29:49.000 --> 29:53.000 They put this kind of modulators or map us. 29:53.000 --> 29:56.000 And they removed all of the pilot processing. 29:56.000 --> 30:01.000 So this is all done in the inside the neural networks. 30:01.000 --> 30:06.000 And they use the op the wireless channels for this. 30:06.000 --> 30:08.000 So with multi path. 30:08.000 --> 30:12.000 You can also use this to keep the problem. 30:12.000 --> 30:16.000 And instead, try to optimize the channel. 30:16.000 --> 30:18.000 Simulation instead. 30:18.000 --> 30:19.000 Yes. 30:19.000 --> 30:21.000 So that is also done. 30:21.000 --> 30:23.000 That is one of the. 30:23.000 --> 30:27.000 Also one of the research areas that people do a lot. 30:27.000 --> 30:29.000 So they use this kind of. 30:29.000 --> 30:36.000 Yeah, optimization to find channel models that are difficult to analytically model. 30:36.000 --> 30:39.000 So they they more or less have a lot of measurement data. 30:39.000 --> 30:44.000 And now you create for example. 30:44.000 --> 30:45.000 What's the called? 30:45.000 --> 30:48.000 Yeah, some some. 30:48.000 --> 30:52.000 I just forgot the name of these type of of the again. 30:52.000 --> 30:56.000 So the generic adversarial networks. 30:56.000 --> 30:58.000 Generative adversarial networks. 30:58.000 --> 31:01.000 Yes, where you can basically then sample. 31:01.000 --> 31:05.000 Let's say channels that are similar to what they have seen before. 31:05.000 --> 31:10.000 So yeah, but yeah, also for this kind of system in multi path and the channels. 31:10.000 --> 31:15.000 This all relies on a little bit on you need to to have the data. 31:15.000 --> 31:20.000 So either you have an analytical model or you have enough measurement data to cover. 31:20.000 --> 31:29.000 Let's say the plane of possibilities. 31:29.000 --> 31:32.000 Hey, I'm curious in the beginning. 31:32.000 --> 31:35.000 You have like your graph. 31:35.000 --> 31:41.000 And for me, equalization and synchronization reflect in order. 31:41.000 --> 31:45.000 Specific reason or it's just thinking. 31:45.000 --> 31:50.000 I mean, it's synchronization and equalization. 31:50.000 --> 31:54.000 You can put depending on your specific architecture. 31:54.000 --> 31:58.000 You could can put them in different orders as well or parts of equalization. 31:58.000 --> 32:03.000 And parts of synchronization. So yeah, this is it was just like to give an idea. 32:03.000 --> 32:10.000 But yeah, this depends on your specific equalizer and synchronization algorithm. 32:10.000 --> 32:15.000 We have to put them in different order. 32:15.000 --> 32:21.000 And are there any limits to the simulation channel? 32:21.000 --> 32:26.000 What can you use for other. 32:26.000 --> 32:32.000 So the question is if there are any limits for the channel simulation or what their parameters. 32:32.000 --> 32:35.000 And yes, so if you. 32:35.000 --> 32:39.000 So let's say I did this now on this MacBook because it's very. 32:39.000 --> 32:44.000 It was a simple simulation, but if you want to do for example, this kind of fiber simulations. 32:44.000 --> 32:52.000 And typically to get the good result, you need to sample the or perform like steps of length. 32:52.000 --> 32:55.000 100 meters, that is like a lot of. 32:55.000 --> 33:00.000 If you have like a few hundred kilometers of fiber, that's a lot of computation steps. 33:00.000 --> 33:04.000 And in every step, you need to still set for every calculation. 33:04.000 --> 33:08.000 You have to save the gradients. So that becomes a memory issue as well. 33:08.000 --> 33:13.000 So then, for example, we have big GPUs with like 20 or 40 gigabytes of memory. 33:13.000 --> 33:17.000 And even they are not able to process everything in one step. 33:17.000 --> 33:22.000 But this is our more or less the limit. So if you end also the batch size. 33:22.000 --> 33:31.000 So if you have a lot of samples, of course, you also need to save all of these numerically computed gradients in order to perform these optimization steps. 33:31.000 --> 33:34.000 So yeah, this is. 33:34.000 --> 33:40.000 So for your you have the school that you are that most. 33:40.000 --> 33:43.000 You were able to simulate the. 33:43.000 --> 33:45.000 Yeah, it's still running. 33:45.000 --> 33:50.000 So how realistic is that how much do you need or like a. 33:50.000 --> 33:53.000 Obviously it doesn't take for six months of age. 33:53.000 --> 33:56.000 No, like a 16G or 20G channel. 33:56.000 --> 33:57.000 How. 33:57.000 --> 33:58.000 How. 33:58.000 --> 34:01.000 What a realistic professor right there. 34:01.000 --> 34:04.000 You mean now for the training order for the inference. 34:04.000 --> 34:07.000 Like in interaction with students. 34:07.000 --> 34:10.000 I mean, this is depends on your. 34:10.000 --> 34:16.000 So the question is what are the possible transfer rates using neural networks or neural. 34:16.000 --> 34:19.000 Transmitters receivers on hardware. 34:19.000 --> 34:21.000 That's highly dependent. 34:21.000 --> 34:26.000 So for example, for let's say if you create a transmitter that you trained. 34:26.000 --> 34:31.000 You wouldn't use any neural network weights most likely in a transmitter. 34:31.000 --> 34:36.000 You would just extract the consolation and put this into your hardware. 34:36.000 --> 34:42.000 And for the receiver, there are you can, for example, use a neural network. 34:42.000 --> 34:43.000 You need to. 34:43.000 --> 34:48.000 Yeah, see how well it can be adapted to the hardware you need. 34:48.000 --> 34:56.000 But from what I've seen for these kind of simple cases, the neural networks are quite thin like three layers. 34:56.000 --> 34:59.000 And these are, yeah, not a lot of computation. 34:59.000 --> 35:01.000 So you can even make the case. 35:01.000 --> 35:04.000 For example, if you compare it to like a maximum likelihood receiver, 35:04.000 --> 35:09.000 where you compute the distance to every possible point, you can be cheaper with a neural network. 35:09.000 --> 35:14.000 Because it's doing this intrinsically in a different way.