WEBVTT 00:00.000 --> 00:10.160 We welcome for the new people who joined Welcome to the Go Devroom again. 00:10.160 --> 00:14.720 Our next talk is really interesting because we often talking about instruments in a 00:14.720 --> 00:20.200 lot today but we never talked about to do it without actually changing a single line 00:20.200 --> 00:21.200 of code. 00:21.200 --> 00:26.000 So I have two amazing speakers on stage at the same time which is Hannah and Kemal which 00:26.000 --> 00:28.760 are both going to talk about instrumentation. 00:28.760 --> 00:29.760 A pass! 00:29.760 --> 00:41.280 Okay, so yeah, we're going to talk about instrumentation. 00:41.280 --> 00:44.040 We talk a lot about profiling that was cool. 00:44.040 --> 00:45.320 Parker got a lot of mentions. 00:45.320 --> 00:49.040 I used to maintain Parker so I'm glad that it's getting traction. 00:49.040 --> 00:52.080 No, okay, it's good. 00:52.080 --> 00:55.000 Okay, should I repeat myself? 00:55.000 --> 00:56.800 Yeah, let's go. 00:56.800 --> 00:57.800 Okay. 00:57.800 --> 00:58.800 Let's take a look at that. 00:58.800 --> 00:59.800 Okay. 00:59.800 --> 01:12.880 So, cool, yesterday I was actually when I was walking thinking about this talk and like 01:12.880 --> 01:17.360 zero code changes whatnot and I realized that maybe all the things we're going to 01:17.360 --> 01:22.120 mention today, maybe they are not needed because in the future only the AI are going 01:22.120 --> 01:24.640 to write the code. 01:24.640 --> 01:29.200 We can just give it away and okay, just instrument our go applications and we can get 01:29.200 --> 01:30.200 away with it. 01:30.200 --> 01:34.600 But until that time, let's see what we can do today. 01:34.600 --> 01:36.800 So first of all, why do we care? 01:36.800 --> 01:37.800 Right? 01:37.800 --> 01:44.400 So the observability promise you understand the whole system behavior, they buy production 01:44.400 --> 01:49.360 issues and then prevent any other objects or like troubleshoot when they happen. 01:49.360 --> 01:56.320 What happens is distributed complexity, partial visibility, you don't see all your like 01:56.320 --> 02:01.240 metrics, pens or like any metrics that you are collecting and then okay, it's very 02:01.240 --> 02:04.040 some my machine but just happen. 02:04.040 --> 02:07.640 So how do you achieve this? 02:07.640 --> 02:10.320 You need to pay the tax right? 02:10.320 --> 02:15.920 Today we're going to strictly talk about or like our example is going to be about distributed 02:15.920 --> 02:21.840 tracing, collecting the traces, whatnot, but technically this could apply any of the 02:21.840 --> 02:24.600 signals that you collect from your processes. 02:24.600 --> 02:31.720 So you import an SDK into your go application, initialize your tracer, grab your handlers 02:31.720 --> 02:37.520 if this is an online HTTP based application, propagate context everywhere because if you 02:37.520 --> 02:44.000 are now making an HTTP request from the HTTP client or like making a VBC code query and 02:44.000 --> 02:49.200 you need to propagate all these contexts and then you need to shut down everything, gracefully 02:49.200 --> 02:53.080 make sure everything flush to your observability system whatnot. 02:53.080 --> 02:57.320 And now if you are like working for a big enterprise company and if you have like hundreds 02:57.320 --> 03:03.240 of services, microservices, you need to do this again and again and again and again. 03:03.240 --> 03:13.080 So where the instrumentation gets away, so it's one of the things that like let to open 03:13.080 --> 03:15.560 telemetry to rise up. 03:15.560 --> 03:21.760 You are like getting something from data dog, something from Neveralic or like other APM 03:21.760 --> 03:22.760 vendors. 03:22.760 --> 03:29.240 You put that in your code base and now it's your like the third vendor specific code and 03:29.240 --> 03:36.600 then this is not specific to your vendor or like business logic but then it's inconsistent. 03:36.600 --> 03:42.160 Maybe one of the teams are implemented something in their service and it exposes like 03:42.160 --> 03:46.520 different attributes and the label, labels, the other service they don't have it, like 03:46.520 --> 03:51.520 how do you ensure that is the case and like how do you ensure that you don't clutter 03:51.520 --> 03:58.400 your code base right and then there's this anxiety of like I'm collecting all these data 03:58.400 --> 04:04.440 signals and they're like useful and everything but then like what is the cost of it, 04:04.440 --> 04:09.320 what is the overhead of like doing all these sort of instrumentation. 04:09.320 --> 04:16.320 So we can't do anything about the last part or maybe we can but it's like harder to tweak 04:16.320 --> 04:24.040 the performance part but what about like getting the toilet of the way, toilet of instrumentation 04:24.040 --> 04:28.040 out of the way and directly provide the value of observability. 04:28.040 --> 04:33.440 So that means like no SDK imports, no wrapping any handlers or functions, no context 04:33.440 --> 04:41.000 propagation, we don't need to deal about it and then just observability just make it work. 04:41.000 --> 04:47.240 So first of all like let's step aside and talk about instrumentation, what do you mean 04:47.240 --> 04:48.240 by instrumentation? 04:48.240 --> 04:54.480 You have your application, let's say then you have your like backend, your application 04:54.480 --> 04:59.200 interacts with your backend or any other services, this could get easily complicated but 04:59.200 --> 05:03.440 then what's happening in between, maybe you need to understand like what's happening 05:03.440 --> 05:08.400 in the ingress point, what's happening in if I'm like calling a microservice about the 05:08.400 --> 05:11.240 auto, what not, how do you do that? 05:11.240 --> 05:17.680 So you can use logs, like it's the easiest, developers, loud em, right? 05:17.680 --> 05:24.800 You can just put a log line, you see it on your local machine and push that to some service 05:24.800 --> 05:30.240 but the problem is like yes, it's super convenient to add the logs but it's one of the 05:30.240 --> 05:36.880 hardest signals to like store, search and make understand that, it's a challenging task. 05:36.880 --> 05:42.560 Yes, then you have metrics, there's a reader, they're easier to collect but they're aggregated 05:42.560 --> 05:43.560 data, right? 05:43.560 --> 05:50.200 And they don't tell a story about your individual transactions, then there does the tracing 05:50.200 --> 05:51.200 come from, right? 05:51.200 --> 05:55.880 It's about transaction, you have like hierarchical data, you have the context propagation, 05:55.880 --> 06:01.880 it's rich, you can derive metrics and logs from the traces, you can build a fat event system 06:01.880 --> 06:08.760 based on it, it gives you more data but again, it's complicated to store, let's not 06:08.760 --> 06:13.880 talk about the storage but it's complicated to instrument and collect these data. 06:13.880 --> 06:18.160 So that's where the auto instrumentation comes in. 06:18.160 --> 06:19.160 Cool. 06:19.160 --> 06:23.400 So the point of our talk is about auto instrumentation so let's talk a little bit about 06:23.400 --> 06:28.520 that but before we get into all of that we need to talk about all of the manual toy 06:28.520 --> 06:32.880 all that comes into manual instrumentation that come all was talking about. 06:32.880 --> 06:39.360 So let's say that we have this HTTP request handler that's just doing something coding. 06:39.360 --> 06:41.720 Let's say I want to instrument this. 06:41.720 --> 06:46.760 I need to add around 15 lines of code for every single handler that I'm having so every 06:46.760 --> 06:49.640 health check that you have every endpoint that you have. 06:49.640 --> 06:53.080 You've got to start a span, you've got to stop the span, you've got to set their attributes 06:53.080 --> 06:58.360 somehow, it goes on and on, this doesn't include starting the tracer, everything like 06:58.360 --> 06:59.360 that. 06:59.360 --> 07:03.520 So what's the point of doing all this work? 07:03.520 --> 07:09.000 I am lazy, I don't, I want to have all this data but I don't want to do it myself 07:09.000 --> 07:14.000 and as Kamal said maybe AI will do this for me eventually but so far, haven't heard 07:14.000 --> 07:15.560 of anything. 07:15.560 --> 07:21.080 I need to do something in between having my application and actually getting data to 07:21.080 --> 07:24.080 actually profit from having this application. 07:24.080 --> 07:27.880 So this is exactly where auto instrumentation comes in. 07:27.880 --> 07:32.720 Auto instrumentation as you can probably tell by the name is a way of instrumenting your 07:32.720 --> 07:36.120 code without having to make any code changes. 07:36.120 --> 07:38.520 And there are two different types that we're going to talk about today. 07:38.520 --> 07:44.000 The first one is run time auto instrumentation which you can probably tell happens during 07:44.000 --> 07:46.000 run time. 07:46.000 --> 07:51.600 This is often a source of making code changes and if you know about go which is a compiled 07:51.600 --> 07:54.720 language, we can't make source code changes. 07:54.720 --> 07:59.320 So luckily for us in go there are other alternatives that don't require source code changes 07:59.320 --> 08:01.800 and we're going to talk about that. 08:01.800 --> 08:08.800 The other type is compile time instrumentation which happens at compile time surprisingly. 08:08.800 --> 08:13.200 And this works a lot better for languages like go which are compiled because this will 08:13.200 --> 08:18.200 never require require you to make source code changes. 08:18.200 --> 08:23.000 Okay, so let's go more in depth into each of these approaches. 08:23.000 --> 08:28.600 The first one for run time basically all of the approaches and go talk about EBPF. 08:28.600 --> 08:33.040 And if you were at the EBPF Devroom yesterday, you've heard a lot about it and you probably 08:33.040 --> 08:36.880 know but as a overview, it used things called hooks. 08:36.880 --> 08:41.200 You probes are in the user space, K probes are in the kernel space and then there's also 08:41.200 --> 08:46.920 static hooks called USTT and all of these hooks give you an opportunity to skip from 08:46.920 --> 08:50.960 your application to some kind of tracing code. 08:50.960 --> 08:55.440 There's another approach that's called library injection that takes advantage of LDPRLO 08:55.440 --> 09:00.720 to basically do some weird magic that I don't understand to shove some code into your application 09:00.720 --> 09:03.280 who knows. 09:03.280 --> 09:07.480 So ignoring LDPRLO, let's talk about EBPF. 09:07.480 --> 09:13.040 So as I mentioned earlier, it gives you a hook but when you have your basic application, 09:13.040 --> 09:16.560 you're most likely going to do some communication with the kernel, whether you're in 09:16.560 --> 09:21.280 user space or kernel space and as you're doing all of this work, that happens behind 09:21.280 --> 09:22.280 the scenes. 09:22.280 --> 09:29.240 EBPF gives you a hook that tells the code, hey, jump to this tracing code, make some spans, 09:29.240 --> 09:31.920 make some stresses for me. 09:31.920 --> 09:38.480 There's an example of EBPF Automate Instrumentation, this is an example of the OTAL auto instrumentation 09:38.480 --> 09:45.200 for a go and it's very, very simple, all you have to do is create this config file. 09:45.200 --> 09:51.480 There's basically nothing here about tracing, all you have to do is bring in the library 09:51.480 --> 09:55.120 that they have and you don't have to make any source code changes. 09:55.120 --> 09:59.920 This will give you all of your traces and this one takes advantage of Upros, we'll talk a 10:00.000 --> 10:06.400 little bit more about this later, but goes to show how very simple this is. 10:06.400 --> 10:10.280 The other EBPF or POSHA we see in go is called OBI. 10:10.280 --> 10:15.480 This one is a library that was originally created by Bella but was donated to the open 10:15.480 --> 10:22.160 telemetry community and this one also uses EBPF to trace your go code. 10:22.160 --> 10:25.360 So a little bit more about go code or OBI. 10:25.360 --> 10:27.680 It supports many different languages, not just go. 10:27.680 --> 10:33.760 So if you are for some reason not using go, you can still probably use OBI and it also has 10:33.760 --> 10:36.320 a lot of different coverage for different protocols that you want to use. 10:36.320 --> 10:40.400 So HTTP, GRPC, etc. 10:40.400 --> 10:44.400 The thing about EBPF that it's very important to know is that while you're accessing the 10:44.400 --> 10:50.560 kernel, it's going to require you to give it a administrative privileges and root access, 10:50.560 --> 10:55.080 which is something will delve into a little bit more later on, but this is very important 10:55.080 --> 10:58.600 to keep in mind. 10:58.600 --> 11:02.640 And the way the OBI works is pretty similar to what I mentioned previously. 11:02.640 --> 11:07.480 You have your application in the kernel, there's a hook and after this hook, there's 11:07.480 --> 11:10.800 a OBI side card that does all of the tracing for you. 11:10.800 --> 11:16.360 And again, you don't have to make any source code changes. 11:16.360 --> 11:21.560 Similar to the other EBPF application that we saw, this is also very, very easy to set up. 11:21.560 --> 11:25.760 It doesn't require any source code changes, and you can even do this without having to 11:25.760 --> 11:29.880 stop your application, which is kind of crazy. 11:29.880 --> 11:34.320 All you have to do is tell it, what poor things being sent to, don't want traces, don't 11:34.320 --> 11:42.440 want metrics, use promethias, etc., and as soon as you run this very fancy command with 11:42.440 --> 11:45.960 or without your application still running, you get traces. 11:45.960 --> 11:48.680 So pretty magical. 11:48.680 --> 11:51.120 So that's it for now for runtime approaches. 11:51.120 --> 11:55.360 Let's talk about the other side of things, which is compile time instrumentation. 11:55.360 --> 11:58.800 There are two main things that we want to talk about today. 11:58.800 --> 12:02.960 One is, or a cast-year-un, which I don't know if we mentioned this, but we both work at 12:02.960 --> 12:03.960 data dog. 12:03.960 --> 12:08.280 So our cast-year-un is one of our projects, and is in a compile time approach that takes 12:08.280 --> 12:11.320 advantage of tool exec and go. 12:11.320 --> 12:16.200 And open the telemetry compile time instrumentation, sig, is a special interest group that 12:16.200 --> 12:20.600 is the collaboration between data dog, Ali Baba, and Otel. 12:20.600 --> 12:26.320 This is basically going to be one big library to support all of your compile time instrumentation 12:26.320 --> 12:27.320 needs. 12:27.320 --> 12:31.640 This one is still a work in progress, but we're working hard on it, and hopefully it will 12:31.640 --> 12:35.920 be available more widely soon. 12:35.920 --> 12:39.560 So how does compile time instrumentation work? 12:39.560 --> 12:43.840 As you all probably know, you start over for your code, a bunch of compiling things happen, 12:43.840 --> 12:48.120 and then you end up with an executable at some point that is actually what is being run 12:48.120 --> 12:50.480 when you run your application. 12:50.480 --> 12:54.800 Inside the compiler, there are a bunch of different steps, but we're going to just look 12:54.800 --> 12:56.200 at a few of them. 12:56.200 --> 13:01.360 The first thing that happens is that your code is broken down into abstract syntax trees, 13:01.360 --> 13:06.640 or ASTs, which are then turned into intermediate representations or IRs. 13:06.640 --> 13:12.560 And what AST is very briefly, is just a tree that consists of a bunch of nodes that represent 13:12.560 --> 13:16.560 your functions, your packages, your variables, et cetera. 13:16.560 --> 13:21.200 Once you have your IR and your ASTs, that gets broken down into machine code, a bunch 13:21.200 --> 13:27.080 of other steps, including linking, happen, and then you get an executable, very cool. 13:27.080 --> 13:32.600 The thing that we want to focus in on here is the compile, or the AST stuff, and using 13:32.600 --> 13:34.600 tool exec. 13:34.600 --> 13:42.400 So specifically for orchestrion, we use the tool exec tool, and this is also what the Otel 13:42.400 --> 13:44.960 compile time approach does. 13:44.960 --> 13:52.240 Using tool exec to go into the compile time steps, trace through the entire AST, and edit 13:52.240 --> 13:54.160 the nodes in the tree. 13:54.160 --> 13:59.480 This is called abstract oriented code, and you basically give joint points, which points 13:59.480 --> 14:05.760 to different nodes in the tree, and then an advice will tell you how to change the node. 14:05.760 --> 14:12.320 This is not just data dog specific, we support the open telemetry standard, and it supports 14:12.320 --> 14:21.320 a bunch of different packages and other dependencies that can happen automatically. 14:21.320 --> 14:26.080 However, if you are like us, and you don't want to use the data dog libraries under 14:26.080 --> 14:31.480 the hood, you can do something else, which is to create a config file, and what we did 14:31.480 --> 14:37.720 for the purposes of this presentation is to edit orchestrion to instead of using our 14:37.720 --> 14:40.920 data dog tracers to use Otel tracers. 14:40.920 --> 14:46.720 This is an extremely simplified config file, because I had very limited space on this 14:46.720 --> 14:51.920 slide, so if you try to copy this is probably not going to work, but basically what 14:51.920 --> 14:57.880 it does, it tells the orchestrion library, what joint points you look at, so for example 14:57.880 --> 15:02.920 the main function of my main package, and then what advice it does, which is to, at the beginning 15:02.920 --> 15:06.000 of the function, start up a tracer, and then start a span. 15:06.000 --> 15:10.960 So very simple stuff, and that means you're not limited to using data dog if you don't 15:10.960 --> 15:13.800 want to. 15:13.800 --> 15:18.560 As I mentioned previously, this uses tool exec, but of course if you don't want to do all 15:18.560 --> 15:23.240 these fancy tool exec things, you can also just use the orchestrion command, and also 15:23.240 --> 15:27.640 you can use environment variables if you so wish to, get things run. 15:27.640 --> 15:33.120 This will edit your AST while your code is compiling, so by the time it turns into an executable 15:33.120 --> 15:39.840 file, all the tracer tracing is injected, and your source code is untouched. 15:39.840 --> 15:45.560 Okay, now that we've done all that, finally the meat of the presentation. 15:45.560 --> 15:52.680 Okay, so the always one, like we have a bunch of other things that we are working on 15:52.680 --> 15:56.320 to tackle the auto instrumentation things, we're going to talk about it, but like we really 15:56.320 --> 16:01.160 wanted to see like what would be the effect of these things, like versus, okay, I have 16:01.160 --> 16:07.920 the baseline application, then I have a manual instrumentation, then I use orchestrion and 16:07.920 --> 16:13.640 injected some instrumentation, and then okay, I use OBI, like what are the trade-offs. 16:13.640 --> 16:21.320 For that we basically wrote several applications, and then example applications, and set 16:21.320 --> 16:25.800 it up with all the open source tooling, everything is available in a repo, we will share 16:25.840 --> 16:32.600 the link in the end, we use a Docker-based observability stack, we come up with some 16:32.600 --> 16:39.440 archetypes to actually generate the representative workloads, because like, okay, like 16:39.440 --> 16:46.240 your app can be ideal, it can be IO1, CPU1d, or maybe it's just like a mix bag of that, 16:46.240 --> 16:49.640 and we would like to see the all the differences and how it behaves. 16:49.640 --> 16:56.640 We also collected CPU memory from the hosted self, because when you run an BPA agent 16:56.640 --> 17:01.640 as your process, you need to actually consider both of their resources, not just the process 17:01.640 --> 17:08.840 one, we try to collect those things, and then, yeah, as I've mentioned before, but these 17:08.840 --> 17:13.840 are just HTTP simple applications and have some traces. 17:13.840 --> 17:19.440 To actually focus on the effect of the app approaches, these are not doing a lot of 17:19.440 --> 17:24.040 complicating stuff like making calls to the other services, so that you don't introduce 17:24.040 --> 17:31.640 any other noises caused by the external services, it's just like pure spend creation 17:31.640 --> 17:35.160 and trace creation in the end. 17:35.160 --> 17:42.280 On the methodology of the benchmarking and how to make this kind of cleanly, we also 17:42.280 --> 17:48.880 talk a lot about this in the software performance room this year, probably if you want 17:48.880 --> 17:53.200 to watch that talk as well, you would know about the methodology whatnot, but like, you 17:53.200 --> 17:57.840 can also check the repo, pull, and test yourself, right? 17:57.840 --> 18:03.240 For that we have scenarios that we tested, default baseline, no instrumentation, manual 18:03.240 --> 18:09.120 instrumentation, then the OBI itself, EBPF auto, like these projects eventually go on 18:09.120 --> 18:14.720 a merge, but they are now like separate projects, and our question, because it's the 18:14.720 --> 18:19.040 ready one, but eventually it will be an OTLC, we are calling it, open telemetry 18:19.040 --> 18:26.000 compile time instrumentation, and again, we have a go application containers for each of these 18:26.000 --> 18:32.000 scenarios, we have an OTL Collector, you can check everything with the EGR traces, 18:32.000 --> 18:37.360 Prometheus metrics, and the Generate load with K6, but then we also have a simulation layer, 18:37.360 --> 18:43.360 so when you have a request, we simulate like CPU one, the operation, IO one, the operation, 18:43.360 --> 18:50.720 whatnot, and we use identical bare metal hardware to actually run these things, and then 18:50.720 --> 18:58.120 we like sustain the load for eight minutes, so how do they compare? 18:59.000 --> 19:03.000 All right, exciting. 19:03.000 --> 19:09.160 Okay, so just to show off some data dog dashboards, when we make this public, we'll make 19:09.160 --> 19:13.680 it available using Grafana, it's just that we're more familiar with data dog. 19:13.680 --> 19:18.280 We did a few measurements, the first one being about latency of the requests, the throughput 19:18.280 --> 19:22.760 of all of the requests coming in, I know this is really hard to see, so I'm just going to 19:22.760 --> 19:30.120 go through this quickly, we have a summary, some CPU and memory metrics, and some host 19:30.120 --> 19:38.200 metrics, again, for memory and CPU, so hopefully this is slightly easier to read if not a 19:38.200 --> 19:43.880 little bit less crowded, though it kind of still is, but we're comparing the baseline, which 19:43.880 --> 19:49.320 again, is the default instrumentation, or no instrumentation, rather, the manual instrumentation, 19:49.400 --> 19:55.640 which is just using the Oto SDK, EBPF, and the tool train approach, which is still using 19:55.640 --> 20:02.040 Oto under the hood, so shouldn't introduce too much noise. In the first column, you see, we 20:02.040 --> 20:12.760 have CPU, and as expected for different approaches, we're actually not expected, sorry, we updated 20:12.760 --> 20:21.160 these numbers up pretty recently, strangely enough, where's my pointer? No pointer. The EBPF 20:21.160 --> 20:27.480 and the tool train approach seem to be using less CPUs surprisingly, and I'll do a little bit 20:27.480 --> 20:31.400 more talking about this after I go through all of the columns, for memory, as expected, we're 20:31.400 --> 20:36.120 using a lot more memory, or a little bit more memory for each approach compared to the baseline. 20:36.120 --> 20:41.080 For latency, we're seeing that requests are going through quicker, and then as a result, the 20:41.160 --> 20:45.880 throughput is higher than the baseline, so maybe not what you expected to see. 20:47.320 --> 20:51.880 Evidently, I did not pay attention to Kamal and Augusta's talk about benchmarking, 20:53.400 --> 20:57.880 but if you were there at the talk, you know there were a bunch of tips that you could be using, 20:57.880 --> 21:05.320 and of course benchmarks are a little finicky, so this is just the latest results, but 21:05.400 --> 21:12.920 let's, in addition to benchmarking, talk about who the winner is. 21:14.520 --> 21:19.000 So, in addition to just peer numbers, we want to know how easy it is, and how 21:19.800 --> 21:26.200 quote-unquote good each approach is. So, for this, we're just going to be looking at the 21:26.920 --> 21:30.200 EBPF approach and the tool chain approach across four different aspects. 21:31.000 --> 21:35.480 The first one is performance, and as you saw from the previous slides, we're using more 21:35.480 --> 21:40.360 memory, so obviously there's going to be a little bit of overhead when you start instrumenting your 21:40.360 --> 21:46.440 code, so a little bit of a trade-off if you want to get data. For stability, as I mentioned earlier, 21:46.440 --> 21:53.560 EBPF requires you to use probes, which often require you to know offsets of all of your functions 21:53.560 --> 21:58.280 in your variables in your code, which means that it's a little hard to use, and if you're 21:58.360 --> 22:04.200 rerunning or rebuilding your code, sometimes things can break. On the other hand, for tool chain approaches, 22:04.200 --> 22:10.120 all you really have to know is that your code is compiling, so it's a little bit more stable in 22:10.120 --> 22:17.000 terms of like run to run. In terms of security, as I mentioned earlier, and I hope everyone 22:17.000 --> 22:22.360 remember this, because I told you to remember this, the EBPF approach requires you to give a 22:22.360 --> 22:30.360 administrative privileges to the library, which is a little scary. It's accessing your roots, 22:30.360 --> 22:37.320 and it's not great if there are potentially dangerous people on your nutwork. The tool chain 22:37.320 --> 22:42.440 approach doesn't do this, so it's a little safer. And then last of all, for portability or like 22:42.440 --> 22:49.000 ease of use, EBPF is great, because for things like OBI, you don't even have to restart your 22:49.080 --> 22:55.320 application, and you can just drop in the sidecar and then things start popping up. However, 22:55.320 --> 23:01.000 it does require you to have a Linux kernel, so if you're using a MacBook, you're kind of out of luck. 23:02.120 --> 23:07.080 But the tool chain approach, though, again, as long as your code is compiling, it works, 23:07.080 --> 23:12.280 but it does require you to restart your application. So they have their pros and cons. 23:13.000 --> 23:21.160 So the winner, I'm sorry, there's no clear winner. I'm kind of bidded everyone. It really depends. 23:22.280 --> 23:27.960 If you want to use EBPF, that's great if you want flexibility in terms of use, but for 23:29.000 --> 23:34.920 stability and security, protecting your code, compile time instrumentation might be the way to go for you. 23:34.920 --> 23:42.760 And a little bit more about EBPF hooks and probes and things like that. Donia and her 23:42.760 --> 23:49.640 coworker Chris had a fantastic talk yesterday about the gotchas of using hooks and probes. 23:50.360 --> 23:57.560 And our coworker was somehow had another talk yesterday about the performance impacts and 23:58.200 --> 24:03.720 other issues that we've had with EBPF. So if you're interested in learning a little bit more about this, 24:03.720 --> 24:06.120 you can go check out their slides and their recordings. 24:08.200 --> 24:13.960 Do you have time? Yeah. Yeah, we have enough time to talk about it. But again, as we told 24:14.680 --> 24:22.360 talked before, we are still thinking about this. And we still try to find ways to push the boundaries 24:22.360 --> 24:27.960 and really find a way to like auto instrument the go application, sorry. So one of the things 24:27.960 --> 24:34.360 are available already in the other languages or the systems is having USDT. This is 24:35.160 --> 24:40.360 statistically defined tracing points for the user space. And some of the languages, some of 24:40.360 --> 24:46.280 the binary runtime, they already have this. You can enable them. This is basically injecting 24:46.280 --> 24:52.120 couple of empty bytes to each function prologue and you can use these places to 24:52.120 --> 24:58.200 either inject a library or like hook with the U-props system. What it gets into table, 24:58.200 --> 25:03.960 the all the downsides of like or beyond needs to write right now deal with like calculating the 25:03.960 --> 25:11.560 offsets where to find the binaries which memory to read and do this for each and every go version. 25:11.560 --> 25:15.800 And most of the things are can easily break because if you temper with the go stack when it's 25:15.800 --> 25:21.240 executing, if you go runtime would just panic and just you're at all that gets crashed. 25:21.240 --> 25:27.800 But if we can put USTTs into go application, actually we can make these things stable and we can 25:27.800 --> 25:33.960 benefit for the both of the world. So how do we do this right now? There's a library called 25:33.960 --> 25:40.200 SELP, but it's not and it's using the sub-st, which is a native binary and it's basically 25:40.200 --> 25:48.120 that generates what's required and link them in the runtime. But again, then you have those knobs, 25:48.200 --> 25:52.920 there's no runtime to it. There's no execution costate and they are just like dormant and when you hook 25:52.920 --> 26:00.280 some BPPF trace program or any other BPPF program and then you can utilize that event collecting 26:00.280 --> 26:08.040 to data. I'm going to just give you our earlier running out of the time. And the same strategies 26:08.040 --> 26:14.040 enabled like for the other runtime so I'm not. But again, this is this isn't working for like 26:14.200 --> 26:20.680 latest goal versions. SELP is like out of data and this is basically also like dynamic time 26:20.680 --> 26:28.440 library opening, whatnot, it's not exactly secure. The other way that you can also do is like 26:28.440 --> 26:33.720 again the dark magic of injecting a third party native library and hook into the actual calls and 26:33.720 --> 26:41.800 generate these spans. There is a framework for that called Frida and I also experimented with that 26:41.800 --> 26:48.200 V also examples of the LIP-STP in the library but since they're like so unstable, we do 26:48.200 --> 26:54.120 it and include them to the benchmarks. But there is a way that we can craft by supporting 26:54.120 --> 27:03.400 these libraries whatnot. But actually one of the, this is another like approach to do this as well. 27:03.400 --> 27:08.680 They are building their own quarks, they are building their own framework based on CGO and 27:08.680 --> 27:16.440 extending to runtime but it's basically an injection framework. So but there's also another idea 27:16.440 --> 27:22.280 that I want to pursue and I've been working on this for a while. So come like one of the things 27:22.280 --> 27:28.200 that recently we, the goal runtime added is the flight recorder. And flight recorder is for like 27:28.200 --> 27:34.360 getting traces out of the scheduler, GC whatnot, it's about the internals of the goal runtime itself. 27:34.360 --> 27:41.560 But I thought like okay like if we can extend it and add some more like tracing points in the end 27:41.560 --> 27:46.760 and then make it just like aggregate the data and stream out of the system and we can actually 27:46.760 --> 27:54.840 use this for our purposes as well. And for that I'm working on a POC to see that if it's actually 27:54.840 --> 28:00.600 can work and I got some results but it needs a lot of performance improvements that's why we also 28:00.600 --> 28:06.840 that included that to the benchmarks. We also don't know if the goal runtime team 28:06.840 --> 28:13.400 gonna agree with us. So there's a long way ahead for this proof of concept. And the second one, 28:13.400 --> 28:19.720 the another POC that we are working on is injecting USTD probes directly to the goal compile time. 28:19.720 --> 28:25.400 This is achievable. USTT is work like you just need to add another f-section to your binary 28:25.400 --> 28:31.000 and then it they are discoverable and they are stable. And if you if we actually implement this 28:31.000 --> 28:36.200 in the goal compile time and the tool chain we can have these probes and we can add these probes 28:36.200 --> 28:42.200 to the standard library and this is there is no runtime and execution overhead of this and we can 28:42.200 --> 28:47.000 just enable them for the Linux systems and then an EVPS system or an injection system can 28:47.000 --> 28:53.880 hook into these tracing points that are defined. I come up with some tooling. I actually implemented 28:53.880 --> 29:00.760 these. I have a proof of concept PR whatnot to discover these points and maybe generate 29:01.640 --> 29:07.640 BFF trace to demo these things whatnot and like the API is definitely like looks like this right 29:07.640 --> 29:13.400 USTD at a probe and then collect the data. Maybe capture some arguments whatnot. I'm still 29:13.400 --> 29:18.920 working on it. It's not stable. That's why it's not made the cut for the benchmarks but I will 29:18.920 --> 29:26.120 keep updating the repo that we're going to share. And yeah if you are interested you can check it 29:26.120 --> 29:31.480 out and see how it goes. So instrumentation is helpful because of the ability, helpful. 29:32.120 --> 29:38.920 Auto instrumentation is possible with a lot of trade-offs. You need to pick your battles. You can use 29:38.920 --> 29:45.160 the BFF page approach but it's brittle. You can use compile time approach but then you need to change 29:45.240 --> 29:51.240 how you build your application. Even it's minor. It's a change. You can still contribute all of these. 29:51.240 --> 29:56.200 The open source CIGs are working. There's an OBI CIG and an open telemetry. There's a compile 29:56.200 --> 30:01.720 time CIG that we are working on. There is the auto injector framework. You can extend auto 30:01.720 --> 30:07.080 injector to include go and you can always contribute to the go time. If you are interested in any 30:07.080 --> 30:13.480 of these subjects, try to discover and join these CIGs and say hi and let's start working on that. 30:13.480 --> 30:22.040 With that, thanks for listening. Thank you. Thank you so much.