WEBVTT 00:30.000 --> 00:41.000 All right, can you guys hear me? 00:41.000 --> 00:45.000 All right, so let's get started. 00:45.000 --> 00:49.000 Please, we have only a couple of minutes 00:49.000 --> 00:53.000 because we are going to present here with Kevin 00:53.000 --> 00:58.000 on contribution processes for MariaDB 00:58.000 --> 01:01.000 and for Postgres. 01:01.000 --> 01:06.000 So I'm going to try to be really quick 01:06.000 --> 01:08.000 and then give the floor to Kevin here. 01:08.000 --> 01:11.000 Okay, so let's get going. 01:11.000 --> 01:14.000 I want to walk you through the MariaDB 01:14.000 --> 01:16.000 server contribution process here 01:16.000 --> 01:20.000 and I want to show you a life example. 01:20.000 --> 01:23.000 So first of all, why? 01:23.000 --> 01:27.000 Basically contributing to database code 01:27.000 --> 01:29.000 basis is hard stuff. 01:29.000 --> 01:33.000 I remember my first contribution to the code base 01:33.000 --> 01:37.000 of the MariaDB server took three months to review. 01:37.000 --> 01:40.000 So it's not trivial. 01:40.000 --> 01:44.000 And that's why I want to show you some support 01:44.000 --> 01:47.000 and I want to show you how it's done basically 01:47.000 --> 01:51.000 so that it's not as intimidating as it might be. 01:52.000 --> 01:55.000 Right, so first of all, let's get the facts. 01:55.000 --> 01:58.000 It's a proper, the MariaDB server code 01:58.000 --> 02:00.000 basis is a proper GitHub repository. 02:00.000 --> 02:02.000 It has some extensions. 02:02.000 --> 02:05.000 It has a, like, contributors license agreement 02:05.000 --> 02:08.000 but which checks your pull request 02:08.000 --> 02:10.000 and then there is a build boat 02:10.000 --> 02:14.000 which also tests for regressions, whatever you submit. 02:14.000 --> 02:18.000 And you also need a geera for larger submissions. 02:18.000 --> 02:22.000 This is basically to describe what you are trying to do 02:22.000 --> 02:25.000 in a better way than just a commit message. 02:25.000 --> 02:28.000 Because, well, as I said, it's complex stuff 02:28.000 --> 02:31.000 so it needs better explanation sometimes. 02:31.000 --> 02:35.000 So how to pick a task to work on. 02:35.000 --> 02:39.000 Basically the best advice that I can give you 02:39.000 --> 02:42.000 is pick something that is important to you. 02:42.000 --> 02:45.000 When people work on scratching their own 02:46.000 --> 02:48.000 each is so to say. 02:48.000 --> 02:50.000 It's, they are most productive 02:50.000 --> 02:55.000 and they feel more accomplished when they achieve something. 02:55.000 --> 02:57.000 If you are wondering what to do, 02:57.000 --> 02:59.000 there is a list of beginner-friendly tasks 02:59.000 --> 03:02.000 in our contributing document. 03:02.000 --> 03:04.000 So take a look at those. 03:04.000 --> 03:06.000 They can get to start. 03:06.000 --> 03:08.000 The interesting things there. 03:08.000 --> 03:11.000 Another way of finding contributions 03:11.000 --> 03:14.000 is to basically engage online with other community members 03:14.000 --> 03:18.000 or check geera for open issues that are appealing. 03:18.000 --> 03:21.000 And then, last but not least, 03:21.000 --> 03:24.000 the MariaDB project participates in the Google 03:24.000 --> 03:26.000 summer of cold program. 03:26.000 --> 03:30.000 So that's huge incentive for people working 03:30.000 --> 03:33.000 on open source contributions. 03:33.000 --> 03:35.000 So, right. 03:35.000 --> 03:38.000 This is the contribution process in our nutshell. 03:38.000 --> 03:40.000 So basically you clone the MISK 03:40.000 --> 03:43.000 or MariaDB server repository. 03:43.000 --> 03:45.000 You work on your feature. 03:45.000 --> 03:48.000 And then you submit the pull request from your branch. 03:48.000 --> 03:52.000 And then you make sure that the built pass is okay. 03:52.000 --> 03:54.000 All the regression tests are good. 03:54.000 --> 03:56.000 Then you get a review. 03:56.000 --> 03:59.000 So basically you get a preliminary review by your 03:59.000 --> 04:01.000 truly here. 04:01.000 --> 04:04.000 I try to do that as fast as possible 04:04.000 --> 04:06.000 when the PR appears. 04:06.000 --> 04:08.000 And then we get a final review by an actual developer 04:08.000 --> 04:10.000 owning the cold base. 04:10.000 --> 04:13.000 And then make sure to have the final review 04:13.000 --> 04:17.000 or also push your change because sometimes 04:17.000 --> 04:20.000 they just approve it and they don't merge the PR, 04:20.000 --> 04:23.000 which is a problem, right. 04:23.000 --> 04:24.000 Okay. 04:24.000 --> 04:29.000 And for the case study that I promised, 04:29.000 --> 04:32.000 I probably it's not very visible, 04:32.000 --> 04:36.000 but it is ten lines of actual cold. 04:36.000 --> 04:39.000 Not a huge contribution, really. 04:39.000 --> 04:42.000 But it is something tangible. 04:42.000 --> 04:45.000 It was, as I said, based on a gire, 04:45.000 --> 04:49.000 which describes what was the problem and what 04:49.000 --> 04:51.000 was the fix for it. 04:51.000 --> 04:55.000 Then it also got great test coverage. 04:55.000 --> 05:00.000 All the regression tests were fine and everything. 05:00.000 --> 05:03.000 And then the CLA was signed. 05:03.000 --> 05:07.000 We got the proper treatment on our end properly 05:08.000 --> 05:10.000 There was an active conversation. 05:10.000 --> 05:11.000 I don't know if you see that, 05:11.000 --> 05:14.000 but there is 32 messages going back and forth 05:14.000 --> 05:16.000 for the stands and lines of cold. 05:16.000 --> 05:21.000 So I'm really impressed by the engagement 05:21.000 --> 05:26.000 that the developer showed in that particular case. 05:26.000 --> 05:30.000 There were also several iterations, 05:30.000 --> 05:33.000 several versions of these things was 05:33.000 --> 05:36.000 published and presented by the contributor. 05:36.000 --> 05:39.000 And finally, finally, it got merged 05:39.000 --> 05:45.000 by a very senior developer in the MariaDB community, apparently. 05:45.000 --> 05:48.000 So yeah, it's not hard. 05:48.000 --> 05:50.000 You just need to follow the steps 05:50.000 --> 05:56.000 and you get your name into the contributes list, basically. 05:56.000 --> 06:01.000 So take away a few free to ask me 06:01.000 --> 06:04.000 and communicate thoroughly, communicate often, 06:04.000 --> 06:07.000 and be responsive to request from reviewers 06:07.000 --> 06:09.000 and be nice and don't panic. 06:09.000 --> 06:13.000 So those are my contact data 06:13.000 --> 06:16.000 if you want to talk to me please do. 06:16.000 --> 06:20.000 And with that, I guess I will give the floor to Karen. 06:20.000 --> 06:33.000 And my audible. 06:33.000 --> 06:35.000 Thank you. 06:50.000 --> 07:02.000 Okay, cool. 07:02.000 --> 07:07.000 So thanks, Georgie, for the OE for the MariaDB code base. 07:07.000 --> 07:09.000 So now I'm going to give a similar insight 07:09.000 --> 07:13.000 but into the postgres code like the contribution process. 07:13.000 --> 07:15.000 I'm going to get from the perspective of someone 07:15.000 --> 07:18.000 who's done it just a few months ago for the first time. 07:18.000 --> 07:20.000 I'm relatively early in my career. 07:20.000 --> 07:23.000 And I just want to talk about the postgres contribution process, 07:23.000 --> 07:25.000 what my personal experience was, 07:25.000 --> 07:27.000 and some takeaways for how it works overall, 07:27.000 --> 07:30.000 and maybe you guys can also do it in the future. 07:30.000 --> 07:32.000 My name is Kevin. 07:32.000 --> 07:35.000 So to begin with, I can talk about in how my interest 07:35.000 --> 07:39.000 in postgres began because it was not like a typical way. 07:39.000 --> 07:42.000 So my first job out of college in 2023 07:42.000 --> 07:44.000 was a company called PRDB. 07:45.000 --> 07:47.000 PRDB is an open source ETL tool, 07:47.000 --> 07:48.000 as far as CDC tool, 07:48.000 --> 07:51.000 that moves data from postgres to multiple destinations, 07:51.000 --> 07:53.000 including clickhouse. 07:53.000 --> 07:55.000 PRDB uses logical replication, 07:55.000 --> 07:58.000 which is the postgres feature to read changes 07:58.000 --> 08:01.000 as they happen on postgres. 08:01.000 --> 08:04.000 And I think Rohit and like the plan scale 08:04.000 --> 08:07.000 folks talked about logical application detail earlier. 08:07.000 --> 08:10.000 It is a pretty complex and a pretty intricate feature, 08:10.000 --> 08:12.000 and it's mostly undocumented. 08:12.000 --> 08:15.000 So you end up needing to read postgres code a lot 08:15.000 --> 08:18.000 to make the tool extable and work for all cases. 08:18.000 --> 08:20.000 PRDB was acquired by clickhouse, 08:20.000 --> 08:21.000 which is why I end up here. 08:21.000 --> 08:24.000 And I continue doing the same work as part of clickpipes. 08:24.000 --> 08:26.000 So now I focus on going data from postgres 08:26.000 --> 08:30.000 into clickhouse as reliably as possible. 08:30.000 --> 08:33.000 So let's talk about the postgres code base. 08:33.000 --> 08:37.000 So postgres is a 30 plus year old C code base, 08:37.000 --> 08:40.000 and because of C and C's fairly feature light, 08:41.000 --> 08:43.000 you need to build a lot on top of it 08:43.000 --> 08:45.000 to have a clean and structured code base. 08:45.000 --> 08:48.000 So postgres has its own custom memory subsystem, 08:48.000 --> 08:50.000 has a lot of dynamic dispatch, 08:50.000 --> 08:54.000 has a lot of macro use just to make the code base make sense. 08:54.000 --> 08:57.000 A reading postgres code is not trivial. 08:57.000 --> 08:59.000 It can be very overwhelming. 08:59.000 --> 09:01.000 There's a ton of files each file as you know, 09:01.000 --> 09:03.000 4,000, 5000 lines of code. 09:03.000 --> 09:07.000 So what I did was to isolate an area that I wanted to focus on 09:07.000 --> 09:09.000 as logical application, and you know, 09:09.000 --> 09:11.000 focus on those files, those functions, 09:11.000 --> 09:12.000 you know everything else, 09:12.000 --> 09:14.000 and figure out the area that I wanted to focus on. 09:14.000 --> 09:16.000 One great thing about postgres is postgres 09:16.000 --> 09:17.000 has great commit messages. 09:17.000 --> 09:21.000 So every line of code in postgres has a good commit message 09:21.000 --> 09:23.000 where you can see why that change was made, 09:23.000 --> 09:25.000 and it links back to the mailing list 09:25.000 --> 09:27.000 so the mailing list will get into later. 09:27.000 --> 09:30.000 But like you can read the commit message 09:30.000 --> 09:32.000 and then you can read the links in the commit message 09:32.000 --> 09:34.000 to even get even more context out of it. 09:34.000 --> 09:38.000 So that's how the postgres code base can be very readable 09:38.000 --> 09:41.000 just by the get history. 09:41.000 --> 09:45.000 So until now, like for the past couple of years, 09:45.000 --> 09:48.000 my sort of relationship with the postgres code base 09:48.000 --> 09:49.000 is more of a passive one. 09:49.000 --> 09:52.000 I was the guy who just read the code, 09:52.000 --> 09:54.000 you know figured out customer issues, 09:54.000 --> 09:56.000 like helped things out at the mailing list, 09:56.000 --> 09:59.000 but I wasn't really focused on contributing the postgres myself. 09:59.000 --> 10:03.000 And the reason that change was kind of random. 10:04.000 --> 10:08.000 So what happened was I was reading through postgres docs 10:08.000 --> 10:11.000 and I found a setting called Scram iterations. 10:11.000 --> 10:14.000 This setting, all it does is it just controls a number of times 10:14.000 --> 10:18.000 the password for a user is hashed when creating or authenticating. 10:18.000 --> 10:23.000 It had a maximum value, that's really high maximum value. 10:23.000 --> 10:26.000 So the sort of intrusive thought that entered my head was 10:26.000 --> 10:28.000 what happens when he said this to the maximum value. 10:28.000 --> 10:31.000 I just want to virtual machine 10:31.000 --> 10:33.000 and it stayed running for a month. 10:33.000 --> 10:36.000 Like I left it running and then I realized it just running forever. 10:36.000 --> 10:40.000 So I realized it maybe there's a bug. 10:40.000 --> 10:44.000 So this is the code that postgres had 10:44.000 --> 10:47.000 to actually do this hashing thing. 10:47.000 --> 10:50.000 And maybe a few of you can spot the issue immediately here 10:50.000 --> 10:53.000 where that was causing an infinite loop. 10:53.000 --> 10:57.000 The problem was that like the, 10:58.000 --> 11:02.000 basically if you say to the maximum value of I and then increment I, 11:02.000 --> 11:05.000 it becomes a negative value because of integer overflow. 11:05.000 --> 11:09.000 So because of that, the lesson that equal to sign in the full loop, 11:09.000 --> 11:12.000 it will hit the maximum value, it will still compare through. 11:12.000 --> 11:16.000 Then it'll increment it to a negative value and then loop would never exit. 11:16.000 --> 11:19.000 So the fix for this was actually fairly simple. 11:19.000 --> 11:21.000 You just change the loop to not do that. 11:21.000 --> 11:26.000 So this was a story of how I accidentally found a bug in postgres. 11:26.000 --> 11:29.000 And the fix was like relatively simple. 11:29.000 --> 11:31.000 It is a bug that almost nobody would hit. 11:31.000 --> 11:34.000 But you know, it was something that I had found. 11:34.000 --> 11:36.000 But again, that was half the battle. 11:36.000 --> 11:40.000 Like I made the fix but I actually contribute the fix upstream. 11:40.000 --> 11:42.000 And this is where the postgres, you know, 11:42.000 --> 11:44.000 developer experience comes in a picture. 11:44.000 --> 11:48.000 So unlike RADB, it is not a standard GitHub repository. 11:48.000 --> 11:53.000 Postgres has its own Git repository that is hosted by postgres itself. 11:53.000 --> 11:57.000 And you know, it's the UI is quite different from GitHub. 11:57.000 --> 12:00.000 And the way they are no PRs. 12:00.000 --> 12:01.000 So there's no issue tracker. 12:01.000 --> 12:02.000 There's no PRs. 12:02.000 --> 12:04.000 This pretty much nothing for postgres. 12:04.000 --> 12:08.000 The way you actually commit code is you create a branch locally. 12:08.000 --> 12:10.000 You make your change locally. 12:10.000 --> 12:14.000 And then you use the git format patch command to make a patch. 12:14.000 --> 12:17.000 And that patch is very attached as an email. 12:17.000 --> 12:19.000 So you actually send an email to the mailing list in postgres. 12:19.000 --> 12:22.000 And that email contains, you know, your entire patch. 12:22.000 --> 12:25.000 Even if it's like a thousand lines of code, it will be a single file. 12:25.000 --> 12:28.000 As your attachment to the email you send to the mailing list. 12:28.000 --> 12:34.000 Basically, like an email address that contains like hundreds of people that are in the postgres community. 12:34.000 --> 12:43.000 So yeah, like the commands I run in this translate into the patch file that you know talks about what files my patch changed. 12:43.000 --> 12:45.000 And you know what the changes are. 12:45.000 --> 12:50.000 And this is what I submit as part of my email to fix a change. 12:51.000 --> 12:53.000 So postgres is called commit fest as well. 12:53.000 --> 13:02.000 So this is the postus equivalent of like a code spirit or a review cycle where a bunch of people submit the changes to commit first. 13:02.000 --> 13:04.000 You know, they have reviewers. 13:04.000 --> 13:07.000 They have a version of CI. 13:07.000 --> 13:09.000 You know, people reject PRs. 13:09.000 --> 13:10.000 They approve it. 13:10.000 --> 13:11.000 It gets merged. 13:11.000 --> 13:13.000 Sometimes it gets pushed to the next commit first. 13:13.000 --> 13:17.000 So I think each postgres has like five or six commit first. 13:18.000 --> 13:20.000 And they happen every couple of months. 13:20.000 --> 13:24.000 And this is where most of the postgres review happens because 13:24.000 --> 13:27.000 Postgres doesn't have a single company behind it. 13:27.000 --> 13:29.000 It's all a bunch of people working on like this. 13:29.000 --> 13:31.000 Pat time to review code changes. 13:31.000 --> 13:34.000 So it's kind of decentralized and that's where it's commit first process. 13:34.000 --> 13:39.000 So for me again, like the fix that I had made. 13:39.000 --> 13:42.000 The fix itself was like a few minutes of work. 13:42.000 --> 13:46.000 But it took me a lot more time to figure out how to draft that very first email. 13:46.000 --> 13:49.000 And how to actually attach a patch to Postgres. 13:49.000 --> 13:55.000 So you know, I, that was a mail I sent to Postgres after like some thinking. 13:55.000 --> 13:59.000 And after some like comments and you know, 13:59.000 --> 14:02.000 They're deciding whether it needs tests or not. 14:02.000 --> 14:04.000 In the end, the change was merged. 14:04.000 --> 14:05.000 It was not merged by me. 14:05.000 --> 14:07.000 So I have no commit access to the Postgres report. 14:07.000 --> 14:10.000 Someone merged it on my behalf and I was attributed as author. 14:10.000 --> 14:13.000 So this is kind of how the Postgres, 14:13.000 --> 14:16.000 you know, contribution process works. 14:16.000 --> 14:18.000 Yeah. 14:18.000 --> 14:21.000 So again, this was like a first book I had found. 14:21.000 --> 14:24.000 And around the same time I had made this first commit. 14:24.000 --> 14:29.000 There was a much bigger issue that some customers of PDB were facing. 14:29.000 --> 14:34.000 Where they were running into an issue where the replication slot creation. 14:34.000 --> 14:37.000 So replication slot is basically the Postgres. 14:37.000 --> 14:41.000 The thing that PDB connects to the Postgres to read the changes from Postgres. 14:41.000 --> 14:44.000 The command to create the slot would hang. 14:44.000 --> 14:47.000 And it would hang in a way where you couldn't stop it. 14:47.000 --> 14:50.000 Like even if you tried to control C or send a terminate command to the square E. 14:50.000 --> 14:52.000 It would just get stuck perpetually. 14:52.000 --> 14:53.000 You couldn't really stop it. 14:53.000 --> 14:57.000 And what some customers had to do was to entirely restart the database, 14:57.000 --> 15:00.000 which if it's Postgres and you're asking a customer to restart their Postgres, 15:00.000 --> 15:02.000 it's not a really good look. 15:02.000 --> 15:06.000 So initial sort of thought because we couldn't figure out what the issue was. 15:06.000 --> 15:08.000 Was there an issue with their managed service? 15:08.000 --> 15:10.000 Was there an issue with RDS or GCP or so and so forth? 15:10.000 --> 15:13.000 And what was a smoking gun for me was, you know, 15:13.000 --> 15:16.000 a customer actually sent either S trace output. 15:16.000 --> 15:19.000 So in this is calls from the incident that is having the issue. 15:19.000 --> 15:23.000 And that, you know, helped me track it down. 15:23.000 --> 15:26.000 So this is like a pretty intricate issue, 15:26.000 --> 15:29.000 which I did write a blog post on if you're more interested in details. 15:29.000 --> 15:32.000 But what was actually the issue was, 15:32.000 --> 15:35.000 so when creating a replication slot, 15:35.000 --> 15:39.000 there is a step that requires waiting for older transactions 15:39.000 --> 15:41.000 to complete. 15:41.000 --> 15:44.000 This code doesn't function properly on read replica. 15:44.000 --> 15:46.000 So on the primary Postgres instance, 15:46.000 --> 15:47.000 it works just fine. 15:47.000 --> 15:50.000 But on the read replica or hot standby, 15:50.000 --> 15:52.000 a bit of this code doesn't function properly. 15:52.000 --> 15:54.000 So it doesn't wait on a transaction, 15:54.000 --> 15:57.000 but it still thinks a transaction is running. 15:57.000 --> 15:59.000 So it just, you know, 15:59.000 --> 16:04.000 continues the loop indefinitely because it's not waiting. 16:04.000 --> 16:06.000 It's just checking and then it keeps failing. 16:07.000 --> 16:09.000 The problem with this loop is, 16:09.000 --> 16:12.000 if this loop has you never checks for interrupts. 16:12.000 --> 16:14.000 Like, even if you signal this back end to, 16:14.000 --> 16:16.000 you know, terminate or cancel or whatever, 16:16.000 --> 16:19.000 because in read replica, 16:19.000 --> 16:22.000 this code doesn't function the expected way. 16:22.000 --> 16:25.000 It would never receive the signal and so until this loop 16:25.000 --> 16:27.000 exits due to a transaction completing, 16:27.000 --> 16:31.000 the process would become unkillable. 16:31.000 --> 16:35.000 So this solution was not like simple, 16:35.000 --> 16:37.000 like isolated fix. 16:37.000 --> 16:39.000 So I submitted a patch for this, 16:39.000 --> 16:41.000 the immediate fix, which is to just let a customer 16:41.000 --> 16:44.000 stop the query, not let it run forever, 16:44.000 --> 16:46.000 which was to just allow this lot creation process 16:46.000 --> 16:48.000 to be interrupted by signals, 16:48.000 --> 16:49.000 even on read replica, 16:49.000 --> 16:52.000 in a respective of, you know, what was happening. 16:52.000 --> 16:54.000 This was committed to the Postgres code base, 16:54.000 --> 16:55.000 so it is in now, 16:55.000 --> 16:58.000 most Postgres minor versions. 16:58.000 --> 17:01.000 There was a follow-up patch by a different member of the community, 17:01.000 --> 17:03.000 so I'm not mean to highlight this way, 17:04.000 --> 17:06.000 so the highlight the fact that we are stuck in this state, 17:06.000 --> 17:08.000 and this patch, 17:08.000 --> 17:10.000 unfortunately, has not been merged yet, 17:10.000 --> 17:12.000 so even though it like a pretty small change 17:12.000 --> 17:14.000 which is highlight the thing, 17:14.000 --> 17:17.000 it was not merged right now. 17:17.000 --> 17:20.000 A long term fix is to just not need this loop at all 17:20.000 --> 17:21.000 and to have read replicas, 17:21.000 --> 17:24.000 you know, just wait for transaction the more efficient manner. 17:24.000 --> 17:27.000 This, and for setting is a lot more thought, 17:27.000 --> 17:30.000 a lot more expertise from the folks who know Postgres, 17:30.000 --> 17:32.000 so it needs to be figured out and then committed. 17:32.000 --> 17:35.000 This is definitely not something that I can do right now. 17:35.000 --> 17:38.000 So the conclusion for me, 17:38.000 --> 17:41.000 like, like this time last year, 17:41.000 --> 17:43.000 I had no idea how to contribute to Postgres, 17:43.000 --> 17:45.000 and you know, I figured it out over time, 17:45.000 --> 17:49.000 but the weird part for me was like the hardest part of all this 17:49.000 --> 17:51.000 was not necessarily the code itself. 17:51.000 --> 17:53.000 It was just convincing myself that, you know, 17:53.000 --> 17:54.000 I had something worth saying, 17:54.000 --> 17:57.000 I had not, like it is not something I was hallucinating, 17:57.000 --> 18:00.000 it's actually something that I found out was a bug 18:00.000 --> 18:03.000 and sending that very first email to the mailing list saying, 18:03.000 --> 18:04.000 I found this issue. 18:04.000 --> 18:06.000 This is true of open source, 18:06.000 --> 18:08.000 in general, even if you find an issue, 18:08.000 --> 18:10.000 it may actually not be an issue, 18:10.000 --> 18:12.000 or even if it is sometimes a contribution 18:12.000 --> 18:13.000 to make a rejected or ignored. 18:13.000 --> 18:16.000 But despite all that, like regardless of that, 18:16.000 --> 18:19.000 like if you have something to contribute to any open source 18:19.000 --> 18:22.000 whatever, it's Postgres or MariaDB or Clickhouse or whatever, 18:22.000 --> 18:24.000 even if it's a bug reported, 18:24.000 --> 18:26.000 Docs update, like just send it, 18:26.000 --> 18:29.000 and I feel that open source projects need more eyes 18:29.000 --> 18:31.000 on them and not feel it. 18:31.000 --> 18:32.000 Thank you. 18:32.000 --> 18:35.000 Thank you.