WEBVTT

1
00:00:20.480 --> 00:00:22.080
<v Matt Godbolt>Hey, Ben.

2
00:00:22.080 --> 00:00:26.060
<v Ben Rady>Hey Matt, dancing to the theme music.

3
00:00:26.060 --> 00:00:34.530
<v Matt Godbolt>How are you doing, friend? We both are. I know. and is yeah Again, we've said this before, but like we've the new the new recording setup means that we get to play the intro music, if only just to get the timings roughly right.

4
00:00:34.530 --> 00:00:35.220
<v Ben Rady>Mm hmm.

5
00:00:35.220 --> 00:00:46.190
<v Matt Godbolt>And so that we are bopping away. And actually, thinking of that, in the background, not that our listener can see, because we never record the video to this, but there's a huge box of nonsense behind me, if you can see it over here.

6
00:00:46.190 --> 00:00:46.860
<v Ben Rady>Mm hmm.

7
00:00:46.860 --> 00:00:54.960
<v Matt Godbolt>So I'm traveling at the moment. I'm at my my parents' house in England, ah which makes this an international recording.

8
00:00:54.960 --> 00:01:14.250
<v Matt Godbolt>And I've been mining all of the stuff that I left behind as a kid to send either home, which was actually kind of a miracle thing. I had two large boxes of like cool stuff ah of mine that I'd left here because you know I wasn't planning on being in the States for more than a couple of years, but 14 years on I'm like, well, maybe I should take this home now.

9
00:01:14.250 --> 00:01:14.470
<v Ben Rady>Yep.

10
00:01:14.470 --> 00:01:14.700
<v Ben Rady>Right.

11
00:01:14.700 --> 00:01:19.280
<v Matt Godbolt>I just put, I stuck a sticker on the side of these boxes and I took them to a local shop.

12
00:01:19.280 --> 00:01:19.540
<v Ben Rady>Yeah.

13
00:01:19.540 --> 00:01:30.840
<v Matt Godbolt>Magic happened and then my wife texted me a picture of them on my front porch in America. It's like, I know that's how that works, but I was still so excited to see my things get there in like a day and a half.

14
00:01:30.840 --> 00:01:41.680
<v Matt Godbolt>It was brilliant. um But now I'm sending all of this stuff off to the person who composed our theme music, Inverse Phase, Brendan Becker, um for his museum.

15
00:01:41.680 --> 00:01:41.740
<v Ben Rady>nice.

16
00:01:41.740 --> 00:01:58.380
<v Matt Godbolt>In fact, he has got great news. He has just um raised enough money to buy his own property and he's moving his museum ah of cool like games and weird and wonderful computer tech paraphernalia over the years ah into Pittsburgh.

17
00:01:58.380 --> 00:02:09.400
<v Matt Godbolt>So at some point next year, I will definitely be going to Pittsburgh to see him. and the amazing things. Anyway, weirdest thing. um What are we talking about today, Ben?

18
00:02:09.400 --> 00:02:19.740
<v Ben Rady>Today, in addition to that, in addition to cool museums, we are talking about

19
00:02:19.740 --> 00:02:25.360
<v Ben Rady>We are talking about messaging systems.

20
00:02:25.360 --> 00:02:40.100
<v Ben Rady>There are various systems in the world that you design as ah some service somewhere is going to send you a reasonably small message and you're going to process that message and then you're going to send

21
00:02:40.100 --> 00:02:40.100
<v Matt Godbolt>Mm hmm.

22
00:02:40.100 --> 00:02:56.140
<v Ben Rady>one or maybe more or maybe zero messages out to another thing. And the whole architecture of the system is designed with just sort of this message passing in mind. um And oftentimes when you have systems like this, you have distributed computing problems.

23
00:02:56.140 --> 00:02:56.520
<v Matt Godbolt>Yep.

24
00:02:56.520 --> 00:03:08.680
<v Ben Rady>You have ah sort of reproducibility ah concerns that you need to think about. And so I thought it would be a good idea to talk about some of the things in our experience, having built some of systems like this.

25
00:03:08.680 --> 00:03:08.940
<v Matt Godbolt>Right.

26
00:03:08.940 --> 00:03:09.200
<v Matt Godbolt>Yep.

27
00:03:09.200 --> 00:03:12.120
<v Ben Rady>And we can talk about maybe what some of those systems were just for context.

28
00:03:12.120 --> 00:03:20.960
<v Ben Rady>But um in our experience building systems like this, what are some things that you should do? And what are some things that you should definitely not do?

29
00:03:20.960 --> 00:03:34.080
<v Matt Godbolt>Interesting, yes. um So we're talking about, you you mentioned small message there, so we're not talking like bulk data thing here, we're talking, what what would be an example of, what would be like a canonical example of this kind of system

30
00:03:34.080 --> 00:03:44.740
<v Ben Rady>Yeah. Well, I think starting right off with some of the things that you should not do is I don't think that you should put gigabytes of data into something and call it a message.

31
00:03:44.740 --> 00:03:50.320
<v Matt Godbolt>Okay.

32
00:03:50.320 --> 00:03:58.310
<v Ben Rady>Um, that's it in these, that is something that I would be skeptical of if, if someone was like, well, can't we just take this like, you know, three gigabyte file and stick it in there?

33
00:03:58.310 --> 00:03:58.500
<v Matt Godbolt>I mean, strictly speaking, you can, but you mean in these kinds of systems, you wouldn't want to...

34
00:03:58.500 --> 00:03:58.860
<v Ben Rady>It's like, maybe you shouldn't do that.

35
00:03:58.860 --> 00:04:21.320
<v Matt Godbolt>So, and again, message, positive so we're talking things like broadly something which could be using ah something like Kafka as a sort of mental model of like, Hey, you're going to just put a sequence of messages somewhere and, or it could be um some other system, but I'm just putting Kafka in my head for now. It's like something that probably most of the audience space might have heard of.

36
00:04:21.320 --> 00:04:21.540
<v Ben Rady>Yeah.

37
00:04:21.540 --> 00:04:32.230
<v Matt Godbolt>and then obviously that's a great example of something you shouldn't do is putting a massive massive massive message into a message queue system. They're usually not good at larger pieces of data.

38
00:04:32.230 --> 00:04:34.040
<v Ben Rady>Mm hmm.

39
00:04:34.040 --> 00:04:44.560
<v Matt Godbolt>um it Sometimes your recipients will um want to discard some messages and if you curse them to download a three gigabyte file just to discover they don't want it that's not what you wanted.

40
00:04:44.560 --> 00:04:44.560
<v Ben Rady>Mm hmm.

41
00:04:44.560 --> 00:04:58.040
<v Matt Godbolt>That's not good behavior. So, you know, the typical solution I can think of of that is that you normally put bulk data somewhere else, be it like, say, an S3 bucket, or some shared file system, or some other system.

42
00:04:58.040 --> 00:05:04.400
<v Matt Godbolt>And then you send a message that says, Hey, there exists some big data over somewhere else that you can get hold of. Is that the kind of thing?

43
00:05:04.400 --> 00:05:18.540
<v Ben Rady>Yeah, yes, that's what I've done as well as you have basically a pointer to some other large piece of data, whether it's a file in object storage or maybe even like you know, one thing I've seen is like embedding a SQL query that's, um you know, bitemporal.

44
00:05:18.540 --> 00:05:24.900
<v Ben Rady>So when you run it, you always get the same results. You can put that in the message and be like, oh, there's some data available here if you want to query it, right? um but

45
00:05:24.900 --> 00:05:25.520
<v Matt Godbolt>Oh, that's neat.

46
00:05:25.520 --> 00:05:37.900
<v Ben Rady>But like embedding... the core idea here is that don't don't put a bunch of data into a messaging system, whether that's just a system that's passing messages or a queue, right like something like Kafka or some other type of queue.

47
00:05:37.900 --> 00:05:37.900
<v Matt Godbolt>Yep.

48
00:05:37.900 --> 00:05:47.180
<v Ben Rady>Instead, put in something that allows you to fetch the the consumers of that stream to fetch that data if and when they want based on maybe some metadata that you include in the message.

49
00:05:47.180 --> 00:06:01.240
<v Matt Godbolt>Got it. Yep. Now you've obviously by doing that you have added another system to an otherwise straightforward system like I would need to mock out if I was testing this both the retrieval system separately from the message queue system.

50
00:06:01.240 --> 00:06:06.760
<v Matt Godbolt>There's an allure to saying, Hey, let's just throw it in the one system and then everything's a message and we don't need anything outside.

51
00:06:06.760 --> 00:06:12.960
<v Matt Godbolt>So I can see why, but, but there is a blurry line, like, you know, we throw out three gig of data is like, that's too much, but, but maybe, you know, 300K, I don't know, maybe, uh, yeah.

52
00:06:12.960 --> 00:06:13.000
<v Ben Rady>Mhm.

53
00:06:13.000 --> 00:06:18.620
<v Ben Rady>Yeah. Yeah, yeah, yeah. Yeah, right. It's like that maybe is fine. Yeah.

54
00:06:18.620 --> 00:06:37.580
<v Ben Rady>So yeah, so all of that I think starts to become a lot more context sensitive. And maybe it's worthwhile talking about like some of the systems that we have built to paint a little picture of some of this context and be able to talk about ah the trade-offs that we're talking about here in those contexts.

55
00:06:37.580 --> 00:06:44.040
<v Matt Godbolt>Yeah. Okay. So yeah. What, what, what kind of systems have we, what do you want to start with? What would you happy talking about first?

56
00:06:44.040 --> 00:06:58.080
<v Ben Rady>Well, I mean, I can kind of go, you know, the the three main systems that I think of that I built that are like this are um there is a sort of infrastructure and monitoring system that I built at a trading firm.

57
00:06:58.080 --> 00:07:18.700
<v Ben Rady>And then at that same trading firm, I actually worked on, ah yes, that is like pantomiming the logo of the of the system that we built. And at that same firm, I actually also built a trading system for event trading.

58
00:07:18.700 --> 00:07:26.540
<v Ben Rady>So this is like discrete events that are happening in the world. And we would name news as an example of that. And we would we would trade those events

59
00:07:26.540 --> 00:07:33.540
<v Matt Godbolt>Right. So election results come in kind of thing and you're like, Hey, if, if this person wins and the market moves this way, or, you know, if some, if a drug gets [approved]...

60
00:07:33.540 --> 00:07:40.420
<v Ben Rady>Yeah, tweets, we would trade tweets. I mean, things like that, you know, ah press releases, those kinds of things.

61
00:07:40.420 --> 00:07:40.520
<v Matt Godbolt>Right.

62
00:07:40.520 --> 00:07:43.660
<v Ben Rady>And that was extremely latency sensitive, right?

63
00:07:43.660 --> 00:07:48.920
<v Ben Rady>Like that trade is basically like you're, you're racing the speed of light. um And so that had its own special constraints.

64
00:07:48.920 --> 00:07:59.420
<v Matt Godbolt>Right, because you and everybody else know that if the Fed puts the the interest rate up, then the market will react in a particular way and you want to either take advantage of it or, you know, protect your own position, whatever.

65
00:07:59.420 --> 00:07:59.420
<v Ben Rady>Exactly.

66
00:07:59.420 --> 00:08:01.360
<v Matt Godbolt>But yeah, interesting.

67
00:08:01.360 --> 00:08:05.230
<v Ben Rady>Yeah, yeah, yeah. um So like you know in that example, like a queue is just right out.

68
00:08:05.230 --> 00:08:05.420
<v Matt Godbolt>Yeah.

69
00:08:05.420 --> 00:08:06.440
<v Ben Rady>You can't queue anything.

70
00:08:06.440 --> 00:08:06.440
<v Matt Godbolt>ha

71
00:08:06.440 --> 00:08:18.300
<v Ben Rady>right like That's not going to work. um And then probably the third one was ah the system that ah we collectively built at Coinbase.

72
00:08:18.300 --> 00:08:19.760
<v Matt Godbolt>I was thinking about that one.

73
00:08:19.760 --> 00:08:34.680
<v Ben Rady>which ah was an exchange, right? Like Coinbase hired Matt and I and a few other people to build a replacement for their cloud-based exchange. And what happened with that is a big long story, which is maybe another podcast, but nonetheless...

74
00:08:34.680 --> 00:08:35.480
<v Matt Godbolt>Or not....

75
00:08:35.480 --> 00:08:44.060
<v Ben Rady>We yeah, right, or not, honestly. You can read about it on the internet if you want. How about that? I think that's the best way to to to do that.

76
00:08:44.060 --> 00:08:44.320
<v Matt Godbolt>Yes.

77
00:08:44.320 --> 00:08:52.560
<v Ben Rady>But nonetheless, we we built an exchange. And that is very much a system like this where you're passing messages around. So those those are the three that sort of spring to mind for me.

78
00:08:52.560 --> 00:09:38.120
<v Matt Godbolt>Right. And just concretely for those, you know, an exchange in this instance and is is a a service where many people are sending messages into the system to buy and sell a commodity, in this instance, various cryptocurrency coins and things. um And yeah, we had to process those and we had to process them fairly and we had to process them ah as the lowest latency that was reasonable and very, very, very reliably. And yeah, we used a very interesting design of a messaging system at the very core, the very guts of how it all fitted together to give us certain properties that we wanted to be able to tell our clients that we had, you know, like fairness and guarantees over certain things, which was very interesting. Yeah, no, those are cool. Where do you want to start? Do you want to start with the monitoring system or

79
00:09:38.120 --> 00:09:44.500
<v Ben Rady>ah Well, those are those are mine. Are there any others that you can kind of throw into the mix here?

80
00:09:44.500 --> 00:10:33.840
<v Matt Godbolt>I mean, I think, in general, receiving market data itself, that is the the information that exchanges then that the exhaust from an exchange. So um the publicly visible information for some definition of public about what's going on in any particular market is disseminated as a set of discrete messages that is, is ordained to you, you get a PDF from the exchange, and they say, this is how we're going to do it. But you have to be able to sort of keep up and read and process that. So you get ah yeah There is a message processing system there so that's the thing i have the most but experience with but i don't get the choice of designing it i just got to make sure i hit the spec of the of of what's going on there so i don't think of them i don't think of that as in in the same way as as the other thing so let's just stick with your your and i'll see if anything rings a bell with something that i have done.

81
00:10:33.840 --> 00:11:02.860
<v Ben Rady>Okay. um But yeah, so examples of things to do and not do. So you know in the in the sort of latency-constrained world, that I was living in with that event trade, and I would imagine in other places where you have latency constraints, you need to be very careful about the messages at rest, right? So ah a more dysfunctional form of this, I think, is you're building a messaging system, but in the middle of your messaging system, you put a database.

82
00:11:02.860 --> 00:11:10.020
<v Ben Rady>So you write data into the database, and then you have some other thing that is pulling data out of the database.

83
00:11:10.020 --> 00:11:10.020
<v Matt Godbolt>Right.

84
00:11:10.020 --> 00:11:14.740
<v Ben Rady>And it's like maybe got like a cursor or something where it's like, you know, I'm at like row 1000.

85
00:11:14.740 --> 00:11:23.560
<v Matt Godbolt>Right. You're tailing a log, effectively. That log just lives in a database and you've got, yeah, you're just following down in insert on one side and a select the next thing on the other.

86
00:11:23.560 --> 00:11:30.120
<v Ben Rady>Yeah, yeah, yeah. and the And the terrible thing about those designs is that they they kind of sort of mostly work a little bit, right?

87
00:11:30.120 --> 00:11:30.120
<v Matt Godbolt>Right.

88
00:11:30.120 --> 00:11:40.740
<v Ben Rady>So it's easy to trick yourself into thinking that you have something that will scale and you're like, oh yeah, you know, this database scales, I don't know, whatever, it's some cloud database and it scales infinitely, right?

89
00:11:40.740 --> 00:11:44.160
<v Ben Rady>Or I've got, you know, some cluster of these things and I can just scale it out horizontally.

90
00:11:44.160 --> 00:11:44.160
<v Matt Godbolt>Right, right, right.

91
00:11:44.160 --> 00:12:06.040
<v Ben Rady>But like, you know There's not really any magic there. If you've just got one table and you're writing things into the one table and you have lots of things reading from the one table, you need to really understand what that database is intended to do and what it's capable of doing and maybe ask the question in that case, you know do we need something more like Kafka?

92
00:12:06.040 --> 00:12:06.980
<v Ben Rady>Do we need something more that is more of a traditional queue?

93
00:12:06.980 --> 00:12:18.940
<v Matt Godbolt>Right, because you're, I mean, ah not to throw anything in your way, but no, a good friend of mine once suggested that using a sequence of numbered files is a perfectly reasonable way of sending messages between systems.

94
00:12:18.940 --> 00:12:31.340
<v Matt Godbolt>And that's true as well. So I don't think you're saying that a database is not a solution to some problems, but certainly when latency is important, you've got too much non-determinism and there's too many moving parts.

95
00:12:31.340 --> 00:12:31.340
<v Ben Rady>Yeah, yeah. Right, yes.

96
00:12:31.340 --> 00:12:43.380
<v Matt Godbolt>So what do you do if you have um a latency sensitive application that needs to be able to react as fast as you possibly can, and you still want it to be a message passing system.

97
00:12:43.380 --> 00:12:43.700
<v Ben Rady>Mm-hmm.

98
00:12:43.700 --> 00:13:06.180
<v Ben Rady>ah ha okay i mean so you know Again, we're calling on some of our prior experiences here. um Not storing the messages, right like having the sender and the receiver directly sending messages to each other, either over ah you know TCP or some sort of reliable multicast protocol, which you know you can Google various options there and see what you like.

99
00:13:06.180 --> 00:13:07.700
<v Matt Godbolt>I was going to say there's, there's, that's a whole episode.

100
00:13:07.700 --> 00:13:20.900
<v Ben Rady>Yeah, right, um is a great way to sort of reduce that latency. It does put constraints on the consumers, depending on exactly how you do it, to either not create back pressure or to deal with that back pressure in some way.

101
00:13:20.900 --> 00:13:20.920
<v Matt Godbolt>Yep.

102
00:13:20.920 --> 00:13:27.920
<v Ben Rady>Like, you know, the fundamental question to ask is if the consumer doesn't consume the data, what happens? Right?

103
00:13:27.920 --> 00:13:27.920
<v Matt Godbolt>Yep.

104
00:13:27.920 --> 00:13:50.980
<v Ben Rady>where Where does it live? Does it get dropped? Does it get stuffed somewhere else that it reads later? And how would it ever possibly catch up? So there's all sorts of concerns to think about there. But fundamentally, if you've got something where you've got some latency constraint, I think... Attacking that problem as I'm going to write my messages into some sort of storage thingy and then read them back out again.

105
00:13:50.980 --> 00:13:58.520
<v Ben Rady>You just need to be really careful about what kind of latency that's going to introduce and maybe just going directly is better.

106
00:13:58.520 --> 00:14:00.720
<v Matt Godbolt>Right.

107
00:14:00.720 --> 00:14:13.820
<v Matt Godbolt>Right, and I suppose in the limit, um if you can do this, which obviously we've we've kind of glossed over already, um being on the same physical computer means that you can use shared memory transport type things and a queue that that lives only in memory.

108
00:14:13.820 --> 00:14:13.820
<v Ben Rady>Mm hmm. Mm hmm.

109
00:14:13.820 --> 00:14:29.680
<v Matt Godbolt>So there is there's a queue, but like only because you have to have somewhere to put it, you know, so a double buffer or even in the limit of like, I'm writing to this thing ah in process A and process B is just waiting for the okay to read it read from it as soon as it's been finished, as as soon as it's finished being written to.

110
00:14:29.680 --> 00:14:40.420
<v Matt Godbolt>um But, you know, in all the things that I've been to thinking about so far have all been some network traffic has happened between a more distributed system than than something that can be literally co-located.

111
00:14:40.420 --> 00:14:50.420
<v Matt Godbolt>Because, of course, and even more of a limiting case, they're in the same thread and they just literally have memory mapped and in this, you know, they're just ah a global variable is being said or whatever a shared variable, I should say.

112
00:14:50.420 --> 00:14:51.860
<v Ben Rady>Mm hmm.

113
00:14:51.860 --> 00:15:00.900
<v Matt Godbolt>um Yeah, so um storing the data is is sort of orthogonal to, or sorry, durability of the data. You don't always need durability.

114
00:15:00.900 --> 00:15:08.540
<v Matt Godbolt>Something like Kafka will always give you durability. And as you say, that's the thing that stores it kind of first, and then everybody gets a copy of it from the brokers that have already stored it.

115
00:15:08.540 --> 00:15:21.360
<v Matt Godbolt>There's a quorum based here, and everyone's got you know it. it is We know that if a message has been sent, if before anyone sees it, some configurable amount of durability has taken place such that you know that that message has not been lost.

116
00:15:21.360 --> 00:15:22.080
<v Ben Rady>Mm hmm.

117
00:15:22.080 --> 00:15:39.200
<v Matt Godbolt>And we'll definitely be there again if you have to go back and get it. And then there's something on the back end as well where you can say, I know that this message definitely got processed by at least one of the people that were supposed to do anything with this message. And so that's really, really good when you're talking about things like financial transactions and other things where you like, it absolutely needs to happen.

118
00:15:39.200 --> 00:15:44.360
<v Matt Godbolt>We need to have a journal of record. And that journal is is more important than the the latency hit we have.

119
00:15:44.360 --> 00:15:44.460
<v Ben Rady>Mm-hmm. Yeah.

120
00:15:44.460 --> 00:15:46.100
<v Matt Godbolt>In the case of your event trade, presumably,

121
00:15:46.100 --> 00:15:46.100
<v Ben Rady>Mm-hmm.

122
00:15:46.100 --> 00:16:10.490
<v Matt Godbolt>if you dropped a message or if they're, again, back-pressure related things here, maybe dropping the message is okay, because it's better to not hold up the fast people by having that one slower consumer than it is and have that message being missed by that consumer than it is to cause them ato potentially to to to fire an order too late or some other and some issue there, right.

123
00:16:10.490 --> 00:16:10.520
<v Ben Rady>Mm hmm.

124
00:16:10.520 --> 00:16:19.860
<v Ben Rady>Yeah. yeah yeah yeah and Another actually interesting dimension of that particular system, um which I think is worth talking about, is that the messages were were not sequenced.

125
00:16:19.860 --> 00:16:24.790
<v Ben Rady>We had lots of different messages coming in from different data centers.

126
00:16:24.790 --> 00:16:25.220
<v Matt Godbolt>Interesting.

127
00:16:25.220 --> 00:16:32.910
<v Ben Rady>that were all hitting the same system. And it didn't really matter what sequence they arrived in, right? this The system could could deal with that in different ways.

128
00:16:32.910 --> 00:16:33.220
<v Matt Godbolt>ah Oh, that is interesting. Yeah.

129
00:16:33.220 --> 00:16:41.080
<v Ben Rady>But oftentimes, it is very useful to be able to sequence a stream of messages because that allows you to do things like create a state machine

130
00:16:41.080 --> 00:16:41.080
<v Matt Godbolt>Yes.

131
00:16:41.080 --> 00:17:01.240
<v Ben Rady>And then any consumer of that stream should be able to reproduce the same state of the state machine from the sequence of events. And obviously, and a classic example of this in finance is building a book. But there are lots of situations in which you want to have a sequenced stream of events that you can use to reproduce state in any consumer that sees that stream.

132
00:17:01.240 --> 00:17:09.560
<v Matt Godbolt>Right, this is like log structured journals of information like databases and things, you just need to be able to process them in strict sequence. Now, and again, when you, that's okay.

133
00:17:09.560 --> 00:17:09.560
<v Ben Rady>Mm-hmm.

134
00:17:09.560 --> 00:17:26.700
<v Matt Godbolt>So like you mentioned building building a book in in our world, which is taking this multicast data that flows from the exchange and applying it um as the set of modifications to an empty state to bring your world up to date with whatever orders are flying around and are currently active.

135
00:17:26.700 --> 00:17:40.860
<v Matt Godbolt>And you absolutely have to apply them in the right sequence or else things go horribly wrong. But in that instance, there is a single producer, at least for any one book, there is exactly one producer that is can give you a sequenced number.

136
00:17:40.860 --> 00:18:13.680
<v Matt Godbolt>And therefore you can see if the messages arrive in order. And so that's That's an easier proposition. And again, for those folks who are thinking like TCP, again, if you've got a single connection that's TCP one end to the other, then again, the the the messages that are being sent aren't going to be reordered anyway, that's a property of the of the transport. But in general, for the kind of UDP messages that we talk about in finance, that's not true. And you need to be able to see if you either have received messages out of order, or you've seen that you in fact miss one that you need to go and get it from some other ah other place.

137
00:18:13.680 --> 00:18:26.400
<v Matt Godbolt>So that's an interesting property, again, of messages. So we've already talked about all durability is one sort of dimension. Another dimension is like, what are the constraints on ah reproducibility and sequencing that kind of sort of go hand-in-hand?

138
00:18:26.400 --> 00:18:46.780
<v Matt Godbolt>um So just to sort of to take another point here, that something like Kafka, by putting it through a broker, somebody who's responsible, at least for a single stream in Kafka, you have also, as well as the durability guarantees, you have got like a single place of record where the ordering is kind of set in stone.

139
00:18:46.780 --> 00:18:47.180
<v Ben Rady>right

140
00:18:47.180 --> 00:18:59.600
<v Matt Godbolt>And so a subsequent read of that will give you back the things in the same order that everyone saw it in. And that's a useful property in some cases. But going back to your event trade, you are saying that that's something that you could actually tolerate.

141
00:18:59.600 --> 00:19:04.980
<v Matt Godbolt>And in fact, you didn't want to take the hit for ah receiving from multiple, multiple systems, right.

142
00:19:04.980 --> 00:19:04.980
<v Ben Rady>Right. Right.

143
00:19:04.980 --> 00:19:23.280
<v Ben Rady>Yeah, the sequencing process would just slow that down so we couldn't do it, right? It's you have to to just design the system to to be tolerant of that. But I think something that's really important to understand, and this is true of Kafka. It's it's this might be just like a general CAP theorem thing of like if you're going to get a sequence stream of events.

144
00:19:23.280 --> 00:19:34.200
<v Ben Rady>then it can be very difficult to build a system that can scale horizontally with that constraint. Because something has to be that you know the sequencer.

145
00:19:34.200 --> 00:19:39.880
<v Matt Godbolt>right the arbiter of what time things happen which came first right yeah there's yeah

146
00:19:39.880 --> 00:19:51.420
<v Ben Rady>Which came first? Yeah. ah huh ah huh so and And in the particular case of Kafka, I forget topic versus stream and and and exactly how that is. But it's like the thing that gives you that ordering guarantee cannot scale horizontally.

147
00:19:51.420 --> 00:20:06.830
<v Matt Godbolt>that is yes the stream within a topic so topics can have multiple streams and those streams are kind of a unit by which they are um given to individual members of the Kafka cluster and of course you can have multiple processes and threads and whatever so essentially by sending to a single stream you're sending to a single

148
00:20:06.830 --> 00:20:08.780
<v Ben Rady>right Yeah.

149
00:20:08.780 --> 00:20:21.480
<v Matt Godbolt>...single destination, and that's the thing that gets to decide, but there's only one of them. If you need if you need to go faster, you need two of them, and now suddenly you're no longer, do you have this nice guarantee of a total ordering.

150
00:20:21.480 --> 00:20:24.020
<v Matt Godbolt>And that's what we're talking about here, a total ordering.

151
00:20:24.020 --> 00:20:27.840
<v Ben Rady>right Yeah, yeah, yeah, yeah, yeah. So there's some important trade-offs to consider there.

152
00:20:27.840 --> 00:20:34.020
<v Matt Godbolt>So why not just use the time as the total ordering?

153
00:20:34.020 --> 00:20:36.940
<v Ben Rady>[Laughs] Well, how much time do you have? Because, pun intended.

154
00:20:36.940 --> 00:20:40.850
<v Matt Godbolt>Uh, well, you said you, you said you had an hour, so, uh, I'm taking you at your word.

155
00:20:40.850 --> 00:20:51.600
<v Ben Rady>um Well, so to start with, um what precision? ah because whatever precision you choose, you're going to get some amount of collision, right?

156
00:20:51.600 --> 00:20:51.600
<v Matt Godbolt>All of the precision.

157
00:20:51.600 --> 00:20:58.940
<v Ben Rady>These two events happened at the same nanosecond. Which comes first? I don't know, right?

158
00:20:58.940 --> 00:21:00.480
<v Matt Godbolt>I mean, ah yeah, yeah, no, exactly.

159
00:21:00.480 --> 00:21:03.860
<v Ben Rady>Right, like that's not a deterministic sort order, right?

160
00:21:03.860 --> 00:21:14.820
<v Matt Godbolt>And if the, if you look, yeah, you think that never happens and then, you know, that's, that doesn't, what you know, birthday paradox kind of thing means that it happens a little bit more often than you would otherwise naively think.

161
00:21:14.820 --> 00:21:27.400
<v Matt Godbolt>But yeah, it's still, I, I, I'm going to admit here. Um, we did use nanoseconds since 1970 as a, like a global key for packets arriving in one of the products I worked on a number of companies ago.

162
00:21:27.400 --> 00:21:27.400
<v Ben Rady>Mmhm.

163
00:21:27.400 --> 00:21:37.090
<v Matt Godbolt>And the solution there was a post process, arbitrarily picked one of them if it found two that had the same and just added one nanosecond until it didn't till it didn't match anymore, right?

164
00:21:37.090 --> 00:21:37.120
<v Ben Rady>Yeah, right, right, right, right, right.

165
00:21:37.120 --> 00:21:45.460
<v Matt Godbolt>It's like, it's pragmatically, it mostly never happens. But what it does, it really blows your system up. So yeah, and then so how much precision?

166
00:21:45.460 --> 00:21:45.580
<v Ben Rady>Yes, it's.

167
00:21:45.580 --> 00:22:14.470
<v Matt Godbolt>Great question. And you know, you and I have been fortunate enough to work in the the finance industry where we already like to have accurate time. So getting a somewhat accurate to within low digits of nanoseconds time is is feasible for us, but for most people that isn't an option you can get milliseconds at best and ntp will get you within plus or minus fifteen maybe twenty milliseconds you know better than two people synchronizing their watching and watching an old spy movie but not that much better.

168
00:22:14.470 --> 00:22:14.720
<v Ben Rady>Right, right.

169
00:22:14.720 --> 00:22:26.280
<v Ben Rady>Yeah, yeah, yeah. And I do think it's sort of that false precision problem that leads you into this trap where you're just like, well, this nanosecond precision timestamp, what are the odds, like they can't even physically arrive at the same time.

170
00:22:26.280 --> 00:22:39.300
<v Ben Rady>Like the photons don't move like that. It's like, okay, but then what happens when your clocks are just off, right? Like you're just, they're just not that precise. And so you get two things that have the same timestamp because your clocks just aren't that precise.

171
00:22:39.300 --> 00:22:55.340
<v Matt Godbolt>and right and you know when as soon as you have more than one cable the photons don't move that way but you can have two parallel streams of photons that do arrive at exactly the same time and so you do it can it can and does happen yeah so yeah you can't just use time and anyway whose time are we talking about because ah you know yes

172
00:22:55.340 --> 00:23:09.360
<v Ben Rady>Right, right. Now we're getting into the whole problem. This is a whole other category of this, which is clock domains, right? Like synchronizing time between multiple computers is hard.

173
00:23:09.360 --> 00:23:23.040
<v Ben Rady>it requires thought and oftentimes specialized equipment. And if you just sort of take it for granted that all clocks everywhere are the same, you're you're setting yourself up for a lot of hurt, like the the hurt is coming for you.

174
00:23:23.040 --> 00:23:23.520
<v Matt Godbolt>Right.

175
00:23:23.520 --> 00:23:39.560
<v Ben Rady>um So anytime that you're gonna be comparing time, ah you need to be thinking about what is the source of those clocks and how precise are they and how accurate are they and how are you gonna deal with the the differences between them and what are those differences?

176
00:23:39.560 --> 00:24:08.520
<v Ben Rady>What can they be and you know what are the the things there? So it can go all the way from, you know, we've got a GPS antenna that's sitting on the top of the building. And we know the precise geographic coordinates of that antenna. And we know how long the cable is from that antenna to all of the various servers that are using that antenna to synchronize their time. And from the length of those cables, we can compute the drift from the received signal and the antenna to each of the individual computers, right?

177
00:24:08.520 --> 00:24:17.900
<v Ben Rady>And unless you're taking that level of precaution or something kind of like it, I would not trust any nanosecond timestamp to be greater or less than anything else, right?

178
00:24:17.900 --> 00:24:26.340
<v Matt Godbolt>You've missed out even some bits there. you know like When we were doing stuff at previous companies, you know there would be a rubidium-based oscillator with a very high...

179
00:24:26.340 --> 00:24:26.340
<v Ben Rady>Yeah.

180
00:24:26.340 --> 00:24:35.180
<v Matt Godbolt>you know There's an oven that's got like rubidium at some temperature and it's used and that's the thing that you synchronize with the GPS and everything synchronizes to that with some complicated protocol and

181
00:24:35.180 --> 00:24:35.510
<v Ben Rady>yeah

182
00:24:35.510 --> 00:24:35.820
<v Ben Rady>ye Yep,

183
00:24:35.820 --> 00:25:00.340
<v Matt Godbolt>Yeah, well, no, I say it complicated. This is my favorite protocol. And I remember one of our network engineers saying to me, yeah, we use PPS to synchronize the master clock with the individual, like clocks on each of the machines. I'm like, PPS, wow, what's that? Because I've heard of NTP, and I've heard of PTP, and PPS. And he's like, it stands for pulse per second.

184
00:25:00.340 --> 00:25:06.560
<v Matt Godbolt>And it's like, literally, it goes five volts once a cycle a second, on the second, and like, oh, right, that's the protocol.

185
00:25:06.560 --> 00:25:07.640
<v Ben Rady>Yeah, yeah, yeah.

186
00:25:07.640 --> 00:25:09.100
<v Matt Godbolt>Just on and off, got it.

187
00:25:09.100 --> 00:25:10.820
<v Ben Rady>This is good. It's a simple protocol.

188
00:25:10.820 --> 00:25:21.600
<v Matt Godbolt>It's a simple protocol. But yeah, again, you talk about the lead, you know, the cables were very carefully measured and very carefully designed to be understandable how long they the delays they brought in.

189
00:25:21.600 --> 00:25:23.980
<v Matt Godbolt>So yeah, it's complicated.

190
00:25:23.980 --> 00:25:25.500
<v Ben Rady>Yeah, yeah, yeah. Right, right.

191
00:25:25.500 --> 00:25:33.650
<v Matt Godbolt>And And reasonable people could disagree because yeah, you can have a data center full of things that uses your discipline for clock synchronization, which you're maybe happy with.

192
00:25:33.650 --> 00:25:33.940
<v Ben Rady>Oh, yeah.

193
00:25:33.940 --> 00:26:03.020
<v Matt Godbolt>But if you take a message from, say, an exchange and the exchange says, hey, this happened at this point in time, you have to trust their ability to manage that if you want to say, well, ah why don't we use their clocks, they're, you know, whatever we're doing on our side, forget it. Let's just use the clocks from the remote people. We have been through this process. You're like, well, that makes sense. You know, they surely um have done something sane. And then of course, what if they haven't? I mean, what would ever throw aspersions that are friends who have a difficult job maintaining these systems, but like,

194
00:26:03.020 --> 00:26:03.500
<v Ben Rady>Yeah.

195
00:26:03.500 --> 00:26:18.800
<v Matt Godbolt>Things have gone wrong before and then suddenly you're thrown into a world of of of hurt because time went backwards by tens of nanoseconds and you're like, no, I always expect time to go forwards because you know, that's one of the few truths along with taxes and death is like time goes forwards.

196
00:26:18.800 --> 00:26:44.620
<v Ben Rady>Nope, you think it does. But I mean, I think that raises a really good point, which is one way that you can get around this time synchronization difficulty is to never use the system time of the computers that are in the the messaging system and embed time in the messages, right? And then these the ultimate source of the messages is the thing that has to have a reasonably accurate time.

197
00:26:44.620 --> 00:26:58.380
<v Ben Rady>but the sense of time for all of the downstream system just comes from that. And that is really important if you want to do what we were kind of talking about earlier where you have a sequence of messages and you're trying to reconstitute state based on that sequence of messages.

198
00:26:58.380 --> 00:27:09.160
<v Ben Rady>If there's any sort of time processing that has to happen, then embedding the time in the messages allows you to reconstitute that state retroactively, right?

199
00:27:09.160 --> 00:27:12.820
<v Ben Rady>So you can go back and you can replay the messages from three months ago

200
00:27:12.820 --> 00:27:13.200
<v Matt Godbolt>Yep.

201
00:27:13.200 --> 00:27:25.340
<v Ben Rady>and reconstitute whatever state that you have, even if it depends on time, because it doesn't depend on the the clock of the computer that's just running the the simulation or the reproduction, it's extracting that time from the messages itself.

202
00:27:25.340 --> 00:27:25.340
<v Matt Godbolt>Right.

203
00:27:25.340 --> 00:27:28.100
<v Ben Rady>So you will always get exactly the same result.

204
00:27:28.100 --> 00:28:22.240
<v Matt Godbolt>Yeah, just to take a a temporary diversion here, this is one of the things that in the code base that I was working on, um we use different types for the different types of time. So they were literally not comparable or convertible between each other without like an explicit thing I could search for in the code saying like, we're doing this, we're crossing clock domains right now. I am trying to look at the current time as measured by whatever process has given me the time on my computer and I'm comparing it to the message time that was embedded in the message through some mechanism and i have to know that that comes with this huge bag of caveats. It's sometimes useful to do it because one thing you might wanna do is measure the skew between the two just to graph it somewhere or just to keep track of it or just to alert if it gets more than a few hundred milliseconds or something out. So you do want to be able to do it, but you definitely don't want to be able to do it just by saying `time t = clock.now - message.time`.

205
00:28:22.240 --> 00:28:23.840
<v Ben Rady>Yeah, yeah.

206
00:28:23.840 --> 00:28:36.020
<v Matt Godbolt>It should be, no, that's so so that's a syntax error, right? The thing is going to fail to compile there. You have to do some work here. And you know that's um That's ah always been a worthwhile thing I've found to do.

207
00:28:36.020 --> 00:28:46.540
<v Matt Godbolt>And even within a a computer, you know like there are different clocks. You've got monotonic clocks that are guaranteed to not go backwards. You've got clocks that try and like adjust because of like the NTP drift as they're readjusting themselves.

208
00:28:46.540 --> 00:28:50.560
<v Matt Godbolt>You've got like the CPU cycle counter, which is measured in its own domain.

209
00:28:50.560 --> 00:28:50.560
<v Ben Rady>Mmhmm.

210
00:28:50.560 --> 00:29:13.070
<v Matt Godbolt>So this is something that's useful to have more generally. Gosh, this is really going off topic, isn't it? This is great. But no, it's it's a really important thing to to know about. I think it's worth saying as well just because it's cool that it is possible to get networking hardware to add a timestamp onto the end of packets that flow through it.

211
00:29:13.070 --> 00:29:13.100
<v Ben Rady>Mmhmm.

212
00:29:13.100 --> 00:29:15.720
<v Matt Godbolt>So there are certain switches that you can configure.

213
00:29:15.720 --> 00:29:15.720
<v Ben Rady>Mmhmm.

214
00:29:15.720 --> 00:30:11.620
<v Matt Godbolt>You can plug them into this PPS and get them to synchronize with your very accurate timestamp. And then every message that flows through that switch gets a payload on the back of each packet tacked on after like the end of what would normally be the UDP packet or the TCP packet or whatever and you need to use exotic mechanisms to go and actually pull those bytes out but they are there and then that you can have like a source of truth that maybe the edge of your network as things come in from the outside world you say well this is where we're going to timestamp it. And that's useful for both reconstituting the sequence in which they arrived at the edge, which is not necessarily the order that they arrived at you, because cables can vary within the system and routes within your system can vary, but it gives you something to measure things by. And in particular, when you're doing some of those more ah latency sensitive things that we were talking about, having a sort of ground truth comparison, that you can look at that timestamp for the thing that came in, and look at the timestamp of your message that went out of

215
00:30:11.620 --> 00:30:22.320
<v Matt Godbolt>of the system. you've got like That's literally how long it took, warts and all, every network hop. Anyway, that's one of the many sources of clock domains. And we were talking about clock domains in the context of ordering.

216
00:30:22.320 --> 00:30:22.960
<v Matt Godbolt>So yeah, go ahead.

217
00:30:22.960 --> 00:30:46.610
<v Ben Rady>yeah Yeah, well, and that actually brings up another topic, which is that time stamping is an example of something else that is a really good practice, which is tracing. right as each As the message flows through your system and as it's being processed at each stage, it is quite often useful to be able to embed in the message or maybe as a wrapper around the message, depending on how you do it,

218
00:30:46.610 --> 00:30:46.920
<v Matt Godbolt>Yes.

219
00:30:46.920 --> 00:31:01.150
<v Ben Rady>information about the tracing. And that can be useful for performance. It can be useful for like um you know error ah debugging, yeah like you know like just general observability, figuring out, like hey, this message failed to process...

220
00:31:01.150 --> 00:31:01.900
<v Matt Godbolt>Yep. Well, debugging. Yeah.

221
00:31:01.900 --> 00:31:17.800
<v Ben Rady>Why? like Where did it stop? What problems did it run into? Or it was really slow to process. Why? What was the bottleneck? right What was the slow part? um And, you know, sometimes you'll do things like creating some sort of identifier at the point of ingestion or message creation.

222
00:31:17.800 --> 00:31:30.210
<v Ben Rady>And then you can have like an external system that refers to the message as it flows through using that identifier. Or sometimes you're literally just adding information into the message object as it's flowing through um to.

223
00:31:30.210 --> 00:31:30.360
<v Matt Godbolt>Right.

224
00:31:30.360 --> 00:31:40.900
<v Matt Godbolt>That, incidentally, is what we used ah the nanosecond timestamp for, because obviously the the the hardware on the outside would put this nanosecond timestamp on every packet. We're like, well, that's a unique identifier, except when it isn't.

225
00:31:40.900 --> 00:31:41.340
<v Ben Rady>Yeah, yeah, yeah, except it's not.

226
00:31:41.340 --> 00:31:53.840
<v Matt Godbolt>um But most of the time it is. And and then it would gives you this sort of unique ID, this sort of like trace ID, which is carries information in its own right, because it's the time that it arrived as well.

227
00:31:53.840 --> 00:32:12.300
<v Matt Godbolt>Yeah, not always, ah unfortunately, not always unique. um No, that's, ah we I've variously seen this as, you know, "provenance" or "tracing" or or "causality", or, you know, there's, the and I'm sure like that I know that the OpenTelemetry projects, I keep being pointed out, and I'm going to start looking at that soon.

228
00:32:12.300 --> 00:32:22.600
<v Matt Godbolt>I keep meaning to, um they seem to have a whole bunch of stuff around the telemetry of more just generally of systems, but I wonder if they have something that also talks of or or can be used to correlate.

229
00:32:22.600 --> 00:32:40.280
<v Matt Godbolt>That's another one, "correlation IDs" and things like that. One event and like the the causality as it traces through your system and you see all the different events. I mean, even on just like a website, just seeing that someone clicked a button and caused an error and you're like, well, that the backend error was caused by this click over here is useful.

230
00:32:40.280 --> 00:32:43.320
<v Matt Godbolt>Anyway, sorry, again, off, really off base here, but yeah.

231
00:32:43.320 --> 00:32:54.300
<v Ben Rady>No, I mean, I think these are all these are all dimensions of this problem that you need to be thinking about if you're going to build systems like this, right?

232
00:32:54.300 --> 00:33:15.460
<v Matt Godbolt>we've we've we've talked about um various dimensions so far of messages. We talked about like durability, we talked about sequencing, we've talked about ah now tracing, um which sort of had determinism ah what are the and and you know very we We opened with you know don't put giant areas of data giant blocks of data into your messages.

233
00:33:15.460 --> 00:33:15.460
<v Ben Rady>Yeah, yeah.

234
00:33:15.460 --> 00:33:22.680
<v Matt Godbolt>And we said, be very careful about which clocks you use. What other the considerations are there?

235
00:33:22.680 --> 00:33:27.590
<v Matt Godbolt>I mean, so how would your your monitoring system? does it what Let's just think a little bit about the monitoring system.

236
00:33:27.590 --> 00:33:27.600
<v Ben Rady>Yeah, yeah, yeah.

237
00:33:27.600 --> 00:33:40.310
<v Matt Godbolt>So that had a very, very high set of inputs. Like, essentially, it was ah it was a centralized monitoring system for the whole company's services. though All the services could send all the stats they wanted to it.

238
00:33:40.310 --> 00:33:40.680
<v Ben Rady>yeah

239
00:33:40.680 --> 00:33:42.500
<v Matt Godbolt>And you had to deal with it.

240
00:33:42.500 --> 00:33:50.470
<v Ben Rady>Yeah, I'll tell you one thing one mistake that we made, and this is you know ah good judgment comes from experience and experience comes from bad judgment.

241
00:33:50.470 --> 00:33:50.540
<v Matt Godbolt>[laughs]

242
00:33:50.540 --> 00:33:59.040
<v Ben Rady>And so listeners, I hope that you get to benefit from all of the bad judgment of the of the people on this podcast and the hard-won experience.

243
00:33:59.040 --> 00:34:11.280
<v Ben Rady>And so when I say like you need to be careful about clock domains and you need to think about like where your source of time is, one of the great mistakes that we made very early on in that project, and it's something that just haunted us forever,

244
00:34:11.280 --> 00:34:30.620
<v Ben Rady>is we allowed people who were sending messages to the system. So the idea behind the system is that you'd have you know external clients that could send you know telemetry data or, I mean, basically anything like prices, internal application metrics, whatever they wanted, they could send um you know data to the system.

245
00:34:30.620 --> 00:34:33.560
<v Ben Rady>It worked a little bit like StatsD, if you've ever used StatsD, but it had sort of more, yeah, yeah.

246
00:34:33.560 --> 00:34:40.800
<v Matt Godbolt>Yeah, sort of prometheus-y type things that but but it's ah a lot more it was designed for more real time stuff rather than like once a minute once a second kind of stuff it was it was very much like

247
00:34:40.800 --> 00:34:51.800
<v Ben Rady>Yes, yes, yes. The idea behind the system was like, you know, it's cool and that Grafana has a chart that updates once a minute, but we need something that can update many times per second because it's monitoring trading systems.

248
00:34:51.800 --> 00:35:03.260
<v Ben Rady>And if something happens, we need to know about it right now. So like human time. But one of the great mistakes that we made with the system was allowing people to put their own timestamps on those messages.

249
00:35:03.260 --> 00:35:07.000
<v Ben Rady>That was a terrible idea. An absolutely terrible idea.

250
00:35:07.000 --> 00:35:09.720
<v Matt Godbolt>It's so easy to do. I can see why you'd want to be able to do this.

251
00:35:09.720 --> 00:35:09.720
<v Ben Rady>Yes.

252
00:35:09.720 --> 00:35:18.340
<v Matt Godbolt>You know, like I find this quite often with things like um the, ah like our Prometheus setup, because, you know, like, Hey, I've got a build.

253
00:35:18.340 --> 00:35:18.340
<v Ben Rady>Mm hmm.

254
00:35:18.340 --> 00:35:35.200
<v Matt Godbolt>I want to like measure my build time and I want to post it. And then sometimes I want to go, actually, I want to go back in time and like run the last hundred builds one day apart from each other. And I want to populate some data in the database so that I i don't just have "now data". I have historic data once I've thought I want it.

255
00:35:35.200 --> 00:35:35.200
<v Ben Rady>Yeah, yeah, yeah.

256
00:35:35.200 --> 00:35:41.620
<v Matt Godbolt>Right. And so how bad would it be to let me post stuff that's in the past to you so that I can write my data?

257
00:35:41.620 --> 00:35:41.620
<v Ben Rady>Right.

258
00:35:41.620 --> 00:35:48.000
<v Matt Godbolt>Like, you know, it's a reasonable thing to want to do. So what was the drawback? What was the, what, what made you rue that decision?

259
00:35:48.000 --> 00:36:01.160
<v Ben Rady>Well, because inevitably people want to be able to say like, Oh, and also give me the list of all the messages that were delivered on this day. And now that's just wrong because your timestamp and my timestamp don't line up for whatever reason, right?

260
00:36:01.160 --> 00:36:01.160
<v Matt Godbolt>Right.

261
00:36:01.160 --> 00:36:06.710
<v Ben Rady>It could be that you post or pre or post dated your thing, but you did the calculation wrong.

262
00:36:06.710 --> 00:36:18.180
<v Ben Rady>It could be that like what you actually want when you say the delivery day that was delivered on that day was the delivery that data was that was delivered on that day and not like whatever timestamp it had, because that came out of your log file or whatever.

263
00:36:18.180 --> 00:36:21.920
<v Matt Godbolt>Well, this is, this comes back to almost like the bitemporality thing.

264
00:36:21.920 --> 00:36:21.920
<v Ben Rady>Bitemporality, yes.

265
00:36:21.920 --> 00:36:27.680
<v Matt Godbolt>It's like, you know, there's the time that I got it. And that's the kind of knowledge time. When did I know that you said that you wanted this thing?

266
00:36:27.680 --> 00:36:35.170
<v Matt Godbolt>That's one timestamp. And then the other timestamp is what time did you say that you wanted this thing to be known as of or related to, sorry.

267
00:36:35.170 --> 00:36:35.710
<v Ben Rady>yeah Yes.

268
00:36:35.710 --> 00:36:36.800
<v Ben Rady>Yes.

269
00:36:36.800 --> 00:36:44.360
<v Matt Godbolt>ah And you in almost all situations, those two times are coincident or so close that nobody cares, but not always.

270
00:36:44.360 --> 00:36:44.360
<v Ben Rady>Mm-hmm, mm-hmm.

271
00:36:44.360 --> 00:36:50.690
<v Matt Godbolt>right And I think that's one of the harder things. I don't know if we've weve ever talked about bitemporality. Maybe we have. I don't know.

272
00:36:50.690 --> 00:36:50.720
<v Ben Rady>I don't know.

273
00:36:50.720 --> 00:36:53.960
<v Matt Godbolt>We we must have done in passing. du but That's a whole interesting world as well.

274
00:36:53.960 --> 00:36:53.960
<v Ben Rady>Yeah.

275
00:36:53.960 --> 00:37:01.830
<v Matt Godbolt>you know like it's it's ah yeah You want to say, on this day, what messages did you send me?

276
00:37:01.830 --> 00:37:02.280
<v Ben Rady>Mm hmm.

277
00:37:02.280 --> 00:37:10.240
<v Matt Godbolt>And then you want to say, on this day, what samples fall in this window? Which is different from when did you tell me about those samples?

278
00:37:10.240 --> 00:37:10.700
<v Ben Rady>Right.

279
00:37:10.700 --> 00:37:12.700
<v Matt Godbolt>right That's a very, i mean again, they're mostly the same.

280
00:37:12.700 --> 00:37:12.740
<v Ben Rady>Right, right, right.

281
00:37:12.740 --> 00:37:13.240
<v Matt Godbolt>But yeah, that's OK.

282
00:37:13.240 --> 00:37:27.740
<v Ben Rady>Yeah, yeah, if I had it to do over again, what I would have said is no, you cannot specify the timestamp, but you can, and this was true already, you can put whatever data you want in your message and you can query based on any of that data.

283
00:37:27.740 --> 00:37:27.740
<v Matt Godbolt>Mm hmm.

284
00:37:27.740 --> 00:37:36.840
<v Ben Rady>So if you want to have your own log timestamp or ingestion timestamp or whatever, you can add that as a field to your message.

285
00:37:36.840 --> 00:37:36.840
<v Matt Godbolt>Yeah.

286
00:37:36.840 --> 00:37:45.620
<v Ben Rady>My system will be blissfully ignorant of it other than it's another field that you can do stuff with and you can do whatever you want with that timestamp.

287
00:37:45.620 --> 00:37:55.860
<v Matt Godbolt>Yeah, that is your, that is your piece of data to do with you wish, but we know when it arrived with us and that's all we're going to like keep as the sort of primary thing that we can.

288
00:37:55.860 --> 00:37:55.860
<v Ben Rady>Yeah.

289
00:37:55.860 --> 00:37:56.520
<v Matt Godbolt>Yeah. yeah

290
00:37:56.520 --> 00:38:03.360
<v Ben Rady>Yes. Also, speaking of timestamps, Please, please, please do not put localized timestamps in your messages.

291
00:38:03.360 --> 00:38:03.400
<v Matt Godbolt>Oh.

292
00:38:03.400 --> 00:38:18.080
<v Ben Rady>It's so it's a long, it's a yeah it's it's it can be nano precision, it can be millisecond precision, it can be second precision. I don't even care, but it's a number. Please just put a number in there. Don't put some parsed string with a time zone offset.

293
00:38:18.080 --> 00:38:18.080
<v Matt Godbolt>Yeah.

294
00:38:18.080 --> 00:38:19.480
<v Ben Rady>and No.

295
00:38:19.480 --> 00:38:26.160
<v Matt Godbolt>No, and store it in UTC for this kind of thing or some well-defined never-changing thing.

296
00:38:26.160 --> 00:38:26.160
<v Ben Rady>Yes.

297
00:38:26.160 --> 00:38:43.460
<v Matt Godbolt>um I think, I don't know to what extent it's an open secret or not, but um a very large web search company ah to this day, to the best of my understanding, still logs everything in West Coast time, which means that it,

298
00:38:43.460 --> 00:38:51.340
<v Matt Godbolt>Its logs and the graphs that go with it have a twice a year, either a big gap or a weird back double backing on themselves type of thing.

299
00:38:51.340 --> 00:38:51.760
<v Ben Rady>Mm hmm.

300
00:38:51.760 --> 00:38:55.810
<v Matt Godbolt>um And it's just the cost of changing it is so high that it hasn't been done.

301
00:38:55.810 --> 00:38:56.280
<v Ben Rady>Mm hmm.

302
00:38:56.280 --> 00:39:24.310
<v Matt Godbolt>But yeah, you there are time, there's a time and a place for a localized time. And it is in application level. things, like if if you're if you're if you're saying um if you're trying to talk about what time did a trade happen on a particular exchange, it is useful to specify it in the local time of that exchange, say, because you know that our exchange opens at 8.30 local time on that day and closes at 3.30 local time on that day.

303
00:39:24.310 --> 00:39:24.560
<v Ben Rady>Yes.

304
00:39:24.560 --> 00:39:37.040
<v Matt Godbolt>But if you have to sit and try and work out or do anything other than compare with arithmetic operations straightforward arithmetic operations on a 64-bit number then you're doing something wrong.

305
00:39:37.040 --> 00:39:43.600
<v Matt Godbolt>If you have to kind of work out what day that was and then i was it daylight savings or not on that wait a second that was in europe wasn't it and they don't do daylight saving.

306
00:39:43.600 --> 00:39:55.040
<v Ben Rady>Absolutely, absolutely. like Like religion and politics, time localization should only be discussed in the home. Like you, the international standard is a 64-bit number.

307
00:39:55.040 --> 00:40:07.800
<v Ben Rady>And only when you're displaying it or like viewing it or or making a report, do you ever take that 64-bit number and turn it into some localized time that is localized for the person who is viewing it, right?

308
00:40:07.800 --> 00:40:08.020
<v Matt Godbolt>Yes, or the whatever it is.

309
00:40:08.020 --> 00:40:09.550
<v Ben Rady>Or the system perhaps that is viewing it.

310
00:40:09.550 --> 00:40:10.080
<v Ben Rady>But yes, yes, right.

311
00:40:10.080 --> 00:40:20.200
<v Matt Godbolt>Yes, no, then that makes sense. Yeah, I think i that is that. And then, ah yeah, nanoseconds since 1970 is not a bad thing to fit into 64 bits.

312
00:40:20.200 --> 00:40:32.060
<v Matt Godbolt>That'll get you to, I can't remember when, but it was, you know, it's far enough in the future, that at least right now, I don't have to worry about it before I retire, although that is, you know, I'm an old man. So maybe ah maybe the younger folk will have to worry about it.

313
00:40:32.060 --> 00:40:32.180
<v Ben Rady>Mmhm

314
00:40:32.180 --> 00:40:45.630
<v Matt Godbolt>um ah But there are no any number of of ways of storing time better than that or you know yeah you can pick your own epoch right: you don't have to be 1970 is convenient if it is cuz then you could use.

315
00:40:45.630 --> 00:40:45.800
<v Ben Rady>Yeah.

316
00:40:45.800 --> 00:40:46.560
<v Ben Rady>Right, right.

317
00:40:46.560 --> 00:41:26.380
<v Matt Godbolt>ah the Unix date command to kind of move back and forth. In fact, one of the first things I do, ah yeah I've checked in all my dot files. Sorry, this is another sidetrack, but one of the like fish functions of the shell that I use is to convert numbers from an epoch time to like a displayable time and backwards, right? So I could do epoch and then just type a number in and then it, based on however many the digits it's got, it guesses whether it's millis, micros or nanos, and then it prints it out in my current time zone. And it is the single most useful thing. I know people go to epoch-converter.com, which drives me bonkers to see, you know, why would you go to our website with all these flashing ads and things on it, just to convert some numbers when it's like something that command line can do, but on the other hand, it's a pain to do.

318
00:41:26.380 --> 00:41:32.920
<v Ben Rady>Yeah. Or you can just open up a JavaScript console in your favorite browser and paste the timestamp into `new Date()`.

319
00:41:32.920 --> 00:41:33.340
<v Matt Godbolt>Yeah, that's true.

320
00:41:33.340 --> 00:41:35.300
<v Ben Rady>And that'll, that'll also give it to you. um

321
00:41:35.300 --> 00:41:36.040
<v Matt Godbolt>That's a great one.

322
00:41:36.040 --> 00:41:36.560
<v Ben Rady>yeah

323
00:41:36.560 --> 00:41:38.940
<v Matt Godbolt>I'm remembering that one.

324
00:41:38.940 --> 00:41:40.960
<v Ben Rady>Yeah, it's super, super convenient most of the time.

325
00:41:40.960 --> 00:41:41.400
<v Matt Godbolt>That one's even more portable than mine, yeah.

326
00:41:41.400 --> 00:41:50.840
<v Ben Rady>um Another thing to think about here, and this is kind of getting back to, you know, I was saying like, don't put a database in the middle of your messaging system, right? ah generally Generally, sometimes it's it's fine.

327
00:41:50.840 --> 00:41:50.840
<v Matt Godbolt>Right.

328
00:41:50.840 --> 00:42:05.900
<v Ben Rady>And, you know, as you said before, sometimes it's just a file. But like, okay, if I can't do that, then how am I supposed to bridge the gap? Because there will almost certainly be a gap. between the world of like stream processing systems and batch processing systems.

329
00:42:05.900 --> 00:42:13.140
<v Ben Rady>right like At some point, someone's going to run want to run a database query or something on your data.

330
00:42:13.140 --> 00:42:13.280
<v Matt Godbolt>Right.

331
00:42:13.280 --> 00:42:37.680
<v Ben Rady>right And how do you handle that? right And also, this kind of ties into a durability thing, where it's like if you don't have a system like Kafka or some other sort of durable queue, in the middle of your system to kind of keep track of the history. You know you just have you know UDP packets or you have something else. like What should be responsible for sort of keeping the historical record of everything that has ever happened, right?

332
00:42:37.680 --> 00:42:38.500
<v Matt Godbolt>Right. Right.

333
00:42:38.500 --> 00:42:38.680
<v Ben Rady>So I...

334
00:42:38.680 --> 00:42:41.160
<v Matt Godbolt>Which obviously some people don't need and that's fine.

335
00:42:41.160 --> 00:42:47.380
<v Matt Godbolt>if you're if you're If you're a video game server and you've got the player positions that are being updated, then maybe you don't need a log for all time.

336
00:42:47.380 --> 00:42:47.420
<v Ben Rady>Right.

337
00:42:47.420 --> 00:42:47.460
<v Ben Rady>Yes.

338
00:42:47.460 --> 00:42:55.980
<v Matt Godbolt>But you know, if you're working in finance, it's generally a good idea to keep everything forever for all time in case somebody comes and asks you a very awkward question about what happened.

339
00:42:55.980 --> 00:42:55.980
<v Ben Rady>Yes, yes.

340
00:42:55.980 --> 00:43:25.940
<v Ben Rady>Yeah, yeah. And this ties in also to another thing that we were talking about, about ah reproducing state for state machines. So it's like it's you know the cool idea is like, all right, I'm going to take my messages. I'm going to pass them into some system that processes them. There will be no other information that goes into the state machine other than the messages itself. And therefore, I can completely reproduce the state from the sequence of messages. It's like, yes, that's cool. But what happens when you have seven years worth of messages? and you have to start at the beginning.

341
00:43:25.940 --> 00:43:26.460
<v Matt Godbolt>right right right

342
00:43:26.460 --> 00:43:40.080
<v Ben Rady>That seems bad. So one of the things that you typically do is you have something that is consuming the stream of messages whose purpose is to store them and also potentially snapshot them.

343
00:43:40.080 --> 00:44:08.780
<v Ben Rady>right So you you have something that is consuming the messages, it's writing them into some persistent store, maybe it's even like transforming them into like something that can fit into it like a database table or some other format that is nice for bulk processing. And another thing that it might be doing is running this sort of state machine and taking a snapshot at some regular interval and then putting that into the storage as well. So that when you need to reproduce the state for some particular point in time,

344
00:44:08.780 --> 00:44:23.440
<v Ben Rady>Rather than having to play all seven years worth of messages through your system, you can jump to you know a prior but recent snapshot and then load that state into your system and then only replay the messages forward from there.

345
00:44:23.440 --> 00:44:26.580
<v Ben Rady>And that will be much faster and much more efficient.

346
00:44:26.580 --> 00:44:27.120
<v Matt Godbolt>Right.

347
00:44:27.120 --> 00:44:33.680
<v Matt Godbolt>Right, right, right. Provided there is there exists a sensible snapshot format, which is an interesting.

348
00:44:33.680 --> 00:44:33.680
<v Ben Rady>Right. Right.

349
00:44:33.680 --> 00:44:46.640
<v Matt Godbolt>So I think what you're this this has now sort of moved into what what I think of as like a log structured journal of light you know like, you have some database yeah or in-memory representation of the world that you update through seeing these events.

350
00:44:46.640 --> 00:45:21.800
<v Matt Godbolt>um For some things, um so for example, to build the set of live orders on an exchange, that is the prices of like Google and all the people that are trying to buy and sell Google, um you can unambiguously snapshot that state and go, okay, this is what um this is ah at this point in time, at nine in the morning, these are the, everyone's orders. And now if you just load up this nine AM, you can carry on. You don't have to have loaded up, you know, the seven AM m ones and, or the whole, from the whole day. That's fine, right? But as soon as you start getting to things that have state that is like non-trivial,

351
00:45:21.800 --> 00:45:33.400
<v Matt Godbolt>now it becomes a function of the processor of that state. So let me give you an example. What if you were keeping some kind of exponential moving average of some of the factors of that?

352
00:45:33.400 --> 00:45:37.470
<v Matt Godbolt>That depends on how long the window f your exponential...

353
00:45:37.470 --> 00:45:38.340
<v Ben Rady>Uh-huh.

354
00:45:38.340 --> 00:46:17.640
<v Matt Godbolt>is, and some other properties of that. What do you count? Which kinds of information go into that or don't? And now you've got a complicated piece of state that is arguably different for every client. you know Maybe some people care about a 10-day look back, and then other people want a you know a five-minute look back. And so that gets kind of tricky. I don't know where I'm going with this now, but like if it just it's it's not as straightforward for um application domains if they have any kind of state that is that requires some history in order to get to the point other than the like the pure individual like add/remove of, say, a book, unambiguous stuff, yeah.

355
00:46:17.640 --> 00:46:32.240
<v Ben Rady>Yeah, and that state can get quite large because of these constraints. And I think this is something that is really important to think about because this kind of snapshotting is becomes very important when you think about error recovery, right?

356
00:46:32.240 --> 00:46:42.610
<v Ben Rady>And there's two dimensions of error recovery that I think we we can talk about here. One is you've got some consumer of the stream and it's crashed. And now you want to restart it, right?

357
00:46:42.610 --> 00:46:44.060
<v Matt Godbolt>Right? Yep.

358
00:46:44.060 --> 00:46:56.540
<v Ben Rady>what state do you need to to let it sort of rejoin the stream, right? Again, do you have to go back to the beginning of time and process seven years with the messages for your system to restart? That's gonna be bad, right?

359
00:46:56.540 --> 00:46:59.580
<v Matt Godbolt>Oh, we'll fix it next year when, yeah, it only gets worse, yeah.

360
00:46:59.580 --> 00:47:21.000
<v Ben Rady>So if you, yeah, right. Yes, we've we've rebooted it and the website will be back online in 2038. um so ah So you have to think about the state if you want to be able to recover, and you need to think about how you can reasonably snapshot that state if you want to be able to spin something back up and have it sort of rejoin this stream, right?

361
00:47:21.000 --> 00:47:33.100
<v Ben Rady>um And so you have to I think you have to consider that from the very beginning. like how How big is the state? How often can we snapshot it? What is our sort of acceptable amount of downtime here for these various things?

362
00:47:33.100 --> 00:47:44.780
<v Ben Rady>you know Is it like an hour? Is it a minute? Is it you know a month? um And how are we going to be able to to rejoin this processing? Otherwise, we can never turn this software off, right?

363
00:47:44.780 --> 00:47:49.740
<v Matt Godbolt>Right, which is an option. um Just don't write any bugs.

364
00:47:49.740 --> 00:47:49.740
<v Ben Rady>Yeah, right.

365
00:47:49.740 --> 00:47:53.180
<v Matt Godbolt>I don't have any hardware faults...we'll be golden!

366
00:47:53.180 --> 00:48:01.200
<v Ben Rady>ahha yes yeah yeah yeah yeah um Another dimension to think about with with fault timelines with these systems are poisoned messages.

367
00:48:01.200 --> 00:48:01.340
<v Matt Godbolt>Yeah.

368
00:48:01.340 --> 00:48:11.820
<v Ben Rady>right so That is a very common situation where there's a bug in your system or a bug in a producer system, perhaps, and you receive a message

369
00:48:11.820 --> 00:48:25.340
<v Ben Rady>that you can't process. right And redundancy here will not save you. right You can have 10 redundant systems that are all consuming the stream and processing the messages so that if one like runs out of memory or whatever, you know the other nine are there.

370
00:48:25.340 --> 00:48:34.400
<v Ben Rady>But if they all have the same bug and they all get the same message, the whole point of the distributed state machine is that they are all going to do the same thing, which is not process your message.

371
00:48:34.400 --> 00:48:35.080
<v Matt Godbolt>er all crash

372
00:48:35.080 --> 00:48:47.380
<v Ben Rady>That means they might crash. you know All kinds of manner of problems can happen here. So one common approach to dealing with these things is creating what's called a dead letter queue.

373
00:48:47.380 --> 00:49:00.060
<v Ben Rady>So you you have a message that comes in and your system cannot process it, but it's able to detect that it can't process it. Maybe it raises an error, maybe there's some validation step, whatever it is, and it's like, I can't process this message.

374
00:49:00.060 --> 00:49:00.060
<v Matt Godbolt>They're all crash.

375
00:49:00.060 --> 00:49:09.060
<v Ben Rady>So what I'm going to do is I'm going to take it i'm going to put it into another queue, another stream of messages called the dead letter queue.

376
00:49:09.060 --> 00:49:09.060
<v Matt Godbolt>Streamer messages, yeah, yeah.

377
00:49:09.060 --> 00:49:20.780
<v Ben Rady>And it's going to sit there until somebody does something with it. Now, the first thing that you want to do with it is send some kind of notification or alert or something to tell everybody, yes, like you know somebody's getting paged.

378
00:49:20.780 --> 00:49:20.980
<v Matt Godbolt>Someone's phone should go off.

379
00:49:20.980 --> 00:49:49.140
<v Ben Rady>It's like, ah, we just got a message we don't know how to process, right? um But if you if you do that, then depending on the state machine that you're trying to reproduce, if you have one, or just the message processing that you're doing, it can sometimes be OK to say, OK, I'm going to take this message. I'm going to put it in the dead letter queue. And then I'm just going to keep going. right I'm going to pretend like I never even got this message because it's malformed or it's it's there's some other problematic thing with it. and I'm just going to keep going.

380
00:49:49.140 --> 00:49:56.820
<v Ben Rady>You can obviously run into situations where there's just a bug in your code and this is a message that you need to process and you didn't process it correctly and now your state is wrong.

381
00:49:56.820 --> 00:49:58.920
<v Matt Godbolt>And now you're doomed. Yeah, yeah.

382
00:49:58.920 --> 00:50:14.920
<v Ben Rady>But there are also situations in which you have one of these messages and it is truly something that is malformed and can be ignored, was never supposed to be created in the first place, and now you can just continue on having this in the dead letter queue.

383
00:50:14.920 --> 00:50:26.640
<v Ben Rady>A common pattern that I ah have used with great effect is being able to basically re-drive those dead letter queue messages back into the main queue if sequencing doesn't matter.

384
00:50:26.640 --> 00:50:26.860
<v Matt Godbolt>Oh, interesting.

385
00:50:26.860 --> 00:50:29.240
<v Ben Rady>but If sequencing matters, then you can't do this.

386
00:50:29.240 --> 00:50:41.700
<v Ben Rady>right But if you have a system where there's no sequencer ah or there is a sequencer where it doesn't really matter all that much, then you can take these messages and be like, all right, we got this message, we don't know how to handle.

387
00:50:41.700 --> 00:50:55.720
<v Ben Rady>um It went into the dead letter queue. We're now going to change the code so that it can handle this message in some way, redeploy that, and then re-drive the message back into the queue so that it can be correctly processed and flow all the way through.

388
00:50:55.720 --> 00:50:55.920
<v Matt Godbolt>Right.

389
00:50:55.920 --> 00:50:59.820
<v Ben Rady>right and That is a really nice way to handle it if you can.

390
00:50:59.820 --> 00:51:14.520
<v Matt Godbolt>If you're able to do that, then yeah, that's a really, and that's, that's so particularly, for example, if this was some, um you know, holiday booking stream of information, you're like a centralized holiday booking thing, and then if someone comes in and they've just booked some suite and some

391
00:51:14.520 --> 00:51:14.520
<v Ben Rady>Yes.

392
00:51:14.520 --> 00:51:26.080
<v Matt Godbolt>the price is higher than you've ever hit before and some internal issue happens and you're like, oh, damn, you know, we can't book this for them because it's it's a it's legitimately $100,000 a night a thing.

393
00:51:26.080 --> 00:51:27.740
<v Matt Godbolt>And that just overflows something we're done.

394
00:51:27.740 --> 00:51:27.740
<v Ben Rady>hu hu Yeah,

395
00:51:27.740 --> 00:51:41.960
<v Matt Godbolt>But you're like, this is really valuable business. Ben, could you hotfix that very, very quickly? Write a test, fix the test, deploy the thing, and then we're gonna put it back in again, and then the booking goes through, albeit 30 minutes, an hour late.

396
00:51:41.960 --> 00:51:52.800
<v Matt Godbolt>At least it gets done, and you caption the revenue, and everyone's happy, and, you know, it ever it's... ah ah Yeah, that that seems like a really nice way to heal the system in that instance.

397
00:51:52.800 --> 00:51:52.870
<v Ben Rady>Yeah, yeah.

398
00:51:52.870 --> 00:51:52.900
<v Ben Rady>Yeah.

399
00:51:52.900 --> 00:51:58.830
<v Matt Godbolt>But obviously, sometimes it can be a legitimate bug or a malformed message or something something like that.

400
00:51:58.830 --> 00:51:59.520
<v Ben Rady>Yeah.

401
00:51:59.520 --> 00:52:03.480
<v Matt Godbolt>Yeah, and you have to be able to deal with it. Yeah, because as you say, fault tolerance was was a dimension that you talked about.

402
00:52:03.480 --> 00:52:03.480
<v Ben Rady>Right.

403
00:52:03.480 --> 00:52:21.920
<v Matt Godbolt>So ah Another dimension for message processing systems is that, like, things go wrong, computers go wrong, and it's entirely reasonable to have more than one person, more than one person, more than one system listening to this stream of messages and independently processing them and updating them. And then if the machine breaks,

404
00:52:21.920 --> 00:53:07.820
<v Matt Godbolt>Well, you've got two more of them, and that's OK. And then you have to have a system behind that system that determines what the actual outcome of any particular update was. But you've got fault tolerance by scaling through a messaging system. And that's that that's a really interesting solution. And part of the solution that we put together at the aforementioned cryptocurrency trading place, which was a really interesting solution for a number of of of of things that we were doing, wasn't it? It allowed us to do rolling updates of the code because we could have a quorum of five machines doing the same processing and then take two of them out of the system, upgrade them and then put them back in again and then run them in silent mode and check that everyone still agreed on everything that was happening

405
00:53:07.820 --> 00:53:18.320
<v Matt Godbolt>And then only when we were confident that we hadn't introduced a new bug, we could add them back into the pool and then start rolling over the other three. And there you go. Now you can do rolling upgrades and you're never down. Hooray.

406
00:53:18.320 --> 00:53:18.320
<v Ben Rady>Mmhmm, mmhmm, mmhmm.

407
00:53:18.320 --> 00:53:50.060
<v Matt Godbolt>Um, it let us do things like have different configurations of those computers, be it through the different JVM settings or different hardware or whatever, such that if one of them processed the message faster than the other or one of them had to GC say, or one of them was doing some JIT work or whatever. Um, we could make sure that as long as two or three came up with a a good answer that we were happy with, the other two could be slower and that's fine. And that meant that we could hide some of our tail latencies.

408
00:53:50.060 --> 00:53:50.420
<v Ben Rady>Yeah, yes.

409
00:53:50.420 --> 00:54:01.740
<v Matt Godbolt>in, in the quorum, which was, you know, so we got all these ah wonderful and obviously, yeah, if we had an equipment failure, then, you know, two or three of those machines could die and the machine and the site would stay up and we'd be able to process transactions and everything.

410
00:54:01.740 --> 00:54:11.980
<v Matt Godbolt>And that was super cool. And was, was definitely eye opening to me working there in terms of like, Hey, you get a lot of benefits from doing it this way. That's great.

411
00:54:11.980 --> 00:54:36.460
<v Ben Rady>Yeah, I had that that same experience. and And we had done some things like that at ah my my previous company when we were basically intentionally creating races between systems because we were trying to get them to run as fast as possible. And it created a an opportunity to to make the system more fault tolerant, where you'd have you know multiple parallel things that are all processing the same stuff. And the first one to finish wins.

412
00:54:36.460 --> 00:54:55.540
<v Ben Rady>And so like if there's some variation in the latency because of some you know operating system level thing, or a garbage collection because some of this was Java, or so something else had happened, right or one of them was just offline and was losing every race because it just wasn't processing anything, it was all fine.

413
00:54:55.540 --> 00:54:55.660
<v Matt Godbolt>Yeah.

414
00:54:55.660 --> 00:55:12.660
<v Ben Rady>right um I think one of the more interesting ah things from that is if you want to be tolerant of certain types of failures, you know like gamma ray burst type stuff where bits just get flipped, then the number of systems that you need to do this is three.

415
00:55:12.660 --> 00:55:18.340
<v Ben Rady>The number of counting is three ah because you need to have two of them, not one, not two.

416
00:55:18.340 --> 00:55:20.840
<v Matt Godbolt>Not one. Not two. Five is right out.

417
00:55:20.840 --> 00:55:25.580
<v Ben Rady>And five five is actually kind of fine in this case, but you need at least three, right?

418
00:55:25.580 --> 00:55:25.580
<v Matt Godbolt>[laughs]

419
00:55:25.580 --> 00:55:40.240
<v Ben Rady>um Because if you have two and one says the answer is A and the other says the answer is B, You don't know which is right. You need you need three so that you can compare, okay, two of them say it's A and one of them says it's B, so B is suspect.

420
00:55:40.240 --> 00:55:44.980
<v Ben Rady>And if you have five and four of them say it's A and one of them says it's B, then that's even better, right? But you need at least three.

421
00:55:44.980 --> 00:55:46.180
<v Matt Godbolt>Time to, yeah.

422
00:55:46.180 --> 00:55:47.020
<v Ben Rady>Yeah.

423
00:55:47.020 --> 00:56:02.600
<v Matt Godbolt>so're all running the same version of the code it's time to yes start looking through your radiation hardening protocol for what on earth happened or check the `dmesg` for any kind of uncorrectable error memory errors and things of that nature but but yeah that's

424
00:56:02.600 --> 00:56:02.600
<v Ben Rady>Yes. Right.

425
00:56:02.600 --> 00:56:14.940
<v Matt Godbolt>Yeah, I think I've just looked at the time and we've been, gabbling you know, given that we hadn't really got a plan, which is, yeah you know, and as regular, our regular listener will know, is how we do this.

426
00:56:14.940 --> 00:56:25.480
<v Matt Godbolt>We have, we've covered quite a lot of ground, although I don't know that we covered our intended topic. exactly as I would have done if we'd have written out something before because we went on so many tangents, but in a good way.

427
00:56:25.480 --> 00:56:35.040
<v Matt Godbolt>Like we talked about time, we talked about durability, we talked about scalability, um and all these things come out of a ah message-based system or can come out of a message-based system.

428
00:56:35.040 --> 00:56:35.040
<v Ben Rady>Mm hmm.

429
00:56:35.040 --> 00:56:47.040
<v Matt Godbolt>and Especially if you have this sort of like journal-based thing where you say the sequence of messages is the only input into my state machine and I can trivially start from the beginning of time and get to exactly the same state.

430
00:56:47.040 --> 00:56:53.780
<v Matt Godbolt>Or we can snapshot if we know what the internal state's important at different points along that time and have the best of all worlds, which is which is super cool.

431
00:56:53.780 --> 00:56:54.140
<v Ben Rady>Yeah.

432
00:56:54.140 --> 00:57:00.960
<v Matt Godbolt>Yeah. um So I think by way of saying, maybe we should stop here is what i why I bring all that up.

433
00:57:00.960 --> 00:57:00.960
<v Ben Rady>yeah

434
00:57:00.960 --> 00:57:09.260
<v Matt Godbolt>So um this has been super cool. And definitely some deep memories there from from previous companies coming up there.

435
00:57:09.260 --> 00:57:12.920
<v Ben Rady>Oh, yeah. Bring in bringing bringing back the hard lessons of systems past.

436
00:57:12.920 --> 00:57:17.060
<v Matt Godbolt>"We make mistakes so you don't have to."

437
00:57:17.060 --> 00:57:17.960
<v Ben Rady>A new tagline on this podcast.

438
00:57:17.960 --> 00:57:33.740
<v Matt Godbolt>That is our new tagline, okay. We've certainly made, I've definitely made plenty of mistakes as as well you know, um as I shared on social media a picture of me driving a car through a place where cars shouldn't go and was too, I got the car wedged. [ https://bsky.app/profile/matt.godbolt.org/post/3ldh76pqffc2z ]

439
00:57:33.740 --> 00:57:33.940
<v Ben Rady>Uh huh.

440
00:57:33.940 --> 00:57:35.600
<v Matt Godbolt>um Yeah, you have to look at me on

441
00:57:35.600 --> 00:57:38.800
<v Ben Rady>three gigabytes versus worth of car in your in your messaging system and it didn't work.

442
00:57:38.800 --> 00:57:44.050
<v Matt Godbolt>In my 2.9 gigabyte ah hard drive, yeah, it didn't work very well anyway.

443
00:57:44.050 --> 00:57:45.300
<v Ben Rady>Yeah, yeah, yeah, yeah.

444
00:57:45.300 --> 00:57:57.760
<v Matt Godbolt>Well, I think we should leave it there, my friend. Thank you as ever for for joining me in this endeavor of trying to, I don't know what we're doing, trying to what?

445
00:57:57.760 --> 00:57:59.470
<v Ben Rady>huh Yeah, yeah, this was a good one.

446
00:57:59.470 --> 00:57:59.880
<v Matt Godbolt>Be entertaining and enjoy ourselves and hopefully be interesting and useful to other people too.

447
00:57:59.880 --> 00:58:03.880
<v Matt Godbolt>All right, friend, until next time.

