WEBVTT

1
00:00:18.240 --> 00:00:19.120
<v Matt Godbolt>Hey Ben.

2
00:00:19.120 --> 00:00:20.260
<v Ben Rady>Hey Matt.

3
00:00:20.260 --> 00:00:26.420
<v Matt Godbolt>One of these days we're going to have to come up with a different intro to this podcast, so I'd like Good afternoon, Benjamin.

4
00:00:26.420 --> 00:00:26.780
<v Ben Rady>Hello, Mr. God Built.

5
00:00:26.780 --> 00:00:34.880
<v Matt Godbolt>There we are. That's better. Little variety there for our listener. Yeah, what are we talking about today?

6
00:00:34.880 --> 00:01:04.780
<v Ben Rady>Today I want to talk about an idea that entered my brain a few weeks ago at work talking to one of our colleagues, and he had asked me a question about this, and it was one of those times where it's like I'm now going to reflect on what I have done for almost my entire career and something that I have been doing this whole time that I never actually put into words until just now. When you ask me a question about this,

7
00:01:04.780 --> 00:01:06.820
<v Matt Godbolt>Isn't mentoring such a rewarding

8
00:01:06.820 --> 00:01:09.020
<v Ben Rady>Experience?

9
00:01:09.020 --> 00:01:10.160
<v Matt Godbolt>Yes.

10
00:01:10.160 --> 00:01:16.210
<v Ben Rady>Being forced to use the talky bits of your brain, which normally lie dormant in the process of programming

11
00:01:16.210 --> 00:01:17.620
<v Matt Godbolt>And turns true.

12
00:01:17.620 --> 00:01:34.500
<v Ben Rady>There's a whole extra CPU over there that you can use for thinking that you just normally don't. Right? It's like a graphics card in your brain. You can do all kinds of interesting things with,

13
00:01:34.500 --> 00:01:35.900
<v Matt Godbolt>So he was, yeah. What did he ask for?

14
00:01:35.900 --> 00:02:52.200
<v Ben Rady>Yeah, so we were talking about automated tests of course, because what the hell else do I talk about? And he was saying, how do you write automated tests for when you're doing performance optimization? Do you write a unit test that shows that the performance is bad and then go and improve the performance? And then you see your test pass and I say, no, I don't really do that. I kind of treat performance optimization like refactoring. And he got this look on his face like, wait, what? And I had never really thought about it until that moment. But yes, there is actually I think a whole category of things that you can write automated tests for perhaps. I think it's difficult to write good tests for them, but you can, and I sometimes do, but for the most part there is a category of things at which performance optimization is one where my explicit intention is to keep the sort of observable behavior, and I can define what that means very specifically later, but keep the observable behavior constant while changing the code in some other dimension. And obviously the most common thing that you do with this is to change the design of the software so that it's easier to understand,

15
00:02:52.200 --> 00:07:21.440
<v Matt Godbolt>Easier to change. That's the very traditional refactoring definition.

16
00:07:21.440 --> 00:07:49.840
<v Ben Rady>Yeah, absolutely. Absolutely. Now I will say again, here comes all the asterisks and caveats to this bold statement that I make at the head of the thing, because the real world is a little messy. There are many situations in which the way that you achieve a performance optimization is by changing the behavior, the actual observable by unit test behavior, right? Like, oh, we're going to just filter out some things from this collection before we process it, for example, would be an example of how you would speed something up

17
00:07:49.840 --> 00:12:04.220
<v Matt Godbolt>Or an API changes so that well, hey, if I hand you each individual event one by one, it's a lot slower than me giving you, here's just a big old chunk of events off you go. And then we lose that kind of feel to it, which is like, well, batching or otherwise changes the API. You could probably write an adapter to make the test still fit that whole, but you have still changed the way things look.

18
00:12:04.220 --> 00:12:48.220
<v Ben Rady>Right? Right. Yeah. And the tests are there in that case only to make sure that if you accidentally switch from a stable sort to an unstable sort, that doesn't matter. And that sort of gets at the heart of what it means to have changing code where the behavior doesn't matter as verified by a set of tests. The behavior that matters in the behavior that doesn't matter is specified by the tests. And that's one of the reasons why it's important not to overs specify behavior in your test because not only does it make it harder to refactor, but if performance optimization in this way is also refactoring, it makes it harder to optimize your performance because you're asserting behavior that you don't actually care about.

19
00:12:48.220 --> 00:13:03.660
<v Matt Godbolt>That stable sort versus unstable sort is a perfect example of saying it's so easy to write a test that will fail for the wrong reasons if you accidentally assert stability of an order. So just for those who aren't

20
00:13:03.660 --> 00:13:06.760
<v Ben Rady>Familiar, yeah, I guess we should define what stable and unstable sorts are maybe.

21
00:13:06.760 --> 00:14:23.100
<v Matt Godbolt>Exactly. So when you're sorting things, you sort by a key, and that may be only one of the properties of the objects that you have in an array or container. What do you do when there's a tie? Well, one thing is it doesn't matter everything that has the same priority or whatever the key is, doesn't matter which sequence they appear in, they all have the same values to you that you are interested in or it doesn't matter, whereas. And so an unstable sort, which is often more optimized when you're doing the sort, it doesn't matter if you have two things at the same if you end up switching them or not switching them, doesn't matter. And that's great, and that's often what you want, certainly if you're writing the sort, but sometimes if there's a tie, you still want to keep them in the same sequence that is objects that have the same value need to be kept in the same sequence as they were originally. So maybe, yeah. Anyway, so that is much harder to achieve in a sort algorithm and more expensive, and it could be so easy to write a test where you have things that do have the same values in them, and then you assert them to be exactly some output form and that exact output form is stable. And then later on you change your sort and they get switched around and you're like, why is my test failing?

22
00:14:23.100 --> 00:14:25.780
<v Ben Rady>Yeah, right. Exactly. Exactly.

23
00:14:25.780 --> 00:14:51.740
<v Matt Godbolt>And we do this with things like sets, things that return sets of objects. Sometimes you forget that maybe that set is not ordered or not ordered, and then again, changing the internals to make the set deterministically ordered or not could be an optimization. It's like, Hey, I don't have to keep them in sorted order. As long as it's still a set of all the things that you said in any order, that's fine. Yeah, over specifying is bad anyway, think.

24
00:14:51.740 --> 00:15:30.020
<v Ben Rady>Yeah, no, that's exactly right. And switching, for example, from a list to a set in order to achieve some performance optimization and having a test fail because the test cared about the order, but really you don't actually care about the order, you just care that it's a collection is a perfect example of this. And so that's one of the many reasons why overs specifying in tests and not being very intentional about what it is that you assert and what it is that you don't assert can make your tests less useful because it doesn't let you do this. Another area that I think is that is like this is thread safety.

25
00:15:30.020 --> 00:19:09.200
<v Matt Godbolt>Oh

26
00:19:09.200 --> 00:19:16.880
<v Ben Rady>Yes, exactly. Think really hard about it and then go explain it to someone else on your team,

27
00:19:16.880 --> 00:19:19.740
<v Matt Godbolt>And if they still think it's a good idea,

28
00:19:19.740 --> 00:19:31.800
<v Ben Rady>A, to use that sidecar GPU to think about it again, and also to make sure that they agree that we're not about to introduce a foot gun into the system and break a whole bunch of things

29
00:19:31.800 --> 00:20:19.540
<v Matt Godbolt>For real. Yeah. It's interesting how much on my team, even recently I've been discussing about the introduction or otherwise of threading and opinions diverge as to how much trouble that's going to cause. And my healthy skepticism is if we can avoid it as much as possible, then great, and if we do have to do it, let's use more message passing style, then locking and threading, and then I've kind of essentially outsourced the correctness of my thread library to something which is a single producer, single consumer queue or a multi, and then hey, the best kind of shared state is the one that you don't share with anyone at all. Mutable shared state is the one that you copy and give to someone else, and then it's not shared.

30
00:20:19.540 --> 00:20:19.640
<v Ben Rady>But

31
00:20:19.640 --> 00:24:07.920
<v Matt Godbolt>This was not meant to be about threading.

32
00:24:07.920 --> 00:24:33.060
<v Ben Rady>I know, and I think this is the problem that I have with this, and this is why I think this is controversial. I use basically the exact same approach. I'm not writing tests for the dependencies that I have, but I also don't have a better process than upgrading some dependency or the standard library or whatever running, make test, maybe deploying to the test environment and starting the system up and be poking it

33
00:24:33.060 --> 00:25:07.000
<v Matt Godbolt>With a stick a bit. Yeah, let's me exactly. Well, I mean I do this, I do this once a week. Every Monday night I upgrade all of the compiler, explorer's, dependencies, and I run all the tests again, and then I deploy to our staging instance, and I literally do that. I poke around, I do a few compiles, I have my little list of things I do, and then I go, I guess that works then. And then sometimes it hasn't. Someone goes, oh yeah, this syntax highlighting isn't working anymore. I'm like, oh, shucks. I don't have the bandwidth to write full UI tests for third party components, for example. And so sometimes things break,

34
00:25:07.000 --> 00:27:30.440
<v Ben Rady>Right? So it's like that. I don't think that I can justifiably call that kind of change refactoring because I have no expectations that problems would be caught. I don't know. I just don't know

35
00:27:30.440 --> 00:30:28.220
<v Ben Rady>Yeah, I suppose if you were doing anything with sort of generated code and you changed the way that you did the code generation, that would be another kind of thing where you would expect your tests to continue to pass, and it's very possible that they would catch some problems. I think it's not quite as unjustifiable confidence as the upgrade the libraries and all my tests pass because we explicitly don't, we explicitly don't test that behavior. Sort of like, okay, well where would you stop testing that behavior? Do you trust the cpu? Do you trust the electrons? You got to draw the line somewhere. But with the sort of code generation, there might be some situations where you say, we do actually at least sort of transitively can say if this code generation is broken, something upstream of it is almost certainly going to break. So you could rely on the tests in that way. It's an

36
00:30:28.220 --> 00:31:22.640
<v Ben Rady>Yeah, right, right. Yeah. I mean the analogy that I used for this is I feel like there was some physics professor who is many decades ago who was doing research into quantum physics and then started walking around with risers on the soles of his feet. He was scared that he was going to fall through the floor, and it's like you just sort of do the math and you're like, wait a second. This is possible. And it's sort of like you can't live your life worried that all of the electrons in the ground beneath you are going to suddenly align in such a way that your legs just sort of pass through the floor, and that's just how it's going to happen. Maybe there's some infinitesimal probability of that, but you just can't do that. You have to have, actually, I did a talk that was sort of adjacent to this once, which is sort of tongue in cheek saying, all engineering is based on faith.

37
00:31:22.640 --> 00:31:24.640
<v Matt Godbolt>It is

38
00:31:24.640 --> 00:32:06.050
<v Ben Rady>Faith-based engineering, and it's like at a certain point, you just have to have faith that the CPU is going to be able to add one and one and get two. Because if you do everything constantly questioning all of the abstractions beneath, you won't get anything done. I think what makes a senior software engineer a senior engineer is that they've developed a sense for when that can be trusted. I think if I add one and one, I'm going to get two. And when it can't be trusted, where it's sort of like, what do you mean I can't treat this TCP socket like a file because you just got to learn all those situations in which the abstractions are there and sometimes

39
00:32:06.050 --> 00:32:06.320
<v Matt Godbolt>Level

40
00:32:06.320 --> 00:33:31.280
<v Ben Rady>Really, and sometimes they're may be not. That's what makes

41
00:33:31.280 --> 00:34:36.360
<v Ben Rady>You're right. You're going to fall down a rabbit hole and never come back. And I think this is directly related to what we were talking about in terms of being able to draw this line and have these dimensions of refactoring that is beyond just redesigning. Because again, this ties directly into the problem of overs specifying in tests or under specifying. If you go the other way, if you're mistrustful of abstractions that are beneath you that you should be trustful of, or it maybe makes sense to, it doesn't make economic time sense to distrust them, then you're going to overs specify in your tests. And that has many problems. One of which is you're wasting time writing tests for things that are probably not going to break, but also you're limiting the dimensions of refactor ability that you will have in the future because of an unwarranted fear. And also on the other side, you don't test things enough because maybe we should have some situations, you and I, where we write some tests specifically for when we do that next upgrade just to make sure that this thing really still works the way that it should.

42
00:34:36.360 --> 00:35:07.500
<v Matt Godbolt>Well, those things get informed a little bit by experience there when you've been bitten a couple of times by stupid whatever file handling in do library, Bob, you go, well, I tell you what, while I was even trying to work out that this was a problem, I wrote some tests just to pave the way or to explore the space, and I tarted them up a little bit and then we checked them in because they didn't hurt us any, and they might just solve it the next one. But that always has a sense of a feel of bolting the stable door after the horse is bolted, whatever the, I forget shutting them.

43
00:35:07.500 --> 00:36:02.180
<v Ben Rady>But I mean, I think it makes sense to sort of wait to feel some pain on those fronts because if you did that speculatively, again, where's the boundary? You'd be testing everything. But I think what this is is getting that level of specification right in the goldilock zone where not too much and not too little takes experience. It takes some hard fought experience sometimes, but when you get it right or when you get it close, you're able to do all of these things. You're able to optimize your code with a set of running tests to tell you that it's not broken. You're able to change your threading model again with tests that tell you that it's not broken, right? You're able to do all of these other things that we can use our tests for, but if you either overs specify or unders, your ability to do that will be greatly limited.

44
00:36:02.180 --> 00:36:10.980
<v Matt Godbolt>Yeah. That's awesome. And you've somehow managed to tie all my complete tangent background to the original topic. Yeah.

45
00:36:10.980 --> 00:36:20.900
<v Ben Rady>Well, it turns out these things are usually related. It's like there's subtle or sometimes not so subtle dependencies on these things, and you can see them, they're there.

46
00:36:20.900 --> 00:36:26.240
<v Matt Godbolt>Well, fantastic. I think that's a great place to pause and

47
00:36:26.240 --> 00:36:27.060
<v Ben Rady>Yeah, call it

48
00:36:27.060 --> 00:36:39.060
<v Matt Godbolt>For today. Yeah, go refactor some code or go back to my compilations speed or optimize it. Yeah, I mean, go optimize some code. That's always a fun thing to do.

