WEBVTT

1
00:00:14.760 --> 00:00:19.640
<v Matt Godbolt>Hey Ben,

2
00:00:19.640 --> 00:00:20.500
<v Ben Rady>Hey Matt,

3
00:00:20.500 --> 00:00:21.600
<v Matt Godbolt>How you doing?

4
00:00:21.600 --> 00:00:22.720
<v Ben Rady>Very good.

5
00:00:22.720 --> 00:00:51.260
<v Matt Godbolt>Excellent. Well, we don't normally refer to, uh, things that have happened in the news because that gives us a certain flexibility in the order that we release these recordings. But you and I were literally just talking about the fact that Broadcom has bought VMware and we were gonna talk about some level of containers versus not containers versus virtualization versus whatever. And it seems like we should, we should bring that up. So, let's talk! What do you...

6
00:00:51.260 --> 00:00:52.200
<v Ben Rady>Seems like a good topic.

7
00:00:52.200 --> 00:02:00.920
<v Matt Godbolt>Right? Exactly. It's such a deep one. And you know, we've got varying levels of experience in different technologies for essentially what is, how do I make sure my software works in the environment that I'm expecting it to? And I'm, I'm thinking personally from this point of view, like a developer who deploys server type applications, headless applications that run on machines in the cloud or in data centers and whatever like that. But I guess actually now I'm saying that I was sort of giving that so that, you know, I can't say, um, too much about how UI stuff is developed, but then there's a number of software packages I see for Linux these days that come as a pack file or a, um, uh, what are those things called? There's, there's a, there's a bunch of different snaps there snapshots, which are essentially, here's a, here's a whole operating system's worth of application wrapped into a file system and then presented as if it's like a, a single thing. And it's like a Docker container. Right. But, but differently. So, so it, it it's, it is everywhere. And I think we all want software, that's easy to deploy and run, but there's a number of ways of achieving it. What are your thoughts?

8
00:02:00.920 --> 00:02:49.400
<v Ben Rady>Uh, well, I mean, I think it's interesting that I think you and I have a sort of a similar perspective in that we look at those tools that way. And if we asked somebody who, you know, was a little bit more focused on infrastructure, they would probably tell us something similar, but definitely not the same in terms of like being able to take, um, you know, a fixed amount of hardware that they have provisioned and paid for and, you know, have, uh, prepurchased the electricity for, and have backup batteries for, and have networking for and say, okay, well, how do I take this sort of fixed resource and allocate it out to all of the, uh, needy, greedy software engineers who keep telling me that they want more servers? Um, well, I gotta have a server for this app and I have a server for this app and a server for this app.

9
00:02:49.400 --> 00:03:27.740
<v Ben Rady>Um, and so, you know, I think from their perspective, they might see some of these virtualization tools as a way to, um, you know, manage those resources more effectively and have, um, a little bit more control over not only just the resources themselves in terms of memory and compute, but also those, the, the sort of blast radius, if one of them, you know, goes horribly wrong. Right? Like, you know, being able to, uh, you know, wipe an image and, and give someone a fresh new server, uh, with a few clicks, a button is way easier than driving down to the data center and unracking a machine that is no longer responsive, um, because somebody did something terrible to it. So...

10
00:03:27.740 --> 00:03:29.650
<v Matt Godbolt>Never, I dunno what you're talking about.

11
00:03:29.650 --> 00:03:30.960
<v Ben Rady>I've never done that.

12
00:03:30.960 --> 00:03:32.050
<v Matt Godbolt>Never done that.

13
00:03:32.050 --> 00:03:32.580
<v Ben Rady>Uh.

14
00:03:32.580 --> 00:03:36.300
<v Matt Godbolt>Oops. I just fork bombed my own machine <laugh>

15
00:03:36.300 --> 00:04:21.660
<v Matt Godbolt>So you're right. Actually, that's a very valid point. Um, those infrastructural things are super important and it's sort of a, a funny thing. We, I was talking about it at lunch with a bunch of folks the other day, and, um, regaling with a story from, from my past and a friend of mine who used to work at, uh, uh, a big airline and those folks are still using mainframes and mainframes have always been able to do all the things that we are now kind of starting to rediscover in that virtualization world. Right. You like, Hey, you want more CPUs? Yeah. We can bolt more CPUs on while the mainframe's still running. Hey, you wanna shut down the mainframe, do maintenance and bring it back up again. Sure. What we can do is we can teleport the mainframe's image up to a backup site. Nobody even notices that your connection is now going to Manchester instead of London.

16
00:04:21.660 --> 00:04:53.300
<v Matt Godbolt>Um, and your terminals keep on responding. Everyone's still doing their re requests and the batch jobs are still going. Meanwhile, they power down the main machine, fix the Ram, and then you can teleport it back. And those things have been around for, you know, half a century. And yet we are rediscovering them in terms of what, I mean, specifically you were mentioning things, uh, like, like VMware, um, allow you to manage the resources, really fine grained and make those kinds of like, Hey, we need to move from one machine to another machine. I it's, it's sort of miraculous. Yeah. That, that it works as well as it does.

17
00:04:53.300 --> 00:04:59.540
<v Ben Rady>Yes, yes, yes. Yeah. Cuz those mainframes were clearly designed with those specific use cases in mind, right.

18
00:04:59.540 --> 00:05:01.840
<v Matt Godbolt>Hardware capabilities to do those things.

19
00:05:01.840 --> 00:05:40.340
<v Ben Rady>Right, right. They built those things from the ground up with like, okay, we're gonna be able to do this offsite backup and we're gonna make sure that it all works. And with all these other things, we sort of backed our way into it because it's like, clearly there's a need to do that. But you know, the old school operating systems and CPU architectures and all these other things that we have, uh, maybe someone gave that a thought a long time ago, but they certainly didn't design the whole ecosystem from the ground up to be able to do that. And so now we're sort of, um, in this state where it's like, that need is still there. The desire is still there and it's this sort of tricky problem of, okay, well how do you actually do that?

20
00:05:40.340 --> 00:06:35.680
<v Matt Godbolt>And yeah, folks like, uh, VMware have got their solution for it. There are obviously other vendors that can do it. And, and of course, I mean, one should, should note as well that the, the chip manufacturers have been slowly heading this way too, adding more and more like hardware level virtualization things. Cause you know, like we've always been able to do these things. It's like my, my hobby of writing little, uh, emulators for old machines, once you can fully emulate something, of course it's state is just a bunch of numbers that you've got, you can move that around anywhere you like, and then kind of carry on somewhere else and have, you know, your single step through each frame of a game and then go backwards because you could just emulate from a snapshot, one fewer frame forward and keep, you know, that kind of stuff. So this has always been possible, but it was just infeasible to do it without actually running the same CPU as you are, you are trying to virtualize and, and the same hardware, but then things have come along. But I feel like we're going off base from where we, where I was thinking are going <laugh> I just got excited, which is what this podcast is about. Right?

21
00:06:35.680 --> 00:07:01.640
<v Ben Rady>Yeah. Yeah, no, I mean, I, I think these, all these things are all are all kind of related, you know, you can, um, and maybe I, I, I don't think we should necessarily dive into this at the start, but one place that you could maybe take, this is like this isn't just about virtual computers. It's also about virtual networking equipment, right? Like if you look at, you know, some of the tools that are out there, it's like, yeah, you, you think that this IP address is a switch, but it's, it's not like

22
00:07:01.640 --> 00:07:22.500
<v Matt Godbolt><laugh>, I mean, one only imagines what's going on in like the, the AWS's and the Google cloud infrastructure in terms of their physical network separation and their ability to, as you say, make it look like you have your own cloud to yourself, knowing that actually no, those, those fibers are the same fibers that everyone else is using between all of the racks. It's all magic.

23
00:07:22.500 --> 00:08:13.100
<v Ben Rady>Right, right. Yeah. Um, but yeah, talking specifically about the, uh, the part of this that is, you know, I, as a software developer, as somebody who's, you know, sort of building a, a total application, you wanna be able to deploy it. That's really important. Uh, you want to be able to, um, you know, connect to the machine, that's running it and troubleshoot it and read the logs and, you know, run TCP dump and netstat and all the other wonderful tools that we talked about in some prior podcast. Um, and, you know, still, uh, have the flexibility of, of, of the things that we were talking about in terms of, you know, um, you know, making maximal use of those resources and being able to tear it up and down and being able to, um, you know, build, uh, the definition of what that system is, uh, in a configuration file, rather than, you know, in PCPartPicker.

24
00:08:13.100 --> 00:08:16.680
<v Matt Godbolt>Here's a checklist that Barry has to go down and make sure they all look the same. Right?

25
00:08:16.680 --> 00:08:30.520
<v Ben Rady>Yeah, yeah, yeah, yeah. Right, right. Um, and so, you know, there's, there's, there's lots of, lots of different tools to do this, but I think they all sort of serve the same needs. So what are some of the tools that you've, that you've actually used in anger to do this?

26
00:08:30.520 --> 00:09:18.060
<v Matt Godbolt>Well, I mean, the main one that springs to mind, uh, other than bespoke ones that I guess became Kubernetes when I was at Google, there were some things that became sort of like strange containery type things. I dunno if that's exactly the same thing now, I think out loud, but, but the one I have the most experience with is, is Docker and Docker is a great solution to the, I want to have a reproducible environment that's incrementally built with layers. And so it's relatively efficient if you are only changing the end layers. And, um, and you can definitely have the, that kind of feeling of like, well, if I have a Docker image that I can give to you and you are gonna run it, then I am 99 point, you know, six nines positive that if it worked on my, it worked on your machine, because what I in fact did was ship my machine to you.

27
00:09:18.060 --> 00:10:03.040
<v Matt Godbolt><laugh> and now you are running my machine, which is a blessing and a curse. And I think that's the problem, right? Is that it can be misused like anything like any technology um, my experiences with Docker. So Compiler Explorer started out, actually didn't start out with anything. It just started out with a shell script, running the node, JS on a bare machine, and then very quickly it was like, how am I gonna manage this? So I decided to use Docker rather sensibly, at the time. And Docker served as well for many, many years. Um, Docker did not scale with the gigabytes of, and gigabytes, you know, hundreds of gigabytes of compilers that I wanted to build into the image. The images took longer and longer. Every time to build. We're gonna take a pause there where my wife comes in through,

28
00:10:03.040 --> 00:10:04.060
<v Ben Rady>Through the couch?

29
00:10:04.060 --> 00:10:05.860
<v Matt Godbolt>Through the back. Yeah. Through the couch,

30
00:10:05.860 --> 00:10:08.980
<v Ben Rady>How have I never realized that there's a door behind your couch?

31
00:10:08.980 --> 00:10:16.980
<v Matt Godbolt>How else do you get between, um, places? You know, we put flue powder in and then we can go anywhere to any other couch.

32
00:10:16.980 --> 00:10:20.040
<v Ben Rady>Diagon alley. That's how, that's how this works?

33
00:10:20.040 --> 00:10:21.440
<v Matt Godbolt>That is exactly how this works.

34
00:10:21.440 --> 00:10:21.780
<v A very nice lady>Sorry.

35
00:10:21.780 --> 00:10:22.360
<v Matt Godbolt>It's okay.

36
00:10:22.360 --> 00:10:23.220
<v A very nice lady>The back door's jammed.

37
00:10:23.220 --> 00:10:25.340
<v Matt Godbolt>The back door's jammed. Okay. The back door's.

38
00:10:25.340 --> 00:10:27.740
<v Ben Rady>Oh, no. So you had to come in through the couch door.

39
00:10:27.740 --> 00:10:47.600
<v Matt Godbolt>Come in through the couch door. All right. There goes my dog. And then we're gonna have to try and remember what I was saying and work something out, or just pretend it didn't happen and just put this in and, you know, <laugh> give our listener. Uh, I was just thinking actually I hope our listener isn't called Barry because I always use Barry as, uh, as, as, as my general dog's body person.

40
00:10:47.600 --> 00:10:48.200
<v Ben Rady>Oh yeah.

41
00:10:48.200 --> 00:10:48.660
<v Matt Godbolt>To do Barry.

42
00:10:48.660 --> 00:10:53.100
<v Ben Rady>I say Steve, for some reason. I don't know why that is Steve that's Steve. Yeah.

43
00:10:53.100 --> 00:10:54.900
<v Matt Godbolt>So we were talking about Docker.

44
00:10:54.900 --> 00:10:58.220
<v Ben Rady>Yeah. You're talking about using Docker in, in, uh, Compiler Explorer.

45
00:10:58.220 --> 00:11:44.580
<v Matt Godbolt>That's right. So the problem with, um, bigger and bigger images is that, um, no matter how you cut it, you are uploading layers upon layers upon layers, upon layers of a, of a piece of software with more and more compilers. And it was just getting unwieldy and there are definitely tricks you could do with volumes and other things like that. And we, we looked at them for a while, but ultimately we backed out when we realized we needed more security than Docker would give us. There were some, at the time, there were some relatively high privilege, um, exploits for breaking out of Docker containers into the wider world. And we were kind of tacitly relying on Docker to also be a sort of protection domain. And the other thing is that if you're running inside that container, even if you, um, even if you don't get privileged escalation outside that container, that container is long lived.

46
00:11:44.580 --> 00:12:50.760
<v Matt Godbolt>So if you're like servicing somebody's request and it was a poison request, and it was able to monkey with the system, it's now monkeyed with that running Docker container. And so it's gonna be there until we restart the machine or restart the docker container. So there was some things we didn't want, um, properties we didn't, um, that we wanted to get in terms of jailing. And once you're in one container, you can't have a container inside a container inside a container arbitrarily. At least at the time you couldn't. So we switched out to a different approach where we just have tarballs and run them on the operating system, but it did serve a need for a long time. And it's a frequent question we get asked is, "Hey, do you publish a Docker container of what you, of a Compiler Explorer, uh, instance that I can just get started?" Because people do just want to do Docker run, blah, and you get that benefit. It just works. Yeah. Um, we have different ways of achieving that, I think. But, um, it, so anyway, that's my experience with Docker. I also have used it at a number of places at work and, um, I think it works great if you plan very carefully, your Docker image layout and the layers are sensible and well managed. And yeah.

47
00:12:50.760 --> 00:12:54.780
<v Ben Rady>So when you say a Docker layer, what do you, what do you mean what's a layer?

48
00:12:54.780 --> 00:13:44.140
<v Matt Godbolt>So Docker logically is, um, a file system, a whole operating file system. It has, um, it literally untars for one of a better, um, explanation into a bunch of temporary directories and then overlays each directory, uh, one directory over the other. So you start with a base image, which is maybe, you know, like your entire Ubuntu distribution. And then you're like, oh, the first thing I'm gonna do is I'm gonna install these 20 apt packages that I need. And so the next layer will be another file system that only contains the things that change between the base system and the system where you ran pseudo APT install my hundred packages. I needed my extra packages. And then the next layer might be, oh, and now I'm gonna copy some files from my git repo that I'm running it in into the container at a particular location.

49
00:13:44.140 --> 00:14:50.300
<v Matt Godbolt>And that's another layer of the file system that only contains those copied files and then so on and so forth, each, each layer would add in more bits of the software and configuration. And the cool thing is, of course, is that you only need to regenerate layers that changed. And of course the layers that are immediately after them. So if I change, for example, the base Ubuntu image, of course, everything depends upon that. So I'll have to rerun the commands that populated the later layers and create new layers. But if I'm just changing my application software and I don't change my dependencies and my system dependencies, then oftentimes it's only that last layer of a few hundred kilobytes or so that changes. And so, uh, not only is the build time faster, but the way that Docker Docker, um, distributes itself is as compressed layers. And very often, of course, if you're upgrading software time and time, again, those, those base layers are already on the system, in the cache somewhere. And the only thing that you need to do is upload the few hundred K time, which is fabulous. Yeah. So that's a really good way of, of having, um, a, a sort of incremental deployment of your, of your software.

50
00:14:50.300 --> 00:14:58.260
<v Ben Rady>Yeah. So what happens if I have like a, a layer that's like fetched the latest version of this thing from the internet?

51
00:14:58.260 --> 00:15:38.320
<v Matt Godbolt>Well, that's, that is an excellent question. And, uh, that is one of the biggest problems with something like Docker is that it's very easy, uh, Docker cached based on the text contents of the command that's to run. Oh, okay. So if you just say curl, get me latest version of something pipe through tar -zxf or whatever to extract it, then that command will run exactly once on your machine when it populates that layer. And then if you run again, having uploaded up, having changed the, um, the contents on the website that you're curling from, or like a new version of the software is released and the URL doesn't encode that in some way, you know, you're getting, you know, like, right.

52
00:15:38.320 --> 00:15:39.320
<v Ben Rady>You're getting latest or whatever,

53
00:15:39.320 --> 00:15:48.160
<v Matt Godbolt>Bob dot latest exactly. Right? Yeah. Then you won't see that, but unfortunately, anyone who later builds with your Docker container will see that change.

54
00:15:48.160 --> 00:16:36.220
<v Matt Godbolt>And so these things will not necessarily agree. And so it's really important that if you are fetching external resources, uh, and it's so easy not to get this right. But if you are fetching external resources that you get, like a specifically named version of everything that you want to get for two reasons, one, it means that you get reproducibility. If someone else grabs your Docker file and just says, build me this please. And the second thing is that necessarily, if you want to change that image, you have to edit the, the URL that encodes the git sha or the version number or whatever. And which means that it will be rebuilt automatically, but it's hard to do that right. And it's hard to make sure you apply that everywhere. Even things like the base image itself, you know, often time when you say in the Docker file, Hey, I'd like to build something based on a Ubuntu 20.04.

55
00:16:36.220 --> 00:17:21.520
<v Matt Godbolt>That's essentially what you say, you say from Ubuntu colon 20.04 from Ubuntu colon latest or something like that. And those are kind of like a git pull of whatever someone has tagged as being the 20.04 for a Ubuntu. If you really, really want to make sure you get reproducible bills, you need to put the SHA hash of that particular layer in the get command as well. So that, you know, you're always gonna start with the, um, uh, the same version. And of course there's a duality there, right? It's convenient from, you know, from my mindset, it's great to have a totally reproducible build. And that means that I can hand you a Docker file, not, not the contents of the Docker image, right. That's different. But if I hand you just the text that says, this is how to build my world, you will get the same answer that I got every time.

56
00:17:21.520 --> 00:18:27.400
<v Matt Godbolt>And that's really powerful, but it's super inconvenient because, um, every time some little trivial fix in the base images pushed, you know, a security patch or a security fix or whatever, then I have to think to go back and change the sha to be the latest one. And that kind of feel if I want to keep those things going. And of course the first thing you're gonna do this is almost always what the first, uh, line after the, from Ubuntu is, sudo. Not sudo, cause you're running as root. Is apt, get update and update, update, sorry, upgrade and update, right? Because you want to in pull in all of the, the, the things that are are latest. There's no kind of version for that. There's no bi-temporality to that. So you're a bit stuck at that point. Um, and that factors into where some of the problems that one has with, with something like Docker, it's a boon, but you have to be really careful how to use it and have to understand these slightly sharp edges. And maybe most people don't care about those, but I know that it's affected us before. And we, we have a, you know, you and I have definitely got, um, an industry where we really want to be able to reproduce what we did before and, and understand it.

57
00:18:27.400 --> 00:18:49.040
<v Matt Godbolt>It's also very easy to generate gigantic layers. If you think about, um, if you, if you don't design your Sy, your Docker file correctly, you know, so in the example, I just gave up apt update, apt upgrade, apt install, right? Those are like sensible commands. I might type myself if I had a fresh new computer that you handed me,

58
00:18:49.040 --> 00:18:49.940
<v Ben Rady>Right.

59
00:18:49.940 --> 00:19:25.680
<v Matt Godbolt>The simple thing to do would be to run them as three separate layers. And that makes a lot of sense, but I've pulled down a whole bunch of stuff and replaced a bunch of, uh, um, there's a load of temporary files that get pulled into the apt directory that I probably don't need in my production image. Um, I've then updated a whole bunch of stuff, which has replaced a bunch of stuff. And then I'm like maybe installing my own packages. And maybe I remove some system packages that I don't want. Right. And so I've got three or four layers, each of which is strictly additive. And then there is sometimes if you had to delete files, so I, you might be tempted at the end of that to go. And the last thing I do is rm minus re, /var/apt/cache right?

60
00:19:25.680 --> 00:19:57.600
<v Matt Godbolt>Kill the cash. I don't want it anymore. It's like gigabytes of all the intermediate crap that was downloaded while I was installing my packages. But if you put it as a separate step, unfortunately those already exist. Those intermediate files exist in a layer that delete can't remove them from the layer. It just marks them as being, you can't see them anymore. It puts tombstones in there. And so your overall size, the number of bytes you need to ship around still contains the layer that has all of those files in it. And then a separate layer that says, and by the way, all those files are gone now. <laugh>

61
00:19:57.600 --> 00:19:58.280
<v Ben Rady>Right, right.

62
00:19:58.280 --> 00:20:23.700
<v Matt Godbolt>So you have to be really careful. So you, what people end up doing is writing a, a, a long stanza of like app get and update, and what as like one giant long single bash command. And at the very end of that, rm minus rf /var/apt/cache and dpkg dash dash, you know, purge, cashes, all the things as one thing. So atomically all those things happen. And then it's just the, the end result that gets shipped as the layer.

63
00:20:23.700 --> 00:20:44.380
<v Ben Rady>Yeah. Yeah. And that, I've definitely seen that in Docker files and it's sort of this, like, you know, uh, it just reads as gobbledygook as at the start of the file and you sort of parse it and you sort of figure out what's going on there, but it's, it's not the sort of like clean, you know, one instruction at a time, maybe with a helpful comment as to why you're doing it, um, that you'd want.

64
00:20:44.380 --> 00:21:24.640
<v Matt Godbolt>You, you know, you, you see, sometimes people will write shell scripts that they then copy into the image to run and then delete again afterwards, just because then the shell script is essentially atomic from the point of view of the layers and it's, I mean, it could be a tooling thing. It could be just what you'll get used to. I don't know, but it's easy to get wrong. Yeah. And the thing is that as a developer running locally, you tend not to notice these mistakes because it's necessarily incremental. You've been doing this, you've been building on and building on and building on. Right. And then when you ship the, the, when you docker push for the first time you discover that you've got several layers of a, you know, gigabytes each, and I'm sure you've done this as well, when you've pulled someone else's Docker image and you're like, oh my golly, what an earth is it pulling down?

65
00:21:24.640 --> 00:21:41.840
<v Ben Rady>Why is this Docker image so big is a game that many have played and few have won. <laugh> it's, it's, it's just, it's a really painful experience sometimes. You know, you start cracking, open the layers and trying to figure out what the heck is going on. And it's just like, oh geez, why are we doing this?

66
00:21:41.840 --> 00:22:28.740
<v Matt Godbolt>Right. Right. And I think a lot of the time, people reach for Docker because it's super convenient. Everyone understands it and it does solve a very real need. But I think oftentimes in my experience with the, the kind of things that we do at least, um, a tarball of the code that you're gonna run, maybe containing the node.JS binary, you wanna run it with, or maybe, you know, cause we are in a luxurious position where we own our machines, they live in a data center, we know which machines they're running on, which, you know, probably virtual machines as it happen. So that's another layer of a virtuality above all of this. Um, but if we know a lot of things about what version of libc is it running? What, you know, base operating system are we running?

67
00:22:28.740 --> 00:23:14.780
<v Matt Godbolt>What things can I assume are there, which of course is now a dangerous game to play, which Docker kind of makes you address fully. But most of the time you're like, well, okay, if I've got libc this version, I'll just pass along all my dependencies. Right. And it's not that big, you know, for native applications, often a bit of an, uh, um, a few environment variables. And suddenly now, uh, all of your DLLs will be looked for inside the, the directory you ship. And then you're just like, copy them all with you. And that's a bit bigger, but you know, we're talking tens of megabytes of, of library files here, right. In a little tarball that you extract and will run on a developers machine and a remote machine. And I guess the other sort of critical part about Docker is that it requires elevated privileges, which means that there's a lot of monkey around with which user you are running as right.

68
00:23:14.780 --> 00:23:55.400
<v Matt Godbolt>And that sometimes it's useful. You sometimes you want a totally unprivileged user that's isolated from the rest of the system. Um, and, and, you know, in the, like the kind of was it 12 factor type model where, um, an application sort of consumes only logs to standard out only reads and writes to external things through TCP. That's fine. You tweet it as like a black box, but very often it's tempting for developers to carry out. Well, it would be really convenient if I could get to this set of files on the network, or if I could write to this log directory. And so you start passing things, you start puncturing, the isolation that Docker gives you. And then suddenly you wonder why on earth, you've got a hundred files that are owned by the wrong user,

69
00:23:55.400 --> 00:23:55.960
<v Ben Rady>Right.

70
00:23:55.960 --> 00:24:08.480
<v Matt Godbolt>Excuse me, as a truck going past. Um, but you know, you run this command and then you like try to delete it afterwards. And it goes, I'm sorry, I can't delete that. You know, you need to be root. And you're like, wait a second. I'm not, I, how did you, how are you root?

71
00:24:08.480 --> 00:25:03.190
<v Ben Rady>Yes. How did you write this as root? And I, and I think it is really an unfortunate thing that the default behavior behavior of Docker is to run as root, cuz it's really easy to sort of fall into a trap of, of, um, building an application that accidentally for really no good reason needs those elevated privileges. Right? Like if you had just been forced to think about it for a minute, you would've been like, oh, well we don't. I mean the, the dumbest example I can think of is like we're binding to port a hundred instead of, you know, 2000, right? Like there's no reason in the world why that integer matters to anyone. But if you build a whole application, it's like, yeah, there's 30 other apps that connect to port 100, cuz that's the port that we chose. Um, and not realizing that that requires elevated privileges. Um, then you've, you've just added a whole bunch of you've added a constraint completely by accident.

72
00:25:03.190 --> 00:25:04.760
<v Matt Godbolt>Right.

73
00:25:04.760 --> 00:25:14.480
<v Ben Rady>Um, and, and running as a non-privileged user, you'll find that out right away. Um, and there are other things like that too. And I, and I feel like it's almost like the testing thing. Right. And I on brand,

74
00:25:14.480 --> 00:25:16.160
<v Matt Godbolt>Oh my gosh. Testing you say, tell me more!

75
00:25:16.160 --> 00:25:19.620
<v Ben Rady>I know I haven't talked about testing in like a podcast and a half, so

76
00:25:19.620 --> 00:25:20.720
<v Matt Godbolt>I know. All right. You've got...

77
00:25:20.720 --> 00:25:54.840
<v Ben Rady>It's the, you know, part of the reason you write the test first is to make sure that the resulting solution that you come up with is testable. Right. If you build something, uh, and you don't think about tests and then you try to add the test later, it's really hard. And so most people don't right. And it's, and the reason for that is, well, you came up with a perfectly reasonable solution if you completely ignore this other constraint. Yeah. And then you try to add it in later. Right. And so you're doing kind of the same thing when you run, uh, you know, apps in as root in Docker is you've, you've got a constraint that would be nice, but you don't even think about it until it's too late.

78
00:25:54.840 --> 00:26:33.240
<v Matt Godbolt>It's invisible, which, okay. So I'm gonna take the other side of that just to sort of in the defense of a Docker style thing. I know obviously this is, uh, uh, there there's many a nuance here, but right. One of the things that Docker gives you kind of out of the gate is deployability, which is another thing that if you don't think about right at the beginning, it's hard to retrofit. We've all seen applications that you're like, well, this is well, well, and good. If I can get clone and I've got full access to the internet and then I can run, uh, these commands and I've got access to these things and I can do whatever. And you're like, that's great on my developer machine. Again, the loudest truck in the world is now outside my house.

79
00:26:33.240 --> 00:26:34.020
<v Ben Rady>They're circling, just circling.

80
00:26:34.020 --> 00:26:55.020
<v Matt Godbolt>They really, no, it's just, he's taunting me. He's reversing it up. This has been the most I'm I will try and edit some of these things, but I think if you're, if you can hear this dear listener, then I failed to edit the podcast very well. All right. I think they've gone. So, but yeah. Where were we? Um, I was ranting about something

81
00:26:55.020 --> 00:26:56.300
<v Ben Rady>You were about to defend docker. It was shocking.

82
00:26:56.300 --> 00:27:34.100
<v Matt Godbolt>It was, I was defining no, the deployability is an important thing to not have to retrofit afterwards and Docker kind of hands you that straight away. You're like, well, docker pull, docker run amazing. Right. My CI is docker build and docker push. And my run time is docker pull and docker run. And the cool thing is that my developers can run as if they have the CI build because they can docker pull as well, and then docker run as well. And so it ticks tons of boxes. Right? Yeah. It's so lovely, right. From that point of view. Yeah. Again, until you discover that half of your computer is now owned by root and you don't actually have root privileges on it. And then you're like, well, I'm stuck with these files, I guess.

83
00:27:34.100 --> 00:27:39.820
<v Ben Rady>Yes. Right. Until you fire up the container and then, uh, rm them from the containers,

84
00:27:39.820 --> 00:27:40.240
<v Matt Godbolt>Inside the container

85
00:27:40.240 --> 00:27:42.100
<v Ben Rady>The container has root.

86
00:27:42.100 --> 00:27:58.700
<v Matt Godbolt>Yes. I mean a good friend of mine. I will not drop them in the, in, uh, but a good friend of mine has a one liner that gives you actual root privileges on the machine that you're on. If you have Docker available with non pseudo, it's a convenient little thing to remember and just clicking it. Oh, that's so.

87
00:27:58.700 --> 00:27:59.760
<v Ben Rady>Right.

88
00:27:59.760 --> 00:28:04.000
<v Matt Godbolt>You have Docker, you basically have root. Yeah. Even if you weren't allowed it in the first place.

89
00:28:04.000 --> 00:28:15.620
<v Ben Rady>If you, and if you live, if you work in one of those horrible environments where they don't let you have sudo on your own machines, which is insane, but they do exist. You can maybe put in a request for Docker instead and get basically the same thing.

90
00:28:15.620 --> 00:28:37.380
<v Matt Godbolt><laugh> let me just say that this, this, uh, sec, this is a personal opinion that Ben and I hold, um, don't wanna get anyone in trouble with their security teams. Please don't do anything daft with that information, but it is true. Yeah. And it's great for taunting your infrastructure and SecOps folks, if, uh, you indeed need Docker for whatever. Anyway, that's, that's Docker, other containment, containment, container solutions. I mean containment

91
00:28:37.380 --> 00:28:39.920
<v Ben Rady>Containment solutions like from the Ghostbusters.

92
00:28:39.920 --> 00:28:45.870
<v Matt Godbolt>Like from, yeah. I was actually thinking the same thing. Yeah. The light is green. The trap is clean.

93
00:28:45.870 --> 00:29:08.120
<v Ben Rady>Uhhuh <laugh> uh, yeah. So I mean, VMware, so we talked, we kicked off this whole discussion with, with VMware and virtual machines, which are, uh, a very different kind of technology than Docker. Uh, do you think you could give us a, a two minute overview of the differences between something like VMware or virtual box or other sort of virtual machines and Docker having built many virtual machines in your life.

94
00:29:08.120 --> 00:29:55.640
<v Matt Godbolt>Your that's? Well, my virtual machines have all been, um, eight bit, um, if I'm, which makes them considerably easier on some axes, but yeah, so the let's explain a little bit about how Docker is working. So at least Docker on Linux, which is my only experience here. So Linux supports, um, name spacing. That is the ability to make groups and, uh, resource allocations that are kind of contained and have their own name space away from anyone else running on the system. And now obviously you can think about a user is a sort of a name space of vaguely, but you know, if you type PS as a particular user or PS aux, you can see all of the other users that are running on the system in this instance, name spaces can contain off areas of the operating system so that like the main operating system can see what's going on.

95
00:29:55.640 --> 00:30:40.300
<v Matt Godbolt>But if you are inside that name space, if a process is inside that name space, it only sees things in its own name, space, and name spaces can be file systems. They can be users. They can be, um, oh, uh, CPUs. And that may be secrets, but there's a number of things. Number of like, um, aspects of the system, which can be compartmentalized and held separate. Um, but you're still running the same operating system. And you're still doing all the things that you were doing before. You're just making a new name space. So what Docker effectively is doing is making a new name space, um, creating inside that name space, a bunch of links to the outside world, for things like the terminal for things like, um, oh yeah. Network is another name space you can create and you can make a name space. You can make a bridge.

96
00:30:40.300 --> 00:31:23.960
<v Matt Godbolt>Then that talk that talks one name space to another as if it was, uh, one of those network devices that we're talking about. Like a, uh, uh, um, and, and then you're basically running like a regular process, except that if you type PS or if you two type LS, you'll only see the world that the container gave you through giving you your own name space. And it's a bit like if someone's ever looked at like chroot jails, which was like the, the precursor to this where you could say, Hey, start a new process and pretend that the root directory like the slash the top of the hierarchy is this sub folder I just made. And then you can never see outside of there. And you could imagine that you are effectively in a jail. You can't see outside of there, and your process can run along and, um, and, and be isolated.

97
00:31:23.960 --> 00:32:14.300
<v Matt Godbolt>And you can see how you might build like a, a duplicate operating system image in there and then run it. But it's running really on the main operating system. And that has a really interesting side effect. The kernel calls that you are making are going straight to the host operating system's kernel. There is no kernel that you are running inside your Docker container. So if you're running on, um, kernel version five point star, um, and there's some whizbang new feature that's in kernel version six and above, and you've got a Docker image, that's Ubuntu 24, whatever that wants to use that it ain't gonna work. No amount of Docker magic will make new features appear in your running kernel. Virtualization, on the other hand, takes this down to the hardware level and pretending effectively like you've are, oh God.

98
00:32:14.300 --> 00:32:45.400
<v Matt Godbolt>Now the distractions are a cat hitting the microphone. Uh, at the virtualization level, you are pretending that you have, uh, a CPU and resources network, it resources and hardware resources that don't actually exist. And then a, a, a full on kernel boots up in that world. And as far as that kernel is concerned with a few caveats, it thinks that it's running on a real computer, but it's actually running on a simulation of a computer that's running on the real computer. Now it's.

99
00:32:45.400 --> 00:32:46.960
<v Ben Rady>Kinda like how we're all living in a simulation.

100
00:32:46.960 --> 00:33:04.780
<v Matt Godbolt>We are all living in a simulation, which explains an awful lot. Yes. But yeah, we're all living in some kind of the matrix and all we're doing is we're putting another matrix in our matrix so that we can run. Yes. Uh, another copy inside of that. So as far as that virtual machine is concerned, it is a full sovereign computer in its own.

101
00:33:04.780 --> 00:34:00.500
<v Matt Godbolt>Right. And it can do anything. It likes unaware that when it says, Hey, oh, I've got a network device over here what's really happening is that some kind of, um, trap is happening in the CPU when it's accessing or trying to access that device and an operating system, one layer up in the list of, of matrices <laugh>, um, goes, oh, wait a second. And much like when, um, a regular operating system misses a page and has to swap it in from disc. And like the process is put to sleep. And while the, the process, the, the image is red and then it kind of goes, oh, I, yeah, the memory's there. Now the same thing happens at one layer above in what's called the hypervisor, which is like the operating system, running the show for all of the operating systems underneath it. And so that hypervisor can do, uh, can arbitrate access to the real network cards and the real physical block devices, like the hard discs that are in the machine, uh, that you are you're emulating.

102
00:34:00.500 --> 00:34:55.240
<v Matt Godbolt>Um, and then when you say emulation, you think it's gonna be super slow. And in fact, you know, you could obviously write an, a genuine emulator and then you would, um, you could, you know, pretend to be an ARM machine when you're running on an x86 or whatever. What typically happens is that, um, uh, these are hardware accelerated. The CPU knows quotes that there are layers and rungs of the, of the hierarchy of ma uh, of, of simulation environments. And, um, it gives the hypervisor more privileges than the, um, the operating system underneath. And in fact, mostly nowadays, um, the guest operating systems, as they're called, are in cahoots with the virtualization layer, they actually do know that they are living in a simulation, and that allows certain things to be a lot faster. So instead of actually having to emulate a real network card, and like, as with this sort of two way back and forth between the hypervisor and the underlying, uh, operating system, there can be some kind of agreed thing of like, Hey, I'd like to talk to the network card.

103
00:34:55.240 --> 00:36:06.140
<v Matt Godbolt>I'm just gonna pull all the, the data I would like you to look at over here. And then, Hey, hypervisor, imagine that a network, you did the, whatever the network card thing gives you. There's a certain amount of collaboration. I'm making that up in full disclosure. But yeah, what that means is that when you go to your, uh, Amazon account and say, I'd like a new computer, please, that computer is not a real computer. It is just a virtual computer running on someone else's infrastructure. And you get a certain number of CPUs, which, and a certain number of disc ios per second, and all that good stuff. And this then comes back to the VMware thing that you were saying in the beginning. This is why infrastructure folks love it, because I can buy two, a hundred and twenty eight core, uh, terabyte RAM machines. And then I can hand them out to as many developers as I'd like in like two or three or four CPU slices, which I can't even buy. I can't buy a two CPU computer anymore. And they get to share it and they all have root on their machine and there's no way they can bust out of their virtualization environment. To get to the hypervisor, but they have they, and then they can, they can like blue screen, their kernel can panic. The whole thing can go down is exactly like a normal computer, except that really it's just one 10th of the physical machine you're running on.

104
00:36:06.140 --> 00:36:23.120
<v Ben Rady>Right. Right. So when the annoying developer tells you that they need a server to run their app and you ask what the app is, and they're like, well, this is node.JS app that runs in one thread. You're like, there's no way on the planet I'm giving you a $10,000 server to run a single threaded Node.JS app. So I'm just gonna give you this one little slic.

105
00:36:23.120 --> 00:36:47.000
<v Matt Godbolt>And you think it's a server and it has its own operating system, which means obviously there is a, you know, your storage requirements, both in terms of memory and in terms of disc space go up. Because, you know, like there is a real honest to God, Linux kernel running there and probably on the sibling CPU, like literally on the die, you know, two millimeters away from you is another CPU running someone else's Linux kernel. Right. And never the twain shall talk to each other.

106
00:36:47.000 --> 00:36:49.980
<v Ben Rady>Right. Rowhammer issues and other things aside

107
00:36:49.980 --> 00:36:52.920
<v Matt Godbolt>Yeah, don't, don't gimme an in to talk about that kind of stuff.

108
00:36:52.920 --> 00:36:54.010
<v Ben Rady><laugh>

109
00:36:54.010 --> 00:37:00.680
<v Matt Godbolt>Right. You know, so actually, yeah, right. We are gonna, we're gonna have to now because you poke my buttons,

110
00:37:00.680 --> 00:37:01.960
<v Ben Rady><laugh> Row hammered them!

111
00:37:01.960 --> 00:37:55.420
<v Matt Godbolt>Not row hammer, but that's, that's definitely one for another conversation, but what, um, what a reasonable person might do given what I just said, is say, well, the hypervisor is sat there. Not doing very much. Doesn't need any CPU resources most of the time, cuz it's reactive to the host operating systems that are really running on the CPUs. Right. But we could potentially say, well, let's give one or so CPU to the hypervisor itself. And it can do some background maintenance activities. What if it's scanned through all of the physical memory of the computer and went, wait a second, I've seen this 4k page before. Right? I've got every single of my 60 guest operating systems have all loaded up variants of the same Linux operating system. Why the hell would I have the same 4k pages? You know, like many, many, many 4k pages that are exactly the same.

112
00:37:55.420 --> 00:38:43.860
<v Matt Godbolt>Cause they all loaded like, you know, VM Linuz 4, 5, 29, whatever, why don't I just point them all at the same actual physical location and then discard the copies of it. But like pretend to all of the individual guest operating systems that they have their own copy and then it's just copy on write. If they try to write to it, then they get their own copy a bit like, you know, when you fork a process on a single operating system. The same tricks happen. Makes perfect sense. Now obviously you have to do it retroactively when you fork, you know that every page that you currently have is gonna be shared in the child process, but this is a sort of emergent property of once you've booted a machine up, eventually some pages will be the same on one machine as they are on another, in which case you do duplicate them. And then you're right. You've got more free memory for the system as a whole. And it seems like there could be no, there could be nothing wrong with that until the security people come along. <laugh>

113
00:38:43.860 --> 00:38:46.800
<v Ben Rady>Yes. And ruin everyone's day

114
00:38:46.800 --> 00:38:49.580
<v Matt Godbolt>And ruin everyone's day, exactly, exactly!

115
00:38:49.580 --> 00:39:51.320
<v Matt Godbolt>Uh, so it was shown that, and maybe I won't go into too much details for two reasons. One, I don't necessarily know the details. And two we've probably talked too much about this already. Um, it was shown that if you have the same implementation of open SSL or one of the other cryptographic libraries as a co-located virtual machine to you, so I, I'm gonna just go to Amazon and I'm gonna ask for a hundred EC2 instances and then I'm gonna run a test to see if I can find that I'm co-located with my target just by coincidence. I happen to be running on a machine that also has an SSL process somewhere in it, all right? The chances are that obviously those 4k pages will be de-duplicated cuz it's the same dot so that we've both got it's open SSL Ubuntu whatever version right. Now I can start doing timing attacks because I know my physical RAM is associated with the same physical Ram that they have. And so if I know which code paths are taken in their code, I can poke around in my cache and sort of try and determine whether.

116
00:39:51.320 --> 00:39:52.600
<v Ben Rady>Oh and start getting the the keys basically.

117
00:39:52.600 --> 00:39:53.440
<v Matt Godbolt>Exactly.

118
00:39:53.440 --> 00:39:57.480
<v Ben Rady>I got this byte of the key and I got that byte of the key and I don't have it all yet, but that's close enough.

119
00:39:57.480 --> 00:40:20.300
<v Matt Godbolt>It took a long while to read this bit out. Cause it must been in an L3, but if it wasn't then I know it must be in someone's L2 somewhere and that someone might be and all these kinds of things. And you could imagine how terrifying that is from a point of view of, of, of security. You're like you've lost the isolation between the virtual machines that aren't even meant to know that their siblings exist. So that's your own fault, Ben <laugh>.

120
00:40:20.300 --> 00:40:21.120
<v Ben Rady>Worth it.

121
00:40:21.120 --> 00:40:23.360
<v Matt Godbolt>We can talk about Rowhammer, another time

122
00:40:23.360 --> 00:40:24.740
<v Ben Rady>Worth it, worth it.

123
00:40:24.740 --> 00:40:38.720
<v Matt Godbolt>So in terms of deployment though, I mean, it's, you sort of alluded to that by saying that like as a developer, it's convenient to be able to go to your infrastructure folks and say, can I just have a server to run my right little Node.js app.

124
00:40:38.720 --> 00:40:50.000
<v Ben Rady>Or not even talk to them and just like run a script that generates one for you and they keep tabs on it and they know who's allocated to, and they can call you up and say, Hey, you're using 35 servers. Do you really need them? But you know,

125
00:40:50.000 --> 00:40:50.740
<v Matt Godbolt>that's very true.

126
00:40:50.740 --> 00:40:53.000
<v Ben Rady>automate those things. Right. And it's really great when you do.

127
00:40:53.000 --> 00:41:33.740
<v Matt Godbolt>That's very true. I mean, I forget of course that that's, this is what Terraform and what the like do for, for, for me in Amazon. Right. I just say how, when another computer, another computer appears, it has never occurred to me that really somewhere behind the scenes, some, all this magic is going on to make that happen, but you know, it just does. Right. And yeah, that puts a lot of power and responsibility, but a lot of power into developer's hand, you don't have to like overload a machine and you get the isolation that say a Docker container would give you, but at a much deeper level now different problems again. Right. You know, at least in your own server, if it's running as root, well, it's only running as root because you made it run as root. As root, right. <laugh> you've got every choice you like. Yeah.

128
00:41:33.740 --> 00:41:35.120
<v Ben Rady>Yeah.

129
00:41:35.120 --> 00:41:42.120
<v Matt Godbolt>So what do we think about that in terms of like the, the trade offs? What would, what would make you choose one method over another?

130
00:41:42.120 --> 00:42:25.440
<v Ben Rady>I mean, I, I, I tend to lean more toward, you know, having virtual machines and, you know, having, uh, more of like the I'm gonna get this virtual machine. I, I will probably build some very lightweight automation to set it up. But again, the setup of it is mostly just, you know, kind of like you were saying, the apt update apt upgrade, you know, maybe install one or two system packages, but hopefully not if I can avoid it and then just run all my applications as a user, as an unprivileged user. And you know, every version is a new tarball that gets copied up to the computer or maybe have some automated thing that pulls 'em down from a central repository.

131
00:42:25.440 --> 00:42:29.070
<v Matt Godbolt>You've got like a deployment thing that use you've got like a, is it git-deploy?

132
00:42:29.070 --> 00:42:30.740
<v Ben Rady>Oh, git-deploy. Yeah.

133
00:42:30.740 --> 00:42:32.080
<v Matt Godbolt>Is that open? That is open source, right?

134
00:42:32.080 --> 00:43:10.260
<v Ben Rady>That is open source. Yeah. So git-deploy is sort of my Heroku style deployment script that I made, um, that will let you take any server that you have SSH access to, uh, and, um, basically push to it as it, as if it was a git repository. And as a side effect of that, if the, if the push works, that is your code is not out of sync with everyone else that's deployed to it. It will deploy your application and start up. And so you get to sort of use the, git semantics around push and pull as your mechanism to make sure that you don't accidentally clobber someone else's deployment.

135
00:43:10.260 --> 00:43:10.580
<v Matt Godbolt>I see.

136
00:43:10.580 --> 00:43:19.700
<v Ben Rady>Right? Um, and it's, so it's sort of a safer way to be able to empower people to deploy locally from their machines, if that makes sense to do. Now, sometimes that doesn't make sense to do.

137
00:43:19.700 --> 00:43:21.480
<v Matt Godbolt>It doesn't always make sense. Right. Yeah.

138
00:43:21.480 --> 00:43:31.700
<v Ben Rady>But in fact it sort of usually doesn't make sense, but sometimes it makes a ton of sense. And it's really nice to be able to do that in a way that is safer than just, you know, scp <laugh>. Right.

139
00:43:31.700 --> 00:43:39.590
<v Matt Godbolt>Right. But I mean, often, you know, there, there are, there are also places where, or times when you want to be able to push to like a development machine.

140
00:43:39.590 --> 00:43:40.260
<v Ben Rady>Oh absolutely, yeah.

141
00:43:40.260 --> 00:44:04.080
<v Matt Godbolt>A development cluster. And that seems like a good thing there where I would actually want the feature is I have a code on my machine that I want to have running in a environment that I can't reproduce myself locally. It's not ideal to be in a situation where you can't quite reproduce it locally, but sometimes, you know, I wanna batter it with 200, um, machines that are gonna send queries to it. And so I wanna deploy my version that has my fix or whatever.

142
00:44:04.080 --> 00:44:38.070
<v Ben Rady>Yep. Yep. And I mean, you know, you can take speaking of virtual virtualization, like you can take these things a lot further. And one of the things that I've been playing around with one of my projects is sort of getting rid of the idea of the production environment. So all of the environments in this project that I'm working on are just branches. There's the main environment for the main branch. And that's where the DNS entry for the top level domain points to. But if you make a new branch, it will automatically spin up a new environment and it will marshal all the services that that environment needs.

143
00:44:38.070 --> 00:44:38.180
<v Matt Godbolt>Ooooo.

144
00:44:38.180 --> 00:44:58.440
<v Ben Rady>And it will do everything that it does. And so if you want to make a change that involves potentially making changes to the infrastructure like, oh, I'm gonna change a security group, or I'm gonna change, you know, the number of servers from, from four to five or whatever it might be. You just create a new branch, you push that branch to GitHub and the infrastructure magically appears.

145
00:44:58.440 --> 00:44:58.740
<v Matt Godbolt>That is awesome.

146
00:44:58.740 --> 00:45:06.680
<v Ben Rady>And the name of that infrastructure is literally the name of the branch. So they're tied together in that way. And when you delete the branch, the infrastructure gets torn down.

147
00:45:06.680 --> 00:45:07.860
<v Matt Godbolt>That's super cool.

148
00:45:07.860 --> 00:45:18.340
<v Ben Rady>So, the main branch is always there. That's sort of the quote-unquote production environment. Um, but if you were to ever delete the main branch, it would also actually tear down the, the, the <laugh> production environment, which like...

149
00:45:18.340 --> 00:45:19.440
<v Matt Godbolt>I mean that's probably what you want though,

150
00:45:19.440 --> 00:45:22.370
<v Ben Rady>You know, it's like, it's sort of a weird thing, but it's like, it's like,

151
00:45:22.370 --> 00:45:22.720
<v Matt Godbolt>No, I like it.

152
00:45:22.720 --> 00:45:37.260
<v Ben Rady>coupling those two things together very tightly and saying a branch is an environment. There's no such thing as the dev or the test or the UAT or the production, they're just names of branches. Um, and that is only possible because of virtualization. You couldn't do that any other way.

153
00:45:37.260 --> 00:46:36.460
<v Matt Godbolt>Way on a real machine. No. Well, yeah, for all the reasons, I mean, cost was what I was about to bring up because you know, that, um, I, I'm sort of trying to move Compiler Explorer towards a system, which is a tiny bit more like that where instead of the staging environment that we do testing being kind of like just a subcategory under the production environment, it's like its own AWS account effectively. And then I can do the kind of things you're talking about like, Hey, let's have a new, um, load balancer. Let's try out a different way of doing everything in the staging environment. Uh, but for me that's prohibitively expensive because those resources are not free and they're quite expensive. Like having one load balancer is expensive enough. Uh, and, and I can configure that one load balancer to kind of say, well, if it has slash staging in the thing, then goes to this, this subsection. Right. And that's how it works at the moment. But, um, so there's a trade off to be had there. And obviously, in, in a world of infinite resources, it's no problem that if you create 12 different branches, you've got 12 environments.

154
00:46:36.460 --> 00:46:54.260
<v Ben Rady>Right, right. Right. Well, one of my subtle motivations for doing this, and again, I'm trying this on my own project, but you know, maybe one day I'll get to do this in a, in a, a, a more, um, you know, widely shared, um, company environment is to directly manifest to the bean counters, the cost of so many different branches.

155
00:46:54.260 --> 00:46:55.060
<v Matt Godbolt>Amazing.

156
00:46:55.060 --> 00:47:03.440
<v Ben Rady>It's sort of like, yeah, you know, branches have a cost and it's hard for you to measure that cost. Cause it's mostly cognitive load on developers. What if we just turn that into dollars.

157
00:47:03.440 --> 00:47:04.120
<v Matt Godbolt>Actual dollars

158
00:47:04.120 --> 00:47:08.380
<v Ben Rady>And then you could measure them and be like, why then you'd have accountants, yelling, "Why do we have so many branches?"

159
00:47:08.380 --> 00:47:27.100
<v Matt Godbolt>As, as one of those folks that sends out the emails and the, the nags to people saying like, Hey, this PRs been open for three years. Is there any chance of it being closed? I totally I'm down with that. Yeah. The cognitive load, when I hit auto complete in, uh, in the <laugh> for my branch name, I'm like, what the heck is this, this person left the company two years ago. Why is it still here?

160
00:47:27.100 --> 00:47:27.100
<v Ben Rady>Yeah. Why is this here?

161
00:47:27.100 --> 00:48:23.000
<v Matt Godbolt>Is this here? Those kinds of things. Yeah. Yeah. No, that's, that's, I, I like that approach. I like the idea of, of manifesting. And I think, you know, obviously you've talked about doing it in terms of virtualization. There are ways and means of doing it with the, uh, Docker style approach as well. And I know we're kind of getting close on the amount of time that we've got available, but, um, I'd like to sort of suggest, you know, there's the Kubernetes, um, type approach there's, uh, Nomad, which is a Hashicorp system, which can be used to run Docker containers. And then there are definitely load balancer type things that can talk to those containers. And you could definitely have it so that every branch that you commit builds a Docker image and pushes it to a, a tag, a named tag for that branch, and then auto registers a container, a job running in nomad to say, Hey, I'm the, your project, hyphen your branch name environment that has all of these machines running.

162
00:48:23.000 --> 00:49:17.720
<v Matt Godbolt>So it can be done too. You are still paying the cost. There are processes running in one operating system or one set of operating systems that is the nomad cluster or the Kubernetes cluster or whatever. It's a sort of lower, um, I guess it's higher level. I dunno how you describe it, where it, where it's cutting, you know, is it lower level or higher level, um, than virtualization? It's definitely higher level like measured on the axis that makes sense to me right now. But, um, but, uh, yeah, so, so you can achieve it using that too, which is, which is great. Um, and I mean, I think all of this kind of comes down to is what we're talking about in, at this instance is infrastructure as code. However it's achieved, be it VMs. Or Nomad machines, running Docker stuff. And so we've kind of strayed from the original point about like, what, how does one do deployment? How does one use virtualization? How is it what different things are available? But...

163
00:49:17.720 --> 00:49:18.940
<v Ben Rady>Yeah, they're all related though.

164
00:49:18.940 --> 00:49:57.520
<v Matt Godbolt>They are, I guess, related. Yeah. Yeah, yeah. And that infrastructure as code thing is, is, is super important to be able to say like, yeah, I, it's not, no one has to rack a machine. No one has to, um, physically move any cables around when I stand up this instance here. Um, and that instance is defined by a piece of code or a configuration file that's generated by code or just a configuration file that a human edits, which you know, is, is a fabulous way of tracking. I mean, we are all especially software engineers. We know where we stand with source control and CI and things like that. So having, having the Mach, the, the physical world work that way too, and be able to roll back and all that kind of stuff is super cool. However, so it's achieved.

165
00:49:57.520 --> 00:49:58.500
<v Ben Rady>Yeah.

166
00:49:58.500 --> 00:50:05.800
<v Matt Godbolt>All right. Well, this is, this has been a fun exposition of what on earth can we remember about how all this stuff fits together?

167
00:50:05.800 --> 00:50:11.500
<v Ben Rady><laugh> yeah. I feel like we only really like touched on it. Like there's you could probably do a whole other hour on these topics, like you and I talking about Kubernetes.

168
00:50:11.500 --> 00:50:28.060
<v Matt Godbolt>Absolutely. Yeah. I mean, I don't know enough about Kubernetes. I mean, I use Borg at Google, which I, I believe to be related in some way, but I don't know either. I remember them trying to pretend that it wasn't called that. And they used to pretend that it was Anita Borg. It was named after not clearly the, the, the evil people in, in Star Trek.

169
00:50:28.060 --> 00:50:28.840
<v Ben Rady>Right.

170
00:50:28.840 --> 00:51:20.540
<v Matt Godbolt>Uh, which yeah. And I think that was because they, it leaked out because, um, they weren't laundering their, uh, referrers. And so people were running internal services from machines and then people would like link to, I know not YouTube videos, cause that would be Google too, but you know, link to other people's websites. And then it, someone went through like what this, all the machines had, like, um, names that DNS names that uniquely refer to the job that were running that was running on it. Yeah. Which is super convenient for everything. You know, you wanna hit your job and it's running a web server, then you just go to that long name and it hits the machine and the machine then looks at the Mach, uh, the, the name you gave it. And then it redirects it to the correct port for that particular instance that you were running on. And then off there, there you are. There's your, there's your job running and you can look at it. Um, but obviously if you then have a webpage on there that has a, Hey, click, the cat animation, you click the cat animation, then you've leaked, you know, twelve.seven.borg.google.dns or Whatever.

171
00:51:20.540 --> 00:51:21.990
<v Ben Rady>Right, right, right, right.

172
00:51:21.990 --> 00:51:23.400
<v Matt Godbolt>Oops.

173
00:51:23.400 --> 00:51:24.220
<v Ben Rady>Referers referer headers, man. Yeah.

174
00:51:24.220 --> 00:51:33.460
<v Matt Godbolt>Yeah, yeah. All right. We should stop talking. We should stop talking. We've plenty of things to audit, uh, to edit <laugh> that include.

175
00:51:33.460 --> 00:51:34.080
<v Ben Rady>and audit.

176
00:51:34.080 --> 00:51:35.900
<v Matt Godbolt>and audit

177
00:51:35.900 --> 00:51:37.920
<v Ben Rady><laugh> we got, we can't let the Borg stuff leak out.

178
00:51:37.920 --> 00:51:41.430
<v Matt Godbolt>That's true. That ship has sailed.

179
00:51:41.430 --> 00:51:42.000
<v Ben Rady>Cool.

180
00:51:42.000 --> 00:51:44.640
<v Matt Godbolt>All right. I'll see ya. Next time.

181
00:51:44.640 --> 00:51:46.640
<v Ben Rady>Bye.

