Building Distributed Systems at Scale - EMx 219
Brent Anderson is a Software engineer at Knock. He builds high-scale messaging systems in Elixir. He joins the show to talk about his article, "Using our One and Done library to power idempotent API requests". He begins by explaining the idea of creating a library and the importance of idempotency.
Special Guests:
Brent Anderson
Show Notes
Brent Anderson is a Software engineer at Knock. He builds high-scale messaging systems in Elixir. He joins the show to talk about his article, "Using our One and Done library to power idempotent API requests". He begins by explaining the idea of creating a library and the importance of idempotency.
Sponsors
Links
- https://knock.app/blog/using-one-and-done-to-power-idempotency
- https://en.wikipedia.org/wiki/Thundering_herd_problem
- https://hex.pm/packages/socket_drano
- https://blog.heroku.com/erlang-in-anger
- http://www.erlang-in-anger.com/
Getting in Touch
- @bja@social.brentjanderson.com
- brant@knock.app
Picks
- Adi - Job
- Allen - Prey
- Brent - Dendron
- Brent - RPG in a Box
- Brent - e-bike
- Sascha - Bullet Journaling
Transcript
SASCHA_WOLF:
Hey everybody, welcome to another episode of Elixir Mix. This week on the panel, we have Alan Weimar,
ALLEN_WYMA:
Hello.
SASCHA_WOLF:
Adi Eingar, and me, Sascha Wolf, and we have a special guest, and that is Brent Anderson this week. So Brent, why don't you tell the audience why you are here, why we invited you, and what's the lovely thing you're doing every day?
BRENT_ANDERSON:
Yeah, my name is Brent Anderson and I work as a software engineer at NOC. We make notification infrastructure easy for companies that are building products that need better notifications. And we recently released a library called one and done that is a tool for handling item potency. So, uh, they wanted me to come on and talk about item potency and how it works and some of the different ways you can implement that.
SASCHA_WOLF:
By the way, I love the name. I would assume that our listeners know, but let's give us the benefit of the door. What is Item Potency Brand?
BRENT_ANDERSON:
Yeah, so item potency is a big word that I think comes originally from mathematics. And the idea is that when you do something, um, that is item potent, um, like if you think about multiplying by the number one, it doesn't matter how many times you do it, you'll always get the same result
SASCHA_WOLF:
Mm-hmm.
BRENT_ANDERSON:
and in, uh, software and distributed systems, item potency is useful because sometimes you don't know if an operation that you performed, um, was successful, you, you don't get a result back that confirms or denies whether it worked. And so if you, but you don't want to retry it, if it ends up creating say like, you know, if you're charging a credit card, you don't want to charge it twice. You want to make sure it only goes through once. So item potency is important when you want to make sure that something happens once, even in a state of potential failure to make sure that it only ever happens one time.
SASCHA_WOLF:
Yeah, and I think that the monetary example is like the classic one, right? Like that you don't want to charge people twice. It's going to get your angry customers.
BRENT_ANDERSON:
Yeah, I think that Stripe stocks actually have some really good comments on item potency and how they implement it for their particular use case. Um, one of the things that was interesting in developing a library for a item potency at knock is that, um, there are some general principles. I think there's even an IETF draft, like an internet engineering task force
SASCHA_WOLF:
Mm-hmm.
BRENT_ANDERSON:
draft on how to set up item potency headers. Um, and, uh, there are, um, like we released the one and done library. Um, It was an April of 2023 or March, late March of 2023. And shortly after that, there was actually a separate plug that was released on Hex for doing the same type of thing. And what's interesting is seeing the way that different libraries approach it or different companies, because there are some pieces that need to be application specific. And that's one of the things we leaned into with One and Done was making sure that applications got a default experience out of the box that was good. but that was something that they could configure for their specific situation. So, Stripe, for instance, because of their item potency guarantees that they need to make, if I remember correctly, it's not in their docs, but there's a blog post that they have that I want to say that they actually persist their item potency keys and responses in a SQL database, like in a transactional database, so that they are able to make pretty strong guarantees around how. their item potency responses work. A lot of applications don't need that level of guarantee. And so you can use something like Redis or Ets or other different persistence layers. But there's a lot of nuance that can fall into item potency depending on your application's needs for some of those reasons.
SASCHA_WOLF:
I also think that it's worthwhile to point out that not everything needs to be idempotent, right? Because it's something I've seen, especially like more new in software engineers, kind of go to, right? Like they learn about this principle and they're like, oh, everything should be idempotent. But I mean, I guess, for example, in your specific use case, like where you handle notifications, I would assume that it's not necessarily the end of the world if a notification goes up twice to a specific user, right? Or maybe it's different for you.
BRENT_ANDERSON:
Yeah, you know, it kind of depends. It's something that we need to leave up to our customers. And so if they don't pass an item potency key, then it does, like we're not going to worry
SASCHA_WOLF:
Mhm, mhm.
BRENT_ANDERSON:
about it, right? So you don't have to use this feature, certainly at Nock or anywhere, even at Stripe actually. Although I think Stripe's SDK does use item potency by default. The important thing I think is, you're right that you don't need item potency on every endpoint, so you should be selective. about where you apply that in your application. And Plug and Phoenix make it pretty easy to specify that just using simple routing pipelines so that you can specify, oh, I want to introduce item potency or whatever plug on this particular slice of my router, but not all of it. We at NOC rolled out workflow triggers as our first item potency endpoint. And we have a request then to add it to another endpoint. And it's literally just a line or two change in the router. There are some other changes we need to make to our SDKs and things to support it. But it's nice being able to just move one line of code into a, a different scope and you get this feature. And, and, yeah, I, one of the things that you mentioned as well with, you know, people with notifications and whether or not you get duplicate notifications kind of depends on the application. So we
SASCHA_WOLF:
If
BRENT_ANDERSON:
kind
SASCHA_WOLF:
that
BRENT_ANDERSON:
of
SASCHA_WOLF:
makes
BRENT_ANDERSON:
leave that
SASCHA_WOLF:
sense.
BRENT_ANDERSON:
up to our customers, but.
SASCHA_WOLF:
Yeah, I guess if you send a notification like, hey, there's a new message from a friend, if that happens twice, okay. But if there's a notification, hey, I don't know, like, you're... Good gosh, I don't have a good example right now.
BRENT_ANDERSON:
I mean, think about something in billing or
SASCHA_WOLF:
Yeah, I feel like it makes
BRENT_ANDERSON:
if
SASCHA_WOLF:
sense.
BRENT_ANDERSON:
it's something that's customer facing involving money, I can imagine that being a little bit more sensitive for, for certain people.
SASCHA_WOLF:
Yeah. Yeah.
BRENT_ANDERSON:
You don't necessarily want to tell someone, oh, like you were charged some amount, you know, and if that notification goes out twice, then that could be
SASCHA_WOLF:
That
BRENT_ANDERSON:
confusing.
SASCHA_WOLF:
makes sense. Yeah, that makes sense. I think, I feel it's interesting that it is an overlap with, for example, event handling, because in your particular use case, like one and done is more of an API kind of thing, right? Like you use for building item, product APIs. But the same principles apply, for example, in, and there's something we've talked about a whole bunch in the show, on event source systems, right? Like where, depending on what kind of event consumers you have, you also want to make sure that like you're only really consume. process this event once, even if something bad happens, usually also in the case with monetary things or like integrations with external APIs. So the pattern is honestly kind of the same, right? In the case of like an event system that you will be saving, okay, like this is the event ID and that is something already handled, for example, right? So if that gets delivered twice or even thrice or more, then I can safely ignore it. which I feel is an interesting kind of thing to point out that it's item potency is not necessarily only relevant for building APIs, but it's a generally useful principle in application, especially in distributed systems.
BRENT_ANDERSON:
Yeah, no, I think that you've got that 100%. And when ingesting events in particular, we didn't design one. I mean, we designed one and done initially for the issue that we had at NOC, which was for handling this in plug applications. Anyone listening to this, if they decide that they like the approach that we're taking with one and done, we did design it so that it could support other, not just plug, but
SASCHA_WOLF:
Mm-hmm.
BRENT_ANDERSON:
we had other HTTP routing layers in mind. take that same concept and extend it to handling events, make it so that it's part of, you know, that there's just some middleware that you incorporate in a pipeline in Broadway or something for receiving events and then choosing what you're going to do about those events downstream, including just ignoring them, you know, or just handling whatever you need to do for your particular use case when it comes to handling events like that. Distributed systems, I think that's the funny thing about it is that, at least for me, and maybe this is just, you know, a few years ago where it didn't, it seemed like the term distributed systems wasn't thrown around quite as much in general.
SASCHA_WOLF:
Mm-hmm.
BRENT_ANDERSON:
Maybe that's just my observation, but distributed systems are far more prevalent these days, especially as we continue to integrate with different endpoints and different services. Really anybody that's integrating with an API provider like just about any third party platform that you may integrate with in your own sock where that kind of introduces some of these distributed systems problems. Even if you're not running some multi-cloud multi-region Kubernetes cluster and distributed early or something like distributed systems, problems emerge, um, at even simple system architectures. I think that it becomes a question of scale as to when you start to notice those things, but it is, uh, it is something that I think is something you'll experience in a matter. no matter what type of software you're building these days.
SASCHA_WOLF:
Yeah, I agree 100% especially because nowadays there is, as you just said, a lot more integration with the third-party providers. I don't know, 10 years ago or something, it might have been really, might have been still reasonable to say, I built this one PHP application, it kind of handles everything, right? And that is good. Sends out emails, whatever. And nowadays, well, you may be integrated from platforms such as Knock or Brace or some payment provider because the level of complexity you would have to handle in addition to your core business complexity increases by quite a lot if you want to do that yourself, also to a degree where people expect it now, the level of quality users expect it to be. And of course, I mean, that is engineering time spent on those supporting the subdomains when you could spend the same engineering time on, well, making your core product better. And at that point, as soon as you kind of integrate with the software as a service provider, becomes a distributed system as you just played out. I think you feel that it's the big, I would imagine, I could imagine that it's the big reason why it's becoming so prevalent nowadays because we are very much in this cloud native era as people like to call it. Right. I'm actually curious to hear Adi, Alan, like what is your, maybe you have any story about idempotency because I feel everybody has a story about idempotency. I probably have to adjust this a bit.
ALLEN_WYMA:
Well, I just wanted to also say about idempotency too. Something that people may forget, right? We're talking about API based idempotency, but there should also be some kind of idempotency protection on UIs, I think, to a certain extent. Because there's many times like we did a demo for a client, they had the app, they clicked the button, nothing happened, they clicked a couple more times, and then
SASCHA_WOLF:
Mm-mm.
ALLEN_WYMA:
they were actually making test trades, thank God, not real trades. And the trades finally like four or five of them came up at the same time. because we didn't actually protect against this problem and block the button out. So I guess it's one simple way of also saying that I would put to use not just about API only. It could also be used in other places too. Yeah, otherwise, I didn't put in C. I mean, I did have this. I'd seen a lot like I was working on another client app. I think there's a protection for downloading assets, I believe. You have to have a nonce. I forgot what it is exactly. You guys know what I'm talking about?
SASCHA_WOLF:
I'm not entirely sure to be honest.
ALLEN_WYMA:
I'll look it up while we're
BRENT_ANDERSON:
You
ALLEN_WYMA:
chatting
BRENT_ANDERSON:
talking like
ALLEN_WYMA:
but like
BRENT_ANDERSON:
a cross-site request forgery token, that type of thing?
ALLEN_WYMA:
Not exactly like that one, but something similar. Let me look it up. I forgot what it is exactly. And
SASCHA_WOLF:
Adi,
ALLEN_WYMA:
I'll get back to you.
SASCHA_WOLF:
what is your idempotency story?
ADI_IYENGAR:
I mean, plenty. And I think it comes, I mean, like I said, it comes with like, you know, distributed asynchronous nature of software becomes very important. Yeah, I mean, I've worked in FinTech and, you know, even driven, more recently, even driven systems, and it becomes very important the more distributed and asynchronous you are. I do, I am very curious to like dig a little deeper into what I've done because I'm really curious to see how especially the key is being stored and being used. Because I think we, anyway, I would love to know. Because we're talking from a producer perspective or a client perspective, the ad importance system, but also from the server or consumer perspective, if you have distributed consumers as well. That's especially where I've noticed it starts to become really complex. Because you want to make sure the same. or multiple events of the same type, right? The same item, open keys are not consumed by two different servers or consumers at the same time. But anyway, we'd love to learn a little bit more about how one and done is like designed and you know, our world things takes care of, yeah.
BRENT_ANDERSON:
That's a great question. So, um, I think like there's a few points there. The, the first is that every application is going to require its, uh, its own. Approach to solving some of these problems, kind of like I mentioned earlier. Um, I'm, I'm, I'm under the impression, I could be wrong on this, but, um, it wouldn't surprise me if there are other companies that do this, but I am under the impression that Stripe does use a, um, like a strongly consistent data store. for handling its identity keys. So something like Postgres that sits at the edge. And I could be wrong on this, but it would make sense to me that if you have, let's say you have like 100 different edge servers that a request could come into, it'd be important to make sure that, if two requests happen very quickly, one after another, you definitely wanna make sure that you don't end up processing those requests like under any circumstances. And so, You have to, you know, you have to make certain trade-offs about how you handle that. And if you were using a strongly consistent data store for storing the keys that were used and the responses that you need to send back, that's, you know, it's, it's own thing to consider. One and done starts with a set of basic assumptions out of the box. One of them is that you're using a caching layer that is similar to Nebula X, which is a pretty popular caching layer. But you can adapt it to just about any caching. platform that you want to use, whether that's CacheX, or if you wanted to write one for yourself that uses Redis or Ecto or something, all of those are options that you could incorporate into a deployment of one and done. And so when a request comes in, let's kind of walk through this request lifecycle and maybe use that to illustrate some of the places where you can make some decisions. So a request comes in, and the first thing is that it needs to have an item potency key set as an HTTP header. And that's just the default convention that we go with with one and done. Um, that's what the IETF draft requite recommends is a header. That's just item potency dash key. And then you can set it to whatever value you want. Um, now the, the, one of the things you noted there, um, is that the, uh, that item potency key may, may need to be scoped to the user that's sending it. Right. So, um, at knock, for instance, we have lots of different customers. Um, Vercell uses us, we have a couple other customers that use us and we wouldn't want item potent requests from one customer to conflict with those from another customer that could be really bad. Um, in some respects that could even produce a security issue where if I put in a request and someone else is using the same item potency key and I'm not properly partitioning these out, then that means that you could end up getting data from someone else's, um, you know, requests, right? And that would be really bad, obviously. So at Knock, what we do is we have a, there's a function you can override and when done when you set it up. And you can say, hey, when a request comes in, here is how I want to construct the cache key that you're gonna use for loading this information, for going and checking to see if there's been something done with this item potency key. And we store that cache key as a, it includes some of the information, basically some account information. with each of these item put in requests. So when you make a request, first of all, we authorize it. So we make sure that this is even an acceptable request in the first place. And if it fails an authorization check, then, or an authentication check, then we don't even get to the one and done part. But if it passes, then we say, okay, is this key in our cache? And the key is looking at which environment someone uses in their knock environment. Basically, it's just that unique account identifying information. And then there is the item potency key that's in there as well for being able to do that lookup. And we also are keeping track of the path that you used. So if you right now at NOC, we only support the one endpoint at the time of recording this for item potency. But if we had, say, five or 10 different endpoints. you'd want to be able to reuse potentially that same item potency key for different operations. And so the combination of the account, the item potency key, the request method that you used, and the path, the HTTP path that you passed in, that makes up the key that is. stored in the cache, and it's checked when the request comes through. And if there's a cache miss, then the request just goes through and does what it would normally do. But we register a callback that fires at the end of the request cycle as part of plug. And
SASCHA_WOLF:
Mm-hmm.
BRENT_ANDERSON:
at the end of the request cycle, we then are able to say, OK, here is the response that comes back. And let's now store this response in the cache. and make sure that it's ready to go for when the next response, like when the next request comes in with this key, right? So that's kind of that first request as it comes in, is it checks the cache, it checks it in an account-specific way using the path and HTTP method and the item potency key for the request. And if there's a miss, then once the request is processed, we then store that response in the cache. We don't store the full response, but we do store... enough of it to be able to send it back for the item potency, like for subsequent item potent requests. One of the things that we also do is we include a hash of the original request. I'll talk about why we do that in a minute, but the gist is to make sure that if customers send a duplicate request that we can confirm that it wasn't sent. that it was sent in a way that ought to be treated as a duplicate and that it's not because of some of this configuration on their part. So I think we'll talk about that in just a second. Were there any questions or anything that we wanted to raise at this point? That's kind of the first request loop that happens through this one and done process. When the first request comes in and we get that response cached. And then there's kind of the second half of it, which is what happens if there's a duplicate, but did you have anything you wanted to cover before we get into that second half of the cycle?
ADI_IYENGAR:
That makes sense.
SASCHA_WOLF:
I actually have one question because that is like something I had in the back of my head. And I feel now is a good point is how do you hold us one and done this particular case handle? What does it even handle or respect HTTP methods? Because I'm post because for example, usually by definition kind of they are not expected to be idempotent. That's something which is in the specification. Is this something is like a baked in assumption? about one and one and done that you kind of say, hey, for post request, we kind of ignore the whole item potency check. Because, I mean, we presume all of us have worked with API's that that don't really use HTTP methods as you would expect. So yeah, I'm curious how if one and done is like doing something is opinionated there, and it's that kind of respecting what the you know what the HTTP methods are supposed to be doing. Did I make myself clear? Because
BRENT_ANDERSON:
Yeah,
SASCHA_WOLF:
I
BRENT_ANDERSON:
I think
SASCHA_WOLF:
look super
BRENT_ANDERSON:
so.
SASCHA_WOLF:
confused.
BRENT_ANDERSON:
You cut out there for just a second, but I think I got your question, which was around whether or not one and done or any item potent solution would consider the HTTP method. And that's good. It's a really good point to raise. So we can kind of go back to some of that because there are different HTTP methods, right? There's get requests, there's delete requests. You have put, patch and post. Those are the common ones that you typically deal with in an API and a REST API, right? Get and delete requests are automatically item potent.
SASCHA_WOLF:
Mm-hmm.
BRENT_ANDERSON:
And so you don't really have to do anything for those types of requests. Post requests are the one that I think gets the most attention with item potency because that's where you run the risk of creating duplicates. Post requests are generally for creating something new and for causing some new action to be taken in the API. that is going to modify state, right? And so if you're creating a new blog post or a new charge and stripe or triggering a workflow or something, those are all post requests in those types of APIs. And that would typically be covered in an implementation. Patch is also the other method that the IETF draft on item potency recommends, if I remember correctly. At NOC, we decided to set it so that it was put requests that are also treated as item potent. I think that
SASCHA_WOLF:
Yeah.
BRENT_ANDERSON:
you could consider those to automatically be item potent because it's supposed to just overwrite with things. But we do have some endpoints that we may end up using in an item potent way that are put endpoints for our specific API. And so with one and done, we're going to go ahead and start the next session.
SASCHA_WOLF:
Also for put requests, you might deal with lost update situations. I'm not necessarily sure how that goes into an important question here, but yeah.
BRENT_ANDERSON:
Yeah, I mean, if you had a series of put requests and they came out of order, that could produce some unexpected results, right? And I think that there may be some separate considerations that you may want to have in designing an API that deals with that type of an operation. Most of the time, these types of operations are going to come through post methods. And by default, one and done, it checks. And if it's not a post request, then it doesn't do anything. It just passes it through transparently. So that's another consideration to make is, how are you designing your API? Obviously there's going to be some nuance in how you design an API with how it's gonna respect different headers. But generally speaking, the convention is that post requests would definitely need an item potent handling or item potent treatment. Yeah.
SASCHA_WOLF:
Yeah, so did I get you right in that one and done kind of presumes that you're building your API based on the specifications of the HTTP methods, right? Based on
BRENT_ANDERSON:
Yeah.
SASCHA_WOLF:
that you have the assumption baked in that get and delete, for example, that you don't need to handle idempotency separately there because they are defined as idempotent. While post, for example, is one which isn't defined as idempotent. So that is kind of where one and done and especially for plug then kind of hooks itself into, let's say that.
BRENT_ANDERSON:
Yep, that's exactly right.
SASCHA_WOLF:
Okay, that's exactly what I wanted to know. Is there like, is this something where, is there like an escape hatch? Because I mean, I don't want to build an API necessarily, but sometimes maybe you do for legacy system, right? Where you're okay, I kind of have to do it this way because the, I don't know, the old app is expecting this to be a get request, but it kind of, it isn't, you know, so is there an option to be saying like, you know what? I'm aware that this is not the standard, but I
BRENT_ANDERSON:
Yeah.
SASCHA_WOLF:
really want you to handle get requests right now.
BRENT_ANDERSON:
No, you can customize every part of the decision making that
SASCHA_WOLF:
Mm-hmm.
BRENT_ANDERSON:
happens in one and done. So you can, you could say, okay, these I want to, I want to take, you know, I want to delete requests to be treated in an idempotent way or get requests for that matter. Or you can even bypass checking that completely. And there's a separate function that you can customize when you're setting up one and done to say,
SASCHA_WOLF:
Nice.
BRENT_ANDERSON:
Hey, I would like this to, this is how we're going to decide if this request should be treated in an item potent way. And the default implementation, it does do those method checks. But if you wanna just completely rip it out and replace it with your own, then you could make a call to Ecto if you wanted to call to some other system to decide if a request ought to be item potent. It would increase your latency, but you could do it.
SASCHA_WOLF:
Yeah, it's good to know. I think it's an interesting tidbit to know also for listeners. But yeah, I think that's a good segue to go back to why are you caching? I mean, why are you caching? Why
BRENT_ANDERSON:
Yeah.
SASCHA_WOLF:
are you hashing?
BRENT_ANDERSON:
Yeah, I think I think we'll we'll touch into that in just a second because I think it will make more sense as we work through kind of OK, so we've had a request that's come through and we've cached it right and then a duplicate request comes in. And one of the things that as we were looking at other other item potent implementations out there. Stripe in particular, we were impressed by the fact that if you have a request that comes in and it gets cached. And then you have another request that comes in that uses the same item potency key, but it has a different request body or some other property on the request that is different from the initial request. Um, they will actually reject the request. They'll say, Oh, this is a bad request because there's a difference. There's a discrepancy between what you gave us originally and what you're giving us now. So even though the keys match the payload that's associated with that key doesn't
SASCHA_WOLF:
Mmm.
BRENT_ANDERSON:
for the request. And we thought that that was a really important detail because it can be a really helpful tool for diagnosing problems in systems for our clients, for our customers. So if a customer has not correctly implemented access to our API, if they're not using our SDK, for instance, like our SDK does it correctly, but if you were kind of doing it on your own, you'd wanna know if something was reusing an API key or an item potency key in a way that was you know, that could lead you to get you into trouble. So let's suppose that, going back to Stripe, just because it's such an easy example to talk about charging money and stuff. If I have a request with an item potency key and I issue a charge for, you know, $10, right? And then I use that same item potency key and I issue a charge for $20, right? What's the correct behavior here? And in the case of Stripe, I think that they've correctly said, well, even though you're using the same item potency key, we're not gonna return a success on this. because clearly this is intended to be a different request. The first one is for $10, the second one is for $20. We're gonna reject the second request. It's not correct, right? Now that works if you're charging credit cards and maybe Stripe has methods where this is, where they have to solve the same problem. At NOC, when you trigger a notification, you need to sometimes attach, put attachments on, right? So if you're sending emails through our system, you may wanna include PDFs or something. and that needs to pass through. And those files can get large. We didn't want to have to blow our cache size up storing copies of these full request bodies. And so when a request comes in, if we accept it as identipotent and we process it and then we store it in the cache, we actually keep a, we use, I want to say that it's Erlang's p hash 2, which is a pretty fast hashing algorithm. It's built right into Erlang and into the Beam. And we take a hash of the request body, and we store that with the cached response. So that when we go to rehydrate the cache, when a request comes in, if we're returning a response that is already in the cache, then we compute that again for the second request. And then we can compare the hashes to make sure that the requests match. That is something also that you can disable if you really want to. But we found that that was important for our use case to make sure that we were able to make good guarantees to our customers, while at the same time, not making sure that we weren't making it so we were having to store all of this extra information in the cache without really needing to. So that's the reason that we do the hashing piece of it. But kind of walking through that second request then, when that second request comes in, we generate the same cache key using the exact same mechanism that we did for the first one. And we go and we check the cache. And if there is a response that's been serialized and stored there, we pull it out and we look at it. And we say, OK, did the original request match this request in its body? We ignore HTTP headers for that calculation because you're going to get different request IDs and there can be all sorts of changes up on the top side of things. But in terms of the request payload itself, the parameters, that should be. it should be stable from request to request. And then if it matches, then we reconstruct that we construct a new response and we send it back. One of the things that is a bit of a unique feature of our library that I haven't seen elsewhere, maybe Stripe does something like this internally, but they don't really talk about it from their consumer documentation is that when a request comes into our system, we assign a request ID to it, just by default using the default request ID plug. But each of those request IDs is going to be unique from each of these requests. And it can be helpful to know what was the originating request that we're returning. So if the first request comes in and it gets assigned request ID A and the second one gets assigned request ID B, we actually include the original request ID as a separate header in the response so that it's easy for you to be able to then correlate and say, OK. this response is linked back to this original request that happened. And in Knox dashboard, when you go to look at, you can actually see every API call that you ever make to our dashboard in our logs. When you go and look, you can actually see, you can actually get that lineage, that correlation of all of these requests to actually tie back to the same idempotent request that came in originally. Um, and you can customize which headers you want to respect in that way with the one and done library as well. We start with just the request ID stuff, but if, if, uh, if a team had a different header that they wanted to preserve between requests, then you can actually do that too.
SASCHA_WOLF:
That would have been my next question. So nice that Monoton can do that because I was just thinking, like if you were, for example, building, I have built-in tracing, then it might be interesting to pass like this trace ID along for the operation, which is like kind of triggering the first request, right? And then get that back on subsequent request to correlate it in your own observability stack. To say, okay, like all of my following requests get the response which are related to that trace over there. So it's pretty cool that you can do that. Nice.
BRENT_ANDERSON:
Yeah, I mean, we, um, a lot of this was just scratching our own itch. And, you know, we're, we're definitely, um, you know, knock is growing. We have some pretty big customers. Um, you know, we're not the biggest company on the planet. So I'm sure that there are challenges that Stripe or Google or other companies, um, you know, have to deal with, with some of these things, but we, we've seen enough scale that I think that it solves for a broad set of use cases that many customers, um, of libraries like this are going to need, um, to, to set up, you know, a default, a good. batteries out of the box included type experience for item potency.
SASCHA_WOLF:
So is there anything else on NOC you're working on which kind of relates to distributed systems? Because I presume, I mean, idempotency is probably not the only topic of concern I would imagine in a system like this.
BRENT_ANDERSON:
Yeah, that's a good question. So one of the products that NOC includes, NOC provides, just to kind of go in a little bit into what we do, we make it easy for product teams to incorporate excellent notifications experiences for their customers. And what this means is that with NOC using our dashboard, just a couple of clicks, you can set up workflows that will orchestrate the timing and delivery of notifications across multiple channels. So we support sending emails across a number of different email providers. You can send text messages. Um, you can send push notifications across different platforms. Uh, you know, there's a Slack and we have a number of other different integrations that we support web hooks. Um, and you may want to do different things with these notifications throughout the life cycle of one of these workflows. Um, so for instance, you may want to batch together, you know, 10 different notifications and deliver them in one shot, right? Instead of someone receiving 10 different edit notices from Notion, they would benefit from getting just one. And to be clear, Notion is not a customer of Knock, at least not yet. But we were trying to make it so that it's easy for teams like Notion and for any team, really, to get excellent notification experiences without having to do all the work that goes into producing that at scale. And so one of the features that we have that's part of that is a batteries included in-app feed product. So if you want to include a feed in your application, it makes it really easy for you to push notifications into a feed that can appear on a website. We have like a React SDK, for example, that makes it easy in just a couple of minutes to get notifications, you know, a little bell up in your corner and you click on it and it gives you the feed showing comments or messages or whatever it is that's relevant for your application. And there's a whole lot of customization that can go into that with preferences and stuff. The problem that we run into is that those notifications, we send them using, of course, Phoenix has channels and Phoenix sockets are really excellent. It's one of the reasons that Phoenix is, it's one of the reasons that as a team, we use Phoenix for solving these problems. But as you scale and as you get more connections running in your system, you end up with what's known as the thundering herd problem, where you may have Let's say you have 100,000 sockets that are connected to a couple of Phoenix containers that are running on a system somewhere, right? Doesn't really matter where you're running it. It could be on AWS or any other cloud. But bottom line is that if you have, at a certain point of scale, when you go to release a new version of your application, you're going to be shutting down your old connections. And those connections will then need to rejoin onto whatever the new version of your software is that's running. Um, unless you're going to be doing hot upgrades, um, with Erlang and Elixir. Um, which this is one of those areas where I know that as a community, we're like, yeah, we have hot upgrades. And then everyone's like, ah, I try to avoid them if I can, because, you know, it's complicated and it is, um, you know, it's, but it's a, it's a neat power to have and it's one of those things that this is a problem that hot upgrades are kind of meant for. Um, if they were originally born out of the need to. be able to support telephone platforms without dropping calls. If you swap out the idea of a telephone call needing to stay up with the idea of a web socket connection needing to stay open, it's really the same problem. At Nock, we don't use hot early upgrades. And so we do have this thundering herd problem where when we start to shut down old versions of our application to replace them with the new ones as we do releases, we have to deal with trying to direct traffic from this old set of. nodes to the new set of nodes in our cluster. Have any of you seen a problem like that before worked on anything in that space?
ADI_IYENGAR:
Yeah, I've had, I've experienced that a couple of times actually at the score. version which solves many problems including one of them includes this this problems exactly so unfortunately that's not open source yet but I hope one day they can open source it
SASCHA_WOLF:
not directly myself because the products I did work on, there was a chance that this would happen at some point, but I never reached the scale. At least I was not around to see them reach the scale where that actually became a problem. So I'm curious to hear. Yeah, I mean, I can already, my mind already started wondering, okay, I guess there are different ways to go about this. I'm curious to hear what you've been cooking up there. to handle this kind of situation because I feel a lot of engineers don't necessarily have the experience on working on a product of that scale where you actually need to solve these kinds of problems. You know what I'm saying? A lot of products out there don't reach the scale where something such as the thundering herd problem actually becomes relevant to the day-to-day operation. So it's kind of cool to hear from somebody who will kind of had to work the your way through that.
BRENT_ANDERSON:
Yeah, I mean, it's something that we're working through just in general in continuing to scale our system to support more and more customers. The basic premise of the thundering herd is that as those connections get disconnected from the old deployment, they have to go somewhere, right? Clients are going to start retrying. And if you imagine just yanking the power plug on your existing deployment and everything just goes to zero immediately, you're going to end up with however many clients there are out there immediately trying retrying, right? And there's a lot of different pieces that go into handling this effectively. One is first of all, that your clients ought to have an exponential back off strategy if connections aren't working correctly. Right? So that's, that's one of the things that's I think maybe first to look at. Obviously there's a lot of things to do on the backend, but if you, if you build your clients in a way that's respectful of the API, And this is not just for building your own products, but when you're connecting to other services, it helps a lot to make sure that clients don't all retry at the exact same time, even introducing one second of random retry time where it's like, hey, the connection just dropped. We're going to reconnect, but we're going to randomly reconnect after waiting 0 to 5 seconds. That gives you at least a 5-second window of spreading out whatever happened. to cause this issue to span out. So at least it buys you a little bit more time. And then if the connection fails again, then you say, OK, instead of 0 to 5 seconds, let's say 5 to 25 seconds, right, or whatever. I mean, there's some algorithms for exponential backoff that you could work through. But the gist is that every time you need to do a reconnect, waiting just a little bit longer to retry until you hit some limit where you either give up or you just are retrying every 30 seconds or whatever makes sense for your application. A lot of these are going to be very sensitive to what you're doing. So for instance, if you were in say video games, for instance, or something that requires, you know, like live online multiplayer, real-time type experiences. You might make different trade-offs than someone that is, you know, send. I mean, like not Knox experience. Our customers can tolerate, you know, like a little bit of a disconnect there. without anybody really needing to worry about it, generally speaking, that's not something, we don't guarantee that your socket is always going to be online 24 seven, right? But some applications might require that. So I think the thing that kind of strikes me about that then is that that's like that first set of things from the client side, but then on the backend side of things, you have to start looking at, okay, so what is the way that we're going to gracefully slide the... the request from, basically, how are we going to gracefully slide the connections from the old deployment over to the new one? And there's a lot of different ways to do that. It kind of depends on the infrastructure that you're using. If you're using Amazon, then I know that there are different traffic shaping rules that you can use to kind of gradually direct traffic from an old deployment to a new one. There's not really any one. set way to do it. So I can kind of only talk in generalities when it comes to, you know, anyone that's listening to this, how they might apply this themselves. But you have those client considerations, exponential back off, a little bit of jitter. When you go to actually do shut down your old deployment that's been up your new one, one thing that can be really helpful is doing what is like doing some form of a blue green deploy. You can also do a rolling deploy or something, but essentially what you want to do is you want to make sure that your new servers, your new software, whatever is going to be serving these sockets, that you have some of those systems ready to receive traffic before you start shutting down the old ones, right? Kubernetes has a lot of different ways that you can spin this up with deployments and with rolling deploys, there's a lot of different variables that you can play there. And if we're going to get really specific there, if you set max unavailable in your deployment strategy to zero, and you set max surge to 100%, then what that'll do is, if you have enough capacity in your cluster, it will create a brand new set of pods that matches the ones that you're about to replace. So you have 100% of your capacity ready to go. And you could play with those numbers. But setting those values would give you what's known as a blue-green deploy, where you're able to have the new deployment ready to go. And then you can start to direct traffic to that new system. There's a library that's. It's on Hex called Socket Drano. And there's a version of this library that it's out of the gate. It does a pretty good job. If you look on GitHub, it's not on Hex, but there's a fork or two of it. And I can't remember if Nox, we've made a couple of adjustments to it as well for our needs. So there might be a fork that would be visible from Nox. But there is a fork that's out there that makes a few tweaks to it. that fork, what it does is it keeps track of every socket that connects to Phoenix in an etch table. And so what it does is as these sockets connect, as your application boots up and sockets come in, it will record every single socket in an etch table using telemetry handlers. And as sockets come and go, they get added and removed from this etch table. It doesn't take up all that much RAM, just kind of quietly sits in the background. But what happens is that when Socket Draino interrupts the application shutdown sequence in your supervision tree. So you start Socket Draino close to the end of your application supervision tree. And once you start receiving all this traffic, then it's able to keep track of the sockets. And when it goes to shut down, it pauses the shutdown sequence and it will then gradually disconnect all of the sockets that are connected to your application. and start to drain them. And then you're able to configure a bunch of different parameters for how rapidly it does that draining. And you're gonna need to tune those parameters for whatever your application looks like. But broadly speaking, you're able to then gradually take the traffic off the old deployment and shift it over to the new one. And then you can, once the node has no more connections on it, then you can retire the node. There's a lot of nuance in there that is again gonna be application specific, but that's a pretty convenient way to get started with this type of stuff in the Beam ecosystem. Even if you're not in the Beam ecosystem, you can do some of these things by configuring load balancer rules. It's a lot of ops level considerations around how you're directing your traffic from your old deployment to your new deployment, and then eventually shutting down the old deployment.
ADI_IYENGAR:
Yeah, using telemetry handlers is a pretty cool idea for this. Actually, when I said I have dealt with the problem, I had not dealt with it in a Phoenix socket perspective. So this is an interesting problem with Phoenix socket. And this is a very interesting solution for socket drainer. Nice. Very cool.
BRENT_ANDERSON:
Yeah, I mean, and I think like socket Drano it's, um, it's got, I mean, I'd say that it's a, the best solution that I've found for just dropping into an application. And if you were looking at trying to implement your own from scratch or just trying to study this up, um, or if you're wanting to make some improvements to it, um, it's a good starting point because it does approach a lot of the general strokes that you need to make in trying to achieve this type of a result in a, in a beam application. or if you wanted to apply it to other stacks, you're gonna see something similar. Like if you were using Node.js or Go or something, you're still gonna have this process of saying, okay, let's keep track of which, what connections do we need to disconnect? Let's now disconnect them. In any scenario, you still need to have something upstream that prevents traffic from now being directed to these targets, right? So it doesn't do you much good if you start draining connections, but connections are now being rerouted back to you. Um, so there is, there's still an ops level consideration that happens at your load balancer level of saying, okay, traffic is coming in. How am I going to prevent that traffic from reconnecting to, you know, this, uh, this system. Um, so, but, but it still is a, it's a, it's better than nothing for sure. Even, even if you don't get that quite right.
SASCHA_WOLF:
You've really kind of nerd sniped me now. Like my brain is going, oh, what kind of different scenarios there might be. I mean, like what you said earlier about, okay, a short disconnect from customers. That's not the end of the world, but what if it would be, right? Like then you could kind of have a system in the middle, which keeps those connections and like kind of does the routing.
BRENT_ANDERSON:
Yeah.
SASCHA_WOLF:
If you have state inside of your application that is maybe ephemeral, but you don't want to shed it, then could even use clustering. So like it is... a bunch of wild scenarios which gets arbitrarily complex, but
BRENT_ANDERSON:
See,
SASCHA_WOLF:
it's super
BRENT_ANDERSON:
it
SASCHA_WOLF:
fascinating.
BRENT_ANDERSON:
does, but I do have to say that the Beam is uniquely positioned for handling that. And again,
SASCHA_WOLF:
Mm-hmm.
BRENT_ANDERSON:
I've never used hot Erlang upgrades in Angular. I know that there are people that do. But if you do find yourself in a situation where you have mission critical sockets and you can't just shut them down, I want to say that that's, at least up for a period of time, that that's actually what the Heroku router was built out of, was an Erlang project.
SASCHA_WOLF:
Mm-hmm.
BRENT_ANDERSON:
Um, and that that's, you know, famously where a lot of, uh, contributions that early in community have come through, um, was the development of that product so that you were able to have connections that came in and were routed to the correct Dino without those connections that were dropping because you could do hot upgrades and manage those connections and things. Um, so, you know, at some point, maybe in Knox future, we're going to have, you know, something like that. At this point, we don't. We are doing this socket role where we go from deployment to deployment
SASCHA_WOLF:
That makes sense.
BRENT_ANDERSON:
without trying to do those upgrades. But you could do it.
SASCHA_WOLF:
Yeah, I feel at that point it's also a worthwhile consideration to have two separate deployments, which one is actually really only responsible of holding these sockets. And there you kind of dive into the complexity of having the hot code upgrades. And for your core business application, which kind of has all the business resources are saying, hey, I'm just going to keep my usual deployment cycle because it is less of an overhead and I have more flexibility in deploying on the regular. The assumption being that this thing in the middle, we just like handling these eternally available socket connections that probably won't be updated that often compared to whatever is actually your core business value delivering piece of software.
BRENT_ANDERSON:
Yeah, I'm aware of some companies that do it exactly that way. Where you just isolate the socket problem from
SASCHA_WOLF:
Mm-hmm.
BRENT_ANDERSON:
everything else.
SASCHA_WOLF:
Yeah.
BRENT_ANDERSON:
And then
SASCHA_WOLF:
Then you
BRENT_ANDERSON:
you,
SASCHA_WOLF:
have also
BRENT_ANDERSON:
uh,
SASCHA_WOLF:
a lot more flexibility on how you want to reconnect, right? Because then the thing that does still... The thing that might cause the Fun Ring Herb problem is literally in your control. So, yeah, super
BRENT_ANDERSON:
Yep, exactly.
SASCHA_WOLF:
fascinating, seriously. I got really nuts snagged on this.
ADI_IYENGAR:
It's also really cool that you bring up that Heroku router uses that. I've actually noticed it. And I think I read about it as well last year or something. It does work as you deploy more dynos, your connections still live. It's really cool that, yeah, you can see Erlang in use right there.
BRENT_ANDERSON:
Yeah, I don't know if they still use Erlang or not for that product, but I definitely know that they did for a period of time and if they didn't, I'd be curious what they switched to because Erlang is uniquely situated for solving that exact type of problem.
ADI_IYENGAR:
So I think they still do. In fact, I recently heard a lot of infrastructure around routing and dyno and load balancing has been moved to Elixir as well, which is also quite exciting for the community.
SASCHA_WOLF:
I just did a quick Google search in the background and I was not aware that this short book stuff goes bad, Erlang and Anger, right? That it's like this very short Erlang book, which is kind of like how it is to use the platform as the beam in Anger, right? That it's actually born out of exactly those Hiroku routing teams working on those exact I was not aware of that connection because this is a really good read and I would suggest everybody of our listeners, if you have like a decent grasp of Erling, you don't have to be an amazing grasp, but you can read Erling, right? Like you kind of understand the basics of the beam, then read this because it's free and it's very short and there's a lot of wisdom in there.
ALLEN_WYMA:
Yeah, I just remembered that that book was written by Fred, right. And it's basically
SASCHA_WOLF:
Yeah,
ALLEN_WYMA:
made
SASCHA_WOLF:
it's fit my
ALLEN_WYMA:
from
SASCHA_WOLF:
threat.
ALLEN_WYMA:
just, you know, all the pain, right that he had to go through. So it's pretty nice that he wrote this book and put out there for free.
SASCHA_WOLF:
Okay, any last questions from you, Adi, Alan? Otherwise, I would kind of... us to the end.
ALLEN_WYMA:
Going back to what I said before about the nonce, right? So there's this web security thing that you can, if I looked real quick, I couldn't focus too much, like listening to you guys and also reading and trying to research. What I understand is that you can send back a header with a nonce. And in your JavaScript on your response, you will put the nonce also on there to kind of protect your website from not having JavaScript injected into your page from people that you obviously wouldn't want them to be injecting stuff on your page. So it's kind of interesting that you can also use this nonce for protection for your APIs, protection from UI, and also protection from client side JavaScript, which is quite interesting.
ADI_IYENGAR:
Yeah, I think it's a cryptographic technique to not reuse the same things twice. It's pretty cool. I mean, you see its examples and some of the old ways of communicating, like ciphers and stuff. It's actually pretty neat to extend that in web security. I always thought that was really cool. I think one more thing I do want to highlight what Brent briefly mentioned, other ways of handling it through the infrastructure as well. I know Brent mentioned Mac Search and Kubernetes and stuff. I think that's also something to definitely keep in mind as you are dealing with different aspects of the tundling herd problem. It doesn't necessarily have to be sockets. It can literally be the same problem we're talking about. with the item potency, you could have on the consumer side tundering code problem. So I think it's good to keep in mind that, yes, overlying is great, but I think you could probably deal with this with very simple configuration changes in your cloud providers.
SASCHA_WOLF:
So Brand, if people have more questions about distributed systems or about NOC, how can they reach you?
BRENT_ANDERSON:
So I'm on Mastodon. I'm at bja.social.brentjanderson.com and you can also email me brentknock.app if you have any other direct questions on the platform or when I'm done, any of that stuff.
SASCHA_WOLF:
Nice. Okay, then I would say we can move on to picks and Adi, what did you bring us for this week? More jobs, more jobs.
ADI_IYENGAR:
I do have a job, actually, last minute. Someone just posted that on our website. Otherwise, I would have had no picks. So yeah, we're looking for an embedded systems developer at Score. It's a bit of Elixir, but it's mostly Rust. And yeah, the Elixir stuff is built around Phoenix. But I know a lot of people who listen to the podcast are into Rust. Everyone's into Rust. I see Alan. His eyes lit up. when I mention it so yeah if you guys are interested I'll put a link in the description.
SASCHA_WOLF:
Nice, thank you very much. Alan!
ALLEN_WYMA:
Yeah, so I heard about this game called Prey, not the new one, but one from 2006. If you guys have heard this game before, it was actually made by 3D Realms just after they did Duke Nukem, I think. Well, it was originally made by them, and then they dropped the ball, and then somebody else picked it up. Anyways, it's a really interesting game that came out 2006, like I said. And for the time, I think it's really good. I mean, you shook your head, Sasha, so I guess you probably played this game. Or at least you heard of it.
SASCHA_WOLF:
Yes, I heard of it. I never played it. I only played like the the later game of the exactly same name, which has basically
ALLEN_WYMA:
Yeah.
SASCHA_WOLF:
nothing to do with it.
ALLEN_WYMA:
Yeah.
SASCHA_WOLF:
But yeah.
ALLEN_WYMA:
The history of it's kind of interesting. It's like typical 3D realms, you know, like spend a long time, keep rebuilding, rebuilding, rebuilding until you mess up and then somebody else has to finish it. But these guys who finished it did a really awesome job. You cannot find the game on Steam, so I actually went to Kinguin and bought a key and then could activate it on there. So that's kind of a workaround. But yeah, I've been playing and it was really quite blown away with the game. got really cool stuff. Just check it out like on YouTube and yeah, I think if you like FPS games, it's pretty cool. So that's my pick.
SASCHA_WOLF:
Nice. And I only have one pick this week. And this is really something close and dear to my heart. I mentioned on the past, especially in the pick section, that I recently got diagnosed with ADHD. And one of the big problems for me and one of the big challenges is organizing everything that is like in the future, especially when it has not a fixed date and not a fixed time. Like everything which has a fixed date and fixed time goes in my digital calendar with a bunch of not quite that, for example, I don't know, planning the inspection of my car, like the yearly one, right? That kind of thing, or getting a birthday present, or all those kind of things that need to happen, but there's not a specific point in time I can just put it in my calendar, that those are the things that tend to slip. And I've recently found a very interesting method to capture this, which kind of does... goes by the principle as much structure as necessary, as little as possible. And that is bullet journaling. And bullet journaling is something which is super popular. A bunch of you have probably heard about it. But the interesting tidbit for me is also that the guy which originally came up with it, came up with it because he has ADHD and he tried a bunch of different things throughout his life, over decades, nothing worked. And that is what he ended up with that worked. And I've been doing this for three weeks now and- I can only say for me it works. It really works. And I'm already planning like the like a gift for my wife or like a Mabas day here in Germany, which is usually something I would have done like kind of last minute because oh shit now it's the tour. What do I do? And this is now late. Oh, that's next week, so it's Sunday from time recording. And that is just giving me so much peace of mind, honestly. I have one for work, I have one for private things. It also gives, enables me to be a lot more mindful about my work practices because I close my day now every time with like a 20 minutes, a board journaling exercise where I kind of write things down that still need to be done. Or like check through my to-do's, did I get anything done today? Is there another to-do you need to capture for like the next few days? Um, and then I close the thing and then I feel work is done for the day. And that is also something I've been struggling with in the past. Where like, you kind of leave the desk and you close your laptop, but like work kind of still stays with you mentally. So honestly, while of course your mileage may vary, but I can't speak highly enough about bullet journaling. This has the potential to literally change my life. So yeah, maybe check it out. Even if you don't have ADHD necessary, but if you struggle with personal organization, this is still a very mindful technique, but also a very useful one. Brent, do you have any picks for us?
BRENT_ANDERSON:
I do. I've actually got three, um, that I think are all, I would rate them very highly. One of them I threw onto my list of ideas. Um, as you were talking about bullet journaling and this is, I, it hurts a little bit to share because, um, it's recently gone into maintenance mode and I haven't found a good substitute yet, but there's still a vibrant community around a product called dendron. Um, it is kind of like, if you're familiar with notion or Rome research, Um, if you go to dendron.so and it is a, uh, it's like those tools, but it is all marked or like obsidian for instance. Um, if you don't want to use this, then go check out obsidian.md. Um, they're all very similar, but I really like dendron because it's all marked down and it's all in VS code and I use VS code every day. And so it's just an extension. You initialize it, but it has some really neat, very thoughtful touches to how you organize information. Um, so that you can't, so that you don't just put information into one of these knowledge management systems, but you can also get it back out again. Um, I think that they, they have the right philosophy around it, even if they weren't able to monetize it and they had to put it in maintenance mode, but it's a cool product. Um, but if you haven't used something like that, obsidian also does bullet journaling and stuff like that. It's a, it's a cool little tool. Um, so first pick would be dendron. Second pick, um, is, uh, a lot of people get into, um, writing software I've noticed from wanting to write their own video games and just being interested in how they're written and stuff like that. And that was at least the case. Yeah, we got some hands raised on this thing. That was the case for me. I've been, I've dabbled in games and stuff on this. Like I'm not, I'm not a gamer by any means, but I'm, I've always been interested in how they work. You know, how do you program these things and being able to tell meaningful stories through that type of a medium. And there is a tool that doesn't get enough, reach. It's called RPG in a box. You go to RPG in a box.com. It is a passion project from a fellow who's been working on this for several years and he's been making regular product updates to it for a really long time. And it is, uh, it has everything that you need in order to build all sorts of different types of games. Whether that's RPGs. Some people may use it to make FPSs. I really like it because it's very casual. You can pick up something like Unity or Unreal or Godot or some of these other big engines that maybe you've heard of. And they just have so many things that you can do with them, but it feels like there's just too much for you to really sit down and casually make something fun. And RPG in a Box takes care of it's got an asset library built in. You can generate sound effects. You can build your own models and animate them because they're just voxels. And then you can. construct stuff with it. And it's surprising how you can take just a couple hours and you can actually get something playable. And it's fun, and it's interesting. So if you've ever wanted to scratch that itch and you've wanted to play around with it, go check this out. Buy a license. It's totally worth it. It's some of the best value for getting into this. If you were doing it with something like Unreal or Unity, you'd spend just as much on an asset pack if you were really going to get into something like that. And this comes with a lot of cool stuff. So. Um, so RPG in a box. And then the last one I threw out there is, uh, recently got an e-bike. Um, and, uh, I'd forgotten just how much fun it is to get on a bike. And I spend enough time, you know, working in abstracted thundering herd problems and stuff that it's a, just a reminder, at least to myself, if I was listening to this without having been on it, that getting outside on a bike is kind of fun and it's a, it's a good way to like, you know, just, just kind of do something different from the regular work that I do anyway. working in software, so can't recommend that enough.
SASCHA_WOLF:
Is there any specific e-bike you can recommend? Or is it more like a, go biking, it's fun. Hahaha.
BRENT_ANDERSON:
Um, I, I did my research and chose to purchase from rad power bikes. Um, they are a direct consumer brand. Um, I really like it. Um, I got the rad wagon, which is, uh, kind of an extended caboose for carrying cargo and stuff. Um, so it's got a fun, fun little thing going there, but there's a lot of different ones to choose from. I know that, uh, in North America, you can get stuff from REI, um, and tons of other brands, um, there's some really fancy ones that you can get that have Bosch motors. Um, if you're wanting to go more up market, but if you're just trying to get started, there's some, some really good entries. If you go to round power bikes, they've got a pretty good support system and they're, I'd say they're pretty fairly priced for the value you get.
SASCHA_WOLF:
Nice. I also really like your RPG in the box pick because I can't even see it being useful for somebody who does more of a game dev experience because it seems like amazingly useful for prototyping as well, you know?
BRENT_ANDERSON:
Oh yeah, no, it gets all of the not fun parts of video game development, it gets them off the... they're just not there. And it does mean that you don't have... When you're using a full-fledged engine, obviously you can do anything you want, but that means you have to do everything. An RPG in a box,
SASCHA_WOLF:
Yeah.
BRENT_ANDERSON:
it's very carefully crafted to give you just enough to be able to do just about anything you'd reasonably want to do. You might run into some limitations, but I don't know that there are really that many. And the developer has been... He's very responsive, really active community. It's a, it's a fun little tool and it's a, it deserves a little bit more praise for fulfilling the niche that it does.
SASCHA_WOLF:
Super cool, seriously. Cool pick. Okay, it was a pleasure having you, Brand.
BRENT_ANDERSON:
Yeah, no, it's been great. Thank you for having me.
SASCHA_WOLF:
And I hope all of you enjoyed listening to us rambling about distributed systems. I guess we can call this building distributed systems at scale or something like that. And I hope you tune in next time with another episode of Aliximix. Bye.
Building Distributed Systems at Scale - EMx 219
0:00
Playback Speed: