Pitchfork, Falcon, and Performant HTTP Servers - RUBY 587
Jean Boussier is a Staff Engineer on Shopify's Ruby and Rails infrastructure team. He joins the show to talk about pitchfork. He begins by defining the pitchfork and describing how the application concept works. Moreover, he explains the reason why he wrote it and tackles some of its useful features.
Special Guests:
Jean Boussier
Show Notes
Jean Boussier is a Staff Engineer on Shopify's Ruby and Rails infrastructure team. He joins the show to talk about pitchfork. He begins by defining the pitchfork and describing how the application concept works. Moreover, he explains the reason why he wrote it and tackles some of its useful features.
On YouTube
Sponsors
Educational Links
- GitHub - Shopify/pitchfork
- To Thread or Not to Thread: An In-Depth Look at Ruby’s Execution Models
- Heap Profiler: Measure memory allocations and retentions of a Ruby code snippet
- Rails’ eager_load_namespaces
- The Deadline pattern
- Circuit Breakers in Ruby with Semian
- Performance tuning Falcon
- Ruby’s New GVL Instrumentation API
Socials
Promoted Links
Picks
- Chuck - Sushi Go Party!
- Chuck - Riverside.fm
- Jean - Ruby 3.2.0 Released
- Valentino - watchmeforever - Twitch
- Valentino - ivoanjo/gvl-tracing
Transcript
Charles Max_Wood:
Hey there, and welcome back to another episode of the Ruby Rogues Podcast. This week on our panel, we have Valentino Stoll.
Valentino_Stoll:
Heya now.
Charles Max_Wood:
I'm Charles Max Wood from Top End Devs, and this week we have a special guest, it's Jean Boussier. Jean, do you wanna introduce yourself? Let us know who you are and why you're famous.
Jean_Boussier:
Hi, hey there. Yeah, Famous, I don't know, but I've been working at Shopify for nine years now. And
Charles Max_Wood:
Oh wow.
Jean_Boussier:
yeah, yeah, I know. And I ended up being like, I'm no, a Rails core contributor. I'm also like a Ruby committer and maintain a bunch of gems there and there. And over my time at Shopify, delivery, mini kind of like SRE kind of work and now I'm on the Ruby and Rails infrastructure team so I work like quasi full-time on either open source or like upgrading all open source projects inside Shopify and things like that.
Charles Max_Wood:
Very cool. So we brought you on to talk about Pitchfork. Now Pitchfork, it looks like it's a rack HTTP server. I think the ones that I've used that kind of compare are Unicorn and Puma. But
Jean_Boussier:
Absolutely.
Charles Max_Wood:
do you want to kind of give us the 10,000 foot view on what Pitchfork is and why people might want to
Jean_Boussier:
Yeah,
Charles Max_Wood:
think about it?
Jean_Boussier:
we definitely need to dive in to explain how it works. But basically,
Charles Max_Wood:
Right.
Jean_Boussier:
it started as a fork of Unicorn. So in terms of economics, it's going to be very much like Unicorn. But it's going to use way, way less memory than Unicorn, because it uses this semi-new technique, which is reforking to improve the copy and write performance of your application. We're definitely going to have to explain that a bit later. Basically, it's
Charles Max_Wood:
Right.
Jean_Boussier:
better at sharing memory.
Charles Max_Wood:
All right, so yeah, so it's an HTTP server. You're probably going to have less, what, memory usage. So yeah, so just to get started with using it, do I just pick it up and use it just like Unicorn or Puma? Or do I have to do anything else to get it to go?
Jean_Boussier:
Well, so hopefully in a bit, yes. But right now, there's a huge challenge. Right now, it's very much like I think I'm clearing the dream. It's experimental. I'm currently working on onboarding one of our apps in production to run out any kind of issues we might have. But it's always going to be slightly harder than Puma and Unicorn to onboard in. Because what it does is that to improve the memory usage, it periodically reforks your application. So if
Charles Max_Wood:
Okay.
Jean_Boussier:
you're familiar with Prima and Unicorn, you know, like they boot your application and then they fork once, or like they fork multiple workers from the same process, and then they
Charles Max_Wood:
Mm-hmm.
Jean_Boussier:
stop there. What Pitchfork does is that frequently, like on a regular basis, it's gonna tell one of the worker, okay, you're using the new master process now. it's
Charles Max_Wood:
Oh,
Jean_Boussier:
going
Charles Max_Wood:
interesting.
Jean_Boussier:
to switch. Yeah, it's going to become the new master and like the old worker is going to one by one shut down and the new master is going to create new ones. So that's all it achieved like a much better copy and write performance because as your process lives, you know, like when you start forking you get an exact copy of the memory of your parent
Charles Max_Wood:
Mm-hmm.
Jean_Boussier:
you don't have a physical copy. It's just like you're pointing to the same memory region. And then
Charles Max_Wood:
Right.
Jean_Boussier:
whenever you touch, like you write into one of those memory pages, the kernel just stops you and makes a real copy and gives you a real copy. So which means that when you fork a process, like a, I don't know, one gigabyte process, and you end up with two gigabytes, two one gigabyte process, and if you look at the
Charles Max_Wood:
Uh-huh.
Jean_Boussier:
RSS, you know, like the memory usage, the kernel, you're going to say, oh yeah, yeah, it's using two gigabytes. In truth, it's using just one because everything is shared.
Charles Max_Wood:
Right.
Jean_Boussier:
And so that's all. Puma and Unicorn don't use too much memory. The problem is that Ruby is not particularly well. It's not that it's badly designed, but it's not trying to optimize for this so much. So for example,
Charles Max_Wood:
Okay.
Jean_Boussier:
when you compile your file, inside the VM, you have some bytecode. And to improve performance, inside it's bytecode, Ruby reserves some teeny memory regions where it caches things. Like, for example, if you look up a constant, it's going to say, when you compile your method for that constant, it's going to reserve a teeny two, three bytes. And the first time it exits your method, it'll say, oh, I don't know what that constant is. And it's going to do a slow constant lookup. It's a fairly heavy operation. But then when you install that in that teeny cache, what's cache, and on the next execution, it's going to say, oh, I already know where that constant is. And so the problem with that is that with Unicorn or Puma, you boot your application, but you never execute most of the code right. Let's say you're a controller. You're not going to call your controller during your boot phase. It's only after you've forked, but you're going to execute it. And
Charles Max_Wood:
Right.
Jean_Boussier:
so you fork. You share 100% of your memory. And then just after a few requests, 30 pages after pages, and then you're only going to share half of it, basically. From
Charles Max_Wood:
Mm-hmm.
Jean_Boussier:
what I measured in production, most applications after just 100 requests, they fall to 60% or 40% shared memory only. So I forgot the question.
Charles Max_Wood:
Well,
Jean_Boussier:
And so,
Charles Max_Wood:
let
Jean_Boussier:
yes.
Charles Max_Wood:
me see if I can
Jean_Boussier:
No,
Charles Max_Wood:
restate
Jean_Boussier:
no, no,
Charles Max_Wood:
and
Jean_Boussier:
sorry.
Charles Max_Wood:
understand
Jean_Boussier:
That was
Charles Max_Wood:
what
Jean_Boussier:
for
Charles Max_Wood:
you're
Jean_Boussier:
the onboarding.
Charles Max_Wood:
talking about, Ryan.
Jean_Boussier:
Oh yeah, go ahead, go ahead, please.
Charles Max_Wood:
Okay. So effectively what you're talking about, and we've covered processes and copy-on-write and stuff in previous episodes. It's been quite a long time since we've talked about it, but yeah, effectively you fork the process and it says, well, I'm just going to, rather than copy everything that I need from the old process, use the same memory space as the old process. And then when I have to overwrite something that was in the old memory space, that's when I add that specifically to my memory because I don't wanna pollute the other memory space. And so what that does is it allows your processes to run without a ton of memory bloat on stuff that everything that's forked off of that one process already knows about.
Jean_Boussier:
Yeah, that's an excellent explanation. Because in a running web server, I mean, it's going to depend from an application to another. But you are going to have a huge part of the memory that is going to be the same in every process. You're going
Charles Max_Wood:
Right.
Jean_Boussier:
to have the memory you use to render your request and things like that. And you're going to have all the supporting memory. That's all your code, your classes, your translation data, all these things. And so yeah. achieve much better memory performance is that it just is able to share much more. The downside is, since you refolk, that's why folk is not very popular in many circles, is that when you folk, two things happen, like two bad things happen. First, all the threads but the one you folked from end up dead in the child. So if you have a background thread that was like started in your children.
Charles Max_Wood:
Okay.
Jean_Boussier:
So that's the first challenge. So if you have some libraries do this, they start a thread to do things, and then you fork, and the thread is dead, and it no longer works. A famous library that does this is gRPC. And they do this in sealants. And historically, if the thread was dead, your main thread would wait for something that the background thread was supposed to do, never do, and the process will just look forever. So very bad things can happen if you have some dead threads. And the second thing that many applications or lots of code doesn't deal necessarily very well with is that you inherit all your file descriptors. So that's a term. But file descriptors are everything that's open file, open sockets. So if you have something like, say, a database connection or a Redis connection, and you fork, both the children and the parents are effectively in the same connection. So for instance, say the parent does a query and the children does one as well, they might get each other's response. So you can have terrible things where,
Charles Max_Wood:
Oh.
Jean_Boussier:
you know, so very
Charles Max_Wood:
I mean,
Jean_Boussier:
often like
Charles Max_Wood:
what's
Jean_Boussier:
the client,
Charles Max_Wood:
wrong with that?
Jean_Boussier:
yeah, I guess it's just like you're just playing Russian roulette with your network packets basically.
Charles Max_Wood:
Uh-huh.
Jean_Boussier:
So this can, like people who have done like Ruby deployment for a while, either with Unicorn or Prima, they're somewhat familiar with this because they know that for instance, you know, active record or something, you need to configure Prima to close connection and before fork and reestablish after. So they know about this, but the thing is like the vast majority of the code is not executed during the boot phase and Prima just
Charles Max_Wood:
Mm-hmm.
Jean_Boussier:
boot your application and then fork. So there's lots of code out there, wouldn't support being forked after they have been being used but that currently works with Puma but wouldn't work with Pichfork. So that's why I put a big warning on Pitchfork. It's like you need to really own your stack, being able to vet your dependencies to know it's going to work with. Or either of that, or you need a very good staging on which to run tests to figure out the problems. Don't go willy-nilly enabling it in production, because you're going to have trouble. That's for sure. My hope, though, is that in the near future, it's going to be a bit easier. Because historically, I don't know the Rails 3 days where you had to say, oh, you had the thread safe mode for Rails because nobody was running thread at the time. And in the end, it didn't take that long for the community to just make most of the cards there thread safe. And so I'm working on like popular gems and things like that to make them fork safe, like refork safe. So for instance, I made patches for the connection pool gem to automatically detect was forked and just abandon the connection and start over. Just today, I was making a peer to an HTTP gem that's called XCon or something. I'm just looking at the popular gem that may have trouble with this, and I'm just submitting patches so that the ecosystem just becomes more compatible with PitchFork. And I'm just generally working on improving the ecosystem around this so that hopefully in, I don't know, a year, two years, enabling pitchfork just become like just adding a gem to your gem file and you're good to go.
Charles Max_Wood:
Makes sense. So I kind of want to step in and just see if I understand what Pitchfork is doing then. So what you said was that Puma effectively stands up the app and then forks immediately. And so everything
Jean_Boussier:
Yes.
Charles Max_Wood:
that forks off of that master process, they only share whatever's initialized on the app. And so what you're saying is with Pitchfork, when you fork off of something that's been running for a while, share more memory because it's loaded in the stuff that it needed when it shared or when it ran and so it'll share a larger percent of the memory.
Jean_Boussier:
So that is true. That is one of the thing because there's this pattern that is very common in the Ruby community, which is like the Oracle, like lazy memorization kind of thing. And so it's not uncommon to see application that grow in memory after
Charles Max_Wood:
Uh-huh.
Jean_Boussier:
they've been deployed and then they stabilize, right? So there's
Charles Max_Wood:
Right.
Jean_Boussier:
definitely this dimension and Pitchfork's gonna help with that because after a few requests, all this like when your memory stabilize, And so that's going to help. But in addition, there is the thing I talked about a bit earlier, which is, say, the inline caches in the virtual machine is that there's memory regions, especially that when you just booted your application, are not yet initialized. And then once you execute that code path once, it's stable, like it writes into it. So it invalidates copy and write. But it doesn't invalidate copy and write on every execution, So there's this idea of like for the first few requests, your memory pages are very volatile. Like it's going to be invalidated a lot. But as you go over like requests, more and more requests as your application warms up, that's no longer will be the case. So the heuristic pitch focus right now to know when to do a new generation, like refocusing workers is like based on And what I've experimented with for now is just like, oh, after 100 requests, you do a new generation. And then after a thousand more requests, you do yet another generation. And the idea is that after a while, you don't even need to refolk anymore. It's just stabilized, right? Because if you look at the copy and write efficiency graph, if you look at the memory matrix and you see how many pages are shared, you're going to start at 100%. And you're going to drop very, very very quickly to something like 50-ish percent, and then you're going to have a very slow curve. It's going to be a logarithmic or asymptotic kind of curve with a very slow decrease. And once you're there, it's good. You can just forget about it. It's stable. So it's a bit like a JIT that needs to warm up. It's kind of the same idea, but you
Charles Max_Wood:
Right.
Jean_Boussier:
need to warm up to share more memory.
Charles Max_Wood:
That's really cool. So what made you want
Jean_Boussier:
Yeah,
Charles Max_Wood:
to write this?
Jean_Boussier:
right, right, right. That's a very long story. So as I said, I've been working at Shopify for like nine years or something. And I've been working on Monolith a lot, which is gigantic. Really, you have no idea how big it
Charles Max_Wood:
Uh
Jean_Boussier:
is.
Charles Max_Wood:
huh.
Jean_Boussier:
And so I've been focusing on memory usage for a very long time. It's been, I think, like four or five years. I'm just regularly. It's like a kind of a chore. come back to, I just look at like, oh, where did we degrade? Where did we start using more memory lately? I have a set of dashboard. I've written something that's called e-profiler that takes e-doms and just makes you like a kind of a summary of where your memory is being used. I've been doing a lot of work in that area for a while. And credits to with you. At some point, I saw, like I found of experimental feature in Puma that has been written by, I'm really sorry, he recognizing himself, someone who write like an experimental feature for Puma. But kind of like pretty much just the same idea of like reforking. So they call it refork walker, I think, or fork walker, which is the same idea than pitchfork. But the way it's implemented at like a major limitation is, so in Puma, you have like the master control process, like the cluster process that forks the workers. And what they did is that they said, oh, when you trigger a refork, then the first worker become like forks new workers by itself. But then what you end up with is like you have the cluster, which is a parent of the worker zero, which is a parent of the new generation of worker. So now you not only have like a parent in some children, you have like a grandparent. with that is that if the middle process, the one that the worker zero dies, your workers become orphans. And that's, you know, like, sometimes you hear zombie process, not really the same thing, but like, it's like in Unix, when you want to do a daemon, like daemonize a process, you fork twice, right? You do a middle process, and then you let the middle process die. And the crown children is reattached to PID1, which is like the process nobody tracks it. And that for a web server, it's very bad, because it means you have no control over that child anymore, and you don't know what happened. And so I was like, whoa, this is a cool idea, but there is no way about that in production. It's just too dangerous, right? And I was, I don't know, on Acura News or something, and I saw something that was added to Linux just five years ago or something, which Pierre child subripper and so it's a new API in Linux where you can as a process you can declare yourself as not exactly the new in it but you're gonna say if any of my children or grandchildren or whatever become detached I need to be right that you someone reattach them to me and that's where it hit me
Charles Max_Wood:
Hmm.
Jean_Boussier:
that it So it was possible for a process to like focus siblings of itself if the parent was cooperative And that's when I put one in one together, but I was like, oh we could do this cool thing And that's when I started working on pitchfork Because that was a combination of those two ideas Oh, and I should say, this need for this feature means that the main appeal of Pitchfork is Linux only. You can run Pitchfork on Mac OS or FreeBSD or whatever, but you won't be able to enable reforking unless you have this feature.
Charles Max_Wood:
So I guess the other question I have is, is this something that you're using at Shopify now and what kind of results are you getting out of it?
Jean_Boussier:
Right. So unfortunately, not quite yet. I'm actually working on it this week. Because I wrote Pitchfork last year, towards the end of year. And then there was the new Ruby release coming up. And I'm the guy, since I had experience in CI at Shopify, I'm the guy who just run Nike builds to make sure the release is going to be nice, nice and clean for everyone. So I just put Pitchfork on the side for a bit. this week, but some team said, oh, we had this problem, whatever, do you want to try Pitchfork on a wrap? And I was like, you know what, all my dear, I'm just, I'm gonna come. So I'm like working on making their app folks safe right now and hopefully I should deploy it like next week or something. So if you want to know, just follow me on Twitter or whatever, I will probably share graphs or numbers soon, but that's my hope. And so now other than that, micro benchmarks. If you look at the pitchfork repo, there's a specially crafted application to showcase what I've talked about before, like the inline caches and things. So it's something like an application that's about 300 megabytes, like one process. And I benchmark two workers and two threads against pitchfork for workers. And pitchfork can end up using like twice memory as Puma. But again, I don't want to give too much hopes. All those results are extremely dependent on what your app is doing. Different applications are going to have very different memory profiles and will be able to benefit much more from PitchFork than others. So it's going to be very, very viable. Regardless.
Valentino_Stoll:
So I never knew that you could eager load names, namespaces like that in reels. Do you find yourself like as Shopify, like throwing a ton of stuff in that eager load process? Or
Jean_Boussier:
million
Valentino_Stoll:
is that like
Jean_Boussier:
things.
Valentino_Stoll:
a huge, or is that like a huge no-no? Like, hey, we're getting to our limit. Like maybe we should back off a little.
Jean_Boussier:
No, actually, so yeah, that's funny because that's kind of a misconception I try to fight against is that there's many developers who they try to lazy initialize things because saying, oh, we might not need it, so we might only compute it when we need it and then we store it because it costs us a lot to compute. And the problem with that is if it's not initialized during boot, first, the first to eat that code path is going to be slower than the remaining ones. So I'm going to talk about something that many people may have experienced, it's like those shark fin looking latency graph when you deploy. If you, if when you deploy your app, you see like latency raising, it's very likely that you have way too many lazily initialized things in your application. It's not a good thing. You much, I'd much rather make the boot time a bit slower. If it means that I have like much smoother deploy is where latency stay, stable. And so, yes, we, you have this thing in Rails, which is like eager load namespaces, which is very useful for these things where you can have something lazy in development and then you put it in that list and Rails will call it for you, like eager load it, like it's going to call a method on it that allow you to pre-warm those objects during boot. So that's going to ensure that it's booted in the parent. So if it's going to be as part, that's going to be, sorry. that's going to be in a memory region that is going to be copied and write later on. So you actually might not pay as much because you think like, I'm going to add like one megabyte to my process. But if you use say 10 children, you're not adding one megabyte, you're adding 100 kilobytes. And that's why like that's another dimension of like how to reduce memory usage in production is to use bigger servers boxes. I hear a lot of complaints about Ruby Bumiru usage, but I think it's in part because a large part of the community using things like Heroku with like 500 megs or something like that. So they can only have like two, three Puma workers, but that should be fine. Like we run something like 32 Unicorn workers, or
Charles Max_Wood:
Hmm.
Jean_Boussier:
one of our app is even like 90 children because... As I said, you have one gig, and then if you have 10 processes, it's going to be 100 megs. So the more workers, Puma workers, Unicorn workers, doesn't matter. The more you use of those, the less having some precomputed memory in your application is a big deal. So you really want to benefit of that. And so yes, eager on namespace is a very underused API, in my opinion. And all these that Shopify is like, there's probably 50 entries in there. It's just kind of ridiculous. you
Valentino_Stoll:
That's funny. One thing I want to call out in this article I was reading on the thread or not the thread, I didn't know that the garbage collection locks up the GVL across threads.
Jean_Boussier:
Right. Right. That's...
Charles Max_Wood:
Hmm.
Jean_Boussier:
That's another dimension of using pitchfork versus Puma or Unicorn is that traditionally, I think a lot of people have migrated to Puma for various reasons. It's a very good server. I don't want to seem like I'm dunking on it or whatever. I think for most people, it's going to remain the best server for quite a while. But Puma started using thread, and the community in general started using thread. It's true as well for Psychic and for many other as a way to fit in those small Heroku boxes. I mean, I don't want to say 500 megs is small. I know that there's probably some people who are going to say that's shake and tick already. But I think threads became popular in the Ruby community in good part to reduce memory usage. But then they bring this problem of you're sharing your memory space, which means you share the global VM log. So you're going to lose... going to suffer because you're going to hit the JVL log. I think you had a very good episode with Ivo and Joe a few months back, which explains this in great details.
Charles Max_Wood:
and I'll see you next time. Bye.
Jean_Boussier:
And the second thing is that, yes, the JVC is also stop the world. It's going to stop all your threats. So if you have two requests, and one is spewing a lot of garbage, allocating a lot of objects that the garbage collector has to clean after it, the other request that is totally innocent end up being in the same process. It's going to be regularly stopped as well. So you can, what I like and why we kept deploying Unicorn for so long at Shopify. We still, like all main apps are still on Unicorn. And sometimes I get the question of, why didn't you migrate to Pima? I thought Unicorn was deprecated or abandoned or whatever. It's because this isolation, being able to say this endpoint is really well optimized not being impacted by a less important endpoint, but maybe wasn't as optimized. And they won't impact each other. And that's very, it's a very good property for us. That's very important. And the other thing is, of course, resiliency is like if for some reason something is buggy, like a, you know, like a chia PC bug I told you about previously of like, oh, the VM is locked up or whatever. With Unicorn, you know there's only one request per process. just use a unique on timeout, which basically kills the process. When I say kill, it's like a unique, sick kill kind of thing. And there's no coming back from that. And you know for sure all resources, everything's going to be cleared up. Whereas, interrupting a request in Puma is a bit more tricky. If you try to kill the thread, maybe it was holding onto something, and you might leave the process in a weird state. So I sleep better at night knowing and just like we have a very hard time out. And if something really funky is happening, it's just gonna go away.
Valentino_Stoll:
Yeah, it's so funny that timeouts are such a problem in Ruby.
Jean_Boussier:
Yeah, I don't think it's particularly, you know, like it's just very hard because it's very hard to really compute, okay, what's the maximum amount of time this request is going to take, right? Because you
Charles Max_Wood:
Mm-hmm.
Jean_Boussier:
can benchmark it, but like you will need to do every single network call you do, you would need to sum up all the timeouts of it. So if you have like a one second timeout to Redis and you do 10 Redis calls, seconds, which is awfully long for a web request. I mean, it depends what you do. But for the most part, I would consider anything above 100 milliseconds to be a bit unnecessary. But of course, you're not going to put a timeout on one of your database to 10 milliseconds to accommodate that. You cannot lower the timeout enough. And then you also have the problem of batch requests. It's very common, especially in Shopify, when you have a non-min panel of some sort. I don't know, delete products or delete rows. Generally, you have a batch function where you can delete 10 at once or 100 at once or a thousand at once. And it's very hard from a big application perspective to say, OK, I'm going to enforce that. Every single of my requests is going to end up in this meantime, right? There's a good pattern. It's kind of not really the subject of today, but there's a good pattern for that, which is the deadline patterns, which having a fixed timeout for every single of your calls, you have like an object, you allocate say, five seconds for your requests, and every time you do a call, you subtract from that deadline object. And so if you only have like 10 milliseconds remaining and you try to do a call, it's like, oh, sorry, you blew your budget and we stopped there instead of waiting for the web server to say, oh, it's been too long, sorry.
Valentino_Stoll:
Yeah, I think I've also heard that as short circuit. It was that different.
Jean_Boussier:
I think you might mean circuit breakers, which
Valentino_Stoll:
of circuit
Jean_Boussier:
are
Valentino_Stoll:
breakers.
Jean_Boussier:
like, yeah, that's slightly different. That's
Charles Max_Wood:
Mm-hmm.
Jean_Boussier:
also a gem I worked on, you might have heard of Simeon.
Valentino_Stoll:
Oh yeah.
Jean_Boussier:
But it's a slightly different thing. It's more about detecting errors. It's like if you do five code to the database and they all failed, you just start to assume the database is done and you stop doing codes, you immediately fail. The reason is to avoid, like a failing code is fine. The problem is the timing code. Because if you have a one second timeout to your database and you do five code, it's five seconds. You might try to say, OK, the database is dead for now. We fail in like one millisecond. So it's a way to propagate errors faster to avoid cascading failures.
Valentino_Stoll:
I see. So I had a question about Falcon and
Jean_Boussier:
Hehehehehehe
Valentino_Stoll:
how Pitchfork maybe parallels in some ways, because I know Falcon is not forking by default, but it does have a forking option.
Jean_Boussier:
Yeah, I haven't looked too much into it, but I think in Ruby, you know, you could think that Unicorn is a forking server and Prima is a threading server, but actually in Ruby, there's no reason not to fork at least for splitting your load a bit and then use either thread or fibers. I haven't used Falcon much, I think it's the whole async, like fiber is nice but I think it's nice because it allows me to do things that before people would have reached to say Node.js or Go before, you know, like some very I.O. intensive and when I say I.O. intensive I don't mean like 80% I.O. I mean like 99.99% I.O. I mean like something where you know you subscribe to a Redis, like you listen to a Redis stream or something and then you just forward that back to that are listening to WebSockets or something, those kind of teeny proxy things. Historically, five years ago, everybody was doing that in, say, Node.js, and I really don't like Node.js. So I'm happy about it. Next time I need to do something like that, I can just do it using those async things. But all the issues I have with threads, which is the shared memory, the shared memory, the GVL, the GC, all these things, they're essentially the same with fibers. Fibers are basically like very lightweight threads. So you can only have one fiber executing at once. So the problem remains. It's just a way to have tons and tons of threads. And actually, they're very slightly worse than threads. Because let's say you use Puma and you have like your two threads. for some reason did a Fibonacci endpoint. You know, you compute Fibonacci and like is asking for Fibonacci 50. So for Ruby that's probably gonna take 30 or 50 seconds to compute, right? And that's just gonna be the CPU crunching up and it's not ever gonna do any IOs, it's just gonna crunch up. Well, with Puma, so Ruby VM is gonna say, after 100 milliseconds gonna say, okay, you guys, like you, this thread has too much CPU, I'm gonna let the other thread run for a bit. And every once in a while, it's going to let it continue, but it's just going to prevent it from using all the CPU in the process and starving other requests or other co-located things. If you do this with fibers and Svalcon, it's just going to compute its FOB and HG for 50 seconds and no other request is going to go through in that meantime. Because fibers are not preemptive, Samir has been working on is like to make them automatically preemptive when they do IOs. But if you don't do any IO, you're just gonna keep running. And so for me, I think they're great, but to do like some very tight systems where you don't use too much dependencies, you really know what you're doing, you're not exposed to any services. But yeah, I would be scared to, I mean, we are next to MK that Shopify, but like there's like literally committing to or monolith on a regular basis and like some bad code goes through sometimes like having a system that's resilient to mistakes but you just like okay a mistake went through and we can correct it later it's very important for us so like yeah but but wouldn't work. I know there's a lot of people asking to to make Rails compatible with Falcon. I did a bit of work in that But I think it's a minority use case. But again, like what I said for Pitchfork, every app is different and going to see different benefits from different systems. So I don't want to say that, like, don't use it. I just want to, I would like people to be very aware of the trade-off, basically. Because any good engineer should tell you that nothing is better, there's nothing better than something else. It's just all trade-off, right? You just always lose something when There's no... there's no free lunch.
Valentino_Stoll:
This makes a lot of sense to me why you would go the forking route now, because at first I was definitely questioning it. Why continue the long drawn out process of this forking route? It seemed like things were getting more along the threaded approach and then
Jean_Boussier:
Yeah, absolutely.
Valentino_Stoll:
pulled back. then optimizing the memory as it runs definitely makes a lot more sense because the average use case is what you have a database, you have Redis, and then that's it and you're serving requests. And that's like a typical Rails app cycle. And then you have other services kind of built in, but for the most part, you have your primary bulk monolith stuff up, and then it just optimizes itself as it stays up. with their copy-on-write optimizations and YJIT and stuff,
Charles Max_Wood:
Mm-hmm.
Valentino_Stoll:
to help support forking as the happy path, the Rails way. I guess that makes a lot of sense.
Jean_Boussier:
Well, I don't have the pretension to say that it should become the happy path, but at least that it should be one of the supported paths. You know, I just again,
Valentino_Stoll:
Sure,
Jean_Boussier:
like.
Valentino_Stoll:
I mean, like you're saying, every app is unique.
Jean_Boussier:
Exactly.
Valentino_Stoll:
But I
Jean_Boussier:
Every
Valentino_Stoll:
mean, in
Jean_Boussier:
up
Valentino_Stoll:
my
Jean_Boussier:
is
Valentino_Stoll:
experience,
Jean_Boussier:
a snowflake.
Valentino_Stoll:
every app is a snowflake. But I mean, in my experience, I've consulted a lot and a lot of Rails apps are very typical and follow that optimizations
Jean_Boussier:
Right.
Valentino_Stoll:
that would be brought from a forking server. So it makes a lot of sense. But I'm curious to see what your thoughts are on how to juggle those two and where Shopify sees this. Because I imagine like Shopify has other things other than its monolith, probably hundreds of apps,
Jean_Boussier:
Yeah,
Valentino_Stoll:
maybe thousands
Jean_Boussier:
absolutely.
Valentino_Stoll:
that maybe wouldn't work well also with pitchfork or a forking approach.
Jean_Boussier:
Yeah, yeah, yeah.
Valentino_Stoll:
How do you juggle those two and help keep people on the right path?
Jean_Boussier:
Yeah, so I mean, the way we approach things is that the company is too big. There's just way too many apps and way too many developers to be like to all everyone by the end and do everything. So we try to work as like we write what we want is to write shiny tools so that you reach to those because they look cool rather than to do your own thing on your own. And that's how we help with standardization. You really have to use cops or whatever, mandatory kind of proceedings. But ideally, at least the way I see it, is I like to, as you said before, create a happy path for people to go into rather than to try to police them afterwards. And so to give you an idea on the current situation, is that all bigger apps, like the Monolith and the Storefront Renderer, the one thing a lot of traffic but are under a lot of scrutiny. We definitely use Unicorn for them. And then for the long tail of smaller apps, especially the internal ones and stuff like that, Prima is more the default right now. I don't think, like depending on all my effort to try to make Pitchfork really more pro-game play, more easy to use by making most of the ecosystem compatible with reforking. switch to it as a default later on but like Puma is definitely gonna stay as like a staple for quite a while and and sorry I forgot the question
Valentino_Stoll:
I was gonna
Jean_Boussier:
Oh
Valentino_Stoll:
say,
Jean_Boussier:
yeah,
Valentino_Stoll:
how do
Jean_Boussier:
oh,
Valentino_Stoll:
you,
Jean_Boussier:
we juggled the two, yeah.
Valentino_Stoll:
yeah, how do you
Jean_Boussier:
And...
Valentino_Stoll:
juggle it?
Jean_Boussier:
And also, the memory pressure is more and more a problem as the apps get bigger, right? As I said, we use 32, like we fork 32 times. We use 32 workers for web server on the monolith because we need to really shrink the memory usage. But we have some internal apps that don't even need 32 workers to save the amount of traffic they get. So there is also the question of at which point does pitch fork or things like that will become an interesting memory optimization because I was saying we have like we have a gigantic monolith but then we have a long tail of like probably a thousand I'm not even sure of the number to be honest but like a thousand rails up left and right some are internal some are like applications that are you that you can install on your Shopify platform so they're developed like as a add-on kind of thing there's a really a lot of different application and they're all all right
Charles Max_Wood:
So I'm not sure what else to ask about this. Are there other things that people should know about Falcon or Pitchfork
Jean_Boussier:
Um...
Charles Max_Wood:
that we haven't asked?
Jean_Boussier:
of... Yeah, I'm not too sure what else to add to be honest. I think... That's probably the bulk of it.
Charles Max_Wood:
All right, well, I found the project on GitHub under Shopify slash Pitchfork, so you all can find it there. And yeah, let's go ahead and just move on to the next segment of the show then and do our self-promo shout outs. So Valentino, we'll start with you. What are you working on that people should know about?
Valentino_Stoll:
Let's see Right now I'm continuing to work on my It's a an AI project at Docsimity called Docs GPT Really excited to release it officially It's out now you can play with it. But basically Just putting some of these chat GPT tools in the hands of doctors and the context of healthcare so we're doing a lot of fun stuff and fun playing with the AI. It's too fun, you know?
Charles Max_Wood:
Very cool. That does sound like fun. I think I mentioned this last week, but I'm still working on getting the launch of the Catapult Your Coding Career podcast up. I'm hoping to get it to the point where it's a daily podcast, probably five to 10 minutes. And I've been doing coaching with people. I've just had people coming in and we kind of do a preliminary coaching call. Some of them sign up for actual coaching, but a lot of them are asking the same kinds of things. The other thing is that I went through a process when I was a developer, a newer developer, just learning new things and advancing and having a mentor and stuff like that. So I wanted to share the things that I learned that worked for me to level up in the ways that really mattered so that I could get the job that I wanted. Just walking through that and then from there, once you're a senior developer, architect, whatever. questions, then what? Then how do I go from there to making a difference in the community and contributing and creating content? I'm planning on covering the whole process to that. I'm planning on talking through a lot of the issues and answering the questions. You can check that out at catapultyourcodingcareer.com or you can just find it on topendevs.com. Jean, what are you working on that people should know about?
Jean_Boussier:
Well, I was saying earlier, I'm working on onboarding some apps to pitchfork, but at the same time, I'm also working on tuning the garbage collector settings in Hormone.lyph, like this week, because we recently upgraded to Ruby 3.2, and there was some kind of major change, so we need to redo our tuning, and we're trying to just really reduce the latency impact of GC,
Charles Max_Wood:
Mm-hmm.
Jean_Boussier:
So we're probably gonna take a while, but we're hoping to come up with a blog post with a guide on autotune.gc Even though the first line of the guide is gonna be, you probably don't need to do it, but...
Charles Max_Wood:
Yeah, but it's something else to tinker with.
Jean_Boussier:
Yeah, yeah, but you can do some interesting things. You can instruct the GC that, oh, I'm definitely going to need that many objects at runtime. So just don't bother. Start directly with that much space available. You can also, you have a bunch of, like the GC use heuristics. Basically, it just tracks
Charles Max_Wood:
Mm-hmm.
Jean_Boussier:
a bunch of metrics and say, oh, when I cross this threshold, I'm just going to do a major GC. I'm going to go over all the heap. on big hips that can take a long time. And we just found that OGC was way off on some assumptions. And we probably have some nice things to share. I mean, it's going to take a teeny while before we actually write the post, but like a good month or something.
Charles Max_Wood:
Cool.
Jean_Boussier:
But yeah, you can also shoot yourself in the foot by setting the limit too low or something. Like some developers at Shopify have been trying to optimize the service by just trying to do what I don't know if you remember back in the day, people were doing a out of bound GC. It was this idea of you don't disable, but after the request is complete, you run GC to just try to make space so that hopefully you don't actually trigger GC while you perform the request. But the idea is just to raise the limits, basically, and then you explicitly trigger the GC at certain points. But by doing so, they actually made things worse. They called us to say, oh, we have a problem with the GC. And the first step we did was just like, oh, we're going to reset to the default config. And it just immediately went way faster. So that's why I really want to warn people out there. Just don't do it unless you're set and you're in it.
Charles Max_Wood:
Makes sense to me. All right, well let's do our regular picks. Valentino, do you got some picks?
Valentino_Stoll:
Sure. So thanks to Jean's GVL instrumentation API. I know Ivo has given this recommendation before, but he has a GVL tracing gem now. You can trace GVL stats. Pretty great stuff. I see you also have contributed to that, Jean, so thank you.
Jean_Boussier:
Yeah.
Valentino_Stoll:
I'm going to I'm poking around in there, too, trying to see what's happening. Now that I know that threads lock like that during garbage collection. It's just interesting to see what's happening under hood. The other pick I have is called Watch Me Forever. One of my coworkers shared this and it's basically an AI generated sitcom cartoon that mimics Seinfeld. I recommend checking that out as well.
Charles Max_Wood:
Awesome. All right. I'm gonna throw in a few picks. So this week I'm going to pick, oh, what was it that I picked? Because I always start with a board game and my brain is, oh, I picked Sushi Go Party. So Sushi Go Party is a game you play, you can play two to eight players, and it tells you how to kind of build the deck up because you have different food items that you add into it, sushi restaurant and then they each score differently, right? So some of them are worth just straight up points. Some of them are worth like three of them are worth 10 points or two of them are worth seven points or like there's the spoon that allows you to basically put a placeholder down. And then when you use the spoon, what you wind up doing is you put two cards down and you put the spoon back in your hand. But when you play a card, you play a card. And then you pass the, your hand to the left. And so you're consistently having to choose, you know, trade-offs, because somebody may take the card that you, the other card you want, before the hand gets back to you. It's pretty simple. Board Game Geek ranks it at a 1.3 weight, so it's a pretty easy game. But it's a lot of fun. And my sister and brother-in-law, when they come over to play games, a lot of times if we have just a half hour before they're going to take off, That's what they do. They just, you know, they will just play that and then and then they'll go. So anyway, it's a fun game. I really like it. So I'm going to pick it. As far as other picks go, I'm going to pick Riverside. That's what we're using to record this. Like I said, I'm starting new podcasts and Riverside is awesome, even if it's just me by myself. person. If their bandwidth isn't good or something like that, it compensates for all that stuff because you still get a local recording. But the other thing is that it makes it really easy for my team to pick it up because they can pull the recording off of Riverside, basically pull it out of the cloud. My editor can do his thing. You can actually, when we're recording, you can put a, like you can mark the clip in here. And so we're going to start actually 30 second to one minute clips of us talking, right? So whatever the highlights are. Anyway, it's gonna be awesome. So we're really getting into that and making that work. And then one of the other things that they're gonna do with those videos is we're gonna open up a TikTok account. And it's gonna be a mix. It's gonna be a mix of clips from the show and of me basically sharing some of the stuff from either catapult your coding career, or I may just get on and say, hey, here's what I'm doing in my office or whatever, be able to see that on TikTok and Instagram. So if you wanna follow us there, it's top end devs in both places. And yeah, anyway, it's been real fun to just kinda play with some of those tools and see what you can do. And then the last pick I'm gonna pick is Mid Journey AI. I've been using it to generate artwork for various things. I've actually used it. It's a lot cheaper than having 99designs do a podcast artwork design. picture I like and put the podcast name on it, I am totally good with doing that, right? As long as it reflects what the show's about. I do kind of like the logo approach of like the Ruby Rogues podcast artwork, right? It looks more like a logo and less like a AI generated picture. But sometimes when I tell it to generate like a logo or logo artwork, it puts like weird... of other content, right, is the way that it works. They've got images that have been put in and then it ingested it into their AI algorithm. And so when you ask for something, it uses the keywords for those other things. And so you get weird text artifacts in it. And I don't know that I necessarily know or want to know how to go in and clean those up. So anyway, it's also nice because I've been putting the artwork in on emails and stuff like that on the email list. And so it's been nice to kind of have something that I know isn't copyrighted, right? That I can stick in. So unless it's, you know, got blatant, something that it pulled off of somewhere that is copyrighted, you know, it's good to go. So yeah, so I'm gonna pick Mid Journey AI. And I think that's what I've got. Jean, what are your picks?
Jean_Boussier:
It's not going to be very original, but I'm going to pick a Ruby 3.2 because it just got out of there not so long ago. And I think, I mean,
Charles Max_Wood:
Yeah.
Jean_Boussier:
we've seen an increase of people upgrading fairly early. I've seen a bit of chatter about it on Twitter, Reddit, or whatever, but I really want to recommend upgrading. It just performs very well. It's quite stable. There's a few teeny white-jeans corner case bugs, in a matter of days or week now, because it's very soon. And there's some nice things, like Valentino said, there's a GVL API I implemented, which allows you to instrument and to know whether you're using too many threads or not.
Charles Max_Wood:
Hmm.
Jean_Boussier:
There's YGIT, obviously. I work closely with the YGIT team, and I know they're dying for more, not necessarily public chatter, but more public feedback about, like, did it work well on your
Charles Max_Wood:
Right.
Jean_Boussier:
Things like that. So yeah, that's what I said too much. So when I go over with this public, common knowledge in the Ruby community of like the pointer releases where RCS at best, and that you should wait for like the point one or point two. But this
Charles Max_Wood:
Mm-hmm.
Jean_Boussier:
has changed and like there's no problem putting a 3.2.0 in production. So people, if you hear me, please do. You're missing out.
Charles Max_Wood:
Cool.
Jean_Boussier:
Thank
Charles Max_Wood:
Yeah,
Jean_Boussier:
you.
Charles Max_Wood:
I've been putting anything new that I build on 3.2 and it's cool. I like it. All right, well, we're gonna go ahead and wrap up here. Thanks for coming, Jean.
Jean_Boussier:
Thank you. Thank you for inviting
Charles Max_Wood:
Thanks
Jean_Boussier:
me.
Charles Max_Wood:
Valentino. Until next time folks,
Valentino_Stoll:
Yeah.
Charles Max_Wood:
Max out. Alright, we're gonna...
Pitchfork, Falcon, and Performant HTTP Servers - RUBY 587
0:00
Playback Speed: