161 RR Docker Deploys with Sam Saffron
The Rogues talk to Sam Saffron about deploying in Docker.
Hosted by:
Show Notes
The Rogues talk to Sam Saffron about deploying in Docker.
Special Guest: Sam Saffron.
Transcript
AVDI:
I’m just imagining some horrible Australian creature is chewing on the wires right now.
JAMES:
[Chuckles]
[This episode is sponsored by Rackspace. Are you looking for a place to host your latest creation? Want terrific support, high performance all backed by the largest open source cloud? What if you could try it for free? Try out Rackspace at RubyRogues.com/Rackspace and get a $300 credit over six months. That’s $50 per month at RubyRogues.com/Rackspace.]
[Snap is a hosted CI and continuous delivery services that goes far beyond letting you do continuous deployment. Snap’s first class support for deployment pipelines lets you push any healthy build to multiple environments automatically and on demand. This means with Snap, you can deploy your staging environment today. Verify it works and later deploy the exact same build to production. Snap deploys your application to cloud services like Heroku, Digital Ocean, AWS, and many, many more. You can also use Snap to push your gems to RubyGems. Best of all, setting up your build is simple and intuitive. Try Snap free for 30 days. Sign up at SnapCI.com/RubyRogues.]
CHUCK:
Hey everybody and welcome to episode 161 of the Ruby Rogues Podcast. This week on our panel, we have James Edward Gray.
JAMES:
Good morning, or afternoon. I’m so confused.
CHUCK:
Avdi Grimm.
AVDI:
Hello, hello.
CHUCK:
David Brady.
DAVID:
You can build all of the logical gates out of NANDs, which means that all of computer science is based on the theory that two wrongs do make a right.
JAMES:
[Chuckles]
CHUCK:
I’m Charles Max Wood from DevChat.TV. And this week we have a special guest and that’s Sam Saffron.
SAM:
Good morning from Australia.
CHUCK:
So, we’ve had you on the show twice already?
SAM:
Once.
CHUCK:
Just once?
SAM:
Yeah.
JAMES:
Yeah. We were talking about Discourse.
SAM:
We were talking about Discourse. It’s a very [inaudible]…
CHUCK:
Well, having Jeff is like having you twice, right?
SAM:
[Chuckles]
DAVID:
Nobody’s going to touch the ‘it felt like twice’ joke? We’re all just going to leave that? [Laughter]
JAMES:
Yeah, we’re just going to leave that.
DAVID:
This is actually my first time meeting you, Sam. So, welcome.
SAM:
Ah, thank you.
JAMES:
I met Sam in person.
CHUCK:
So, do you want to introduce yourself for David’s sake?
SAM:
Yes. My name is Sam Saffron. I work for Discourse, a startup that is building forum software. It’s all open source and we use it on Ruby Parley. Previously, I worked at Stack Overflow and a few other companies. I love Ruby. I love working on performance issues.
JAMES:
Hey, so do we. [Laughter]
DAVID:
Okay. I had this list of just fun zingers for you for the whole show. And now I have nothing to go with.
JAMES:
[Chuckles]
DAVID:
I guess I’m just glad you’re here. Okay.
SAM:
Oh, well I’m glad I’m here too.
DAVID:
No, that’s awesome. That’s awesome. Welcome.
SAM:
Thank you.
CHUCK:
So, I’m going to give a little bit of introduction here. First off, I do have something that was funny that happened to me related to Parley. I have a little bookmark and I clicked it to go to Parley. And I got nothing. [Chuckles]
SAM:
Aww.
CHUCK:
And I was like, “Oh my gosh. It’s down.” And usually I just tweet and you guys are like, two seconds, it’s back up right? Because it’s happened twice, I think, where it was down for a minute.
But I’m like, “Oh no. It’s down.” And then I realized that it was https.
SAM:
Ah, yes. [Chuckles] Now, did we change Parley?
CHUCK:
No. The bookmark was https.
JAMES:
Ah.
SAM:
That makes sense, yeah.
CHUCK:
Yeah.
SAM:
And there’s also, there’ve been some recent issues that I had to enable cores for JavaScript, which was really weird because we started collecting JavaScript errors just floating around in Discourse instances. It turns out that you have to enable cores headers on the actual JavaScript files if you want it to work with a CDN. It’s just a huge, long complicated story.
CHUCK:
Yes.
SAM:
But yeah, there was a period where a bunch of people were getting white screens because of those changes.
CHUCK:
Yeah. I’ve had to do that with AngularJS.
DAVID:
So, deploying Discourse is a nightmare, is what you’re saying?
CHUCK:
Used to be.
SAM:
Oh, it used to be.
DAVID:
We should talk about that.
SAM:
Yes. We should definitely talk about that.
DAVID:
[Laughs]
CHUCK:
So, here’s the setup that I had. I set up Discourse a while back using the big, long explanation where you go in and you set up. You install Ruby and you install Apache. So, you go through all that fun stuff. Or Nginx, I don’t remember. Anyway, I went in to install another Discourse more recently. And you have this fancy little Docker repo that all you do is go in and you clone it, you put the information into the config file, and then you basically tell it to start.
SAM:
That’s pretty much it, yeah.
JAMES:
Alright. That kind of thing is just ruining computers.
[Laughter]
CHUCK:
And I was sitting here thinking, “I have other apps that I deploy to multiple servers.”
SAM:
[Laughs]
CHUCK:
“And I hate my life unless it’s Discourse. So, I need to know how to do this.”
SAM:
Yeah.
AVDI:
I don’t know. That sounds easy. That sounds too easy.
SAM:
[Chuckles] That sounds too easy.
JAMES:
It’s too easy.
SAM:
Like as developers, a lot of the times we object to this kind of stuff.
JAMES:
We do, right?
SAM:
If it’s too easy, right. Like, why do we want to do this?
JAMES:
Yeah.
SAM:
It’s like we like the hard way. [Chuckles]
SAM:
And I guess historically, we’ve been doing stuff the hard way. If you look at the history of a lot of deploys that I’ve done with Rails apps, it’s a lot of voodoo, because you’re slowly building this stack out of duct tape and upgrading a little version here and changing a config file there. And it becomes a nightmare to keep track of what you did. So, a lot of times when you’re building up a Rails app, you have no way of reproducing it. So, you just pray that nothing ever goes wrong, because if something goes wrong, you know that you’re in for a day of pain. Am I talking… Do you guys have that feeling sometimes when you’re deploying?
CHUCK:
Oh, yes.
DAVID:
[Snorts] Oh. [Chuckles]
JAMES:
I’ve never had that problem. [Chuckles]
DAVID:
We weren’t chiming in and commenting because we were all recoiling from our microphones in shock.
JAMES:
Right, right. [Laughter]
CHUCK:
It’s particularly bad when it’s somebody else’s app. So, things like Discourse or I’ve done a bunch with Instructure Canvas which is a learning management system. And they’ve got these production deploy instructions and it takes you a half hour to read it and three hours to do it. [Chuckles]
SAM:
Yeah. And then when you think about apps conceptually, they’re not as developed as [inaudible]. Nginx. It’s running Sidekick. It’s running Rails. But at the end of the day, conceptually, it’s just an app. And it’s got a bunch of stuff that’s happening with it. It’s an app. It’s one thing. And what Docker allows us to do is to start thinking about applications as units that can be packaged and deployed, like these analogy that they try to make is like shipping containers and the way they changed how all of the shipping worked in the 20th century, because they used to just put goats on a boat and move them between countries. There was no standard way of moving stuff. And that just completely revolutionized the way that we were able to move goods around the world, because it was one size shipping container that everybody used. And similarly, Docker is trying to introduce that concept. It’s like one size, one type of container, that everybody can use and share. And they can share the configuration and they can share the builds and just basically build on each other like little pieces of Lego. And that is where I found it very, very appealing as a solution. And unlike virtual machines, you don’t pay the price. Like with virtual machines, you’re paying a big prices on performance because you’re not going to get native performance when you’re executing, whereas in Docker you’re not virtualized, which is a very, very key concept there.
CHUCK:
Yeah. Can I clarify that really quickly?
SAM:
Yes.
CHUCK:
Because basically the way it works is in Linux, in the Linux core, they’ve come up with this thing basically called Linux containers.
SAM:
Yes.
CHUCK:
And the way that they work as opposed to a virtual machine which basically, you have a program that emulates hardware. In this case, what it does is it actually shares the kernel with your host machine.
SAM:
That’s right.
CHUCK:
And so, you don’t incur any of the costs of that extra layer. And so, it creates its own container. And depending on your operating system and how they manage it, they’ve actually done a pretty good job in Ubuntu I know to keep them pretty well sandboxed from your main machine and from the other containers. So, even though it’s sharing kernel functions and things like that, it’s actually a very secure way of splitting up your functionality between different sandboxes or containers within your server.
SAM:
Yeah. And this technology has a lot of history. It’s not that Docker came and invented it.
DAVID:
Yeah. This sounds like paravirtualization, right?
SAM:
I wouldn’t call it virtualization.
DAVID:
Sorry. Paravirtualization.
CHUCK:
I’m not sure what you mean by that.
AVDI:
I’m not sure if it qualifies as paravirtualization.
DAVID:
Oh, okay.
AVDI:
But I could be wrong.
SAM:
Yeah.
DAVID:
My understanding on PVM is you take an image and you go to bring it up and it basically says, what are all the hardware bindings that we need? Oh, you need these? Okay, you can have them natively.
SAM:
Ah, no, no. No, no, no, no, no.
DAVID:
Okay.
SAM:
It’s completely natively. Linux containers were developed at Google, I think, did a lot of the work to bring it to the mass market. But it’s basically a reinvention of BSD Jails I guess would be where it would have started, where this concept started. And the idea is basically to contain a process. So, you restrict it.
JAMES:
For sandboxing?
SAM:
Yeah, for sandboxing. So yeah, you’ve got, you’re allowed to use this much memory, you’re allowed these capabilities, you’re allowed to do these things, and here are your devices, and go. So, you’re not really operating at any kind of virtualized layer at all. The kernel’s just stopping you from doing you from doing stuff. That’s all.
DAVID:
Right. Well, paravirtualization, it’s totally native once it gets out of the way. It lines everything up and lets you run.
AVDI:
If I could sort of break it down a little bit as I understand it. Virtualization is when you have a complete emulation of hardware. So, you have a guest operating system which is running, believes itself to be running on raw hardware but in fact it’s running on an emulation of a machine. Paravirtualization actually introduces some cheats, basically. You modify the guest operating system so that it can communicate directly to the hypervisor and thereby speed some things up. So it’s virtualization but with some optimization where the guest knows that it’s actually running virtualized. But with this, yeah, what we’re talking about here is more akin to a jail. It’s sort of the logical conclusion of a chroot jail, I believe that would be…
SAM:
Yes.
AVDI:
A good, yeah.
DAVID:
Okay. Okay, so paravirtualization makes it almost native and Docker’s actually really making it native, is what you’re saying.
SAM:
Yeah.
AVDI:
Well, yeah, yeah.
SAM:
I mean, it’s not Docker. It’s Linux containers really, that are making it native.
DAVID:
Okay.
SAM:
So, Docker likes to pick all of these technologies. And one interesting thing is the goal for Docker is to get mass adoption. So, at the moment, they use this thing called libcontainer that talks directly to the kernel and sets all of these things up. They have this idea that you can plug in different backends eventually. And theoretically, somebody could build a non-jailed backend for Docker if they wanted to do so. And that might let you run it on, say Apple native, or use [truer] jails instead. So, because they’re trying to get adoption, they’re trying to support a wide amount of technologies to solve the same problem. And so, it’s also not inconceivable that somebody would write an adaptor for Docker that would do virtual machines if they wanted. And then you’d get a bunch of the other features plus you use virtual machines, et cetera.
AVDI:
So, Docker is coordinating things as a layer on top of the underlying technology, then.
SAM:
That is correct, very correct.
CHUCK:
Yeah. It’s written in Go.
SAM:
It’s written in Go. And historically, it was using LXC, Linux containers, which has been around for a long time. It just went 1.0. And then they said, “Ah look. It’s too hard for us to coordinate LXC so let’s leave that as a pluggable backend. And we’ll write it ourselves,” and they called it libcontainer. So, that shipped recently. And that changed a whole bunch of stuff with the way that we deployed and worked with Docker. The main thing is you couldn’t attach to containers anymore. We’ll go through that soon. But the fascinating thing as a developer, you launch this thing up and you’ll root inside this image that isn’t really root [chuckles]. So, we have that. And that’s not the only technology that’s taken there. The other technology that they build on very heavily and they built on from day one is the AUFS, which is Another Union File System.
JAMES:
The file system, yeah. I’ve heard about this.
SAM:
So, the first piece of the Lego was these containers and you thought about those [truer] jails. The other very big piece of it is having this layering file system. And what that means is that you can get some of the stuff that you get in VMs where you revert to snapshot. So, you can snapshot it. And then when you’re building up these Docker images, you’ll snapshot, snapshot, snapshot, and that enables a few things in the workflow. So, if you’re working on an image and you’re snapshotting during the time, if you ever want to continue working on it, you can just pick up from the last snapshot and keep going. And that means that it’s very, very fast to work on it. It also means that it’s very fast to boot these things, because they’re just booting it from a snapshot. They don’t need to do any kind of massive amount of work. It does have some disadvantages because this layering file system doesn’t ship everywhere. And that was a big problem initially with Docker, because they couldn’t get adoption because only Ubuntu really had it very simply. And then recently, they introduced some other backends that have got experimental backend for Btrfs. I’d imagine if they ever do BSD, they’ll have one for ZFS. They have one for, whatchamacallit, Dell VM. So the underlying technology that drives LPM that ships with everything on Linux, there’s this virtual file system that you can use and snapshot. And they can ship with that. But it hasn’t really worked that well for me in history. We actually have a check in the Docker installed. So, if you’re using this kind of file system, then be warned. You may have issues. They’re still, I think, it’s not the perfect use of Docker until they iron out all of the bugs. Yeah, an underlying technology there is device mapper that drives all of this. And that’s a kernel feature that ships pretty much everywhere. But Red Hat is working really hard on that particular backend and I’d imagine it’ll become very stable as we go.
CHUCK:
So, I’ve played around with Docker and I’ve played around with the Docker files.
SAM:
Yeah.
CHUCK:
I’m really curious as to how you built the Docker setup for Discourse.
SAM:
Yeah, and it’s very interesting. So, before any of that, there’s this one big discussion in the Docker community, which is whether you go for the model of one process per container or you go for a model of multiple processes per container. And there’s violent disagreement everywhere about this. Some people violently think that a container should just be treated as a process and you need to do all of the coordination outside of it. And some other people think that you can just use them as disposable VMs and as a way to ship software. So for me, I wanted to use Docker as a vehicle to ship Discourse. And ship Discourse, it meant that it would be much more complicated for me to build a system where I’m coordinating multiple containers of Docker, because when you think about what Discourse is built off. So, Discourse is composed of a bunch of processes, Nginx. I really wanted to use Unicorn for Discourse for a couple of reasons. And it was very hard to get everybody to use it because it’s a little bit harder to configure with Nginx plus Unicorn plus a process supervisor to make sure that Unicorn never stops working. So, they end up being a whole bunch, and then there’s Postgres as well, of course is running, and Redis. So, you’ve got all these processes that you need people to set up. And if I was going to go the one container per process, I’d end up having four or five of these. And I’d need to coordinate all the mesh working between them and a whole bunch of other more complicated stuff. Now, Docker didn’t even have a lot of the team features built in when I was looking at it initially, which came a bit later. And I just thought I’ll just use it as a disposable VM. It makes it much easier for me to get this out there and install it and configure it. But there were a bunch of limitations there. I wanted to build a system that’s very, very flexible that we could use in production and also everybody else could use. So, I built a system that is a modifiable system. For example, you can bring up a Discourse image that doesn’t have Postgres in it and the Postgres is living somewhere else. Or you can bring up a Discourse image that just fits everything in one. But unfortunately, the Docker files are these really, really dumb files that just allow you to do some very, very basic operations. So, when you look at a Docker file, it’s got…
AVDI:
Docker file is the configuration file?
SAM:
Yeah. It’s a configuration file that allows you to specify how it’s going to build up the base image for Docker. So, for your container, the term would be. The term that they use is image. And then once it’s running they’ll call it a container. So, these images are built from Docker files which just look like little text files. And they tell you, add this file, add that file, run this command, and pretty much those are most of the primitives that you have there. You’ve got a couple of other primitives that say exposes port or run this initial command when you start up the image. But except for that, it’s a very, very, very simple text file. And it doesn’t allow you to compose stuff. You can’t say, “Look, if you’re running it with these parameters, then compose this little bit in. And if you’re running with that parameter compose that bit in.” It doesn’t have primitives that allow you to do things that in general you’d expect to want to do. Like say you’ve got a text file and you want to replace a bunch of stuff there, you want to change this word to that word, which really helps if you’re doing templating stuff. Our Nginx template doesn’t have everything that is specific so we may want to change a bunch of things in it during the time that we build your specific image. For example, if you’ve got any very specific settings like HTTPS settings and whatnot. So, the lack of composability actually meant that I needed another system on top of Docker to coordinate building up these images. So, there are two things that I do. The first thing is I build what I call a base image that’s built using traditional Docker files. And by the way, all of this stuff is open source under MIT. People can take this system and extend it to their own and build their own one if they want to, on top of it.
CHUCK:
Is that in the Docker Discourse repo?
SAM:
Yeah, in the Docker Discourse repo. In fact, I use it to deploy the logster website, which is pretty cool. So, I actually used it to deploy a different Rails website, which I think is very, very, very nifty. And I’ll go through that soon. But the thing there is that the Docker file system wasn’t flexible enough for what we needed. And then I started looking elsewhere to see what other features am I missing from here? And it turned out we were only missing a handful of features. So I thought, “Yeah, I can build a little DSL for this and just have a YAML file that configures it and have these few features that I need.” I called this little system pups. You could easily just use Chef or Puppet of any of these other things for it. But I wanted something absolutely trivial for this and that didn’t involve much learning. So, the whole pups thing is literally maybe 200 lines of code that read this YAML file and coordinate it and say, “Yeah, if you want to mix in a template, you mix it in like this,” et cetera. And that is the other piece of technology that goes and builds up these images. So, we build this base image that has all of the applications you need, like Postgres and Nginx, et cetera, and all the versions that you need. And then once that’s all ready and you go and run our command, like launch a bootstrap Discourse, it will run pups inside it. It will read all of these templates and it’ll plug in all the values into it and bring it up so you can use it. And internally, we use runit. Have you guys had any experience with various things? Bluepills, runits, et cetera?
Monits?
JAMES:
Yeah.
SAM:
All of those guys?
JAMES:
I’ve used Monit and…
SAM:
God.
[Laughter]
CHUCK:
Yeah.
SAM:
I can hear the pain, yes, years ago, when it used to leak so much memory.
JAMES:
Right.
SAM:
You needed a God monitor for God as well.
JAMES:
[Chuckles]
CHUCK:
I’ve used God and I think I’ve used runit once or twice.
SAM:
Yeah. So, with these, I found runit to be the most appealing of all the ones out there because it’s so super lightweight. When you look at the actual processes that runit has, it follows this Linux philosophy in that it’s very simple, little tools that only do one thing. And it really meshed. It meshed with me, the concepts of the guy who wrote it. And it has been rock solid. Really, I’ve never had anything bad. Whatever features it has, it just does really, really, really well. And before that, we were using bluepill, which is not really maintained much anymore. And it’s Ruby so it’s much heavier when you’re talking about, so you’ve got a process monitor that’s consuming 10 Megs of RAM as opposed to something that is consuming hundreds of bytes. Truly, hundreds of bytes to monitor a process and just works rock, rock solid. [Inaudible], you’d have dependencies and all of that. There’s an interesting blog post about this, if anybody’s interested in this kind of problem. It’s called, I think, ‘Process Management: A Solved Problem’ or something. So, I ended up using runit and moving away from using bluepill. And that has worked out extraordinarily well for us. And now everybody’s using runit because they’re using our Discourse Docker. So, that’s a full circle.
DAVID:
That’s the beauty of open source. You solved it well.
SAM:
Yeah.
DAVID:
So, we all thank you.
SAM:
[Chuckles] So now, people can stop thinking about, “Am I using runit? Am I using Unicorn? Am I using Nginx?” and they go, “I’m using Discourse, right?”
DAVID:
Right.
SAM:
And it’s time to update Discourse. So, that was the history behind it. So, we use pups for bringing up these images. And then at the end of the day, you just have your Docker image. And it’s ready to go and it takes care of all of these problems for you. And you don’t need to think about these things at all. And with the huge advantage, since using the Discourse base image and system, say there is Heartbleed one day. How do you get everybody that’s using Discourse not to be vulnerable to Heartbleed? It’s pretty simple. I just build a new base image that isn’t vulnerable, I go to my file, my launch a file, and I say, “No, don’t depend on base image number two. Depend on base image number three.” And then everybody that pulls Discourse from their own will get it and everybody that updated will pick up on the new base image. I keep all of the… there are two types of data that your deployment just carries around. There is stuff that you want to keep around and stuff that you want to throw away. And the stuff that you want to keep around, I keep on the host operating system. So, that would be your Postgres database or your log files.
JAMES:
Right.
SAM:
And stuff that I don’t care if I throw away, like temp files and other crap that just you build while you’re using the system, that all gets chucked out. So, for example, precompiled assets will be built but then thrown away and rebuilt. So, the beauty is that you can just replace the base image completely and just bring it up, and you’ve got a new base image with all of your data and none of the data that you don’t care about.
AVDI:
There’s a question that brings to my mind. It seems like there is a divide between what the Docker file does or whatever you’re provisioning system does. I guess there’s a line between the instructions for building up a system versus the image, the frozen image of the system. and I remember back a long time ago having, working at a place where we basically just created whole AWS images and then we would, every time we wanted to change something, we just update the image and then push that out. But there were a lot of problems with that. And there’s a big push to move away from that model and more to the model of using something like Chef or Puppet. SAM:
Yes, definitely.
AVDI:
To build the system, completely from instructions. You build it up from a pristine source.
SAM:
Yes.
AVDI:
And you build it up from the constructions. With Docker, how do you place that line between things that are in the image and things that are configured onto the image when it gets set up?
JAMES:
This is what the file system’s about, right?
SAM:
No, it’s instructions all the way through. There are just two types of instructions. So, there are the instructions to get to the base image, which is the Docker file. And that is just about, for me I draw the line as that’s about what software goes on the box and where the software lives. So, that’s where I draw that first line. And then the second line is where our configuration system kicks in, which could be Puppet if you prefer to use Puppet or whatnot. And that does the more complex configuration at core, because that will involve, “Let’s edit this little configuration file a bit and change a line here. And let’s move this thing over here and create a symlink there.” So, once a configuration gets a little bit more fiddly, I prefer to use a tool that’s a little bit more powerful to do that.
AVDI:
Okay. So, when you say what software is on the box, you’re talking about what version of what operating system package is installed?
SAM:
Yeah. I’m just talking about running apt-get commands pretty much. There are a couple of other things, but it’s mostly running apt-get commands and cleaning repos.
AVDI:
And just to totally clarify, I’m sorry I’m dumb about this stuff.
SAM:
Oh, that’s fine.
AVDI:
Are you talking about those commands being embedded in the Docker file, or are you talking about you run those commands once while setting up the image and then you freeze the image?
SAM:
Yes, I run the Docker file locally. And once I have that done, I’ll stamp and image with a version. And then I take that image and I push it to the public Docker repository.
AVDI:
Okay.
SAM:
So, it’s really beautiful in that everybody can just share whatever image they want with whoever. But you don’t have an obligation to share the Docker file. But I think it’s very, very important if you’re doing any of this work to really share the Docker file as well so other people can build on your image. Otherwise, there’s an image out in the cloud that nobody knows how to recreate, which is pretty bad.
AVDI:
Okay. So, you don’t have any config files that are doing apt-get postgres because that’s already frozen into the image. But then you do have configuration that’s done to set up application configuration files. Is that accurate?
SAM:
That is very accurate.
AVDI:
Okay.
SAM:
And it’s more than just application, because everything’s applications. SSH is running on the box and that’s an application in some ways. And I need to configure it and fiddle with it a bit so it works the way it should work and bring all of the keys across, et cetera. It’s really cool when you’re using Discourse Docker, you can just dot launch your SSH into my image and it just takes you in there magically because it already preconfigured all the keys for you.
AVDI:
How do you make that decision? Is the decision to make the base software part of the image, is that because it comes up a lot faster if you don’t have to run all those apt-get commands every time?
SAM:
Yes, definitely. There are two parts of it. One is that it comes up faster. Other part is that you really want to freeze versions at some point to make stuff a lot more predictable, because if you’d be rebuilding and doing that, you don’t really know what version everybody’s running and that becomes complicated, but also very, very slow, all these things.
AVDI:
Right. And I guess, yeah.
CHUCK:
So, when it updates Discourse, it’s actually just updating the image from the sky?
SAM:
So, it depends on what you’re doing. If you’re bootstrapping an image, you’ll probably already have it locally. So, you’ve already downloaded that. But if I ever want to update the base things, like we decide, I don’t know, it’s time for everybody to move from Postgres 9.2 to 9.3 like we did, the I’ll update the base image. And then at that point, people can get that upgrade. Actually Postgres 9.2 to 9.3 was very interesting in that the base image prepared Postgres 9.3. But then, my bootstrapping process needed to take care of the database upgrade. So, I had to spin up a database and take it and massage it into Postgres 9.3 if it detected that it was Postgres 9.2. So, I actually handled even a database upgrade with this system, which is pretty fascinating.
AVDI:
I have to say, this all makes me super mad. [Chuckles]
AVDI:
And I’ll tell you why. Because I used to work in a project that was basically about doing, sort of like doing CI in the cloud except it was more about accelerating test runs using parallelism than it was about CI. And this was long before, years before any of the current CI in the cloud services. And you know, we’ve discovered quickly that every project, every significantly-sized project had its own weird setup, with special services that needed to be running in order to run their test suite. And I remember evaluating the lightweight container solutions that existed at the time.
SAM:
Yeah.
AVDI:
[Chuckles] And they were all just too immature.
SAM:
Yeah, and it’s moved so much in the last couple of years. It’s amazing.
AVDI:
And yeah, it seems like it moved very quickly in just a couple of years, because for a long time, this technology has been part of Linux for ages.
SAM:
Yes.
AVDI:
But in a dusty, ill-used form for a long time.
SAM:
Yeah, I think the interfaces as well. The beauty about Docker is that they went and said, “Look. Let’s make this simple. So, we’ll introduce an abstraction in front that just takes away a lot of this mess,” because running LXC at the time is really, really complicated. Those commands are nasty and the files that are used to configure these containers are nasty, complicated. So, they just said, “Look. We’ll give you less features but we’ll simplify it and we’ll streamline it and we’ll take care that it isn’t buggy.” And that’s what they did. So, that’s how they managed to push these technologies out there, just by putting an adaptor on top of it, basically.
AVDI:
Yeah, which is wonderful, because I think it was very easy to set something up and be like, “I think that I’ve limited this machine’s memory. But I’m not actually sure.”
SAM:
Yeah, that’s true. That’s very true. Those are some of the more complicated parts.
CHUCK:
So, I have a few more questions here. One is, you said that some of this stuff lives outside of the container?
SAM:
That’s correct.
CHUCK:
So, how do you manage all of that? I guess it’s all in that launcher script that’s there.
SAM:
Yeah. Docker has this concept of you can have these shared volumes with the host.
CHUCK:
Okay.
SAM:
So, you can say, “Look, this directory on the host is this directory inside the container.” So, that’s basically how you can share data. You can share sockets that way as well, if you want to. And you can also share ports. You can say, “This port inside the host is this port on the container.” So, it’s actually fun when you’re in a container and you can just listen on port 80 and you don’t have to worry about who’s on port 80.
CHUCK:
Yeah.
SAM:
But then, externally you just say, “Look, forward port 80 to port 2023.” So, that’s basically how you manage, I guess boundaries between the container and your host, by just sharing volumes or exporting ports.
CHUCK:
So, one other question I have is in Discourse itself, it will tell you when there’s an update and you just click on the button. And then you click a link or maybe two links.
SAM:
Yeah.
CHUCK:
Depending on what has to be updated. And it just goes in and updates it. So, how much of that is Docker and how much of that is Discourse?
SAM:
That’s 100% Discourse actually, because the container’s up and running and it’s finished at that state. Now, you have two options when you want to upgrade it. You can either just burn it and start from scratch, which is what we actually do for our production deploys. We just burn everything and start from absolute scratch. And we don’t rebuild the base image, but we’ll rebuild all the configuration. And the other option is if you know exactly how everything is wired up and all of the pieces, all of the layer pieces, are exactly in the right space, at that point, you can say, “Look. I know that you make this little change here and this little change there and you can update the system.” And that’s what Discourse Docker does. It knows how the file system is laid out. It knows where everything is. It knows that it can send a signal to the, really Unicorn microcontroller that can take care of restarting it. So, that just receives a Linux signal. It will just stop it and start it without people feeling any outage. So, the way Unicorn does it is it just spins up the new Unicorn and it slowly weans all the connections off the old Unicorn and puts them on the new one. So, we know how everything is wired up and that’s how we can do the Discourse Docker thing. It just does a Git pull, it precompiles assets, and then it tells everything to restart. And that’s how you’re getting that experience there, which is great because nobody experiences and outage. These are things that with passenger, you’d have to pay the money to get the passenger that does these zero downtime deploys and stuff like that, whereas we’re able to get all of that for free. And there are a bunch of other things that people are getting for free, like they’re getting out-of-band GC for free, because the way we’ve got Unicorn configured, it does out-of-band GC for us. They’re getting to use jemalloc, which if you told people, “Configure Ruby but don’t just configure it. Configure it this way. You have to use ld_preload this then that.”
AVDI: [Laughs]
SAM:
“And then you have to run these environment variables,” and you’re going, “Oh, come on. Really?” So, we take care of that for you, I guess. And that makes everything run a lot faster. And recently I’m thinking, “Oh yeah, might as well add DOS protection for all of these people.” So, I can just amend our Nginx config. I think about Discourse as an app now. I don’t think about it as, “Well, this is what I can do and this is what I can’t do.” I can think about it holistically as, these are all the dependencies and this is how I can mend my dependencies and make them work better to have the whole experience. I think that’s a very healthy way to think about applications you’re building.
JAMES:
It’s like the cruise ship experience. Everything’s already taken care of for you.
SAM:
[Laughs] Yeah, that’s true. But yeah, it’s fascinating.
CHUCK:
Well yeah, the thing that I see is that you could set up a system like this for just a generic Rails app and then from there…
SAM:
Exactly.
CHUCK:
Customize it to whatever you want, or whatever.
SAM:
Yeah, definitely.
CHUCK:
So, then you just think about it as little boxes that you stack in wherever you need them.
SAM:
Yeah, exactly. And I think Hongli has been trying to do that. He’s got a base image that he works on. He calls it passenger-docker. And you can link to it in the show notes. And he’s trying to give everybody a base image that they can build on. And it uses a lot of the same concepts that we use. He uses runit. We use runit. It’s actually very similar in lots of ways. We ship Postgres 9.3, which we wanted to have very tight control of our dependencies. So, we forked efforts and we build our base image and he builds his base image. But I definitely recommend either looking at ours or his. A very interesting thing that you can, if you want to use Discourse Docker to deploy your own Rails app, you can look at what I did for logster. And I’ll put a link through. And this is, logster, is like a little Rack middleware that you can just insert in any Rack app or Rails app that allows you to look at the logs in a GUI. And we built this for Discourse because it was getting impossible to manage how our customers look at logs. We didn’t want our customers to go, “Ah yeah.” We didn’t want to go, “Ah yeah, go to the file system and look at this file and tell me what’s going on there.” We wanted to give them a GUI so they can look and tell us what’s going on. So, I built logster to solve that problem. And something very fascinating about this is that it has this little folder in it called docker, which contains the Docker container configuration. So, I can actually go in and update this little YAML file and it will seamlessly deploy that website for me automatically without me needing to think about anything. And that’s nothing to do with Discourse, which is fascinating as well.
AVDI:
I have another question.
SAM:
Sure.
AVDI:
In a world of Docker, does Vagrant still matter?
SAM:
Well, you would still…
CHUCK:
Yeah, you use Chef to install Docker, and then you go from there, right?
SAM:
You actually use Vagrant to boot up a boot2docker image, because now Docker doesn’t work natively in Windows or on Mac. So, they’ve got a base VM they use.
AVDI:
Oh, okay.
SAM:
So, you’d use Vagrant to boot that up and do some minimal orchestration to start off.
AVDI:
Okay.
SAM:
But you can use Docker for a large amount of it. Recently I’ve been thinking of moving my dev environment to Docker as well and just using, doing all of my dev in a Docker image. And that way, I don’t need to, because there’s a lot of configuration needed for a production system. A dev system multiplies that by two. And that’s what you need to do for it. So, it would be nice for me to get rid of all of these magic voodoo things that I did on my computer to get it to go.
AVDI:
Right.
SAM:
And instead just use pristine images. And then also, it makes it easy. Want to hack on Discourse? Well, two seconds, and just use this image.
AVDI:
Interesting.
CHUCK:
Ah. Vagrant will also provision to Docker and there is a plugin that will allow you to do LXC, or the Linux containers.
SAM:
Yeah. I wouldn’t… LXC would be handled by Docker, so you wouldn’t really use that [inaudible] Vagrant . You’d use its provisioning pieces. But yeah, definitely it doesn’t, say that all of these tools, Chef, Puppet, don’t need to live in existence with it. It’s just you can mesh pretty much all of these tools with Docker.
AVDI:
Yeah, I’m just trying to figure out where they all fit together.
SAM:
Okay, yeah. So, the Chef and Puppet would fit where we use pups, which is a super dumb YAMLbased bootstrapper. So, you’d use Chef if you’re more comfortable with Chef over there, because those tools give you more flexibility. So, you’d think in the Docker file, it’s where the trivial stuff lives, where you just want to run a Bash script basically, or take a file and chuck it in there. Those are the kinds of things that you’ll do. But whenever you want to do anything more complicated, then another tool probably should, is a very good idea to put in there.
AVDI:
Yeah, then Vagrant still fits in as a way to coordinate bringing up development environments?
SAM:
I’d say Vagrant sits there as yeah, just a way of just starting up, your general boot up button. That will just bring, I guess the boot2docker image if you’re using Docker. So, you’d bring that into Vagrant and then you boot that up, and then you’d have Vagrant orchestrate the initial commands that Docker needs to run.
AVDI:
Okay.
SAM:
Like Docker building this image and then Docker running that.
AVDI:
Okay.
SAM:
Whereas historically, you’d use Vagrant. You’d use Vagrant and then you’d use Chef to coordinate that all the way through. But with the Docker approach, I think it’s a little bit faster and it’s a lot cleaner. It feels to me cleaner, because you’re dealing with very, very pristine states.
AVDI:
But I guess for me, developing on Linux, it might be… I wouldn’t need the VM part of that.
SAM:
Yeah, you wouldn’t, exactly. So, for you developing on Linux, if you move to a Docker-based setup, then you wouldn’t really need any of that. You could just use Docker straight.
AVDI:
Okay.
SAM:
And just have a few Docker files that do that. The key I think is to start small, because a lot of this can be daunting when you look at doing ops. Because at the end of the day, you’re like, “Look, I built this magical castle,” but it had to start from a little wall.
JAMES:
[Chuckles]
SAM:
And a little of the foundation and all that. So, the key to actually be successful and be able to use these technologies is to start really, really small and say, “I just want to do this little thing,” and then slowly build on it and become more confident with the tools. So, don’t set yourselves goals that are just too, too hard to achieve, because that’s the way to set yourself up for failure.
AVDI:
If I want to start deploying apps using Docker, what is the simplest way to do that? Is there a Heroku of Docker?
SAM:
There is this thing called Dokku, which is a bunch of Bash scripts that use it. There isn’t a clear winner. There’s a whole bunch of little tools out there, what I mentioned, the passenger-docker stuff, the Discourse Docker stuff, the Dokku stuff. There’s another thing called Fig. I’d recommend looking at them all and seeing where you’re comfortable at and which one you’re comfortable using. I don’t think there’s a clear winner now of this is the way you should do it. Interestingly, in Docker one, in the Docker… they had a Docker conference this week. And Solomon, who created Docker, talked about a new piece they’re adding to this whole Lego castle that they’re building called libswarm. And they’re trying to abstract way the orchestration part of it. So, they’re trying to create a language that is common between all of these providers that do this kind of deployment stuff to say, “Okay, in Docker Land you’ll say put this image on this machine,” and then you can choose whatever provider you want that knows how to put an image on a machine. And that’s what libswarm is trying to solve. So, they’re trying to solve that other problem of moving these pieces around and provisioning stuff up and down. So, this stuff is really, really early. And a lot of the stuff with Docker, you’re going to have to get your hands dirty and you’re going to have to build experience with it. I’d say the best way to start, if you want to deploy Docker in production would be just to figure out how to run your tests inside the container, because once you’re able to run your tests, you’ve basically figured out how to spin up the whole container. And at that point, moving to being able to deploy it is actually a small step.
AVDI:
Okay.
SAM:
So, that’s a great, great way to start.
AVDI:
And then deploying it would be a matter of spinning up a machine somewhere and then just going in and setting it up using Dokku or something like that.
SAM:
Yeah. The way our production system works is that we build a base image for everybody. So, that’s just standard stuff. And then we publish it to a private registry. So, there’s this concept in Docker that you can have a private registry of images that isn’t shared with the world. And then we just run SSH commands on the various computers to say, “Yeah, just pull this base image and start it.” So, it’s literally three commands to start up our images wherever they need to be started. So, that piece is really simple once you’ve got your own private registry, because you definitely don’t want to put all of your passwords everywhere. And your software might be close source. You may not want to publish it with the world. So, you can still get to use the Docker system by doing that.
AVDI:
And I guess the one prerequisite to that is that you configure your hosts to have access to the Docker repository and to have Docker installed.
SAM:
That’s correct.
AVDI:
Okay.
SAM:
And also, you have to start thinking about if you’re going to be doing that, keeping Docker up to date, because it’s moving so fast. So, you have to have some scripts to update Docker.
AVDI:
Right.
SAM:
And update whatever configuration you need. But it’s been, for us, tremendously successful, especially because we’re doing a lot of virtual hosting. So, some people want this plugin and some people want that plugin. So, just having the ability to spin up a ton of image for tons of customers and some people are on this version with these plugins, and some people are on that version with that plugin. And none of this stuff interferes with each other and I don’t have to worry about crazy file system structures or virtual machines, et cetera. It’s been incredibly successful for us.
CHUCK:
So, do you just create separate versions for those plugins?
SAM:
Yeah. I create a separate image. I create a separate image for them and I publish it to our repository. And then I load balance it. So, we’ll run, for each customer, multiple images on multiple machines, and we’ll load balance them using HAProxy. And those images are mirror images across multiple machines.
CHUCK:
And where do you store the actual images? Are they just up in the cloud?
SAM:
No, no. They’re stored locally, because these images get big. You’re talking about one and a half gigs. You don’t want to have a deploy take forever because you’re pushing a one and a half gig image that you’re going to throw away later.
CHUCK:
Yeah, but you said you put them into your repository. So, where is that? Is that just a server somewhere?
SAM:
Ah, so the Docker registry, our internal Docker registry, runs locally. It’s cool, to get your Docker registry running. All you have to do is type docker run registry. [Chuckles] And then you’ve got your own local registry running. So, we’ll actually throw that away once in a while as well to get new versions and whatnot.
AVDI:
So, you have a machine somewhere that’s just sitting there running that?
SAM:
Yeah. Yeah, pretty much.
AVDI:
Okay.
CHUCK:
And then it’s only available to your internal network?
SAM:
Correct. So yeah, we publish the public image that we build off to everybody. But yeah, the private images, with customer configurations and plugins and assets precompiled, those ones just live locally, which is cool.
CHUCK:
That is very cool.
SAM:
Yeah. And it’s cool that we do our smoke tests on the actual production image, because you can throw it away later on. So, we create this production image and then we’ll run it a bit and we’ll access the website and make sure that, “Ah yeah, you can get to the homepage. You can go to the topics page. You can go here. You can go there.” And once we’re happy with it, we just throw away all of that work that we just did and say, “Ah yeah, this image, this version, it’s good.” And then we’ll just publish that to GitHub and we’ll tell GitHub, “This is where the last good build was.” And then everybody can benefit from our internal smoke test that actually ran Discourse and accessed pages on Discourse.
AVDI:
That’s pretty cool. And so, I guess you have, for that image you have Postgres running inside so that that gets thrown away as well?
SAM:
Yeah. It doesn’t matter, yeah. So, you throw it away.
DAVID:
So, we’ve been talking about this model of Docker as shipping containers. And shipping containers clearly have a many-to-one relationship to a piece of hardware. Is that true? Can a container span multiple machines?
SAM:
No, no, no. Containers are on…
DAVID:
Okay. So, if you wanted to deploy a large SOA architecture using Docker, you’d have to have containers for each service or at least for each machine in the cluster.
SAM:
Well, in general they’d recommend if you’re doing something like that, you’d want to have a container per service, at least.
DAVID:
Okay.
SAM:
And then you’d have a coordinator that does all of this orchestration. And there are a bunch of things out there that are starting to gain a lot of traction, like CoreOS would be one thing that I’d definitely look at if you’re looking at doing that kind of stuff.
DAVID:
Yeah.
SAM:
There’s OpenStack as well. So, those kinds of things, tools that are out there that deal with, “I need to coordinate multiple machines and orchestrate these big architectures.”
DAVID:
Yeah. So, you’d actually have a coordinator living in one of these Docker containers then?
SAM:
Potentially.
DAVID:
So, if you wanted to…
SAM:
It’s not a requirement.
DAVID:
Okay.
SAM:
Because it’s just a coordinator. It can be anything or anywhere.
DAVID:
Yeah.
SAM:
You can set it up however you want to set it up.
DAVID:
Okay, very cool.
SAM:
But yeah, there is a huge advantage of, even for us internally we started moving to a model where everything I try to get deployed in Docker, mainly because I want to document it and to know how to do it. So for example, there’s this thing called logstash that I set up in the past. And I just, I’m kicking myself for setting it up the hard way, because now I don’t know how to update it. And it’s stopped working. And I’m very, very frustrated. Whereas the other thing that I set up was we’ve got the graphs that use StatsD, which is very, very cool. If you’ve seen Graphite and StatsD, it’s something. It’s like your own private New Relic in many ways. It just gives you graphs of what’s going on. And that I set up using Docker and I made that image public, so anybody can use it. But the whole experience of updating it later on just became a pleasure, because I’d just rebuild the image and just spin it up and I’m done. So, I don’t need to worry about all of this stuff.
DAVID:
Yeah.
SAM:
So definitely, services is a great, great thing. And certain things, it takes a while to figure it out, like setting up Java and how do I do it? And Elasticsearch is also, got its own little quirks. And Elasticsearch is like a hydra. It just tries to look for all of the other Elasticsearches around. And sometimes, you want to contain it so it doesn’t require Elasticsearches everywhere. So, it’s really handy to have Docker containers for these kinds of things that are complicated to set up and you want to contain properly so they don’t walk through your environment and do things that you don’t want them to do.
JAMES:
So, the sandboxing idea still applies on some level.
SAM:
Yeah, definitely. It totally applies. And it’s great. It’s great to have your own IP address that you can just run [inaudible] on and do whatever you want with.
JAMES:
But when you have containers coming and going, how do you handle things like HAProxy or whatever, knowing where to send requests?
SAM:
Oh, well we dedicate ports, in the ports that we map out. So internally, it’s running at port 80 and externally, it’s running in port 2000. And that will be forever. And that’s in our configuration file. So, HAProxy knows to always look for port 2000 on machine X and it expects a service to be there. I know there’s a lot to chew here.
JAMES:
Yeah.
DAVID:
Yeah, yeah.
JAMES:
It’s a big, it’s a big thing.
SAM:
Yeah. But it’s very exciting and it’s version 1.0 now. It’s no longer this experimental thing the kids are playing with. Company’s being funded, so it’s going to be around for a while, and it’s looking, it’s definitely picking up. And I think it’s really good for us, because for a while it felt for me in the Rails world that this problem has been solved and the solution is Heroku. And then everybody just used Heroku. And for me, it didn’t really sit right, being a control freak and wanting to know how everything works and have control of these things. And it’s also fairly expensive. Really, if you look at the pricing of Digital Ocean versus Heroku, they’re complete different scales of pricing. So, it’s very interesting to see that these things are now coming and becoming more accessible to everybody. So, anybody can be their own cloud provider. [Chuckles]
JAMES:
I just noticed this minutes before we did this call. It was in Ruby Weekly this week, so thank you Peter. But there is an open source app called dawn which aims to be the Heroku platform as a surface using Docker. Have you seen this yet?
SAM:
I haven’t seen it.
JAMES:
Yeah.
SAM:
But I’m not surprised, because [chuckles] every few weeks, a new one of these is going to pop up and have been popping up as well.
JAMES:
Ah, cool. So, yeah.
SAM:
It’s definitely a great sign that there’s a lot of stuff operating in this ecosystem. And it’s always a great sign for technology.
AVDI:
You just mentioned Digital Ocean.
SAM:
Yeah.
AVDI:
And it’s worth nothing that they actually have a Docker application, as they call it.
SAM:
Yeah, that’s right.
AVDI:
So, when you start up your image, you can actually select Docker on Ubuntu and I guess you’ll get an image. I have never, I haven’t done this. I’m just looking at this on the web. But I guess you’ll get an image which is just already running with Docker installed.
SAM:
Yeah, that’s correct. So, that saves you a whole bunch of pain. Also, with Docker, you definitely want to be on the latest stable version. And it’s unfortunate that 14.04 just came out of Ubuntu and it ships with Docker so you can do that to get Docker out of the box. But you get version 0.9, which is already considered dull.
AVDI:
Right.
SAM:
So, you definitely want to use the official Docker repos when you’re playing with this stuff.
AVDI:
Which thankfully, they do have Debian/Ubuntu repos so you can just add those and you can just pull straight from them.
SAM:
Yes. Yes, definitely.
AVDI:
Which is always nice.
SAM:
Yeah. But this is just like the days of Ruby where you’d never do apt-get install ruby unless you’re a bit crazy.
AVDI:
[Laughs] Yeah.
CHUCK:
Have you played much with Docker on Mac OS or anything like that?
SAM:
Well, you don’t… the only way to play it is with a system that supports it. So, you need to be on Linux. So, you’re going to have to start from boot2docker, which is like the super lightweight VM that just runs the kernel pretty much and is [inaudible] Docker.
AVDI:
Here’s a stupid question. If I’m running, let’s say Ubuntu 13.10…
SAM:
Yeah.
AVDI:
Can I run a Docker image which is Ubuntu 14.04?
SAM:
Yes. So, there’s the kernel thing. You’re going to be running in the kernel of the host. That’s just a kernel. The distribution itself is not usually dependent on the kernel.
AVDI:
Okay.
SAM:
Very rarely will it be dependent on the kernel. So, all you want to do is make sure that you’ve got a recent enough kernel to run Docker. So, if you’re running 12.04 you want to make sure that you’re on the latest 12.04 and make sure that you get kernel 3.8 or above.
AVDI:
Okay.
SAM:
So, as long as you’re on that kernel, you’re fine. So for example, on 3.8 I can even run 14.04 which is using a different base kernel really, in all of their builds, but it runs just fine.
AVDI:
Interesting. Okay, so as long as it’s a system call compatible kernel version, you can actually have that, the whole upgraded OS. You’re just running it, an older kernel.
SAM:
Yeah, exactly, which is supported in Linux. They always support taking your kernel back or pulling it up a bit.
AVDI:
Right. It’s just occasionally there are systems-based services that might…
SAM:
Depend on a particular sys call that didn’t even exist in the kernel.
AVDI:
Exactly, yeah.
SAM:
So yeah, you definitely want to be a bit careful about that and ideally be on the latest kernel. But it works fine. We haven’t seen any issues in being on a different version.
CHUCK:
So, one other question I have for you is would it makes sense to set up a server with multiple containers with multiple Discourses in them or other Rails apps in them?
SAM:
Yeah.
CHUCK:
And then, do the load balancing across your containers on one box?
SAM:
I actually do that in some places. And the advantage that you get is that if you have a data container and a web container, then you can basically spin up a new web container with the old one running and then just swap it across. And if you chuck HAProxy in front of that, then you can move it across without any downtime. So, the process would be HAProxy is balancing these two containers that run on different ports. Then if you want to rebuild it, then you take one of them out, you rebuild the image, you swap it in, and then take the other one out. So, you can use that technique of running multiple containers on one box to get zero downtime deploys at that level, if you want to do a full software rebuild and the Unicorn restart isn’t good enough for you.
CHUCK:
Besides the added complexity, are there any downsides to that approach?
SAM:
The added complexity is about it. [Chuckles]
CHUCK:
Okay.
SAM:
We run HAProxy in front and I strongly recommend it if you’re doing anything like enterprise-y scale, because HAProxy does a whole bunch of stuff really, really well. And the load balancing pieces and the throttling pieces are just fantastic. And it’s a very, very great piece of software. So, we actually even terminate SSL on our HAProxy and then just use HTTP internally everywhere.
So, we’ve just got one point where we’re terminating SSL.
CHUCK:
Alright. Anything else we need to talk about with Docker or deploying with Docker?
SAM:
[Chuckles] It’s a very, very big, big, big kind of change to the way stuff had been. So, I think everybody should look at it and should play with it and should start small. Don’t expect to do a Rails app, because Rails apps are very big and complicated. If you want to play with that, then I think a great way would be to reverse engineer what Discourse is doing because it’s all out in the open and all the configuration files are there. So, just work through it and see what it is that we did to make it. And then just take those lessons. And it doesn’t mean you have to use our tool, but at least the lessons are good enough to apply everywhere. The worst thing that can be is if you’ve got these systems that perform magic and you have no idea how the magic works. So, I think the key is to having a good understanding on how you would build something like this. And that will allow you to diagnose stuff if it goes wrong. You don’t want to move all of your infrastructure to Docker and then just have one person that can debug it. [Chuckles]
SAM:
That’s kind of a problem. So, if any place is moving to more a Docker-style deploy, everybody needs to be trained on this. Everybody needs to learn about this and have a good understanding how it works all the way down.
CHUCK:
One other thing that just occurred to me is if I am going to deploy, traditionally I’ve used things like Capistrano.
SAM:
Yeah.
CHUCK:
So, do you do the same kind of thing to deploy Docker containers?
SAM:
Yeah.
CHUCK:
And then your recipe essentially builds the image instead of checking [inaudible]?
SAM:
Yeah. It’s so simple really, because it’s running three Docker commands. So, a Bash script is enough to do our deployments. Do we really need to bring a tool in place when three Bash commands and running them over SSH does the job? Yeah, we don’t really need this.
CHUCK:
But don’t you have a whole bunch of instances?
SAM:
Yeah. Ah, we use Jenkins to manage all of those. So, we’ve got a big Jenkins thing. We just click the button whenever we want to deploy.
CHUCK:
I see. And then it goes to each machine.
SAM:
Yeah, and runs those commands.
CHUCK:
And says, “Run these commands.” Okay.
SAM:
Yeah, which works really well. And yeah, you definitely need to have some sort of frontend.
Otherwise, it’s going to get a bit crazy.
AVDI:
The thing that excites me about this is I’ve been wanting to spend more time just building very, very small play applications in different languages, just enough to learn that language. And I feel like this may provide a great way to easily deploy those play applications. Does that seem like a reasonable use for Docker?
SAM:
Oh, yes. To me, that would be the perfect thing. This is the perfect vehicle for polyglot applications that have complicated dependencies.
AVDI:
Like if I want to build an application in Idris or something.
SAM:
Yeah.
AVDI:
Just to hipster namedrop for a moment.
JAMES:
Did you just make that up?
AVDI:
[Chuckles] I did not.
JAMES:
Wow. Okay, cool.
DAVID:
I haven’t even heard of it.
SAM:
Me too. Wow. [Chuckles]
AVDI:
Oh my gosh. I just got so much hipster points right there. [Laughter]
DAVID:
And as soon as James and I start using it, Avdi’s going to leave. [Laughter]
SAM:
But I’d say yeah. It’s a perfect fit. It was built for this kind of stuff where you need to pull in these custom repos or do a build using make and fun things. Like if you wanted to, say ship an experimental version of something like PhantomJS and that thing takes half an hour to compile, Docker becomes this super blessing there, because you precompile the binary and just ship the binary. And you don’t have that worry of getting that out there and having people to wait half an hour.
CHUCK:
Awesome. Alright, well if there’s nothing else, let’s go ahead and do the picks.
SAM:
Sure, yeah. I think that’s about it. Trying to think. Ah yeah, one other thing that I did want to mention that is cool that if people are looking at this is that we work really hard to make Discourse consume the least amount of memory it can. And to do that, one of the things that we did in our Docker image was we use Unicorn and that uses a copy-and-write memory. So, you get nice 30% saving of memory use between these processes. But then we also built a system that forks out Sidekick from the master Unicorn process. And that’s cool in that it became practical once we were in Docker to do those kinds of things, because we have a nice, clean environment to experiment with. So yeah, to constrain memory, we’ve worked really, really hard there. And those are things that other people can learn from if they’re dealing with Ruby and they want to try and keep memory down, especially in light of the latest Rubies that have come out and had a few memory issues. 4.2.12 is just out and it’s much better memory-wise. And also talking about Ruby and memory, I think it’s definitely worth mentioning that [inaudible] last week did a very big patch to master to constrain memory even more with the new generational GC. So, that is very exciting and coming in the next version of Ruby at the end of the year.
JAMES:
So, you’re saying that the master process that boots up in Unicorn and then you fork the Sidekick background worker process off of that.
SAM:
That’s correct.
JAMES:
So, it already has your entire environment loaded and you get that copy-on-write memory savings there.
SAM:
Exactly, yeah.
JAMES:
That’s cool.
SAM:
And it’s pretty fascinating, when you look at a user tool like smem to see where the memory’s sitting at. And we’re sitting at 140 Meg PSS for a big app like Discourse. So, 140 Megs of memory used per process is really low in the Rails land. So yeah, that’s about it I think.
JAMES:
That’s awesome.
CHUCK:
It is so cool. It’s fun to play with.
SAM:
Yeah, it is, definitely. And the easiest of just image and bin bash and start exploring inside it. You can get a Bash prompt in two seconds if you just do docker run ubuntu 14.04 bin bash. That’s pretty much the, you want to get down to the ground running and start playing with it, that’s the easiest thing. You can get a Bash prompt inside a container and start mucking around.
CHUCK:
Alright. Well, let’s go ahead and do the picks.
SAM:
Cool. I’ve got one.
CHUCK:
James, do you want to start us with the picks?
JAMES:
Sure. Just one this time, but Xiki is this thing we’ve talked about in the past on this show. It’s come up a couple of times now. And I even have a hard time describing it. It’s like an interface between shell command, shells, text editors, whatever, and it’s a really modern shell interface is how I would describe it, with custom commands and interactive commands and commands that you can click through and explore, and just this really cool thing. Anyway, it just got put up on Kickstarter for a round of enhancement and integrating with a bunch of text editors. And there’s some pretty cool rewards on this project, like the creator Craig will help you make commands for it if you want and record videos doing that and then share those and stuff. So, if you haven’t seen Xiki, now is a great time to go check it out and see what it is via the Kickstarter page. If you have seen it before and know how mind-expanding it can be and want to see it go to the next level, now is probably the right time to give a little help. So, Xiki Kickstarter, that’s my pick.
CHUCK:
Awesome. Avdi, what are your picks?
AVDI:
I will start with Xero spelled with an X. I spent the last several days importing. I decided I needed to improve my bookkeeping and based on a bunch of recommendations, I decided to go with Xero which is a FreshBooks, not FreshBooks, QuickBooks competitor. So, I’ve been importing all of my records from the last year and a half into it. And so far, I’ve been reasonably impressed with it. It seems like a really industrial-strength bookkeeping application. I will caution anyone who goes out to try it that it is not a simple application at all. It’s not one of these applications where you can just jump in and start getting stuff done without understanding what’s going on. There are features that you won’t even see unless you know how to enable them and stuff like that. But it has really comprehensive documentation and good documentation search and pretty helpful customer service. So, with some study I was able to get things set up the way I needed to. And yeah, it’s been pretty great so far. I’ve been surviving on FreshBooks and some other stuff for a while. But I don’t know. If you use FreshBooks, you know that it’s pretty much oriented just towards invoicing. If your business is more sales-oriented than services-oriented, FreshBooks just, I don’t know, it didn’t cut it for me. So yeah, Xero. And sort of in conjunction with that, I did need to munge a bunch of bank records in order to import them into Xero. And so, one of my other picks is the Ruby CSV library, which I believe was created by our own James Edward Gray.
JAMES:
Never heard of it. [Laughter]
AVDI:
And yeah, that was a huge help in just doing the munging on the CSV files and getting them into a state that made sense for import. So, those are two picks. One other quick pick. I have been through several run tracking applications for tracking my runs. And I’ve been disappointed by a few of them. I first used RunKeeper because it seemed like everybody else was using it. But that was just sad on Android. It did things like it would, when it did its voice coaching, if it had two different things it wanted to tell me at the same time, it would actually talk over itself. [Chuckles]
AVDI:
Which was a fascinating little study in how not to do parallel programming [Chuckles]
JAMES:
It’s more efficient that way, Avdi, to give you multiple instructions at once.
[Laughter]
AVDI:
Yeah. And so, after that I switched to Endomondo for a while. But I was having a lot of problems with its GPS tracking. Anyway, I finally switched to Runtastic on the basis of a recommendation from a runner friend. And so far, Runtastic has been really nice. So, I recommend Runtastic. And that’s it for me.
CHUCK:
Alright. David, what are your picks?
DAVID:
Just the one. I want to divide our audience into the jerks who knew that season 3 of Sherlock was out on Netflix and didn’t tell me. And you know who you are, because the rest of you have switched off the podcast and gone straight to Netflix to start listening and watching Sherlock. So, those of you that are still here, you’re jerks. And that’s my pick, [chuckles] is Sherlock is out on Netflix.
JAMES:
The third season is so good.
DAVID:
It’s amazing. I love that Mrs. Watson is a real character. They’ve done a good job reimagining her for the modern day.
JAMES:
Totally agree. Very surprising addition.
CHUCK:
Cool. I’ve got a couple of picks. The first one is the Docker try it page where you can actually go in and type in the Docker commands and do all the cool stuff. It walks you through as a tutorial, how to use Docker. The next pick I have is ‘The Miracle Morning’. And I’ve been doing this for about a week now and it is awesome. The book is on Amazon. It’s by Hal Elrod. And basically, it is a morning routine that has helped me increase my energy. I’ve been much more efficient in getting things done. I just feel better during the day. I can’t say enough good things about it. It is amazing. So, if you’re down or you start running out of energy during the day or you’re not sure, you’re having trouble focusing, this really helped me with all that stuff. So, go pick it up. ‘The Miracle Morning’. The other pick I have is a book. It’s a book I’ve been reading for a while. I am probably about two-thirds of the way through it but I have been enjoying it immensely. And that is the ‘Steve Jobs’ biography by Walter Isaacson. And it’s really interesting. I don’t know that I’m going to go and emulate Steve Jobs. But it’s been interesting to see how things went with Apple, Pixar, Next, and all of the other things that were going on. So, if you’re interested in Steve Jobs or in the history of Apple to a certain degree, go pick up that book. It’s pretty interesting. And those are my picks. Sam, what are your picks?
SAM:
I have one pick. I’ll pick i3wm. I’ve been using this tiling windows manager for a while now and I’m really enjoying it, if anybody hasn’t tried any tiling window managers. And I have access to Linux. I think it’s very well worth trying. There are a few other options out there as well, but mine’s i3wm.
CHUCK:
Sounds like fun. Before we wrap up, I want to make a couple of announcements. First off, Amy Knight is the winner of the Midwest.io ticket. Congratulations.
DAVID:
Congrats.
CHUCK:
Thank you for listening and posting stuff to us.
JAMES:
Awesome.
CHUCK:
The other announcement is go pick up ‘Refactoring: Ruby Edition’. If you haven’t started reading it, it is pretty cool. And I’ve been enjoying it. I’m a couple of chapters in. And we’re looking forward to talking to the authors. You can also pick up, I think it’s called ‘Refactoring with Ruby’. It’s a companion to that book. So, go read them and we’ll be talking about them on the show. And I don’t think there’s anything else, so we will wrap up and we’ll catch you all next week.
[This episode is sponsored by Codeship. Codeship is a hosted continuous deployment service that just works. Set up continuous integration in a few steps and automatically deploy when all your tests have passed. Codeship has great support for a lot of languages and test frameworks. It integrates with GitHub and Bitbucket and lets you deploy cloud services like Heroku, AWS, Nodejitsu, Google App Engine, or your own servers. Start with their free plan. Setup only takes three minutes. Codeship, continuous deployment made simple.]
[A special thanks to Honeybadger.io for sponsoring Ruby Rogues. They do exception monitoring, uptime, and performance metrics and are an active part of the Ruby community.]
[Hosting and bandwidth provided by the Blue Box Group. Check them out at Bluebox.net.]
[Bandwidth for this segment is provided by CacheFly, the world’s fastest CDN. Deliver your content fast with CacheFly. Visit CacheFly.com to learn more.]
[Would you like to join a conversation with the Rogues and their guests? Want to support the show? We have a forum that allows you to join the conversation and support the show at the same time. You can sign up at RubyRogues.com/Parley.]
161 RR Docker Deploys with Sam Saffron
0:00
Playback Speed: