Linguistic Antipatterns With Jimmy Koppel - RUBY 583
Jimmy Koppel is the founder of Mirdin. He also has a Ph. D. in programming languages from MIT. He joins the show alongside Chuck to talk about "Linguistic Antipatterns". It is a persistently bad practice in the name and documentation which could make it more difficult to understand programs. He begins by sharing some of its examples, how to identify them, and how to avoid them.
Hosted by:
Charles Max Wood
Special Guests:
Jimmy Koppel
Show Notes
Jimmy Koppel is the founder of Mirdin. He also has a Ph. D. in programming languages from MIT. He joins the show alongside Chuck to talk about "Linguistic Antipatterns". It is a persistently bad practice in the name and documentation which could make it more difficult to understand programs. He begins by sharing some of its examples, how to identify them, and how to avoid them.
On YouTube
Sponsors
Links
- Advanced Software Design Web Course - Mirdin
- Dropbox - rubocop.md - Simplify your life
- two Ruby case studies
- Linguistic Antipatterns
- Mirdin
- Jimmy Koppel
- Twitter: @jimmykoppel
Picks
- Charles - Forbidden Sky
- Jimmy - Tigris & Euphrates
- Jimmy - Might and Magic series - MobyGames
Transcript
Charles Max_Wood:
Hey, welcome back to another episode of the Ruby Rogues podcast. This week, I'm your host, Charles Max Wood. I don't know where everybody else is, but they're not here. We're here with Jimmy Koppel. Jimmy, do you wanna introduce yourself, let people know who you are, why you're world famous and all that good stuff?
Jimmy:
Uh, yes, Charles, it's gonna be a pleasure. So, I'm Jimmy, and I'm on... So, uh- My entire life is motivated by trying to get the world to spend less on software maintenance. So kind of known for two big thrusts. I have a PhD from MIT in programming languages, specifically in the field of what I call meta-meta programming, which is, it's kind of a pretentious name, but it's a lot cooler than software language engineering, which is another name for
Charles Max_Wood:
Right.
Jimmy:
the same thing, which is meta-programming, as Rubius should be well aware, is about programs that generate code, edit programs. tools. And I noticed there are a lot of really awesome tools that people in academia have developed that you can't really use even though they're so cool and they have studies showing it makes you so much faster. And I think it's because they've been really hard to build and you build it on work and you get it for some special use case. So my PhD largely focused on programs that So,
Charles Max_Wood:
Mm,
Jimmy:
hence,
Charles Max_Wood:
okay.
Jimmy:
meta-meta programming. I'm also the founder of Myrdin. Myrdin is a code quality company. We take good software engineers and turn them into great software engineers. So, our main thing is the web course, the advanced software design course, where we teach software engineers a lot of really advanced techniques that are very broadly applicable to all kinds of languages, but most people have no familiarity with. We teach people how to make code, which is easy to modify and hard to make mistakes in. So I've gotten a lot of rave reviews from that, people saying that it's made putting more enjoyable because they feel like they're kind of on top of the world and see what's going on. I've trained about 300 people so far. Now as far as the world of Famo.us, I know you're kind of jetted with that, but I actually do have a little bit of a world claim to fame that's outside of programming, which is about three years ago now. the 2020 elections and you might have heard about some drama around a couple of voting apps. So first there is Shadow,
Charles Max_Wood:
Mm-hmm.
Jimmy:
used to run the Iowa caucuses and it made the results a week late come out. Then after that there is Votes, which is an actual mobile voting app being supposedly
Charles Max_Wood:
I've used votes.
Jimmy:
blockchain based.
Charles Max_Wood:
I've used votes.
Jimmy:
Really? Okay.
Charles Max_Wood:
Yeah.
Jimmy:
Well, Votes was not used in the federal election in 2020. It was not used in the West Virginia because of myself and my co-author, Mike Spector, where we found numerous security vulnerabilities in it, which was, and got them published in the New York Times and a bunch of other places around the world. That's actually how
Charles Max_Wood:
Nice.
Jimmy:
I got my Chinese name. My Chinese name is based on the characters that a Taiwanese newspaper used for me when reporting the So that is my world claim to fame.
Charles Max_Wood:
That's cool. Yeah, I've used votes. I'm a delegate here in Utah. And so we've used it for some party business and stuff like that to vote in leadership and stuff like that. So, yeah. Anyway, very interesting. So, yeah, well, we brought you on. I think you, I don't know if we reached out to you about this or if you proposed it, but linguistic anti-patterns. And I find the idea really fascinating. Do you want to kind of start us off just talking about what, you know, how you define it and how to identify them and maybe the problems they cause?
Jimmy:
Sure. So, so, loops into pattern. Well, if you want to be kind of, you want to be kind of crude, you can say it's a fancy word for bad naming, but simply
Charles Max_Wood:
Mm-hmm.
Jimmy:
it's specific ways of bad naming. And that's, you can train yourself to identify and avoid and then predict the consequences. So, So we have numerous examples of these, so much has been very bad. So there are some basic things such as. Apparently there's quite a few methods in Java that have our name like getters. I'm
Charles Max_Wood:
Mm-hmm.
Jimmy:
looking at one right now, it's called get method bodies, but it doesn't return anything. It's void. So that caused a little
Charles Max_Wood:
Okay,
Jimmy:
bit of confusion.
Charles Max_Wood:
yeah
Jimmy:
But it gets worse from there. So my favorite example is... So this one has happened within a major software company where probably most people listening have used their product quite a few times. But I'm not gonna name the company. I'm not gonna give the name the internal framework. I'll just call it Odrin, which is our fake name for it.
Charles Max_Wood:
Mm-hmm.
Jimmy:
So there's a function called createAsyncRequest in this framework.
Charles Max_Wood:
Huh.
Jimmy:
you get a thing, then you call, then you invoke it. And you'd expect
Charles Max_Wood:
Right.
Jimmy:
that it's an ASIC request. And that expectation is so reasonable that one of my former students spent a full week trying to debug this, and they showed it to the creators of this framework, and they can figure it out either. Because
Charles Max_Wood:
Oh no.
Jimmy:
createAsyncRequest does not create an async request, creates an async request template that must be instantiated. So because the expectation of what this does is so strong, and because there's not the kind of compile time or runtime error feedback tell you exactly why you're using it in the correct way. This is something that's easy to read and like, you wouldn't, you would just read this and say it's correct.
Charles Max_Wood:
I can
Jimmy:
Thank
Charles Max_Wood:
definitely
Jimmy:
you. Thank you.
Charles Max_Wood:
see where that would be an issue. And yeah, I mean, I would expect it, yeah, to make an async request, right? Not give me a template back that I have to instantiate and work on. So did the framework change it?
Jimmy:
I did not hear the rest of the story and
Charles Max_Wood:
hahahaha
Jimmy:
it's not at all open source so it's not like I can just look it up.
Charles Max_Wood:
Okay.
Jimmy:
My guess is probably not because Giant Company backwards compatibility.
Charles Max_Wood:
Great. Makes sense. So, I mean, just diving into this a little bit, I've run into issues, right, where I wrote almost all the code, I named crap poorly, right? And then I went back and I used something making the assumption that it did what I called it. And yeah, it's, you know, it's not a unique problem to bigger systems or bigger teams or anything like that. This is something
Jimmy:
Absolutely.
Charles Max_Wood:
that everybody kind of falls into.
Jimmy:
Yes.
Charles Max_Wood:
So how do you identify it? Do you just identify it when people bring stuff up like this, or is there a less painful way to do that?
Jimmy:
Yes, the number of things you can use. So one, let's see. So I know there are a number of static typing approaches for Ruby right now. I'm not sure why they dot they are. But heavy static typing can definitely help a lot. And that's actually a pretty general thing, which is this is part of the more general idea of trying when some API can be confused for something else. So for instance, when you teach is that, if you see, if in a type language, you see a function that takes in three integers in a row, you should
Charles Max_Wood:
Mm-hmm.
Jimmy:
get nervous because that's just inviting people to pass in integers in the wrong order.
Charles Max_Wood:
Right.
Jimmy:
I have a story from before, over four years ago, spent 10 to 20 hours debugging something that was caused by such an error.
Charles Max_Wood:
Okay,
Jimmy:
So.
Charles Max_Wood:
yeah, that sounds interesting.
Jimmy:
So the thing to note about confusability in general, beyond just confusability that comes from linguistic anti-patterns, is to think not just what you might use it for, but what will happen if you have the wrong expectation. So for instance, suppose you have a stack and you can never remember which is push and which is pop. You get them mixed up.
Charles Max_Wood:
Mm-hmm.
Jimmy:
Well, even if you get them backwards, you're going to find out very quickly you got them backwards because they work totally differently.
Charles Max_Wood:
Right.
Jimmy:
But for something that does send, makes a background request, or like has some effects, whereas something with concurrency where you usually don't want your program to be overly dependent on how stuff's being scheduled, those are the kind of thing where it gets dangerous and you should take more care to specifically design things in a way where a mistake in using it is going to cause a compiler runtime error.
Charles Max_Wood:
Yeah. I mean, one thing that I just want to back up for a second, you mentioned that helps is strong typing, but in Ruby, I mean, we're on a Ruby podcast, we don't have that. So... Is there a way around that? We
Jimmy:
Oh
Charles Max_Wood:
can
Jimmy:
yeah.
Charles Max_Wood:
go into something that takes the arguments in a second. Cause I'd like to talk about that too, but yeah.
Jimmy:
Oh yeah, absolutely. So like heavier use of keyword arguments, for instance. That can stop friends things from being confused in a different way.
Charles Max_Wood:
Right.
Jimmy:
Also a little more interesting is avoiding using the same names in different objects. So.
Charles Max_Wood:
What do
Jimmy:
and
Charles Max_Wood:
you mean by that, using the same name
Jimmy:
So
Charles Max_Wood:
in different objects?
Jimmy:
this is not a big example, but the first one comes to mind. So say you have a user and it has a.id. And then you have
Charles Max_Wood:
Uh huh.
Jimmy:
a task and it has a.id. So sure.
Charles Max_Wood:
like in Rails.
Jimmy:
Well, that is inviting someone to pass in a task when you're expecting a user and you call.id. And then you're using a user ID which is the place of a task ID.
Charles Max_Wood:
right.
Jimmy:
And I'm not sure what that one is, but a lot of others like that. It's like, using the same name, different object means a difference, but now if that was static
Charles Max_Wood:
Mm-hmm.
Jimmy:
typing, you could confuse them. So
Charles Max_Wood:
Right.
Jimmy:
you can intentionally try to make the names more distinct, like you can say user.uid or task.tid instead. Now, that's one source of error gone.
Charles Max_Wood:
Right, that makes sense. Going back to the idea of the arguments, right? And since Ruby doesn't have the strong typing, I mean, you can pass basically anything into any argument and as long as it matches the call signature, right? It has two arguments, three arguments or 10 arguments. It'll at least call it, right? I mean, it may error out further down the line, but it will call it. the keyword arguments, that's something that I use pretty heavily. If it just takes one argument, I may or may not use keyword arguments. Two arguments, I may or may not. But once I get to three, I'm looking for ways to start using keyword arguments because yeah, I can't keep it straight, right? Even if it seems obvious. A lot of times I'll come back later and go, I thought this order was obvious, but it clearly isn't.
Jimmy:
That's right. Another thing can be done. So I said in general rule about avoiding bullying arguments, because it's at least duly and non-keyword arguments. Because
Charles Max_Wood:
Mm-hmm.
Jimmy:
if I call foo true, it's like, you might have no idea what that means. And there are lots of cases where it's like, example I have is in Java, you can call task out of wait. And then you pass in bullying, and it tells it whether this awaiting can be interrupted. But based on what I've told you, you probably don't know true means interruptible or uninterruptible.
Charles Max_Wood:
Right.
Jimmy:
So solution is every language has some way of doing something like enums.
Charles Max_Wood:
Mm-hmm.
Jimmy:
So you can have an enum called interruptible and an enum called uninterruptible. Now you're not going to fuse those. And further, if you just read a piece of code that passed in true, then instead of passing an interruptible, now you don't need to look at the docs. You just know what that means.
Charles Max_Wood:
Yep, makes total sense to me. I'm
Jimmy:
No,
Charles Max_Wood:
just trying to think
Jimmy:
it's...
Charles Max_Wood:
through other things, right? I mean,
Jimmy:
Yeah, well it's...
Charles Max_Wood:
some of the things we're talking about are, I recently was reading clean code and a lot of these ideas are in there, not all of them, but
Jimmy:
Alright.
Charles Max_Wood:
several of them are, right? But a lot of times too, I run into issues, just as an example, the documentation for the APIs that I've been working on for my client right? As far as like knowing what I can pass in or how to pass it in or how to format what I'm passing in and things like that. And so I see that at kind of a higher level when we're talking about arguments that come in where, yeah, you're not quite sure what the convention is for that or what they did in this particular edge case to allow you to do something that's a little bit outside the normal CRUD operations.
Jimmy:
serious. You said they were doing something. What about outside normal credit operations? I think you missed a keyword there.
Charles Max_Wood:
So for example, they do logistics, right? So
Jimmy:
Uh-huh.
Charles Max_Wood:
it's the order a thing and you want it to show up at your house. Right. And so they, they do everything from inventory, warehouse, stuff like that. Right. And so, you know, you've got your normal operations, like you have an order, right. And you, um, there are a lot of different ways to change the state of an order. And some of it's just data that lives on the order object or in the database on the order orders table. Changing that's an update operation, right? If I want
Jimmy:
Uh?
Charles Max_Wood:
to delete the order, I can call the delete on the on the API but if I want to do change the state of it to cancel or shipped or Things like that a lot of times the documentation isn't clear on how to do that Because it's not your standard update to the system
Jimmy:
Yeah.
Charles Max_Wood:
if that makes sense
Jimmy:
Yeah, that's, you know, the order is a state machine. And there's like changing the fields. There's changing the state. You expect
Charles Max_Wood:
Right.
Jimmy:
this to change the states to affect a lot more things. And it would make sense for
Charles Max_Wood:
Yep.
Jimmy:
it to have a different API for doing so.
Charles Max_Wood:
Yeah. And sometimes that's not entirely clear because they don't expect you to pass a set of changes to the data that's in the table or
Jimmy:
Thanks for watching!
Charles Max_Wood:
things like that. And so it's, okay, so what do I pass? Do you need the ID? Do you need the ID and some other piece of data? Do you need the order number? I'm working on integration, so do you need the original ID from the integration that created it?
Jimmy:
Yeah, so yeah, so I'm hearing that's there's that's the signature or documentation is functions not particularly crisp and also possibly
Charles Max_Wood:
Mm-hmm.
Jimmy:
hearing a mismatch between how the API design how works in the actually as far as Which makes it not obvious we are defined the various updaters you
Charles Max_Wood:
Right.
Jimmy:
need for kinds of dates
Charles Max_Wood:
Yeah. And you know, the reason I'm bringing it up is because we were talking about the way that the functions go together and the way that the arguments are named or the keywords are provided, um, you know, the formats that those things can be in. And it just occurred to me that it doesn't have to be within my code. It could be. As I call out to services, we have some of these same problems. As we dive into this, I'm a little curious, are there other kinds of linguistic anti-patterns that you find pretty commonly? Or maybe you can give us an example of somewhere where this happens within the Ruby ecosystem.
Jimmy:
Yeah, so it was actually hunting through my list of examples an hour or two ago, hoping to find one in Ruby that I already had prepared. But I lasted not. There was one that I had remembered as being in Ruby, but actually was in Python.
Charles Max_Wood:
Okay.
Jimmy:
Although, it's possible RubyLibrary the same one. That story is about how, well, in PyAML is a function called danger load. That makes you think that normal load is safe, but it's not.
Charles Max_Wood:
Ha! So what's the difference?
Jimmy:
Oh boy,
Charles Max_Wood:
Now I'm curious.
Jimmy:
I'm gonna have to look that one up. So... Let's see. Let's see... So, the safe behavior. Okay, so YAML, you might have heard some of the rage against YAML, which looks great, beautiful, readable, but it's actually
Charles Max_Wood:
Mm-hmm.
Jimmy:
terribly, totally gnarly. There's an amazing and hilarious website called NoYAML.com that features some of these. But I don't think that even talks about these security vulnerabilities, which is perhaps things like a full decade ago now. where you can have a YAML file, so you can call like yaml.load,
Charles Max_Wood:
Uh huh.
Jimmy:
it is going to construct Ruby objects. That
Charles Max_Wood:
Mm-hmm.
Jimmy:
is you can annotate your YAML, so that's going to construct Ruby objects. And the person providing the YAML file picks which objects. So if there is enough stuff imported, then they can basically do whatever they want to your system, just by when you call yaml.load. So that is super dangerous. So So in my quick skim right now, have this article pulled up. I'm not quickly find the difference between danger load and load, but the safe behavior in question is to not run arbitrary code.
Charles Max_Wood:
Right. So the safe version being, yeah, don't run arbitrary code. Does that, how is that not safe then? I mean, what other?
Jimmy:
That is a safe version. Thing is
Charles Max_Wood:
Okay,
Jimmy:
that's a
Charles Max_Wood:
that's
Jimmy:
YAML.load.
Charles Max_Wood:
not what it does.
Jimmy:
YAML.load does not, does not do that.
Charles Max_Wood:
I gotcha. Yeah, I seem to remember there being, oh, I can't remember the term, but yeah, they found a vulnerability in Rails that was related to what you're talking about, or Ruby was related to that YAML. There was some kind of remote code execution that, yeah, people could run on your Ruby apps or Rails apps.
Jimmy:
It's always fun.
Charles Max_Wood:
Yeah, and it was related to that in CVE, but that was a few years ago.
Jimmy:
But still, it's like the original really frightening YAML parsing bug was... I don't remember what year it was, but I remember that Heartbleed was 2014.
Charles Max_Wood:
Mm-hmm.
Jimmy:
And when Heartbleed came out, people were talking about the importance of having a really catchy name for your vulnerability because the Ruby
Charles Max_Wood:
Ha!
Jimmy:
YAML parsing bug was old news at that point and was
Charles Max_Wood:
Right.
Jimmy:
as or more dangerous than Heartbleed, but didn't have such a catchy name. So putting it like at least a decade old, I guess we're still getting Yammer Worm abilities, so the root cause was not fixed.
Charles Max_Wood:
Interesting. So, I mean, what are we supposed to do? Because there are all kinds of Ruby apps that use YAML to configure them.
Jimmy:
So... And that is a big question. And as always, backwards compatibility. Backwards
Charles Max_Wood:
Right.
Jimmy:
compatibility is the thing that gets in the way. So. I'm going to talk about a related API design error that I believe now is fixed in the process to get there. And then, so as I'm talking, should, uh, find a lot of idea for what the YAML solution should be. So I've never heard of the curl opt SSL verify host bug.
Charles Max_Wood:
Huh.
Jimmy:
So. This thing came out, I think over a decade ago. So I talked earlier about the danger of Boolean parameters. Well, in C, you can have things that look like Boolean parameters and aren't, because Booleans are just integers. So you're connecting to a server. So what do you want to happen? You want to make sure when you're talking to Google.com, it's actually Google.com and not a guy parking in front of your house with a router and broadcasting a signal pretending to be Google.com. But by the way, the former head of site integrity book got hacked by his own employees by doing exactly that.
Charles Max_Wood:
No way.
Jimmy:
Yeah, well, he asked, he asked me when I heard the story a dozen years ago, he asked not for the full details, but a lot of the details are published in Tatcrunch. But it's been a long enough thinking and say, and the employees who did it, they, they left their car outside his house for months with a boat battery powering their machine. And then they rollerbladed to work. That's a, so anyway, so that's a man, that's an example of a man in the middle attack. They can be scary. So that's why I met a certificate, an HTTPS and TLSSSL. Now it's like the server has a private key. And so that's which is signed by some certificate authority and like only the guy with the private key can give evidence that he's actually in that website.
Charles Max_Wood:
Uh huh.
Jimmy:
So when you're using curl as a library, you have some apps doing SSL connection. You want to check those certificates. And so you should go ahead and pass. You see this curl-opt-ssl-verify-host flag. And yes,
Charles Max_Wood:
Uh
Jimmy:
you
Charles Max_Wood:
huh.
Jimmy:
want to verify the host. You want to check certificates. You said it's true.
Charles Max_Wood:
Right.
Jimmy:
If you do that, it actually disables this certificate checking. You shouldn't pass in true. You should pass in to. So, there's a great paper from 10 or 15 years ago called The Most Dangerous Code in the World. Which.
Charles Max_Wood:
Uh huh.
Jimmy:
They went through a bunch of applications. Yep, that's 2012. They found these man-in-the-middle vulnerabilities in a lot of applications because of this API design flaw. So, you can say it's in a similar situation as with the YAML. So... So here's how the documentation for curlopt-sseller of our host reads today. It verified value was set to one, i.e. true. In 7.2.8.0, when earlier, treated as a debug option, then so they patched the old versions to kind of ignore that. From 7.28.1 to 7.65.3, setting it to 1 made curl easy setups, return an error, and leads lag untouched. From 7.660 onwards, treat 1 and 2 the same. So the upshot of the story is basically patched old versions to kind of give a notification have a long period of giving an error in doing that and then a longer period of kind of deprecating it. So. Mm. On. So it would probably be a similar story of you release passages through all the old versions saying, no loading objects to XT arbitrary code unless you specifically call a special method called danger load. And then it's a. actually I think that kind of be in this story. And like, perhaps you eventually remove that feature altogether, whatever it is. The big thing is like, I'm sure there are lots of uses for people loading arbitrary Ruby objects, including YAML using this feature,
Charles Max_Wood:
Mm-hmm.
Jimmy:
but you should never have it happen unless you really want it to.
Charles Max_Wood:
Right. Makes sense. Now you sent over a couple of case studies. Do these tie into this idea too? Where we're talking about like the Bundler case study or the Ruocop case study.
Jimmy:
Um, yeah, but broadly. So, as this, yeah, as this story on these case studies is that's, um... Let's see, one of my early students, a guy named Gabriel, and he really wanted to take my course, but he was in Brazil.
Charles Max_Wood:
Uh huh.
Jimmy:
And, well, Brazil's not doing well economically for a long time. Back then,
Charles Max_Wood:
Right.
Jimmy:
the real had just fallen a lot worth a dollar. You know, I said, you know, it's like, I think you're a really right guy, and how about I'll give you a huge discount on the course, but then you write two case studies. And he did. these two Ruby case studies I have came out.
Charles Max_Wood:
OK.
Jimmy:
So these are about either messy code or code flaws in two popular Ruby programs in RuboCop and in Bundler. The Bundler one is interesting that this is kind of a story about how real security vulnerability happened, which is a handful of years ago. It's very easy to accidentally. It's very easy to accidentally... So you have a funniler file, and
Charles Max_Wood:
Mm-hmm.
Jimmy:
you say, here's my primary source, download the gems from here, but for this one gem, download from place B.
Charles Max_Wood:
Right.
Jimmy:
But actually it would give place B precedence over place A. This makes you vulnerable to what's called a supply chain attack, where someone
Charles Max_Wood:
Right.
Jimmy:
uploads to place B, a library that has the same name as the thing you actually want, but it... would you like?
Charles Max_Wood:
Right.
Jimmy:
So, in other communities, there have been some very infamous cases of this kind of supply chain attack. So, trying to remember the name of the library. I believe this was in JavaScript's, in the node world, where... And... So the attack was someone got administrative access. No, he took on this one project, small project from someone else, didn't want it to be anymore, made some good contributions. Then once he had control, he changed one of the minify dependencies of that program. So in order to add some Opscated code, which... So we have minified dependency in this library. It's being used in Electron, which is being used in some cryptocurrency wallet. It's like
Charles Max_Wood:
Uh huh.
Jimmy:
three layers of dependencies. And it was running in that program, and the person had at least 100 Bitcoin, then it would drain their account. So that's the supply chain attack. But this error in Bundler, which you can call it a bug, you can call it unexpected behavior, there's a saying, a program without a specification cannot be wrong, it can only be surprising. I think, so I think this was a little specified enough to just be surprising. But anyway, it's, I don't know how much it exploited, but it definitely made these kind of supply chain attacks quite easy. And... And from looking at the code, it turned out this actually was almost certainly not what the creator's intended behavior to be. It's like you specify one dependency with a secondary source. You think that should only apply that one dependency. But because the way the code
Charles Max_Wood:
Right.
Jimmy:
was engineered, as far as not having explicit, a well-structured reputation of these sources. So instead you have this kind of lower level primitives used to represent them, global data structures. So it's just like, Oh, we're going to prioritize the most recent source that was specified,
Charles Max_Wood:
Mm-hmm.
Jimmy:
which happens to never be the global source statements you put at the top of the file.
Charles Max_Wood:
Yep, I seem to remember...
Jimmy:
Thank you.
Charles Max_Wood:
the particular issue with the npm package. And I think we talked about it on JavaScript Jabber.
Jimmy:
Yeah.
Charles Max_Wood:
And yeah, there have been a few instances where stuff like that's happened. Some of it is
Jimmy:
Thanks for watching!
Charles Max_Wood:
fairly. you know, it's not as malicious, right, where somebody just deleted their NPM library and a bunch of people depended on it, and so it was bad.
Jimmy:
Yeah,
Charles Max_Wood:
But
Jimmy:
that was left pad.
Charles Max_Wood:
it wasn't fatal, right? It was just... I'm thinking of LeftPad when I say that, right? And so
Jimmy:
Yeah.
Charles Max_Wood:
the package got deleted, people couldn't build, so NPM actually put it back, was the solution that they came up with. But yeah, library that the maintainer didn't want to maintain anymore, they handed it off. And yeah, somebody took it over and put malicious code in it. And then yeah, they got access to stuff that you didn't want them to have access to.
Jimmy:
Mm-hmm.
Charles Max_Wood:
Are there... And again, how do you mitigate that? How do you stay on top of that kind of a thing?
Jimmy:
So that's the kind of thing that's... I don't have a good answer to that's just some wide problem and not really. not really something that an individual team, unless you're a Megacorp, can really handle.
Charles Max_Wood:
Mm-hmm.
Jimmy:
So, in some of my own projects, and things for clients, I tried putting in practice of... of being careful adding dependencies or every upgrade. Like we put notes saying this thing is this, we need it for this reason. I looked at the code, it seems fine. But if you really commit to that of really checking every dependency, then either you can't do it and you have to kind of. It's half-baked in a way you would still be vulnerable to the electron big-pointed attack I mentioned earlier, like three layers dependencies
Charles Max_Wood:
Mm-hmm.
Jimmy:
in an obfuscated code minified file. Good luck. It's
Charles Max_Wood:
Right.
Jimmy:
like, you do that, you say, I never upgrade, or you just don't use dependencies, and then you're lo and behold,
Charles Max_Wood:
Right.
Jimmy:
despite your high and mighty ideals, you still find yourself working several times slower than everyone else.
Charles Max_Wood:
Right.
Jimmy:
is that I think there should be community organized or group corporate organized auditing of common libraries. I want to be able to say here is 500 of the most common node repackages and an industry consortium hired some people to do a basic level one cursory audit that's not malicious code and then basic checks of them, dependencies for every version. Because I think there's enough people who are going to be concerned about these supply chain attacks. That's like this coordination problem, like surely we can find group funds such an effort. So, I think that's it for this video. I hope you enjoyed it. If you did, please like, share, and subscribe.
Charles Max_Wood:
Yeah. One thing that I know we've talked about again on JavaScript Jabber, we had Firas Bukadjie and he got
Jimmy:
Mm.
Charles Max_Wood:
on and he talked about some of these supply chain issues and a tool he's working on called Socket that does it for NPM packages. I don't know of anything that does this same kind of thing though for Ruby packages. And effectively, I mean, you mentioned it was like this obfuscated, encrypted, of the code, right? And so you can't see into it, but it does stuff. And he talked about, you know, essentially his tool scans for that, among other things. And that's what kind of drew my mind to it. But yeah, it'd be really interesting to have a tool like that that's, yeah, running those community scans and saying, hey, this has these red flags to it, right?
Jimmy:
Yeah, on this little small side, I was an intern with Frost at Facebook 12 years ago.
Charles Max_Wood:
Oh,
Jimmy:
So
Charles Max_Wood:
okay.
Jimmy:
he was around for the man in the middle fishing story that I mentioned. I'm
Charles Max_Wood:
Mm-hmm.
Jimmy:
not sure if he attended the session where the story was told, but he was there. He's also currently married to a friend of mine, someone that I was in the Teal Fellowship with.
Charles Max_Wood:
Okay.
Jimmy:
So So this thing, I don't know a good way to import this into the Ruby world, but this is the kind of thing that a good type stem or capability system will do. Actually, you might be able to put a capability system into Ruby. So capability system means something like there is instead of being able to say file.open anywhere you want. Instead, there's a file stem object that gets passed around with constructor to constructor, method to method. And this file system object has the open function. So if you have a section of code and you don't pass the file system object that code, it cannot touch the file system. Or maybe you pass in your restricted version of that object with restricted permissions. With that kind of control of effects and capabilities, then you don't have things where you import a string formatting library and then it logs your keyboard Because like, why would a string-forming library need keyboard and network access? You just don't give it that.
Charles Max_Wood:
Right. Yep. I think we've kind of deviated a little bit from the case study, which basically just said that you could be pulling from a source you don't necessarily trust or get code that you don't necessarily want. I'm curious then, I mean, are there ways to mitigate that particular? Can you tell it which server to pull from for a particular gem? I kind of scanned the case study a little bit and it said that in your gem file, I guess it was at one point, if it's not anymore, honoring the last entry. Can you go in and say, well, use the first entry on these, or these are the exceptions,
Jimmy:
So...
Charles Max_Wood:
torque.
Jimmy:
So I'm not sure about now, but at the time the bug was first supported, there was syntax for doing that. But it didn't actually do that. You would say, let me pull up. Okay, all right, so here, all right, so here I've pulled up the page with the explain the security of law. So you could write source, private.com, do, and then in the do block, you
Charles Max_Wood:
Mm-hmm.
Jimmy:
declare gems. So you would
Charles Max_Wood:
Okay.
Jimmy:
think that would scope, you would think that would scope this secondary source just within that do block, but it didn't.
Charles Max_Wood:
Right. Oh, really?
Jimmy:
So. So with such and wrong behavior being provided by the library, we talk about solutions on Any solution. Any solution that's given to the end user is going to be workarounds around this bad behavior, which I believe the workaround
Charles Max_Wood:
Right.
Jimmy:
was just every single gem that you import, you explicitly say which one that particular one comes from. So when you talk about real solutions, it has to be a solution within the bundler code base. And
Charles Max_Wood:
Right.
Jimmy:
so here, a deeper cause of this problem was a solved design flaw as far as and not having, not digging the concepts of... of gems and sources explicitly into the code.
Charles Max_Wood:
I'm sorry, say that again?
Jimmy:
not making the concept of gems and sources explicitly in the code.
Charles Max_Wood:
Okay. I mean, most of the time, if I have something that's not on RubyGems that I want... It's in a GitHub repo somewhere that I have access to, but
Jimmy:
Yeah.
Charles Max_Wood:
it does occur when you're using something like Sidekick Enterprise or some of the other gems out there that have a paid option. Right? They have their own gem repository that you're going to pull your libraries from. And so in those cases, yeah, if they've got a gem that's named something different or named the confusing the two and pulling the wrong thing.
Jimmy:
So the thing about this error is that most of the time it's benign, except when it's malicious, which is a reason that it went under the radar for so long.
Charles Max_Wood:
right.
Jimmy:
That usually it will be like, it looks at the first repository, which is the first repository in the president's list, which is whatever the last thing the file was.
Charles Max_Wood:
Right.
Jimmy:
And it's not there. And then it goes the next thing, which is saying global repository. Everything's happy. You never notice anything's wrong until someone maliciously adds such a gem to the technique repository.
Charles Max_Wood:
right. Yep, that makes sense. And so if your secondary
Jimmy:
get.
Charles Max_Wood:
repository is most of the time they're trusted sources, right? But yeah, if you're... If you're pulling them from some other source that has something that, yeah, for whatever reason you want something different from what's on Ruby gems, I could conceivably see that.
Jimmy:
Yeah.
Charles Max_Wood:
It also seems though that this is something you should just do sparingly and then it's not an issue.
Jimmy:
Oh, yeah, that might be great advice. A lot of people then, you know, probably some listeners here aren't a case where they need a ton
Charles Max_Wood:
Right.
Jimmy:
of special gems or paid proprietary gems. And then, so,
Charles Max_Wood:
Yeah.
Jimmy:
and then your advice, it's like, you know, it might feel like someone in 1990 telling people to not use open source or not use
Charles Max_Wood:
Hahaha,
Jimmy:
dependencies at all.
Charles Max_Wood:
fair enough.
Jimmy:
So to flesh out a bit, what I was saying earlier about gems and is not being reps in the code well. So I have open right now the source code for this version of Bundler. And it has a source list class, but it does not have a source class. I believe sources are just strings or just URLs.
Charles Max_Wood:
Uh huh. Yeah, I haven't even looked at Bundler source code. But yeah, I definitely see the issue there. Before we run out of time, I did wanna talk about the RuboCop
Jimmy:
Cheers.
Charles Max_Wood:
case study as well. You want to just kind of explain what that is and how it worked?
Jimmy:
Yeah, RuboCop is a linter for Ruby, which
Charles Max_Wood:
Mm-hmm.
Jimmy:
we can find simple buggy patterns as well as bad idioms or bad style.
Charles Max_Wood:
Mm-hmm.
Jimmy:
So the flaws that Gabrielle found in Rubicop and put this case study were less dramatic. Which is, it's more about how when you're invoking it programmatically, there is a lot of ways to mess up. So for instance, there are a ton of options and some of the options are only legal to sets if another one is also set. And then, so within the RuboCop code, they have... Mm-hmm. Yeah, they have a pretty messy function, which... you which tests whether there's legal combination arguments. So, so
Charles Max_Wood:
Uh-huh.
Jimmy:
you hear the consequence is not major security vulnerability supply chain attack possibility. It's just
Charles Max_Wood:
Right.
Jimmy:
some messy code when RuboCop and some potential for bugs, but probably ones that will just give you an error and not make it to production on the user side. But let's...
Charles Max_Wood:
Right.
Jimmy:
You know, just because it doesn't stop your house from burning down doesn't mean that keeping a clean house is useless.
Charles Max_Wood:
Yeah.
Jimmy:
So here the problem is more of the way that they take options, just a bunch of independent things, primitive types, as opposed to creating a proper DSL or proper object hierarchy where it makes it easy to write down the options you want and hard drops to write down invalid options.
Charles Max_Wood:
Okay. Yeah, I mean RuboCop
Jimmy:
you
Charles Max_Wood:
is something that I use. And looking at the case study, I mean, I see stuff for the command line options.
Jimmy:
Huh.
Charles Max_Wood:
But, and is that what you're talking about, or are you talking about the configuration file?
Jimmy:
Yeah, actually, I think, I think what I was saying refers to both the command line and programmatic options.
Charles Max_Wood:
Right. Yeah, I mean, one of the things I see is that, yeah, there are a number of places it can pull the Ruby version from, which may make it hard for you to identify which Ruby version is supposed to use. That's a problem that's generally true of a lot of Ruby systems, is that you can have your Ruby version listed in your bundle file, you can have your Ruby version listed in a Ruby version file, you can have it listed, in like a.RVMRC or a.Ruby or what is it? Ruby-version or something like that. I can't remember. But there are a whole bunch of different places you can put it. And depending on the kind of... they don't always follow necessarily the same order in looking them up. And so you could have one tool running one Ruby version and then another tool running another because you haven't universally updated all of your options to the same version of Ruby, right? Because you deploy to say Heroku and it runs one version of Ruby and then, and so then you need the compatibility so you change it, right? And then you run your local it out of the, you know, dot Ruby
Jimmy:
Thanks for
Charles Max_Wood:
dash
Jimmy:
watching!
Charles Max_Wood:
version. Yeah, I can kinda see
Jimmy:
So
Charles Max_Wood:
that.
Jimmy:
classic. Yeah, that's, it's like each one of these tools wants to be king of the world and to like own your Ruby configuration. And then none of them do.
Charles Max_Wood:
Yeah. very interesting. But yeah, they have a million different command line configs that you can set up. And then... Yeah. Um, only can, can't have a specific cop. The except can't, can't have the top level cop that checks syntax. Yeah. Yeah, I definitely see where a lot of this comes from. Dash, dash, cache is a Boolean. And you also run into this in your code when we were talking about arguments and stuff is you mentioned, you know, sending an argument that's true, but Ruby evaluates different values to truth or your falsies, so you could pass in any number of things. and get interesting results. It's
Jimmy:
Yeah.
Charles Max_Wood:
really interesting just to think about how we overload a lot of this stuff.
Jimmy:
The story just about passing any truthy thing actually reminds me of a very dramatic story I had 13 years ago that caused my break with dynamic typing. Right now, I think WAPQ would associate me with heavy with static typing. I'm a big Haskell guy. I was even on the Perkin committee of the Haskell symposium last year. I teach a lot of people in other languages. and Ruby about how they can use a type system in a way more advanced way than they knew about, even like Ruby and Python that, let's say I teach them, I can get some benefits of type systems like I talked about earlier with the name thing. But no, but I actually kind of grew up as a Ruby enclosure person, as a big dynamic typing guy. And I still actually do some Ruby. It's like, it's my tool of choice That's when something gets too complicated for bash scripts, actually.
Charles Max_Wood:
Mm-hmm.
Jimmy:
Once upon a time I picked up a book on Z shell, thinking I'd get really good at shell scripting. About halfway through, I put it down, and realized, if I kept reading, I'd become a worse person. That's beneficial that I don't know too much advanced shell, because then I can't use it. So the story about my break with dynamic typing. And so I was working on an app that lets you play any card game with anyone else on the internet. So it's not about that wasn't programming rules or card games is programming in card physics. That is you can pick up cards, put down cards, put in your hands. You can put down a card and flip it face up at the same time.
Charles Max_Wood:
Mm-hmm.
Jimmy:
So it's in that feature where on my computer, I go to put down a card, flip face, I put down a card. I have a card up in my hand face down. I put it down on the table face down. On your computer, it puts it down face up. What happened? So you go and like I see, like is the flip over flag sets, it says false to my computer, it stays face down. On your computer, it says false and it flips. I'm debugging this, like I'm printing out, it says false, but enters the true branch. And it took
Charles Max_Wood:
Right.
Jimmy:
me multiple days, like gripping the code base inside, I figured out what was going on, which was that false was not a false. It's not a Boolean with a lowercase b. It was a Boolean with a capital B.
Charles Max_Wood:
Mmm.
Jimmy:
And that's in Java, in Java and there for every JVM language, including Clojure, they have these box primitives. So. So when I send the false over the wire, it gets sent as a low-crate B Boolean, gets read in as a capital B Boolean, and false as a box Boolean is its true value.
Charles Max_Wood:
Oh.
Jimmy:
I realized that was an error that could not happen in a static type language. And so that was a moment when I
Charles Max_Wood:
Uh huh.
Jimmy:
big shift away from the dynamic to the type languages happened.
Charles Max_Wood:
Very cool. I don't know if I would have ever found it. Cool, well, we're kinda getting toward the end of our time, so I'm gonna push us into the self-promotion arena. Is there something you wanna talk about that you've been working on lately that people should know about, or ways people can hire you?
Jimmy:
Sure. So there's the small and the large. So first, check out the linguistic anti-patterns that I've been talking about. So if you go to linguistic-antipatterns.com, spelled like it sounds like, there you'll find our webpage at the... where we have these listed examples explained and can go onto the associated GitHub repository and sharing new examples in discussion forum or propose new ideas with the same two patterns as we're building up this community resource. As mentioned earlier, my life nowadays is all about teaching developers how to write better code. Telling good developers into grades. So,
Charles Max_Wood:
Mm-hmm.
Jimmy:
go to myrden.com slash courses, find all that are offerings. We have our express course about an hour a week, an hour or two a week for several weeks starting in January 23rd. And then a few weeks later we have our full course starting. And you also said it's share us like random things that I like, like board games.
Charles Max_Wood:
Oh yeah, yeah go ahead and do some pics and then I'll go.
Jimmy:
Yeah, sure. So, yeah, so I used to be very active in my university's board game club. So like 10 years ago, I got to know about all the hot Euro games. My favorite two
Charles Max_Wood:
Hmm.
Jimmy:
easily are Tigris and Euphrates and Dominant Species. Tigris and Euphrates is one where someone can be building a huge empire, but then in one turn, you take that empire and use it So, more recently, I'm a die-hard fan of the Might and Magic series of games and the Heroes of Might and Magic series strategy spin-out. So, in fact, my project worked on for long the care to admit is the first mod of Heroes of Might and Magic 2, which came out in 1996 on the first person to rush engineer it and hack it and mod it, which is how I got the skills for the election hacking stuff I mentioned earlier. FI.SD Iron Fist. But recently, not on my hands, a Polish company is coming out with the Heroes of Might Magic 3 board game. So they just had an ultra successful Kickstarter, and I'm super excited about that.
Charles Max_Wood:
very cool. I'll have to check that out. I am going to do a handful of picks myself. Let me do the self-promotion stuff first. We have Rails RemoteConf coming up. It's gonna be at the beginning of February. So keep an eye out for that. And then... We are also chasing down book club ideas. So if you want to join the book club, we're going to be wrapping up clean architecture throughout this month, and then we're going to start a new book in February. So if you're interested in being part of the book club, go to topendevs.com slash book club, and you can sign up, and then we will let you know how that is all looking. As my picks, I always start with a board game. I just wanna call out real quick, so the Tigress and Euphrates game. I always give you the weight on Board Game Geek. It's two to four players. It says it's a 90 minute play time, and it's a 3.5 weight. So it's a somewhat complicated game. But I like those ones, so. The game I'm gonna pick Sky. If you've played like Forbidden Island or Forbidden Desert, this is just another variation. It is 2-5 players. With 2 players, we got through it. My wife and I did pretty fast last night. We actually won it for the first time. We played it like 7 or 8 times. What we did is we actually played with 4 quote players, but she ran 2 of them and I ran 2 I ran two of them. And anyway, it's kind of a difficult one to win. In fact, I found a forum thread on BoardGameGeek talking about Forbidden Sky and basically saying, is it even possible to win it? And people talking about different approaches. And so yeah, we did win it. I think what we basically had to do was, in a lot of games, you can kind of take risks, and the risks aren't that high, right? So you, you know, you stay out where you might get hurt and take the risk. It turns out that every time we played, we'd eventually get to the point where we were willing to do that. And yeah, then one of the characters would die and Forbidden Sky, it's kind of like Forbidden Desert or Forbidden Island where you're trying to get everybody off the, out of the desert or off the island. you're collecting artifacts. In this game what you're doing is you're building a machine that will power a rocket that will get you off the platform in the sky. And so you wind up putting tiles together that allow you to place the components of your machine and then you wire them all together and take off. And so this time what we did is we effectively just, if there was an wanted versus moving back to a safer spot, we would move back to the safer spot. And that was effectively what we had to do to win. So we'll see how that goes with three players or two players. But anyway, it was fun. Forbidden Sky gives it a weight of 2.59. So it's slightly more complicated than a kind of a casual game. But you know, well within the... well within the realm of a game that you could pick up for your family and not be super thrown off by how complex it is, right? It's not a game that I would say this is just heavy gamers would want to go play this one. So anyway, I'm going to pick that and then... Yeah, I went through a whole bunch of personal stuff earlier this week, so I'm not as prepared on this as I usually am. So I think I'm just gonna leave it with the board game. But yeah, hoping to see you all around at the different things that we've got
Jimmy:
you
Charles Max_Wood:
going on. And I guess we'll wrap it here. And until next time, Max out.
Jimmy:
Alright, thank you.
Charles Max_Wood:
All right, hang on just a second. Make sure that we get you all uploaded.
Linguistic Antipatterns With Jimmy Koppel - RUBY 583
0:00
Playback Speed: