How to Think Like a Principal Architect - ML 112

Today, we do a deep dive into Ben's background. We cover his career trajectory and, more importantly, how nature and nurture have impacted the way he thinks.

Hosted by:

Ben Wilson •

Michael Berk

RSS Spotify Apple Podcasts YouTube Amazon Music

Show Notes

Today, we do a deep dive into Ben's background. We cover his career trajectory and, more importantly, how nature and nurture have impacted the way he thinks.

Sponsors

Transcript

Michael_Berk:

Welcome back to another episode of Adventures in Machine Learning. I'm one of your hosts, Michael Burke, and I do data engineering and machine learning at Databricks and I'm joined by my cohost.

Ben_Wilson:

Wilson. I write code at Databricks.

Michael_Berk:

We'll take it. Um, so today we are going to be doing sort of a part two to last week's episode, but it's the inverse of last week's episode. So last week we talked, or we talked to Chuck. He's a software engineer and he's looking to sort of enter the ML world, maybe use some ML models in a deployment, but he's probably not going to be doing super in depth ML engineering or data science. Now, Ben and I come from the data science and ML world and Ben has successfully made the transition to being a software engineer. And I'm not going to spoil my future yet, but I'm thinking about doing something similar and at least learning a lot about software engineering. So that's going to be the spirit of this discussion. We're just going to be shooting the shit talking about our experience and our transition from data science to ML. Sound good, Ben?

Ben_Wilson:

Sounds great.

Michael_Berk:

So how about you kick us off with the very beginning of your career? What were you doing? What was it like?

Ben_Wilson:

The very, very beginning. Well, out of high school, I joined the U S Navy. Now the beginning of my tech career would be my transition from what I used to do many, many years ago when I got out of the military was work in traditional engineering and that was applying my knowledge that I gained in the military of mechanical and electrical engineering into something called process engineering, You work at a factory, there's some tool or some technology that you need to improve or adapt or make better in some way, like enhance it. Or just doing recipe control, where there's, think of it like you're a baker and there's so many ingredients, you need to figure out what quantities of those ingredients and how to mix them together and how to apply them. That's how things are made in factories. control those machines and how they are used to make things. I did that in a number of different types of industries and found that while incredibly fascinating and exciting, I love learning new things as much as I possibly can. I've always been like that since a young kid. But once I've kind of learned something to a certain degree and there's no way to continue learning stuff in that, it becomes kind of boring. And I don't like that sort of time card punching aspect of work, where you're coming and doing the same thing every day. So I managed to transition from being a process engineer to working at another factory, but in a completely different capacity. Instead of a process engineer at that factory, work for a group that was part of the integration engineering department, which integration engineering is usually the people that are pulling data about what the process engineers are doing and looking at the results at the end of the line saying, how many units did we produce of widget A that we can sell? have to recycle or how many do we just have to discount in price because they're not quite up to par. This is in semiconductor manufacturing, so there's not a lot of discounting that happens. It does happen depending on what you're making, but for the most part, a failed chip on a wafer is scrap. It gets thrown away. wafer. If you have too many failed wafers in an entire run, you scrap the entire run, depending on what the problem is. Integration engineering and particularly yield analysis is focused on figuring out why. It's all root cause analysis. Like, hey, this run that went three months. Well, there's 25 wafers in that run. Why did these three have a yield of 10% and the rest had a yield of 99%? That percentage is how many, what's the ratio of good to bad chips on the entire wafer? So you're given the keys to the kingdom with data. You can pull data from any tool in the factory, going back years of data if you want. And you have to figure out what are the common paths through the factory. In each of those 1600 stages, they could go through anywhere from one tool with 10 units on

Michael_Berk:

you

Ben_Wilson:

that tool to 50 tools with 8 units on each of those tools, depending on what the step is and what you're doing. So lots of paths that you can potentially go through the in doing that was approaching that with the mindset of, well, I'm pretty good at finding patterns and things, so I'll just pull all the data and see if I can find commonality between all the problem wafers went through this one tool. where you're writing just basic SQL against massive databases and saying, what's, what's the count of, you know, the bad waivers that went through all the different chambers and what, what pops up as the top 10. Well, I'll go and look in those tool logs and see if I see something weird. And that works while you're, you're getting started. And the department that I, that I joined, I was one of the first three people that joined it. Uh, just started it because it was a relatively new factory when I got there. It's really easy to find stuff when your yields are kind of low and there's a lot of problems because it's a new factory. Once you start tearing down all that low hanging fruit, it starts to become much less of a chance of you finding, oh, we have scrap wafers. throughout the factory run and it becomes like that's all low hanging fruit. You figure out like, oh that tool had a messed up recipe for three days. Everything that went through it in that one chamber is bad. That's easy to find with just basic analytics. Fast forward a year and a half from there, none of that's happening anymore. Or if it is happening, we're catching it while it's So, when we're looking at the end of the line yields, it's just patterns on wafers. We're like, well, how do we get from an average of 96% yield to 98% yield? Ben and team, go figure out. I was like, I have no idea, but I had learned all this stuff at a previous company about advanced statistical processes using SaaS software. sort of old school machine learning.

Michael_Berk:

Wait, you kicked off the SaaS initiative at that company?

Ben_Wilson:

NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN

Michael_Berk:

Was it a team decision or?

Ben_Wilson:

It was just kind of a group discussion where we're saying, what tools do we need to figure this stuff out? It wasn't me being a hero being like, this is the only way to do it. There's about 20 of us and all really smart people. We just sat down and we're like, what do we need to figure this stuff out? We all tried a ton of different things. I just had the benefit of having the training at a previous company from the Instance Eats that software. So I had all their books and even stuff that you can't get online. And I was just going through the training materials of stacks and stacks of books because we were all signed up for all those classes. So I was like, oh, jeez. Liner regression, OK, in SAS, how do I do this with multivariate? All right, go to that chapter, look through the examples. And just started applying those techniques refining them and getting more and more creative as time went on and exploring more of their software suite. Looking back on it now, I was like, man, I got a lot of exposure of a lot of different foundations of applied machine learning because there's so many different things that were going on at that factory with data. We had defect detection stuff where that's image processing. of an SEM, can we figure out if we align what we're expecting, wafer to wafer, on this particular coordinate shot, can we detect if there's differences with fixed aperture? Can we determine if there's differences in resolution at this particular layer that we're getting a picture of? We started to classify those. We didn't know what all this stuff was at the time. or two image classification as people know it today. There were people at the time that were doing far more advanced stuff, but we were just brute forcing our way through it by looking at reading docs, trying stuff out, and seeing if we could save the factory money. That was our end goal. We did. Billions of dollars were saved in the six years that I I worked there with a great team of people that were super smart.

Michael_Berk:

That's

Ben_Wilson:

But

Michael_Berk:

crazy.

Ben_Wilson:

then.

Michael_Berk:

Wait, question before we move on. So a lot of these sort of anomaly detection systems, they had, you all often have to make a trade off from a business perspective of a false positive rate versus a false negative rate versus essentially like success and failure, how did you guys think about detecting like what it, how do you think about the trade off between detecting something correctly versus detecting something in

Ben_Wilson:

So I can't get into the full details of that. I'm still technically under NDA from that company, but.

Michael_Berk:

When does the NDA expire? We'll

Ben_Wilson:

It's

Michael_Berk:

re-record.

Ben_Wilson:

a 20 year NDA actually,

Michael_Berk:

Jesus.

Ben_Wilson:

so 12 years from now. No, maybe it's eight years from now. But there's processes in place when you're dealing with that amount of money in product. And when there's another big company that's paying you to make those devices. So there's no one person that decides to throw a wafer away. That wafer could be, if you're making some old tech, be $10,000 because of the chips that are on it. If you're making something that's cutting edge state of the art, that wafer could be $250,000, one wafer, and there's 400,000 wafers in the factory at any given time. You start saying, I think this is a problem, we got to scrap it. Executives are going to start saying, what are you doing?

Michael_Berk:

Yeah.

Ben_Wilson:

Every wafer you So a lot of scrutiny goes into decisions as it should be and major scrap events are decided by people extremely high up, like higher than the people that ran that factory. So What we were trying to do was not so much identify through eliminating false positives and false negatives of saying, well, we have a good idea of, we need to scrap this. It was more, where can we tie this back to? Or this defect that we detected to let the owner of the tool that ran the processing right before this metrology. And tell them, to clean your tool.

Michael_Berk:

Got it. That makes sense. So you would focus on not just identifying issues, but sort of alerting the creator of the tool to fix whatever the issue is.

Ben_Wilson:

Yeah, the main, the maintainer of the tool.

Michael_Berk:

Maintain. Got it.

Ben_Wilson:

So the process engineer, like the job that I used to do before that, and the technicians to say, Hey, you need to clean this or adjust something or do something with your recipe so that you're not creating this problem.

Michael_Berk:

got it. Yeah, that's super cool. It's interesting. The reason I asked about sort of false positive versus false negative rate is I'm working on a couple projects actually that all encompass this concept. And anytime that you have an automated system surfacing in action to the business, you want to make sure that that there's confidence in that action. And if there's issues with the model, obviously, that's a problem. But even if there aren't issues with the model, of a trade-off between inertia versus action that changes inertia. So it's really interesting to hear that you guys were doing this way back when, and it makes a lot of sense in a, a physical system. You would want to make sure that any of your processes are optimal. So that's super cool.

Ben_Wilson:

Yeah, and stuff like defect detection on silicon wafers. The makers of the tools that do the metrology have, back then it was computer vision, where it was trying to classify each individual defect and it knew its exact location. But nowadays, those tool manufacturers have deep learning models embedded into the tools themselves

Michael_Berk:

I'm sure,

Ben_Wilson:

that

Michael_Berk:

yeah.

Ben_Wilson:

are shockingly accurate make predictions on, hey, I see this defect at this location. There's a 93% probability that this is going to kill the dye. If we continue processing it and then make a determination of if you send this to cleaning again, before next processing stage, this will probably be fine. Or I'm detecting something here that is a device killer. I'm 100% confident that this dye is dead. And it'll mark that even all the way through further metrology so that it doesn't keep on pinging on that. It'll say like, hey, this chip is dead, so we don't need to keep on looking at it. We know that there was a defect at layer 5.1 that caused, you know, basically a bridge

Michael_Berk:

Got it. So, so you come in to the Navy, do stuff, leave,

Ben_Wilson:

Huh.

Michael_Berk:

work in semiconductor processing, and you brought some statistical knowledge to that semiconductor processing organization. And after doing some quote unquote data science, I think you ran linear regression. So you're officially a data scientist.

Ben_Wilson:

Mm-hmm.

Michael_Berk:

What happened next?

Ben_Wilson:

Then I got jobs back to back working as an actual data scientist, like job title. And. both in completely different industries. And I learned a lot of different approaches to ML that had like nothing to do with what I was doing in factories. Stuff like dealing with text or dealing with graph algorithms. And the job that I got before working at Databricks was dealing with retail. Where it's like, Hey, we have. a recommendation engine to build.

Michael_Berk:

Ben

Ben_Wilson:

Michael_Berk:

is slowly dying over there, you alright?

Ben_Wilson:

a really bad cold. Yeah.

Michael_Berk:

Cheers.

Ben_Wilson:

But that's where I got exposure to forecasting as well. It's like, hey, inventory forecasting and sales forecasting. And the big difference between what I was doing before working at that company was I wasn't doing stuff that was just for internal use only. We're writing scripts that are building models and we're triggering those via a cron scheduler or just Or just running these ad hoc where you have a script and you're just like, Hey, execute this. I want the results for everything that process in the last six hours. But when I worked at, at the retail company, it's like, Hey, we need to have predictions that are ready every day at 5 a.m. and you need to get this automated.

Michael_Berk:

Got it. So that probably involves some software engineering.

Ben_Wilson:

So I had no idea how to do any of that. Like zero clue. I knew how to write basic functions in like. Yeah, I knew how to do stuff in SAS at that time. And I knew how to do basic Python stuff. And I could write a simple function. I had never written a test for a function. I had never written a class. I had no idea how any of that stuff worked. And I had certainly never deployed anything. So I took it upon myself to do it. do three separate things that sort of prepared me for doing that sort of work. One is I found some people that were really cool and would help me out with all the stupid questions I was going to ask. Second thing is I bought a bunch of books. And in the process of buying those books, I set up, based on the advice of mentors, to just build a bunch of projects. Or like, hey, figure out how to do these things with the guidance of these books and asking them for help if I got really stuck. And then the third thing was look at a bunch of examples that were available on the internet. And this wasn't reading through blog posts of the hello world for a library. This was somebody submitting an example of like an end to end deployment of an ML model. And also I just got to know a bunch of people in industry at meet tech meetups and stuff. And I just start asking them questions. Like, oh, what's the tech stack that you use? Oh, how do you do this? And I'd listen to everything they said. Right after talking to them, I'd take a bunch of notes. And then I knew over the next week or so, that was all the stuff that I'd just have to go and read about. But I'd have to go and try it and break it. And by breaking, I mean every single thing that I tried the first time, I broke it. And you learn a lot. You learn how to read stack traces. You learn how to write tests. and in the process of learning those things about debugging, you start to almost subconsciously learn how to write better code.

Michael_Berk:

Yeah, that I've definitely experienced that as well. After you break literally everything under the sun, you start sort of developing intuition on what will break when. That's interesting. So it sounds like your transition into the quote unquote data science world was completely organic. You didn't have this vision of becoming well data science probably wasn't even a thing then. Um, and you sort of brought statistical knowledge to physical processes. And then people were like, we actually want these processes to run in a scheduled manner and ideally not break. And that's where software came in. Is that essentially the path you took?

Ben_Wilson:

I mean, the term data scientist, I've always found really weird because it's just this coined phrase that somebody came up with and it just stuck and people are like, oh, it's my job title now. I'm super cool. But many, many job titles over the last 50, 60, 70 years would classify today as data scientists. But they would be called different things. at the factory, we were called yield analysis engineers. It was data science that we were doing and analytics. At other companies that I worked at, at some of those previous factories, there were statisticians on staff. They were building models. They were evaluating how things were behaving in the factory using what we would call data science today. And researchers working on large-scale compute clusters. They would definitely be called machine learning engineers today or data scientists. And they were doing deep learning back in the 1970s on high compute clusters. And uh... So a lot of people think, well, the job title just started in the last 15 years, data scientists, but people have been doing this stuff for a really long time. It's just cool. Now it's even cooler now with everybody talking about chat bots.

Michael_Berk:

Yeah, it's officially a buzzword and that always helps because buzzwords originate from a process that existed for a while. It actually amazed me to learn. So I was sort of new to the chat GPT world. When it three point, whatever was released to the public. And it turns out that there was chatbots with similar or even more powerful functionality for years prior and deep learning was invented in the seventies and was, was used. Actually, I don't know if it was invented in the seventies. But even crazier is some of the most cutting edge statistical methods, or at least methods that are still used today were invented in the fricking twenties. Like we really have been leveraging a lot of history of information and a lot of history of innovation and research. And so yeah, the newest greatest things are the best, but they're also not very tried and true. And I'm going to stop talking.

Ben_Wilson:

I couldn't

Michael_Berk:

You get.

Ben_Wilson:

have said it better. It's easy for people to look at a new magical thing that comes out and wonder how it works and wonder and sort of by default associate this new functionality as something that was just born from the team that built it. But if you were to ask any of the people on that team, they'll be the first to say, no, 30 papers to come to this next decision, everything grows on itself. That's going to continue to happen. It's just how research works. That's how science works. it's going to be interesting to see what the future holds. But the, uh, the other line of questioning that you had was that sort of transition into this making stuff run in production and what's, what's, what are the, the sorts of skills that had to be learned to get that to happen? And going through the process of learning how to write code that doesn't suck and understand new technologies relatively quickly so that it can be done in a way that isn't dangerous. It's, it's relatively challenging to do that. And it's, I would say I would not have been able to do it at all if I didn't have a lot of questions too. There's multiple different ways about the asking questions part. I don't think a lot of people outside of software development think about. So you can ask a question via, you know, Slack or chat or email or in person face to face say, Hey, I want to know more about X. Can you explain it to me? And that's, I would argue that's the least efficient way of learning something when interacting with a mentor because you're the one who's guiding what you, what your bias is telling you, you know, and you don't know. So our, our minds, particularly with technical things that we're supposed to have a grasp on. really good handle on into this sort of bucket of, yeah, I know this. I've got this under control. Then we also know the things that we kind of have a fuzzy understanding of that we need more information on, but we have no idea generally about that vast ocean of things that we're ignorant about. The more that you know and the further you go on your journey, you realize how deep that and long it is. And eventually you realize that there's oceans within oceans and you're never going to get to see them all. But for software engineers, one of the tools that's used, not just for protecting your production code base and making sure that things that get deployed to production are going to work well is a code review. We file a branch to merge to our main branch and we have this pull request. people writing code, the single most efficient way of learning. because it exposes what you think you know, but you actually don't really know that well, as well as the things that you have no idea, or that you didn't know there was a better way to do it, or that maybe the way that you were thinking about the design of this isn't necessarily wrong, it's just there's a way that makes the code more maintainable in the future.

Michael_Berk:

Yeah, just to piggyback on that, it's really interesting that you say that's the most efficient way to learn. I am sort of junior in my career, but I've started managing a couple of people. And it's been interesting to see what techniques they like versus what techniques I like versus what techniques lead to the best result short term versus the techniques that lead to the best result longterm. So if I need something done in two hours, let's say I say, do this, do this, do go implement it, but they might not retain long-term understanding of why I was thinking this way, alternative designs and that type of thing. So if you're investing in that person, it's often best to just say, try to figure it out and then come back to me with ideas and then you can steer it's sort of the PR approach. So it, there's many different management styles and it basically how your mentor treats you says a lot about how they think about your, your future and your they're investing in you, they're probably not telling you what to do.

Ben_Wilson:

Yeah, definitely. I mean, if you're a direct supervisor of somebody and you see them as a blood-filled version of ChatGPT 3.5 and you're like, hey, I need you to build these four functions for me that do these things and then write at least a dozen unit tests that cover their functionality. That person's just going to be like, okay, I can do that. And they'll go off and do it. And they're just going to, well, they're going to do one of two things. If they know how to do that already, they're going to pull from their reserves of knowledge or maybe reference something that they've done in the past, very similar to that. And they're just going to get it done really quick and they're going to ship it out. And if it works. Great. But if they don't know what they're doing with it, they don't have a good rapport with that supervisor to say, hey, I don't quite grok this. Can you walk me through how you would think about this? So they're gonna default to what is the other avenue for them, which is stack overflow. They might win the stack overflow lottery.

Michael_Berk:

Yeah, we love that lottery.

Ben_Wilson:

And

Michael_Berk:

Big

Ben_Wilson:

find

Michael_Berk:

fan.

Ben_Wilson:

like the gold standard example. Uh, if they have any sort of, you know, self-respect, they'll, they'll use that as a reference and rewrite what they're going to do based on that implementation. And write a bunch of comments in there to explain what's going on. And then also say, Hey, I got this idea from this Stack Overflow post at this following link. But if they just don't care, they're going to just copy pasta and that code is then perpetuated into yet another code base.

Michael_Berk:

Wait, I have a quick question. So, and I'm curious if you experienced this as well. So throughout school, I was like relatively smart, but I think what my strength was and also my weakness was I was really good at quote unquote mastering things, but I was really bad at just getting a B plus or like an A minus. So I was really efficient to getting an A plus, but I was also needed, of studying and research to get to that level. And I couldn't sort of half-ass things. And so what I have been recently trying out is I've required customer work and just Databricks work that I have to do. And I just put on headphones, time box it, like literally set a timer on my computer and get it done. I don't care how it works. I don't care why it works. I don't care when it works. As long as it works, we're good to go. And then what I do is I try to reserve at the pace that I enjoy. And so an example is going down a rabbit hole for, what was the most recent rabbit hole? For buffered loggers in Python, how those work. And I spent like 60 minutes of my, like my full 60 minutes that

Ben_Wilson:

that

Michael_Berk:

day

Ben_Wilson:

day

Michael_Berk:

just

Ben_Wilson:

just

Michael_Berk:

creating

Ben_Wilson:

creating

Michael_Berk:

a prototype,

Ben_Wilson:

a prototype.

Michael_Berk:

trying to

Ben_Wilson:

I'm

Michael_Berk:

break

Ben_Wilson:

a great

Michael_Berk:

stuff,

Ben_Wilson:

guy.

Michael_Berk:

trying to

Ben_Wilson:

I'm

Michael_Berk:

explore

Ben_Wilson:

a good guy.

Michael_Berk:

stuff. Now I think I have a reasonable understanding of conceptually how they're designed, but relevant to buffered loggers that I just will not ever get to. So I've been dividing into going as fast as possible

Ben_Wilson:

you

Michael_Berk:

and not caring why things work and then slowing down and actually enjoying the learning process not under deadline. Do you do something similar or do you learn everything relatively thoroughly before

Ben_Wilson:

you

Michael_Berk:

you actually implement in a production setting?

Ben_Wilson:

I wish. I don't know any, I personally don't know any software engineers that that behave that way, that aren't still in school. They aren't like in a doctoral research program or something. When you're getting paid to write code, nobody cares how you do it, except for The people that are paying you are not the other people on your team. They are the management of the company. They expect that the code will work. They don't care how it works. They're not interested in that. They want a product. So because we work in a capitalist society, those products make us money, which gives us our salaries. not producing products to the pace at which your management expects, you won't be getting a salary anymore. It's very important to put priority on product when you're developing software. Even in developing models as well, data scientists aren't immune to this either. They get a little bit more leeway, I think, because people generally in management don't understand what they're doing. But people in management definitely understand the process of software development. It's been out long enough and they know that, hey, they said it's going to take three weeks to do this. It's been three and a half weeks. Where's the product? If you come back with, well, I wanted to understand more about this esoteric concept before I actually wrote the code. You might get away with that. After that, they're going to be like, no hard feelings, there's the door. Rightfully so. You can't do that. It's all about learning the foundations and having a really good understanding of how to generally write code and solve problems with code. This isn't, hey, learn this language really well. engineers, nobody has a job title of Python software engineer or Java software engineer. I'm sure those job titles exist, but if you're a true professional software engineer, it doesn't matter what language it is. You just need to understand patterns of development, patterns of writing things, and how to figure something out in such a way that you can ship a product on time. along the way where you're like, I really wanna know how buffered readers work. If that comes up in a feature development cycle. You can spend a little time to learn what you need to learn about that, to be like, okay, I'm using one of these to write a stream of data into. Well, I need to understand, is there some sort of trigger that I need to dump, you know, everything that's queued up into this out somewhere? How do I change the size of this? What's the impact to the system? And usually you just write some tests real quick or simulate some data way of like, oh, this is the behavior of this. There are certain times where you would need to go deep on something like that. I mean, like really deep. If the solution to your problem was you need to implement your own buffering system, like you can't use the one or you have to extend the one that's built into the language and there's all sorts of, I mean, that's where open source projects come from is somebody feeling like, Hey, the thing that's part of my language doesn't do what I need it to do. and I really need it to do these other things. And that's when you do that project where you're going so far down deep into the rabbit hole, you're interfacing with like. very base modules in a language that are marked developer only or they're not generally interfaced with by most people that are using the language. And that stuff's cool. But if you were to do projects like that back to back over and over throughout an entire year of working as a software engineer, I think you'd get burned out. The mental load for doing that over and over with deadlines is pretty intense. Most good TLs are going to split up work like that among the team so that at any given quarter, maybe 20% of your team is doing work like that. And then the next quarter, a different 20% of the team is doing work like that. So that you're not... You don't want to overload just a couple of people with the cool stuff because the cool stuff can become not so cool if that's all you're doing. You just end up feeling exhausted by the end of the week, mentally drained. Like, well, I was operating at this high level for this amount of time. Eventually that no longer becomes cool. Can confirm. Netflix and just vegetate state just watching the screen with bright lights moving in front of me. I'm not paying attention to anything. So it's very important to not go down that deep when learning everything.

Michael_Berk:

It's super interesting you say you shouldn't do that. I, since joining Databricks, I feel like my consumption of crap TV has really increased and I kind of like it like it's, um, as long as I still have sort of a balanced and full life outside of work. Uh, it's really fun to just be constantly overwhelmed and going in every single day and sort of requiring to be at your peak capacity to, to output on time. And I mean, it's probably different for everyone. And maybe it's not sustainable. I've only been at Databricks for like, I don't know, eight months or something. But, uh, yeah, I don't know. It's fun.

Ben_Wilson:

Well, the, to go in deep into technical things is slightly different than the concept of just having an intense work pace. You know, if you work at this company, you're going to have an intense work pace regardless of what you're doing. Uh, there's a lot of stuff that goes on. Um, but if you're spending 10 hours a day digging into the source code of something. there's no documentation for this stuff. It's not like, oh, I can go to the user guide of this library and they explain how they built it. No developer does that. Nobody has time for that. So if you're interfacing with low level stuff for seeing how, like how would I build something from scratch that meets this need, but the amount of complexity in severe that you're like, okay, I'm not just building this one thing. There's actually, you know, 80 components to this. that all need to be built and working together in concert at some point, but I can't build it all at the same time because nobody's going to review that PR that's 50,000 lines of code. So how do I break this problem up into 15 different component parts that I develop over a period of three months? There's only so many projects like that that you can do and not feel mentally overwhelmed. where you're just focusing on this one thing for a really long period of time and every day is just pure intensity. Yeah, it's a lot.

Michael_Berk:

Yeah. One

Ben_Wilson:

particularly

Michael_Berk:

of the,

Ben_Wilson:

with the deadlines.

Michael_Berk:

yeah. Yeah. Deadlines are weird, man. Um, one of the things that I've been really enjoying as well is, um, with the pace of always having to learn new things, you start learning. It sounds kind of corny, but you start learning how to learn. You start finding patterns in what is valuable information and what is not. And maybe I'm just bad at that and always have been question Mark, but, um, see how when you identify a component of knowledge, let's say, and let's take, I don't know the book, let's go with the buffered logging example. Knowing how it flushes to a CSV file, let's say you don't really need to know that because it's just a call in a, it's a function call that's already written for you, but theoretically, if you didn't know that that is not relevant to your run process, you would have to go into the source code and see how it actually flushes to a reader. or to a CSV file, let's say. So one of the really, really fun things that I've been enjoying is figuring out like that pattern recognition for when a piece of knowledge is relevant to your process versus is not.

Ben_Wilson:

Thanks for watching!

Michael_Berk:

Did you cultivate that over time or did you always have that or do you not have that?

Ben_Wilson:

The way that, I think this is an interesting topic, but I've had this conversation with people in the past where they're like, hey, when you think about how something works, how does your brain work? Are you visual or do you think about component pieces or do you hear somebody talking in your head from a different voice? solve problems. And from a young age, I've always had this, I don't, I wouldn't call it a skill. It's just like how my brain works internally is when I think about something or I look at something and I want to start thinking about how it works, it's almost like in my head and starts disassembling it. And then I see how everything kind of fits together. But that same level of abstraction happens with stuff like software, which is why I could write code relatively efficiently and understand how things work. Because that's what's going on inside my mental model is I'm looking at a function. But to me, when I look at that, it's like a box or, you know, cube that's moving over to one side. And then all of the places where it's called from, there's like strings being tied to other boxes that are using that. And then when I'm thinking about navigating through the code, I'm like a type rope walker who's walking between each of the boxes and seeing, oh, what happens if I jump from this rope to this rope? Well, that's a different, you know,

Michael_Berk:

What does the box represent again?

Ben_Wilson:

Like a function or a method in code,

Michael_Berk:

God, it's

Ben_Wilson:

Michael_Berk:

Ben_Wilson:

Michael_Berk:

component.

Ben_Wilson:

could be a class. Yeah. Just something that is called. But that sort of visual and spatial understanding of the world. That's just how my mind's always worked. And with that, that mental model, I can, I kind of use it to think of other patterns that I can see. I'm like, oh, this is kind of like how this thing works. It's similar to this other thing that I understood about, you know, and I kind of understand causality of using different things in different ways. I don't know if that makes it easier for me to learn some of these concepts or harder. I don't have any other frame of reference. It's just how I think about things.

Michael_Berk:

Yeah, that's, that's super interesting. I wonder, it'd be cool to get a tour of some smart people's brains and see what structures they leverage because a lot of it just sort of seems like magic. They just produce a good output with like, they think critically, they think quickly. Uh, and yeah, it's just so crazy that different people can have a visual model versus sort of an, an completely intuition based model. Like I don't, I don't have boxes. I have nodes and edges. And I sort of think about how concepts connect via this like soft. Wow. I actually haven't thought about this before. So innovation on the podcast, but it's sort of like a, a weird, almost kernel based connection between

Ben_Wilson:

Thanks for watching!

Michael_Berk:

different areas of a graph where the kernel transforms the shape so that it's congruent and fits like a puzzle.

Ben_Wilson:

Interesting.

Michael_Berk:

But it's definitely not as structured as boxes, as you say. It's very high dimensional, weird space.

Ben_Wilson:

Yeah, when I was a kid, I used to look at stuff like... appliances that are around the house or things that are in my dad's garage, like tools, like the lawnmower. Mentally, I remember, and it still happens to today when I'm looking at something that I know is like something's broken. I got to go out and fix it or figure out what's going on with this thing. My mind just still does it. It disassembles it in three-dimensional space components that are within that. I remember the first time realizing that there was something weird with that was actually looking at the lawnmower in the garage and just seeing it from the top. I'm like, how does it cut grass? Then my mind just started taking the cover off. Then inside there was this thing with a, I didn't know what it was called, but it's like, oh, this is an engine and it has a rotor attached to it. There's pistons attached to the rotor and It sort of hazily deconstructed this and then lifting up the lawnmower onto its side, validating that like, yeah, that's how it works. It like spins like this and there's this shaft that goes into this thing. I open the cover up. So my mind still does that. Sometimes it's kind of distracting and annoying, but it helps me kind of visualize, which was a boon when talking to a lot of customers at Databricks when they speak in abstract terms of something complex that they want to build. I'm not building those horrible PowerPoint presentations of architectural diagrams that drive me up a wall because they're never detailed enough.

Michael_Berk:

Ha.

Ben_Wilson:

But I'm building what I think the components are actually doing when they're sending a packet of JSON data from one place to another. What do these things Okay, you have all this data, you're operating structured streaming on a Spark cluster, and then you're connected to Kinesis that's sharded out to 500 nodes. Which worker gets which packet of data and how does it do retry policy? And I'm just visualizing that like, oh, here's all these VMs, it's 500 boxes, and they all are shooting out all of these strings to all of these different worker nodes that are a fixed size, they can auto scale as well. So I'm mentally imagining Box, just popping in and popping out of existence on those connections. And then just seeing little bits of packets of data moving to them and then going back the other way to do a retry fetch from a particular shard with an offset. So, I don't know, maybe I'm crazy. That's just how my brain works.

Michael_Berk:

This is wild. I remember one of the first times I ever heard Ben talk about something that I know something about, I was like, holy shit, this guy really knows his stuff. And then he did it for like 500 more topics. And every single time he used so many, like so much complex technology. I was like, this man just has a Rolodex in his brain. But it turns out that he just decomposes everything into a 3D Iron Man type model. That's insane, I didn't know that. That's super cool. All right.

Ben_Wilson:

I'm just like a visuo-spatial sort of person, which

Michael_Berk:

Just

Ben_Wilson:

means

Michael_Berk:

let me-

Ben_Wilson:

I'm absolute garbage at stuff like learning foreign languages. I'm terrible at that. I suck at art, like badly suck at art. My brain is just not wired for stuff like that. I can be creative when thinking about things that are spatial in nature. So constructing code is spatial, you know, for me at least it is. and coming up with ideas about how to solve problems like that works. But doing stuff that is unstructured, like art, not good.

Michael_Berk:

Yeah.

Ben_Wilson:

But music makes sense to me because music is structural. So when I listen to something like a symphony, I'm. This is like the episode of Mind Talk. I've told this to a couple of people and it's blown their minds. I hear a 40-piece orchestra. I'm actually hearing or seeing in my mind a 40-person orchestra and the 18

Michael_Berk:

What

Ben_Wilson:

different components

Michael_Berk:

the hell?

Ben_Wilson:

but I'm listening to each person individually. So when I'm listening to the music, I'm like seeing that person playing. I think this person is playing the second violin and they're also playing along with four other people and I can hear that line in the music. which makes music really fun when you listen to complex stuff like classical music and it makes music really boring when you listen to crappy pop music but yeah another weird anecdote

Michael_Berk:

Wait, so do you find complexity attractive?

Ben_Wilson:

Uh, in witnessing? Uh, yes. In creating?

Michael_Berk:

why in creating, because theoretically, witnessing, like, why are they different?

Ben_Wilson:

I like the fact, my favorite thing is a simple solution to a complex topic. Or sort of there's beauty

Michael_Berk:

Bye.

Ben_Wilson:

in simplicity.

Michael_Berk:

Cause all the pieces are working in harmony to a simple outcome.

Ben_Wilson:

Yeah.

Michael_Berk:

Got

Ben_Wilson:

Michael_Berk:

it.

Ben_Wilson:

if we're going to continue with the music theme, what if I had you, you as the conductor and the writer of music for an orchestra playing, and I told you, hey, I want you to create the most complex and discordant sound that you possibly can. The most effective way to do that is to have each instrument group of the orchestra starting at the lowest The the double basses out there playing their lowest note Maybe that's an e. I don't remember but Each subsequent instrument playing one half step away from each other and to say hey play random notes Is that complex? Yeah There's a lot of things going on and one would say it's pure chaos That would probably drive me to run out of that room Like I would not be able to deal with that. It would sound so bad to me and there would have no order or beauty to it. There's entire genres of music that I just can't listen to because it's either, you know, ridiculously simplistic and formulaic and it's just repetitive and annoying or it's of quality art. It's like somebody sort of trolling an audience by like, what if I play this thing that is creating not just tension and dissonance, but just pure chaos for a bit and then I don't resolve that. It's like, why are you doing this? You're just doing

Michael_Berk:

Yeah.

Ben_Wilson:

this because you can, but that's complexity for the sake of complexity. People do that They may know there's a simpler way to do it, but they want to show off. They want people to look at their code and be like, wow, this person is super advanced and they're super clever. But anybody who's actually super advanced and super clever with software who sees something like that and knows they're going to know the better way to do it or the simpler way to do it, they look at that and they're like, maybe it's a super junior person who wrote that. But if they know who that person was that wrote it and they know that they should know better than to do that, you're just like, you know, what an idiot. Like, what a complete asshole. Like, why would you write code like that?

Michael_Berk:

Yeah,

Ben_Wilson:

So.

Michael_Berk:

it's taking this to the maximum meta of that statement. So let's say you start off with sort of a fixed type of brain in childhood. And then throughout life, you can certainly augment skills, but typically the way your brain works is the way your brain works. And I am of the, of the side that you can actually do a lot to change how the fundamental nature of your brain operates, but it takes time and you need to be exposed to a specific environment for a long period of time.

Ben_Wilson:

BLEP I believe

Michael_Berk:

One

Ben_Wilson:

that.

Michael_Berk:

thing, okay, cool. Good so far. One thing that I think that people never fricking work on and it's really, really high ROI and kind of annoying that this isn't more pop culture-y, but changing the goals that you shoot for. So like Ben just mentioned, if you are looking to solve a problem efficiently, you will never write unnecessarily complex code. But if you examine the rationale for you writing that, um, that code as well. I'm probably a little insecure about my code writing abilities. I want to try this new thing that I might not know works or doesn't work. And anything else that's wrapped up into creating a suboptimal solution. Those are the issues. It's not your skill set. It's your desire for output and what output is is good to you. And so one of the things that I've been just playing around with over the past couple of years is reframing what I strive towards. in the hope of having an optimal result. And that's a very like hand wavy description of it, but it's really, really valuable. If you shoot for the right things, often the results come.

Ben_Wilson:

And to add to that, I would just say that. The vast majority of professional software engineers are not under the illusion that what they're writing is the optimal solution. Anything that you create is usually under a certain state of duress, right? You're like, you need to get this thing shipped. You need to make sure your tests pass. You need to make sure that you're doing everything you can within that deadline to get this thing out there. the pursuit is the most perfect code at all times, you're probably going to miss your deadline. If you're the person writing the code, you're going to miss stuff. We all have creator bias. It's not because we think we're so good. It's because your mind went down a path while writing this thing that you're not constantly looking back at where you went down that entire path. path to be like, hey, did I clean up all of those branches back there? I remember I went off to the right there for a minute and then I had to go back to this Y in the road and then move down this other path. Did I trash that right path that I went down? So that's what peer review is for. Fresh set of eyes that's coming in and looking at it and be like, hey, dude, I don't think this code is actually used anymore. exception because this shouldn't throw anymore. So that whole process of shipping for other people to review code that's not optimal or that you know is not like the best that you could do if given infinite time and perfect solitude and wonderful health while you're writing it. You have to basically, the most important tip that I have for people is don't be afraid of what people are going to think about what you're producing. Realize that everybody is seeing it the same way, provided that you're a nice person and people want to work with you. If you're, if you're nice and you're cool, everybody just wants you to continue being nice and cool to them. So they're going to be nice and cool to you back. and they just want us to make sure that what you're producing is good, so that the product works correctly. But they also want to make it so that you understand what you missed or teach you a better way of doing something by saying, hey, did you think of this or could we try this? So long as you can get out of your own way with feedback. Even if you are producing something that you don't fully understand, the best way to do it and realize that if somebody else sees that, they're not going to be like, man, Michael just checked this code in. What a muppet. Did you see how jacked up this was? Nobody is thinking that, I promise you. Nobody who matters thinks that. If you're sending it to people who are doing peer reviews that are your peers, they're going to be like, hey, dude, did you think of, there's this, I see you wrote all this stuff, this whole function. thing in the native library that does this exact thing. Uh, and they're not going to say it does it better than how you implemented it. It, they're going to say it does this so that you don't have to, which is a hint of like, you don't want to maintain what you just built because now you own it. And it's another thing that could break when a new version of, of the library gets released, you now have to update that, you know, cause it's going to break. It's going to, you know, deprecate. You're going to get all this great feedback from people that you're cool with and that are cool with you. That's why I said earlier on, that's the fastest way to learn is getting out of your own way.

Michael_Berk:

Yeah, I guess I agree. And even if people do shit on your code, that's fine. Like

Ben_Wilson:

Yeah, don't take it

Michael_Berk:

they

Ben_Wilson:

personally.

Michael_Berk:

might say it in a rude way, but yeah, don't take it personally and take the nugget of information that they do have and learn from it.

Ben_Wilson:

Yeah, the only time to completely disregard comments on things that you produce is if somebody is just saying something that's just trying to attack you, it's an ad hominem attack of like, why did you build this? This is dumb. I built something better. It's like, okay, I'll, I'll take a look at what you built, but that's not what I was told to do. I was told by my boss to build this thing. And usually the answer is yes, and the boss is like, I don't like that guy, you know. If somebody's being a jerk just to be a jerk, you can usually tell because there's nothing constructive that they're producing. They just say, this sucks. It's good to push back on people like that and say, could you elucidate? Could you tell me a little bit more about why this doesn't meet the needs or why you don't like this? People that are doing it just to hate on something, they typically can't answer that question know what could potentially be wrong and they're just being a jerk or they want to disengage from that conversation as soon as possible. Either way you win.

Michael_Berk:

Can you hear the pause real quick? It's freaking take your child to work day.

Ben_Wilson:

No! That's what that is. That's every

Michael_Berk:

Thanks for watching!

Ben_Wilson:

day for me.

Michael_Berk:

true. have this edited out but

Ben_Wilson:

Thanks for watching.

Michael_Berk:

hopefully in a sec. So there are gone.

Ben_Wilson:

That's gone.

Michael_Berk:

All right, so we're coming up on time. So I will try to wrap. We did not achieve any of the things we tried to achieve, but this was a fun conversation. Yeah, so some of the nuggets that I thought were interesting were the way that Ben learned early in his career was first find people to help you. They can guide you on projects and also provide feedback. Second is buy a bunch of books. I'm a fan of blog posts, but books are really, really good when you have to go deep. And then third, go on the internet and try to implement stuff that's already out there. It's rare that there are code repositories that have super in-depth or cutting edge topics, but at the same time, open source technology is pretty advanced. So you can find some cool stuff out there. The other component that we talked about was sort of ways to ask questions and whether that's from the mentor or mentee perspective. And so we're going to take the mentor perspective in this summary. If you were looking to invest in the people you are or that report to you, explaining how to do stuff directly is probably not the best option. Throwing them in the deep end is a common approach because it's very easy for mentors to do that and then just go about their day. But theoretically a middle ground of creating sort of sandbox environments where your direct report can play and learn in a structured manner That's typically the best way to manage someone for growth But everybody has different styles and each person is different. So tailoring your style to that person is super relevant So I think that concludes it. Ben, do you have any final thoughts?

Ben_Wilson:

Now, it's a good summary.

Michael_Berk:

Alright, well until next time, it's been Michael Burke and my co-host.

Ben_Wilson:

I'm Ben Wilson.

Michael_Berk:

Have a good day everyone.

Ben_Wilson:

Take it easy.

How to Think Like a Principal Architect - ML 112

0:00

1:03:28

Playback Speed: