LLMs on Azure - ML 123

In today's episode, we speak with Jeff Procise, founder of Wintellect, an Azure-focused software consulting company. Expect to learn about your next LLM MVP on Azure, the societal impact of AI, the bifurcation of model size, and much more!

Hosted by:

Ben Wilson •

Michael Berk

RSS Spotify Apple Podcasts YouTube Amazon Music

Show Notes

Sponsors

Transcript

Michael:

Welcome back to another episode of Adventures in Machine Learning. I'm one of your hosts, Michael Burke, and I do data engineering and machine learning at Databricks and I'm joined by my cohost.

Ben:

Ben Wilson, and I use LLMs to write my docs for me at Databricks.

Michael:

Yes. Well done once again. Uh, so today we have Jeff precise. Uh, he started off as sort of a mechanical and aerospace engineer, and then moved into the software world. And we might touch on some of that background a bit, but we're really going to be focusing on his software experience and he has a ton of it. Uh, he has nine books, hundreds of magazine articles, all centering around the topic of software. He co-founded a company and that was acquired recently by. Atmos Sarah, and now they provide sort of software solutions for industry. And his current title is something that I would love to learn about. It is chief learning officer. So Jeff, what is a chief learning officer?

Jeff Prosise:

That's a great question, Michael. I was hoping you could tell me actually. No. So the company that I founded more than 20 years ago, Wintelect, had two primary lines of business. One was the consulting business where we helped customers build software solutions, large and small. The other line of business was doing hardcore developer training. So over the years, we've trained tens of thousands of developers at companies like Microsoft to help make their software developers better, acquaint them with the latest tools and languages and platforms and libraries and technologies and all. So when Wintelect merged with Atmissera two years ago, because I had always been pretty heavily involved with the training arm of Wintelect, it was just sort of natural for me to take that over at Atmissera as well. So my title became Chief Learning Officer. I'm actually having new business cards printed this week. And we're changing that

Michael:

That's

Jeff Prosise:

title

Michael:

exciting.

Jeff Prosise:

to Chief Artificial Intelligence Officer, because I sit on a lot of calls with customers and potential customers talking about AI. And sometimes they wonder, what does this guy know about AI? His title is Chief Learning Officer.

Michael:

Interesting.

Ben:

Should have just prefaced it with machine and then people would say, oh, machine learning

Jeff Prosise:

Machine

Ben:

officer, got

Jeff Prosise:

look

Ben:

it.

Jeff Prosise:

I hadn't thought about that. Yes, see MLO. Yes So yeah, so the title is probably going to officially change soon, but I love the training industry We've been doing it for a long time Love working with smart people. It's one of the things I love about what I do I get to meet work with and interact with a lot of really smart people And what that involves? Figuring out ways to infuse AI into business processes to make companies better, to make them more efficient. To me, it's not even work, it's just fun. I love doing this stuff.

Michael:

So what is your day to day? I still not getting a clear picture, talking to customers for sure, but what does that entail?

Jeff Prosise:

Yeah, it's a good question. So my day to day, I still do a few training classes a year. We have a number of people that teach these classes, but I do a few a year myself doing one next week for a large company that you've heard of. I spend a fair amount of time doing research. As you know, AI is fast moving right now. In fact, every day it seems like there's something new to learn about. So I try to spend 20 to 25% of my time doing research, writing code, reading papers, updating the AI course that I wrote to make sure its current covers all the latest developments and bells and whistles. And then I do spend a significant amount of time with customers, a lot of sales calls with our sales staff, hearing about the problems customers are having, brainstorming about ways that AI may help. And then sometimes I get to roll my sleeves up and actually get involved in helping to architect and implement those solutions as well. So little bit of training, little bit of research, a lot of working with customers, designing and building solutions and just brainstorming what the solutions will be. That accounts for the bulk of my time.

Michael:

Interesting.

Ben:

I'm really curious what you've seen, everybody who's even somewhat aware of what's going on in the tech world over the last eight months since the November bombshell of all of a sudden laypeople are aware of AI again for the first time in a decade. But that hype doesn't seem to be slowing down. It's speeding up and people are actually using this stuff for real now. But we've talked on our podcast in the past couple of months about how this stuff isn't new. It's been around for a very long time. Even the foundational models that, you know, the GPT stuff that OpenAI has been, you know, working on Transformers architecture has been around for years. And how do you see those conversations change in customers that you're interacting with when people are suddenly realizing that? They don't have to do it how you would have had to have done it two, three years ago, where you take the Transformers architecture and the white paper, you take PyTorch and build it from scratch and then say, where do I get my training data from? Does that

Jeff Prosise:

Yeah.

Ben:

change like the dynamic of how you're interacting with these customers?

Jeff Prosise:

It does. So November was a key time, as you mentioned. It's when most of the world became aware of AI, and in particular of large language models, which have completely transformed the industry and have raised the cognizance level around AI to heights it has never reached before. I got to tell you, I knew that AI had hit the mainstream when in December, my wife came home from a meeting at church. and said, have you heard about this thing called ChatGPT? We talked about it in the meeting tonight. My wife is a sweet person. We've recently celebrated our 40th anniversary, but she's not a technical person. And it occurred to me, if she's talking about ChatGPT, AI has hit the mainstream. So yeah, it has changed conversations in a lot of ways. And a couple of things that stand out to me, Ben, are, number one, customers are much, much more aware of AI and its capabilities and the kinds of things that it can do for them because they've seen demos. They've gone to the open AI website, set up an account, played with chat GPT and they want to know what it can do for them. So it's interesting. Some of the conversations we have with customers. These customers are very knowledgeable about AI, but more commonly, they don't know a lot about it yet. There's almost this fear that if we don't get on top of this, some of our competitors are, so we have to do something now. I answer a lot of questions for these customers, to be frank, and I'm sure you can understand this. There's quite a lot of education to be done because there is this perception that AI can do anything now. You guys know it can't do anything. It can do a lot of things. So part of what I do is help steer the customers, saying here's what we can do and here's what we can't do. One thing I've noticed in particular is that people tend to think ChatGPT can work absolute magic. And often spend a little bit of time with these customers, kind of educating them about where ChatGPT came from. Going back to the famous Transformers paper in 2017 that you mentioned, Google's BERT model. which was sort of built on top of that and how we could fine tune and make it do amazing things. But now with the GPT models, you know, we've upped the ante there. But one thing I try to get across to them, and I still speak at a number of conferences, I enjoy doing that, I love educating people, seeing the light bulbs go on. And I frequently now speak about what chat GPT is, where it came from, and how it works. And one of the key points I try to get across is it is a statistical word generator, a next word predictor, nothing more, nothing less. I often walk people through the transformer architecture, through BERT, tell them how we got to GPT-1, 2, 3, chatGPT, and GPT-4. And I show them this animation I put together that shows how you give the model a prompt. it generates a token from that prompt. Then it iteratively goes back in using the same prompt as input and the token that it generated, and it generates the next token. And when you understand that, you understand why chat GPT can't do math, for example. All it can do is predict next word. So ton and ton of customer interest, I mean, no doubt since November, and even in the last three or four months, I fielded an order of magnitude more calls from customers who were interested than before. But there's a big educational component to it too. Letting people know this is not magic, here's how it works, here's what we can do, and here's what we can't do.

Ben:

Yeah, one thing that I've heard from a lot of people, particularly with the advent of the difference between, you know, you can go to hugging face and you can get GPT2. You know, the, the grandfather to the modern chat GPTs and you can, you can take it, fine tune, train it on your data, kind of look at it and be like, man, this thing's kind of dumb. Like it doesn't really understand context very well. It hallucinates a ton. And sometimes it just. generates a bunch of gibberish. Even if you dial the temperature way down, it still sometimes just kind of goes crazy. And then people look at the difference between that and 3.5 turbo and say, wow, what a generational leap here. It's like, yeah, a couple of years of development effort there. But then they start to wonder, okay, if I take some of these modern open source corollaries to... the open AI proprietary models that Microsoft helped them build with the hardware at least. And then the generational leap to GPT-4, which went GA last night. So everybody can use it now with the API keys. And people wonder like, well, can I take like, you know, llama or, you know, MPT, like Mosaic can I get that to perform exactly like GPT-4? And then they train it or try to use it just raw, pre-trained. And they're like, it still doesn't perform the same. Do you actually talk with customers to explain mixture of experts and what that architecture would look like? And that's actually what these GPT-4 is. It's like, hey, there's like 30 different models all running in parallel, basically. And it's just. Each of them are answering the question. We vote on what the best response is. Do you see that those proprietary models and all of the engineering work that's not necessarily data science work or ML work that that's going into that, but more like pure software engineering to get that mixture of experts working. Do you see that eventually coming to the open source and how is that going to challenge some of these SaaS vendors?

Jeff Prosise:

Yes, short answer is yes. The longer answer is so with regard to having conversations with customers about using for example the open AI models versus some of the models they can they can pull from hugging face. We have that conversation often in fact had a conversation with a customer yesterday about this. It's a customer that we're just starting to do a bunch of really about using generative AI to build something that basically takes a lot of information from PDFs generated by governmental entities and puts a question answering solution over the top of it. And we built a POC, we demoed it for them yesterday, they loved it. But they also said, hey, is open AI our only option here? They said, what if open AI goes away someday? Are there private models that we could use? So my answer was, yes, it's always possible open AI could go away someday. I doubt that'll happen, especially with, you know, the backing that Microsoft has put behind them. But yes, there are also models that you can get from places like Hugging Face that will do some of this. In fact, one thing we've done at Atmacera, we've actually been building a test harness and done some quantitative testing on various LLMs. And what we found, is what I think everyone has found, and it's what I shared with this customer yesterday. Yeah, we can pull something from Hugging Face. It's not going to cost you. It's going to be lower latency because with sufficient hardware, we can run it locally. But it's not as good as the GPT models. You can't beat what ChatGPT and GPT-4 can do in terms of ingesting content or context that you provide. And... providing an answer to a question, for example, that's very much like one generated by a human. So are there other models? Yes. Will they get better over time? Absolutely. For now, for our customers who want top-notch performance, top-notch results, open AI is the way to go, in my opinion. And one thing we steer them towards, by the way, something that won't surprise you guys, there are two ways to utilize these open AI models. make rest calls out to open AI itself, to its servers, or you can stand up private instances of these models in Azure and call them that way. We highly recommend to our customers, especially our enterprise customers, that they go the Azure route. One, it doesn't cost anything more, but now it alleviates a lot of those security and privacy concerns because you're not passing your data out to some public web server somewhere. It also gives you more control over rate limits, availability, and things like that. So we often build our demos against the public OpenAIM models, but when we build something for production for our customers, we use Azure to do that.

Ben:

Yeah, and there's, we just actually, five days ago, my team just built the integration to that service for MLflow, which will be released soon. But we found that as well. We were like, oh, it comes with Azure Active Directory integration with it too. So you don't even have to worry about managing a key anywhere. And our integration with that, it's a single service principal key that you just attach to our bastion server, or our reverse proxy server. And then whoever logs in, as long as you're logged in with Azure Active Directory, you can just start hitting it and you get attribution for who's doing what. You know, you get that logging that's, that's part of that. And, you know, you don't have to worry about anything really, just how much are you eventually using? You know, you,

Jeff Prosise:

Exactly.

Ben:

you are going to get billed for it, but

Jeff Prosise:

You are going to get billed for it. I have to say with ChatGPT in particular, I'm amazed at how much you can do for how little money. I think I had a bill on my OpenAI account for like a buck 50 last month and I had used it a ton. I mean, when you run the numbers on ChatGPT, you get more than a quarter million words for about a dollar. I don't know how they run this thing for the cost they do. Certainly Microsoft has something to do with that, but... Yeah, Azure is absolutely the way to go. You know, when Microsoft made that original investment in OpenAI a few years ago, or a couple of years ago, I was a little bit skeptical. Like, what's the big picture here? Boy, a billion dollars is a lot of money. In retrospect, it was a brilliant move because you can't go to AWS or GCP and stand up an instance of ChatGPT. You can do that in Azure. And Azure, I think it's safe to assume, will remain the sole. cloud provider that can serve up these models. So, you know, hats off to Satya Nadella, others, whoever was involved in the decision. They laid out a lot of money, but I think it was a great decision and we're seeing that right now.

Ben:

So how do you see the arms race playing out? So I'm sure you saw the news. Our company last week announced, we just acquired basically the company that's the big challenger to open AI with slightly different model architectures, potentially a lot more data and full cloud compute, backing their training.

Jeff Prosise:

Wow, I did not see that news. So I'm going to have to dig through this. That's really, really interesting, Ben. In terms of how the arms race plays out, right now, OpenAI is sort of the 800-pound gorilla, right? They have state-of-the-art models. And some of them, especially, you know, chat GPT, which is based on GPT 3.5, is very economical to use. But it's going to be good for industry. if there are alternatives. Competition is almost always good. So it's going to be interesting to see how Databricks plays this. I'm sure there'll be others making similar moves. The way I look at it, it's all good. I'll have my choice of models. I can test different ones before I build something for a customer to put into production. And the competition will also create price pressure. That means hopefully I can run these models economically. I mean, what we'd all love to do is to be able to run these models locally, right? Not so much worried about the cost, but worried about latency, worried about availability and things like that. One thing I think a lot of people don't understand yet is, you know, you don't run models like chat GPT on a throwaway computer with a Pentium in it, you know. You've got to make some investments in hardware to run these models in a performant way, but, uh, I do look forward to the day. I mean, we have models like this that we can run locally now. Nothing that I'm aware of that really competes with open AI's models, but that's going to change. That's going to change and that's going to give us more options. Let us write better software, do a better job for our customers.

Ben:

Yeah, I have seen the challengers that we cannot talk about, but I've seen the performance of the single expert. And it's interesting to see some of the research that's coming out of particularly the Stanford NLP lab, where they're starting to work on the next generation, like the successor to transformers. Like, hey, how can we get medium term storage memory built into an accessible expert model that's a thousand times smaller than what a transformer's model currently is. So when we're talking about parameter counts, that arms race, we're already hearing word from some of these AI foundational model companies that are saying, yeah, we're going to be going to 200, 400. 800 billion parameters with these models. And you're like, where are you going to run that? Like the GPUs, they don't exist to hold that in memory right now. So they're looking at, okay, how do we use current, like relatively cheap GPU hardware? And how do we offload the model weights so that we're doing some sort of intelligent fetch or querying a vector DB service instead of having to read files into memory? What, like what I wanted to get your take on, what do you think is going to happen when one of these competitors says, Hey, we're all about altruism. We think this is for everybody and we don't want to charge for this. We'll charge for, for setting it up and getting the infrastructure for you. But we're going to take the core guts of what makes this thing magical and give it away to the world for free. What do you think happens then with this?

Jeff Prosise:

Well, I think things get more exciting. I mean, ultimately, you know, when we're specing something out for our customers, what we care about really is performance and the quality of the output that is generated. So if these models, performance is a hardware issue, right? If you're willing to put enough into the hardware, you don't have to worry too much about performance. I mean, you're running inference, you're not training the model, hopefully. If it... If these models can produce output that rivals that of the open AI models, I think customers will flock to those. I mean, you know, no one likes to call an API. Even if you're, you're standing all this up in Azure, maybe you've got an Azure app service running that's talking to in this sense of GPT-4 that you've stood up in Azure. You've put them in the same region, same data center. So you have an incredibly high bandwidth connection between them. There's still a latency there. And, and you're passing information back and forth, which you don't totally control the route that it takes, although the story behind that in Azure is pretty good. No, I think people will embrace it. And I think that's probably where we're headed. But it's interesting you mentioned, you know, we're folks at Stanford are working on, you know, what comes next after Transformers. There was a really interesting paper published a few months ago by some researchers at MIT, making the case that we have pretty much reached the limit of what AI can do at this point, unless we find an alternative to neural networks, or an alternative to transformers, or we find a better way to train the neural network, something better than the back propagation algorithms we use now, or until hardware suddenly becomes several orders of magnitude faster, maybe quantum computing or something like that. So, You know, everyone is asking now, you know, is, should we be afraid of AI? What should we be doing? It actually concerns me a lot. We're hearing politicians talk about how we need to place limits on this. Um, frankly, uh, people in Washington, DC, who mostly are lawyers are the last people in the world that can make quantitative judgments about this stuff. Um, what I've been telling people is I don't think we have a lot to fear. And there are a couple of reasons for that. One is, AI is much more limited, I believe, than most people think. I

Ben:

Mm-hmm.

Jeff Prosise:

often tell people, we have figured out very clever solutions to very specific problems. Image classification, for example. Is this a dog photo or a cat photo? Object detection in images, the kind of thing that a self-driving car needs, identifying various objects, their positions, confidence levels, things like that. And now we've made great strides in natural language processing. So there are a handful of things we can do there. We can do neural machine translation really well. We've gotten pretty good at generative AI, text classification, document summarization, keyword extraction. But outside of that, AI can't do much. There's much, much more that it can't do than it can do. So when I see people and really smart people like Jeffrey Hinton, who I'm sure is 10 times smarter than I am, when I see him say, hey, we need to put the brakes on this, it really gives me pause. And I've done a lot of soul searching recently saying to myself, what am I missing? There are people a lot smarter than me saying we need to be careful here. Fortunately, there are a lot of smart people who are now pushing back on that. Andrew Ng at Stanford recently. pushed back and he said something to the effect of if we want Humanity to survive and to thrive we need to accelerate AI Not slow it down Francois Charlotte Who was the creator of Charis never met him in real life, but I have a ton of respect for him Love his book follow him on Twitter. He's also pushing back and saying hey Let's think about this I've been thinking about writing a piece myself about good AI. There's so many things AI is doing right now to make this world a better place. And if we're going to put the brakes on it, that's fine, but let's make a decision with good data. Let's realize what we're giving up by slowing down on AI research. So sorry, I got off on a tangent there, but it's something I'm... I hear a lot I'm passionate about. I was at a family reunion a couple of weeks ago, and one of my wife's cousins said, wow, this is getting kind of scary. What do you think about AI? Should we be concerned? I mean, everybody is talking about it. And frankly, a lot of what you're hearing out of Washington and in the news media and on Twitter and stuff, I think is pretty uninformed.

Ben:

I couldn't agree more. We've had discussions on this topic on the podcast before. What blows my mind is everybody's concerned about this right now with respect to generative AI and these LLMs. And people don't seem to know or even think that a lot of the decisions that are being made about things that you actually care about in your life have been automated for decades now. They might not be sophisticated, deep learning models. Some of them are transitioning to that. But when you go and go over to the car dealership and you're, you're 23 years old, you just graduated from college. You really want that car as sort of a gift to yourself and, or, Hey, you just need to, you want a new car to get to work your first job and you fill out the paperwork and it comes back and says, declined. No human made that decision. There's not some person sitting at your bank who's just ready for when that application comes in. They can just read through the data and say, yeah, we believe Michael's trustworthy or no, we don't know if he has enough history. It's all algorithm based and it's based on machine learning models. That's something that people can understand from an example. There's far more insidious things that have been trained, are in production running at the government level that I think a lot of people don't really understand. Policy decisions are made based on some of these models about where do we zone people in the construction of expansion of a metropolitan area? Where do we actually want to deploy police to for… making sure that crime doesn't happen in a city. All of that stuff is based on historical bias that is a lot of problems that people have with how things work in the world. They don't really understand how those decisions are being made. The granularity of some of this decision making is so extreme that humans can't handle it. So it offload it to machines. You're like, hey, the shift work for police in St. Louis, Missouri. Why are so many of them in this area of the city versus this other area of the city? Why do I drive through whatever side and I can't go three minutes without seeing a cop car driving by or pulling somebody over or interacting with somebody along the side of the road who has a particular ethnic bias that people have a serious problem with? And we wonder why there's these sorts of issues. And it's because some of these decisions are being made by our own biased and flawed data. Um, so when I hear that argument of like, well, we need to pump the brakes and we need to slow down, it's like, nobody's really using these LLMs yet. In any way that could be dangerous. They're insanely useful for people in, in most industries. and provided that there's some sort of quality guards on them, I think it's fine. But nobody's really talking about the really big elephants in the room, which are bad models from good intentions on bad data. And then people have a problem with and blame, you know, politicians for it. It's like they didn't make these policies. The government approved for contractors to come in and build this stuff. They're doing it for the money and putting this stuff in production. So I always just find stuff like that interesting about because it's new and everybody's talking about it, then all of the vitriol gets attached to the new thing because people don't really understand what it is. But then the people who do understand are like, wait a minute, why are you getting all bent out of shape? You're fine with the fact that all this other stuff is automated, but you don't like something that seems human because it can. put text on the screen? Really?

Michael:

Yeah, I think you hit the nail on the head. It's about understanding and because it can walk and talk, it is a lot more understandable and people think it's a lot more advanced than it is. If we develop an algorithm that breaks hashing, that is World War 7. Like the world is done and that would be absolutely silent and no one would really understand how it works. But because chat GBT is very easy to interact with and sort of. It's on your side. It helps you brainstorm. It helps you problem solve. People start using pronouns for it. Uh, I've, I've heard several colleagues slip up a few times. Um, it's just, I think it's a fundamental lack of understanding of how the model actually works, um, and there's lots of scary stuff.

Jeff Prosise:

There is, and you know, so I think there's certainly a challenge here in controlling expectations. People are concerned about AI models being biased, and that's a very valid concern. By the way, one thing I often tell people is it's not the model that's biased, it's the data that you trained it with. The model is agnostic, right? And guess what? We train these models on data generated by people. People are flawed. All of us are. AI may never be any better than that. We can't expect AI to be perfect. We can't expect it to be unbiased. We can do our best to make it unbiased by controlling the data sets that we train with. But at the end of the day, all these models are doing is learning from content we've generated, massive volumes of that content. And you can try all day to remove every last trace of bias from it. You'll never get all the way there. And And how do you define that anyway? I just wish that people would understand that, um, you know, I, I read in the news yesterday, uh, new law or ordinance passed in New York city, it's called local law 144 or one 77 or something. Did you see it?

Ben:

Mm-hmm.

Jeff Prosise:

Um, the gist of it is that, um, if a company is going to use AI or machine learning in the hiring process. There's now a lot of guidelines they have to conform to in New York City. For example, they have to document the algorithm that's used to do the selection of the hires. Lots of other stuff. And there's some pretty severe monetary penalties for companies that are hiring and recruiting without going through this process to get approved by New York City. And I look at that and I think, what are they trying to do? Kill this stuff off? You

Ben:

Thanks

Jeff Prosise:

know,

Ben:

for watching!

Jeff Prosise:

even if you completely remove any kind of machine learning or AI from the hiring process, now the process is being driven by humans who are inevitably flawed and biased. So I see a lot of warning signs here. I definitely agree with Andrew Ng. There's so many good things AI can do for the world and for society. We are at risk, I believe, of preventing some even more good things from happening because of our own misunderstanding. And that's my fear. What if there were a federal legislation that says before you deploy an AI model, you have to document the algorithms used? Well, you guys know. One of the hardest things about AI or machine learning in general is explaining why a model reached the conclusions that it did. There's a whole new field called explainable AI, where

Ben:

Thanks

Jeff Prosise:

we're just

Ben:

for watching!

Jeff Prosise:

trying to figure it out. But can you explain when you input a prompt to chat GPT, why it generated what it did? No, you really can't. It's baked into those 175 billion parameters. Even if you crank temperature down to zero to get more repeatable and reproducible results, you still can't explain it. We just need to be really careful. It gets back to education, I think. Folks like us that work in this industry, who work with these models and understand them, we need to be really proactive about making sure people know what they are and how they work. Again, chat GPT is a glorified next word predictor. It's really good at what it does. But you could, in theory, you could just build a straight up statistical model that does what chat GPT does. The real brilliance of chat GPT is rather than build that statistical model, open AI built this really cool neural network, trained it with tons of data and it built that statistical model and the parameters. There's a lot of education that we have to do here. And by the way, if I may, you mentioned one use for machine learning, processing loan applications, making judgments about. who should get a loan and who shouldn't. I think one of the biggest wins for machine learning is in fraud detection with credit card companies. It's what got me interested in this field 10 or 12 years ago. I travel quite a lot around the world. I carry my Delta Scott miles, American Express, wherever I go and collect, you know, frequent flyer miles with Delta. And once a year, once every two years, the card number gets stolen. Inevitably, Amex calls me. uh confirms that it wasn't me trying to make the purchase. When I say no, they uh they cancel the card uh overnight me a new one. It is amazing how good their fraud detection algorithms for lack of a better word are. Better in my opinion than what I've seen from Savise and MasterCard. How do they do it? They do it through machine learning, right? Not once, not once, has someone successfully my Amex card fraudulently, thanks to the machine learning models that Amex runs.

Michael:

that you

Jeff Prosise:

Michael:

know

Jeff Prosise:

telling

Michael:

about.

Jeff Prosise:

that I know about it. Well, exactly right. And I do look over my statement every month, by the way, just to be sure. And only on one or two occasions have they erratly flagged a legitimate transaction as a fraudulent one. Yesterday, my wife made a purchase with her card, which is linked to mine. I got a text message from MX saying, hey, can you verify that this was you or her? I answered yes and it went right through. There's so many things that we would give up if we did without, if we legislated too heavily around AI. We just need to be careful. And folks like the three of us need to be super proactive about educating people about what it is, how it works, what it can do, what it can't do, et cetera.

Ben:

Yeah.

Michael:

All right, well, let me jump in real quick. I have

Jeff Prosise:

Yeah.

Michael:

a scenario for both of you. So you have just been appointed king of the world. You can implement three policies going forward over the next, let's say. five to 50 years that will influence how AI is developed. What are those three policies from a fairness and ethical, whatever, stand?

Jeff Prosise:

Wow, Ben, I'm going to let you go first because I'm not sure what to say.

Ben:

Uh, I think it would be more like an emperor, right? But, um, I would say my first thing is to reevaluate the education policies globally and ensure that everyone has access to quality education and that there's much more incentive for gifted people who can educate the next generation to be compensated appropriately. You don't get the best and the brightest joining the education system for children because the pay is garbage and the work is government controlled and not particularly creative any longer. So there's a lot of issues with it, but I would institute some sort of policy that would make sure that within 50 years, the general population of the planet is aligned in the way that the educated elite are. in the world today. It isn't to say everybody's going to learn STEM, but we should be providing more funding, I think, to the next generation for education. And that will allow more people to focus on things to advance us as a species, not just on this planet, but elsewhere as well. Second thing would be to put some sort of funding and effort towards development of technology that's related to AI that is self-monitoring. Something that I think the speed at which we're developing things seems really fast to everybody, but we've been doing a couple of tests on our side, Databricks Engineering, of trying to use some of these open source LLMs and stuff like OpenAI. with just test projects saying, okay, I've got four hours free. What's, like, how far can I get along with just using prompt engineering to have this thing? You know, I was using ChapGPT4 last week as a test. Uh, and I wanted to see what's the most complicated thing that actually comp like compiles and runs effectively, uh, application that I can have this thing build. And I'll do the copying and pasting and I'll tell it when it's getting it wrong. But I want full unit test coverage. I want an actual production grade code. And in four hours, I estimated that it did the same amount of work that a developer would have taken five days to do. And that would be a staff level engineer. Just due to the fact of the speed of writing boilerplate. It's stuff that just takes time. writing unit tests, writing documentation. It just eats up time. That's not really useful. It's not using the creative aspect or the technical acumen of that developer. So knowing that, if that process gets better and we can offload even more of the annoying parts of software engineering and having these generative AI models become more aware of the quality of what they're generating and having a much larger context so that it can do something like, okay, I'm going to sit down for two weeks and prompt engineer this thing so that it can create a self-monitoring system. Okay, I want you to evaluate your own bias in your training data, create a program that will do that and cull out problematic information that is at conflict with... the following values and have that democratically voted by the world. Say, what are our values as a species? Have people from all sorts of different backgrounds compile that list of things that we should value as a species and have that system be written by the AI and have humans validate it and use it. getting some sort of feedback into the training process itself, where we can say, how careful are we being with what this thing is learning? Is the training data motivating anything that's trained from it to be beneficent towards humanity and to the furtherance of us as a species? That would be cool. And that would be a big priority that I would have generated. I think people want to do something like that, but that task is so daunting. and so complex that I think it's actually beyond the capabilities of humans. But

Michael:

yet.

Ben:

a sufficiently advanced LLM or generative AI system could potentially be the one that creates an application that does something like that. And then the third thing, my third task to finish up your question, I would abolish the

Jeff Prosise:

Yes.

Michael:

noted.

Jeff Prosise:

Well, if we get to vote on who to be our emperor, then I'm voting for you.

Ben:

Thanks.

Michael:

I don't think that's how empires work, but.

Ben:

I think it's conquest, lots of bloodshed, but.

Jeff Prosise:

He's a benevolent emperor. So, no, gosh, I'm not sure I have a great answer to your question, Michael. I would say. AI is like any other technology. It can be used for good and bad. Bad people are always gonna use it to do bad things. But if the good outweighs the bad, we need to be cognizant of that. And we need to do what we can to steer towards the good AI and guard ourselves from the bad AI. So I think I would be, I'm not sure how I would do this. But I want to put up some guardrails, not around AI, but around the way we restrict AI. So that if we are going to restrict it in some ways, we're very, very sure of what we're doing. Again, I'm not sure how I'd actually implement that, but I do agree with Ben wholeheartedly. Education is key. I live in Tennessee, and one of the things that I'm proud of about Tennessee is that two years of community college are free in the state of Tennessee. I think we're one of the only states that do that. It all goes back to our governor from a few years ago and the gentleman who is currently president of the University of Tennessee who had a vision to make. make higher education easily available to Tennesseans. We have some very, very poor areas in the state. I grew up in Appalachia, one of the poorest areas of the country. And education is the key to that. And I think I would try to steer the education dollars. I mean, one of the things we've seen in the news recently is whether the government is gonna forgive student loan debt. and we can all have various opinions about that. But my goodness, if we're going to make all that money available, let's try to make the country a better place. Let's look at the professions where we have shortages like teachers, engineers, anything in STEM, and make that money more easily available for those fields and make it less likely that someone is going to go into debt $150,000 to Harvard to get a degree that's not gonna let them pay that off. So I... I would work to make us much, much smarter about how we implement education, how we make it available, trying to make it, yeah, available, but available in a way that benefits all of us. Yeah, I can't think of number three. Michael, you've given me a lot to think about, so I

Michael:

I'm

Jeff Prosise:

may have

Michael:

glad.

Jeff Prosise:

to get back to you on number three.

Michael:

Got it. Interesting. Yeah. The purpose of that question was basically a lot of people gripe about the current status of regulation and no one likes the outlook of machine learning, whether it's overly regulated, under regulated, whatever it may be, whether you understand the algorithm or not. And I think there are some just really fundamentally difficult problems. Like how do you define fairness? Take the police stationing example. What is a fair algorithm? Well, there are higher and lower crime neighborhoods. Should ethnicity be a predictor? Maybe, maybe not. Should it's just such a complex topic, um, where there are trade-offs and sort of humans have to evaluate as a species or as a government or as a city, where you want to place your morals. Do you care about lives? Do you care about true fairness? If it's true fairness, make it random assignment. That is the most fair option. So it's just really, I think just a fundamentally difficult question. And I was curious. So thank you for, for educating at least me.

Ben:

Well, the interesting thing about the police monitoring, and it's something that I've debated with people before, and I always ask that sort of Socratic line of questioning, saying, are crime rates in certain areas of cities higher? because there's police there or is it because of the people that live there? When you look at the statistical distribution of what people are being booked for, I've looked in my own County here in Wake County, North Carolina, and seen the areas of the County where police presence is highest. They don't do it in residential areas. They do it along main thoroughfares of usually people coming into the County. or leaving the county. So it's most of the things that people are busted for in Wake County are DUIs, shoplifting in certain areas of like big box stores that seems to be prevalent and drug possession, usually marijuana. And it's from traffic stops because they do random stops sometimes. And it's just a policy here. But when you look at that distribution of those top three things in other cities, look at it geographically and you see, if you're just looking at the raw numbers and statistically lying to yourself by being inconvenient, being like, see, this, this part of the inner city has much higher arrest rates. It's like, yeah, what are people getting busted for? Possession of marijuana. Uh, they've got, you know, a dime bag of pot for personal use, which by the way, in a bunch of states now is totally legal. So they're getting arrested and booked for that. But that isn't to say in the affluent area of the city, people aren't walking around with pot on them. It's just cops aren't there patting people down. So how does that skew that data? Are we creating an artificial statistics problem because we're putting enforcement in that area? I don't know the true answer to that, but that's the questions that I like to ask people. When they say like, Oh, see it. I would never live there. The crime's really high. I'm like, really? What crime is it? Is it violent crime? Well, no, but okay. Let's look at the data.

Jeff Prosise:

I was going to say what you're really saying Ben is look at the data and that's something that we don't do often enough, right? We jump to conclusions. You know, I was just having a conversation with a friend this morning about social media with all the stuff that's going on with Twitter and threads and stuff. And I said to him, on balance, I think society would be better off had social media never been invented.

Ben:

I agree.

Jeff Prosise:

Now, the genie is out of the bottle. I would never be a proponent of... eliminating it. But yeah, I think we'd probably be better off without it. think about this with people you know concerned about AI today and talking about all the bad things that could be used for one of the reasons that we can have these discussions is the internet right and social media. Think back 30 years when the internet was in its infancy. Some of the same things that are being said about AI today, oh bad people are going to use it for bad things, could have been said about the internet but it wasn't as easy for us to have that discourse. back then because there was no internet. So, you know, I think obviously the internet was transformational for planet earth, for the way we live our lives, transformational to our culture. AI is going to be the same way, perhaps even more so. But let's be very careful, but let's be especially. We don't want to see it misused and we don't want to be a part of it, but we also want to be very careful about just saying no. I just pulled up in my browser, New York City Local Law 144, and it says, in the city, it shall be unlawful for an employer to use an AI tool. This is for recruiting and hiring. Unless the tool has been independently audited and notice has been provided to each such candidate or employee who resides in the city. Now, If you read through the whole thing, you see it's much

Ben:

Wow.

Jeff Prosise:

more strenuous and there's some pretty hefty financial fines. What it's going to mean is hiring in New York City is going to be left completely to humans. No one in their right mind, given the requirements and the penalties for not meeting them, would use any kind of tool that uses MLRAI in their hiring decisions. So you know. I look at that based on what I know about it, I would consider that a bad law. And I'm afraid that there are more bad laws coming in part because of the hype. There is a lot of hype around AI now, no doubt about it, a lot of positive hype. There's a lot of negative hype too. I mean, you turn on CNN or I like to watch BBC because I feel like I get a little bit more objective. view of what's going on not just in the US but around the world. But I turn on BBC and they're talking about, wow, what are we going to do about AI? Boy, that's a dangerous question, especially when politicians ask it.

Ben:

Or ignorant newscasters.

Jeff Prosise:

or ignorant newscasters, yeah.

Ben:

That question, that law that you talk about though, I'm trying to wrap my head around. the fact that if they just use AI as saying, basically some sort of statistical model that's making inference on a resume or a person's background, is there a service out there that doesn't use that nowadays? Resume screening is built into most HR products. In fact, all of the ones that I'm aware of people using, all that basically means is that New York, if they actually follow through with that law, they're going to destroy their tech economy. Cause people can be like, I'm not going to try to hire people here. We're just going to close their office. Cause this is just too, too obnoxious. If you're a successful company and you have a New York office and you're large enough, you know, think about some of the big companies that are in New York. Um, you put 10 job postings up a quarter for a particular department. How many applicants are you going to get if you're a sought after company? 20,000, 50,000, a hundred thousand, a million. No, in order to go through all of those resumes and do an adequate job, you would need an army of people to just, that's all they do is just read those things every day. So yeah, good luck, New York. That's pretty stupid. Like on the scale of, of smart to. just galactically moronic that goes on the galactic scale for me.

Michael:

Wait, Jeff, so your argument is it's going to be prohibitively expensive? Like what's the but what's the issue?

Jeff Prosise:

I think companies will not, they will not do business in New York or they will keep at arm's length any tool they use for recruiting and hiring that uses ML or AI. Here's the headline on TechCrunch two days ago. New York City's anti-bias law for hiring algorithms goes into effect. I won't read it, people can do a search and see themselves, but it says after months of delays, New York City today began enforcing Local Law 144. And it goes on to basically document how onerous this law is. When you do a search on Local Law 144, you know what else you find besides the news headlines? You see law firms and the guidance they have put out. So it's going to be great for lawyers, not very good for companies to do business in New York City. So yeah. I am with you based on what I know, Ben. It looks galactically stupid to me. But that's okay. Maybe I'm wrong. I've been wrong before many times. This may be a good thing. And maybe if it is bad law, maybe New York City will back off from it. The scary thing is though, what if something like this were put in place at the federal level? And that's what I'm afraid politicians are going to do. Don't mean to be negative about it, but I'm very, very concerned. And I think a lot of this is being driven by the scary things that people are seeing on TV and seeing on social media. AI

Ben:

art.

Jeff Prosise:

is arguably mankind's greatest invention. I would put it right up there with the airplane, for example. Do we really want to snuff this invention out? Can we ever expect it to be perfect? Absolutely not. How do we define perfect, by the way? Try to quantify that. I tell people all the time, I see people complaining, chat GPT hallucinates. Well, it does.

Ben:

Jeff Prosise:

You've

Ben:

do humans.

Jeff Prosise:

got to remember. Yeah, exactly. All chat GPT knows came from content, lots of content generated by humans. In that content, there is bias, there is misinformation. There is stuff that's flat out wrong. And open AI is working really hard right now to try to mitigate that. But we're never going to get all the way there. It's. humanly impossible and non-humanly impossible too. So we just need to be careful, I think. You know, one thing I find interesting and going back to that paper published by MIT a few months ago about the wall we're hitting with our current technologies, I do think that AI is self-regulating to some extent, at least for now, until we come up with something better than the transformer or something better than back propagation or, you know, better than spinning up 2000 Nvidia A100 GPUs. I mean, you've probably read reports, depending on which report you believe it costs OpenAI anywhere from a few hundred thousand dollars to up to a million dollars to do one complete training run on chat GPT. That's not sustainable. And without Microsoft's backing of billions of dollars, most likely OpenAI could not be doing what they're doing here. But until we come up with something better, It is somewhat self-limiting. So you know what I think is going to happen? You know, there's been an explosion of AI since November, as you pointed out, Ben. People are aware of it as never before. It seems like every day new things are being done, new announcements are being made. I think we're going to see that slow down for a bit because we are hitting that wall. And by the way, another really important point that paper made was it's not just about the models, the software. and about the hardware they're trained on. It's the data. Why can we build image generators like stable diffusion and mid journey? Because we have the Leon 5B data set with almost 6 billion text image pairs corrupted from the internet. We couldn't have done that 30 years ago because we didn't have the internet. We couldn't have gathered those images. But suppose you want to build something that's substantially better than stable diffusion at generating images. So you say, okay, I'm going to use 50 billion images rather than five. Where are you going to find 50 billion images? We are running out of data to train these models with. So are there going to be new announcements, cool new things that AI can do in coming weeks, months and years? Absolutely. Are we going to see the evolutionary advancements that we've seen in this last year? So I think we're going to see a lot fewer of those.

Ben:

There's some fascinating stuff that we've been exposed to internally with regards to where I think this might go. At Data and AI Summit, a week and a half ago, it was announced that K Databricks has this new Lakehouse AI effective initiative, which is, hey, we're going to make it simpler to take one of these open source architectures and not just do fine tuning of one of them on your data, but let's try to see if we have customers out there. And it turns out we do a lot of them where they're collecting so much data that is super valuable, but it would never be open sourced. It's so proprietary or so valuable that it's... locked in a very secure environment. But we have clients that you're not talking about like, oh, they have 200 terabytes of data, it's more like they have eight exabytes of data that is sitting like in their account and it's not clean, it's not perfect or anything. But running through the process of getting it into a state that could make it available for training is, is out there. And some of these new architectures that are being discussed is like, hey, we can really train from scratch some of these open source models provided we have a modest amount of compute. Not, you know, the BERT-based transformer model architecture, but some of these newer ones that are coming out. And some of the reports that we saw from Mosaic's announcements recently, where their NPT architecture and their next one after that is something that you don't need you know, 10,000 GPUs in order to train it. You can do it with 20 and it only takes, you know, five days to train. So I actually see that in the next couple of years is I agree with you. Like maybe the, the technology and the advancements in the generalized learning aren't going to be that great. I'm sure there's going to be some, some foundational leaps, but it's not going to be as it has been the last nine months. But I think. services that companies are going to be able to create where imagine chat GPT with the GPT-4 model, but on your company's data and exposing an endpoint that your own customers can interact with and get deep understanding of whatever service you provide or give it as an internal tool for your own workforce to use. Think about how this revolutionizes stuff like internal documentation. We're like, hey, I don't even know who to ask how to do this process that I need to get done in the next two hours. What if I asked our chatbot that has been trained on all of our data, hey, what is the answer here? And that's releasing on our platform very soon. I think it's in private preview now where you can interact with the Lake Sense AI helper. and ask it plain text questions and it generates queries for you that answer the exact question that you want. But that's because it, it has access to all the data. It knows where that data is. It knows what that data really means. And it's supplemented with a bunch of metadata. But when that's opened up to an entire company's data, like reserves, not just text data, but also tabular data as well. Think about how that is going to think the first companies that figure that out. Name an industry doesn't matter and use that internally where regulators won't even be involved. They don't have to know. It's just a tool you're using internally. Think of the insights that something that understands English and can answer questions, but has access to your proprietary data. You start asking it questions like, what do you think we should do? I'm going to give you three options. Which one's going to make us the most amount of money? And it sits there and thinks for 30 seconds, pulls a whole bunch of data and says, based on the last 75 years of your company's history, here's, and all of this other data that I have access to, here's the highest probability of what's going to make you money and these are the steps in order to do it. Think about how companies are going to start obliterating their competition when they start using this technology.

Jeff Prosise:

doubt. Well, to your point, what you just described is the number one use case for these GPT models for LLMs in general. 90% of the calls that I take from customers right now, they're wanting to know how can we build a rich natural language semantic search interface over our own data, over data that we've accumulated over years, over HR documents we've dropped into SharePoint, and things like that. How can we use these models to provide that? we're doing a lot of that for customers right now. I wish I could talk about some of the things we're doing. I can't, but this is the real value of generative AI. And I love the fact that Databricks is going to provide sort of a canned solution for that because everybody's going to love that. Azure is doing the same thing as well. Microsoft, among the many announcements they made at Build last month, they're adding. services to Azure Open AI that make it easy to put GPT models over internal documents. There's actually kind of an early public preview of that out now. I was just playing with some the other day, but that's the sweet spot. And you know, you can't talk about Open AI today without talking about lane chain, right? Why is lane chain getting so much traction? Because it does a lot of the work. that helps you chunk up the documents, generating the embedding vectors, drop them in a vector database and make this stuff work. And I'll have to tell you, it's pure magic. In this demo I was doing for a customer yesterday, we did exactly that. We put Langchain over a bunch of PDFs that they had accumulated and give us for testing, built a fairly simple web interface to it. And I was actually amazed by how good this thing was at answering questions. And they were amazed as well. So yeah, it's the sweet spot. I have to tell you, Ben, I'm not much of a visionary, but a year ago I went to some of my friends and said, I think there's a business opportunity here that is huge. Once people learn about these GPT models, the first thing they're gonna want to do, every company in the world is gonna want to put them over their own data. And by the way, one of the beauties of doing this, of course, is that everyone knows that chat GPT loves to hallucinate. Well, it loves to hallucinate when you don't put guardrails up around it. But when we build apps like the ones we're talking about, we restrict it to a specific context and we say, if you can't find the answer from this context that we've provided through the vector database and the chunking and all, simply say, I don't know.

Ben:

I don't know. Yep. Mm-hmm.

Jeff Prosise:

And it's willing to do that. One of the points I made to this customer yesterday when we did the demo for them to show them what we can build for them was, I can't guarantee you it will never hallucinate. I can tell you that since we're restricting its context to the data that you've provided, it is far, far less likely to hallucinate than chat GPT. So you know what, I sort of, when someone goes to a chat GPT website and says, describe molecular biology in the style of Dr. Seuss, and it does it. It's amazing, but you know what that is? It's a parlor trick, right? There

Ben:

Mm-hmm.

Jeff Prosise:

is no commercial value in that. But when you can put large language models over your company's own data, it's a game changer. We talked with one customer last year who has a lot of proprietary IP that they hold internally, developed by a number of engineers over the years. Many of these engineers are reaching retirement age and are starting to retire. So they were looking to build a knowledge base, essentially, through video interviews, internal emails, internal documents, to capture the institutional knowledge that was going to leave the company when these engineers left. They said, how do we do that? Well, what's the answer? A large language model is your best friend. You can easily transcribe those videos. You can just make, you know, everything that you've gathered, documents, emails, video transcripts, one big knowledge base and there is no better way today to surface that information than with a large language model.

Ben:

Mm-hmm.

Jeff Prosise:

That is the sweet spot.

Michael:

Well

Ben:

Definitely.

Michael:

put.

Jeff Prosise:

And you don't worry about hallucination. It's just the real challenge. You've probably run into this too. It's pretty easy. Whether you're using lane chain or doing it manually, Microsoft has this open source library that they've just introduced. It's very early now, but it's called semantic kernel. I'm spending a fair amount of time with, because given a choice, I'm a.NET developer. I love the C-sharp programming language. I love.NET as a runtime and as a platform. But with Lang chain, it's Python or JavaScript only at least right now. But building these things, putting chat GPT over the top of custom data, absolutely where it's at.

Michael:

Yeah, so Jeff and Ben, I know that we could keep going for another at least seven hours. I personally have a bunch of questions that were sadly unanswered, but that's how it works. So I will quickly wrap. So today we talked about a lot of really cool things. These are some points that stuck out to me. First, if you're going to be using GPT in production, standing it up in Azure is really effective and it's not available as easily in GCP and AWS. And it's great out of the box. So if you aren't looking to train your own model from scratch, GPT is a great solution. Some industry notes, and this sort of is reflected from the Databricks Data and AI Summit that was last week, there's going to be sort of a divergence in models. Some are going to get bigger and people are just going to throw money at it. And others are going to get smaller and those smaller data sets are going to become more focused. So Look for that trend and it's really interesting and I'm curious how it plays out. Regarding the fear of artificial intelligence, once again, relax, we're all fine. One of the reasons that chat GBT might be seen as sort of a dangerous technology is because it's more easy to work with. It's more easy to understand, people can talk to it. And again, think about if encryption was broken for the world. That would be one of the scariest things to ever happen in the history of humanity. We have so much infrastructure based on it. And so really understanding the underlying algorithm and architecture informs how you should think about sort of the dangerousness of a technology. And then from an emperor perspective, if we're gonna solve all the world's problems and we should invest in education, emphasizing self-evaluation and models, and then thinking about being intelligent with our restrictions and laws. So Jeff, if people want to learn more about you and your work, where should they go?

Jeff Prosise:

So they can check out my blog at atmacara.com or just search on Jeff Procise's blog. I tend to blog in spurts, got a bunch of stuff sort of sitting behind the dam that I need to find time to blog about. Also wrote my first book in more than 20 years last year. It's called Applied Machine Learning and AI for Engineers. It's available from O'Reilly. It was a work of passion for me. I decided many years ago I'd never write another book. People don't learn that way anymore, but what's happening is so exciting and so transformational and so important. I just felt compelled to write that book. So you can go to Amazon, search on my name and find it pretty easily. Also speak at several conferences each year, mostly overseas, but sometimes in the US. So do a search. You'll find out where I'm speaking next. I always love meeting new people and talking about this stuff. As much time as I've spent with this in the last decade or so, I'm still learning things every day. And it's great to meet people, talk to people, get fresh perspectives, which sometimes changes my mind about how things are going as well.

Michael:

Beautiful. Well, it has been an absolute pleasure.

Ben:

Likewise.

Michael:

And until next time, it's been Michael Burke and my cohost

Ben:

Ben Wilson.

Michael:

and have a good day, everyone.

Ben:

See you next time.

LLMs on Azure - ML 123

0:00

1:14:13

Playback Speed: