Vector Search in Rails Applications - RUBY 601

Andrei Bondarev builds AI/ML-capable software products. He joins the show alongside Chuck to talk about Vector Search or Semantic Search. He begins by giving an overview of it, explaining its concept, its significance, how it can be used in the rails application, and many more.

Hosted by:

Charles Max Wood

Special Guests:

Andrei Bondarev

RSS Spotify Apple Podcasts YouTube Amazon Music

Show Notes

Socials

GitHub: andreibondarev
Twitter: @rushing_andrei

Picks

Andrei - Home - The Battle of Polytopia
Andrei - Designing Data-Intensive Applications
Charles - Star Realms

Transcript

Charles Max Wood:

Hey, welcome back to another episode of the RubyRogues Podcast. This week, I'm your panel, Charles Max Wood. Dave and Valentino are out, so it's just gonna be me asking all of my noob questions to Andre Bondarev is our guest today. Andre, do you want to introduce yourself? Let people know who you are and why you're famous. All that good stuff.

Andrei Bondarev:

Sure, so my name is Andre Bondreff. I've been building software products for a little over 12 years. My tool of choice has always been Ruby. I played around with a lot of different languages, Python, Node.js, JavaScript, some other ones, and I kept coming back to Ruby and in its ecosystem, its community. Um, and recently, uh, I've been diving into the emerging, uh, AI ML space and, uh, taking a stab at building applications with, uh, large language models.

Charles Max Wood:

Good deal. And we've talked a bit about some of the large language models and machine learning and stuff like that on this show. We also have a machine learning show, Adventures in Machine Learning, if people are interested in that. But yeah, we invited you on to talk about Vector Search, which seems like it kind of splits the difference between the two. Do you want to just talk briefly about what Vector Search is? I have this vague idea, but I don't think I can explain it.

Andrei Bondarev:

Sure. So vector search, synonymous with semantic search, is the idea of being able to execute, being able to interpret the meaning or concepts from the underlying query, as opposed to a traditional keyword search that just attempts to match your datasets based on those exact queries. keywords that the user passes.

Charles Max Wood:

So for example, I could ask a question instead of just typing in the couple of primary words that I'm looking for.

Andrei Bondarev:

Right.

Charles Max Wood:

That makes sense. It sounds fancy. Um, so how do I go about adding vector search? Well, let's back up for a minute. Why, why do I need vector search as opposed to keyword search? Like, is there a fundamental difference in experience or something like that?

Andrei Bondarev:

Yeah, I think so of course the answer is it depends, right? And it's, you have to look at specific use cases.

Charles Max Wood:

Okay.

Andrei Bondarev:

In some instances, you do wanna give the user an ability to search for a specific keyword matches, but the vector search is a lot more flexible in terms of discovery. So, you know, just like just like humans have a hard time communicating with each other and problem solving and coming to a consensus. And when you're searching through a large data set, you don't always know what you're looking for, right? You kind of have a vague idea. You're putting some queries out there and then the faster the system is able to, the faster the system is able to return what you're looking for or what it thinks you're looking for, the better experience it is overall.

Charles Max Wood:

So yeah, I mean, I guess, yeah, the speed of return and the accuracy of results, that makes sense. One thing though that I've seen is, you know, you mentioned the large language models and so, you know, a lot of times we're talking about like chat GPT and things like that, that people are out there using. And I've had this conversation with a number of people regarding both having chat GPT, I don't know, write code, and also just... having ChatGPT answer questions, especially when it gets into long form content, recognizing kind of the nuance of the conversation and things like that. So, if I put a transcript, for example, from my show into a large language model and expecting it to come back with the results, a lot of times it misses. A lot of times it's not accurate. A lot of times it is accurate. So I've been using ChatGPT to do some... writing and things like that or asking questions. And when I ask it questions, it still doesn't always get it right. And a lot of that really just depends on the data it's trained on. So chat GPT, for example,

Andrei Bondarev:

Mm-hmm.

Charles Max Wood:

is trained on a subset of the internet. And so where the internet is generally wrong, chat GPT is also going to be generally wrong. And just to give a little bit more context, I know I'm kind of talking way longer than I. I want to, to ask a question, but I was talking to a friend of mine who's a parliamentarian, right? So he's, he's the guy that sits in the meeting and makes sure that everybody follows Robert's rules of order. And if there's a question about procedure or whether we can or can't do something according to the rules of the meeting and Robert's rules, he's the guy that looks at the rules and makes a ruling. And so he was asking parliamentary questions on chat GPT. and it was actually telling him wrong answers. And then when he asked it where it got the information from, it would actually cite, you know, it's in this chapter, this section of Robert's Rules of Order, and it was incorrect. And so I guess my concern is, you know, how much data do you have to give this? How much training do you have to give it in order for it to be able to say, you know, if I type in, how do I use device on Rails? How much training does it have to have or how good does my data have to be in order for it to give me the right answer?

Andrei Bondarev:

Yeah, so that's a very good question. And also the case you had just built leading up to that question. And you can actually test this really easily by asking Chad GPT to provide some sort of latest scientific papers or research papers on a given topic. And it will just,

Charles Max Wood:

Oh, okay.

Andrei Bondarev:

it will, maybe the top level, the root domain will be correct. But what follows then the actual path, the actual link is completely false, fake. So from a standpoint of vector search, the way vector search works typically is that you would generate embeddings. from your datasets. So let's say if you have a database, a list of products, right? Let's take a typical e-commerce example. You have a list of products.

Charles Max Wood:

Mm-hmm.

Andrei Bondarev:

You would generate embeddings for every product. That would include different information such as the title, the description, the properties, the sizing, etc. And then the vector search on top of that data set would actually translate a user's query into a vector embedding. It will then go to a vector search database, try to find the closest N number of matches. It would then take those matches, take that data, pass it to an LLM and say, can you synthesize an answer? based on this user's question. So at any point in time, it's taking most relevant database entries and uses an LLM to provide a coherent answer.

Charles Max Wood:

Okay. So... I guess my question is then, is it actually going to try and give me an answer to the question or is it gonna give me links to the products or what kind of a result are we looking at getting?

Andrei Bondarev:

It's so as a developer, you can control that.

Charles Max Wood:

Okay.

Andrei Bondarev:

And there's kind of different paths you can take. The scenario that I just mentioned that I just mentioned, the LLM is going to try to give you an answer to your question. But you can also augment that answer with, Hey, these are the, these are the links to the reference products above. because you know which records are going to be extracted from the VectorServe database when that search is executed.

Charles Max Wood:

That makes sense. Yeah. So, we could, for example, have it come back and say, you know, so if I ask it, if I have this model of car and I, you know, which tires do you have, then it could give me a list of tires. Or it could explain maybe these tires are great for off-road and these tires are great for on-road or whatever, right? And so, I can give as comprehensive

Andrei Bondarev:

Yep.

Charles Max Wood:

an answer as I want as the programmer.

Andrei Bondarev:

Yep, yep.

Charles Max Wood:

That makes sense. And that's kind of interesting just from the standpoint of, like, I run the podcast website and when I do a search, you know, I generally just want it to link back to whichever podcast episode, but I could, for example, have timestamps and feed it timestamps and that, that could be interesting, right? Is, you know, you ask a question that says, Hey, in this podcast around this timestamp, right? Uh, these. these hosts disgust the thing.

Andrei Bondarev:

Right, right, right. Yeah, I mean, another, so an example of putting a vector search on top of your podcasts, right? So presumably you have a description for every podcast, right, a title, kind of the topics that have been discussed. And let's say I'm searching for the word politics, right? So I'm just curious if you guys have ever touched on any politics within your podcast. And if with keyword search, if the word politics is not explicitly mentioned in any of your

Charles Max Wood:

Right.

Andrei Bondarev:

show notes, I'm not going to get that answer back. But if there are similar concepts described, adjacent concepts to

Charles Max Wood:

Mm-hmm.

Andrei Bondarev:

the field of politics. described in your show notes, with vector search, I will get a result back, right? So you may mention that a friend of yours is a parliamentary, right?

Charles Max Wood:

Uh huh.

Andrei Bondarev:

And so with vector search, if I search for politics, I expect to get that show back, that result back.

Charles Max Wood:

That makes sense. So how do I go about building this in? I'm still not completely sold that this is always the right answer. And you said it depends. So yeah,

Andrei Bondarev:

Of course.

Charles Max Wood:

it really, right, you design the experience you want. You're not locked into the keyword style search, which I think is the point, right? So if you want to design a different experience, you can have it, which is exciting. How do I put this in, right? How do I decide, okay, this is... it's gonna give back maybe a little bit more, a little bit different result or different kind of result. And I want a vector search, right? I want people to be able to say, hey, Chuck talked about politics on this episode. How do I start putting that together?

Andrei Bondarev:

Mm-hmm. So there are a lot of vector search databases entering the market. There's five, six, and an ever-growing

Charles Max Wood:

Mm-hmm.

Andrei Bondarev:

number of them currently. It's kind of still unclear who will make it through the end of the year or become a leader over the next couple of years. There's also a lot of traditional database vendors adding those capabilities to.

Charles Max Wood:

Okay.

Andrei Bondarev:

existing products. Elasticsearch is doing it. Postgres has a PG Vector extension.

Charles Max Wood:

Oh, interesting.

Andrei Bondarev:

Pinecone

Charles Max Wood:

Okay, cool.

Andrei Bondarev:

is a standalone, really popular SAS tool. Close source proprietary solution. There are Milvus, Qdrain. WeV8, they have an open source offering supplemented with a cloud offering, paid cloud offering. So the field is moving very rapidly and I think as engineering leaders, you kind of have to assess in that moment in time and do a couple of POCs to try to understand the current capabilities, how fast, how big is your data set, how fast can we index that data into a vector search database, what does the infrastructure look like, what are the kind of compute memory, hosting requirements needed to run an open source solution, right, do you want to maintain it, do you have people on staff? that have those skill sets. And yeah, go through a proper discovery.

Charles Max Wood:

Yeah. It makes sense. So, you said some of them are open source. I don't know that I've seen any. I like the idea of adding it to Postgres since that's usually what I'm using anyway.

Andrei Bondarev:

Yeah, so the PG vector, as far as I know, it's not affiliated with an official Postgres project. It was written as an external extension to Postgres.

Charles Max Wood:

Okay.

Andrei Bondarev:

And I've read some kind of mixed opinions about it. I haven't used it all that extensively so I don't want to comment if someone is using it and has great results with it.

Charles Max Wood:

Makes sense.

Andrei Bondarev:

And just to add to that, I mean, some of the other ones are heavily VC-backed companies and have a lot of resources to pour into the development of these systems. So as far as I can tell, PG Vector is purely an open source solution right now. There's no kind of commercial offering. or roadmap. So you could just expect that some of the for-profit commercial vector search databases are going to get ahead.

Charles Max Wood:

That makes sense. In your blog post, I did see that you were using Weaviate, and I talked to them a little bit at JS

Andrei Bondarev:

Yep.

Charles Max Wood:

Nation in Amsterdam earlier in the month. And yeah, it was kind of interesting to talk about it. Before we get too deep into some of this, I'm a little curious, like, what is the difference between a vector database and sort of your standard database? relational database.

Andrei Bondarev:

Yeah, so a standard relational database like MySQL, like Postgres, I guess SQLite is also, could be put into the same class. They've been around for a while. People have built a ton of applications on top of them. They work, they're well supported. There's a huge community around them. The whole application layer is also built out. If we talk about brails, the ActiveRecord, the ORM has incredible support for all of those databases.

Charles Max Wood:

Mm-hmm.

Andrei Bondarev:

The underlying solution is tailored to be able to execute SQL queries very, very efficiently. And the vector search databases is a much younger type of system. And they kind of started from the other end. So they started from being able to compute embeddings and then also being able to execute basically like similarity searches. Like Another word for it is a nearest neighbor search.

Charles Max Wood:

Gotcha. So... So when I'm querying it, yeah, it's not doing the same kind of lookup. How do you tell it what a near neighbor is or how do you tell it what is close to what? Is that part of the large language model or is

Andrei Bondarev:

Mm-hmm.

Charles Max Wood:

that something else?

Andrei Bondarev:

Yeah, so maybe we need to, maybe I should explain what a vector embedding is in first place.

Charles Max Wood:

Mm-hmm.

Andrei Bondarev:

So a vector embedding has a property called dimension and it's basically the length of it or think of it as an array of float numbers and

Charles Max Wood:

Uh-huh.

Andrei Bondarev:

how big is that array, right? Whether it's 100 float numbers or 1024 float numbers. And that is the semantic representation of your data. So that captures the underlying meaning. And actually all of those vectors can be mapped out in a coordinate system, but the coordinate system is not a 2D or a 3D dimensional coordinate system.

Charles Max Wood:

Okay.

Andrei Bondarev:

It can potentially have a thousand dimensions, right?

Charles Max Wood:

Oh wow.

Andrei Bondarev:

So that's where that dimension property refers to. And so when you index your data, right, so let's go back to that product example, you generate a vector embedding. right, so a large array of float numbers, representation of your data. And when you take a user's query, you generate a vector embedding from that query as well. So you generate an array of float numbers from

Charles Max Wood:

Okay.

Andrei Bondarev:

the user's query, right, to kind of translate it into into a vector embedding representation. And then there's different formulas to try and find the closest distance between those vectors, right? Between the one that's

Charles Max Wood:

Mm-hmm.

Andrei Bondarev:

coming in as a user's query and between the ones that are stored inside of your vector search database. And... You for your specific data sets, you should test out different methods, but there's you can, you can calculate distance based on a cosine formula. There's another called dot product. There's a kind of fair amount of configuration there that you have to, you have to test out and see, see what kind of yields the best results for you.

Charles Max Wood:

That makes sense. So effectively it takes, just to rephrase really simply, it takes

Andrei Bondarev:

Mm-hmm.

Charles Max Wood:

what I give it for my listing. So let's say it's a podcast episode or in your example a product, right? And it takes all the information in there and it generates a series of numbers and those numbers are effectively coordinates in a, you know, enlarged dimensional space, right? Which is effectively what a vector

Andrei Bondarev:

Yep.

Charles Max Wood:

is, right? I remember in math, you know, in college, you know, they taught us vector was effectively the amplitude and direction, right? And so it's basically from the origin or from the middle out to some point out there, right? And that's your vector. And so effectively then what you're doing is you're saying, okay, if I put a search in then it's also going to generate a vector. And then there are a bunch of different algorithms to determine how close it is in the thousand dimension space or whatever. And so it returns all the things that are close to it according to whatever algorithm it has for picking the stuff that's nearby.

Andrei Bondarev:

Correct.

Charles Max Wood:

That makes sense to me. It feels a little bit like magic, right? Because I convert all this text to numbers and then magically the other numbers correlate with it somehow. And so that's the right answer, but it seems to work because people are using it. So, you know, I'll take that much of it on faith. Incidentally, when we've talked about like neural networks on adventures in machine learning, right? It's the same kind of thing. It's like, we have these weights and their numbers. And we don't know why those numbers work, but we get good results. So, right, this is the same kind of deal.

Andrei Bondarev:

Right, right.

Charles Max Wood:

Okay, so yeah, so let's say that I take all of my data and I seed it into this vector database and I want it to be searchable. What do I have to do in, say, a Rails app in order to make this work?

Andrei Bondarev:

Yeah, so there are, so you can either use a vector search and an LLM directly and use user APIs. So whether you want to use Weaviate or Qdrant or Milvus, and there's a number of LLMs as well to choose from. We've actually been working on a library called Langchain RB,

Charles Max Wood:

Okay.

Andrei Bondarev:

which is a set of abstraction layers on top of all of the different LLMs and all of the different vector search databases, in addition to some tools to help you build an application. So we're kind of. building much tighter integration into Rails, so you could just plug it into your active record model and index your data as a callback, for example.

Charles Max Wood:

Oh, interesting. That's very interesting because I've used like search kick. We did an episode actually and talked quite a bit about search, but it was more along the lines of the keyword search with elastic searches and things like that. Um, and yeah, you know, a lot of those give you the same deal, right? Where you can have it, have it index it when it's changed or when it's created, or you can go in and you can say, Hey, re-index the whole collection or things like that. So it sounds like you're providing a lot of those same. kinds of tools.

Andrei Bondarev:

Exactly, yeah.

Charles Max Wood:

And then the rest of it is just having a driver that sends my query into the database and gets the result back, right?

Andrei Bondarev:

Right. And we also have tools that just basically make the whole experience, building the whole experience end to end much easier. So a big part of interfacing with LLMs is also prompt engineering. So we have tools that let you construct, manage, and maintain your prompts, prompt management. kind of make sure that for whichever call you're making, you're not exceeding an LLM's context window. So you're not exceeding the allowed token, a number of tokens to pass. We've also built this kind of data pipeline that lets you import full on. So if you want to import PDF files, local PDF files or text files or markdown files or Word docs we kind of take care of a lot of us a lot of that plumbing that needs to be done and When it comes to parsing the files and then chunking the files before it gets imported into vector search database

Charles Max Wood:

Right, and so this is all done in Lang chain?

Andrei Bondarev:

Yes. Yeah.

Charles Max Wood:

So...

Andrei Bondarev:

Yeah, so there's...

Charles Max Wood:

No, go ahead.

Andrei Bondarev:

So there's a, yeah, there's a Python, the original implementation is in Python and TypeScript. And as I've been working within this space, I felt like we ought to have our own solution in the Ruby

Charles Max Wood:

Mm-hmm.

Andrei Bondarev:

world as well.

Charles Max Wood:

Right.

Andrei Bondarev:

So this is a length chain inspired Ruby flavored one.

Charles Max Wood:

OK, is it called link chain in Python?

Andrei Bondarev:

Yes, yeah.

Charles Max Wood:

Okay. I was just curious if somebody, yeah, came up with some of the creative name for it. That's cool. And yeah, just looking at the Lang chain library, it looks like you connect to a bunch of different LLMs. So you open AI, which is effectively chat GPT, hugging face. I haven't used any of the rest of these. So yeah, so it'll just. It'll do a vector search by generating a prompt that you send over to OpenAI. And then I get my response back.

Andrei Bondarev:

Yep.

Charles Max Wood:

very cool. And so conceivably, because this is one area that I've wanted to get into a bit with some of these results, is so let me back up for a second. So effectively, then, my vector search is just a prompt to an LLM.

Andrei Bondarev:

Your vector search is a query to your vector search database and then you take those results and pass them to an LLM to synthesize an answer.

Charles Max Wood:

I'm not sure what you mean by that. So I make the query to my database and I get results back, right?

Andrei Bondarev:

Yeah, so let me, I'll do another take of this. So the vector search is generating and embedding from a user's query, calculating the closest distance to

Charles Max Wood:

Right?

Andrei Bondarev:

that embedding. to the ones found in your vector search database. And you can serve those results right away to the user.

Charles Max Wood:

Okay, and that's what

Andrei Bondarev:

Charles Max Wood:

I've

Andrei Bondarev:

it's

Charles Max Wood:

been assuming.

Andrei Bondarev:

a... If it's an open-ended question, if what you're building is an actual Q&A, then in order to translate those results into a proper natural language style response, you would actually construct a prompt with the original user's

Charles Max Wood:

Ah,

Andrei Bondarev:

question,

Charles Max Wood:

huh.

Andrei Bondarev:

the results found in the vector search database and n number of results.

Charles Max Wood:

Mm-hmm.

Andrei Bondarev:

And then you would prompt the LLM to write back an answer, giving the results found.

Charles Max Wood:

Right. So this is the kind of thing you would expect to see out of a chat bot or something.

Andrei Bondarev:

Right.

Charles Max Wood:

And so, yeah, so I have my vector search database that has say all 4,700 and something podcast episodes that we've done on top end devs. And so then, you know, and so it feeds in the show notes and the transcript and the title and this short description and everything else. And then what it does is when it gets all that back, then it constructs a prompt. that it sends over to Chad GPT or Hugging Face or something like that and says, hey, this is all the information I have about this episode. Can you synthesize a two or three sentence response that tells them, hey, according to this episode, this is your answer?

Andrei Bondarev:

Yep, that's right.

Charles Max Wood:

Okay. That's pretty slick. So

Andrei Bondarev:

Yeah,

Charles Max Wood:

now

Andrei Bondarev:

so the...

Charles Max Wood:

my question is, oh go ahead.

Andrei Bondarev:

I was going to say, so the main use cases that we try to solve or try to empower developers to solve with link chain RB are things like vector search, building chatbots, and also building a QA on top of your proprietary data sets.

Charles Max Wood:

Okay. So yeah, and that was the question I was going to ask is, okay, so yeah, how are people using this? Is it just chat bots or are there other places that you're seeing people use that? I mean, I guess I could imagine that you could get a written out response on a webpage for doing a search, but people aren't accustomed to that.

Andrei Bondarev:

Yeah. Another experimental thing that we've been looking into and built a couple of tools for in LinkedInRB are agents. So basically trying to treat an LLM as a general problem solver. And there's a lot of emerging research. and different prompting techniques for example to try to have an agent execute on general different various problems.

Charles Max Wood:

Okay. I'm kind of curious, is this something you're actually using in production somewhere? Or was this just an interesting problem for you to solve?

Andrei Bondarev:

uh... the agents.

Charles Max Wood:

No, just any of the vector search stuff that we talked about so far. Like, are you using this at work?

Andrei Bondarev:

Um, we've, so I've built a couple head projects, um, but I'm working on, uh, I'm, I'm working on another product, uh, where, uh, we're gonna offer a, uh, Q and a on top of your documents. So actually, I'll do another take of that. Do you

Charles Max Wood:

Okay.

Andrei Bondarev:

mind repeating the question?

Charles Max Wood:

Yeah, I'm just going to mark the clip real quick so that our editor knows

Andrei Bondarev:

Thank

Charles Max Wood:

to hit

Andrei Bondarev:

you.

Charles Max Wood:

it. All right. Yeah, so my question is, we're kind of talking about all these interesting solutions. So is this something you're using on your own products or at work somewhere or something like that?

Andrei Bondarev:

So I run a small software development firm. And actually, the way I stumbled into this space was this year after concluding a large... Sorry, I'll go again if that's OK.

Charles Max Wood:

It's all good, yep.

Andrei Bondarev:

Do you want me to just go?

Charles Max Wood:

Yeah, just go.

Andrei Bondarev:

All right. So I run a small software development firm. And last year, actually, we've, we worked with a client and it's a well-known public company and they offer data analytics solutions on top of regulatory data, like rules, laws, legislations, national and international. Um, and we helped them rebuild their core application and a big part of it was rebuilding her search experience. So, so we built kind of a traditional, um, elastic search cluster and the whole data pipeline and the AVI's, um, the, the front end to it. Um, and as this AI wave, um, started rolling in this year, I kind of. I kind of had an epiphany that what we had just built and the project that we had just included was sort of outdated because it was a traditional keyword search. And I started exploring the vector search space and all of the different database. So we're actively working with our current clients to kind of enhance their... products with these new capabilities, Vector Search and LL Lamson, and working with product managers to try to figure out where these capabilities can be added and where it makes sense to add those capabilities to your products.

Charles Max Wood:

Very cool. And you know, what was your experience putting that stuff together? I mean, it sounds like the toolkit is pretty comprehensive and makes it pretty easy, but did you invent a lot of this stuff as you went or?

Andrei Bondarev:

Um, yeah, the, so I would say that the, the overall field, um, is still very young. Um, people are prototyping and testing a lot of different approaches. Um,

Charles Max Wood:

Uh-huh.

Andrei Bondarev:

um, you know, even, even the whole field of prompt engineering, um, is, is still kind of, uh, still kind of a black box.

Charles Max Wood:

Mm-hmm.

Andrei Bondarev:

It's unclear why you're getting certain results given certain prompts. You don't really have much visibility into the underlying system, especially if it's a closed source one like OpenAI.

Charles Max Wood:

The other

Andrei Bondarev:

Charles Max Wood:

end of that is that you also don't know how to write the prompt all the time that's going to get you the information you want, right?

Andrei Bondarev:

Exactly. Yeah. Exactly. So we've been looking at a lot of different libraries across other languages as well.

Charles Max Wood:

Mm-hmm.

Andrei Bondarev:

Looking at how things are done in the Python Samantha kernel. There's daily research coming out on the best prompts and how to work with LLMs in general. So we're trying to take all of that, put a kind of Ruby spin on top of it and offer that to the Ruby ecosystem.

Charles Max Wood:

Gotcha. So yeah, with what you're building with Langchain though, did you build Langchain as kind of a, hey, I'd like better tooling on this stuff? Or was this something you found and you've been contributing to it since?

Andrei Bondarev:

No, I'm the original

Charles Max Wood:

You're the

Andrei Bondarev:

author

Charles Max Wood:

author. Okay.

Andrei Bondarev:

and I've been very humbled and very amazed at the amount of attention that the project has been getting. I'm super thankful to all of the contributors and everyone that is actively participating in the project. in the discussion in terms of where this library, the direction that this library needs to go and

Charles Max Wood:

Mm-hmm.

Andrei Bondarev:

what are we missing, what are we lacking, what should be added. We have a Discord server where we're kind of, we're a community of people building these types of capabilities into our products or launching new products. So we're in a discourse server kind of talking about these things.

Charles Max Wood:

Yeah, this is super cool and it's definitely an area I want to explore. And it'd be interesting to see, yeah, which people found more useful, right? Just kind of the standard, uh, keyword to link to, you know, here's where the keywords appear in the show notes or whatever, or, you know, yeah, whether they want more of an answer and then, Hey, here's some context right here. The podcast episodes we referenced and the other things that we write. But yeah.

Andrei Bondarev:

Yeah,

Charles Max Wood:

So.

Andrei Bondarev:

and I think...

Charles Max Wood:

Okay.

Andrei Bondarev:

I find that there is a healthy amount of... I find that Ruby community tends to be very pragmatic.

Charles Max Wood:

Uh-huh.

Andrei Bondarev:

We focus very heavily on productivity, on developer happiness, on maintainability, on good patterns, good software development fundamentals. not reinventing the wheel, right? Obviously a lot of those values have been driven by Ruby on Rails that permeated throughout the whole ecosystem. So, and those are the principles that we're adopting with LENG chain RB. So I think as opposed to some of the other implementations or languages, we're looking at all the problems through that lens. because we don't want to add tooling for the tooling sake. We want to make sure it's kind of best practices, opinionated best practices, how we think LLM based applications should be built.

Charles Max Wood:

Right. So one other question I have is like, what's next? You know, what's coming next for Lang chain? Let's just start there. I have a few other what's coming next, but yeah, let's start there.

Andrei Bondarev:

So it's adding, so we'd like to add more vector search database vendors. We'd like to add more LLM vendors as well. We'd like to add support for local open source LLMs as well so people can... start so people can prototype those types of applications. And we're looking for feedback, anyone that's willing to provide feedback on the documentation, on the abstractions we have. currently in the project. We're very open to any kind of constructive criticism and definitely wanna make sure that what we're building is rooted in real use cases and not completely invented.

Charles Max Wood:

Makes sense. What do you see coming next for vector search and LLM technologies?

Andrei Bondarev:

Um, I'm hoping, um, I'm hoping that, um, the open source LLMs, uh, will, uh, catch up to the proprietary ones as well. So be, um, be as powerful, uh, and also be efficient enough, uh, so that developers can, can run them either, either locally or, um, uh, on a. or host them themselves so that we're not just tied to OpenAI because what OpenAI offers right now is they have some of the strongest models currently available. There's also a growing field of task specific models as well, so models that are trained for specific to accomplish specific tasks like summarization. So I think as opposed to using one powerful model, one general purpose model for everything, I think people are going to start chunking out their workloads to specific models. In terms of the vector search space, Like I've mentioned earlier, there's so many different solutions. I think in a year from now on, there will probably be less players. Based on, you know, folks will have just much more time to actually battle test those systems. So I think the number of offerings is going to be significantly reduced. and we'll just all kind of mob over a smaller set.

Charles Max Wood:

Right. So. Yeah. So just getting started, where do you suggest people start?

Andrei Bondarev:

Um, well, I would suggest checking out blank chain RB, and trying to add those capabilities to your existing products. Um, or even checking out the repo, cloning the repo, reading through, reading through the code base and, and seeing how, how we're doing, uh, how the different things are accomplished. what kind of techniques we're using, what kind of prompts we're using, what kind of settings we're using for different vector search databases. I think that would be a good foot on the door if you were a Rubyist.

Charles Max Wood:

Sounds good. If people wanna connect with you or if they have other questions, is there a good place to find you online?

Andrei Bondarev:

Twitter is great. I think we can put some links in the show notes.

Charles Max Wood:

Yeah, we can definitely put a link in the show notes.

Andrei Bondarev:

Yeah, Twitter would probably be best.

Charles Max Wood:

All right. Yeah, we'll put a link to Twitter in the show notes then. All right, well then we'll just move ahead and we'll do some picks. So I never know if our guests have listened to the show before, so I'll just give you a quick explanation. Picks are essentially just shout outs about stuff we like. So, you know, it can be technical, it can be non-technical. I'll just give out a few examples here. So. I usually pick a board game as part of my picks. Now this last month I've been traveling a ton and so I haven't been hanging out with my board game friends, right? Which is usually where I get a new board game pick because it's like, hey, we tried this game and it was great. So I'm gonna go back to one that I've played quite a bit on my phone actually. It's called Star Realms. You can buy the cards, but it is much easier to just play the game on your phone, right? Because you don't have to buy expansions and you have to... put it away or separate them out or whatever. So Star Realms, I wonder if it's on Board Game Geek. So Board Game Geek is where I go to tell people how complicated a game is. I would imagine it's on here. Yeah, it looks like it is. So Star Realms, yeah, it came out 2014. It's a two player game. 20 minutes per round, yeah that sounds about right. The weight is 1.93. Casual gamer with a semi-complicated game, right? So it's not incredibly simple game, but it's not so complicated that a casual gamer couldn't pick it up. That's kind of what I'm looking at, right? This is a board game weight of 1.93, and I'm looking at a two for the casual gamer, right? Somebody who likes a game that makes them think. but doesn't want a game that's super complicated. And so this one's right in there, right? It's pretty easy to pick up, stuff like that. So yeah, I like that. Like I said, you can pick it up, you can play it on your phone, you can play it against the computer, or you can play it against your friends on the app. And anyway, it's awesome. So I'm gonna pick Star Realms. And then last week, I took my wife and kids to Disneyland. And so I'm gonna do some Disneyland picks this week. Now Disneyland's a ton of fun. Um, they have California Adventure and Disneyland. Um, and you can buy tickets to allow you to switch parks, the park hopper passes. Those tend to cost about $25 more per person. And since I have five kids, that was just a little bit expensive. It's $25 per person per day. And we were there for four days, right? So it would have cost us an extra $700 to get the park hopper pass. So what we did is we just got the one park passes for four days, and then we just picked a park and that's where we were for the day. So we spent two days California adventure and two days in Disneyland. And oh man, it was such a good time. A few things I'm going to throw out. They did have the Genie Plus passes. Those are also $25 per person per day. We did buy those. Now the Genie Plus, if you haven't been for a while, you're probably more familiar with the Fast Passes. The Genie Plus is now the Fast Pass. You have to pay to use the Lightning Pass line. But it was well worth it because you use the app, right? So unlike the old Fast Passes, you don't have to have your park ticket in your pocket and run over to the machine to stick it in the machine. You just tell it, I want to ride this ride using the Lightning line, and then it reserves your spot. And then when you get there, you just scan the barcode on your phone and you go into the Lightning line, and it's way faster. So that was well worth it. The newer rides, so there were a couple of rides in each one. I think there were two rides in each one that you had to pay on top of the Genie Pass in order to get a Lightning Pass to those. So like the Radiator Springs Racers in California Adventure, there was the new Rise of the Resistance. And I think it was because it was new that people really wanted those tickets. Anyway, so they were an extra like 20 bucks per person. So we didn't pay to do Lightning Passes on those. But... We had a great time. The Genie Pass is well worth it. The other thing that it does though, is it gives you, automatically gives you the park photo pass. And so, when you ride on like Space Mountain or in Credit Coaster or some of the other ones that takes your picture, Splash Mountain was closed by the time we were there. That's the one that's probably most famous for having the pictures. It gives you a code. You can put the code into the app and you get a high quality. picture, but they also have people with digital cameras around the park, like high-quality cameras. And so we got our pictures taken by photographers that work for Disney. And you also can get it at the character encounters and stuff like that. And you just have them scan a code on your phone and then it shows up and then you can download the pictures off of the app. So anyway, that was worth it. We also stayed at the Grand Californian Hotel, which is an on-resort hotel. The perk is you can get into the park a half hour early, so we were most of the way to the big rides by the time other people got into the park. But the hotel room itself wasn't that impressive. And overall, I don't think it was worth what it cost, because you can stay in a hotel nearby, and then you can either drive in and park at the park, or you can, anyway. So... So yeah, so that's kind of my take on Disneyland. But I've been going to Disneyland since I was a kid. And so a lot of the stuff is more nostalgia than anything else. But it was fun to go with my kids and watch them enjoy it, especially my seven-year-old. She just loved it. So I'm going to pick that. And then one last thing I'm going to pick is I'm looking at about the beginning of August. doing a podcasting workshop and it'll be a three month course and I'll walk you through all the steps of starting a podcast. I've just had so many people ask me how to start one. It's terrific for your personal brand. It's a great way to go if you want to meet people in the community because almost everybody out there, if you send them an email and say hey you want to come on the podcast, they'll come on the podcast. Even the people you think never in a will this person who's kind of at the top of the community come on a tiny show like mine? And you would be surprised. They, they, they don't care as much about how big your show is as much as they care that they get the opportunity to put, you know, put the word out and talk about the stuff they care about and see if they can make the impact they want to make on the community. So, um, the other thing is, um, it's also been an opportunity for me to say, Hey, I want to meet general people in the community and put something out there and say, Hey, jump on a 15, 20 minute call with me. Right. And so I get just regular programmers from around the world that, you know, we get to talk and see what they care about and what, you know, what's good about their career, what, what things they wish we, they could learn more about stuff like that and just make connections. And so anyway, um, it's just a win, win for me. And, uh, you don't have to break the bank in order to start a podcast. So if you go to topendevs.com slash podcast workshop, I should have that landing page up here within the next couple of days. And then you can go and you can reserve your spot in the workshop. Right now I'm looking at charging about $2,500 for the three month workshop. And then yeah, there'll be a series of lessons and you'll probably just get the whole course all upfront. And then what we'll do is we'll just have office hours calls. Right, so you can go at your own pace. If you can blow through it in a week, I mean, it's kind of a lot, but you know, what we're talking about is not just getting it launched and out there, but here's how you actually find people and grow and reach out to guests and all the kind of stuff that you need to know to make it really successful. Like launching a podcast on all its own is really not that involved. But yeah. knowing who to reach out to and stuff like that. The other thing is, is that if I know somebody who would be a good guest on your show, I don't mind reaching out on your behalf, right, if you join the workshop. And so it's like, hey, I want, you know, somebody who's been in the industry for 40 years and written eight books on stuff and, you know, whatever, right, somebody of that caliber on my show, you know, and I was thinking this person, and if I know him, I don't mind reaching out and saying, hey, I've got some podcasting students. They're launching their shows. They'd like to have you on and see if they're interested. So anyway, that's what I'm doing. And I have gone way too long with my picks. So Andre, what are your picks?

Andrei Bondarev:

So I'm gonna pick a show, a game, a book, and a museum. So the show, so Black Mirror is back for another season. I started watching it and it's phenomenal. It's very dystopian.

Charles Max Wood:

Mm-hmm.

Andrei Bondarev:

Charles Max Wood:

It's a good show,

Andrei Bondarev:

be prep...

Charles Max Wood:

what I've seen of it.

Andrei Bondarev:

it's great. Be prepared for an entertaining watch. So the game I'll pick is a phone game called Polytopia. It's a strategy game. So I'll leave it up to you to check it out, but just warning that you will end up... wasting a lot of time playing this game because it's very addicting. I love the multiplayer

Charles Max Wood:

Is it a

Andrei Bondarev:

mode.

Charles Max Wood:

computer game or a-

Andrei Bondarev:

It's a phone game, yeah.

Charles Max Wood:

That's a, oh, on your phone.

Andrei Bondarev:

on your phone, yeah.

Charles Max Wood:

Oh, okay. Okay, good deal.

Andrei Bondarev:

Um, the book, um, I've been going back to this one, uh, over and over and over again, uh, it's a very, it's a very intense read. It's very, uh, it's a very heavy read. Uh, super technical. Uh, this one's called, uh, designing data intensive applications by Martin Kleppman. Um, and it, and it. It looks at the landscape of different databases, it explains how they work, how to make different decisions when you're choosing between a NoSQL or relational database, how to stream data, batch process, message brokers, et cetera. It's a great read. And then I'll pick a museum, which is the Chess Hall of Fame here in St. Louis. It is, if you're, if you're a fan of chess, it's an absolute joy to, to visit. Last time I went there, they had a Bobby Fisher and Boris Spassky special exhibit about that kind of famous rivalry. Um, and, uh, I was, um, uh, nerding out, uh, in there for quite some time.

Charles Max Wood:

Awesome. Yeah, I just looked up the designing date of intensive applications, because that's something that sounds like it would be an incredibly good fit for our book club. Now, I'm gonna share another tip. I have Capital One Shopping installed as a plugin on my browser, and it found a place where, it found it on eBay for $11 cheaper than Amazon. So, you know,

Andrei Bondarev:

Yeah.

Charles Max Wood:

the delivery's not overnight, but if you don't need it tomorrow, right? Anyway. So I'm going to throw that out as a help if somebody's looking to buy it. But yeah, that looks like an amazing book. And it's showing that there's an audio book too. All right, cool. Well, thanks for coming, Andre. This was super fun. And yeah, another thing for me to go play with. Another cool technology.

Andrei Bondarev:

Yup.

Charles Max Wood:

But yeah,

Andrei Bondarev:

Thank

Charles Max Wood:

thanks

Andrei Bondarev:

you.

Charles Max Wood:

for coming and talking through this. This was fun.

Andrei Bondarev:

Thank you, thanks for having me. This was great.

Charles Max Wood:

All right, well, we'll wrap it up here, folks. Until next time, Max out.

Vector Search in Rails Applications - RUBY 601

0:00

59:03

Playback Speed:

Show Notes

Sponsors

Links

Socials

Picks

Transcript