What (un)exactly do you mean by semantic search?

What (un)exactly do you mean by semantic search? - Stack Overflow

Stack Overflow Business Stack Internal: the knowledge intelligence layer that powers enterprise AI.Stack Data Licensing: decades of verified, technical knowledge to boost AI performance and trust.Stack Ads: engage developers where it matters — in their daily workflow.Qdrant offers high-performance vector search at scale with any deployment model.Connect with Brian on LinkedIn or email the Qdrant team at support@qdrant.io.Congratulations to user Brad Larson for winning a Populist badge for their answer to Find the tangent of a point on a cubic bezier curveTRANSCRIPT[Intro Music]Ryan Donovan: Hello, and welcome to the Stack Overflow podcast, a place to talk all things software and technology. I am your host, Ryan Donovan, and today we are talking about the difference between vector databases and Lucene architectures, when they're appropriate to use, one or the other, and if there is a composable, portable way to use them instead. So, my guest for that is Brian O'Grady, Head of Field Research and Solutions Architecture at Qdrant. Welcome to the show.Brian O'Grady: Hi, Ryan. Thanks for having me on.Ryan Donovan: Before we get into the database topics today, can you tell us a little bit about how you got into software and technology?Brian O'Grady: When I graduated from university back in 2016, I initially went into finance, actually. I was working at Fidelity Investments. After some time there, I got an opportunity at Goldman Sachs, and I entered their technical organization and worked there for four years as a data scientist. I then moved over to Shopify, where I was a data scientist in MLE for about two and a half years. And after that, I moved into the world of databases. So, I moved on to DataStax, and once that was acquired by IBM, I jumped ship and moved over to Qdrant, which was a quasi-competitor at that point because DataStax had its own vector search offering. So, I saw Qdrant a lot in the field and came to appreciate its product begrudgingly.Ryan Donovan: Yeah. It seems like today every database has a vector offering, right? And today, we're talking about when it's appropriate to use the vector database and the sort of older technologies. For folks who may be unaware, what is the Lucene database?Brian O'Grady: Apache Lucene is a text search engine that was developed in the late 90s. Very mature, very rich feature set for text search. It is the underlying index that powers Elasticsearch, Solr, OpenSearch, and even if you are like me, working with Cassandra, even powers the text indexes in Cassandra. So, for a long time, if you were ever on a website and searching for red Nike shoes, and you saw the search results pop up underneath, more than likely that was being powered by Lucene.Ryan Donovan: Obviously, vector database has become very popular because of AI, but Lucene, and Elastic, and things that are powered by Lucene are still very popular. What's the sort of comparison the way of thinking of when each one is appropriate and when do they have trade-offs?Brian O'Grady: It's definitely going to depend on what you're trying to achieve with your application. So, if you think about when do people use Elasticsearch, OpenSearch, what kind of general use cases are they looking for, a lot of times it's these live applications where you're surfacing e-commerce results to an end user. A lot of what actually drives their revenue are logs and analytics. So, this is just, "hey, I have a dump of all of my security logs." You dump them in Elastic, and you just occasionally search through to find whatever happened on this specific day, right? And this is where a text search is really good because typically, in security events logging, you're looking for exact terms, right? You're saying, "I want to know exactly where this error appeared."Ryan Donovan: Some other trace UUID or whatever.Brian O'Grady: Yeah. And the issue with if you try to do vector search for the same thing is you wouldn't get exact matches because vector search is approximate, and you lose information. So, if you tried to search for this exact UUID and embed it as a vector, and then try to search that vector against other vectors: number one, you're losing information when you're doing the embedding because that's a natural flaw, like natural process loss; and then number two, you're only going to be getting approximate results because vector search at scale is an approximate search, whereas text search is always an exact search, right? You're always getting exact results back. So, that's, I would say, a key use case for Elastic and OpenSearch. People doing logging analytics where you need that exact search functionality, but when you start diving into you have maybe [a] user-facing application that needs to service a lot of requests, you don't necessarily need to have exact matches. Maybe you want people to– when people search iPhone, you want to also be able to surface them other types of phones out there. You don't just want to surface iPhones, you want to give them a bunch of options. Text search will fail here because text search will only look for pieces of text that include iPhone, whereas, quote-unquote, semantic search, which is really representing text as embeddings, tends to preserve this idea that different phone types are related to each other. So, you can surface non-exact results to people that may actually still be relevant to what they're looking for. At scale, this is typically where at Qdrant, we see Lucene-based architectures start to fail.Ryan Donovan: A lot of database types today, they have a vector add-on, right? Is that sort of vector add-on useful in conjunction? Does it help mitigate those failures, or are you better off with a sort of pure vector database for those sorts of things?Brian O'Grady: Yeah. What I'll say is that not all vector indexes are created equal. And if we look at who are the, what we call 'vector natives' in the space, we think, okay, there's Qdrant, there's Milvus, there's Pinecone, right? Pinecone's a big one. And I think that's the main three or four that you see people going out after there. Then, you contrast with, okay, what offerings are there that are what we call in the industry bolt-on vector search indexes. The main one that people tend to be using is—it's either two, it's either they have an existing Elastic or OpenSearch cluster, they want to add semantic search, AKA vector search capability, they bolt on a vector index to their existing cluster, and voila, they blow up, they run out of memory, and then they have to completely resize and think about how to go about their workload. The other common one outside of Elastic would be, and probably the most well-known example, is Postgres. Postgres has their pgvector. Very common. People are using it all the time. And what's interesting is that I think a lot of other vector database companies out there see someone using pgvector and they'll write a lot of, "no, pgvector is bad, it doesn't scale." And I'm a bit like– I don't see it as much of a threat. I actually see someone using pgvector as an indicator that they will eventually use Qdrant. Yeah, I see this so frequently that people start out on pgvector because it's super simple to use. You know it's just gonna work. People can stand it up in Docker locally, and they can just add on pgvector extension, and suddenly they have everything. They have transactional data, they have vector search running in the same workload, right? The issue becomes when they actually hit scale, and typically what I see is around– they'll have 10 million rows in their database, and they'll have their vector search running, and suddenly their latencies are spiking to 60 seconds for a single request. And by the way, also their traditional SQL transactional workload is failing because the vector search index in the background is taking up so much computational power that they have to eventually separate them and go to a dedicated service. I think a lot of people are afraid of pgvector, but again, to me, it's more like– I've called it almost like a gateway drug to vector search. As soon as you use pgvector, you're in.Ryan Donovan: Yeah. Postgres almost seems like the new MySQL, right? It's the new starter database. But that sort of answers an interesting question that I've had, where it's like, now I see so many specialized databases popping up. I've also seen some that are like, "you only need this one database." What is the advantage of just focusing on a vector database?Brian O'Grady: Yeah. I think it's like anything else in technology, that there are a lot of use cases where, yes, your single database or your single monolithic architecture will work up until a certain point, right? And it's not just databases, right? I go on to a lot of GitHub repos, and there are those monolithic repos where they have everything under the sun, right? And maybe Qdrant is a part of it, and I'm trying to update the version, and suddenly I can't just update the version because it's a monolithic repo, now the build time is astronomically large. I'm running into weird dependency conflicts because this version of LangChain Qdrant is different than the LangChain version that comes with the LangChain community; they have installed some other part of the repo. So, what I think it comes down to is Qdrant's take on this is that we want to follow the Unix philosophy, where it's do one thing and do it extremely well. So, you can imagine a scenario where people who follow that philosophy, they tend to find specialized tools for their different tasks, and then just coordinate them properly, rather than trying to work with a more monolithic architecture that tries to handle everything, which can become untenable with time and scale, right? So, I think it becomes easier to maintain. It allows for better separation of concerns. To say you want to have a dedicated vector database, maybe some people say there's some added overhead of it now having to coo

What (un)exactly do you mean by semantic search?

What (un)exactly do you mean by semantic search? - Stack Overflow

Related Articles

The Singleton Labyrinth

Build your first MCP server in TypeScript: the 2026 setup that takes 30 minutes.

Check Wallet Balances Across 4 Chains with Zero Dependencies — chain_balance.py

Comments