Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Interesting. Curious what you found better about lance compared to any of the other vecdbs like qdrant, chroma or others.


I initially built this same "chat with PDFs" prototype with LangChain and qdrant. I then rebuilt it from scratch for the sake of learning and comparison.

Some context: I've been a jack-of-all-trades data scientist / machine learning engineer for the past 15 years (officially titled as an MLE the last four years).

I share that only because I think it plays a role in how I'm typically accustomed to working.

1. I found LangChain to be overkill for this use-case. While it might allow some to move more quickly when building, I found it to be cumbersome. My suspicion is this is largely because of my background - I understand how to build much of what's "under the hood" in LangChain. Because of this, I think it felt overly abstracted and I found the docs difficult to navigate and sometimes incomplete.

2. I used Qdrant via their docker image and it was simple to setup and start using. I didn't try to push the limits with it, so I can't say anything about performance. Because Qdrant runs as an http service, I found that it didn't fit well into my workflow - I'm accustomed to being able to visually inspect my data inside the interpreter, debugging, trying out commands, interacting and experimenting with my results, etc. Again, my suspicion is this is my own bias in how I typically work. Qdrant otherwise seemed very nice.

3. LanceDB felt powerful yet lightweight, and fit well into my workflow. It was far more intuitive for me. It was as if sqlite, the python data ecosystem, and a vector database had a child and named it LanceDB. Under the hood, it's built on Apache Arrow and integrates nicely with pandas, allowing me to seamlessly go from LanceDB table on disk, to pandas dataframe, and into some analysis or investigation of my LanceDB query results. This line [1] is a great example of why I liked it. This feels nicer to me than the world of API params and HTTP requests.

1. https://github.com/gjreda/scratch-pdf-bot/blob/main/gpt_pdf_...


Thank you for elaborating. I concur about langchain and Qdrant.

With langchain it was struggle to figure out what was going on under the hood, I had to pull together multiple pieces from multiple notebooks simply to see what the Conversational Retriever Chain does. And then it was trivial to implement a variant of it myself with all pieces transparently in one place.

I like the Qdrant interface and docs and seems to work well so far. For local testing I used their python client rather than docker and it was seamless to switch to their cloud. My usecase doesn’t involve pandas (maybe I wanted a breather from years of pandas-wrangling!); I think the OpenAI cookbook repo has examples of using pandas in combination with Qdrant ( and many others).




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: